NASA Astrophysics Data System (ADS)
Nagai, Toshiki; Mitsutake, Ayori; Takano, Hiroshi
2013-02-01
A new relaxation mode analysis method, which is referred to as the principal component relaxation mode analysis method, has been proposed to handle a large number of degrees of freedom of protein systems. In this method, principal component analysis is carried out first and then relaxation mode analysis is applied to a small number of principal components with large fluctuations. To reduce the contribution of fast relaxation modes in these principal components efficiently, we have also proposed a relaxation mode analysis method using multiple evolution times. The principal component relaxation mode analysis method using two evolution times has been applied to an all-atom molecular dynamics simulation of human lysozyme in aqueous solution. Slow relaxation modes and corresponding relaxation times have been appropriately estimated, demonstrating that the method is applicable to protein systems.
Krohn, M.D.; Milton, N.M.; Segal, D.; Enland, A.
1981-01-01
A principal component image enhancement has been effective in applying Landsat data to geologic mapping in a heavily forested area of E Virginia. The image enhancement procedure consists of a principal component transformation, a histogram normalization, and the inverse principal componnet transformation. The enhancement preserves the independence of the principal components, yet produces a more readily interpretable image than does a single principal component transformation. -from Authors
An Introductory Application of Principal Components to Cricket Data
ERIC Educational Resources Information Center
Manage, Ananda B. W.; Scariano, Stephen M.
2013-01-01
Principal Component Analysis is widely used in applied multivariate data analysis, and this article shows how to motivate student interest in this topic using cricket sports data. Here, principal component analysis is successfully used to rank the cricket batsmen and bowlers who played in the 2012 Indian Premier League (IPL) competition. In…
Non-linear principal component analysis applied to Lorenz models and to North Atlantic SLP
NASA Astrophysics Data System (ADS)
Russo, A.; Trigo, R. M.
2003-04-01
A non-linear generalisation of Principal Component Analysis (PCA), denoted Non-Linear Principal Component Analysis (NLPCA), is introduced and applied to the analysis of three data sets. Non-Linear Principal Component Analysis allows for the detection and characterisation of low-dimensional non-linear structure in multivariate data sets. This method is implemented using a 5-layer feed-forward neural network introduced originally in the chemical engineering literature (Kramer, 1991). The method is described and details of its implementation are addressed. Non-Linear Principal Component Analysis is first applied to a data set sampled from the Lorenz attractor (1963). It is found that the NLPCA approximations are more representative of the data than are the corresponding PCA approximations. The same methodology was applied to the less known Lorenz attractor (1984). However, the results obtained weren't as good as those attained with the famous 'Butterfly' attractor. Further work with this model is underway in order to assess if NLPCA techniques can be more representative of the data characteristics than are the corresponding PCA approximations. The application of NLPCA to relatively 'simple' dynamical systems, such as those proposed by Lorenz, is well understood. However, the application of NLPCA to a large climatic data set is much more challenging. Here, we have applied NLPCA to the sea level pressure (SLP) field for the entire North Atlantic area and the results show a slight imcrement of explained variance associated. Finally, directions for future work are presented.%}
Hemmateenejad, Bahram; Akhond, Morteza; Miri, Ramin; Shamsipur, Mojtaba
2003-01-01
A QSAR algorithm, principal component-genetic algorithm-artificial neural network (PC-GA-ANN), has been applied to a set of newly synthesized calcium channel blockers, which are of special interest because of their role in cardiac diseases. A data set of 124 1,4-dihydropyridines bearing different ester substituents at the C-3 and C-5 positions of the dihydropyridine ring and nitroimidazolyl, phenylimidazolyl, and methylsulfonylimidazolyl groups at the C-4 position with known Ca(2+) channel binding affinities was employed in this study. Ten different sets of descriptors (837 descriptors) were calculated for each molecule. The principal component analysis was used to compress the descriptor groups into principal components. The most significant descriptors of each set were selected and used as input for the ANN. The genetic algorithm (GA) was used for the selection of the best set of extracted principal components. A feed forward artificial neural network with a back-propagation of error algorithm was used to process the nonlinear relationship between the selected principal components and biological activity of the dihydropyridines. A comparison between PC-GA-ANN and routine PC-ANN shows that the first model yields better prediction ability.
Principal component analysis of the nonlinear coupling of harmonic modes in heavy-ion collisions
NASA Astrophysics Data System (ADS)
BoŻek, Piotr
2018-03-01
The principal component analysis of flow correlations in heavy-ion collisions is studied. The correlation matrix of harmonic flow is generalized to correlations involving several different flow vectors. The method can be applied to study the nonlinear coupling between different harmonic modes in a double differential way in transverse momentum or pseudorapidity. The procedure is illustrated with results from the hydrodynamic model applied to Pb + Pb collisions at √{sN N}=2760 GeV. Three examples of generalized correlations matrices in transverse momentum are constructed corresponding to the coupling of v22 and v4, of v2v3 and v5, or of v23,v33 , and v6. The principal component decomposition is applied to the correlation matrices and the dominant modes are calculated.
NASA Astrophysics Data System (ADS)
Lim, Hoong-Ta; Murukeshan, Vadakke Matham
2017-06-01
Hyperspectral imaging combines imaging and spectroscopy to provide detailed spectral information for each spatial point in the image. This gives a three-dimensional spatial-spatial-spectral datacube with hundreds of spectral images. Probe-based hyperspectral imaging systems have been developed so that they can be used in regions where conventional table-top platforms would find it difficult to access. A fiber bundle, which is made up of specially-arranged optical fibers, has recently been developed and integrated with a spectrograph-based hyperspectral imager. This forms a snapshot hyperspectral imaging probe, which is able to form a datacube using the information from each scan. Compared to the other configurations, which require sequential scanning to form a datacube, the snapshot configuration is preferred in real-time applications where motion artifacts and pixel misregistration can be minimized. Principal component analysis is a dimension-reducing technique that can be applied in hyperspectral imaging to convert the spectral information into uncorrelated variables known as principal components. A confidence ellipse can be used to define the region of each class in the principal component feature space and for classification. This paper demonstrates the use of the snapshot hyperspectral imaging probe to acquire data from samples of different colors. The spectral library of each sample was acquired and then analyzed using principal component analysis. Confidence ellipse was then applied to the principal components of each sample and used as the classification criteria. The results show that the applied analysis can be used to perform classification of the spectral data acquired using the snapshot hyperspectral imaging probe.
NASA Technical Reports Server (NTRS)
Williams, D. L.; Borden, F. Y.
1977-01-01
Methods to accurately delineate the types of land cover in the urban-rural transition zone of metropolitan areas were considered. The application of principal components analysis to multidate LANDSAT imagery was investigated as a means of reducing the overlap between residential and agricultural spectral signatures. The statistical concepts of principal components analysis were discussed, as well as the results of this analysis when applied to multidate LANDSAT imagery of the Washington, D.C. metropolitan area.
Principal Component Analysis: Resources for an Essential Application of Linear Algebra
ERIC Educational Resources Information Center
Pankavich, Stephen; Swanson, Rebecca
2015-01-01
Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…
A modified procedure for mixture-model clustering of regional geochemical data
Ellefsen, Karl J.; Smith, David B.; Horton, John D.
2014-01-01
A modified procedure is proposed for mixture-model clustering of regional-scale geochemical data. The key modification is the robust principal component transformation of the isometric log-ratio transforms of the element concentrations. This principal component transformation and the associated dimension reduction are applied before the data are clustered. The principal advantage of this modification is that it significantly improves the stability of the clustering. The principal disadvantage is that it requires subjective selection of the number of clusters and the number of principal components. To evaluate the efficacy of this modified procedure, it is applied to soil geochemical data that comprise 959 samples from the state of Colorado (USA) for which the concentrations of 44 elements are measured. The distributions of element concentrations that are derived from the mixture model and from the field samples are similar, indicating that the mixture model is a suitable representation of the transformed geochemical data. Each cluster and the associated distributions of the element concentrations are related to specific geologic and anthropogenic features. In this way, mixture model clustering facilitates interpretation of the regional geochemical data.
NASA Astrophysics Data System (ADS)
Chen, Zhe; Parker, B. J.; Feng, D. D.; Fulton, R.
2004-10-01
In this paper, we compare various temporal analysis schemes applied to dynamic PET for improved quantification, image quality and temporal compression purposes. We compare an optimal sampling schedule (OSS) design, principal component analysis (PCA) applied in the image domain, and principal component analysis applied in the sinogram domain; for region-of-interest quantification, sinogram-domain PCA is combined with the Huesman algorithm to quantify from the sinograms directly without requiring reconstruction of all PCA channels. Using a simulated phantom FDG brain study and three clinical studies, we evaluate the fidelity of the compressed data for estimation of local cerebral metabolic rate of glucose by a four-compartment model. Our results show that using a noise-normalized PCA in the sinogram domain gives similar compression ratio and quantitative accuracy to OSS, but with substantially better precision. These results indicate that sinogram-domain PCA for dynamic PET can be a useful preprocessing stage for PET compression and quantification applications.
Maurer, Christian; Federolf, Peter; von Tscharner, Vinzenz; Stirling, Lisa; Nigg, Benno M
2012-05-01
Changes in gait kinematics have often been analyzed using pattern recognition methods such as principal component analysis (PCA). It is usually just the first few principal components that are analyzed, because they describe the main variability within a dataset and thus represent the main movement patterns. However, while subtle changes in gait pattern (for instance, due to different footwear) may not change main movement patterns, they may affect movements represented by higher principal components. This study was designed to test two hypotheses: (1) speed and gender differences can be observed in the first principal components, and (2) small interventions such as changing footwear change the gait characteristics of higher principal components. Kinematic changes due to different running conditions (speed - 3.1m/s and 4.9 m/s, gender, and footwear - control shoe and adidas MicroBounce shoe) were investigated by applying PCA and support vector machine (SVM) to a full-body reflective marker setup. Differences in speed changed the basic movement pattern, as was reflected by a change in the time-dependent coefficient derived from the first principal. Gender was differentiated by using the time-dependent coefficient derived from intermediate principal components. (Intermediate principal components are characterized by limb rotations of the thigh and shank.) Different shoe conditions were identified in higher principal components. This study showed that different interventions can be analyzed using a full-body kinematic approach. Within the well-defined vector space spanned by the data of all subjects, higher principal components should also be considered because these components show the differences that result from small interventions such as footwear changes. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.
Directly Reconstructing Principal Components of Heterogeneous Particles from Cryo-EM Images
Tagare, Hemant D.; Kucukelbir, Alp; Sigworth, Fred J.; Wang, Hongwei; Rao, Murali
2015-01-01
Structural heterogeneity of particles can be investigated by their three-dimensional principal components. This paper addresses the question of whether, and with what algorithm, the three-dimensional principal components can be directly recovered from cryo-EM images. The first part of the paper extends the Fourier slice theorem to covariance functions showing that the three-dimensional covariance, and hence the principal components, of a heterogeneous particle can indeed be recovered from two-dimensional cryo-EM images. The second part of the paper proposes a practical algorithm for reconstructing the principal components directly from cryo-EM images without the intermediate step of calculating covariances. This algorithm is based on maximizing the (posterior) likelihood using the Expectation-Maximization algorithm. The last part of the paper applies this algorithm to simulated data and to two real cryo-EM data sets: a data set of the 70S ribosome with and without Elongation Factor-G (EF-G), and a data set of the inluenza virus RNA dependent RNA Polymerase (RdRP). The first principal component of the 70S ribosome data set reveals the expected conformational changes of the ribosome as the EF-G binds and unbinds. The first principal component of the RdRP data set reveals a conformational change in the two dimers of the RdRP. PMID:26049077
Brown, C. Erwin
1993-01-01
Correlation analysis in conjunction with principal-component and multiple-regression analyses were applied to laboratory chemical and petrographic data to assess the usefulness of these techniques in evaluating selected physical and hydraulic properties of carbonate-rock aquifers in central Pennsylvania. Correlation and principal-component analyses were used to establish relations and associations among variables, to determine dimensions of property variation of samples, and to filter the variables containing similar information. Principal-component and correlation analyses showed that porosity is related to other measured variables and that permeability is most related to porosity and grain size. Four principal components are found to be significant in explaining the variance of data. Stepwise multiple-regression analysis was used to see how well the measured variables could predict porosity and (or) permeability for this suite of rocks. The variation in permeability and porosity is not totally predicted by the other variables, but the regression is significant at the 5% significance level. ?? 1993.
Karasawa, N; Mitsutake, A; Takano, H
2017-12-01
Proteins implement their functionalities when folded into specific three-dimensional structures, and their functions are related to the protein structures and dynamics. Previously, we applied a relaxation mode analysis (RMA) method to protein systems; this method approximately estimates the slow relaxation modes and times via simulation and enables investigation of the dynamic properties underlying the protein structural fluctuations. Recently, two-step RMA with multiple evolution times has been proposed and applied to a slightly complex homopolymer system, i.e., a single [n]polycatenane. This method can be applied to more complex heteropolymer systems, i.e., protein systems, to estimate the relaxation modes and times more accurately. In two-step RMA, we first perform RMA and obtain rough estimates of the relaxation modes and times. Then, we apply RMA with multiple evolution times to a small number of the slowest relaxation modes obtained in the previous calculation. Herein, we apply this method to the results of principal component analysis (PCA). First, PCA is applied to a 2-μs molecular dynamics simulation of hen egg-white lysozyme in aqueous solution. Then, the two-step RMA method with multiple evolution times is applied to the obtained principal components. The slow relaxation modes and corresponding relaxation times for the principal components are much improved by the second RMA.
NASA Astrophysics Data System (ADS)
Karasawa, N.; Mitsutake, A.; Takano, H.
2017-12-01
Proteins implement their functionalities when folded into specific three-dimensional structures, and their functions are related to the protein structures and dynamics. Previously, we applied a relaxation mode analysis (RMA) method to protein systems; this method approximately estimates the slow relaxation modes and times via simulation and enables investigation of the dynamic properties underlying the protein structural fluctuations. Recently, two-step RMA with multiple evolution times has been proposed and applied to a slightly complex homopolymer system, i.e., a single [n ] polycatenane. This method can be applied to more complex heteropolymer systems, i.e., protein systems, to estimate the relaxation modes and times more accurately. In two-step RMA, we first perform RMA and obtain rough estimates of the relaxation modes and times. Then, we apply RMA with multiple evolution times to a small number of the slowest relaxation modes obtained in the previous calculation. Herein, we apply this method to the results of principal component analysis (PCA). First, PCA is applied to a 2-μ s molecular dynamics simulation of hen egg-white lysozyme in aqueous solution. Then, the two-step RMA method with multiple evolution times is applied to the obtained principal components. The slow relaxation modes and corresponding relaxation times for the principal components are much improved by the second RMA.
The Complexity of Human Walking: A Knee Osteoarthritis Study
Kotti, Margarita; Duffell, Lynsey D.; Faisal, Aldo A.; McGregor, Alison H.
2014-01-01
This study proposes a framework for deconstructing complex walking patterns to create a simple principal component space before checking whether the projection to this space is suitable for identifying changes from the normality. We focus on knee osteoarthritis, the most common knee joint disease and the second leading cause of disability. Knee osteoarthritis affects over 250 million people worldwide. The motivation for projecting the highly dimensional movements to a lower dimensional and simpler space is our belief that motor behaviour can be understood by identifying a simplicity via projection to a low principal component space, which may reflect upon the underlying mechanism. To study this, we recruited 180 subjects, 47 of which reported that they had knee osteoarthritis. They were asked to walk several times along a walkway equipped with two force plates that capture their ground reaction forces along 3 axes, namely vertical, anterior-posterior, and medio-lateral, at 1000 Hz. Data when the subject does not clearly strike the force plate were excluded, leaving 1–3 gait cycles per subject. To examine the complexity of human walking, we applied dimensionality reduction via Probabilistic Principal Component Analysis. The first principal component explains 34% of the variance in the data, whereas over 80% of the variance is explained by 8 principal components or more. This proves the complexity of the underlying structure of the ground reaction forces. To examine if our musculoskeletal system generates movements that are distinguishable between normal and pathological subjects in a low dimensional principal component space, we applied a Bayes classifier. For the tested cross-validated, subject-independent experimental protocol, the classification accuracy equals 82.62%. Also, a novel complexity measure is proposed, which can be used as an objective index to facilitate clinical decision making. This measure proves that knee osteoarthritis subjects exhibit more variability in the two-dimensional principal component space. PMID:25232949
NASA Astrophysics Data System (ADS)
Wang, Zhuozheng; Deller, J. R.; Fleet, Blair D.
2016-01-01
Acquired digital images are often corrupted by a lack of camera focus, faulty illumination, or missing data. An algorithm is presented for fusion of multiple corrupted images of a scene using the lifting wavelet transform. The method employs adaptive fusion arithmetic based on matrix completion and self-adaptive regional variance estimation. Characteristics of the wavelet coefficients are used to adaptively select fusion rules. Robust principal component analysis is applied to low-frequency image components, and regional variance estimation is applied to high-frequency components. Experiments reveal that the method is effective for multifocus, visible-light, and infrared image fusion. Compared with traditional algorithms, the new algorithm not only increases the amount of preserved information and clarity but also improves robustness.
ERIC Educational Resources Information Center
Rahayu, Sri; Sugiarto, Teguh; Madu, Ludiro; Holiawati; Subagyo, Ahmad
2017-01-01
This study aims to apply the model principal component analysis to reduce multicollinearity on variable currency exchange rate in eight countries in Asia against US Dollar including the Yen (Japan), Won (South Korea), Dollar (Hong Kong), Yuan (China), Bath (Thailand), Rupiah (Indonesia), Ringgit (Malaysia), Dollar (Singapore). It looks at yield…
Directly reconstructing principal components of heterogeneous particles from cryo-EM images.
Tagare, Hemant D; Kucukelbir, Alp; Sigworth, Fred J; Wang, Hongwei; Rao, Murali
2015-08-01
Structural heterogeneity of particles can be investigated by their three-dimensional principal components. This paper addresses the question of whether, and with what algorithm, the three-dimensional principal components can be directly recovered from cryo-EM images. The first part of the paper extends the Fourier slice theorem to covariance functions showing that the three-dimensional covariance, and hence the principal components, of a heterogeneous particle can indeed be recovered from two-dimensional cryo-EM images. The second part of the paper proposes a practical algorithm for reconstructing the principal components directly from cryo-EM images without the intermediate step of calculating covariances. This algorithm is based on maximizing the posterior likelihood using the Expectation-Maximization algorithm. The last part of the paper applies this algorithm to simulated data and to two real cryo-EM data sets: a data set of the 70S ribosome with and without Elongation Factor-G (EF-G), and a data set of the influenza virus RNA dependent RNA Polymerase (RdRP). The first principal component of the 70S ribosome data set reveals the expected conformational changes of the ribosome as the EF-G binds and unbinds. The first principal component of the RdRP data set reveals a conformational change in the two dimers of the RdRP. Copyright © 2015 Elsevier Inc. All rights reserved.
Fast, Exact Bootstrap Principal Component Analysis for p > 1 million
Fisher, Aaron; Caffo, Brian; Schwartz, Brian; Zipunnikov, Vadim
2015-01-01
Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (p) is much larger than the number of subjects (n), calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same n-dimensional subspace as the original sample. As a result, all bootstrap principal components are limited to the same n-dimensional subspace and can be efficiently represented by their low dimensional coordinates in that subspace. Several uncertainty metrics can be computed solely based on the bootstrap distribution of these low dimensional coordinates, without calculating or storing the p-dimensional bootstrap components. Fast bootstrap PCA is applied to a dataset of sleep electroencephalogram recordings (p = 900, n = 392), and to a dataset of brain magnetic resonance images (MRIs) (p ≈ 3 million, n = 352). For the MRI dataset, our method allows for standard errors for the first 3 principal components based on 1000 bootstrap samples to be calculated on a standard laptop in 47 minutes, as opposed to approximately 4 days with standard methods. PMID:27616801
Combination of PCA and LORETA for sources analysis of ERP data: an emotional processing study
NASA Astrophysics Data System (ADS)
Hu, Jin; Tian, Jie; Yang, Lei; Pan, Xiaohong; Liu, Jiangang
2006-03-01
The purpose of this paper is to study spatiotemporal patterns of neuronal activity in emotional processing by analysis of ERP data. 108 pictures (categorized as positive, negative and neutral) were presented to 24 healthy, right-handed subjects while 128-channel EEG data were recorded. An analysis of two steps was applied to the ERP data. First, principal component analysis was performed to obtain significant ERP components. Then LORETA was applied to each component to localize their brain sources. The first six principal components were extracted, each of which showed different spatiotemporal patterns of neuronal activity. The results agree with other emotional study by fMRI or PET. The combination of PCA and LORETA can be used to analyze spatiotemporal patterns of ERP data in emotional processing.
Principal components analysis in clinical studies.
Zhang, Zhongheng; Castelló, Adela
2017-09-01
In multivariate analysis, independent variables are usually correlated to each other which can introduce multicollinearity in the regression models. One approach to solve this problem is to apply principal components analysis (PCA) over these variables. This method uses orthogonal transformation to represent sets of potentially correlated variables with principal components (PC) that are linearly uncorrelated. PCs are ordered so that the first PC has the largest possible variance and only some components are selected to represent the correlated variables. As a result, the dimension of the variable space is reduced. This tutorial illustrates how to perform PCA in R environment, the example is a simulated dataset in which two PCs are responsible for the majority of the variance in the data. Furthermore, the visualization of PCA is highlighted.
Principal component greenness transformation in multitemporal agricultural Landsat data
NASA Technical Reports Server (NTRS)
Abotteen, R. A.
1978-01-01
A data compression technique for multitemporal Landsat imagery which extracts phenological growth pattern information for agricultural crops is described. The principal component greenness transformation was applied to multitemporal agricultural Landsat data for information retrieval. The transformation was favorable for applications in agricultural Landsat data analysis because of its physical interpretability and its relation to the phenological growth of crops. It was also found that the first and second greenness eigenvector components define a temporal small-grain trajectory and nonsmall-grain trajectory, respectively.
Dascălu, Cristina Gena; Antohe, Magda Ecaterina
2009-01-01
Based on the eigenvalues and the eigenvectors analysis, the principal component analysis has the purpose to identify the subspace of the main components from a set of parameters, which are enough to characterize the whole set of parameters. Interpreting the data for analysis as a cloud of points, we find through geometrical transformations the directions where the cloud's dispersion is maximal--the lines that pass through the cloud's center of weight and have a maximal density of points around them (by defining an appropriate criteria function and its minimization. This method can be successfully used in order to simplify the statistical analysis on questionnaires--because it helps us to select from a set of items only the most relevant ones, which cover the variations of the whole set of data. For instance, in the presented sample we started from a questionnaire with 28 items and, applying the principal component analysis we identified 7 principal components--or main items--fact that simplifies significantly the further data statistical analysis.
Machine learning of frustrated classical spin models. I. Principal component analysis
NASA Astrophysics Data System (ADS)
Wang, Ce; Zhai, Hui
2017-10-01
This work aims at determining whether artificial intelligence can recognize a phase transition without prior human knowledge. If this were successful, it could be applied to, for instance, analyzing data from the quantum simulation of unsolved physical models. Toward this goal, we first need to apply the machine learning algorithm to well-understood models and see whether the outputs are consistent with our prior knowledge, which serves as the benchmark for this approach. In this work, we feed the computer data generated by the classical Monte Carlo simulation for the X Y model in frustrated triangular and union jack lattices, which has two order parameters and exhibits two phase transitions. We show that the outputs of the principal component analysis agree very well with our understanding of different orders in different phases, and the temperature dependences of the major components detect the nature and the locations of the phase transitions. Our work offers promise for using machine learning techniques to study sophisticated statistical models, and our results can be further improved by using principal component analysis with kernel tricks and the neural network method.
Das, Atanu; Mukhopadhyay, Chaitali
2007-10-28
We have performed molecular dynamics (MD) simulation of the thermal denaturation of one protein and one peptide-ubiquitin and melittin. To identify the correlation in dynamics among various secondary structural fragments and also the individual contribution of different residues towards thermal unfolding, principal component analysis method was applied in order to give a new insight to protein dynamics by analyzing the contribution of coefficients of principal components. The cross-correlation matrix obtained from MD simulation trajectory provided important information regarding the anisotropy of backbone dynamics that leads to unfolding. Unfolding of ubiquitin was found to be a three-state process, while that of melittin, though smaller and mostly helical, is more complicated.
NASA Astrophysics Data System (ADS)
Das, Atanu; Mukhopadhyay, Chaitali
2007-10-01
We have performed molecular dynamics (MD) simulation of the thermal denaturation of one protein and one peptide—ubiquitin and melittin. To identify the correlation in dynamics among various secondary structural fragments and also the individual contribution of different residues towards thermal unfolding, principal component analysis method was applied in order to give a new insight to protein dynamics by analyzing the contribution of coefficients of principal components. The cross-correlation matrix obtained from MD simulation trajectory provided important information regarding the anisotropy of backbone dynamics that leads to unfolding. Unfolding of ubiquitin was found to be a three-state process, while that of melittin, though smaller and mostly helical, is more complicated.
A novel principal component analysis for spatially misaligned multivariate air pollution data.
Jandarov, Roman A; Sheppard, Lianne A; Sampson, Paul D; Szpiro, Adam A
2017-01-01
We propose novel methods for predictive (sparse) PCA with spatially misaligned data. These methods identify principal component loading vectors that explain as much variability in the observed data as possible, while also ensuring the corresponding principal component scores can be predicted accurately by means of spatial statistics at locations where air pollution measurements are not available. This will make it possible to identify important mixtures of air pollutants and to quantify their health effects in cohort studies, where currently available methods cannot be used. We demonstrate the utility of predictive (sparse) PCA in simulated data and apply the approach to annual averages of particulate matter speciation data from national Environmental Protection Agency (EPA) regulatory monitors.
Coastal modification of a scene employing multispectral images and vector operators.
Lira, Jorge
2017-05-01
Changes in sea level, wind patterns, sea current patterns, and tide patterns have produced morphologic transformations in the coastline area of Tamaulipas Sate in North East Mexico. Such changes generated a modification of the coastline and variations of the texture-relief and texture of the continental area of Tamaulipas. Two high-resolution multispectral satellite Satellites Pour l'Observation de la Terre images were employed to quantify the morphologic change of such continental area. The images cover a time span close to 10 years. A variant of the principal component analysis was used to delineate the modification of the land-water line. To quantify changes in texture-relief and texture, principal component analysis was applied to the multispectral images. The first principal components of each image were modeled as a discrete bidimensional vector field. The divergence and Laplacian vector operators were applied to the discrete vector field. The divergence provided the change of texture, while the Laplacian produced the change of texture-relief in the area of study.
Binding Isotherms and Time Courses Readily from Magnetic Resonance.
Xu, Jia; Van Doren, Steven R
2016-08-16
Evidence is presented that binding isotherms, simple or biphasic, can be extracted directly from noninterpreted, complex 2D NMR spectra using principal component analysis (PCA) to reveal the largest trend(s) across the series. This approach renders peak picking unnecessary for tracking population changes. In 1:1 binding, the first principal component captures the binding isotherm from NMR-detected titrations in fast, slow, and even intermediate and mixed exchange regimes, as illustrated for phospholigand associations with proteins. Although the sigmoidal shifts and line broadening of intermediate exchange distorts binding isotherms constructed conventionally, applying PCA directly to these spectra along with Pareto scaling overcomes the distortion. Applying PCA to time-domain NMR data also yields binding isotherms from titrations in fast or slow exchange. The algorithm readily extracts from magnetic resonance imaging movie time courses such as breathing and heart rate in chest imaging. Similarly, two-step binding processes detected by NMR are easily captured by principal components 1 and 2. PCA obviates the customary focus on specific peaks or regions of images. Applying it directly to a series of complex data will easily delineate binding isotherms, equilibrium shifts, and time courses of reactions or fluctuations.
Optimized principal component analysis on coronagraphic images of the fomalhaut system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meshkat, Tiffany; Kenworthy, Matthew A.; Quanz, Sascha P.
We present the results of a study to optimize the principal component analysis (PCA) algorithm for planet detection, a new algorithm complementing angular differential imaging and locally optimized combination of images (LOCI) for increasing the contrast achievable next to a bright star. The stellar point spread function (PSF) is constructed by removing linear combinations of principal components, allowing the flux from an extrasolar planet to shine through. The number of principal components used determines how well the stellar PSF is globally modeled. Using more principal components may decrease the number of speckles in the final image, but also increases themore » background noise. We apply PCA to Fomalhaut Very Large Telescope NaCo images acquired at 4.05 μm with an apodized phase plate. We do not detect any companions, with a model dependent upper mass limit of 13-18 M {sub Jup} from 4-10 AU. PCA achieves greater sensitivity than the LOCI algorithm for the Fomalhaut coronagraphic data by up to 1 mag. We make several adaptations to the PCA code and determine which of these prove the most effective at maximizing the signal-to-noise from a planet very close to its parent star. We demonstrate that optimizing the number of principal components used in PCA proves most effective for pulling out a planet signal.« less
NASA Astrophysics Data System (ADS)
Hristian, L.; Ostafe, M. M.; Manea, L. R.; Apostol, L. L.
2017-06-01
The work pursued the distribution of combed wool fabrics destined to manufacturing of external articles of clothing in terms of the values of durability and physiological comfort indices, using the mathematical model of Principal Component Analysis (PCA). Principal Components Analysis (PCA) applied in this study is a descriptive method of the multivariate analysis/multi-dimensional data, and aims to reduce, under control, the number of variables (columns) of the matrix data as much as possible to two or three. Therefore, based on the information about each group/assortment of fabrics, it is desired that, instead of nine inter-correlated variables, to have only two or three new variables called components. The PCA target is to extract the smallest number of components which recover the most of the total information contained in the initial data.
Relaxation mode analysis of a peptide system: comparison with principal component analysis.
Mitsutake, Ayori; Iijima, Hiromitsu; Takano, Hiroshi
2011-10-28
This article reports the first attempt to apply the relaxation mode analysis method to a simulation of a biomolecular system. In biomolecular systems, the principal component analysis is a well-known method for analyzing the static properties of fluctuations of structures obtained by a simulation and classifying the structures into some groups. On the other hand, the relaxation mode analysis has been used to analyze the dynamic properties of homopolymer systems. In this article, a long Monte Carlo simulation of Met-enkephalin in gas phase has been performed. The results are analyzed by the principal component analysis and relaxation mode analysis methods. We compare the results of both methods and show the effectiveness of the relaxation mode analysis.
NASA Astrophysics Data System (ADS)
Zhang, Qiong; Peng, Cong; Lu, Yiming; Wang, Hao; Zhu, Kaiguang
2018-04-01
A novel technique is developed to level airborne geophysical data using principal component analysis based on flight line difference. In the paper, flight line difference is introduced to enhance the features of levelling error for airborne electromagnetic (AEM) data and improve the correlation between pseudo tie lines. Thus we conduct levelling to the flight line difference data instead of to the original AEM data directly. Pseudo tie lines are selected distributively cross profile direction, avoiding the anomalous regions. Since the levelling errors of selective pseudo tie lines show high correlations, principal component analysis is applied to extract the local levelling errors by low-order principal components reconstruction. Furthermore, we can obtain the levelling errors of original AEM data through inverse difference after spatial interpolation. This levelling method does not need to fly tie lines and design the levelling fitting function. The effectiveness of this method is demonstrated by the levelling results of survey data, comparing with the results from tie-line levelling and flight-line correlation levelling.
NASA Astrophysics Data System (ADS)
Nordemann, D. J. R.; Rigozo, N. R.; de Souza Echer, M. P.; Echer, E.
2008-11-01
We present here an implementation of a least squares iterative regression method applied to the sine functions embedded in the principal components extracted from geophysical time series. This method seems to represent a useful improvement for the non-stationary time series periodicity quantitative analysis. The principal components determination followed by the least squares iterative regression method was implemented in an algorithm written in the Scilab (2006) language. The main result of the method is to obtain the set of sine functions embedded in the series analyzed in decreasing order of significance, from the most important ones, likely to represent the physical processes involved in the generation of the series, to the less important ones that represent noise components. Taking into account the need of a deeper knowledge of the Sun's past history and its implication to global climate change, the method was applied to the Sunspot Number series (1750-2004). With the threshold and parameter values used here, the application of the method leads to a total of 441 explicit sine functions, among which 65 were considered as being significant and were used for a reconstruction that gave a normalized mean squared error of 0.146.
Decomposing the Apoptosis Pathway Into Biologically Interpretable Principal Components
Wang, Min; Kornblau, Steven M; Coombes, Kevin R
2018-01-01
Principal component analysis (PCA) is one of the most common techniques in the analysis of biological data sets, but applying PCA raises 2 challenges. First, one must determine the number of significant principal components (PCs). Second, because each PC is a linear combination of genes, it rarely has a biological interpretation. Existing methods to determine the number of PCs are either subjective or computationally extensive. We review several methods and describe a new R package, PCDimension, that implements additional methods, the most important being an algorithm that extends and automates a graphical Bayesian method. Using simulations, we compared the methods. Our newly automated procedure is competitive with the best methods when considering both accuracy and speed and is the most accurate when the number of objects is small compared with the number of attributes. We applied the method to a proteomics data set from patients with acute myeloid leukemia. Proteins in the apoptosis pathway could be explained using 6 PCs. By clustering the proteins in PC space, we were able to replace the PCs by 6 “biological components,” 3 of which could be immediately interpreted from the current literature. We expect this approach combining PCA with clustering to be widely applicable. PMID:29881252
Finger crease pattern recognition using Legendre moments and principal component analysis
NASA Astrophysics Data System (ADS)
Luo, Rongfang; Lin, Tusheng
2007-03-01
The finger joint lines defined as finger creases and its distribution can identify a person. In this paper, we propose a new finger crease pattern recognition method based on Legendre moments and principal component analysis (PCA). After obtaining the region of interest (ROI) for each finger image in the pre-processing stage, Legendre moments under Radon transform are applied to construct a moment feature matrix from the ROI, which greatly decreases the dimensionality of ROI and can represent principal components of the finger creases quite well. Then, an approach to finger crease pattern recognition is designed based on Karhunen-Loeve (K-L) transform. The method applies PCA to a moment feature matrix rather than the original image matrix to achieve the feature vector. The proposed method has been tested on a database of 824 images from 103 individuals using the nearest neighbor classifier. The accuracy up to 98.584% has been obtained when using 4 samples per class for training. The experimental results demonstrate that our proposed approach is feasible and effective in biometrics.
From measurements to metrics: PCA-based indicators of cyber anomaly
NASA Astrophysics Data System (ADS)
Ahmed, Farid; Johnson, Tommy; Tsui, Sonia
2012-06-01
We present a framework of the application of Principal Component Analysis (PCA) to automatically obtain meaningful metrics from intrusion detection measurements. In particular, we report the progress made in applying PCA to analyze the behavioral measurements of malware and provide some preliminary results in selecting dominant attributes from an arbitrary number of malware attributes. The results will be useful in formulating an optimal detection threshold in the principal component space, which can both validate and augment existing malware classifiers.
NASA Astrophysics Data System (ADS)
Gao, Yang; Chen, Maomao; Wu, Junyu; Zhou, Yuan; Cai, Chuangjian; Wang, Daliang; Luo, Jianwen
2017-09-01
Fluorescence molecular imaging has been used to target tumors in mice with xenograft tumors. However, tumor imaging is largely distorted by the aggregation of fluorescent probes in the liver. A principal component analysis (PCA)-based strategy was applied on the in vivo dynamic fluorescence imaging results of three mice with xenograft tumors to facilitate tumor imaging, with the help of a tumor-specific fluorescent probe. Tumor-relevant features were extracted from the original images by PCA and represented by the principal component (PC) maps. The second principal component (PC2) map represented the tumor-related features, and the first principal component (PC1) map retained the original pharmacokinetic profiles, especially of the liver. The distribution patterns of the PC2 map of the tumor-bearing mice were in good agreement with the actual tumor location. The tumor-to-liver ratio and contrast-to-noise ratio were significantly higher on the PC2 map than on the original images, thus distinguishing the tumor from its nearby fluorescence noise of liver. The results suggest that the PC2 map could serve as a bioimaging marker to facilitate in vivo tumor localization, and dynamic fluorescence molecular imaging with PCA could be a valuable tool for future studies of in vivo tumor metabolism and progression.
Surzhikov, V D; Surzhikov, D V
2014-01-01
The search and measurement of causal relationships between exposure to air pollution and health state of the population is based on the system analysis and risk assessment to improve the quality of research. With this purpose there is applied the modern statistical analysis with the use of criteria of independence, principal component analysis and discriminate function analysis. As a result of analysis out of all atmospheric pollutants there were separated four main components: for diseases of the circulatory system main principal component is implied with concentrations of suspended solids, nitrogen dioxide, carbon monoxide, hydrogen fluoride, for the respiratory diseases the main c principal component is closely associated with suspended solids, sulfur dioxide and nitrogen dioxide, charcoal black. The discriminant function was shown to be used as a measure of the level of air pollution.
Perturbation analyses of intermolecular interactions
NASA Astrophysics Data System (ADS)
Koyama, Yohei M.; Kobayashi, Tetsuya J.; Ueda, Hiroki R.
2011-08-01
Conformational fluctuations of a protein molecule are important to its function, and it is known that environmental molecules, such as water molecules, ions, and ligand molecules, significantly affect the function by changing the conformational fluctuations. However, it is difficult to systematically understand the role of environmental molecules because intermolecular interactions related to the conformational fluctuations are complicated. To identify important intermolecular interactions with regard to the conformational fluctuations, we develop herein (i) distance-independent and (ii) distance-dependent perturbation analyses of the intermolecular interactions. We show that these perturbation analyses can be realized by performing (i) a principal component analysis using conditional expectations of truncated and shifted intermolecular potential energy terms and (ii) a functional principal component analysis using products of intermolecular forces and conditional cumulative densities. We refer to these analyses as intermolecular perturbation analysis (IPA) and distance-dependent intermolecular perturbation analysis (DIPA), respectively. For comparison of the IPA and the DIPA, we apply them to the alanine dipeptide isomerization in explicit water. Although the first IPA principal components discriminate two states (the α state and PPII (polyproline II) + β states) for larger cutoff length, the separation between the PPII state and the β state is unclear in the second IPA principal components. On the other hand, in the large cutoff value, DIPA eigenvalues converge faster than that for IPA and the top two DIPA principal components clearly identify the three states. By using the DIPA biplot, the contributions of the dipeptide-water interactions to each state are analyzed systematically. Since the DIPA improves the state identification and the convergence rate with retaining distance information, we conclude that the DIPA is a more practical method compared with the IPA. To test the feasibility of the DIPA for larger molecules, we apply the DIPA to the ten-residue chignolin folding in explicit water. The top three principal components identify the four states (native state, two misfolded states, and unfolded state) and their corresponding eigenfunctions identify important chignolin-water interactions to each state. Thus, the DIPA provides the practical method to identify conformational states and their corresponding important intermolecular interactions with distance information.
Perturbation analyses of intermolecular interactions.
Koyama, Yohei M; Kobayashi, Tetsuya J; Ueda, Hiroki R
2011-08-01
Conformational fluctuations of a protein molecule are important to its function, and it is known that environmental molecules, such as water molecules, ions, and ligand molecules, significantly affect the function by changing the conformational fluctuations. However, it is difficult to systematically understand the role of environmental molecules because intermolecular interactions related to the conformational fluctuations are complicated. To identify important intermolecular interactions with regard to the conformational fluctuations, we develop herein (i) distance-independent and (ii) distance-dependent perturbation analyses of the intermolecular interactions. We show that these perturbation analyses can be realized by performing (i) a principal component analysis using conditional expectations of truncated and shifted intermolecular potential energy terms and (ii) a functional principal component analysis using products of intermolecular forces and conditional cumulative densities. We refer to these analyses as intermolecular perturbation analysis (IPA) and distance-dependent intermolecular perturbation analysis (DIPA), respectively. For comparison of the IPA and the DIPA, we apply them to the alanine dipeptide isomerization in explicit water. Although the first IPA principal components discriminate two states (the α state and PPII (polyproline II) + β states) for larger cutoff length, the separation between the PPII state and the β state is unclear in the second IPA principal components. On the other hand, in the large cutoff value, DIPA eigenvalues converge faster than that for IPA and the top two DIPA principal components clearly identify the three states. By using the DIPA biplot, the contributions of the dipeptide-water interactions to each state are analyzed systematically. Since the DIPA improves the state identification and the convergence rate with retaining distance information, we conclude that the DIPA is a more practical method compared with the IPA. To test the feasibility of the DIPA for larger molecules, we apply the DIPA to the ten-residue chignolin folding in explicit water. The top three principal components identify the four states (native state, two misfolded states, and unfolded state) and their corresponding eigenfunctions identify important chignolin-water interactions to each state. Thus, the DIPA provides the practical method to identify conformational states and their corresponding important intermolecular interactions with distance information.
Dynamic of consumer groups and response of commodity markets by principal component analysis
NASA Astrophysics Data System (ADS)
Nobi, Ashadun; Alam, Shafiqul; Lee, Jae Woo
2017-09-01
This study investigates financial states and group dynamics by applying principal component analysis to the cross-correlation coefficients of the daily returns of commodity futures. The eigenvalues of the cross-correlation matrix in the 6-month timeframe displays similar values during 2010-2011, but decline following 2012. A sharp drop in eigenvalue implies the significant change of the market state. Three commodity sectors, energy, metals and agriculture, are projected into two dimensional spaces consisting of two principal components (PC). We observe that they form three distinct clusters in relation to various sectors. However, commodities with distinct features have intermingled with one another and scattered during severe crises, such as the European sovereign debt crises. We observe the notable change of the position of two dimensional spaces of groups during financial crises. By considering the first principal component (PC1) within the 6-month moving timeframe, we observe that commodities of the same group change states in a similar pattern, and the change of states of one group can be used as a warning for other group.
Giesen, E B W; Ding, M; Dalstra, M; van Eijden, T M G J
2003-09-01
As several morphological parameters of cancellous bone express more or less the same architectural measure, we applied principal components analysis to group these measures and correlated these to the mechanical properties. Cylindrical specimens (n = 24) were obtained in different orientations from embalmed mandibular condyles; the angle of the first principal direction and the axis of the specimen, expressing the orientation of the trabeculae, ranged from 10 degrees to 87 degrees. Morphological parameters were determined by a method based on Archimedes' principle and by micro-CT scanning, and the mechanical properties were obtained by mechanical testing. The principal components analysis was used to obtain a set of independent components to describe the morphology. This set was entered into linear regression analyses for explaining the variance in mechanical properties. The principal components analysis revealed four components: amount of bone, number of trabeculae, trabecular orientation, and miscellaneous. They accounted for about 90% of the variance in the morphological variables. The component loadings indicated that a higher amount of bone was primarily associated with more plate-like trabeculae, and not with more or thicker trabeculae. The trabecular orientation was most determinative (about 50%) in explaining stiffness, strength, and failure energy. The amount of bone was second most determinative and increased the explained variance to about 72%. These results suggest that trabecular orientation and amount of bone are important in explaining the anisotropic mechanical properties of the cancellous bone of the mandibular condyle.
NASA Astrophysics Data System (ADS)
Xie, Hong-Bo; Dokos, Socrates
2013-06-01
We present a hybrid symplectic geometry and central tendency measure (CTM) method for detection of determinism in noisy time series. CTM is effective for detecting determinism in short time series and has been applied in many areas of nonlinear analysis. However, its performance significantly degrades in the presence of strong noise. In order to circumvent this difficulty, we propose to use symplectic principal component analysis (SPCA), a new chaotic signal de-noising method, as the first step to recover the system dynamics. CTM is then applied to determine whether the time series arises from a stochastic process or has a deterministic component. Results from numerical experiments, ranging from six benchmark deterministic models to 1/f noise, suggest that the hybrid method can significantly improve detection of determinism in noisy time series by about 20 dB when the data are contaminated by Gaussian noise. Furthermore, we apply our algorithm to study the mechanomyographic (MMG) signals arising from contraction of human skeletal muscle. Results obtained from the hybrid symplectic principal component analysis and central tendency measure demonstrate that the skeletal muscle motor unit dynamics can indeed be deterministic, in agreement with previous studies. However, the conventional CTM method was not able to definitely detect the underlying deterministic dynamics. This result on MMG signal analysis is helpful in understanding neuromuscular control mechanisms and developing MMG-based engineering control applications.
Xie, Hong-Bo; Dokos, Socrates
2013-06-01
We present a hybrid symplectic geometry and central tendency measure (CTM) method for detection of determinism in noisy time series. CTM is effective for detecting determinism in short time series and has been applied in many areas of nonlinear analysis. However, its performance significantly degrades in the presence of strong noise. In order to circumvent this difficulty, we propose to use symplectic principal component analysis (SPCA), a new chaotic signal de-noising method, as the first step to recover the system dynamics. CTM is then applied to determine whether the time series arises from a stochastic process or has a deterministic component. Results from numerical experiments, ranging from six benchmark deterministic models to 1/f noise, suggest that the hybrid method can significantly improve detection of determinism in noisy time series by about 20 dB when the data are contaminated by Gaussian noise. Furthermore, we apply our algorithm to study the mechanomyographic (MMG) signals arising from contraction of human skeletal muscle. Results obtained from the hybrid symplectic principal component analysis and central tendency measure demonstrate that the skeletal muscle motor unit dynamics can indeed be deterministic, in agreement with previous studies. However, the conventional CTM method was not able to definitely detect the underlying deterministic dynamics. This result on MMG signal analysis is helpful in understanding neuromuscular control mechanisms and developing MMG-based engineering control applications.
Three dimensional empirical mode decomposition analysis apparatus, method and article manufacture
NASA Technical Reports Server (NTRS)
Gloersen, Per (Inventor)
2004-01-01
An apparatus and method of analysis for three-dimensional (3D) physical phenomena. The physical phenomena may include any varying 3D phenomena such as time varying polar ice flows. A repesentation of the 3D phenomena is passed through a Hilbert transform to convert the data into complex form. A spatial variable is separated from the complex representation by producing a time based covariance matrix. The temporal parts of the principal components are produced by applying Singular Value Decomposition (SVD). Based on the rapidity with which the eigenvalues decay, the first 3-10 complex principal components (CPC) are selected for Empirical Mode Decomposition into intrinsic modes. The intrinsic modes produced are filtered in order to reconstruct the spatial part of the CPC. Finally, a filtered time series may be reconstructed from the first 3-10 filtered complex principal components.
Investigation of domain walls in PPLN by confocal raman microscopy and PCA analysis
NASA Astrophysics Data System (ADS)
Shur, Vladimir Ya.; Zelenovskiy, Pavel; Bourson, Patrice
2017-07-01
Confocal Raman microscopy (CRM) is a powerful tool for investigation of ferroelectric domains. Mechanical stresses and electric fields existed in the vicinity of neutral and charged domain walls modify frequency, intensity and width of spectral lines [1], thus allowing to visualize micro- and nanodomain structures both at the surface and in the bulk of the crystal [2,3]. Stresses and fields are naturally coupled in ferroelectrics due to inverse piezoelectric effect and hardly can be separated in Raman spectra. PCA is a powerful statistical method for analysis of large data matrix providing a set of orthogonal variables, called principal components (PCs). PCA is widely used for classification of experimental data, for example, in crystallization experiments, for detection of small amounts of components in solid mixtures etc. [4,5]. In Raman spectroscopy PCA was applied for analysis of phase transitions and provided critical pressure with good accuracy [6]. In the present work we for the first time applied Principal Component Analysis (PCA) method for analysis of Raman spectra measured in periodically poled lithium niobate (PPLN). We found that principal components demonstrate different sensitivity to mechanical stresses and electric fields in the vicinity of the domain walls. This allowed us to separately visualize spatial distribution of fields and electric fields at the surface and in the bulk of PPLN.
Ibrahim, George M; Morgan, Benjamin R; Macdonald, R Loch
2014-03-01
Predictors of outcome after aneurysmal subarachnoid hemorrhage have been determined previously through hypothesis-driven methods that often exclude putative covariates and require a priori knowledge of potential confounders. Here, we apply a data-driven approach, principal component analysis, to identify baseline patient phenotypes that may predict neurological outcomes. Principal component analysis was performed on 120 subjects enrolled in a prospective randomized trial of clazosentan for the prevention of angiographic vasospasm. Correlation matrices were created using a combination of Pearson, polyserial, and polychoric regressions among 46 variables. Scores of significant components (with eigenvalues>1) were included in multivariate logistic regression models with incidence of severe angiographic vasospasm, delayed ischemic neurological deficit, and long-term outcome as outcomes of interest. Sixteen significant principal components accounting for 74.6% of the variance were identified. A single component dominated by the patients' initial hemodynamic status, World Federation of Neurosurgical Societies score, neurological injury, and initial neutrophil/leukocyte counts was significantly associated with poor outcome. Two additional components were associated with angiographic vasospasm, of which one was also associated with delayed ischemic neurological deficit. The first was dominated by the aneurysm-securing procedure, subarachnoid clot clearance, and intracerebral hemorrhage, whereas the second had high contributions from markers of anemia and albumin levels. Principal component analysis, a data-driven approach, identified patient phenotypes that are associated with worse neurological outcomes. Such data reduction methods may provide a better approximation of unique patient phenotypes and may inform clinical care as well as patient recruitment into clinical trials. http://www.clinicaltrials.gov. Unique identifier: NCT00111085.
Visualizing Hyolaryngeal Mechanics in Swallowing Using Dynamic MRI
Pearson, William G.; Zumwalt, Ann C.
2013-01-01
Introduction Coordinates of anatomical landmarks are captured using dynamic MRI to explore whether a proposed two-sling mechanism underlies hyolaryngeal elevation in pharyngeal swallowing. A principal components analysis (PCA) is applied to coordinates to determine the covariant function of the proposed mechanism. Methods Dynamic MRI (dMRI) data were acquired from eleven healthy subjects during a repeated swallows task. Coordinates mapping the proposed mechanism are collected from each dynamic (frame) of a dynamic MRI swallowing series of a randomly selected subject in order to demonstrate shape changes in a single subject. Coordinates representing minimum and maximum hyolaryngeal elevation of all 11 subjects were also mapped to demonstrate shape changes of the system among all subjects. MophoJ software was used to perform PCA and determine vectors of shape change (eigenvectors) for elements of the two-sling mechanism of hyolaryngeal elevation. Results For both single subject and group PCAs, hyolaryngeal elevation accounted for the first principal component of variation. For the single subject PCA, the first principal component accounted for 81.5% of the variance. For the between subjects PCA, the first principal component accounted for 58.5% of the variance. Eigenvectors and shape changes associated with this first principal component are reported. Discussion Eigenvectors indicate that two-muscle slings and associated skeletal elements function as components of a covariant mechanism to elevate the hyolaryngeal complex. Morphological analysis is useful to model shape changes in the two-sling mechanism of hyolaryngeal elevation. PMID:25090608
Support vector machine based classification of fast Fourier transform spectroscopy of proteins
NASA Astrophysics Data System (ADS)
Lazarevic, Aleksandar; Pokrajac, Dragoljub; Marcano, Aristides; Melikechi, Noureddine
2009-02-01
Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a novel machine learning based method that uses protein spectra for classification and identification of such proteins within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear combinations of original spectral components and then employs support vector machine (SVM) classification model applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2 and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines exhibits excellent classification accuracy when identifying proteins using their infrared spectra.
Color enhancement of landsat agricultural imagery: JPL LACIE image processing support task
NASA Technical Reports Server (NTRS)
Madura, D. P.; Soha, J. M.; Green, W. B.; Wherry, D. B.; Lewis, S. D.
1978-01-01
Color enhancement techniques were applied to LACIE LANDSAT segments to determine if such enhancement can assist analysis in crop identification. The procedure involved increasing the color range by removing correlation between components. First, a principal component transformation was performed, followed by contrast enhancement to equalize component variances, followed by an inverse transformation to restore familiar color relationships. Filtering was applied to lower order components to reduce color speckle in the enhanced products. Use of single acquisition and multiple acquisition statistics to control the enhancement were compared, and the effects of normalization investigated. Evaluation is left to LACIE personnel.
Variability search in M 31 using principal component analysis and the Hubble Source Catalogue
NASA Astrophysics Data System (ADS)
Moretti, M. I.; Hatzidimitriou, D.; Karampelas, A.; Sokolovsky, K. V.; Bonanos, A. Z.; Gavras, P.; Yang, M.
2018-06-01
Principal component analysis (PCA) is being extensively used in Astronomy but not yet exhaustively exploited for variability search. The aim of this work is to investigate the effectiveness of using the PCA as a method to search for variable stars in large photometric data sets. We apply PCA to variability indices computed for light curves of 18 152 stars in three fields in M 31 extracted from the Hubble Source Catalogue. The projection of the data into the principal components is used as a stellar variability detection and classification tool, capable of distinguishing between RR Lyrae stars, long-period variables (LPVs) and non-variables. This projection recovered more than 90 per cent of the known variables and revealed 38 previously unknown variable stars (about 30 per cent more), all LPVs except for one object of uncertain variability type. We conclude that this methodology can indeed successfully identify candidate variable stars.
Jović, Ozren; Smolić, Tomislav; Primožič, Ines; Hrenar, Tomica
2016-04-19
The aim of this study was to investigate the feasibility of FTIR-ATR spectroscopy coupled with the multivariate numerical methodology for qualitative and quantitative analysis of binary and ternary edible oil mixtures. Four pure oils (extra virgin olive oil, high oleic sunflower oil, rapeseed oil, and sunflower oil), as well as their 54 binary and 108 ternary mixtures, were analyzed using FTIR-ATR spectroscopy in combination with principal component and discriminant analysis, partial least-squares, and principal component regression. It was found that the composition of all 166 samples can be excellently represented using only the first three principal components describing 98.29% of total variance in the selected spectral range (3035-2989, 1170-1140, 1120-1100, 1093-1047, and 930-890 cm(-1)). Factor scores in 3D space spanned by these three principal components form a tetrahedral-like arrangement: pure oils being at the vertices, binary mixtures at the edges, and ternary mixtures on the faces of a tetrahedron. To confirm the validity of results, we applied several cross-validation methods. Quantitative analysis was performed by minimization of root-mean-square error of cross-validation values regarding the spectral range, derivative order, and choice of method (partial least-squares or principal component regression), which resulted in excellent predictions for test sets (R(2) > 0.99 in all cases). Additionally, experimentally more demanding gas chromatography analysis of fatty acid content was carried out for all specimens, confirming the results obtained by FTIR-ATR coupled with principal component analysis. However, FTIR-ATR provided a considerably better model for prediction of mixture composition than gas chromatography, especially for high oleic sunflower oil.
Principal component analysis of Raman spectra for TiO2 nanoparticle characterization
NASA Astrophysics Data System (ADS)
Ilie, Alina Georgiana; Scarisoareanu, Monica; Morjan, Ion; Dutu, Elena; Badiceanu, Maria; Mihailescu, Ion
2017-09-01
The Raman spectra of anatase/rutile mixed phases of Sn doped TiO2 nanoparticles and undoped TiO2 nanoparticles, synthesised by laser pyrolysis, with nanocrystallite dimensions varying from 8 to 28 nm, was simultaneously processed with a self-written software that applies Principal Component Analysis (PCA) on the measured spectrum to verify the possibility of objective auto-characterization of nanoparticles from their vibrational modes. The photo-excited process of Raman scattering is very sensible to the material characteristics, especially in the case of nanomaterials, where more properties become relevant for the vibrational behaviour. We used PCA, a statistical procedure that performs eigenvalue decomposition of descriptive data covariance, to automatically analyse the sample's measured Raman spectrum, and to interfere the correlation between nanoparticle dimensions, tin and carbon concentration, and their Principal Component values (PCs). This type of application can allow an approximation of the crystallite size, or tin concentration, only by measuring the Raman spectrum of the sample. The study of loadings of the principal components provides information of the way the vibrational modes are affected by the nanoparticle features and the spectral area relevant for the classification.
Transforming Graph Data for Statistical Relational Learning
2012-10-01
Jordan, 2003), PLSA (Hofmann, 1999), ? Classification via RMN (Taskar et al., 2003) or SVM (Hasan, Chaoji, Salem , & Zaki, 2006) ? Hierarchical...dimensionality reduction methods such as Principal 407 Rossi, McDowell, Aha, & Neville Component Analysis (PCA), Principal Factor Analysis ( PFA ), and...clustering algorithm. Journal of the Royal Statistical Society. Series C, Applied statistics, 28, 100–108. Hasan, M. A., Chaoji, V., Salem , S., & Zaki, M
NASA Astrophysics Data System (ADS)
Xu, Roger; Stevenson, Mark W.; Kwan, Chi-Man; Haynes, Leonard S.
2001-07-01
At Ford Motor Company, thrust bearing in drill motors is often damaged by metal chips. Since the vibration frequency is several Hz only, it is very difficult to use accelerometers to pick up the vibration signals. Under the support of Ford and NASA, we propose to use a piezo film as a sensor to pick up the slow vibrations of the bearing. Then a neural net based fault detection algorithm is applied to differentiate normal bearing from bad bearing. The first step involves a Fast Fourier Transform which essentially extracts the significant frequency components in the sensor. Then Principal Component Analysis is used to further reduce the dimension of the frequency components by extracting the principal features inside the frequency components. The features can then be used to indicate the status of bearing. Experimental results are very encouraging.
Principal Component Analysis of Thermographic Data
NASA Technical Reports Server (NTRS)
Winfree, William P.; Cramer, K. Elliott; Zalameda, Joseph N.; Howell, Patricia A.; Burke, Eric R.
2015-01-01
Principal Component Analysis (PCA) has been shown effective for reducing thermographic NDE data. While a reliable technique for enhancing the visibility of defects in thermal data, PCA can be computationally intense and time consuming when applied to the large data sets typical in thermography. Additionally, PCA can experience problems when very large defects are present (defects that dominate the field-of-view), since the calculation of the eigenvectors is now governed by the presence of the defect, not the "good" material. To increase the processing speed and to minimize the negative effects of large defects, an alternative method of PCA is being pursued where a fixed set of eigenvectors, generated from an analytic model of the thermal response of the material under examination, is used to process the thermal data from composite materials. This method has been applied for characterization of flaws.
NASA Astrophysics Data System (ADS)
Aida, S.; Matsuno, T.; Hasegawa, T.; Tsuji, K.
2017-07-01
Micro X-ray fluorescence (micro-XRF) analysis is repeated as a means of producing elemental maps. In some cases, however, the XRF images of trace elements that are obtained are not clear due to high background intensity. To solve this problem, we applied principal component analysis (PCA) to XRF spectra. We focused on improving the quality of XRF images by applying PCA. XRF images of the dried residue of standard solution on the glass substrate were taken. The XRF intensities for the dried residue were analyzed before and after PCA. Standard deviations of XRF intensities in the PCA-filtered images were improved, leading to clear contrast of the images. This improvement of the XRF images was effective in cases where the XRF intensity was weak.
NASA Astrophysics Data System (ADS)
Farsadnia, Farhad; Ghahreman, Bijan
2016-04-01
Hydrologic homogeneous group identification is considered both fundamental and applied research in hydrology. Clustering methods are among conventional methods to assess the hydrological homogeneous regions. Recently, Self-Organizing feature Map (SOM) method has been applied in some studies. However, the main problem of this method is the interpretation on the output map of this approach. Therefore, SOM is used as input to other clustering algorithms. The aim of this study is to apply a two-level Self-Organizing feature map and Ward hierarchical clustering method to determine the hydrologic homogenous regions in North and Razavi Khorasan provinces. At first by principal component analysis, we reduced SOM input matrix dimension, then the SOM was used to form a two-dimensional features map. To determine homogeneous regions for flood frequency analysis, SOM output nodes were used as input into the Ward method. Generally, the regions identified by the clustering algorithms are not statistically homogeneous. Consequently, they have to be adjusted to improve their homogeneity. After adjustment of the homogeneity regions by L-moment tests, five hydrologic homogeneous regions were identified. Finally, adjusted regions were created by a two-level SOM and then the best regional distribution function and associated parameters were selected by the L-moment approach. The results showed that the combination of self-organizing maps and Ward hierarchical clustering by principal components as input is more effective than the hierarchical method, by principal components or standardized inputs to achieve hydrologic homogeneous regions.
Roopwani, Rahul; Buckner, Ira S
2011-10-14
Principal component analysis (PCA) was applied to pharmaceutical powder compaction. A solid fraction parameter (SF(c/d)) and a mechanical work parameter (W(c/d)) representing irreversible compression behavior were determined as functions of applied load. Multivariate analysis of the compression data was carried out using PCA. The first principal component (PC1) showed loadings for the solid fraction and work values that agreed with changes in the relative significance of plastic deformation to consolidation at different pressures. The PC1 scores showed the same rank order as the relative plasticity ranking derived from the literature for common pharmaceutical materials. The utility of PC1 in understanding deformation was extended to binary mixtures using a subset of the original materials. Combinations of brittle and plastic materials were characterized using the PCA method. The relationships between PC1 scores and the weight fractions of the mixtures were typically linear showing ideal mixing in their deformation behaviors. The mixture consisting of two plastic materials was the only combination to show a consistent positive deviation from ideality. The application of PCA to solid fraction and mechanical work data appears to be an effective means of predicting deformation behavior during compaction of simple powder mixtures. Copyright © 2011 Elsevier B.V. All rights reserved.
Sano, Yuko; Kandori, Akihiko; Shima, Keisuke; Yamaguchi, Yuki; Tsuji, Toshio; Noda, Masafumi; Higashikawa, Fumiko; Yokoe, Masaru; Sakoda, Saburo
2016-06-01
We propose a novel index of Parkinson's disease (PD) finger-tapping severity, called "PDFTsi," for quantifying the severity of symptoms related to the finger tapping of PD patients with high accuracy. To validate the efficacy of PDFTsi, the finger-tapping movements of normal controls and PD patients were measured by using magnetic sensors, and 21 characteristics were extracted from the finger-tapping waveforms. To distinguish motor deterioration due to PD from that due to aging, the aging effect on finger tapping was removed from these characteristics. Principal component analysis (PCA) was applied to the age-normalized characteristics, and principal components that represented the motion properties of finger tapping were calculated. Multiple linear regression (MLR) with stepwise variable selection was applied to the principal components, and PDFTsi was calculated. The calculated PDFTsi indicates that PDFTsi has a high estimation ability, namely a mean square error of 0.45. The estimation ability of PDFTsi is higher than that of the alternative method, MLR with stepwise regression selection without PCA, namely a mean square error of 1.30. This result suggests that PDFTsi can quantify PD finger-tapping severity accurately. Furthermore, the result of interpreting a model for calculating PDFTsi indicated that motion wideness and rhythm disorder are important for estimating PD finger-tapping severity.
Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg
2015-03-01
Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.
Caprihan, A; Pearlson, G D; Calhoun, V D
2008-08-15
Principal component analysis (PCA) is often used to reduce the dimension of data before applying more sophisticated data analysis methods such as non-linear classification algorithms or independent component analysis. This practice is based on selecting components corresponding to the largest eigenvalues. If the ultimate goal is separation of data in two groups, then these set of components need not have the most discriminatory power. We measured the distance between two such populations using Mahalanobis distance and chose the eigenvectors to maximize it, a modified PCA method, which we call the discriminant PCA (DPCA). DPCA was applied to diffusion tensor-based fractional anisotropy images to distinguish age-matched schizophrenia subjects from healthy controls. The performance of the proposed method was evaluated by the one-leave-out method. We show that for this fractional anisotropy data set, the classification error with 60 components was close to the minimum error and that the Mahalanobis distance was twice as large with DPCA, than with PCA. Finally, by masking the discriminant function with the white matter tracts of the Johns Hopkins University atlas, we identified left superior longitudinal fasciculus as the tract which gave the least classification error. In addition, with six optimally chosen tracts the classification error was zero.
Dong, Fengxia; Mitchell, Paul D; Colquhoun, Jed
2015-01-01
Measuring farm sustainability performance is a crucial component for improving agricultural sustainability. While extensive assessments and indicators exist that reflect the different facets of agricultural sustainability, because of the relatively large number of measures and interactions among them, a composite indicator that integrates and aggregates over all variables is particularly useful. This paper describes and empirically evaluates a method for constructing a composite sustainability indicator that individually scores and ranks farm sustainability performance. The method first uses non-negative polychoric principal component analysis to reduce the number of variables, to remove correlation among variables and to transform categorical variables to continuous variables. Next the method applies common-weight data envelope analysis to these principal components to individually score each farm. The method solves weights endogenously and allows identifying important practices in sustainability evaluation. An empirical application to Wisconsin cranberry farms finds heterogeneity in sustainability practice adoption, implying that some farms could adopt relevant practices to improve the overall sustainability performance of the industry. Copyright © 2014 Elsevier Ltd. All rights reserved.
Comparison of AIS Versus TMS Data Collected over the Virginia Piedmont
NASA Technical Reports Server (NTRS)
Bell, R.; Evans, C. S.
1985-01-01
The Airborne Imaging Spectrometer (AIS, NS001 Thematic Mapper Simlulator (TMS), and Zeiss camera collected remotely sensed data simultaneously on October 27, 1983, at an altitude of 6860 meters (22,500 feet). AIS data were collected in 32 channels covering 1200 to 1500 nm. A simple atmospheric correction was applied to the AIS data, after which spectra for four cover types were plotted. Spectra for these ground cover classes showed a telescoping effect for the wavelength endpoints. Principal components were extracted from the shortwave region of the AIS (1200 to 1280 nm), full spectrum AIS (1200 to 1500 nm) and TMS (450 to 12,500 nm) to create three separate three-component color image composites. A comparison of the TMS band 5 (1000 to 1300 nm) to the six principal components from the shortwave AIS region (1200 to 1280 nm) showed improved visual discrimination of ground cover types. Contrast of color image composites created from principal components showed the AIS composites to exhibit a clearer demarcation between certain ground cover types but subtle differences within other regions of the imagery were not as readily seen.
Dihedral angle principal component analysis of molecular dynamics simulations.
Altis, Alexandros; Nguyen, Phuong H; Hegger, Rainer; Stock, Gerhard
2007-06-28
It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {phi(n)} to the metric coordinate space {x(n)=cos phi(n),y(n)=sin phi(n)} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300 ns molecular dynamics simulation, a critical comparison of the various methods is given.
Dihedral angle principal component analysis of molecular dynamics simulations
NASA Astrophysics Data System (ADS)
Altis, Alexandros; Nguyen, Phuong H.; Hegger, Rainer; Stock, Gerhard
2007-06-01
It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {φn} to the metric coordinate space {xn=cosφn,yn=sinφn} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300ns molecular dynamics simulation, a critical comparison of the various methods is given.
Performance evaluation of PCA-based spike sorting algorithms.
Adamos, Dimitrios A; Kosmidis, Efstratios K; Theophilidis, George
2008-09-01
Deciphering the electrical activity of individual neurons from multi-unit noisy recordings is critical for understanding complex neural systems. A widely used spike sorting algorithm is being evaluated for single-electrode nerve trunk recordings. The algorithm is based on principal component analysis (PCA) for spike feature extraction. In the neuroscience literature it is generally assumed that the use of the first two or most commonly three principal components is sufficient. We estimate the optimum PCA-based feature space by evaluating the algorithm's performance on simulated series of action potentials. A number of modifications are made to the open source nev2lkit software to enable systematic investigation of the parameter space. We introduce a new metric to define clustering error considering over-clustering more favorable than under-clustering as proposed by experimentalists for our data. Both the program patch and the metric are available online. Correlated and white Gaussian noise processes are superimposed to account for biological and artificial jitter in the recordings. We report that the employment of more than three principal components is in general beneficial for all noise cases considered. Finally, we apply our results to experimental data and verify that the sorting process with four principal components is in agreement with a panel of electrophysiology experts.
Ramli, Saifullah; Ismail, Noryati; Alkarkhi, Abbas Fadhl Mubarek; Easa, Azhar Mat
2010-08-01
Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food.
Ramli, Saifullah; Ismail, Noryati; Alkarkhi, Abbas Fadhl Mubarek; Easa, Azhar Mat
2010-01-01
Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food. PMID:24575193
Study on nondestructive discrimination of genuine and counterfeit wild ginsengs using NIRS
NASA Astrophysics Data System (ADS)
Lu, Q.; Fan, Y.; Peng, Z.; Ding, H.; Gao, H.
2012-07-01
A new approach for the nondestructive discrimination between genuine wild ginsengs and the counterfeit ones by near infrared spectroscopy (NIRS) was developed. Both discriminant analysis and back propagation artificial neural network (BP-ANN) were applied to the model establishment for discrimination. Optimal modeling wavelengths were determined based on the anomalous spectral information of counterfeit samples. Through principal component analysis (PCA) of various wild ginseng samples, genuine and counterfeit, the cumulative percentages of variance of the principal components were obtained, serving as a reference for principal component (PC) factor determination. Discriminant analysis achieved an identification ratio of 88.46%. With sample' truth values as its outputs, a three-layer BP-ANN model was built, which yielded a higher discrimination accuracy of 100%. The overall results sufficiently demonstrate that NIRS combined with BP-ANN classification algorithm performs better on ginseng discrimination than discriminant analysis, and can be used as a rapid and nondestructive method for the detection of counterfeit wild ginsengs in food and pharmaceutical industry.
NASA Astrophysics Data System (ADS)
Lin, Jyh-Woei
2012-09-01
This paper uses Nonlinear Principal Component Analysis (NLPCA) and Principal Component Analysis (PCA) to determine Total Electron Content (TEC) anomalies in the ionosphere for the Nakri Typhoon on 29 May, 2008 (UTC). NLPCA, PCA and image processing are applied to the global ionospheric map (GIM) with transforms conducted for the time period 12:00-14:00 UT on 29 May 2008 when the wind was most intense. Results show that at a height of approximately 150-200 km the TEC anomaly using NLPCA is more localized; however its intensity increases with height and becomes more widespread. The TEC anomalies are not found by PCA. Potential causes of the results are discussed with emphasis given to vertical acoustic gravity waves. The approximate position of the typhoon's eye can be detected if the GIM is divided into fine enough maps with adequate spatial-resolution at GPS-TEC receivers. This implies that the trace of the typhoon in the regional GIM is caught using NLPCA.
Sparse modeling of spatial environmental variables associated with asthma
Chang, Timothy S.; Gangnon, Ronald E.; Page, C. David; Buckingham, William R.; Tandias, Aman; Cowan, Kelly J.; Tomasallo, Carrie D.; Arndt, Brian G.; Hanrahan, Lawrence P.; Guilbert, Theresa W.
2014-01-01
Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin’s Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5–50 years over a three-year period. Each patient’s home address was geocoded to one of 3,456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin’s geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. PMID:25533437
Sparse modeling of spatial environmental variables associated with asthma.
Chang, Timothy S; Gangnon, Ronald E; David Page, C; Buckingham, William R; Tandias, Aman; Cowan, Kelly J; Tomasallo, Carrie D; Arndt, Brian G; Hanrahan, Lawrence P; Guilbert, Theresa W
2015-02-01
Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin's Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5-50years over a three-year period. Each patient's home address was geocoded to one of 3456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin's geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. Copyright © 2014 Elsevier Inc. All rights reserved.
Effect of noise in principal component analysis with an application to ozone pollution
NASA Astrophysics Data System (ADS)
Tsakiri, Katerina G.
This thesis analyzes the effect of independent noise in principal components of k normally distributed random variables defined by a covariance matrix. We prove that the principal components as well as the canonical variate pairs determined from joint distribution of original sample affected by noise can be essentially different in comparison with those determined from the original sample. However when the differences between the eigenvalues of the original covariance matrix are sufficiently large compared to the level of the noise, the effect of noise in principal components and canonical variate pairs proved to be negligible. The theoretical results are supported by simulation study and examples. Moreover, we compare our results about the eigenvalues and eigenvectors in the two dimensional case with other models examined before. This theory can be applied in any field for the decomposition of the components in multivariate analysis. One application is the detection and prediction of the main atmospheric factor of ozone concentrations on the example of Albany, New York. Using daily ozone, solar radiation, temperature, wind speed and precipitation data, we determine the main atmospheric factor for the explanation and prediction of ozone concentrations. A methodology is described for the decomposition of the time series of ozone and other atmospheric variables into the global term component which describes the long term trend and the seasonal variations, and the synoptic scale component which describes the short term variations. By using the Canonical Correlation Analysis, we show that solar radiation is the only main factor between the atmospheric variables considered here for the explanation and prediction of the global and synoptic scale component of ozone. The global term components are modeled by a linear regression model, while the synoptic scale components by a vector autoregressive model and the Kalman filter. The coefficient of determination, R2, for the prediction of the synoptic scale ozone component was found to be the highest when we consider the synoptic scale component of the time series for solar radiation and temperature. KEY WORDS: multivariate analysis; principal component; canonical variate pairs; eigenvalue; eigenvector; ozone; solar radiation; spectral decomposition; Kalman filter; time series prediction
NASA Technical Reports Server (NTRS)
Boyd, R. K.; Brumfield, J. O.; Campbell, W. J.
1984-01-01
Three feature extraction methods, canonical analysis (CA), principal component analysis (PCA), and band selection, have been applied to Thematic Mapper Simulator (TMS) data in order to evaluate the relative performance of the methods. The results obtained show that CA is capable of providing a transformation of TMS data which leads to better classification results than provided by all seven bands, by PCA, or by band selection. A second conclusion drawn from the study is that TMS bands 2, 3, 4, and 7 (thermal) are most important for landcover classification.
NASA Astrophysics Data System (ADS)
Schelkanova, Irina; Toronov, Vladislav
2011-07-01
Although near infrared spectroscopy (NIRS) is now widely used both in emerging clinical techniques and in cognitive neuroscience, the development of the apparatuses and signal processing methods for these applications is still a hot research topic. The main unresolved problem in functional NIRS is the separation of functional signals from the contaminations by systemic and local physiological fluctuations. This problem was approached by using various signal processing methods, including blind signal separation techniques. In particular, principal component analysis (PCA) and independent component analysis (ICA) were applied to the data acquired at the same wavelength and at multiple sites on the human or animal heads during functional activation. These signal processing procedures resulted in a number of principal or independent components that could be attributed to functional activity but their physiological meaning remained unknown. On the other hand, the best physiological specificity is provided by broadband NIRS. Also, a comparison with functional magnetic resonance imaging (fMRI) allows determining the spatial origin of fNIRS signals. In this study we applied PCA and ICA to broadband NIRS data to distill the components correlating with the breath hold activation paradigm and compared them with the simultaneously acquired fMRI signals. Breath holding was used because it generates blood carbon dioxide (CO2) which increases the blood-oxygen-level-dependent (BOLD) signal as CO2 acts as a cerebral vasodilator. Vasodilation causes increased cerebral blood flow which washes deoxyhaemoglobin out of the cerebral capillary bed thus increasing both the cerebral blood volume and oxygenation. Although the original signals were quite diverse, we found very few different components which corresponded to fMRI signals at different locations in the brain and to different physiological chromophores.
Rein, Thomas R; Harvati, Katerina; Harrison, Terry
2015-01-01
Uncovering links between skeletal morphology and locomotor behavior is an essential component of paleobiology because it allows researchers to infer the locomotor repertoire of extinct species based on preserved fossils. In this study, we explored ulnar shape in anthropoid primates using 3D geometric morphometrics to discover novel aspects of shape variation that correspond to observed differences in the relative amount of forelimb suspensory locomotion performed by species. The ultimate goal of this research was to construct an accurate predictive model that can be applied to infer the significance of these behaviors. We studied ulnar shape variation in extant species using principal component analysis. Species mainly clustered into phylogenetic groups along the first two principal components. Upon closer examination, the results showed that the position of species within each major clade corresponded closely with the proportion of forelimb suspensory locomotion that they have been observed to perform in nature. We used principal component regression to construct a predictive model for the proportion of these behaviors that would be expected to occur in the locomotor repertoire of anthropoid primates. We then applied this regression analysis to Pliopithecus vindobonensis, a stem catarrhine from the Miocene of central Europe, and found strong evidence that this species was adapted to perform a proportion of forelimb suspensory locomotion similar to that observed in the extant woolly monkey, Lagothrix lagothricha. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Feldman, Sandra C.
1987-01-01
Methods of applying principal component (PC) analysis to high resolution remote sensing imagery were examined. Using Airborne Imaging Spectrometer (AIS) data, PC analysis was found to be useful for removing the effects of albedo and noise and for isolating the significant information on argillic alteration, zeolite, and carbonate minerals. An effective technique for using PC analysis using an input the first 16 AIS bands, 7 intermediate bands, and the last 16 AIS bands from the 32 flat field corrected bands between 2048 and 2337 nm. Most of the significant mineralogical information resided in the second PC. PC color composites and density sliced images provided a good mineralogical separation when applied to a AIS data set. Although computer intensive, the advantage of PC analysis is that it employs algorithms which already exist on most image processing systems.
Dynamical Analysis of Stock Market Instability by Cross-correlation Matrix
NASA Astrophysics Data System (ADS)
Takaishi, Tetsuya
2016-08-01
We study stock market instability by using cross-correlations constructed from the return time series of 366 stocks traded on the Tokyo Stock Exchange from January 5, 1998 to December 30, 2013. To investigate the dynamical evolution of the cross-correlations, crosscorrelation matrices are calculated with a rolling window of 400 days. To quantify the volatile market stages where the potential risk is high, we apply the principal components analysis and measure the cumulative risk fraction (CRF), which is the system variance associated with the first few principal components. From the CRF, we detected three volatile market stages corresponding to the bankruptcy of Lehman Brothers, the 2011 Tohoku Region Pacific Coast Earthquake, and the FRB QE3 reduction observation in the study period. We further apply the random matrix theory for the risk analysis and find that the first eigenvector is more equally de-localized when the market is volatile.
Determination of butter adulteration with margarine using Raman spectroscopy.
Uysal, Reyhan Selin; Boyaci, Ismail Hakki; Genis, Hüseyin Efe; Tamer, Ugur
2013-12-15
In this study, adulteration of butter with margarine was analysed using Raman spectroscopy combined with chemometric methods (principal component analysis (PCA), principal component regression (PCR), partial least squares (PLS)) and artificial neural networks (ANNs). Different butter and margarine samples were mixed at various concentrations ranging from 0% to 100% w/w. PCA analysis was applied for the classification of butters, margarines and mixtures. PCR, PLS and ANN were used for the detection of adulteration ratios of butter. Models were created using a calibration data set and developed models were evaluated using a validation data set. The coefficient of determination (R(2)) values between actual and predicted values obtained for PCR, PLS and ANN for the validation data set were 0.968, 0.987 and 0.978, respectively. In conclusion, a combination of Raman spectroscopy with chemometrics and ANN methods can be applied for testing butter adulteration. Copyright © 2013 Elsevier Ltd. All rights reserved.
Akbari, Hamed; Macyszyn, Luke; Da, Xiao; Wolf, Ronald L.; Bilello, Michel; Verma, Ragini; O’Rourke, Donald M.
2014-01-01
Purpose To augment the analysis of dynamic susceptibility contrast material–enhanced magnetic resonance (MR) images to uncover unique tissue characteristics that could potentially facilitate treatment planning through a better understanding of the peritumoral region in patients with glioblastoma. Materials and Methods Institutional review board approval was obtained for this study, with waiver of informed consent for retrospective review of medical records. Dynamic susceptibility contrast-enhanced MR imaging data were obtained for 79 patients, and principal component analysis was applied to the perfusion signal intensity. The first six principal components were sufficient to characterize more than 99% of variance in the temporal dynamics of blood perfusion in all regions of interest. The principal components were subsequently used in conjunction with a support vector machine classifier to create a map of heterogeneity within the peritumoral region, and the variance of this map served as the heterogeneity score. Results The calculated principal components allowed near-perfect separability of tissue that was likely highly infiltrated with tumor and tissue that was unlikely infiltrated with tumor. The heterogeneity map created by using the principal components showed a clear relationship between voxels judged by the support vector machine to be highly infiltrated and subsequent recurrence. The results demonstrated a significant correlation (r = 0.46, P < .0001) between the heterogeneity score and patient survival. The hazard ratio was 2.23 (95% confidence interval: 1.4, 3.6; P < .01) between patients with high and low heterogeneity scores on the basis of the median heterogeneity score. Conclusion Analysis of dynamic susceptibility contrast-enhanced MR imaging data by using principal component analysis can help identify imaging variables that can be subsequently used to evaluate the peritumoral region in glioblastoma. These variables are potentially indicative of tumor infiltration and may become useful tools in guiding therapy, as well as individualized prognostication. © RSNA, 2014 PMID:24955928
Principal component analysis on a torus: Theory and application to protein dynamics.
Sittel, Florian; Filk, Thomas; Stock, Gerhard
2017-12-28
A dimensionality reduction method for high-dimensional circular data is developed, which is based on a principal component analysis (PCA) of data points on a torus. Adopting a geometrical view of PCA, various distance measures on a torus are introduced and the associated problem of projecting data onto the principal subspaces is discussed. The main idea is that the (periodicity-induced) projection error can be minimized by transforming the data such that the maximal gap of the sampling is shifted to the periodic boundary. In a second step, the covariance matrix and its eigendecomposition can be computed in a standard manner. Adopting molecular dynamics simulations of two well-established biomolecular systems (Aib 9 and villin headpiece), the potential of the method to analyze the dynamics of backbone dihedral angles is demonstrated. The new approach allows for a robust and well-defined construction of metastable states and provides low-dimensional reaction coordinates that accurately describe the free energy landscape. Moreover, it offers a direct interpretation of covariances and principal components in terms of the angular variables. Apart from its application to PCA, the method of maximal gap shifting is general and can be applied to any other dimensionality reduction method for circular data.
Principal component analysis on a torus: Theory and application to protein dynamics
NASA Astrophysics Data System (ADS)
Sittel, Florian; Filk, Thomas; Stock, Gerhard
2017-12-01
A dimensionality reduction method for high-dimensional circular data is developed, which is based on a principal component analysis (PCA) of data points on a torus. Adopting a geometrical view of PCA, various distance measures on a torus are introduced and the associated problem of projecting data onto the principal subspaces is discussed. The main idea is that the (periodicity-induced) projection error can be minimized by transforming the data such that the maximal gap of the sampling is shifted to the periodic boundary. In a second step, the covariance matrix and its eigendecomposition can be computed in a standard manner. Adopting molecular dynamics simulations of two well-established biomolecular systems (Aib9 and villin headpiece), the potential of the method to analyze the dynamics of backbone dihedral angles is demonstrated. The new approach allows for a robust and well-defined construction of metastable states and provides low-dimensional reaction coordinates that accurately describe the free energy landscape. Moreover, it offers a direct interpretation of covariances and principal components in terms of the angular variables. Apart from its application to PCA, the method of maximal gap shifting is general and can be applied to any other dimensionality reduction method for circular data.
NASA Astrophysics Data System (ADS)
Pacholski, Michaeleen L.
2004-06-01
Principal component analysis (PCA) has been successfully applied to time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra, images and depth profiles. Although SIMS spectral data sets can be small (in comparison to datasets typically discussed in literature from other analytical techniques such as gas or liquid chromatography), each spectrum has thousands of ions resulting in what can be a difficult comparison of samples. Analysis of industrially-derived samples means the identity of most surface species are unknown a priori and samples must be analyzed rapidly to satisfy customer demands. PCA enables rapid assessment of spectral differences (or lack there of) between samples and identification of chemically different areas on sample surfaces for images. Depth profile analysis helps define interfaces and identify low-level components in the system.
A stable systemic risk ranking in China's banking sector: Based on principal component analysis
NASA Astrophysics Data System (ADS)
Fang, Libing; Xiao, Binqing; Yu, Honghai; You, Qixing
2018-02-01
In this paper, we compare five popular systemic risk rankings, and apply principal component analysis (PCA) model to provide a stable systemic risk ranking for the Chinese banking sector. Our empirical results indicate that five methods suggest vastly different systemic risk rankings for the same bank, while the combined systemic risk measure based on PCA provides a reliable ranking. Furthermore, according to factor loadings of the first component, PCA combined ranking is mainly based on fundamentals instead of market price data. We clearly find that price-based rankings are not as practical a method as fundamentals-based ones. This PCA combined ranking directly shows systemic risk contributions of each bank for banking supervision purpose and reminds banks to prevent and cope with the financial crisis in advance.
Using Structural Equation Modeling To Fit Models Incorporating Principal Components.
ERIC Educational Resources Information Center
Dolan, Conor; Bechger, Timo; Molenaar, Peter
1999-01-01
Considers models incorporating principal components from the perspectives of structural-equation modeling. These models include the following: (1) the principal-component analysis of patterned matrices; (2) multiple analysis of variance based on principal components; and (3) multigroup principal-components analysis. Discusses fitting these models…
Stashenko, Elena E; Martínez, Jairo R; Ruíz, Carlos A; Arias, Ginna; Durán, Camilo; Salgar, William; Cala, Mónica
2010-01-01
Chromatographic (GC/flame ionization detection, GC/MS) and statistical analyses were applied to the study of essential oils and extracts obtained from flowers, leaves, and stems of Lippia origanoides plants, growing wild in different Colombian regions. Retention indices, mass spectra, and standard substances were used in the identification of 139 substances detected in these essential oils and extracts. Principal component analysis allowed L. origanoides classification into three chemotypes, characterized according to their essential oil major components. Alpha- and beta-phellandrenes, p-cymene, and limonene distinguished chemotype A; carvacrol and thymol were the distinctive major components of chemotypes B and C, respectively. Pinocembrin (5,7-dihydroxyflavanone) was found in L. origanoides chemotype A supercritical fluid (CO(2)) extract at a concentration of 0.83+/-0.03 mg/g of dry plant material, which makes this plant an interesting source of an important bioactive flavanone with diverse potential applications in cosmetic, food, and pharmaceutical products.
Ferrero, Alejandro; Rabal, Ana María; Campos, Joaquín; Pons, Alicia; Hernanz, María Luisa
2012-06-01
A type of representation of the spectral bidirectional reflectance distribution function (BRDF) is proposed that distinctly separates the spectral variable (wavelength) from the geometrical variables (spherical coordinates of the irradiation and viewing directions). Principal components analysis (PCA) is used in order to decompose the spectral BRDF in decorrelated spectral components, and the weight that they have at every geometrical configuration of irradiation/viewing is established. This method was applied to the spectral BRDF measurement of a special effect pigment sample, and four principal components with relevant variance were identified. These four components are enough to reproduce the great diversity of spectral reflectances observed at different geometrical configurations. Since this representation is able to separate spectral and geometrical variables, it facilitates the interpretation of the color variation of special effect pigments coatings versus the geometrical configuration of irradiation/viewing.
Salvatore, Stefania; Røislien, Jo; Baz-Lomba, Jose A; Bramness, Jørgen G
2017-03-01
Wastewater-based epidemiology is an alternative method for estimating the collective drug use in a community. We applied functional data analysis, a statistical framework developed for analysing curve data, to investigate weekly temporal patterns in wastewater measurements of three prescription drugs with known abuse potential: methadone, oxazepam and methylphenidate, comparing them to positive and negative control drugs. Sewage samples were collected in February 2014 from a wastewater treatment plant in Oslo, Norway. The weekly pattern of each drug was extracted by fitting of generalized additive models, using trigonometric functions to model the cyclic behaviour. From the weekly component, the main temporal features were then extracted using functional principal component analysis. Results are presented through the functional principal components (FPCs) and corresponding FPC scores. Clinically, the most important weekly feature of the wastewater-based epidemiology data was the second FPC, representing the difference between average midweek level and a peak during the weekend, representing possible recreational use of a drug in the weekend. Estimated scores on this FPC indicated recreational use of methylphenidate, with a high weekend peak, but not for methadone and oxazepam. The functional principal component analysis uncovered clinically important temporal features of the weekly patterns of the use of prescription drugs detected from wastewater analysis. This may be used as a post-marketing surveillance method to monitor prescription drugs with abuse potential. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Espeland, Mark A; Bray, George A; Neiberg, Rebecca; Rejeski, W Jack; Knowler, William C; Lang, Wei; Cheskin, Lawrence J; Williamson, Don; Lewis, C Beth; Wing, Rena
2009-10-01
To demonstrate how principal components analysis can be used to describe patterns of weight changes in response to an intensive lifestyle intervention. Principal components analysis was applied to monthly percent weight changes measured on 2,485 individuals enrolled in the lifestyle arm of the Action for Health in Diabetes (Look AHEAD) clinical trial. These individuals were 45 to 75 years of age, with type 2 diabetes and body mass indices greater than 25 kg/m(2). Associations between baseline characteristics and weight loss patterns were described using analyses of variance. Three components collectively accounted for 97.0% of total intrasubject variance: a gradually decelerating weight loss (88.8%), early versus late weight loss (6.6%), and a mid-year trough (1.6%). In agreement with previous reports, each of the baseline characteristics we examined had statistically significant relationships with weight loss patterns. As examples, males tended to have a steeper trajectory of percent weight loss and to lose weight more quickly than women. Individuals with higher hemoglobin A(1c) (glycosylated hemoglobin; HbA(1c)) tended to have a flatter trajectory of percent weight loss and to have mid-year troughs in weight loss compared to those with lower HbA(1c). Principal components analysis provided a coherent description of characteristic patterns of weight changes and is a useful vehicle for identifying their correlates and potentially for predicting weight control outcomes.
Differentiation of tea varieties using UV-Vis spectra and pattern recognition techniques
NASA Astrophysics Data System (ADS)
Palacios-Morillo, Ana; Alcázar, Ángela.; de Pablos, Fernando; Jurado, José Marcos
2013-02-01
Tea, one of the most consumed beverages all over the world, is of great importance in the economies of a number of countries. Several methods have been developed to classify tea varieties or origins based in pattern recognition techniques applied to chemical data, such as metal profile, amino acids, catechins and volatile compounds. Some of these analytical methods become tedious and expensive to be applied in routine works. The use of UV-Vis spectral data as discriminant variables, highly influenced by the chemical composition, can be an alternative to these methods. UV-Vis spectra of methanol-water extracts of tea have been obtained in the interval 250-800 nm. Absorbances have been used as input variables. Principal component analysis was used to reduce the number of variables and several pattern recognition methods, such as linear discriminant analysis, support vector machines and artificial neural networks, have been applied in order to differentiate the most common tea varieties. A successful classification model was built by combining principal component analysis and multilayer perceptron artificial neural networks, allowing the differentiation between tea varieties. This rapid and simple methodology can be applied to solve classification problems in food industry saving economic resources.
Zuendorf, Gerhard; Kerrouche, Nacer; Herholz, Karl; Baron, Jean-Claude
2003-01-01
Principal component analysis (PCA) is a well-known technique for reduction of dimensionality of functional imaging data. PCA can be looked at as the projection of the original images onto a new orthogonal coordinate system with lower dimensions. The new axes explain the variance in the images in decreasing order of importance, showing correlations between brain regions. We used an efficient, stable and analytical method to work out the PCA of Positron Emission Tomography (PET) images of 74 normal subjects using [(18)F]fluoro-2-deoxy-D-glucose (FDG) as a tracer. Principal components (PCs) and their relation to age effects were investigated. Correlations between the projections of the images on the new axes and the age of the subjects were carried out. The first two PCs could be identified as being the only PCs significantly correlated to age. The first principal component, which explained 10% of the data set variance, was reduced only in subjects of age 55 or older and was related to loss of signal in and adjacent to ventricles and basal cisterns, reflecting expected age-related brain atrophy with enlarging CSF spaces. The second principal component, which accounted for 8% of the total variance, had high loadings from prefrontal, posterior parietal and posterior cingulate cortices and showed the strongest correlation with age (r = -0.56), entirely consistent with previously documented age-related declines in brain glucose utilization. Thus, our method showed that the effect of aging on brain metabolism has at least two independent dimensions. This method should have widespread applications in multivariate analysis of brain functional images. Copyright 2002 Wiley-Liss, Inc.
Asymptotics of empirical eigenstructure for high dimensional spiked covariance.
Wang, Weichen; Fan, Jianqing
2017-06-01
We derive the asymptotic distributions of the spiked eigenvalues and eigenvectors under a generalized and unified asymptotic regime, which takes into account the magnitude of spiked eigenvalues, sample size, and dimensionality. This regime allows high dimensionality and diverging eigenvalues and provides new insights into the roles that the leading eigenvalues, sample size, and dimensionality play in principal component analysis. Our results are a natural extension of those in Paul (2007) to a more general setting and solve the rates of convergence problems in Shen et al. (2013). They also reveal the biases of estimating leading eigenvalues and eigenvectors by using principal component analysis, and lead to a new covariance estimator for the approximate factor model, called shrinkage principal orthogonal complement thresholding (S-POET), that corrects the biases. Our results are successfully applied to outstanding problems in estimation of risks of large portfolios and false discovery proportions for dependent test statistics and are illustrated by simulation studies.
Asymptotics of empirical eigenstructure for high dimensional spiked covariance
Wang, Weichen
2017-01-01
We derive the asymptotic distributions of the spiked eigenvalues and eigenvectors under a generalized and unified asymptotic regime, which takes into account the magnitude of spiked eigenvalues, sample size, and dimensionality. This regime allows high dimensionality and diverging eigenvalues and provides new insights into the roles that the leading eigenvalues, sample size, and dimensionality play in principal component analysis. Our results are a natural extension of those in Paul (2007) to a more general setting and solve the rates of convergence problems in Shen et al. (2013). They also reveal the biases of estimating leading eigenvalues and eigenvectors by using principal component analysis, and lead to a new covariance estimator for the approximate factor model, called shrinkage principal orthogonal complement thresholding (S-POET), that corrects the biases. Our results are successfully applied to outstanding problems in estimation of risks of large portfolios and false discovery proportions for dependent test statistics and are illustrated by simulation studies. PMID:28835726
NASA Astrophysics Data System (ADS)
LIN, JYH-WOEI
2012-08-01
Principal Component Analysis (PCA) and image processing are used to determine Total Electron Content (TEC) anomalies in the F-layer of the ionosphere relating to Typhoon Nakri for 29 May, 2008 (UTC). PCA and image processing are applied to the global ionospheric map (GIM) with transforms conducted for the time period 12:00-14:00 UT on 29 May, 2008 when the wind was most intense. Results show that at a height of approximately 150-200 km the TEC anomaly is highly localized; however, it becomes more intense and widespread with height. Potential causes of these results are discussed with emphasis given to acoustic gravity waves caused by wind force.
Iris recognition based on robust principal component analysis
NASA Astrophysics Data System (ADS)
Karn, Pradeep; He, Xiao Hai; Yang, Shuai; Wu, Xiao Hong
2014-11-01
Iris images acquired under different conditions often suffer from blur, occlusion due to eyelids and eyelashes, specular reflection, and other artifacts. Existing iris recognition systems do not perform well on these types of images. To overcome these problems, we propose an iris recognition method based on robust principal component analysis. The proposed method decomposes all training images into a low-rank matrix and a sparse error matrix, where the low-rank matrix is used for feature extraction. The sparsity concentration index approach is then applied to validate the recognition result. Experimental results using CASIA V4 and IIT Delhi V1iris image databases showed that the proposed method achieved competitive performances in both recognition accuracy and computational efficiency.
Colniță, Alia; Dina, Nicoleta Elena; Leopold, Nicolae; Vodnar, Dan Cristian; Bogdan, Diana; Porav, Sebastian Alin; David, Leontin
2017-09-01
Raman scattering and its particular effect, surface-enhanced Raman scattering (SERS), are whole-organism fingerprinting spectroscopic techniques that gain more and more popularity in bacterial detection. In this work, two relevant Gram-positive bacteria species, Lactobacillus casei ( L. casei ) and Listeria monocytogenes ( L. monocytogenes ) were characterized based on their Raman and SERS spectral fingerprints. The SERS spectra were used to identify the biochemical structures of the bacterial cell wall. Two synthesis methods of the SERS-active nanomaterials were used and the recorded spectra were analyzed. L. casei and L. monocytogenes were successfully discriminated by applying Principal Component Analysis (PCA) to their specific spectral data.
Leopold, Nicolae; Vodnar, Dan Cristian; Bogdan, Diana; Porav, Sebastian Alin; David, Leontin
2017-01-01
Raman scattering and its particular effect, surface-enhanced Raman scattering (SERS), are whole-organism fingerprinting spectroscopic techniques that gain more and more popularity in bacterial detection. In this work, two relevant Gram-positive bacteria species, Lactobacillus casei (L. casei) and Listeria monocytogenes (L. monocytogenes) were characterized based on their Raman and SERS spectral fingerprints. The SERS spectra were used to identify the biochemical structures of the bacterial cell wall. Two synthesis methods of the SERS-active nanomaterials were used and the recorded spectra were analyzed. L. casei and L. monocytogenes were successfully discriminated by applying Principal Component Analysis (PCA) to their specific spectral data. PMID:28862655
Principal shapes and squeezed limits in the effective field theory of large scale structure
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bertolini, Daniele; Solon, Mikhail P., E-mail: dbertolini@lbl.gov, E-mail: mpsolon@lbl.gov
2016-11-01
We apply an orthogonalization procedure on the effective field theory of large scale structure (EFT of LSS) shapes, relevant for the angle-averaged bispectrum and non-Gaussian covariance of the matter power spectrum at one loop. Assuming natural-sized EFT parameters, this identifies a linear combination of EFT shapes—referred to as the principal shape—that gives the dominant contribution for the whole kinematic plane, with subdominant combinations suppressed by a few orders of magnitude. For the covariance, our orthogonal transformation is in excellent agreement with a principal component analysis applied to available data. Additionally we find that, for both observables, the coefficients of themore » principal shapes are well approximated by the EFT coefficients appearing in the squeezed limit, and are thus measurable from power spectrum response functions. Employing data from N-body simulations for the growth-only response, we measure the single EFT coefficient describing the angle-averaged bispectrum with Ο (10%) precision. These methods of shape orthogonalization and measurement of coefficients from response functions are valuable tools for developing the EFT of LSS framework, and can be applied to more general observables.« less
Grimbergen, M C M; van Swol, C F P; Kendall, C; Verdaasdonk, R M; Stone, N; Bosch, J L H R
2010-01-01
The overall quality of Raman spectra in the near-infrared region, where biological samples are often studied, has benefited from various improvements to optical instrumentation over the past decade. However, obtaining ample spectral quality for analysis is still challenging due to device requirements and short integration times required for (in vivo) clinical applications of Raman spectroscopy. Multivariate analytical methods, such as principal component analysis (PCA) and linear discriminant analysis (LDA), are routinely applied to Raman spectral datasets to develop classification models. Data compression is necessary prior to discriminant analysis to prevent or decrease the degree of over-fitting. The logical threshold for the selection of principal components (PCs) to be used in discriminant analysis is likely to be at a point before the PCs begin to introduce equivalent signal and noise and, hence, include no additional value. Assessment of the signal-to-noise ratio (SNR) at a certain peak or over a specific spectral region will depend on the sample measured. Therefore, the mean SNR over the whole spectral region (SNR(msr)) is determined in the original spectrum as well as for spectra reconstructed from an increasing number of principal components. This paper introduces a method of assessing the influence of signal and noise from individual PC loads and indicates a method of selection of PCs for LDA. To evaluate this method, two data sets with different SNRs were used. The sets were obtained with the same Raman system and the same measurement parameters on bladder tissue collected during white light cystoscopy (set A) and fluorescence-guided cystoscopy (set B). This method shows that the mean SNR over the spectral range in the original Raman spectra of these two data sets is related to the signal and noise contribution of principal component loads. The difference in mean SNR over the spectral range can also be appreciated since fewer principal components can reliably be used in the low SNR data set (set B) compared to the high SNR data set (set A). Despite the fact that no definitive threshold could be found, this method may help to determine the cutoff for the number of principal components used in discriminant analysis. Future analysis of a selection of spectral databases using this technique will allow optimum thresholds to be selected for different applications and spectral data quality levels.
Principal component analysis for designed experiments.
Konishi, Tomokazu
2015-01-01
Principal component analysis is used to summarize matrix data, such as found in transcriptome, proteome or metabolome and medical examinations, into fewer dimensions by fitting the matrix to orthogonal axes. Although this methodology is frequently used in multivariate analyses, it has disadvantages when applied to experimental data. First, the identified principal components have poor generality; since the size and directions of the components are dependent on the particular data set, the components are valid only within the data set. Second, the method is sensitive to experimental noise and bias between sample groups. It cannot reflect the experimental design that is planned to manage the noise and bias; rather, it estimates the same weight and independence to all the samples in the matrix. Third, the resulting components are often difficult to interpret. To address these issues, several options were introduced to the methodology. First, the principal axes were identified using training data sets and shared across experiments. These training data reflect the design of experiments, and their preparation allows noise to be reduced and group bias to be removed. Second, the center of the rotation was determined in accordance with the experimental design. Third, the resulting components were scaled to unify their size unit. The effects of these options were observed in microarray experiments, and showed an improvement in the separation of groups and robustness to noise. The range of scaled scores was unaffected by the number of items. Additionally, unknown samples were appropriately classified using pre-arranged axes. Furthermore, these axes well reflected the characteristics of groups in the experiments. As was observed, the scaling of the components and sharing of axes enabled comparisons of the components beyond experiments. The use of training data reduced the effects of noise and bias in the data, facilitating the physical interpretation of the principal axes. Together, these introduced options result in improved generality and objectivity of the analytical results. The methodology has thus become more like a set of multiple regression analyses that find independent models that specify each of the axes.
NASA Astrophysics Data System (ADS)
Raju, B. S.; Sekhar, U. Chandra; Drakshayani, D. N.
2017-08-01
The paper investigates optimization of stereolithography process for SL5530 epoxy resin material to enhance part quality. The major characteristics indexed for performance selected to evaluate the processes are tensile strength, Flexural strength, Impact strength and Density analysis and corresponding process parameters are Layer thickness, Orientation and Hatch spacing. In this study, the process is intrinsically with multiple parameters tuning so that grey relational analysis which uses grey relational grade as performance index is specially adopted to determine the optimal combination of process parameters. Moreover, the principal component analysis is applied to evaluate the weighting values corresponding to various performance characteristics so that their relative importance can be properly and objectively desired. The results of confirmation experiments reveal that grey relational analysis coupled with principal component analysis can effectively acquire the optimal combination of process parameters. Hence, this confirm that the proposed approach in this study can be an useful tool to improve the process parameters in stereolithography process, which is very useful information for machine designers as well as RP machine users.
Białek, A; Białek, M; Lepionka, T; Kaszperuk, K; Banaszkiewicz, T; Tokarz, A
2018-04-23
The aim of this study was to determine whether diet modification with different doses of grapeseed oil or pomegranate seed oil will improve the nutritive value of poultry meat in terms of n-3 and n-6 fatty acids, as well as rumenic acid (cis-9, trans-11 conjugated linoleic acid) content in tissues diversified in lipid composition and roles in lipid metabolism. To evaluate the influence of applied diet modification comprehensively, two chemometric methods were used. Results of cluster analysis demonstrated that pomegranate seed oil modifies fatty acids profile in the most potent way, mainly by an increase in rumenic acid content. Principal component analysis showed that regardless of type of tissue first principal component is strongly associated with type of deposited fatty acid, while second principal component enables identification of place of deposition-type of tissue. Pomegranate seed oil seems to be a valuable feed additive in chickens' feeding. © 2018 Blackwell Verlag GmbH.
Mao, Zhi-Hua; Yin, Jian-Hua; Zhang, Xue-Xi; Wang, Xiao; Xia, Yang
2016-01-01
Fourier transform infrared spectroscopic imaging (FTIRI) technique can be used to obtain the quantitative information of content and spatial distribution of principal components in cartilage by combining with chemometrics methods. In this study, FTIRI combining with principal component analysis (PCA) and Fisher’s discriminant analysis (FDA) was applied to identify the healthy and osteoarthritic (OA) articular cartilage samples. Ten 10-μm thick sections of canine cartilages were imaged at 6.25μm/pixel in FTIRI. The infrared spectra extracted from the FTIR images were imported into SPSS software for PCA and FDA. Based on the PCA result of 2 principal components, the healthy and OA cartilage samples were effectively discriminated by the FDA with high accuracy of 94% for the initial samples (training set) and cross validation, as well as 86.67% for the prediction group. The study showed that cartilage degeneration became gradually weak with the increase of the depth. FTIRI combined with chemometrics may become an effective method for distinguishing healthy and OA cartilages in future. PMID:26977354
Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong
2015-08-07
Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.
Development of a glottal area index that integrates glottal gap size and open quotient
Chen, Gang; Kreiman, Jody; Gerratt, Bruce R.; Neubauer, Juergen; Shue, Yen-Liang; Alwan, Abeer
2013-01-01
Because voice signals result from vocal fold vibration, perceptually meaningful vibratory measures should quantify those aspects of vibration that correspond to differences in voice quality. In this study, glottal area waveforms were extracted from high-speed videoendoscopy of the vocal folds. Principal component analysis was applied to these waveforms to investigate the factors that vary with voice quality. Results showed that the first principal component derived from tokens without glottal gaps was significantly (p < 0.01) associated with the open quotient (OQ). The alternating-current (AC) measure had a significant effect (p < 0.01) on the first principal component among tokens exhibiting glottal gaps. A measure AC/OQ, defined as the ratio of AC to OQ, was proposed to combine both amplitude and temporal characteristics of the glottal area waveform for both complete and incomplete glottal closures. Analyses of “glide” phonations in which quality varied continuously from breathy to pressed showed that the AC/OQ measure was able to characterize the corresponding continuum of glottal area waveform variation, regardless of the presence or absence of glottal gaps. PMID:23464035
Corrected confidence bands for functional data using principal components.
Goldsmith, J; Greven, S; Crainiceanu, C
2013-03-01
Functional principal components (FPC) analysis is widely used to decompose and express functional observations. Curve estimates implicitly condition on basis functions and other quantities derived from FPC decompositions; however these objects are unknown in practice. In this article, we propose a method for obtaining correct curve estimates by accounting for uncertainty in FPC decompositions. Additionally, pointwise and simultaneous confidence intervals that account for both model- and decomposition-based variability are constructed. Standard mixed model representations of functional expansions are used to construct curve estimates and variances conditional on a specific decomposition. Iterated expectation and variance formulas combine model-based conditional estimates across the distribution of decompositions. A bootstrap procedure is implemented to understand the uncertainty in principal component decomposition quantities. Our method compares favorably to competing approaches in simulation studies that include both densely and sparsely observed functions. We apply our method to sparse observations of CD4 cell counts and to dense white-matter tract profiles. Code for the analyses and simulations is publicly available, and our method is implemented in the R package refund on CRAN. Copyright © 2013, The International Biometric Society.
Corrected Confidence Bands for Functional Data Using Principal Components
Goldsmith, J.; Greven, S.; Crainiceanu, C.
2014-01-01
Functional principal components (FPC) analysis is widely used to decompose and express functional observations. Curve estimates implicitly condition on basis functions and other quantities derived from FPC decompositions; however these objects are unknown in practice. In this article, we propose a method for obtaining correct curve estimates by accounting for uncertainty in FPC decompositions. Additionally, pointwise and simultaneous confidence intervals that account for both model- and decomposition-based variability are constructed. Standard mixed model representations of functional expansions are used to construct curve estimates and variances conditional on a specific decomposition. Iterated expectation and variance formulas combine model-based conditional estimates across the distribution of decompositions. A bootstrap procedure is implemented to understand the uncertainty in principal component decomposition quantities. Our method compares favorably to competing approaches in simulation studies that include both densely and sparsely observed functions. We apply our method to sparse observations of CD4 cell counts and to dense white-matter tract profiles. Code for the analyses and simulations is publicly available, and our method is implemented in the R package refund on CRAN. PMID:23003003
Classification of fMRI resting-state maps using machine learning techniques: A comparative study
NASA Astrophysics Data System (ADS)
Gallos, Ioannis; Siettos, Constantinos
2017-11-01
We compare the efficiency of Principal Component Analysis (PCA) and nonlinear learning manifold algorithms (ISOMAP and Diffusion maps) for classifying brain maps between groups of schizophrenia patients and healthy from fMRI scans during a resting-state experiment. After a standard pre-processing pipeline, we applied spatial Independent component analysis (ICA) to reduce (a) noise and (b) spatial-temporal dimensionality of fMRI maps. On the cross-correlation matrix of the ICA components, we applied PCA, ISOMAP and Diffusion Maps to find an embedded low-dimensional space. Finally, support-vector-machines (SVM) and k-NN algorithms were used to evaluate the performance of the algorithms in classifying between the two groups.
NASA Astrophysics Data System (ADS)
Kopparla, P.; Natraj, V.; Shia, R. L.; Spurr, R. J. D.; Crisp, D.; Yung, Y. L.
2015-12-01
Radiative transfer (RT) computations form the engine of atmospheric retrieval codes. However, full treatment of RT processes is computationally expensive, prompting usage of two-stream approximations in current exoplanetary atmospheric retrieval codes [Line et al., 2013]. Natraj et al. [2005, 2010] and Spurr and Natraj [2013] demonstrated the ability of a technique using principal component analysis (PCA) to speed up RT computations. In the PCA method for RT performance enhancement, empirical orthogonal functions are developed for binned sets of inherent optical properties that possess some redundancy; costly multiple-scattering RT calculations are only done for those few optical states corresponding to the most important principal components, and correction factors are applied to approximate radiation fields. Kopparla et al. [2015, in preparation] extended the PCA method to a broadband spectral region from the ultraviolet to the shortwave infrared (0.3-3 micron), accounting for major gas absorptions in this region. Here, we apply the PCA method to a some typical (exo-)planetary retrieval problems. Comparisons between the new model, called Universal Principal Component Analysis Radiative Transfer (UPCART) model, two-stream models and line-by-line RT models are performed, for spectral radiances, spectral fluxes and broadband fluxes. Each of these are calculated at the top of the atmosphere for several scenarios with varying aerosol types, extinction and scattering optical depth profiles, and stellar and viewing geometries. We demonstrate that very accurate radiance and flux estimates can be obtained, with better than 1% accuracy in all spectral regions and better than 0.1% in most cases, as compared to a numerically exact line-by-line RT model. The accuracy is enhanced when the results are convolved to typical instrument resolutions. The operational speed and accuracy of UPCART can be further improved by optimizing binning schemes and parallelizing the codes, work on which is under way.
Madeo, Andrea; Piras, Paolo; Re, Federica; Gabriele, Stefano; Nardinocchi, Paola; Teresi, Luciano; Torromeo, Concetta; Chialastri, Claudia; Schiariti, Michele; Giura, Geltrude; Evangelista, Antonietta; Dominici, Tania; Varano, Valerio; Zachara, Elisabetta; Puddu, Paolo Emilio
2015-01-01
The assessment of left ventricular shape changes during cardiac revolution may be a new step in clinical cardiology to ease early diagnosis and treatment. To quantify these changes, only point registration was adopted and neither Generalized Procrustes Analysis nor Principal Component Analysis were applied as we did previously to study a group of healthy subjects. Here, we extend to patients affected by hypertrophic cardiomyopathy the original approach and preliminarily include genotype positive/phenotype negative individuals to explore the potential that incumbent pathology might also be detected. Using 3D Speckle Tracking Echocardiography, we recorded left ventricular shape of 48 healthy subjects, 24 patients affected by hypertrophic cardiomyopathy and 3 genotype positive/phenotype negative individuals. We then applied Generalized Procrustes Analysis and Principal Component Analysis and inter-individual differences were cleaned by Parallel Transport performed on the tangent space, along the horizontal geodesic, between the per-subject consensuses and the grand mean. Endocardial and epicardial layers were evaluated separately, different from many ecocardiographic applications. Under a common Principal Component Analysis, we then evaluated left ventricle morphological changes (at both layers) explained by first Principal Component scores. Trajectories’ shape and orientation were investigated and contrasted. Logistic regression and Receiver Operating Characteristic curves were used to compare these morphometric indicators with traditional 3D Speckle Tracking Echocardiography global parameters. Geometric morphometrics indicators performed better than 3D Speckle Tracking Echocardiography global parameters in recognizing pathology both in systole and diastole. Genotype positive/phenotype negative individuals clustered with patients affected by hypertrophic cardiomyopathy during diastole, suggesting that incumbent pathology may indeed be foreseen by these methods. Left ventricle deformation in patients affected by hypertrophic cardiomyopathy compared to healthy subjects may be assessed by modern shape analysis better than by traditional 3D Speckle Tracking Echocardiography global parameters. Hypertrophic cardiomyopathy pathophysiology was unveiled in a new manner whereby also diastolic phase abnormalities are evident which is more difficult to investigate by traditional ecocardiographic techniques. PMID:25875818
Groundwater quality assessment of urban Bengaluru using multivariate statistical techniques
NASA Astrophysics Data System (ADS)
Gulgundi, Mohammad Shahid; Shetty, Amba
2018-03-01
Groundwater quality deterioration due to anthropogenic activities has become a subject of prime concern. The objective of the study was to assess the spatial and temporal variations in groundwater quality and to identify the sources in the western half of the Bengaluru city using multivariate statistical techniques. Water quality index rating was calculated for pre and post monsoon seasons to quantify overall water quality for human consumption. The post-monsoon samples show signs of poor quality in drinking purpose compared to pre-monsoon. Cluster analysis (CA), principal component analysis (PCA) and discriminant analysis (DA) were applied to the groundwater quality data measured on 14 parameters from 67 sites distributed across the city. Hierarchical cluster analysis (CA) grouped the 67 sampling stations into two groups, cluster 1 having high pollution and cluster 2 having lesser pollution. Discriminant analysis (DA) was applied to delineate the most meaningful parameters accounting for temporal and spatial variations in groundwater quality of the study area. Temporal DA identified pH as the most important parameter, which discriminates between water quality in the pre-monsoon and post-monsoon seasons and accounts for 72% seasonal assignation of cases. Spatial DA identified Mg, Cl and NO3 as the three most important parameters discriminating between two clusters and accounting for 89% spatial assignation of cases. Principal component analysis was applied to the dataset obtained from the two clusters, which evolved three factors in each cluster, explaining 85.4 and 84% of the total variance, respectively. Varifactors obtained from principal component analysis showed that groundwater quality variation is mainly explained by dissolution of minerals from rock water interactions in the aquifer, effect of anthropogenic activities and ion exchange processes in water.
Mapping brain activity in gradient-echo functional MRI using principal component analysis
NASA Astrophysics Data System (ADS)
Khosla, Deepak; Singh, Manbir; Don, Manuel
1997-05-01
The detection of sites of brain activation in functional MRI has been a topic of immense research interest and many technique shave been proposed to this end. Recently, principal component analysis (PCA) has been applied to extract the activated regions and their time course of activation. This method is based on the assumption that the activation is orthogonal to other signal variations such as brain motion, physiological oscillations and other uncorrelated noises. A distinct advantage of this method is that it does not require any knowledge of the time course of the true stimulus paradigm. This technique is well suited to EPI image sequences where the sampling rate is high enough to capture the effects of physiological oscillations. In this work, we propose and apply tow methods that are based on PCA to conventional gradient-echo images and investigate their usefulness as tools to extract reliable information on brain activation. The first method is a conventional technique where a single image sequence with alternating on and off stages is subject to a principal component analysis. The second method is a PCA-based approach called the common spatial factor analysis technique (CSF). As the name suggests, this method relies on common spatial factors between the above fMRI image sequence and a background fMRI. We have applied these methods to identify active brain ares during visual stimulation and motor tasks. The results from these methods are compared to those obtained by using the standard cross-correlation technique. We found good agreement in the areas identified as active across all three techniques. The results suggest that PCA and CSF methods have good potential in detecting the true stimulus correlated changes in the presence of other interfering signals.
NASA Astrophysics Data System (ADS)
Wojciechowski, Adam
2017-04-01
In order to assess ecodiversity understood as a comprehensive natural landscape factor (Jedicke 2001), it is necessary to apply research methods which recognize the environment in a holistic way. Principal component analysis may be considered as one of such methods as it allows to distinguish the main factors determining landscape diversity on the one hand, and enables to discover regularities shaping the relationships between various elements of the environment under study on the other hand. The procedure adopted to assess ecodiversity with the use of principal component analysis involves: a) determining and selecting appropriate factors of the assessed environment qualities (hypsometric, geological, hydrographic, plant, and others); b) calculating the absolute value of individual qualities for the basic areas under analysis (e.g. river length, forest area, altitude differences, etc.); c) principal components analysis and obtaining factor maps (maps of selected components); d) generating a resultant, detailed map and isolating several classes of ecodiversity. An assessment of ecodiversity with the use of principal component analysis was conducted in the test area of 299,67 km2 in Debnica Kaszubska commune. The whole commune is situated in the Weichselian glaciation area of high hypsometric and morphological diversity as well as high geo- and biodiversity. The analysis was based on topographical maps of the commune area in scale 1:25000 and maps of forest habitats. Consequently, nine factors reflecting basic environment elements were calculated: maximum height (m), minimum height (m), average height (m), the length of watercourses (km), the area of water reservoirs (m2), total forest area (ha), coniferous forests habitats area (ha), deciduous forest habitats area (ha), alder habitats area (ha). The values for individual factors were analysed for 358 grid cells of 1 km2. Based on the principal components analysis, four major factors affecting commune ecodiversity were distinguished: hypsometric component (PC1), deciduous forest habitats component (PC2), river valleys and alder habitats component (PC3), and lakes component (PC4). The distinguished factors characterise natural qualities of postglacial area and reflect well the role of the four most important groups of environment components in shaping ecodiversity of the area under study. The map of ecodiversity of Debnica Kaszubska commune was created on the basis of the first four principal component scores and then five classes of diversity were isolated: very low, low, average, high and very high. As a result of the assessment, five commune regions of very high ecodiversity were separated. These regions are also very attractive for tourists and valuable in terms of their rich nature which include protected areas such as Slupia Valley Landscape Park. The suggested method of ecodiversity assessment with the use of principal component analysis may constitute an alternative methodological proposition to other research methods used so far. Literature Jedicke E., 2001. Biodiversität, Geodiversität, Ökodiversität. Kriterien zur Analyse der Landschaftsstruktur - ein konzeptioneller Diskussionsbeitrag. Naturschutz und Landschaftsplanung, 33(2/3), 59-68.
Mjørud, Marit; Kirkevold, Marit; Røsvik, Janne; Engedal, Knut
2014-01-01
To investigate which factors the Quality of Life in Late-Stage Dementia (QUALID) scale holds when used among people with dementia (pwd) in nursing homes and to find out how the symptom load varies across the different severity levels of dementia. We included 661 pwd [mean age ± SD, 85.3 ± 8.6 years; 71.4% women]. The QUALID and the Clinical Dementia Rating (CDR) scale were applied. A principal component analysis (PCA) with varimax rotation and Kaiser normalization was applied to test the factor structure. Nonparametric analyses were applied to examine differences of symptom load across the three CDR groups. The mean QUALID score was 21.5 (±7.1), and the CDR scores of the three groups were 1 in 22.5%, 2 in 33.6% and 3 in 43.9%. The results of the statistical measures employed were the following: Crohnbach's α of QUALID, 0.74; Bartlett's test of sphericity, p <0.001; the Kaiser-Meyer-Olkin measure, 0.77. The PCA analysis resulted in three components accounting for 53% of the variance. The first component was 'tension' ('facial expression of discomfort', 'appears physically uncomfortable', 'verbalization suggests discomfort', 'being irritable and aggressive', 'appears calm', Crohnbach's α = 0.69), the second was 'well-being' ('smiles', 'enjoys eating', 'enjoys touching/being touched', 'enjoys social interaction', Crohnbach's α = 0.62) and the third was 'sadness' ('appears sad', 'cries', 'facial expression of discomfort', Crohnbach's α 0.65). The mean score on the components 'tension' and 'well-being' increased significantly with increasing severity levels of dementia. Three components of quality of life (qol) were identified. Qol decreased with increasing severity of dementia. © 2013 S. Karger AG, Basel.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Clerici, Nicola; Bodini, Antonio; Ferrarini, Alessandro
2004-10-01
In order to achieve improved sustainability, local authorities need to use tools that adequately describe and synthesize environmental information. This article illustrates a methodological approach that organizes a wide suite of environmental indicators into few aggregated indices, making use of correlation, principal component analysis, and fuzzy sets. Furthermore, a weighting system, which includes stakeholders' priorities and ambitions, is applied. As a case study, the described methodology is applied to the Reggio Emilia Province in Italy, by considering environmental information from 45 municipalities. Principal component analysis is used to condense an initial set of 19 indicators into 6 fundamental dimensions that highlight patterns of environmental conditions at the provincial scale. These dimensions are further aggregated in two indices of environmental performance through fuzzy sets. The simple form of these indices makes them particularly suitable for public communication, as they condensate a wide set of heterogeneous indicators. The main outcomes of the analysis and the potential applications of the method are discussed.
Šašiċ, Slobodan; Ojakovo, Peter; Warman, Martin; Sanghvi, Tapan
2013-09-01
Raman chemical mapping was used to determine the distribution of magnesium stearate, a lubricant, on the surface of tablets. The lubrication was carried out via a punch-face lubrication system with different spraying rates applied on placebo and active-containing tablets. Principal component analysis was used for decomposing the matrix of Raman mapping spectra. Some of the loadings associated with minuscule variation in the data significantly overlap with the Raman spectrum of magnesium stearate in placebo tablets and allow for imaging the domains of magnesium stearate via corresponding scores. Despite the negligible variation accounted for by respective principal components, the score images seem reliable as demonstrated through thresholding the one-dimensional representation and the spectra of the hot pixels that show a weak but perceivable magnesium stearate band at 1295 cm(-1). The same approach was applied on the active formulation, but no magnesium stearate was identified, presumably due to overwhelming concentration and spectral contribution of the active pharmaceutical ingredient.
Mueller, Daniela; Ferrão, Marco Flôres; Marder, Luciano; da Costa, Adilson Ben; de Cássia de Souza Schneider, Rosana
2013-01-01
The main objective of this study was to use infrared spectroscopy to identify vegetable oils used as raw material for biodiesel production and apply multivariate analysis to the data. Six different vegetable oil sources—canola, cotton, corn, palm, sunflower and soybeans—were used to produce biodiesel batches. The spectra were acquired by Fourier transform infrared spectroscopy using a universal attenuated total reflectance sensor (FTIR-UATR). For the multivariate analysis principal component analysis (PCA), hierarchical cluster analysis (HCA), interval principal component analysis (iPCA) and soft independent modeling of class analogy (SIMCA) were used. The results indicate that is possible to develop a methodology to identify vegetable oils used as raw material in the production of biodiesel by FTIR-UATR applying multivariate analysis. It was also observed that the iPCA found the best spectral range for separation of biodiesel batches using FTIR-UATR data, and with this result, the SIMCA method classified 100% of the soybean biodiesel samples. PMID:23539030
Gouvinhas, Irene; Machado, Nelson; Carvalho, Teresa; de Almeida, José M M M; Barros, Ana I R N A
2015-01-01
Extra virgin olive oils produced from three cultivars on different maturation stages were characterized using Raman spectroscopy. Chemometric methods (principal component analysis, discriminant analysis, principal component regression and partial least squares regression) applied to Raman spectral data were utilized to evaluate and quantify the statistical differences between cultivars and their ripening process. The models for predicting the peroxide value and free acidity of olive oils showed good calibration and prediction values and presented high coefficients of determination (>0.933). Both the R(2), and the correlation equations between the measured chemical parameters, and the values predicted by each approach are presented; these comprehend both PCR and PLS, used to assess SNV normalized Raman data, as well as first and second derivative of the spectra. This study demonstrates that a combination of Raman spectroscopy with multivariate analysis methods can be useful to predict rapidly olive oil chemical characteristics during the maturation process. Copyright © 2014 Elsevier B.V. All rights reserved.
Alakent, Burak; Doruker, Pemra; Camurdan, Mehmet C
2004-09-08
Time series analysis is applied on the collective coordinates obtained from principal component analysis of independent molecular dynamics simulations of alpha-amylase inhibitor tendamistat and immunity protein of colicin E7 based on the Calpha coordinates history. Even though the principal component directions obtained for each run are considerably different, the dynamics information obtained from these runs are surprisingly similar in terms of time series models and parameters. There are two main differences in the dynamics of the two proteins: the higher density of low frequencies and the larger step sizes for the interminima motions of colicin E7 than those of alpha-amylase inhibitor, which may be attributed to the higher number of residues of colicin E7 and/or the structural differences of the two proteins. The cumulative density function of the low frequencies in each run conforms to the expectations from the normal mode analysis. When different runs of alpha-amylase inhibitor are projected on the same set of eigenvectors, it is found that principal components obtained from a certain conformational region of a protein has a moderate explanation power in other conformational regions and the local minima are similar to a certain extent, while the height of the energy barriers in between the minima significantly change. As a final remark, time series analysis tools are further exploited in this study with the motive of explaining the equilibrium fluctuations of proteins. Copyright 2004 American Institute of Physics
NASA Astrophysics Data System (ADS)
Alakent, Burak; Doruker, Pemra; Camurdan, Mehmet C.
2004-09-01
Time series analysis is applied on the collective coordinates obtained from principal component analysis of independent molecular dynamics simulations of α-amylase inhibitor tendamistat and immunity protein of colicin E7 based on the Cα coordinates history. Even though the principal component directions obtained for each run are considerably different, the dynamics information obtained from these runs are surprisingly similar in terms of time series models and parameters. There are two main differences in the dynamics of the two proteins: the higher density of low frequencies and the larger step sizes for the interminima motions of colicin E7 than those of α-amylase inhibitor, which may be attributed to the higher number of residues of colicin E7 and/or the structural differences of the two proteins. The cumulative density function of the low frequencies in each run conforms to the expectations from the normal mode analysis. When different runs of α-amylase inhibitor are projected on the same set of eigenvectors, it is found that principal components obtained from a certain conformational region of a protein has a moderate explanation power in other conformational regions and the local minima are similar to a certain extent, while the height of the energy barriers in between the minima significantly change. As a final remark, time series analysis tools are further exploited in this study with the motive of explaining the equilibrium fluctuations of proteins.
Descriptive Characteristics of Surface Water Quality in Hong Kong by a Self-Organising Map
An, Yan; Zou, Zhihong; Li, Ranran
2016-01-01
In this study, principal component analysis (PCA) and a self-organising map (SOM) were used to analyse a complex dataset obtained from the river water monitoring stations in the Tolo Harbor and Channel Water Control Zone (Hong Kong), covering the period of 2009–2011. PCA was initially applied to identify the principal components (PCs) among the nonlinear and complex surface water quality parameters. SOM followed PCA, and was implemented to analyze the complex relationships and behaviors of the parameters. The results reveal that PCA reduced the multidimensional parameters to four significant PCs which are combinations of the original ones. The positive and inverse relationships of the parameters were shown explicitly by pattern analysis in the component planes. It was found that PCA and SOM are efficient tools to capture and analyze the behavior of multivariable, complex, and nonlinear related surface water quality data. PMID:26761018
Descriptive Characteristics of Surface Water Quality in Hong Kong by a Self-Organising Map.
An, Yan; Zou, Zhihong; Li, Ranran
2016-01-08
In this study, principal component analysis (PCA) and a self-organising map (SOM) were used to analyse a complex dataset obtained from the river water monitoring stations in the Tolo Harbor and Channel Water Control Zone (Hong Kong), covering the period of 2009-2011. PCA was initially applied to identify the principal components (PCs) among the nonlinear and complex surface water quality parameters. SOM followed PCA, and was implemented to analyze the complex relationships and behaviors of the parameters. The results reveal that PCA reduced the multidimensional parameters to four significant PCs which are combinations of the original ones. The positive and inverse relationships of the parameters were shown explicitly by pattern analysis in the component planes. It was found that PCA and SOM are efficient tools to capture and analyze the behavior of multivariable, complex, and nonlinear related surface water quality data.
Nuclear Forensics Applications of Principal Component Analysis on Micro X-ray Fluorescence Images
analysis on quantified micro x-ray fluorescence intensity values. This method is then applied to address goals of nuclear forensics . Thefirst...researchers in the development and validation of nuclear forensics methods. A method for determining material homogeneity is developed and demonstrated
Scientific Elitism and the Information System of Science
ERIC Educational Resources Information Center
Amick, Daniel James
1973-01-01
Scientific elitism must be viewed as a multidimensional phenomenon. Ten variables of elitism are considered and a principal components factor analysis is used to scale this multivariate domain. Two significant dimensions of elitism were found; one in basic and one in applied science. (20 references) (Author)
Construction of a Physician Skills Inventory
ERIC Educational Resources Information Center
Richard, George V.; Zarconi, Joseph; Savickas, Mark L.
2012-01-01
The current study applied Holland's RIASEC typology to develop a "Physician Skills Inventory". We identified the transferable skills and abilities that are critical to effective performance in medicine and had 140 physicians in 25 different specialties rate the importance of those skills. Principal component analysis of their responses produced…
Non-rigid image registration using a statistical spline deformation model.
Loeckx, Dirk; Maes, Frederik; Vandermeulen, Dirk; Suetens, Paul
2003-07-01
We propose a statistical spline deformation model (SSDM) as a method to solve non-rigid image registration. Within this model, the deformation is expressed using a statistically trained B-spline deformation mesh. The model is trained by principal component analysis of a training set. This approach allows to reduce the number of degrees of freedom needed for non-rigid registration by only retaining the most significant modes of variation observed in the training set. User-defined transformation components, like affine modes, are merged with the principal components into a unified framework. Optimization proceeds along the transformation components rather then along the individual spline coefficients. The concept of SSDM's is applied to the temporal registration of thorax CR-images using pattern intensity as the registration measure. Our results show that, using 30 training pairs, a reduction of 33% is possible in the number of degrees of freedom without deterioration of the result. The same accuracy as without SSDM's is still achieved after a reduction up to 66% of the degrees of freedom.
Mathias, Patrick C; Turner, Emily H; Scroggins, Sheena M; Salipante, Stephen J; Hoffman, Noah G; Pritchard, Colin C; Shirts, Brian H
2016-03-01
To apply techniques for ancestry and sex computation from next-generation sequencing (NGS) data as an approach to confirm sample identity and detect sample processing errors. We combined a principal component analysis method with k-nearest neighbors classification to compute the ancestry of patients undergoing NGS testing. By combining this calculation with X chromosome copy number data, we determined the sex and ancestry of patients for comparison with self-report. We also modeled the sensitivity of this technique in detecting sample processing errors. We applied this technique to 859 patient samples with reliable self-report data. Our k-nearest neighbors ancestry screen had an accuracy of 98.7% for patients reporting a single ancestry. Visual inspection of principal component plots was consistent with self-report in 99.6% of single-ancestry and mixed-ancestry patients. Our model demonstrates that approximately two-thirds of potential sample swaps could be detected in our patient population using this technique. Patient ancestry can be estimated from NGS data incidentally sequenced in targeted panels, enabling an inexpensive quality control method when coupled with patient self-report. © American Society for Clinical Pathology, 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Zhao, Yan-Ru; Yu, Ke-Qiang; Li, Xiaoli; He, Yong
2016-12-01
Infected petals are often regarded as the source for the spread of fungi Sclerotinia sclerotiorum in all growing process of rapeseed (Brassica napus L.) plants. This research aimed to detect fungal infection of rapeseed petals by applying hyperspectral imaging in the spectral region of 874-1734 nm coupled with chemometrics. Reflectance was extracted from regions of interest (ROIs) in the hyperspectral image of each sample. Firstly, principal component analysis (PCA) was applied to conduct a cluster analysis with the first several principal components (PCs). Then, two methods including X-loadings of PCA and random frog (RF) algorithm were used and compared for optimizing wavebands selection. Least squares-support vector machine (LS-SVM) methodology was employed to establish discriminative models based on the optimal and full wavebands. Finally, area under the receiver operating characteristics curve (AUC) was utilized to evaluate classification performance of these LS-SVM models. It was found that LS-SVM based on the combination of all optimal wavebands had the best performance with AUC of 0.929. These results were promising and demonstrated the potential of applying hyperspectral imaging in fungus infection detection on rapeseed petals.
A Factor Analytic Validation of Holland's Vocational Preference Inventory
ERIC Educational Resources Information Center
Di Scipio, William J.
1974-01-01
A principal components analysis was applied to a 135-item pool of the Holland Vocational Preference Inventory, Sixth Revision. The a priori clinical scales were partially upheld with differences attributed to the characteristics of the sample and sociopolitical time context during which the test was administered. (Author)
Controlled Assessments in 14-19 Diplomas: Implementation and Effects on Learning Experiences
ERIC Educational Resources Information Center
Crisp, Victoria; Green, Sylvia
2012-01-01
The Principal Learning components of 14-19 Diplomas (introduced in England in 2008) are assessed predominantly via "controlled assessments". These assessments are conducted within the learning context under specified conditions (or "controls") and require learners to apply their skills to work-related tasks. In this research,…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larour, Jean; Aranchuk, Leonid E.; Danisman, Yusuf
2016-03-15
Principal component analysis is applied and compared with the line ratios of special Ne-like transitions for investigating the electron beam effects on the L-shell Cu synthetic spectra. The database for the principal component extraction is created over a non Local Thermodynamic Equilibrium (non-LTE) collisional radiative L-shell Copper model. The extracted principal components are used as a database for Artificial Neural Network in order to estimate the plasma electron temperature, density, and beam fractions from a representative time-integrated spatially resolved L-shell Cu X-pinch plasma spectrum. The spectrum is produced by the explosion of 25-μm Cu wires on a compact LC (40more » kV, 200 kA, and 200 ns) generator. The modeled plasma electron temperatures are about T{sub e} ∼ 150 eV and N{sub e} = 5 × 10{sup 19} cm{sup −3} in the presence of the fraction of the beams with f ∼ 0.05 and a centered energy of ∼10 keV.« less
Fu, Jun; Huang, Canqin; Xing, Jianguo; Zheng, Junbao
2012-01-01
Biologically-inspired models and algorithms are considered as promising sensor array signal processing methods for electronic noses. Feature selection is one of the most important issues for developing robust pattern recognition models in machine learning. This paper describes an investigation into the classification performance of a bionic olfactory model with the increase of the dimensions of input feature vector (outer factor) as well as its parallel channels (inner factor). The principal component analysis technique was applied for feature selection and dimension reduction. Two data sets of three classes of wine derived from different cultivars and five classes of green tea derived from five different provinces of China were used for experiments. In the former case the results showed that the average correct classification rate increased as more principal components were put in to feature vector. In the latter case the results showed that sufficient parallel channels should be reserved in the model to avoid pattern space crowding. We concluded that 6~8 channels of the model with principal component feature vector values of at least 90% cumulative variance is adequate for a classification task of 3~5 pattern classes considering the trade-off between time consumption and classification rate.
Roblová, Vendula; Bittová, Miroslava; Kubáň, Petr; Kubáň, Vlastimil
2016-07-01
In this work aqueous infusions from ten Mentha herbal samples (four different Mentha species and six hybrids of Mentha x piperita) and 20 different peppermint teas were screened by capillary electrophoresis with UV detection. The fingerprint separation was accomplished in a 25 mM borate background electrolyte with 10% methanol at pH 9.3. The total polyphenolic content in the extracts was determined spectrophotometrically at 765 nm by a Folin-Ciocalteu phenol assay. Total antioxidant activity was determined by scavenging of 2,2-diphenyl-1-picrylhydrazyl radical at 515 nm. The peak areas of 12 dominant peaks from CE analysis, present in all samples, and the value of total polyphenolic content and total antioxidant activity obtained by spectrophotometry was combined into a single data matrix and principal component analysis was applied. The obtained principal component analysis model resulted in distinct clusters of Mentha and peppermint tea samples distinguishing the samples according to their potential protective antioxidant effect. Principal component analysis, using a non-targeted approach with no need for compound identification, was found as a new promising tool for the screening of herbal tea products. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
On the Fallibility of Principal Components in Research
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.; Li, Tenglong
2017-01-01
The measurement error in principal components extracted from a set of fallible measures is discussed and evaluated. It is shown that as long as one or more measures in a given set of observed variables contains error of measurement, so also does any principal component obtained from the set. The error variance in any principal component is shown…
Corilo, Yuri E; Podgorski, David C; McKenna, Amy M; Lemkau, Karin L; Reddy, Christopher M; Marshall, Alan G; Rodgers, Ryan P
2013-10-01
One fundamental challenge with either acute or chronic oil spills is to identify the source, especially in highly polluted areas, near natural oil seeps, when the source contains more than one petroleum product or when extensive weathering has occurred. Here we focus on heavy fuel oil that spilled (~200,000 L) from two suspected fuel tanks that were ruptured on the motor vessel (M/V) Cosco Busan when it struck the San Francisco-Oakland Bay Bridge in November 2007. We highlight the utility of principal component analysis (PCA) of elemental composition data obtained by high resolution FT-ICR mass spectrometry to correctly identify the source of environmental contamination caused by the unintended release of heavy fuel oil (HFO). Using ultrahigh resolution electrospray ionization (ESI) Fourier transform ion cyclotron resonance mass spectrometry, we uniquely assigned thousands of elemental compositions of heteroatom-containing species in neat samples from both tanks and then applied principal component analysis. The components were based on double bond equivalents for constituents of elemental composition, CcHhN1S1. To determine if the fidelity of our source identification was affected by weathering, field samples were collected at various intervals up to two years after the spill. We are able to identify a suite of polar petroleum markers that are environmentally persistent, enabling us to confidently identify that only one tank was the source of the spilled oil: in fact, a single principal component could account for 98% of the variance. Although identification is unaffected by the presence of higher polarity, petrogenic oxidation (weathering) products, future studies may require removal of such species by anion exchange chromatography prior to mass spectral analysis due to their preferential ionization by ESI.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koch, C.D.; Pirkle, F.L.; Schmidt, J.S.
1981-01-01
A Principal Components Analysis (PCA) has been written to aid in the interpretation of multivariate aerial radiometric data collected by the US Department of Energy (DOE) under the National Uranium Resource Evaluation (NURE) program. The variations exhibited by these data have been reduced and classified into a number of linear combinations by using the PCA program. The PCA program then generates histograms and outlier maps of the individual variates. Black and white plots can be made on a Calcomp plotter by the application of follow-up programs. All programs referred to in this guide were written for a DEC-10. From thismore » analysis a geologist may begin to interpret the data structure. Insight into geological processes underlying the data may be obtained.« less
Ocaña-Peinado, Francisco M; Valderrama, Mariano J; Bouzas, Paula R
2013-05-01
The problem of developing a 2-week-on ahead forecast of atmospheric cypress pollen levels is tackled in this paper by developing a principal component multiple regression model involving several climatic variables. The efficacy of the proposed model is validated by means of an application to real data of Cupressaceae pollen concentration in the city of Granada (southeast of Spain). The model was applied to data from 11 consecutive years (1995-2005), with 2006 being used to validate the forecasts. Based on the work of different authors, factors as temperature, humidity, hours of sun and wind speed were incorporated in the model. This methodology explains approximately 75-80% of the variability in the airborne Cupressaceae pollen concentration.
Automated cloud screening of AVHRR imagery using split-and-merge clustering
NASA Technical Reports Server (NTRS)
Gallaudet, Timothy C.; Simpson, James J.
1991-01-01
Previous methods to segment clouds from ocean in AVHRR imagery have shown varying degrees of success, with nighttime approaches being the most limited. An improved method of automatic image segmentation, the principal component transformation split-and-merge clustering (PCTSMC) algorithm, is presented and applied to cloud screening of both nighttime and daytime AVHRR data. The method combines spectral differencing, the principal component transformation, and split-and-merge clustering to sample objectively the natural classes in the data. This segmentation method is then augmented by supervised classification techniques to screen clouds from the imagery. Comparisons with other nighttime methods demonstrate its improved capability in this application. The sensitivity of the method to clustering parameters is presented; the results show that the method is insensitive to the split-and-merge thresholds.
Liu, Xiaona; Zhang, Qiao; Wu, Zhisheng; Shi, Xinyuan; Zhao, Na; Qiao, Yanjiang
2015-01-01
Laser-induced breakdown spectroscopy (LIBS) was applied to perform a rapid elemental analysis and provenance study of Blumea balsamifera DC. Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were implemented to exploit the multivariate nature of the LIBS data. Scores and loadings of computed principal components visually illustrated the differing spectral data. The PLS-DA algorithm showed good classification performance. The PLS-DA model using complete spectra as input variables had similar discrimination performance to using selected spectral lines as input variables. The down-selection of spectral lines was specifically focused on the major elements of B. balsamifera samples. Results indicated that LIBS could be used to rapidly analyze elements and to perform provenance study of B. balsamifera. PMID:25558999
NASA Astrophysics Data System (ADS)
Vítková, Gabriela; Prokeš, Lubomír; Novotný, Karel; Pořízka, Pavel; Novotný, Jan; Všianský, Dalibor; Čelko, Ladislav; Kaiser, Jozef
2014-11-01
Focusing on historical aspect, during archeological excavation or restoration works of buildings or different structures built from bricks it is important to determine, preferably in-situ and in real-time, the locality of bricks origin. Fast classification of bricks on the base of Laser-Induced Breakdown Spectroscopy (LIBS) spectra is possible using multivariate statistical methods. Combination of principal component analysis (PCA) and linear discriminant analysis (LDA) was applied in this case. LIBS was used to classify altogether the 29 brick samples from 7 different localities. Realizing comparative study using two different LIBS setups - stand-off and table-top it is shown that stand-off LIBS has a big potential for archeological in-field measurements.
Principal elementary mode analysis (PEMA).
Folch-Fortuny, Abel; Marques, Rodolfo; Isidro, Inês A; Oliveira, Rui; Ferrer, Alberto
2016-03-01
Principal component analysis (PCA) has been widely applied in fluxomics to compress data into a few latent structures in order to simplify the identification of metabolic patterns. These latent structures lack a direct biological interpretation due to the intrinsic constraints associated with a PCA model. Here we introduce a new method that significantly improves the interpretability of the principal components with a direct link to metabolic pathways. This method, called principal elementary mode analysis (PEMA), establishes a bridge between a PCA-like model, aimed at explaining the maximum variance in flux data, and the set of elementary modes (EMs) of a metabolic network. It provides an easy way to identify metabolic patterns in large fluxomics datasets in terms of the simplest pathways of the organism metabolism. The results using a real metabolic model of Escherichia coli show the ability of PEMA to identify the EMs that generated the different simulated flux distributions. Actual flux data of E. coli and Pichia pastoris cultures confirm the results observed in the simulated study, providing a biologically meaningful model to explain flux data of both organisms in terms of the EM activation. The PEMA toolbox is freely available for non-commercial purposes on http://mseg.webs.upv.es.
A method to map errors in the deformable registration of 4DCT images1
Vaman, Constantin; Staub, David; Williamson, Jeffrey; Murphy, Martin J.
2010-01-01
Purpose: To present a new approach to the problem of estimating errors in deformable image registration (DIR) applied to sequential phases of a 4DCT data set. Methods: A set of displacement vector fields (DVFs) are made by registering a sequence of 4DCT phases. The DVFs are assumed to display anatomical movement, with the addition of errors due to the imaging and registration processes. The positions of physical landmarks in each CT phase are measured as ground truth for the physical movement in the DVF. Principal component analysis of the DVFs and the landmarks is used to identify and separate the eigenmodes of physical movement from the error eigenmodes. By subtracting the physical modes from the principal components of the DVFs, the registration errors are exposed and reconstructed as DIR error maps. The method is demonstrated via a simple numerical model of 4DCT DVFs that combines breathing movement with simulated maps of spatially correlated DIR errors. Results: The principal components of the simulated DVFs were observed to share the basic properties of principal components for actual 4DCT data. The simulated error maps were accurately recovered by the estimation method. Conclusions: Deformable image registration errors can have complex spatial distributions. Consequently, point-by-point landmark validation can give unrepresentative results that do not accurately reflect the registration uncertainties away from the landmarks. The authors are developing a method for mapping the complete spatial distribution of DIR errors using only a small number of ground truth validation landmarks. PMID:21158288
An application of principal component analysis to the clavicle and clavicle fixation devices.
Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan
2010-03-26
Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.
A first application of independent component analysis to extracting structure from stock returns.
Back, A D; Weigend, A S
1997-08-01
This paper explores the application of a signal processing technique known as independent component analysis (ICA) or blind source separation to multivariate financial time series such as a portfolio of stocks. The key idea of ICA is to linearly map the observed multivariate time series into a new space of statistically independent components (ICs). We apply ICA to three years of daily returns of the 28 largest Japanese stocks and compare the results with those obtained using principal component analysis. The results indicate that the estimated ICs fall into two categories, (i) infrequent large shocks (responsible for the major changes in the stock prices), and (ii) frequent smaller fluctuations (contributing little to the overall level of the stocks). We show that the overall stock price can be reconstructed surprisingly well by using a small number of thresholded weighted ICs. In contrast, when using shocks derived from principal components instead of independent components, the reconstructed price is less similar to the original one. ICA is shown to be a potentially powerful method of analyzing and understanding driving mechanisms in financial time series. The application to portfolio optimization is described in Chin and Weigend (1998).
Inayama, T; Kashiwazaki, H; Sakamoto, M
1998-12-01
We tried to analyze synthetically teachers' view points associated with health education and roles of school lunch in primary education. For this purpose, a survey using an open-ended questionnaire consisting of eight items relating to health education in the school curriculum was carried out in 100 teachers of ten public primary schools. Subjects were asked to describe their view regarding the following eight items: 1) health and physical guidance education, 2) school lunch guidance education, 3) pupils' attitude toward their own health and nutrition, 4) health education, 5) role of school lunch in education, 6) future subjects of health education, 7) class room lesson related to school lunch, 8) guidance in case of pupil with unbalanced dieting and food avoidance. Subjects described their own opinions on an open-ended questionnaire response sheet. Keywords in individual descriptions were selected, rearranged and classified into categories according to their own meanings, and each of the selected keywords were used as the dummy variable. To assess individual opinions synthetically, a principal component analysis was then applied to the variables collected through the teachers' descriptions, and four factors were extracted. The results were as follows. 1) Four factors obtained from the repeated principal component analysis were summarized as; roles of health education and school lunch program (the first principal component), cooperation with nurse-teachers and those in charge of lunch service (the second principal component), time allocation for health education in home-room activity and lunch time (the third principal component) and contents of health education and school lunch guidance and their future plan (the fourth principal component). 2) Teachers regarded the role of school lunch in primary education as providing daily supply of nutrients, teaching of table manners and building up friendships with classmates, health education and food and nutrition education, and developing food preferences through eating lunch together with classmates. 3) Significant positive correlation was observed between "the teachers' opinion about the role of school lunch of providing opportunity to learn good behavior for food preferences through eating lunch together with classmates" and the first principal component "roles of health education and school lunch program" (r = 0.39, p < 0.01). The variable "the role of school lunch is health education and food and nutrition education" showed positive correlation with the principle component "cooperation with nurse-teachers and those in charge of lunch service" (r = 0.27, p < 0.01). Interesting relationships obtained were that teachers with longer educational experience tended to place importance in health education and food and nutrition education as the role of school lunch, and that male teachers regarded the roles of school lunch more importantly for future education in primary education than female teachers did.
NASA Astrophysics Data System (ADS)
Dafu, Shen; Leihong, Zhang; Dong, Liang; Bei, Li; Yi, Kang
2017-07-01
The purpose of this study is to improve the reconstruction precision and better copy the color of spectral image surfaces. A new spectral reflectance reconstruction algorithm based on an iterative threshold combined with weighted principal component space is presented in this paper, and the principal component with weighted visual features is the sparse basis. Different numbers of color cards are selected as the training samples, a multispectral image is the testing sample, and the color differences in the reconstructions are compared. The channel response value is obtained by a Mega Vision high-accuracy, multi-channel imaging system. The results show that spectral reconstruction based on weighted principal component space is superior in performance to that based on traditional principal component space. Therefore, the color difference obtained using the compressive-sensing algorithm with weighted principal component analysis is less than that obtained using the algorithm with traditional principal component analysis, and better reconstructed color consistency with human eye vision is achieved.
Nelemans, Stefanie A; van Assche, Evelien; Bijttebier, Patricia; Colpin, Hilde; van Leeuwen, Karla; Verschueren, Karine; Claes, Stephan; van den Noortgate, Wim; Goossens, Luc
2018-04-26
Guided by a developmental psychopathology framework, research has increasingly focused on the interplay of genetics and environment as a predictor of different forms of psychopathology, including social anxiety. In these efforts, the polygenic nature of complex phenotypes such as social anxiety is increasingly recognized, but studies applying polygenic approaches are still scarce. In this study, we applied Principal Covariates Regression as a novel approach to creating polygenic components for the oxytocin system, which has recently been put forward as particularly relevant to social anxiety. Participants were 978 adolescents (49.4% girls; M age T 1 = 13.8 years). Across 3 years, questionnaires were used to assess adolescent social anxiety symptoms and multi-informant reports of parental psychological control and autonomy support. All adolescents were genotyped for 223 oxytocin single nucleotide polymorphisms (SNPs) in 14 genes. Using Principal Covariates Regression, these SNPs could be reduced to five polygenic components. Four components reflected the underlying linkage disequilibrium and ancestry structure, whereas the fifth component, which consisted of small contributions of many SNPs across multiple genes, was strongly positively associated with adolescent social anxiety symptoms, pointing to an index of genetic risk. Moreover, significant interactions were found with this polygenic component and the environmental variables of interest. Specifically, adolescents who scored high on this polygenic component and experienced less adequate parenting (i.e., high psychological control or low autonomy support) showed the highest levels of social anxiety. Implications of these findings are discussed in the context of individual-by-environment models.
Principal Component and Linkage Analysis of Cardiovascular Risk Traits in the Norfolk Isolate
Cox, Hannah C.; Bellis, Claire; Lea, Rod A.; Quinlan, Sharon; Hughes, Roger; Dyer, Thomas; Charlesworth, Jac; Blangero, John; Griffiths, Lyn R.
2009-01-01
Objective(s) An individual's risk of developing cardiovascular disease (CVD) is influenced by genetic factors. This study focussed on mapping genetic loci for CVD-risk traits in a unique population isolate derived from Norfolk Island. Methods This investigation focussed on 377 individuals descended from the population founders. Principal component analysis was used to extract orthogonal components from 11 cardiovascular risk traits. Multipoint variance component methods were used to assess genome-wide linkage using SOLAR to the derived factors. A total of 285 of the 377 related individuals were informative for linkage analysis. Results A total of 4 principal components accounting for 83% of the total variance were derived. Principal component 1 was loaded with body size indicators; principal component 2 with body size, cholesterol and triglyceride levels; principal component 3 with the blood pressures; and principal component 4 with LDL-cholesterol and total cholesterol levels. Suggestive evidence of linkage for principal component 2 (h2 = 0.35) was observed on chromosome 5q35 (LOD = 1.85; p = 0.0008). While peak regions on chromosome 10p11.2 (LOD = 1.27; p = 0.005) and 12q13 (LOD = 1.63; p = 0.003) were observed to segregate with principal components 1 (h2 = 0.33) and 4 (h2 = 0.42), respectively. Conclusion(s): This study investigated a number of CVD risk traits in a unique isolated population. Findings support the clustering of CVD risk traits and provide interesting evidence of a region on chromosome 5q35 segregating with weight, waist circumference, HDL-c and total triglyceride levels. PMID:19339786
Prediction of pH of fresh chicken breast fillets by VNIR hyperspectral imaging
USDA-ARS?s Scientific Manuscript database
Visible and near-infrared (VNIR) hyperspectral imaging (400–900 nm) was used to evaluate pH of fresh chicken breast fillets (pectoralis major muscle) from the bone (dorsal) side of individual fillets. After the principal component analysis (PCA), a band threshold method was applied to the first prin...
Nonparametric regression applied to quantitative structure-activity relationships
Constans; Hirst
2000-03-01
Several nonparametric regressors have been applied to modeling quantitative structure-activity relationship (QSAR) data. The simplest regressor, the Nadaraya-Watson, was assessed in a genuine multivariate setting. Other regressors, the local linear and the shifted Nadaraya-Watson, were implemented within additive models--a computationally more expedient approach, better suited for low-density designs. Performances were benchmarked against the nonlinear method of smoothing splines. A linear reference point was provided by multilinear regression (MLR). Variable selection was explored using systematic combinations of different variables and combinations of principal components. For the data set examined, 47 inhibitors of dopamine beta-hydroxylase, the additive nonparametric regressors have greater predictive accuracy (as measured by the mean absolute error of the predictions or the Pearson correlation in cross-validation trails) than MLR. The use of principal components did not improve the performance of the nonparametric regressors over use of the original descriptors, since the original descriptors are not strongly correlated. It remains to be seen if the nonparametric regressors can be successfully coupled with better variable selection and dimensionality reduction in the context of high-dimensional QSARs.
Research of seafloor topographic analyses for a staged mineral exploration
NASA Astrophysics Data System (ADS)
Ikeda, M.; Kadoshima, K.; Koizumi, Y.; Yamakawa, T.; Asakawa, E.; Sumi, T.; Kose, M.
2016-12-01
J-MARES (Research and Development Partnership for Next Generation Technology of Marine Resources Survey, JAPAN) has been designing a low-cost and high-efficiency exploration system for seafloor hydrothermal massive sulfide (SMS) deposits in "Cross-ministerial Strategic Innovation Promotion Program (SIP)" granted by the Cabinet Office, Government of Japan since 2014. We proposed the multi-stage approach, which is designed from the regional scaled to the detail scaled survey stages through semi-detail scaled, focusing a prospective area by seafloor topographic analyses. We applied this method to the area of more than 100km x 100km around Okinawa Trough, including some well-known mineralized deposits. In the regional scale survey, we assume survey areas are more than 100 km x 100km. Then the spatial resolution of topography data should be bigger than 100m. The 500 m resolution data which is interpolated into 250 m resolution was used for extracting depression and performing principal component analysis (PCA) by the wavelength obtained from frequency analysis. As the result, we have successfully extracted the areas having the topographic features quite similar to well-known mineralized deposits. In the semi-local survey stage, we use the topography data obtained by bathymetric survey using multi-narrow beam echo-sounder. The 30m-resolution data was used for extracting depression, relative-large mounds, hills, lineaments as fault, and also for performing frequency analysis. As the result, wavelength as principal component constituting in the target area was extracted by PCA of wavelength obtained from frequency analysis. Therefore, color image was composited by using the second principal component (PC2) to the forth principal component (PC4) in which the continuity of specific wavelength was observed, and consistent with extracted lineaments. In addition, well-known mineralized deposits were discriminated in the same clusters by using clustering from PC2 to PC4.We applied the results described above to a new area, and successfully extract the quite similar area in vicinity to one of the well-known mineralized deposits. So we are going to verify the extracted areas by using geophysical methods, such as vertical cable seismic and time-domain EM survey, developed in this SIP project.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Viennot, David
We show that the holonomy of a connection defined on a principal composite bundle is related by a non-Abelian Stokes theorem to the composition of the holonomies associated with the connections of the component bundles of the composite. We apply this formalism to describe the non-Abelian geometric phase (when the geometric phase generator does not commute with the dynamical phase generator). We find then an assumption to obtain a new kind of separation between the dynamical and the geometric phases. We also apply this formalism to the gauge theory of gravity in the presence of a Dirac spinor field inmore » order to decompose the holonomy of the Lorentz connection into holonomies of the linear connection and of the Cartan connection.« less
Aznar, Margarita; Arroyo, Teresa
2007-09-21
The purge-and-trap extraction method, coupled to a gas chromatograph with mass spectrometry detection, has been applied to the determination of 26 aromatic volatiles in wine. The method was optimized, validated and applied to the analyses of 40 red and white wines from 7 different Spanish regions. Principal components analyses of data showed the correlation between wines of similar origin.
Principal Component Noise Filtering for NAST-I Radiometric Calibration
NASA Technical Reports Server (NTRS)
Tian, Jialin; Smith, William L., Sr.
2011-01-01
The National Polar-orbiting Operational Environmental Satellite System (NPOESS) Airborne Sounder Testbed- Interferometer (NAST-I) instrument is a high-resolution scanning interferometer that measures emitted thermal radiation between 3.3 and 18 microns. The NAST-I radiometric calibration is achieved using internal blackbody calibration references at ambient and hot temperatures. In this paper, we introduce a refined calibration technique that utilizes a principal component (PC) noise filter to compensate for instrument distortions and artifacts, therefore, further improve the absolute radiometric calibration accuracy. To test the procedure and estimate the PC filter noise performance, we form dependent and independent test samples using odd and even sets of blackbody spectra. To determine the optimal number of eigenvectors, the PC filter algorithm is applied to both dependent and independent blackbody spectra with a varying number of eigenvectors. The optimal number of PCs is selected so that the total root-mean-square (RMS) error is minimized. To estimate the filter noise performance, we examine four different scenarios: apply PC filtering to both dependent and independent datasets, apply PC filtering to dependent calibration data only, apply PC filtering to independent data only, and no PC filters. The independent blackbody radiances are predicted for each case and comparisons are made. The results show significant reduction in noise in the final calibrated radiances with the implementation of the PC filtering algorithm.
NASA Astrophysics Data System (ADS)
Martinez, J. C.; Guzmán-Sepúlveda, J. R.; Bolañoz Evia, G. R.; Córdova, T.; Guzmán-Cabrera, R.
2018-06-01
In this work, we applied machine learning techniques to Raman spectra for the characterization and classification of manufactured pharmaceutical products. Our measurements were taken with commercial equipment, for accurate assessment of variations with respect to one calibrated control sample. Unlike the typical use of Raman spectroscopy in pharmaceutical applications, in our approach the principal components of the Raman spectrum are used concurrently as attributes in machine learning algorithms. This permits an efficient comparison and classification of the spectra measured from the samples under study. This also allows for accurate quality control as all relevant spectral components are considered simultaneously. We demonstrate our approach with respect to the specific case of acetaminophen, which is one of the most widely used analgesics in the market. In the experiments, commercial samples from thirteen different laboratories were analyzed and compared against a control sample. The raw data were analyzed based on an arithmetic difference between the nominal active substance and the measured values in each commercial sample. The principal component analysis was applied to the data for quantitative verification (i.e., without considering the actual concentration of the active substance) of the difference in the calibrated sample. Our results show that by following this approach adulterations in pharmaceutical compositions can be clearly identified and accurately quantified.
Bignardi, A B; El Faro, L; Rosa, G J M; Cardoso, V L; Machado, P F; Albuquerque, L G
2012-04-01
A total of 46,089 individual monthly test-day (TD) milk yields (10 test-days), from 7,331 complete first lactations of Holstein cattle were analyzed. A standard multivariate analysis (MV), reduced rank analyses fitting the first 2, 3, and 4 genetic principal components (PC2, PC3, PC4), and analyses that fitted a factor analytic structure considering 2, 3, and 4 factors (FAS2, FAS3, FAS4), were carried out. The models included the random animal genetic effect and fixed effects of the contemporary groups (herd-year-month of test-day), age of cow (linear and quadratic effects), and days in milk (linear effect). The residual covariance matrix was assumed to have full rank. Moreover, 2 random regression models were applied. Variance components were estimated by restricted maximum likelihood method. The heritability estimates ranged from 0.11 to 0.24. The genetic correlation estimates between TD obtained with the PC2 model were higher than those obtained with the MV model, especially on adjacent test-days at the end of lactation close to unity. The results indicate that for the data considered in this study, only 2 principal components are required to summarize the bulk of genetic variation among the 10 traits. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Perturbational formulation of principal component analysis in molecular dynamics simulation.
Koyama, Yohei M; Kobayashi, Tetsuya J; Tomoda, Shuji; Ueda, Hiroki R
2008-10-01
Conformational fluctuations of a molecule are important to its function since such intrinsic fluctuations enable the molecule to respond to the external environmental perturbations. For extracting large conformational fluctuations, which predict the primary conformational change by the perturbation, principal component analysis (PCA) has been used in molecular dynamics simulations. However, several versions of PCA, such as Cartesian coordinate PCA and dihedral angle PCA (dPCA), are limited to use with molecules with a single dominant state or proteins where the dihedral angle represents an important internal coordinate. Other PCAs with general applicability, such as the PCA using pairwise atomic distances, do not represent the physical meaning clearly. Therefore, a formulation that provides general applicability and clearly represents the physical meaning is yet to be developed. For developing such a formulation, we consider the conformational distribution change by the perturbation with arbitrary linearly independent perturbation functions. Within the second order approximation of the Kullback-Leibler divergence by the perturbation, the PCA can be naturally interpreted as a method for (1) decomposing a given perturbation into perturbations that independently contribute to the conformational distribution change or (2) successively finding the perturbation that induces the largest conformational distribution change. In this perturbational formulation of PCA, (i) the eigenvalue measures the Kullback-Leibler divergence from the unperturbed to perturbed distributions, (ii) the eigenvector identifies the combination of the perturbation functions, and (iii) the principal component determines the probability change induced by the perturbation. Based on this formulation, we propose a PCA using potential energy terms, and we designate it as potential energy PCA (PEPCA). The PEPCA provides both general applicability and clear physical meaning. For demonstrating its power, we apply the PEPCA to an alanine dipeptide molecule in vacuum as a minimal model of a nonsingle dominant conformational biomolecule. The first and second principal components clearly characterize two stable states and the transition state between them. Positive and negative components with larger absolute values of the first and second eigenvectors identify the electrostatic interactions, which stabilize or destabilize each stable state and the transition state. Our result therefore indicates that PCA can be applied, by carefully selecting the perturbation functions, not only to identify the molecular conformational fluctuation but also to predict the conformational distribution change by the perturbation beyond the limitation of the previous methods.
Perturbational formulation of principal component analysis in molecular dynamics simulation
NASA Astrophysics Data System (ADS)
Koyama, Yohei M.; Kobayashi, Tetsuya J.; Tomoda, Shuji; Ueda, Hiroki R.
2008-10-01
Conformational fluctuations of a molecule are important to its function since such intrinsic fluctuations enable the molecule to respond to the external environmental perturbations. For extracting large conformational fluctuations, which predict the primary conformational change by the perturbation, principal component analysis (PCA) has been used in molecular dynamics simulations. However, several versions of PCA, such as Cartesian coordinate PCA and dihedral angle PCA (dPCA), are limited to use with molecules with a single dominant state or proteins where the dihedral angle represents an important internal coordinate. Other PCAs with general applicability, such as the PCA using pairwise atomic distances, do not represent the physical meaning clearly. Therefore, a formulation that provides general applicability and clearly represents the physical meaning is yet to be developed. For developing such a formulation, we consider the conformational distribution change by the perturbation with arbitrary linearly independent perturbation functions. Within the second order approximation of the Kullback-Leibler divergence by the perturbation, the PCA can be naturally interpreted as a method for (1) decomposing a given perturbation into perturbations that independently contribute to the conformational distribution change or (2) successively finding the perturbation that induces the largest conformational distribution change. In this perturbational formulation of PCA, (i) the eigenvalue measures the Kullback-Leibler divergence from the unperturbed to perturbed distributions, (ii) the eigenvector identifies the combination of the perturbation functions, and (iii) the principal component determines the probability change induced by the perturbation. Based on this formulation, we propose a PCA using potential energy terms, and we designate it as potential energy PCA (PEPCA). The PEPCA provides both general applicability and clear physical meaning. For demonstrating its power, we apply the PEPCA to an alanine dipeptide molecule in vacuum as a minimal model of a nonsingle dominant conformational biomolecule. The first and second principal components clearly characterize two stable states and the transition state between them. Positive and negative components with larger absolute values of the first and second eigenvectors identify the electrostatic interactions, which stabilize or destabilize each stable state and the transition state. Our result therefore indicates that PCA can be applied, by carefully selecting the perturbation functions, not only to identify the molecular conformational fluctuation but also to predict the conformational distribution change by the perturbation beyond the limitation of the previous methods.
Dong, Jianghu J; Wang, Liangliang; Gill, Jagbir; Cao, Jiguo
2017-01-01
This article is motivated by some longitudinal clinical data of kidney transplant recipients, where kidney function progression is recorded as the estimated glomerular filtration rates at multiple time points post kidney transplantation. We propose to use the functional principal component analysis method to explore the major source of variations of glomerular filtration rate curves. We find that the estimated functional principal component scores can be used to cluster glomerular filtration rate curves. Ordering functional principal component scores can detect abnormal glomerular filtration rate curves. Finally, functional principal component analysis can effectively estimate missing glomerular filtration rate values and predict future glomerular filtration rate values.
Analyzing coastal environments by means of functional data analysis
NASA Astrophysics Data System (ADS)
Sierra, Carlos; Flor-Blanco, Germán; Ordoñez, Celestino; Flor, Germán; Gallego, José R.
2017-07-01
Here we used Functional Data Analysis (FDA) to examine particle-size distributions (PSDs) in a beach/shallow marine sedimentary environment in Gijón Bay (NW Spain). The work involved both Functional Principal Components Analysis (FPCA) and Functional Cluster Analysis (FCA). The grainsize of the sand samples was characterized by means of laser dispersion spectroscopy. Within this framework, FPCA was used as a dimension reduction technique to explore and uncover patterns in grain-size frequency curves. This procedure proved useful to describe variability in the structure of the data set. Moreover, an alternative approach, FCA, was applied to identify clusters and to interpret their spatial distribution. Results obtained with this latter technique were compared with those obtained by means of two vector approaches that combine PCA with CA (Cluster Analysis). The first method, the point density function (PDF), was employed after adapting a log-normal distribution to each PSD and resuming each of the density functions by its mean, sorting, skewness and kurtosis. The second applied a centered-log-ratio (clr) to the original data. PCA was then applied to the transformed data, and finally CA to the retained principal component scores. The study revealed functional data analysis, specifically FPCA and FCA, as a suitable alternative with considerable advantages over traditional vector analysis techniques in sedimentary geology studies.
Statistical downscaling of summer precipitation over northwestern South America
NASA Astrophysics Data System (ADS)
Palomino Lemus, Reiner; Córdoba Machado, Samir; Raquel Gámiz Fortis, Sonia; Castro Díez, Yolanda; Jesús Esteban Parra, María
2015-04-01
In this study a statistical downscaling (SD) model using Principal Component Regression (PCR) for simulating summer precipitation in Colombia during the period 1950-2005, has been developed, and climate projections during the 2071-2100 period by applying the obtained SD model have been obtained. For these ends the Principal Components (PCs) of the SLP reanalysis data from NCEP were used as predictor variables, while the observed gridded summer precipitation was the predictand variable. Period 1950-1993 was utilized for calibration and 1994-2010 for validation. The Bootstrap with replacement was applied to provide estimations of the statistical errors. All models perform reasonably well at regional scales, and the spatial distribution of the correlation coefficients between predicted and observed gridded precipitation values show high values (between 0.5 and 0.93) along Andes range, north and north Pacific of Colombia. Additionally, the ability of the MIROC5 GCM to simulate the summer precipitation in Colombia, for present climate (1971-2005), has been analyzed by calculating the differences between the simulated and observed precipitation values. The simulation obtained by this GCM strongly overestimates the precipitation along a horizontal sector through the center of Colombia, especially important at the east and west of this country. However, the SD model applied to the SLP of the GCM shows its ability to faithfully reproduce the rainfall field. Finally, in order to get summer precipitation projections in Colombia for the period 1971-2100, the downscaled model, recalibrated for the total period 1950-2010, has been applied to the SLP output from MIROC5 model under the RCP2.6, RCP4.5 and RCP8.5 scenarios. The changes estimated by the SD models are not significant under the RCP2.6 scenario, while for the RCP4.5 and RCP8.5 scenarios a significant increase of precipitation appears regard to the present values in all the regions, reaching around the 27% in northern Colombia region under the RCP8.5 scenario. Keywords: Statistical downscaling, precipitation, Principal Component Regression, climate change, Colombia. ACKNOWLEDGEMENTS This work has been financed by the projects P11-RNM-7941 (Junta de Andalucía-Spain) and CGL2013-48539-R (MINECO-Spain, FEDER).
NASA Astrophysics Data System (ADS)
Hirose, Misa; Toyota, Saori; Ojima, Nobutoshi; Ogawa-Ochiai, Keiko; Tsumura, Norimichi
2017-08-01
In this paper, principal component analysis is applied to the distribution of pigmentation, surface reflectance, and landmarks in whole facial images to obtain feature values. The relationship between the obtained feature vectors and the age of the face is then estimated by multiple regression analysis so that facial images can be modulated for woman aged 10-70. In a previous study, we analyzed only the distribution of pigmentation, and the reproduced images appeared to be younger than the apparent age of the initial images. We believe that this happened because we did not modulate the facial structures and detailed surfaces, such as wrinkles. By considering landmarks and surface reflectance over the entire face, we were able to analyze the variation in the distributions of facial structures and fine asperity, and pigmentation. As a result, our method is able to appropriately modulate the appearance of a face so that it appears to be the correct age.
Principal Component Analysis for Normal-Distribution-Valued Symbolic Data.
Wang, Huiwen; Chen, Meiling; Shi, Xiaojun; Li, Nan
2016-02-01
This paper puts forward a new approach to principal component analysis (PCA) for normal-distribution-valued symbolic data, which has a vast potential of applications in the economic and management field. We derive a full set of numerical characteristics and variance-covariance structure for such data, which forms the foundation for our analytical PCA approach. Our approach is able to use all of the variance information in the original data than the prevailing representative-type approach in the literature which only uses centers, vertices, etc. The paper also provides an accurate approach to constructing the observations in a PC space based on the linear additivity property of normal distribution. The effectiveness of the proposed method is illustrated by simulated numerical experiments. At last, our method is applied to explain the puzzle of risk-return tradeoff in China's stock market.
Ferrero, A; Campos, J; Rabal, A M; Pons, A; Hernanz, M L; Corróns, A
2011-09-26
The Bidirectional Reflectance Distribution Function (BRDF) is essential to characterize an object's reflectance properties. This function depends both on the various illumination-observation geometries as well as on the wavelength. As a result, the comprehensive interpretation of the data becomes rather complex. In this work we assess the use of the multivariable analysis technique of Principal Components Analysis (PCA) applied to the experimental BRDF data of a ceramic colour standard. It will be shown that the result may be linked to the various reflection processes occurring on the surface, assuming that the incoming spectral distribution is affected by each one of these processes in a specific manner. Moreover, this procedure facilitates the task of interpolating a series of BRDF measurements obtained for a particular sample. © 2011 Optical Society of America
Wavelet decomposition based principal component analysis for face recognition using MATLAB
NASA Astrophysics Data System (ADS)
Sharma, Mahesh Kumar; Sharma, Shashikant; Leeprechanon, Nopbhorn; Ranjan, Aashish
2016-03-01
For the realization of face recognition systems in the static as well as in the real time frame, algorithms such as principal component analysis, independent component analysis, linear discriminate analysis, neural networks and genetic algorithms are used for decades. This paper discusses an approach which is a wavelet decomposition based principal component analysis for face recognition. Principal component analysis is chosen over other algorithms due to its relative simplicity, efficiency, and robustness features. The term face recognition stands for identifying a person from his facial gestures and having resemblance with factor analysis in some sense, i.e. extraction of the principal component of an image. Principal component analysis is subjected to some drawbacks, mainly the poor discriminatory power and the large computational load in finding eigenvectors, in particular. These drawbacks can be greatly reduced by combining both wavelet transform decomposition for feature extraction and principal component analysis for pattern representation and classification together, by analyzing the facial gestures into space and time domain, where, frequency and time are used interchangeably. From the experimental results, it is envisaged that this face recognition method has made a significant percentage improvement in recognition rate as well as having a better computational efficiency.
The Relation between Factor Score Estimates, Image Scores, and Principal Component Scores
ERIC Educational Resources Information Center
Velicer, Wayne F.
1976-01-01
Investigates the relation between factor score estimates, principal component scores, and image scores. The three methods compared are maximum likelihood factor analysis, principal component analysis, and a variant of rescaled image analysis. (RC)
The Butterflies of Principal Components: A Case of Ultrafine-Grained Polyphase Units
NASA Astrophysics Data System (ADS)
Rietmeijer, F. J. M.
1996-03-01
Dusts in the accretion regions of chondritic interplanetary dust particles [IDPs] consisted of three principal components: carbonaceous units [CUs], carbon-bearing chondritic units [GUs] and carbon-free silicate units [PUs]. Among others, differences among chondritic IDP morphologies and variable bulk C/Si ratios reflect variable mixtures of principal components. The spherical shapes of the initially amorphous principal components remain visible in many chondritic porous IDPs but fusion was documented for CUs, GUs and PUs. The PUs occur as coarse- and ultrafine-grained units that include so called GEMS. Spherical principal components preserved in an IDP as recognisable textural units have unique proporties with important implications for their petrological evolution from pre-accretion processing to protoplanet alteration and dynamic pyrometamorphism. Throughout their lifetime the units behaved as closed-systems without chemical exchange with other units. This behaviour is reflected in their mineralogies while the bulk compositions of principal components define the environments wherein they were formed.
McLeod, Lianne; Bharadwaj, Lalita; Epp, Tasha; Waldner, Cheryl L.
2017-01-01
Groundwater drinking water supply surveillance data were accessed to summarize water quality delivered as public and private water supplies in southern Saskatchewan as part of an exposure assessment for epidemiologic analyses of associations between water quality and type 2 diabetes or cardiovascular disease. Arsenic in drinking water has been linked to a variety of chronic diseases and previous studies have identified multiple wells with arsenic above the drinking water standard of 0.01 mg/L; therefore, arsenic concentrations were of specific interest. Principal components analysis was applied to obtain principal component (PC) scores to summarize mixtures of correlated parameters identified as health standards and those identified as aesthetic objectives in the Saskatchewan Drinking Water Quality Standards and Objective. Ordinary, universal, and empirical Bayesian kriging were used to interpolate arsenic concentrations and PC scores in southern Saskatchewan, and the results were compared. Empirical Bayesian kriging performed best across all analyses, based on having the greatest number of variables for which the root mean square error was lowest. While all of the kriging methods appeared to underestimate high values of arsenic and PC scores, empirical Bayesian kriging was chosen to summarize large scale geographic trends in groundwater-sourced drinking water quality and assess exposure to mixtures of trace metals and ions. PMID:28914824
Fu, Jun; Huang, Canqin; Xing, Jianguo; Zheng, Junbao
2012-01-01
Biologically-inspired models and algorithms are considered as promising sensor array signal processing methods for electronic noses. Feature selection is one of the most important issues for developing robust pattern recognition models in machine learning. This paper describes an investigation into the classification performance of a bionic olfactory model with the increase of the dimensions of input feature vector (outer factor) as well as its parallel channels (inner factor). The principal component analysis technique was applied for feature selection and dimension reduction. Two data sets of three classes of wine derived from different cultivars and five classes of green tea derived from five different provinces of China were used for experiments. In the former case the results showed that the average correct classification rate increased as more principal components were put in to feature vector. In the latter case the results showed that sufficient parallel channels should be reserved in the model to avoid pattern space crowding. We concluded that 6∼8 channels of the model with principal component feature vector values of at least 90% cumulative variance is adequate for a classification task of 3∼5 pattern classes considering the trade-off between time consumption and classification rate. PMID:22736979
Kalegowda, Yogesh; Harmer, Sarah L
2013-01-08
Artificial neural network (ANN) and a hybrid principal component analysis-artificial neural network (PCA-ANN) classifiers have been successfully implemented for classification of static time-of-flight secondary ion mass spectrometry (ToF-SIMS) mass spectra collected from complex Cu-Fe sulphides (chalcopyrite, bornite, chalcocite and pyrite) at different flotation conditions. ANNs are very good pattern classifiers because of: their ability to learn and generalise patterns that are not linearly separable; their fault and noise tolerance capability; and high parallelism. In the first approach, fragments from the whole ToF-SIMS spectrum were used as input to the ANN, the model yielded high overall correct classification rates of 100% for feed samples, 88% for conditioned feed samples and 91% for Eh modified samples. In the second approach, the hybrid pattern classifier PCA-ANN was integrated. PCA is a very effective multivariate data analysis tool applied to enhance species features and reduce data dimensionality. Principal component (PC) scores which accounted for 95% of the raw spectral data variance, were used as input to the ANN, the model yielded high overall correct classification rates of 88% for conditioned feed samples and 95% for Eh modified samples. Copyright © 2012 Elsevier B.V. All rights reserved.
McLeod, Lianne; Bharadwaj, Lalita; Epp, Tasha; Waldner, Cheryl L
2017-09-15
Groundwater drinking water supply surveillance data were accessed to summarize water quality delivered as public and private water supplies in southern Saskatchewan as part of an exposure assessment for epidemiologic analyses of associations between water quality and type 2 diabetes or cardiovascular disease. Arsenic in drinking water has been linked to a variety of chronic diseases and previous studies have identified multiple wells with arsenic above the drinking water standard of 0.01 mg/L; therefore, arsenic concentrations were of specific interest. Principal components analysis was applied to obtain principal component (PC) scores to summarize mixtures of correlated parameters identified as health standards and those identified as aesthetic objectives in the Saskatchewan Drinking Water Quality Standards and Objective. Ordinary, universal, and empirical Bayesian kriging were used to interpolate arsenic concentrations and PC scores in southern Saskatchewan, and the results were compared. Empirical Bayesian kriging performed best across all analyses, based on having the greatest number of variables for which the root mean square error was lowest. While all of the kriging methods appeared to underestimate high values of arsenic and PC scores, empirical Bayesian kriging was chosen to summarize large scale geographic trends in groundwater-sourced drinking water quality and assess exposure to mixtures of trace metals and ions.
Vina, Andres; Peters, Albert J.; Ji, Lei
2003-01-01
There is a global concern about the increase in atmospheric concentrations of greenhouse gases. One method being discussed to encourage greenhouse gas mitigation efforts is based on a trading system whereby carbon emitters can buy effective mitigation efforts from farmers implementing conservation tillage practices. These practices sequester carbon from the atmosphere, and such a trading system would require a low-cost and accurate method of verification. Remote sensing technology can offer such a verification technique. This paper is focused on the use of standard image processing procedures applied to a multispectral Ikonos image, to determine whether it is possible to validate that farmers have complied with agreements to implement conservation tillage practices. A principal component analysis (PCA) was performed in order to isolate image variance in cropped fields. Analyses of variance (ANOVA) statistical procedures were used to evaluate the capability of each Ikonos band and each principal component to discriminate between conventional and conservation tillage practices. A logistic regression model was implemented on the principal component most effective in discriminating between conventional and conservation tillage, in order to produce a map of the probability of conventional tillage. The Ikonos imagery, in combination with ground-reference information, proved to be a useful tool for verification of conservation tillage practices.
Qu, Mingkai; Wang, Yan; Huang, Biao; Zhao, Yongcun
2018-06-01
The traditional source apportionment models, such as absolute principal component scores-multiple linear regression (APCS-MLR), are usually susceptible to outliers, which may be widely present in the regional geochemical dataset. Furthermore, the models are merely built on variable space instead of geographical space and thus cannot effectively capture the local spatial characteristics of each source contributions. To overcome the limitations, a new receptor model, robust absolute principal component scores-robust geographically weighted regression (RAPCS-RGWR), was proposed based on the traditional APCS-MLR model. Then, the new method was applied to the source apportionment of soil metal elements in a region of Wuhan City, China as a case study. Evaluations revealed that: (i) RAPCS-RGWR model had better performance than APCS-MLR model in the identification of the major sources of soil metal elements, and (ii) source contributions estimated by RAPCS-RGWR model were more close to the true soil metal concentrations than that estimated by APCS-MLR model. It is shown that the proposed RAPCS-RGWR model is a more effective source apportionment method than APCS-MLR (i.e., non-robust and global model) in dealing with the regional geochemical dataset. Copyright © 2018 Elsevier B.V. All rights reserved.
Liu, Xiao-Fang; Xue, Chang-Hu; Wang, Yu-Ming; Li, Zhao-Jie; Xue, Yong; Xu, Jie
2011-11-01
The present study is to investigate the feasibility of multi-elements analysis in determination of the geographical origin of sea cucumber Apostichopus japonicus, and to make choice of the effective tracers in sea cucumber Apostichopus japonicus geographical origin assessment. The content of the elements such as Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, Cd, Hg and Pb in sea cucumber Apostichopus japonicus samples from seven places of geographical origin were determined by means of ICP-MS. The results were used for the development of elements database. Cluster analysis(CA) and principal component analysis (PCA) were applied to differentiate the sea cucumber Apostichopus japonicus geographical origin. Three principal components which accounted for over 89% of the total variance were extracted from the standardized data. The results of Q-type cluster analysis showed that the 26 samples could be clustered reasonably into five groups, the classification results were significantly associated with the marine distribution of the sea cucumber Apostichopus japonicus samples. The CA and PCA were the effective methods for elements analysis of sea cucumber Apostichopus japonicus samples. The content of the mineral elements in sea cucumber Apostichopus japonicus samples was good chemical descriptors for differentiating their geographical origins.
NASA Astrophysics Data System (ADS)
Norinder, Ulf
1990-12-01
An experimental design based 3-D QSAR analysis using a combination of principal component and PLS analysis is presented and applied to human corticosteroid-binding globulin complexes. The predictive capability of the created model is good. The technique can also be used as guidance when selecting new compounds to be investigated.
Yongqiang Liu
2003-01-01
The relations between monthly-seasonal soil moisture and precipitation variability are investigated by identifying the coupled patterns of the two hydrological fields using singular value decomposition (SVD). SVD is a technique of principal component analysis similar to empirical orthogonal knctions (EOF). However, it is applied to two variables simultaneously and is...
Foch, Eric; Milner, Clare E
2014-01-03
Iliotibial band syndrome (ITBS) is a common knee overuse injury among female runners. Atypical discrete trunk and lower extremity biomechanics during running may be associated with the etiology of ITBS. Examining discrete data points limits the interpretation of a waveform to a single value. Characterizing entire kinematic and kinetic waveforms may provide additional insight into biomechanical factors associated with ITBS. Therefore, the purpose of this cross-sectional investigation was to determine whether female runners with previous ITBS exhibited differences in kinematics and kinetics compared to controls using a principal components analysis (PCA) approach. Forty participants comprised two groups: previous ITBS and controls. Principal component scores were retained for the first three principal components and were analyzed using independent t-tests. The retained principal components accounted for 93-99% of the total variance within each waveform. Runners with previous ITBS exhibited low principal component one scores for frontal plane hip angle. Principal component one accounted for the overall magnitude in hip adduction which indicated that runners with previous ITBS assumed less hip adduction throughout stance. No differences in the remaining retained principal component scores for the waveforms were detected among groups. A smaller hip adduction angle throughout the stance phase of running may be a compensatory strategy to limit iliotibial band strain. This running strategy may have persisted after ITBS symptoms subsided. © 2013 Published by Elsevier Ltd.
Yazdani, Farzaneh; Razeghi, Mohsen; Karimi, Mohammad Taghi; Raeisi Shahraki, Hadi; Salimi Bani, Milad
2018-05-01
Despite the theoretical link between foot hyperpronation and biomechanical dysfunction of the pelvis, the literature lacks evidence that confirms this assumption in truly hyperpronated feet subjects during gait. Changes in the kinematic pattern of the pelvic segment were assessed in 15 persons with hyperpronated feet and compared to a control group of 15 persons with normally aligned feet during the stance phase of gait based on biomechanical musculoskeletal simulation. Kinematic and kinetic data were collected while participants walked at a comfortable self-selected speed. A generic OpenSim musculoskeletal model with 23 degrees of freedom and 92 muscles was scaled for each participant. OpenSim inverse kinematic analysis was applied to calculate segment angles in the sagittal, frontal and horizontal planes. Principal component analysis was employed as a data reduction technique, as well as a computational tool to obtain principal component scores. Independent-sample t-test was used to detect group differences. The difference between groups in scores for the first principal component in the sagittal plane was statistically significant (p = 0.01; effect size = 1.06), but differences between principal component scores in the frontal and horizontal planes were not significant. The hyperpronation group had greater anterior pelvic tilt during 20%-80% of the stance phase. In conclusion, in persons with hyperpronation we studied the role of the pelvic segment was mainly to maintain postural balance in the sagittal plane by increasing anterior pelvic inclination. Since anterior pelvic tilt may be associated with low back symptoms, the evaluation of foot posture should be considered in assessing the patients with low back and pelvic dysfunction.
Tipton, John; Hooten, Mevin B.; Goring, Simon
2017-01-01
Scientific records of temperature and precipitation have been kept for several hundred years, but for many areas, only a shorter record exists. To understand climate change, there is a need for rigorous statistical reconstructions of the paleoclimate using proxy data. Paleoclimate proxy data are often sparse, noisy, indirect measurements of the climate process of interest, making each proxy uniquely challenging to model statistically. We reconstruct spatially explicit temperature surfaces from sparse and noisy measurements recorded at historical United States military forts and other observer stations from 1820 to 1894. One common method for reconstructing the paleoclimate from proxy data is principal component regression (PCR). With PCR, one learns a statistical relationship between the paleoclimate proxy data and a set of climate observations that are used as patterns for potential reconstruction scenarios. We explore PCR in a Bayesian hierarchical framework, extending classical PCR in a variety of ways. First, we model the latent principal components probabilistically, accounting for measurement error in the observational data. Next, we extend our method to better accommodate outliers that occur in the proxy data. Finally, we explore alternatives to the truncation of lower-order principal components using different regularization techniques. One fundamental challenge in paleoclimate reconstruction efforts is the lack of out-of-sample data for predictive validation. Cross-validation is of potential value, but is computationally expensive and potentially sensitive to outliers in sparse data scenarios. To overcome the limitations that a lack of out-of-sample records presents, we test our methods using a simulation study, applying proper scoring rules including a computationally efficient approximation to leave-one-out cross-validation using the log score to validate model performance. The result of our analysis is a spatially explicit reconstruction of spatio-temporal temperature from a very sparse historical record.
Estimation of surface curvature from full-field shape data using principal component analysis
NASA Astrophysics Data System (ADS)
Sharma, Sameer; Vinuchakravarthy, S.; Subramanian, S. J.
2017-01-01
Three-dimensional digital image correlation (3D-DIC) is a popular image-based experimental technique for estimating surface shape, displacements and strains of deforming objects. In this technique, a calibrated stereo rig is used to obtain and stereo-match pairs of images of the object of interest from which the shapes of the imaged surface are then computed using the calibration parameters of the rig. Displacements are obtained by performing an additional temporal correlation of the shapes obtained at various stages of deformation and strains by smoothing and numerically differentiating the displacement data. Since strains are of primary importance in solid mechanics, significant efforts have been put into computation of strains from the measured displacement fields; however, much less attention has been paid to date to computation of curvature from the measured 3D surfaces. In this work, we address this gap by proposing a new method of computing curvature from full-field shape measurements using principal component analysis (PCA) along the lines of a similar work recently proposed to measure strains (Grama and Subramanian 2014 Exp. Mech. 54 913-33). PCA is a multivariate analysis tool that is widely used to reveal relationships between a large number of variables, reduce dimensionality and achieve significant denoising. This technique is applied here to identify dominant principal components in the shape fields measured by 3D-DIC and these principal components are then differentiated systematically to obtain the first and second fundamental forms used in the curvature calculation. The proposed method is first verified using synthetically generated noisy surfaces and then validated experimentally on some real world objects with known ground-truth curvatures.
Söhn, Matthias; Alber, Markus; Yan, Di
2007-09-01
The variability of dose-volume histogram (DVH) shapes in a patient population can be quantified using principal component analysis (PCA). We applied this to rectal DVHs of prostate cancer patients and investigated the correlation of the PCA parameters with late bleeding. PCA was applied to the rectal wall DVHs of 262 patients, who had been treated with a four-field box, conformal adaptive radiotherapy technique. The correlated changes in the DVH pattern were revealed as "eigenmodes," which were ordered by their importance to represent data set variability. Each DVH is uniquely characterized by its principal components (PCs). The correlation of the first three PCs and chronic rectal bleeding of Grade 2 or greater was investigated with uni- and multivariate logistic regression analyses. Rectal wall DVHs in four-field conformal RT can primarily be represented by the first two or three PCs, which describe approximately 94% or 96% of the DVH shape variability, respectively. The first eigenmode models the total irradiated rectal volume; thus, PC1 correlates to the mean dose. Mode 2 describes the interpatient differences of the relative rectal volume in the two- or four-field overlap region. Mode 3 reveals correlations of volumes with intermediate doses ( approximately 40-45 Gy) and volumes with doses >70 Gy; thus, PC3 is associated with the maximal dose. According to univariate logistic regression analysis, only PC2 correlated significantly with toxicity. However, multivariate logistic regression analysis with the first two or three PCs revealed an increased probability of bleeding for DVHs with more than one large PC. PCA can reveal the correlation structure of DVHs for a patient population as imposed by the treatment technique and provide information about its relationship to toxicity. It proves useful for augmenting normal tissue complication probability modeling approaches.
Bouhlel, Jihéne; Jouan-Rimbaud Bouveresse, Delphine; Abouelkaram, Said; Baéza, Elisabeth; Jondreville, Catherine; Travel, Angélique; Ratel, Jérémy; Engel, Erwan; Rutledge, Douglas N
2018-02-01
The aim of this work is to compare a novel exploratory chemometrics method, Common Components Analysis (CCA), with Principal Components Analysis (PCA) and Independent Components Analysis (ICA). CCA consists in adapting the multi-block statistical method known as Common Components and Specific Weights Analysis (CCSWA or ComDim) by applying it to a single data matrix, with one variable per block. As an application, the three methods were applied to SPME-GC-MS volatolomic signatures of livers in an attempt to reveal volatile organic compounds (VOCs) markers of chicken exposure to different types of micropollutants. An application of CCA to the initial SPME-GC-MS data revealed a drift in the sample Scores along CC2, as a function of injection order, probably resulting from time-related evolution in the instrument. This drift was eliminated by orthogonalization of the data set with respect to CC2, and the resulting data are used as the orthogonalized data input into each of the three methods. Since the first step in CCA is to norm-scale all the variables, preliminary data scaling has no effect on the results, so that CCA was applied only to orthogonalized SPME-GC-MS data, while, PCA and ICA were applied to the "orthogonalized", "orthogonalized and Pareto-scaled", and "orthogonalized and autoscaled" data. The comparison showed that PCA results were highly dependent on the scaling of variables, contrary to ICA where the data scaling did not have a strong influence. Nevertheless, for both PCA and ICA the clearest separations of exposed groups were obtained after autoscaling of variables. The main part of this work was to compare the CCA results using the orthogonalized data with those obtained with PCA and ICA applied to orthogonalized and autoscaled variables. The clearest separations of exposed chicken groups were obtained by CCA. CCA Loadings also clearly identified the variables contributing most to the Common Components giving separations. The PCA Loadings did not highlight the most influencing variables for each separation, whereas the ICA Loadings highlighted the same variables as did CCA. This study shows the potential of CCA for the extraction of pertinent information from a data matrix, using a procedure based on an original optimisation criterion, to produce results that are complementary, and in some cases may be superior, to those of PCA and ICA. Copyright © 2017 Elsevier B.V. All rights reserved.
Functional data analysis of sleeping energy expenditure.
Lee, Jong Soo; Zakeri, Issa F; Butte, Nancy F
2017-01-01
Adequate sleep is crucial during childhood for metabolic health, and physical and cognitive development. Inadequate sleep can disrupt metabolic homeostasis and alter sleeping energy expenditure (SEE). Functional data analysis methods were applied to SEE data to elucidate the population structure of SEE and to discriminate SEE between obese and non-obese children. Minute-by-minute SEE in 109 children, ages 5-18, was measured in room respiration calorimeters. A smoothing spline method was applied to the calorimetric data to extract the true smoothing function for each subject. Functional principal component analysis was used to capture the important modes of variation of the functional data and to identify differences in SEE patterns. Combinations of functional principal component analysis and classifier algorithm were used to classify SEE. Smoothing effectively removed instrumentation noise inherent in the room calorimeter data, providing more accurate data for analysis of the dynamics of SEE. SEE exhibited declining but subtly undulating patterns throughout the night. Mean SEE was markedly higher in obese than non-obese children, as expected due to their greater body mass. SEE was higher among the obese than non-obese children (p<0.01); however, the weight-adjusted mean SEE was not statistically different (p>0.1, after post hoc testing). Functional principal component scores for the first two components explained 77.8% of the variance in SEE and also differed between groups (p = 0.037). Logistic regression, support vector machine or random forest classification methods were able to distinguish weight-adjusted SEE between obese and non-obese participants with good classification rates (62-64%). Our results implicate other factors, yet to be uncovered, that affect the weight-adjusted SEE of obese and non-obese children. Functional data analysis revealed differences in the structure of SEE between obese and non-obese children that may contribute to disruption of metabolic homeostasis.
NASA Astrophysics Data System (ADS)
Durigon, Angelica; Lier, Quirijn de Jong van; Metselaar, Klaas
2016-10-01
To date, measuring plant transpiration at canopy scale is laborious and its estimation by numerical modelling can be used to assess high time frequency data. When using the model by Jacobs (1994) to simulate transpiration of water stressed plants it needs to be reparametrized. We compare the importance of model variables affecting simulated transpiration of water stressed plants. A systematic literature review was performed to recover existing parameterizations to be tested in the model. Data from a field experiment with common bean under full and deficit irrigation were used to correlate estimations to forcing variables applying principal component analysis. New parameterizations resulted in a moderate reduction of prediction errors and in an increase in model performance. Ags model was sensitive to changes in the mesophyll conductance and leaf angle distribution parameterizations, allowing model improvement. Simulated transpiration could be separated in temporal components. Daily, afternoon depression and long-term components for the fully irrigated treatment were more related to atmospheric forcing variables (specific humidity deficit between stomata and air, relative air humidity and canopy temperature). Daily and afternoon depression components for the deficit-irrigated treatment were related to both atmospheric and soil dryness, and long-term component was related to soil dryness.
Phytoplankton across Tropical and Subtropical Regions of the Atlantic, Indian and Pacific Oceans
Estrada, Marta; Delgado, Maximino; Blasco, Dolors; Latasa, Mikel; Cabello, Ana María; Benítez-Barrios, Verónica; Fraile-Nuez, Eugenio; Mozetič, Patricija; Vidal, Montserrat
2016-01-01
We examine the large-scale distribution patterns of the nano- and microphytoplankton collected from 145 oceanic stations, at 3 m depth, the 20% light level and the depth of the subsurface chlorophyll maximum, during the Malaspina-2010 Expedition (December 2010-July 2011), which covered 15 biogeographical provinces across the Atlantic, Indian and Pacific oceans, between 35°N and 40°S. In general, the water column was stratified, the surface layers were nutrient-poor and the nano- and microplankton (hereafter phytoplankton, for simplicity, although it included also heterotrophic protists) community was dominated by dinoflagellates, other flagellates and coccolithophores, while the contribution of diatoms was only important in zones with shallow nutriclines such as the equatorial upwelling regions. We applied a principal component analysis to the correlation matrix among the abundances (after logarithmic transform) of the 76 most frequent taxa to synthesize the information contained in the phytoplankton data set. The main trends of variability identified consisted of: 1) A contrast between the community composition of the upper and the lower parts of the euphotic zone, expressed respectively by positive or negative scores of the first principal component, which was positively correlated with taxa such as the dinoflagellates Oxytoxum minutum and Scrippsiella spp., and the coccolithophores Discosphaera tubifera and Syracosphaera pulchra (HOL and HET), and negatively correlated with taxa like Ophiaster hydroideus (coccolithophore) and several diatoms, 2) a general abundance gradient between phytoplankton-rich regions with high abundances of dinoflagellate, coccolithophore and ciliate taxa, and phytoplankton-poor regions (second principal component), 3) differences in dominant phytoplankton and ciliate taxa among the Atlantic, the Indian and the Pacific oceans (third principal component) and 4) the occurrence of a diatom-dominated assemblage (the fourth principal component assemblage), including several pennate taxa, Planktoniella sol, Hemiaulus hauckii and Pseudo-nitzschia spp., in the divergence regions. Our findings indicate that consistent assemblages of co-occurring phytoplankton taxa can be identified and that their distribution is best explained by a combination in different degrees of both environmental and historical influences. PMID:26982180
Phytoplankton across Tropical and Subtropical Regions of the Atlantic, Indian and Pacific Oceans.
Estrada, Marta; Delgado, Maximino; Blasco, Dolors; Latasa, Mikel; Cabello, Ana María; Benítez-Barrios, Verónica; Fraile-Nuez, Eugenio; Mozetič, Patricija; Vidal, Montserrat
2016-01-01
We examine the large-scale distribution patterns of the nano- and microphytoplankton collected from 145 oceanic stations, at 3 m depth, the 20% light level and the depth of the subsurface chlorophyll maximum, during the Malaspina-2010 Expedition (December 2010-July 2011), which covered 15 biogeographical provinces across the Atlantic, Indian and Pacific oceans, between 35°N and 40°S. In general, the water column was stratified, the surface layers were nutrient-poor and the nano- and microplankton (hereafter phytoplankton, for simplicity, although it included also heterotrophic protists) community was dominated by dinoflagellates, other flagellates and coccolithophores, while the contribution of diatoms was only important in zones with shallow nutriclines such as the equatorial upwelling regions. We applied a principal component analysis to the correlation matrix among the abundances (after logarithmic transform) of the 76 most frequent taxa to synthesize the information contained in the phytoplankton data set. The main trends of variability identified consisted of: 1) A contrast between the community composition of the upper and the lower parts of the euphotic zone, expressed respectively by positive or negative scores of the first principal component, which was positively correlated with taxa such as the dinoflagellates Oxytoxum minutum and Scrippsiella spp., and the coccolithophores Discosphaera tubifera and Syracosphaera pulchra (HOL and HET), and negatively correlated with taxa like Ophiaster hydroideus (coccolithophore) and several diatoms, 2) a general abundance gradient between phytoplankton-rich regions with high abundances of dinoflagellate, coccolithophore and ciliate taxa, and phytoplankton-poor regions (second principal component), 3) differences in dominant phytoplankton and ciliate taxa among the Atlantic, the Indian and the Pacific oceans (third principal component) and 4) the occurrence of a diatom-dominated assemblage (the fourth principal component assemblage), including several pennate taxa, Planktoniella sol, Hemiaulus hauckii and Pseudo-nitzschia spp., in the divergence regions. Our findings indicate that consistent assemblages of co-occurring phytoplankton taxa can be identified and that their distribution is best explained by a combination in different degrees of both environmental and historical influences.
A Nonlinear Model for Gene-Based Gene-Environment Interaction.
Sa, Jian; Liu, Xu; He, Tao; Liu, Guifen; Cui, Yuehua
2016-06-04
A vast amount of literature has confirmed the role of gene-environment (G×E) interaction in the etiology of complex human diseases. Traditional methods are predominantly focused on the analysis of interaction between a single nucleotide polymorphism (SNP) and an environmental variable. Given that genes are the functional units, it is crucial to understand how gene effects (rather than single SNP effects) are influenced by an environmental variable to affect disease risk. Motivated by the increasing awareness of the power of gene-based association analysis over single variant based approach, in this work, we proposed a sparse principle component regression (sPCR) model to understand the gene-based G×E interaction effect on complex disease. We first extracted the sparse principal components for SNPs in a gene, then the effect of each principal component was modeled by a varying-coefficient (VC) model. The model can jointly model variants in a gene in which their effects are nonlinearly influenced by an environmental variable. In addition, the varying-coefficient sPCR (VC-sPCR) model has nice interpretation property since the sparsity on the principal component loadings can tell the relative importance of the corresponding SNPs in each component. We applied our method to a human birth weight dataset in Thai population. We analyzed 12,005 genes across 22 chromosomes and found one significant interaction effect using the Bonferroni correction method and one suggestive interaction. The model performance was further evaluated through simulation studies. Our model provides a system approach to evaluate gene-based G×E interaction.
NASA Astrophysics Data System (ADS)
Fang, Leyuan; Wang, Chong; Li, Shutao; Yan, Jun; Chen, Xiangdong; Rabbani, Hossein
2017-11-01
We present an automatic method, termed as the principal component analysis network with composite kernel (PCANet-CK), for the classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images. Specifically, the proposed PCANet-CK method first utilizes the PCANet to automatically learn features from each B-scan of the 3-D retinal OCT images. Then, multiple kernels are separately applied to a set of very important features of the B-scans and these kernels are fused together, which can jointly exploit the correlations among features of the 3-D OCT images. Finally, the fused (composite) kernel is incorporated into an extreme learning machine for the OCT image classification. We tested our proposed algorithm on two real 3-D spectral domain OCT (SD-OCT) datasets (of normal subjects and subjects with the macular edema and age-related macular degeneration), which demonstrated its effectiveness.
Portable XRF and principal component analysis for bill characterization in forensic science.
Appoloni, C R; Melquiades, F L
2014-02-01
Several modern techniques have been applied to prevent counterfeiting of money bills. The objective of this study was to demonstrate the potential of Portable X-ray Fluorescence (PXRF) technique and the multivariate analysis method of Principal Component Analysis (PCA) for classification of bills in order to use it in forensic science. Bills of Dollar, Euro and Real (Brazilian currency) were measured directly at different colored regions, without any previous preparation. Spectra interpretation allowed the identification of Ca, Ti, Fe, Cu, Sr, Y, Zr and Pb. PCA analysis separated the bills in three groups and subgroups among Brazilian currency. In conclusion, the samples were classified according to its origin identifying the elements responsible for differentiation and basic pigment composition. PXRF allied to multivariate discriminate methods is a promising technique for rapid and no destructive identification of false bills in forensic science. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Bhushan, A.; Sharker, M. H.; Karimi, H. A.
2015-07-01
In this paper, we address outliers in spatiotemporal data streams obtained from sensors placed across geographically distributed locations. Outliers may appear in such sensor data due to various reasons such as instrumental error and environmental change. Real-time detection of these outliers is essential to prevent propagation of errors in subsequent analyses and results. Incremental Principal Component Analysis (IPCA) is one possible approach for detecting outliers in such type of spatiotemporal data streams. IPCA has been widely used in many real-time applications such as credit card fraud detection, pattern recognition, and image analysis. However, the suitability of applying IPCA for outlier detection in spatiotemporal data streams is unknown and needs to be investigated. To fill this research gap, this paper contributes by presenting two new IPCA-based outlier detection methods and performing a comparative analysis with the existing IPCA-based outlier detection methods to assess their suitability for spatiotemporal sensor data streams.
Preliminary study of soil permeability properties using principal component analysis
NASA Astrophysics Data System (ADS)
Yulianti, M.; Sudriani, Y.; Rustini, H. A.
2018-02-01
Soil permeability measurement is undoubtedly important in carrying out soil-water research such as rainfall-runoff modelling, irrigation water distribution systems, etc. It is also known that acquiring reliable soil permeability data is rather laborious, time-consuming, and costly. Therefore, it is desirable to develop the prediction model. Several studies of empirical equations for predicting permeability have been undertaken by many researchers. These studies derived the models from areas which soil characteristics are different from Indonesian soil, which suggest a possibility that these permeability models are site-specific. The purpose of this study is to identify which soil parameters correspond strongly to soil permeability and propose a preliminary model for permeability prediction. Principal component analysis (PCA) was applied to 16 parameters analysed from 37 sites consist of 91 samples obtained from Batanghari Watershed. Findings indicated five variables that have strong correlation with soil permeability, and we recommend a preliminary permeability model, which is potential for further development.
Rowlands, G J; Musoke, A J; Morzaria, S P; Nagda, S M; Ballingall, K T; McKeever, D J
2000-04-01
A statistically derived disease reaction index based on parasitological, clinical and haematological measurements observed in 309 5 to 8-month-old Boran cattle following laboratory challenge with Theileria parva is described. Principal component analysis was applied to 13 measures including first appearance of schizonts, first appearance of piroplasms and first occurrence of pyrexia, together with the duration and severity of these symptoms, and white blood cell count. The first principal component, which was based on approximately equal contributions of the 13 variables, provided the definition for the disease reaction index, defined on a scale of 0-10. As well as providing a more objective measure of the severity of the reaction, the continuous nature of the index score enables more powerful statistical analysis of the data compared with that which has been previously possible through clinically derived categories of non-, mild, moderate and severe reactions.
Optical Aptasensors for Adenosine Triphosphate
Ng, Stella; Lim, Hui Si; Ma, Qian; Gao, Zhiqiang
2016-01-01
Nucleic acids are among the most researched and applied biomolecules. Their diverse two- and three-dimensional structures in conjunction with their robust chemistry and ease of manipulation provide a rare opportunity for sensor applications. Moreover, their high biocompatibility has seen them being used in the construction of in vivo assays. Various nucleic acid-based devices have been extensively studied as either the principal element in discrete molecule-like sensors or as the main component in the fabrication of sensing devices. The use of aptamers in sensors - aptasensors, in particular, has led to improvements in sensitivity, selectivity, and multiplexing capacity for a wide verity of analytes like proteins, nucleic acids, as well as small biomolecules such as glucose and adenosine triphosphate (ATP). This article reviews the progress in the use of aptamers as the principal component in sensors for optical detection of ATP with an emphasis on sensing mechanism, performance, and applications with some discussion on challenges and perspectives. PMID:27446501
Optical Aptasensors for Adenosine Triphosphate.
Ng, Stella; Lim, Hui Si; Ma, Qian; Gao, Zhiqiang
2016-01-01
Nucleic acids are among the most researched and applied biomolecules. Their diverse two- and three-dimensional structures in conjunction with their robust chemistry and ease of manipulation provide a rare opportunity for sensor applications. Moreover, their high biocompatibility has seen them being used in the construction of in vivo assays. Various nucleic acid-based devices have been extensively studied as either the principal element in discrete molecule-like sensors or as the main component in the fabrication of sensing devices. The use of aptamers in sensors - aptasensors, in particular, has led to improvements in sensitivity, selectivity, and multiplexing capacity for a wide verity of analytes like proteins, nucleic acids, as well as small biomolecules such as glucose and adenosine triphosphate (ATP). This article reviews the progress in the use of aptamers as the principal component in sensors for optical detection of ATP with an emphasis on sensing mechanism, performance, and applications with some discussion on challenges and perspectives.
DeWalt, Emma L.; Begue, Victoria J.; Ronau, Judith A.; Sullivan, Shane Z.; Das, Chittaranjan; Simpson, Garth J.
2013-01-01
Polarization-resolved second-harmonic generation (PR-SHG) microscopy is described and applied to identify the presence of multiple crystallographic domains within protein-crystal conglomerates, which was confirmed by synchrotron X-ray diffraction. Principal component analysis (PCA) of PR-SHG images resulted in principal component 2 (PC2) images with areas of contrasting negative and positive values for conglomerated crystals and PC2 images exhibiting uniformly positive or uniformly negative values for single crystals. Qualitative assessment of PC2 images allowed the identification of domains of different internal ordering within protein-crystal samples as well as differentiation between multi-domain conglomerated crystals and single crystals. PR-SHG assessments of crystalline domains were in good agreement with spatially resolved synchrotron X-ray diffraction measurements. These results have implications for improving the productive throughput of protein structure determination through early identification of multi-domain crystals. PMID:23275165
NASA Astrophysics Data System (ADS)
Jha, S. K.; Brockman, R. A.; Hoffman, R. M.; Sinha, V.; Pilchak, A. L.; Porter, W. J.; Buchanan, D. J.; Larsen, J. M.; John, R.
2018-05-01
Principal component analysis and fuzzy c-means clustering algorithms were applied to slip-induced strain and geometric metric data in an attempt to discover unique microstructural configurations and their frequencies of occurrence in statistically representative instantiations of a titanium alloy microstructure. Grain-averaged fatigue indicator parameters were calculated for the same instantiation. The fatigue indicator parameters strongly correlated with the spatial location of the microstructural configurations in the principal components space. The fuzzy c-means clustering method identified clusters of data that varied in terms of their average fatigue indicator parameters. Furthermore, the number of points in each cluster was inversely correlated to the average fatigue indicator parameter. This analysis demonstrates that data-driven methods have significant potential for providing unbiased determination of unique microstructural configurations and their frequencies of occurrence in a given volume from the point of view of strain localization and fatigue crack initiation.
Classification of adulterated honeys by multivariate analysis.
Amiry, Saber; Esmaiili, Mohsen; Alizadeh, Mohammad
2017-06-01
In this research, honey samples were adulterated with date syrup (DS) and invert sugar syrup (IS) at three concentrations (7%, 15% and 30%). 102 adulterated samples were prepared in six batches with 17 replications for each batch. For each sample, 32 parameters including color indices, rheological, physical, and chemical parameters were determined. To classify the samples, based on type and concentrations of adulterant, a multivariate analysis was applied using principal component analysis (PCA) followed by a linear discriminant analysis (LDA). Then, 21 principal components (PCs) were selected in five sets. Approximately two-thirds were identified correctly using color indices (62.75%) or rheological properties (67.65%). A power discrimination was obtained using physical properties (97.06%), and the best separations were achieved using two sets of chemical properties (set 1: lactone, diastase activity, sucrose - 100%) (set 2: free acidity, HMF, ash - 95%). Copyright © 2016 Elsevier Ltd. All rights reserved.
Multivariate Statistical Analysis of MSL APXS Bulk Geochemical Data
NASA Astrophysics Data System (ADS)
Hamilton, V. E.; Edwards, C. S.; Thompson, L. M.; Schmidt, M. E.
2014-12-01
We apply cluster and factor analyses to bulk chemical data of 130 soil and rock samples measured by the Alpha Particle X-ray Spectrometer (APXS) on the Mars Science Laboratory (MSL) rover Curiosity through sol 650. Multivariate approaches such as principal components analysis (PCA), cluster analysis, and factor analysis compliment more traditional approaches (e.g., Harker diagrams), with the advantage of simultaneously examining the relationships between multiple variables for large numbers of samples. Principal components analysis has been applied with success to APXS, Pancam, and Mössbauer data from the Mars Exploration Rovers. Factor analysis and cluster analysis have been applied with success to thermal infrared (TIR) spectral data of Mars. Cluster analyses group the input data by similarity, where there are a number of different methods for defining similarity (hierarchical, density, distribution, etc.). For example, without any assumptions about the chemical contributions of surface dust, preliminary hierarchical and K-means cluster analyses clearly distinguish the physically adjacent rock targets Windjana and Stephen as being distinctly different than lithologies observed prior to Curiosity's arrival at The Kimberley. In addition, they are separated from each other, consistent with chemical trends observed in variation diagrams but without requiring assumptions about chemical relationships. We will discuss the variation in cluster analysis results as a function of clustering method and pre-processing (e.g., log transformation, correction for dust cover) and implications for interpreting chemical data. Factor analysis shares some similarities with PCA, and examines the variability among observed components of a dataset so as to reveal variations attributable to unobserved components. Factor analysis has been used to extract the TIR spectra of components that are typically observed in mixtures and only rarely in isolation; there is the potential for similar results with data from APXS. These techniques offer new ways to understand the chemical relationships between the materials interrogated by Curiosity, and potentially their relation to materials observed by APXS instruments on other landed missions.
NASA Astrophysics Data System (ADS)
Polat, Esra; Gunay, Suleyman
2013-10-01
One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Dale L. Bartos; Gordon D. Booth
1994-01-01
Temperature measurements were made to better understand the role of microclimate on mountain pine beetle, Dendroctonus ponderosae Hopkins (Coleoptera: Scolytidae), activity as a result of thinning lodgepole pine stands. Sampling was done over 61 days on the north slope of the Unita Mountain Range in northeastern Utah. Principal components analysis was applied to all...
Zachery A. Holden; Michael A. Crimmins; Samuel A. Cushman; Jeremy S. Littell
2010-01-01
Accurate, fine spatial resolution predictions of surface air temperatures are critical for understanding many hydrologic and ecological processes. This study examines the spatial and temporal variability in nocturnal air temperatures across a mountainous region of Northern Idaho. Principal components analysis (PCA) was applied to a network of 70 Hobo temperature...
Nonlinear Principal Components Analysis: Introduction and Application
ERIC Educational Resources Information Center
Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Koojj, Anita J.
2007-01-01
The authors provide a didactic treatment of nonlinear (categorical) principal components analysis (PCA). This method is the nonlinear equivalent of standard PCA and reduces the observed variables to a number of uncorrelated principal components. The most important advantages of nonlinear over linear PCA are that it incorporates nominal and ordinal…
USDA-ARS?s Scientific Manuscript database
Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...
Similarities between principal components of protein dynamics and random diffusion
NASA Astrophysics Data System (ADS)
Hess, Berk
2000-12-01
Principal component analysis, also called essential dynamics, is a powerful tool for finding global, correlated motions in atomic simulations of macromolecules. It has become an established technique for analyzing molecular dynamics simulations of proteins. The first few principal components of simulations of large proteins often resemble cosines. We derive the principal components for high-dimensional random diffusion, which are almost perfect cosines. This resemblance between protein simulations and noise implies that for many proteins the time scales of current simulations are too short to obtain convergence of collective motions.
Least Principal Components Analysis (LPCA): An Alternative to Regression Analysis.
ERIC Educational Resources Information Center
Olson, Jeffery E.
Often, all of the variables in a model are latent, random, or subject to measurement error, or there is not an obvious dependent variable. When any of these conditions exist, an appropriate method for estimating the linear relationships among the variables is Least Principal Components Analysis. Least Principal Components are robust, consistent,…
Identifying apple surface defects using principal components analysis and artifical neural networks
USDA-ARS?s Scientific Manuscript database
Artificial neural networks and principal components were used to detect surface defects on apples in near-infrared images. Neural networks were trained and tested on sets of principal components derived from columns of pixels from images of apples acquired at two wavelengths (740 nm and 950 nm). I...
Gao, Wen; Wang, Rui; Li, Dan; Liu, Ke; Chen, Jun; Li, Hui-Jun; Xu, Xiaojun; Li, Ping; Yang, Hua
2016-01-05
The flowers of Lonicera japonica Thunb. were extensively used to treat many diseases. As the demands for L. japonica increased, some related Lonicera plants were often confused or misused. Caffeoylquinic acids were always regarded as chemical markers in the quality control of L. japonica, but they could be found in all Lonicera species. Thus, a simple and reliable method for the evaluation of different Lonicera flowers is necessary to be established. In this work a method based on single standard to determine multi-components (SSDMC) combined with principal component analysis (PCA) for control and distinguish of Lonicera species flowers have been developed. Six components including three caffeoylquinic acids and three iridoid glycosides were assayed simultaneously using chlorogenic acid as the reference standard. The credibility and feasibility of the SSDMC method were carefully validated and the results demonstrated that there were no remarkable differences compared with external standard method. Finally, a total of fifty-one batches covering five Lonicera species were analyzed and PCA was successfully applied to distinguish the Lonicera species. This strategy simplifies the processes in the quality control of multiple-componential herbal medicine which effectively adapted for improving the quality control of those herbs belonging to closely related species. Copyright © 2015 Elsevier B.V. All rights reserved.
Quantitation of flavonoid constituents in citrus fruits.
Kawaii, S; Tomono, Y; Katase, E; Ogawa, K; Yano, M
1999-09-01
Twenty-four flavonoids have been determined in 66 Citrus species and near-citrus relatives, grown in the same field and year, by means of reversed phase high-performance liquid chromatography analysis. Statistical methods have been applied to find relations among the species. The F ratios of 21 flavonoids obtained by applying ANOVA analysis are significant, indicating that a classification of the species using these variables is reasonable to pursue. Principal component analysis revealed that the distributions of Citrus species belonging to different classes were largely in accordance with Tanaka's classification system.
Source localization of temporal lobe epilepsy using PCA-LORETA analysis on ictal EEG recordings.
Stern, Yaki; Neufeld, Miriam Y; Kipervasser, Svetlana; Zilberstein, Amir; Fried, Itzhak; Teicher, Mina; Adi-Japha, Esther
2009-04-01
Localizing the source of an epileptic seizure using noninvasive EEG suffers from inaccuracies produced by other generators not related to the epileptic source. The authors isolated the ictal epileptic activity, and applied a source localization algorithm to identify its estimated location. Ten ictal EEG scalp recordings from five different patients were analyzed. The patients were known to have temporal lobe epilepsy with a single epileptic focus that had a concordant MRI lesion. The patients had become seizure-free following partial temporal lobectomy. A midinterval (approximately 5 seconds) period of ictal activity was used for Principal Component Analysis starting at ictal onset. The level of epileptic activity at each electrode (i.e., the eigenvector of the component that manifest epileptic characteristic), was used as an input for low-resolution tomography analysis for EEG inverse solution (Zilberstain et al., 2004). The algorithm accurately and robustly identified the epileptic focus in these patients. Principal component analysis and source localization methods can be used in the future to monitor the progression of an epileptic seizure and its expansion to other areas.
Principal Component Analysis of Chinese Porcelains from the Five Dynasties to the Qing Dynasty
NASA Astrophysics Data System (ADS)
Yap, C. T.; Hua, Younan
1992-10-01
This is a study of the possibility of identifying antique Chinese porcelains according to the period or dynasty, using major and minor chemical components (SiO2 , Al2O3 , Fe2O3 , K2O, Na2O, CaO and MgO) from the body of the porcelain. Principal component analysis is applied to published data on 66 pieces of Chinese procelains made in Jingdezhen during the Five Dynasties and the Song, Yuan, Ming and Qing Dynasties. It is shown that porcelains made during the Five Dynasties and the Yuan (or Ming) and Qing Dynasties can be segregated completely without any overlap. However, there is appreciable overlap between the Five Dynasties and the Song Dynasty, some overlap between the Song and Ming Dynasties and also between the Yuan and Ming Dynasties. Interestingly, Qing procelains are well separated from all the others. The percentage of silica in the porcelain body decreases and that of alumina increases with recentness with the exception of the Yuan and Ming Dynasties, where this trend is reversed.
Chavez, P.S.; Kwarteng, A.Y.
1989-01-01
A challenge encountered with Landsat Thematic Mapper (TM) data, which includes data from size reflective spectral bands, is displaying as much information as possible in a three-image set for color compositing or digital analysis. Principal component analysis (PCA) applied to the six TM bands simultaneously is often used to address this problem. However, two problems that can be encountered using the PCA method are that information of interest might be mathematically mapped to one of the unused components and that a color composite can be difficult to interpret. "Selective' PCA can be used to minimize both of these problems. The spectral contrast among several spectral regions was mapped for a northern Arizona site using Landsat TM data. Field investigations determined that most of the spectral contrast seen in this area was due to one of the following: the amount of iron and hematite in the soils and rocks, vegetation differences, standing and running water, or the presence of gypsum, which has a higher moisture retention capability than do the surrounding soils and rocks. -from Authors
Optical system for tablet variety discrimination using visible/near-infrared spectroscopy
NASA Astrophysics Data System (ADS)
Shao, Yongni; He, Yong; Hu, Xingyue
2007-12-01
An optical system based on visible/near-infrared spectroscopy (Vis/NIRS) for variety discrimination of ginkgo (Ginkgo biloba L.) tablets was developed. This system consisted of a light source, beam splitter system, sample chamber, optical detector (diffuse reflection detector), and data collection. The tablet varieties used in the research include Da na kang, Xin bang, Tian bao ning, Yi kang, Hua na xing, Dou le, Lv yuan, Hai wang, and Ji yao. All samples (n=270) were scanned in the Vis/NIR region between 325 and 1075 nm using a spectrograph. The chemometrics method of principal component artificial neural network (PC-ANN) was used to establish discrimination models of them. In PC-ANN models, the scores of the principal components were chosen as the input nodes for the input layer of ANN, and the best discrimination rate of 91.1% was reached. Principal component analysis was also executed to select several optimal wavelengths based on loading values. Wavelengths at 481, 458, 466, 570, 1000, 662, and 400 nm were then used as the input data of stepwise multiple linear regression, the regression equation of ginkgo tablets was obtained, and the discrimination rate was researched 84.4%. The results indicated that this optical system could be applied to discriminating ginkgo (Ginkgo biloba L.) tablets, and it supplied a new method for fast ginkgo tablet variety discrimination.
Methods to assess an exercise intervention trial based on 3-level functional data.
Li, Haocheng; Kozey Keadle, Sarah; Staudenmayer, John; Assaad, Houssein; Huang, Jianhua Z; Carroll, Raymond J
2015-10-01
Motivated by data recording the effects of an exercise intervention on subjects' physical activity over time, we develop a model to assess the effects of a treatment when the data are functional with 3 levels (subjects, weeks and days in our application) and possibly incomplete. We develop a model with 3-level mean structure effects, all stratified by treatment and subject random effects, including a general subject effect and nested effects for the 3 levels. The mean and random structures are specified as smooth curves measured at various time points. The association structure of the 3-level data is induced through the random curves, which are summarized using a few important principal components. We use penalized splines to model the mean curves and the principal component curves, and cast the proposed model into a mixed effects model framework for model fitting, prediction and inference. We develop an algorithm to fit the model iteratively with the Expectation/Conditional Maximization Either (ECME) version of the EM algorithm and eigenvalue decompositions. Selection of the number of principal components and handling incomplete data issues are incorporated into the algorithm. The performance of the Wald-type hypothesis test is also discussed. The method is applied to the physical activity data and evaluated empirically by a simulation study. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
[Research on spectra recognition method for cabbages and weeds based on PCA and SIMCA].
Zu, Qin; Deng, Wei; Wang, Xiu; Zhao, Chun-Jiang
2013-10-01
In order to improve the accuracy and efficiency of weed identification, the difference of spectral reflectance was employed to distinguish between crops and weeds. Firstly, the different combinations of Savitzky-Golay (SG) convolutional derivation and multiplicative scattering correction (MSC) method were applied to preprocess the raw spectral data. Then the clustering analysis of various types of plants was completed by using principal component analysis (PCA) method, and the feature wavelengths which were sensitive for classifying various types of plants were extracted according to the corresponding loading plots of the optimal principal components in PCA results. Finally, setting the feature wavelengths as the input variables, the soft independent modeling of class analogy (SIMCA) classification method was used to identify the various types of plants. The experimental results of classifying cabbages and weeds showed that on the basis of the optimal pretreatment by a synthetic application of MSC and SG convolutional derivation with SG's parameters set as 1rd order derivation, 3th degree polynomial and 51 smoothing points, 23 feature wavelengths were extracted in accordance with the top three principal components in PCA results. When SIMCA method was used for classification while the previously selected 23 feature wavelengths were set as the input variables, the classification rates of the modeling set and the prediction set were respectively up to 98.6% and 100%.
Finding Planets in K2: A New Method of Cleaning the Data
NASA Astrophysics Data System (ADS)
Currie, Miles; Mullally, Fergal; Thompson, Susan E.
2017-01-01
We present a new method of removing systematic flux variations from K2 light curves by employing a pixel-level principal component analysis (PCA). This method decomposes the light curves into its principal components (eigenvectors), each with an associated eigenvalue, the value of which is correlated to how much influence the basis vector has on the shape of the light curve. This method assumes that the most influential basis vectors will correspond to the unwanted systematic variations in the light curve produced by K2’s constant motion. We correct the raw light curve by automatically fitting and removing the strongest principal components. The strongest principal components generally correspond to the flux variations that result from the motion of the star in the field of view. Our primary method of calculating the strongest principal components to correct for in the raw light curve estimates the noise by measuring the scatter in the light curve after using an algorithm for Savitsy-Golay detrending, which computes the combined photometric precision value (SG-CDPP value) used in classic Kepler. We calculate this value after correcting the raw light curve for each element in a list of cumulative sums of principal components so that we have as many noise estimate values as there are principal components. We then take the derivative of the list of SG-CDPP values and take the number of principal components that correlates to the point at which the derivative effectively goes to zero. This is the optimal number of principal components to exclude from the refitting of the light curve. We find that a pixel-level PCA is sufficient for cleaning unwanted systematic and natural noise from K2’s light curves. We present preliminary results and a basic comparison to other methods of reducing the noise from the flux variations.
Independent component analysis for automatic note extraction from musical trills
NASA Astrophysics Data System (ADS)
Brown, Judith C.; Smaragdis, Paris
2004-05-01
The method of principal component analysis, which is based on second-order statistics (or linear independence), has long been used for redundancy reduction of audio data. The more recent technique of independent component analysis, enforcing much stricter statistical criteria based on higher-order statistical independence, is introduced and shown to be far superior in separating independent musical sources. This theory has been applied to piano trills and a database of trill rates was assembled from experiments with a computer-driven piano, recordings of a professional pianist, and commercially available compact disks. The method of independent component analysis has thus been shown to be an outstanding, effective means of automatically extracting interesting musical information from a sea of redundant data.
Principal Components Analysis of Triaxial Vibration Data From Helicopter Transmissions
NASA Technical Reports Server (NTRS)
Tumer, Irem Y.; Huff, Edward M.
2001-01-01
Research on the nature of the vibration data collected from helicopter transmissions during flight experiments has led to several crucial observations believed to be responsible for the high rates of false alarms and missed detections in aircraft vibration monitoring systems. This work focuses on one such finding, namely, the need to consider additional sources of information about system vibrations. In this light, helicopter transmission vibration data, collected using triaxial accelerometers, were explored in three different directions, analyzed for content, and then combined using Principal Components Analysis (PCA) to analyze changes in directionality. In this paper, the PCA transformation is applied to 176 test conditions/data sets collected from an OH58C helicopter to derive the overall experiment-wide covariance matrix and its principal eigenvectors. The experiment-wide eigenvectors. are then projected onto the individual test conditions to evaluate changes and similarities in their directionality based on the various experimental factors. The paper will present the foundations of the proposed approach, addressing the question of whether experiment-wide eigenvectors accurately model the vibration modes in individual test conditions. The results will further determine the value of using directionality and triaxial accelerometers for vibration monitoring and anomaly detection.
40 CFR 60.2998 - What are the principal components of the model rule?
Code of Federal Regulations, 2010 CFR
2010-07-01
... 40 Protection of Environment 6 2010-07-01 2010-07-01 false What are the principal components of... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule... management plan. (c) Operator training and qualification. (d) Emission limitations and operating limits. (e...
40 CFR 60.2570 - What are the principal components of the model rule?
Code of Federal Regulations, 2010 CFR
2010-07-01
... 40 Protection of Environment 6 2010-07-01 2010-07-01 false What are the principal components of... Construction On or Before November 30, 1999 Use of Model Rule § 60.2570 What are the principal components of... (k) of this section. (a) Increments of progress toward compliance. (b) Waste management plan. (c...
2007-10-01
1984. Complex principal component analysis : Theory and examples. Journal of Climate and Applied Meteorology 23: 1660-1673. Hotelling, H. 1933...Sediments 99. ASCE: 2,566-2,581. Von Storch, H., and A. Navarra. 1995. Analysis of climate variability. Applications of statistical techniques. Berlin...ERDC TN-SWWRP-07-9 October 2007 Regional Morphology Empirical Analysis Package (RMAP): Orthogonal Function Analysis , Background and Examples by
Maisuradze, Gia G; Leitner, David M
2007-05-15
Dihedral principal component analysis (dPCA) has recently been developed and shown to display complex features of the free energy landscape of a biomolecule that may be absent in the free energy landscape plotted in principal component space due to mixing of internal and overall rotational motion that can occur in principal component analysis (PCA) [Mu et al., Proteins: Struct Funct Bioinfo 2005;58:45-52]. Another difficulty in the implementation of PCA is sampling convergence, which we address here for both dPCA and PCA using a tetrapeptide as an example. We find that for both methods the sampling convergence can be reached over a similar time. Minima in the free energy landscape in the space of the two largest dihedral principal components often correspond to unique structures, though we also find some distinct minima to correspond to the same structure. 2007 Wiley-Liss, Inc.
Zhang, Hong-Guang; Yang, Qin-Min; Lu, Jian-Gang
2014-04-01
In this paper, a novel discriminant methodology based on near infrared spectroscopic analysis technique and least square support vector machine was proposed for rapid and nondestructive discrimination of different types of Polyacrylamide. The diffuse reflectance spectra of samples of Non-ionic Polyacrylamide, Anionic Polyacrylamide and Cationic Polyacrylamide were measured. Then principal component analysis method was applied to reduce the dimension of the spectral data and extract of the principal compnents. The first three principal components were used for cluster analysis of the three different types of Polyacrylamide. Then those principal components were also used as inputs of least square support vector machine model. The optimization of the parameters and the number of principal components used as inputs of least square support vector machine model was performed through cross validation based on grid search. 60 samples of each type of Polyacrylamide were collected. Thus a total of 180 samples were obtained. 135 samples, 45 samples for each type of Polyacrylamide, were randomly split into a training set to build calibration model and the rest 45 samples were used as test set to evaluate the performance of the developed model. In addition, 5 Cationic Polyacrylamide samples and 5 Anionic Polyacrylamide samples adulterated with different proportion of Non-ionic Polyacrylamide were also prepared to show the feasibilty of the proposed method to discriminate the adulterated Polyacrylamide samples. The prediction error threshold for each type of Polyacrylamide was determined by F statistical significance test method based on the prediction error of the training set of corresponding type of Polyacrylamide in cross validation. The discrimination accuracy of the built model was 100% for prediction of the test set. The prediction of the model for the 10 mixing samples was also presented, and all mixing samples were accurately discriminated as adulterated samples. The overall results demonstrate that the discrimination method proposed in the present paper can rapidly and nondestructively discriminate the different types of Polyacrylamide and the adulterated Polyacrylamide samples, and offered a new approach to discriminate the types of Polyacrylamide.
Roberts, Miguel E; Han, Kyunghee; Weed, Nathan C
2006-09-01
This study documents the development of an MMPI-2 scale designed to assess features of the Korean culture-bound syndrome, Hwa-Byung (HB). An American research team and psychiatric practitioners in Korea created an 18-item HB scale via rational item selection and psycho-metric refinement. Principal components analysis of scale items revealed four components, reflecting content domains of general health, gastrointestinal symptoms, hopelessness, and anger. This four-component solution applied well to both Korean men and women, but not to an American sample. Although some findings were encouraging, future studies employing clinical samples are needed to provide further validation of this scale.
Multivariate analysis for scanning tunneling spectroscopy data
NASA Astrophysics Data System (ADS)
Yamanishi, Junsuke; Iwase, Shigeru; Ishida, Nobuyuki; Fujita, Daisuke
2018-01-01
We applied principal component analysis (PCA) to two-dimensional tunneling spectroscopy (2DTS) data obtained on a Si(111)-(7 × 7) surface to explore the effectiveness of multivariate analysis for interpreting 2DTS data. We demonstrated that several components that originated mainly from specific atoms at the Si(111)-(7 × 7) surface can be extracted by PCA. Furthermore, we showed that hidden components in the tunneling spectra can be decomposed (peak separation), which is difficult to achieve with normal 2DTS analysis without the support of theoretical calculations. Our analysis showed that multivariate analysis can be an additional powerful way to analyze 2DTS data and extract hidden information from a large amount of spectroscopic data.
ERIC Educational Resources Information Center
Oplatka, Izhar
2017-01-01
Purpose: In order to fill the gap in theoretical and empirical knowledge about the characteristics of principal workload, the purpose of this paper is to explore the components of principal workload as well as its determinants and the coping strategies commonly used by principals to face this personal state. Design/methodology/approach:…
Zeemering, Stef; Bonizzi, Pietro; Maesen, Bart; Peeters, Ralf; Schotten, Ulrich
2015-01-01
Spatiotemporal complexity of atrial fibrillation (AF) patterns is often quantified by annotated intracardiac contact mapping. We introduce a new approach that applies recurrence plot (RP) construction followed by recurrence quantification analysis (RQA) to epicardial atrial electrograms, recorded with a high-density grid of electrodes. In 32 patients with no history of AF (aAF, n=11), paroxysmal AF (PAF, n=12) and persistent AF (persAF, n=9), RPs were constructed using a phase space electrogram embedding dimension equal to the estimated AF cycle length. Spatial information was incorporated by 1) averaging the recurrence over all electrodes, and 2) by applying principal component analysis (PCA) to the matrix of embedded electrograms and selecting the first principal component as a representation of spatial diversity. Standard RQA parameters were computed on the constructed RPs and correlated to the number of fibrillation waves per AF cycle (NW). Averaged RP RQA parameters showed no correlation with NW. Correlations improved when applying PCA, with maximum correlation achieved between RP threshold and NW (RR1%, r=0.68, p <; 0.001) and RP determinism (DET, r=-0.64, p <; 0.001). All studied RQA parameters based on the PCA RP were able to discriminate between persAF and aAF/PAF (DET persAF 0.40 ± 0.11 vs. 0.59 ± 0.14/0.62 ± 0.16, p <; 0.01). RP construction and RQA combined with PCA provide a quick and reliable tool to visualize dynamical behaviour and to assess the complexity of contact mapping patterns in AF.
Considering Horn's Parallel Analysis from a Random Matrix Theory Point of View.
Saccenti, Edoardo; Timmerman, Marieke E
2017-03-01
Horn's parallel analysis is a widely used method for assessing the number of principal components and common factors. We discuss the theoretical foundations of parallel analysis for principal components based on a covariance matrix by making use of arguments from random matrix theory. In particular, we show that (i) for the first component, parallel analysis is an inferential method equivalent to the Tracy-Widom test, (ii) its use to test high-order eigenvalues is equivalent to the use of the joint distribution of the eigenvalues, and thus should be discouraged, and (iii) a formal test for higher-order components can be obtained based on a Tracy-Widom approximation. We illustrate the performance of the two testing procedures using simulated data generated under both a principal component model and a common factors model. For the principal component model, the Tracy-Widom test performs consistently in all conditions, while parallel analysis shows unpredictable behavior for higher-order components. For the common factor model, including major and minor factors, both procedures are heuristic approaches, with variable performance. We conclude that the Tracy-Widom procedure is preferred over parallel analysis for statistically testing the number of principal components based on a covariance matrix.
Tailored multivariate analysis for modulated enhanced diffraction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caliandro, Rocco; Guccione, Pietro; Nico, Giovanni
2015-10-21
Modulated enhanced diffraction (MED) is a technique allowing the dynamic structural characterization of crystalline materials subjected to an external stimulus, which is particularly suited forin situandoperandostructural investigations at synchrotron sources. Contributions from the (active) part of the crystal system that varies synchronously with the stimulus can be extracted by an offline analysis, which can only be applied in the case of periodic stimuli and linear system responses. In this paper a new decomposition approach based on multivariate analysis is proposed. The standard principal component analysis (PCA) is adapted to treat MED data: specific figures of merit based on their scoresmore » and loadings are found, and the directions of the principal components obtained by PCA are modified to maximize such figures of merit. As a result, a general method to decompose MED data, called optimum constrained components rotation (OCCR), is developed, which produces very precise results on simulated data, even in the case of nonperiodic stimuli and/or nonlinear responses. The multivariate analysis approach is able to supply in one shot both the diffraction pattern related to the active atoms (through the OCCR loadings) and the time dependence of the system response (through the OCCR scores). When applied to real data, OCCR was able to supply only the latter information, as the former was hindered by changes in abundances of different crystal phases, which occurred besides structural variations in the specific case considered. To develop a decomposition procedure able to cope with this combined effect represents the next challenge in MED analysis.« less
NASA Astrophysics Data System (ADS)
Toparli, M. Burak; Fitzpatrick, Michael E.; Gungor, Salih
2015-09-01
In this study, residual stress fields, including the near-surface residual stresses, were determined for an Al7050-T7451 sample after laser peening. The contour method was applied to measure one component of the residual stress, and the relaxed stresses on the cut surfaces were then measured by X-ray diffraction. This allowed calculation of the three orthogonal stress components using the superposition principle. The near-surface results were validated with results from incremental hole drilling and conventional X-ray diffraction. The results demonstrate that multiple residual stress components can be determined using a combination of the contour method and another technique. If the measured stress components are congruent with the principal stress axes in the sample, then this allows for determination of the complete stress tensor.
NASA Astrophysics Data System (ADS)
Lin, Z. D.; Wang, Y. B.; Wang, R. J.; Wang, L. S.; Lu, C. P.; Zhang, Z. Y.; Song, L. T.; Liu, Y.
2017-07-01
A total of 130 topsoil samples collected from Guoyang County, Anhui Province, China, were used to establish a Vis-NIR model for the prediction of organic matter content (OMC) in lime concretion black soils. Different spectral pretreatments were applied for minimizing the irrelevant and useless information of the spectra and increasing the spectra correlation with the measured values. Subsequently, the Kennard-Stone (KS) method and sample set partitioning based on joint x-y distances (SPXY) were used to select the training set. Successive projection algorithm (SPA) and genetic algorithm (GA) were then applied for wavelength optimization. Finally, the principal component regression (PCR) model was constructed, in which the optimal number of principal components was determined using the leave-one-out cross validation technique. The results show that the combination of the Savitzky-Golay (SG) filter for smoothing and multiplicative scatter correction (MSC) can eliminate the effect of noise and baseline drift; the SPXY method is preferable to KS in the sample selection; both the SPA and the GA can significantly reduce the number of wavelength variables and favorably increase the accuracy, especially GA, which greatly improved the prediction accuracy of soil OMC with Rcc, RMSEP, and RPD up to 0.9316, 0.2142, and 2.3195, respectively.
Plazas-Nossa, Leonardo; Hofer, Thomas; Gruber, Günter; Torres, Andres
2017-02-01
This work proposes a methodology for the forecasting of online water quality data provided by UV-Vis spectrometry. Therefore, a combination of principal component analysis (PCA) to reduce the dimensionality of a data set and artificial neural networks (ANNs) for forecasting purposes was used. The results obtained were compared with those obtained by using discrete Fourier transform (DFT). The proposed methodology was applied to four absorbance time series data sets composed by a total number of 5705 UV-Vis spectra. Absolute percentage errors obtained by applying the proposed PCA/ANN methodology vary between 10% and 13% for all four study sites. In general terms, the results obtained were hardly generalizable, as they appeared to be highly dependent on specific dynamics of the water system; however, some trends can be outlined. PCA/ANN methodology gives better results than PCA/DFT forecasting procedure by using a specific spectra range for the following conditions: (i) for Salitre wastewater treatment plant (WWTP) (first hour) and Graz West R05 (first 18 min), from the last part of UV range to all visible range; (ii) for Gibraltar pumping station (first 6 min) for all UV-Vis absorbance spectra; and (iii) for San Fernando WWTP (first 24 min) for all of UV range to middle part of visible range.
Q-mode versus R-mode principal component analysis for linear discriminant analysis (LDA)
NASA Astrophysics Data System (ADS)
Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz
2017-05-01
Many literature apply Principal Component Analysis (PCA) as either preliminary visualization or variable con-struction methods or both. Focus of PCA can be on the samples (R-mode PCA) or variables (Q-mode PCA). Traditionally, R-mode PCA has been the usual approach to reduce high-dimensionality data before the application of Linear Discriminant Analysis (LDA), to solve classification problems. Output from PCA composed of two new matrices known as loadings and scores matrices. Each matrix can then be used to produce a plot, i.e. loadings plot aids identification of important variables whereas scores plot presents spatial distribution of samples on new axes that are also known as Principal Components (PCs). Fundamentally, the scores matrix always be the input variables for building classification model. A recent paper uses Q-mode PCA but the focus of analysis was not on the variables but instead on the samples. As a result, the authors have exchanged the use of both loadings and scores plots in which clustering of samples was studied using loadings plot whereas scores plot has been used to identify important manifest variables. Therefore, the aim of this study is to statistically validate the proposed practice. Evaluation is based on performance of external error obtained from LDA models according to number of PCs. On top of that, bootstrapping was also conducted to evaluate the external error of each of the LDA models. Results show that LDA models produced by PCs from R-mode PCA give logical performance and the matched external error are also unbiased whereas the ones produced with Q-mode PCA show the opposites. With that, we concluded that PCs produced from Q-mode is not statistically stable and thus should not be applied to problems of classifying samples, but variables. We hope this paper will provide some insights on the disputable issues.
The Influence Function of Principal Component Analysis by Self-Organizing Rule.
Higuchi; Eguchi
1998-07-28
This article is concerned with a neural network approach to principal component analysis (PCA). An algorithm for PCA by the self-organizing rule has been proposed and its robustness observed through the simulation study by Xu and Yuille (1995). In this article, the robustness of the algorithm against outliers is investigated by using the theory of influence function. The influence function of the principal component vector is given in an explicit form. Through this expression, the method is shown to be robust against any directions orthogonal to the principal component vector. In addition, a statistic generated by the self-organizing rule is proposed to assess the influence of data in PCA.
Principal Component Analysis: A Method for Determining the Essential Dynamics of Proteins
David, Charles C.; Jacobs, Donald J.
2015-01-01
It has become commonplace to employ principal component analysis to reveal the most important motions in proteins. This method is more commonly known by its acronym, PCA. While most popular molecular dynamics packages inevitably provide PCA tools to analyze protein trajectories, researchers often make inferences of their results without having insight into how to make interpretations, and they are often unaware of limitations and generalizations of such analysis. Here we review best practices for applying standard PCA, describe useful variants, discuss why one may wish to make comparison studies, and describe a set of metrics that make comparisons possible. In practice, one will be forced to make inferences about the essential dynamics of a protein without having the desired amount of samples. Therefore, considerable time is spent on describing how to judge the significance of results, highlighting pitfalls. The topic of PCA is reviewed from the perspective of many practical considerations, and useful recipes are provided. PMID:24061923
Principal component analysis: a method for determining the essential dynamics of proteins.
David, Charles C; Jacobs, Donald J
2014-01-01
It has become commonplace to employ principal component analysis to reveal the most important motions in proteins. This method is more commonly known by its acronym, PCA. While most popular molecular dynamics packages inevitably provide PCA tools to analyze protein trajectories, researchers often make inferences of their results without having insight into how to make interpretations, and they are often unaware of limitations and generalizations of such analysis. Here we review best practices for applying standard PCA, describe useful variants, discuss why one may wish to make comparison studies, and describe a set of metrics that make comparisons possible. In practice, one will be forced to make inferences about the essential dynamics of a protein without having the desired amount of samples. Therefore, considerable time is spent on describing how to judge the significance of results, highlighting pitfalls. The topic of PCA is reviewed from the perspective of many practical considerations, and useful recipes are provided.
Dopico-García, M S; Fique, A; Guerra, L; Afonso, J M; Pereira, O; Valentão, P; Andrade, P B; Seabra, R M
2008-06-15
Phenolic profile of 10 different varieties of red "Vinho Verde" grapes (Azal Tinto, Borraçal, Brancelho, Doçal, Espadeiro, Padeiro de Basto, Pedral, Rabo de ovelha, Verdelho and Vinhão), from Minho (Portugal) were studied. Nine Flavonols, four phenolic acids, three flavan-3-ols, one stilben and eight anthocyanins were determined. Malvidin-3-O-glucoside was the most abundant anthocyanin while the main non-coloured compound was much more heterogeneous: catechin, epicatechin, myricetin-3-O-glucoside, quercetin-3-O-glucoside or syringetin-3-O-glucoside. Anthocyanin contents ranged from 42 to 97%. Principal component analysis (PCA) was applied to analyse the date and study the relations between the samples and their phenolic profiles. Anthocyanin profile proved to be a good marker to characterize the varieties even considering different origin and harvest. "Vinhão" grapes showed anthocyanins levels until twenty four times higher than the rest of the samples, with 97% of these compounds.
Buonaccorsi, G A; Rose, C J; O'Connor, J P B; Roberts, C; Watson, Y; Jackson, A; Jayson, G C; Parker, G J M
2010-01-01
Clinical trials of anti-angiogenic and vascular-disrupting agents often use biomarkers derived from DCE-MRI, typically reporting whole-tumor summary statistics and so overlooking spatial parameter variations caused by tissue heterogeneity. We present a data-driven segmentation method comprising tracer-kinetic model-driven registration for motion correction, conversion from MR signal intensity to contrast agent concentration for cross-visit normalization, iterative principal components analysis for imputation of missing data and dimensionality reduction, and statistical outlier detection using the minimum covariance determinant to obtain a robust Mahalanobis distance. After applying these techniques we cluster in the principal components space using k-means. We present results from a clinical trial of a VEGF inhibitor, using time-series data selected because of problems due to motion and outlier time series. We obtained spatially-contiguous clusters that map to regions with distinct microvascular characteristics. This methodology has the potential to uncover localized effects in trials using DCE-MRI-based biomarkers.
Luo, Jie; Qi, Shihua; Xie, Xianming; Gu, X W Sophie; Wang, Jinji
2017-01-01
Guiyu is a well-known electronic waste dismantling and recycling town in south China. Concentrations and distribution of the 21 mineral elements and 16 polycyclic aromatic hydrocarbons (PAHs) collected there were evaluated. Principal component analyses (PCA) applied to the data matrix of PAHs in the soil extracted three major factors explaining 85.7% of the total variability identified as traffic emission, coal combustion, and an unidentified source. By using metallic or metalloid element concentrations as variables, five principal components (PCs) were identified and accounted for 70.4% of the information included in the initial data matrix, which can be denoted as e-waste dismantling-related contamination, two different geological origins, anthropogenic influenced source, and marine aerosols. Combining the 21 metallic and metalloid element datasets with the 16 PAH concentrations can narrow down the coarse source and decrease the unidentified contribution to soil in the present study and therefore effectively assists the source identification process.
Fang, Leyuan; Wang, Chong; Li, Shutao; Yan, Jun; Chen, Xiangdong; Rabbani, Hossein
2017-11-01
We present an automatic method, termed as the principal component analysis network with composite kernel (PCANet-CK), for the classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images. Specifically, the proposed PCANet-CK method first utilizes the PCANet to automatically learn features from each B-scan of the 3-D retinal OCT images. Then, multiple kernels are separately applied to a set of very important features of the B-scans and these kernels are fused together, which can jointly exploit the correlations among features of the 3-D OCT images. Finally, the fused (composite) kernel is incorporated into an extreme learning machine for the OCT image classification. We tested our proposed algorithm on two real 3-D spectral domain OCT (SD-OCT) datasets (of normal subjects and subjects with the macular edema and age-related macular degeneration), which demonstrated its effectiveness. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Yücel, Yasin; Sultanoğlu, Pınar
2013-09-01
Chemical characterisation has been carried out on 45 honey samples collected from Hatay region of Turkey. The concentrations of 17 elements were determined by inductively coupled plasma optical emission spectrometry (ICP-OES). Ca, K, Mg and Na were the most abundant elements, with mean contents of 219.38, 446.93, 49.06 and 95.91 mg kg(-1) respectively. The trace element mean contents ranged between 0.03 and 15.07 mg kg(-1). Chemometric methods such as principal component analysis (PCA) and cluster analysis (CA) techniques were applied to classify honey according to mineral content. The first most important principal component (PC) was strongly associated with the value of Al, B, Cd and Co. CA showed eight clusters corresponding to the eight botanical origins of honey. PCA explained 75.69% of the variance with the first six PC variables. Chemometric analysis of the analytical data allowed the accurate classification of the honey samples according to origin. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Parada, N. D. J. (Principal Investigator); Formaggio, A. R.; Dossantos, J. R.; Dias, L. A. V.
1984-01-01
Automatic pre-processing technique called Principal Components (PRINCO) in analyzing LANDSAT digitized data, for land use and vegetation cover, on the Brazilian cerrados was evaluated. The chosen pilot area, 223/67 of MSS/LANDSAT 3, was classified on a GE Image-100 System, through a maximum-likehood algorithm (MAXVER). The same procedure was applied to the PRINCO treated image. PRINCO consists of a linear transformation performed on the original bands, in order to eliminate the information redundancy of the LANDSAT channels. After PRINCO only two channels were used thus reducing computer effort. The original channels and the PRINCO channels grey levels for the five identified classes (grassland, "cerrado", burned areas, anthropic areas, and gallery forest) were obtained through the MAXVER algorithm. This algorithm also presented the average performance for both cases. In order to evaluate the results, the Jeffreys-Matusita distance (JM-distance) between classes was computed. The classification matrix, obtained through MAXVER, after a PRINCO pre-processing, showed approximately the same average performance in the classes separability.
NASA Astrophysics Data System (ADS)
Miller, Shelly L.; Anderson, Melissa J.; Daly, Eileen P.; Milford, Jana B.
Four receptor-oriented source apportionment models were evaluated by applying them to simulated personal exposure data for select volatile organic compounds (VOCs) that were generated by Monte Carlo sampling from known source contributions and profiles. The exposure sources modeled are environmental tobacco smoke, paint emissions, cleaning and/or pesticide products, gasoline vapors, automobile exhaust, and wastewater treatment plant emissions. The receptor models analyzed are chemical mass balance, principal component analysis/absolute principal component scores, positive matrix factorization (PMF), and graphical ratio analysis for composition estimates/source apportionment by factors with explicit restriction, incorporated in the UNMIX model. All models identified only the major contributors to total exposure concentrations. PMF extracted factor profiles that most closely represented the major sources used to generate the simulated data. None of the models were able to distinguish between sources with similar chemical profiles. Sources that contributed <5% to the average total VOC exposure were not identified.
NASA Astrophysics Data System (ADS)
Babanova, Sofia; Artyushkova, Kateryna; Ulyanova, Yevgenia; Singhal, Sameer; Atanassov, Plamen
2014-01-01
Two statistical methods, design of experiments (DOE) and principal component analysis (PCA) are employed to investigate and improve performance of air-breathing gas-diffusional enzymatic electrodes. DOE is utilized as a tool for systematic organization and evaluation of various factors affecting the performance of the composite system. Based on the results from the DOE, an improved cathode is constructed. The current density generated utilizing the improved cathode (755 ± 39 μA cm-2 at 0.3 V vs. Ag/AgCl) is 2-5 times higher than the highest current density previously achieved. Three major factors contributing to the cathode performance are identified: the amount of enzyme, the volume of phosphate buffer used to immobilize the enzyme, and the thickness of the gas-diffusion layer (GDL). PCA is applied as an independent confirmation tool to support conclusions made by DOE and to visualize the contribution of factors in individual cathode configurations.
Diversity in shortjaw cisco (Coregonus zenithicus) in North America
Todd, T.N.; Steinhilber, M.
2002-01-01
Shortjaw cisco (Coregonus zenithicus) exhibit morphological variability across their geographic range in North America and could comprise more than one distinct morph or taxon. To investigate this, principal components analysis was applied to a data set that consisted of four variables from nine localities. All data were obtained from digital images of the specimens and the excised first gill arch. Confidence ellipses (95%) about the means of bivariate distributions of the principal components revealed that some populations were distinct from the others, but a continuity of overlap clouded understanding of pattern among the variation. Most populations had more and longer gillrakers than shortjaw cisco from George Lake (Manitoba) and Basswood Lake (Ontario) that had fewer and shorter gillrakers. This analysis supports the existence of a short- and few-rakered morph and a long- and many-rakered morph. However, most populations of shortjaw cisco from the Great Lakes across Canada to the Arctic share a similar morphology and likely represent a single, widespread species.
Relevant principal component analysis applied to the characterisation of Portuguese heather honey.
Martins, Rui C; Lopes, Victor V; Valentão, Patrícia; Carvalho, João C M F; Isabel, Paulo; Amaral, Maria T; Batista, Maria T; Andrade, Paula B; Silva, Branca M
2008-01-01
The main purpose of this study was the characterisation of 'Serra da Lousã' heather honey by using novel statistical methodology, relevant principal component analysis, in order to assess the correlations between production year, locality and composition. Herein, we also report its chemical composition in terms of sugars, glycerol and ethanol, and physicochemical parameters. Sugars profiles from 'Serra da Lousã' heather and 'Terra Quente de Trás-os-Montes' lavender honeys were compared and allowed the discrimination: 'Serra da Lousã' honeys do not contain sucrose, generally exhibit lower contents of turanose, trehalose and maltose and higher contents of fructose and glucose. Different localities from 'Serra da Lousã' provided groups of samples with high and low glycerol contents. Glycerol and ethanol contents were revealed to be independent of the sugars profiles. These data and statistical models can be very useful in the comparison and detection of adulterations during the quality control analysis of 'Serra da Lousã' honey.
SESNPCA: Principal Component Analysis Applied to Stripped-Envelope Core-Collapse Supernovae
NASA Astrophysics Data System (ADS)
Williamson, Marc; Bianco, Federica; Modjaz, Maryam
2018-01-01
In the new era of time-domain astronomy, it will become increasingly important to have rigorous, data driven models for classifying transients, including supernovae (SNe). We present the first application of principal component analysis (PCA) to stripped-envelope core-collapse supernovae (SESNe). Previous studies of SNe types Ib, IIb, Ic, and broad-line Ic (Ic-BL) focus only on specific spectral features, while our PCA algorithm uses all of the information contained in each spectrum. We use one of the largest compiled datasets of SESNe, containing over 150 SNe, each with spectra taken at multiple phases. Our work focuses on 49 SNe with spectra taken 15 ± 5 days after maximum V-band light where better distinctions can be made between SNe type Ib and Ic spectra. We find that spectra of SNe type IIb and Ic-BL are separable from the other types in PCA space, indicating that PCA is a promising option for developing a purely data driven model for SESNe classification.
Krishnan, T; Reddy, B M
1994-01-01
The graphical technique of biplot due to Gabriel and others is explained, and is applied to ten finger ridge-count means of 239 populations, mostly Indian. The biplots, together with concentration ellipses based on them, are used to study geographical, gender and ethnic/social group variability, to compare Indian populations with other populations and to study relations between individual counts and populations. The correlation structure of ridge-counts exhibits a tripartite division of digits demonstrated by many other studies, but with a somewhat different combination of digits. Comparisons are also made with the results of Leguebe and Vrydagh, who used principal components, discriminant functions, Andrews functions, etc., to study geographical and gender variations. There is a great deal of homogeneity in Indian populations when compared to populations from the rest of the world. Although broad geographical contiguity is reflected in the biplots, local (states within India) level contiguity is not maintained. Monogoloids and Caucasoids have distinct ridge-count structures. The higher level of homogeneity in females and on the left side observed by Leguebe and Vrydagh is also observed in the biplots. A comparison with principal component plots indicates that biplots yield a graphical representation similar to component plots, and convey more information than component plots.
Progress Towards Improved Analysis of TES X-ray Data Using Principal Component Analysis
NASA Technical Reports Server (NTRS)
Busch, S. E.; Adams, J. S.; Bandler, S. R.; Chervenak, J. A.; Eckart, M. E.; Finkbeiner, F. M.; Fixsen, D. J.; Kelley, R. L.; Kilbourne, C. A.; Lee, S.-J.;
2015-01-01
The traditional method of applying a digital optimal filter to measure X-ray pulses from transition-edge sensor (TES) devices does not achieve the best energy resolution when the signals have a highly non-linear response to energy, or the noise is non-stationary during the pulse. We present an implementation of a method to analyze X-ray data from TESs, which is based upon principal component analysis (PCA). Our method separates the X-ray signal pulse into orthogonal components that have the largest variance. We typically recover pulse height, arrival time, differences in pulse shape, and the variation of pulse height with detector temperature. These components can then be combined to form a representation of pulse energy. An added value of this method is that by reporting information on more descriptive parameters (as opposed to a single number representing energy), we generate a much more complete picture of the pulse received. Here we report on progress in developing this technique for future implementation on X-ray telescopes. We used an 55Fe source to characterize Mo/Au TESs. On the same dataset, the PCA method recovers a spectral resolution that is better by a factor of two than achievable with digital optimal filters.
Physiological and anthropometric determinants of rhythmic gymnastics performance.
Douda, Helen T; Toubekis, Argyris G; Avloniti, Alexandra A; Tokmakidis, Savvas P
2008-03-01
To identify the physiological and anthropometric predictors of rhythmic gymnastics performance, which was defined from the total ranking score of each athlete in a national competition. Thirty-four rhythmic gymnasts were divided into 2 groups, elite (n = 15) and nonelite (n = 19), and they underwent a battery of anthropometric, physical fitness, and physiological measurements. The principal-components analysis extracted 6 components: anthropometric, flexibility, explosive strength, aerobic capacity, body dimensions, and anaerobic metabolism. These were used in a simultaneous multiple-regression procedure to determine which best explain the variance in rhythmic gymnastics performance. Based on the principal-component analysis, the anthropometric component explained 45% of the total variance, flexibility 12.1%, explosive strength 9.2%, aerobic capacity 7.4%, body dimensions 6.8%, and anaerobic metabolism 4.6%. Components of anthropometric (r = .50) and aerobic capacity (r = .49) were significantly correlated with performance (P < .01). When the multiple-regression model-y = 10.708 + (0.0005121 x VO2max) + (0.157 x arm span) + (0.814 x midthigh circumference) - (0.293 x body mass)-was applied to elite gymnasts, 92.5% of the variation was explained by VO2max (58.9%), arm span (12%), midthigh circumference (13.1%), and body mass (8.5%). Selected anthropometric characteristics, aerobic power, flexibility, and explosive strength are important determinants of successful performance. These findings might have practical implications for both training and talent identification in rhythmic gymnastics.
Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao
2015-01-01
Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
[Discrimination of Rice Syrup Adulterant of Acacia Honey Based Using Near-Infrared Spectroscopy].
Zhang, Yan-nan; Chen, Lan-zhen; Xue, Xiao-feng; Wu, Li-ming; Li, Yi; Yang, Juan
2015-09-01
At present, the rice syrup as a low price of the sweeteners was often adulterated into acacia honey and the adulterated honeys were sold in honey markets, while there is no suitable and fast method to identify honey adulterated with rice syrup. In this study, Near infrared spectroscopy (NIR) combined with chemometric methods were used to discriminate authenticity of honey. 20 unprocessed acacia honey samples from the different honey producing areas, mixed? with different proportion of rice syrup, were prepared of seven different concentration gradient? including 121 samples. The near infrared spectrum (NIR) instrument and spectrum processing software have been applied in the? spectrum? scanning and data conversion on adulterant samples, respectively. Then it was analyzed by Principal component analysis (PCA) and canonical discriminant analysis methods in order to discriminating adulterated honey. The results showed that after principal components analysis, the first two principal components accounted for 97.23% of total variation, but the regionalism of the score plot of the first two PCs was not obvious, so the canonical discriminant analysis was used to make the further discrimination, all samples had been discriminated correctly, the first two discriminant functions accounted for 91.6% among the six canonical discriminant functions, Then the different concentration of adulterant samples can be discriminated correctly, it illustrate that canonical discriminant analysis method combined with NIR spectroscopy is not only feasible but also practical for rapid and effective discriminate of the rice syrup adulterant of acacia honey.
Early forest fire detection using principal component analysis of infrared video
NASA Astrophysics Data System (ADS)
Saghri, John A.; Radjabi, Ryan; Jacobs, John T.
2011-09-01
A land-based early forest fire detection scheme which exploits the infrared (IR) temporal signature of fire plume is described. Unlike common land-based and/or satellite-based techniques which rely on measurement and discrimination of fire plume directly from its infrared and/or visible reflectance imagery, this scheme is based on exploitation of fire plume temporal signature, i.e., temperature fluctuations over the observation period. The method is simple and relatively inexpensive to implement. The false alarm rate is expected to be lower that of the existing methods. Land-based infrared (IR) cameras are installed in a step-stare-mode configuration in potential fire-prone areas. The sequence of IR video frames from each camera is digitally processed to determine if there is a fire within camera's field of view (FOV). The process involves applying a principal component transformation (PCT) to each nonoverlapping sequence of video frames from the camera to produce a corresponding sequence of temporally-uncorrelated principal component (PC) images. Since pixels that form a fire plume exhibit statistically similar temporal variation (i.e., have a unique temporal signature), PCT conveniently renders the footprint/trace of the fire plume in low-order PC images. The PC image which best reveals the trace of the fire plume is then selected and spatially filtered via simple threshold and median filter operations to remove the background clutter, such as traces of moving tree branches due to wind.
Salvatore, Stefania; Bramness, Jørgen G; Røislien, Jo
2016-07-12
Wastewater-based epidemiology (WBE) is a novel approach in drug use epidemiology which aims to monitor the extent of use of various drugs in a community. In this study, we investigate functional principal component analysis (FPCA) as a tool for analysing WBE data and compare it to traditional principal component analysis (PCA) and to wavelet principal component analysis (WPCA) which is more flexible temporally. We analysed temporal wastewater data from 42 European cities collected daily over one week in March 2013. The main temporal features of ecstasy (MDMA) were extracted using FPCA using both Fourier and B-spline basis functions with three different smoothing parameters, along with PCA and WPCA with different mother wavelets and shrinkage rules. The stability of FPCA was explored through bootstrapping and analysis of sensitivity to missing data. The first three principal components (PCs), functional principal components (FPCs) and wavelet principal components (WPCs) explained 87.5-99.6 % of the temporal variation between cities, depending on the choice of basis and smoothing. The extracted temporal features from PCA, FPCA and WPCA were consistent. FPCA using Fourier basis and common-optimal smoothing was the most stable and least sensitive to missing data. FPCA is a flexible and analytically tractable method for analysing temporal changes in wastewater data, and is robust to missing data. WPCA did not reveal any rapid temporal changes in the data not captured by FPCA. Overall the results suggest FPCA with Fourier basis functions and common-optimal smoothing parameter as the most accurate approach when analysing WBE data.
[Applied ecology: retrospect and prospect].
He, Xingyuan; Zeng, Dehui
2004-10-01
Applied ecology is evolved into a principal part of modern ecology that rapidly develops. The major stimulus for the development of applied ecology roots in seeking the solutions for the problems of human populations, resources and environments. Through four decades, the science of applied ecology has been becoming a huge group of disciplines. The future for the applied ecology should concern more with human-influenced and managed ecosystems, and acknowledge humans as the components of ecosystems. Nowadays and in future, the top-priorities in applied ecology should include following fields: sustainable ecosystems and biosphere, ecosystem services and ecological design, ecological assessment of genetically modified organisms, ecology of biological invasions, epidemical ecology, ecological forecasting, ecological process and its control. The authors believe that the comprehensive and active research hotspots coupled some new traits would occur around these fields in foreseeable future.
Socaci, Sonia A; Socaciu, Carmen; Tofană, Maria; Raţi, Ioan V; Pintea, Adela
2013-01-01
The health benefits of sea buckthorn (Hippophae rhamnoides L.) are well documented due to its rich content in bioactive phytochemicals (pigments, phenolics and vitamins) as well as volatiles responsible for specific flavours and bacteriostatic action. The volatile compounds are good biomarkers of berry freshness, quality and authenticity. To develop a fast and efficient GC-MS method including a minimal sample preparation technique (in-tube extraction, ITEX) for the discrimination of sea buckthorn varieties based on their chromatographic volatile fingerprint. Twelve sea buckthorn varieties (wild and cultivated) were collected from forestry departments and experimental fields, respectively. The extraction of volatile compounds was performed using the ITEX technique whereas separation and identification was performed using a GC-MS QP-2010. Principal component analysis (PCA) was applied to discriminate the differences among sample composition. Using GC-MS analysis, from the headspace of sea buckthorn samples, 46 volatile compounds were separated with 43 being identified. The most abundant derivatives were ethyl esters of 2-methylbutanoic acid, 3-methylbutanoic acid, hexanoic acid, octanoic acid and butanoic acid, as well as 3-methylbutyl 3-methylbutanoate, 3-methylbutyl 2-methylbutanoate and benzoic acid ethyl ester (over 80% of all volatile compounds). Principal component analysis showed that the first two components explain 79% of data variance, demonstrating a good discrimination between samples. A reliable, fast and eco-friendly ITEX/GC-MS method was applied to fingerprint the volatile profile and to discriminate between wild and cultivated sea buckthorn berries originating from the Carpathians, with relevance to food science and technology. Copyright © 2013 John Wiley & Sons, Ltd.
40 CFR 62.14505 - What are the principal components of this subpart?
Code of Federal Regulations, 2010 CFR
2010-07-01
... 40 Protection of Environment 8 2010-07-01 2010-07-01 false What are the principal components of this subpart? 62.14505 Section 62.14505 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY... components of this subpart? This subpart contains the eleven major components listed in paragraphs (a...
A Filtering of Incomplete GNSS Position Time Series with Probabilistic Principal Component Analysis
NASA Astrophysics Data System (ADS)
Gruszczynski, Maciej; Klos, Anna; Bogusz, Janusz
2018-04-01
For the first time, we introduced the probabilistic principal component analysis (pPCA) regarding the spatio-temporal filtering of Global Navigation Satellite System (GNSS) position time series to estimate and remove Common Mode Error (CME) without the interpolation of missing values. We used data from the International GNSS Service (IGS) stations which contributed to the latest International Terrestrial Reference Frame (ITRF2014). The efficiency of the proposed algorithm was tested on the simulated incomplete time series, then CME was estimated for a set of 25 stations located in Central Europe. The newly applied pPCA was compared with previously used algorithms, which showed that this method is capable of resolving the problem of proper spatio-temporal filtering of GNSS time series characterized by different observation time span. We showed, that filtering can be carried out with pPCA method when there exist two time series in the dataset having less than 100 common epoch of observations. The 1st Principal Component (PC) explained more than 36% of the total variance represented by time series residuals' (series with deterministic model removed), what compared to the other PCs variances (less than 8%) means that common signals are significant in GNSS residuals. A clear improvement in the spectral indices of the power-law noise was noticed for the Up component, which is reflected by an average shift towards white noise from - 0.98 to - 0.67 (30%). We observed a significant average reduction in the accuracy of stations' velocity estimated for filtered residuals by 35, 28 and 69% for the North, East, and Up components, respectively. CME series were also subjected to analysis in the context of environmental mass loading influences of the filtering results. Subtraction of the environmental loading models from GNSS residuals provides to reduction of the estimated CME variance by 20 and 65% for horizontal and vertical components, respectively.
González-Vidal, Juan José; Pérez-Pueyo, Rosanna; Soneira, María José; Ruiz-Moreno, Sergio
2015-03-01
A new method has been developed to automatically identify Raman spectra, whether they correspond to single- or multicomponent spectra. The method requires no user input or judgment. There are thus no parameters to be tweaked. Furthermore, it provides a reliability factor on the resulting identification, with the aim of becoming a useful support tool for the analyst in the decision-making process. The method relies on the multivariate techniques of principal component analysis (PCA) and independent component analysis (ICA), and on some metrics. It has been developed for the application of automated spectral analysis, where the analyzed spectrum is provided by a spectrometer that has no previous knowledge of the analyzed sample, meaning that the number of components in the sample is unknown. We describe the details of this method and demonstrate its efficiency by identifying both simulated spectra and real spectra. The method has been applied to artistic pigment identification. The reliable and consistent results that were obtained make the methodology a helpful tool suitable for the identification of pigments in artwork or in paint in general.
NASA Astrophysics Data System (ADS)
Ying, Yibin; Liu, Yande; Fu, Xiaping; Lu, Huishan
2005-11-01
The artificial neural networks (ANNs) have been used successfully in applications such as pattern recognition, image processing, automation and control. However, majority of today's applications of ANNs is back-propagate feed-forward ANN (BP-ANN). In this paper, back-propagation artificial neural networks (BP-ANN) were applied for modeling soluble solid content (SSC) of intact pear from their Fourier transform near infrared (FT-NIR) spectra. One hundred and sixty-four pear samples were used to build the calibration models and evaluate the models predictive ability. The results are compared to the classical calibration approaches, i.e. principal component regression (PCR), partial least squares (PLS) and non-linear PLS (NPLS). The effects of the optimal methods of training parameters on the prediction model were also investigated. BP-ANN combine with principle component regression (PCR) resulted always better than the classical PCR, PLS and Weight-PLS methods, from the point of view of the predictive ability. Based on the results, it can be concluded that FT-NIR spectroscopy and BP-ANN models can be properly employed for rapid and nondestructive determination of fruit internal quality.
Hierarchical Regularity in Multi-Basin Dynamics on Protein Landscapes
NASA Astrophysics Data System (ADS)
Matsunaga, Yasuhiro; Kostov, Konstatin S.; Komatsuzaki, Tamiki
2004-04-01
We analyze time series of potential energy fluctuations and principal components at several temperatures for two kinds of off-lattice 46-bead models that have two distinctive energy landscapes. The less-frustrated "funnel" energy landscape brings about stronger nonstationary behavior of the potential energy fluctuations at the folding temperature than the other, rather frustrated energy landscape at the collapse temperature. By combining principal component analysis with an embedding nonlinear time-series analysis, it is shown that the fast fluctuations with small amplitudes of 70-80% of the principal components cause the time series to become almost "random" in only 100 simulation steps. However, the stochastic feature of the principal components tends to be suppressed through a wide range of degrees of freedom at the transition temperature.
Selection of solubility parameters for characterization of pharmaceutical excipients.
Adamska, Katarzyna; Voelkel, Adam; Héberger, Károly
2007-11-09
The solubility parameter (delta(2)), corrected solubility parameter (delta(T)) and its components (delta(d), delta(p), delta(h)) were determined for series of pharmaceutical excipients by using inverse gas chromatography (IGC). Principal component analysis (PCA) was applied for the selection of the solubility parameters which assure the complete characterization of examined materials. Application of PCA suggests that complete description of examined materials is achieved with four solubility parameters, i.e. delta(2) and Hansen solubility parameters (delta(d), delta(p), delta(h)). Selection of the excipients through PCA of their solubility parameters data can be used for prediction of their behavior in a multi-component system, e.g. for selection of the best materials to form stable pharmaceutical liquid mixtures or stable coating formulation.
Principals' Perceptions Regarding Their Supervision and Evaluation
ERIC Educational Resources Information Center
Hvidston, David J.; Range, Bret G.; McKim, Courtney Ann
2015-01-01
This study examined the perceptions of principals concerning principal evaluation and supervisory feedback. Principals were asked two open-ended questions. Respondents included 82 principals in the Rocky Mountain region. The emerging themes were "Superintendent Performance," "Principal Evaluation Components," "Specific…
Bayesian wavelet PCA methodology for turbomachinery damage diagnosis under uncertainty
NASA Astrophysics Data System (ADS)
Xu, Shengli; Jiang, Xiaomo; Huang, Jinzhi; Yang, Shuhua; Wang, Xiaofang
2016-12-01
Centrifugal compressor often suffers various defects such as impeller cracking, resulting in forced outage of the total plant. Damage diagnostics and condition monitoring of such a turbomachinery system has become an increasingly important and powerful tool to prevent potential failure in components and reduce unplanned forced outage and further maintenance costs, while improving reliability, availability and maintainability of a turbomachinery system. This paper presents a probabilistic signal processing methodology for damage diagnostics using multiple time history data collected from different locations of a turbomachine, considering data uncertainty and multivariate correlation. The proposed methodology is based on the integration of three advanced state-of-the-art data mining techniques: discrete wavelet packet transform, Bayesian hypothesis testing, and probabilistic principal component analysis. The multiresolution wavelet analysis approach is employed to decompose a time series signal into different levels of wavelet coefficients. These coefficients represent multiple time-frequency resolutions of a signal. Bayesian hypothesis testing is then applied to each level of wavelet coefficient to remove possible imperfections. The ratio of posterior odds Bayesian approach provides a direct means to assess whether there is imperfection in the decomposed coefficients, thus avoiding over-denoising. Power spectral density estimated by the Welch method is utilized to evaluate the effectiveness of Bayesian wavelet cleansing method. Furthermore, the probabilistic principal component analysis approach is developed to reduce dimensionality of multiple time series and to address multivariate correlation and data uncertainty for damage diagnostics. The proposed methodology and generalized framework is demonstrated with a set of sensor data collected from a real-world centrifugal compressor with impeller cracks, through both time series and contour analyses of vibration signal and principal components.
Riahi, Siavash; Hadiloo, Farshad; Milani, Seyed Mohammad R; Davarkhah, Nazila; Ganjali, Mohammad R; Norouzi, Parviz; Seyfi, Payam
2011-05-01
The accuracy in predicting different chemometric methods was compared when applied on ordinary UV spectra and first order derivative spectra. Principal component regression (PCR) and partial least squares with one dependent variable (PLS1) and two dependent variables (PLS2) were applied on spectral data of pharmaceutical formula containing pseudoephedrine (PDP) and guaifenesin (GFN). The ability to derivative in resolved overlapping spectra chloropheniramine maleate was evaluated when multivariate methods are adopted for analysis of two component mixtures without using any chemical pretreatment. The chemometrics models were tested on an external validation dataset and finally applied to the analysis of pharmaceuticals. Significant advantages were found in analysis of the real samples when the calibration models from derivative spectra were used. It should also be mentioned that the proposed method is a simple and rapid way requiring no preliminary separation steps and can be used easily for the analysis of these compounds, especially in quality control laboratories. Copyright © 2011 John Wiley & Sons, Ltd.
Nguyen, Phuong H
2007-05-15
Principal component analysis is a powerful method for projecting multidimensional conformational space of peptides or proteins onto lower dimensional subspaces in which the main conformations are present, making it easier to reveal the structures of molecules from e.g. molecular dynamics simulation trajectories. However, the identification of all conformational states is still difficult if the subspaces consist of more than two dimensions. This is mainly due to the fact that the principal components are not independent with each other, and states in the subspaces cannot be visualized. In this work, we propose a simple and fast scheme that allows one to obtain all conformational states in the subspaces. The basic idea is that instead of directly identifying the states in the subspace spanned by principal components, we first transform this subspace into another subspace formed by components that are independent of one other. These independent components are obtained from the principal components by employing the independent component analysis method. Because of independence between components, all states in this new subspace are defined as all possible combinations of the states obtained from each single independent component. This makes the conformational analysis much simpler. We test the performance of the method by analyzing the conformations of the glycine tripeptide and the alanine hexapeptide. The analyses show that our method is simple and quickly reveal all conformational states in the subspaces. The folding pathways between the identified states of the alanine hexapeptide are analyzed and discussed in some detail. 2007 Wiley-Liss, Inc.
NASA Astrophysics Data System (ADS)
Mitsutake, Ayori; Takano, Hiroshi
2015-09-01
It is important to extract reaction coordinates or order parameters from protein simulations in order to investigate the local minimum-energy states and the transitions between them. The most popular method to obtain such data is principal component analysis, which extracts modes of large conformational fluctuations around an average structure. We recently applied relaxation mode analysis for protein systems, which approximately estimates the slow relaxation modes and times from a simulation and enables investigations of the dynamic properties underlying the structural fluctuations of proteins. In this study, we apply this relaxation mode analysis to extract reaction coordinates for a system in which there are large conformational changes such as those commonly observed in protein folding/unfolding. We performed a 750-ns simulation of chignolin protein near its folding transition temperature and observed many transitions between the most stable, misfolded, intermediate, and unfolded states. We then applied principal component analysis and relaxation mode analysis to the system. In the relaxation mode analysis, we could automatically extract good reaction coordinates. The free-energy surfaces provide a clearer understanding of the transitions not only between local minimum-energy states but also between the folded and unfolded states, even though the simulation involved large conformational changes. Moreover, we propose a new analysis method called Markov state relaxation mode analysis. We applied the new method to states with slow relaxation, which are defined by the free-energy surface obtained in the relaxation mode analysis. Finally, the relaxation times of the states obtained with a simple Markov state model and the proposed Markov state relaxation mode analysis are compared and discussed.
Anatomical curve identification
Bowman, Adrian W.; Katina, Stanislav; Smith, Joanna; Brown, Denise
2015-01-01
Methods for capturing images in three dimensions are now widely available, with stereo-photogrammetry and laser scanning being two common approaches. In anatomical studies, a number of landmarks are usually identified manually from each of these images and these form the basis of subsequent statistical analysis. However, landmarks express only a very small proportion of the information available from the images. Anatomically defined curves have the advantage of providing a much richer expression of shape. This is explored in the context of identifying the boundary of breasts from an image of the female torso and the boundary of the lips from a facial image. The curves of interest are characterised by ridges or valleys. Key issues in estimation are the ability to navigate across the anatomical surface in three-dimensions, the ability to recognise the relevant boundary and the need to assess the evidence for the presence of the surface feature of interest. The first issue is addressed by the use of principal curves, as an extension of principal components, the second by suitable assessment of curvature and the third by change-point detection. P-spline smoothing is used as an integral part of the methods but adaptations are made to the specific anatomical features of interest. After estimation of the boundary curves, the intermediate surfaces of the anatomical feature of interest can be characterised by surface interpolation. This allows shape variation to be explored using standard methods such as principal components. These tools are applied to a collection of images of women where one breast has been reconstructed after mastectomy and where interest lies in shape differences between the reconstructed and unreconstructed breasts. They are also applied to a collection of lip images where possible differences in shape between males and females are of interest. PMID:26041943
Liu, Hui-lin; Wan, Xia; Yang, Gong-huan
2013-02-01
To explore the relationship between the strength of tobacco control and the effectiveness of creating smoke-free hospital, and summarize the main factors that affect the program of creating smoke-free hospitals. A total of 210 hospitals from 7 provinces/municipalities directly under the central government were enrolled in this study using stratified random sampling method. Principle component analysis and regression analysis were conducted to analyze the strength of tobacco control and the effectiveness of creating smoke-free hospitals. Two principal components were extracted in the strength of tobacco control index, which respectively reflected the tobacco control policies and efforts, and the willingness and leadership of hospital managers regarding tobacco control. The regression analysis indicated that only the first principal component was significantly correlated with the progression in creating smoke-free hospital (P<0.001), i.e. hospitals with higher scores on the first principal component had better achievements in smoke-free environment creation. Tobacco control policies and efforts are critical in creating smoke-free hospitals. The principal component analysis provides a comprehensive and objective tool for evaluating the creation of smoke-free hospitals.
Critical Factors Explaining the Leadership Performance of High-Performing Principals
ERIC Educational Resources Information Center
Hutton, Disraeli M.
2018-01-01
The study explored critical factors that explain leadership performance of high-performing principals and examined the relationship between these factors based on the ratings of school constituents in the public school system. The principal component analysis with the use of Varimax Rotation revealed that four components explain 51.1% of the…
Molecular dynamics in principal component space.
Michielssens, Servaas; van Erp, Titus S; Kutzner, Carsten; Ceulemans, Arnout; de Groot, Bert L
2012-07-26
A molecular dynamics algorithm in principal component space is presented. It is demonstrated that sampling can be improved without changing the ensemble by assigning masses to the principal components proportional to the inverse square root of the eigenvalues. The setup of the simulation requires no prior knowledge of the system; a short initial MD simulation to extract the eigenvectors and eigenvalues suffices. Independent measures indicated a 6-7 times faster sampling compared to a regular molecular dynamics simulation.
[A study of Boletus bicolor from different areas using Fourier transform infrared spectrometry].
Zhou, Zai-Jin; Liu, Gang; Ren, Xian-Pei
2010-04-01
It is hard to differentiate the same species of wild growing mushrooms from different areas by macromorphological features. In this paper, Fourier transform infrared (FTIR) spectroscopy combined with principal component analysis was used to identify 58 samples of boletus bicolor from five different areas. Based on the fingerprint infrared spectrum of boletus bicolor samples, principal component analysis was conducted on 58 boletus bicolor spectra in the range of 1 350-750 cm(-1) using the statistical software SPSS 13.0. According to the result, the accumulated contributing ratio of the first three principal components accounts for 88.87%. They included almost all the information of samples. The two-dimensional projection plot using first and second principal component is a satisfactory clustering effect for the classification and discrimination of boletus bicolor. All boletus bicolor samples were divided into five groups with a classification accuracy of 98.3%. The study demonstrated that wild growing boletus bicolor at species level from different areas can be identified by FTIR spectra combined with principal components analysis.
Tailored multivariate analysis for modulated enhanced diffraction
Caliandro, Rocco; Guccione, Pietro; Nico, Giovanni; ...
2015-10-21
Modulated enhanced diffraction (MED) is a technique allowing the dynamic structural characterization of crystalline materials subjected to an external stimulus, which is particularly suited forin situandoperandostructural investigations at synchrotron sources. Contributions from the (active) part of the crystal system that varies synchronously with the stimulus can be extracted by an offline analysis, which can only be applied in the case of periodic stimuli and linear system responses. In this paper a new decomposition approach based on multivariate analysis is proposed. The standard principal component analysis (PCA) is adapted to treat MED data: specific figures of merit based on their scoresmore » and loadings are found, and the directions of the principal components obtained by PCA are modified to maximize such figures of merit. As a result, a general method to decompose MED data, called optimum constrained components rotation (OCCR), is developed, which produces very precise results on simulated data, even in the case of nonperiodic stimuli and/or nonlinear responses. Furthermore, the multivariate analysis approach is able to supply in one shot both the diffraction pattern related to the active atoms (through the OCCR loadings) and the time dependence of the system response (through the OCCR scores). Furthermore, when applied to real data, OCCR was able to supply only the latter information, as the former was hindered by changes in abundances of different crystal phases, which occurred besides structural variations in the specific case considered. In order to develop a decomposition procedure able to cope with this combined effect represents the next challenge in MED analysis.« less
Motegi, Hiromi; Tsuboi, Yuuri; Saga, Ayako; Kagami, Tomoko; Inoue, Maki; Toki, Hideaki; Minowa, Osamu; Noda, Tetsuo; Kikuchi, Jun
2015-11-04
There is an increasing need to use multivariate statistical methods for understanding biological functions, identifying the mechanisms of diseases, and exploring biomarkers. In addition to classical analyses such as hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis, various multivariate strategies, including independent component analysis, non-negative matrix factorization, and multivariate curve resolution, have recently been proposed. However, determining the number of components is problematic. Despite the proposal of several different methods, no satisfactory approach has yet been reported. To resolve this problem, we implemented a new idea: classifying a component as "reliable" or "unreliable" based on the reproducibility of its appearance, regardless of the number of components in the calculation. Using the clustering method for classification, we applied this idea to multivariate curve resolution-alternating least squares (MCR-ALS). Comparisons between conventional and modified methods applied to proton nuclear magnetic resonance ((1)H-NMR) spectral datasets derived from known standard mixtures and biological mixtures (urine and feces of mice) revealed that more plausible results are obtained by the modified method. In particular, clusters containing little information were detected with reliability. This strategy, named "cluster-aided MCR-ALS," will facilitate the attainment of more reliable results in the metabolomics datasets.
How multi segmental patterns deviate in spastic diplegia from typical developed.
Zago, Matteo; Sforza, Chiarella; Bona, Alessia; Cimolin, Veronica; Costici, Pier Francesco; Condoluci, Claudia; Galli, Manuela
2017-10-01
The relationship between gait features and coordination in children with Cerebral Palsy is not sufficiently analyzed yet. Principal Component Analysis can help in understanding motion patterns decomposing movement into its fundamental components (Principal Movements). This study aims at quantitatively characterizing the functional connections between multi-joint gait patterns in Cerebral Palsy. 65 children with spastic diplegia aged 10.6 (SD 3.7) years participated in standardized gait analysis trials; 31 typically developing adolescents aged 13.6 (4.4) years were also tested. To determine if posture affects gait patterns, patients were split into Crouch and knee Hyperextension group according to knee flexion angle at standing. 3D coordinates of hips, knees, ankles, metatarsal joints, pelvis and shoulders were submitted to Principal Component Analysis. Four Principal Movements accounted for 99% of global variance; components 1-3 explained major sagittal patterns, components 4-5 referred to movements on frontal plane and component 6 to additional movement refinements. Dimensionality was higher in patients than in controls (p<0.01), and the Crouch group significantly differed from controls in the application of components 1 and 4-6 (p<0.05), while the knee Hyperextension group in components 1-2 and 5 (p<0.05). Compensatory strategies of children with Cerebral Palsy (interactions between main and secondary movement patterns), were objectively determined. Principal Movements can reduce the effort in interpreting gait reports, providing an immediate and quantitative picture of the connections between movement components. Copyright © 2017 Elsevier Ltd. All rights reserved.
Constrained Principal Component Analysis: Various Applications.
ERIC Educational Resources Information Center
Hunter, Michael; Takane, Yoshio
2002-01-01
Provides example applications of constrained principal component analysis (CPCA) that illustrate the method on a variety of contexts common to psychological research. Two new analyses, decompositions into finer components and fitting higher order structures, are presented, followed by an illustration of CPCA on contingency tables and the CPCA of…
A baseline drift detrending technique for fast scan cyclic voltammetry.
DeWaele, Mark; Oh, Yoonbae; Park, Cheonho; Kang, Yu Min; Shin, Hojin; Blaha, Charles D; Bennet, Kevin E; Kim, In Young; Lee, Kendall H; Jang, Dong Pyo
2017-11-06
Fast scan cyclic voltammetry (FSCV) has been commonly used to measure extracellular neurotransmitter concentrations in the brain. Due to the unstable nature of the background currents inherent in FSCV measurements, analysis of FSCV data is limited to very short amounts of time using traditional background subtraction. In this paper, we propose the use of a zero-phase high pass filter (HPF) as the means to remove the background drift. Instead of the traditional method of low pass filtering across voltammograms to increase the signal to noise ratio, a HPF with a low cutoff frequency was applied to the temporal dataset at each voltage point to remove the background drift. As a result, the HPF utilizing cutoff frequencies between 0.001 Hz and 0.01 Hz could be effectively used to a set of FSCV data for removing the drifting patterns while preserving the temporal kinetics of the phasic dopamine response recorded in vivo. In addition, compared to a drift removal method using principal component analysis, this was found to be significantly more effective in reducing the drift (unpaired t-test p < 0.0001, t = 10.88) when applied to data collected from Tris buffer over 24 hours although a drift removal method using principal component analysis also showed the effective background drift reduction. The HPF was also applied to 5 hours of FSCV in vivo data. Electrically evoked dopamine peaks, observed in the nucleus accumbens, were clearly visible even without background subtraction. This technique provides a new, simple, and yet robust, approach to analyse FSCV data with an unstable background.
Gorgulho, B M; Pot, G K; Marchioni, D M
2017-05-01
The aim of this study was to evaluate the validity and reliability of the Main Meal Quality Index when applied on the UK population. The indicator was developed to assess meal quality in different populations, and is composed of 10 components: fruit, vegetables (excluding potatoes), ratio of animal protein to total protein, fiber, carbohydrate, total fat, saturated fat, processed meat, sugary beverages and desserts, and energy density, resulting in a score range of 0-100 points. The performance of the indicator was measured using strategies for assessing content validity, construct validity, discriminant validity and reliability, including principal component analysis, linear regression models and Cronbach's alpha. The indicator presented good reliability. The Main Meal Quality Index has been shown to be valid for use as an instrument to evaluate, monitor and compare the quality of meals consumed by adults in the United Kingdom.
NASA Astrophysics Data System (ADS)
Ginanjar, Irlandia; Pasaribu, Udjianna S.; Indratno, Sapto W.
2017-03-01
This article presents the application of the principal component analysis (PCA) biplot for the needs of data mining. This article aims to simplify and objectify the methods for objects clustering in PCA biplot. The novelty of this paper is to get a measure that can be used to objectify the objects clustering in PCA biplot. Orthonormal eigenvectors, which are the coefficients of a principal component model representing an association between principal components and initial variables. The existence of the association is a valid ground to objects clustering based on principal axes value, thus if m principal axes used in the PCA, then the objects can be classified into 2m clusters. The inter-city buses are clustered based on maintenance costs data by using two principal axes PCA biplot. The buses are clustered into four groups. The first group is the buses with high maintenance costs, especially for lube, and brake canvass. The second group is the buses with high maintenance costs, especially for tire, and filter. The third group is the buses with low maintenance costs, especially for lube, and brake canvass. The fourth group is buses with low maintenance costs, especially for tire, and filter.
PCA based clustering for brain tumor segmentation of T1w MRI images.
Kaya, Irem Ersöz; Pehlivanlı, Ayça Çakmak; Sekizkardeş, Emine Gezmez; Ibrikci, Turgay
2017-03-01
Medical images are huge collections of information that are difficult to store and process consuming extensive computing time. Therefore, the reduction techniques are commonly used as a data pre-processing step to make the image data less complex so that a high-dimensional data can be identified by an appropriate low-dimensional representation. PCA is one of the most popular multivariate methods for data reduction. This paper is focused on T1-weighted MRI images clustering for brain tumor segmentation with dimension reduction by different common Principle Component Analysis (PCA) algorithms. Our primary aim is to present a comparison between different variations of PCA algorithms on MRIs for two cluster methods. Five most common PCA algorithms; namely the conventional PCA, Probabilistic Principal Component Analysis (PPCA), Expectation Maximization Based Principal Component Analysis (EM-PCA), Generalize Hebbian Algorithm (GHA), and Adaptive Principal Component Extraction (APEX) were applied to reduce dimensionality in advance of two clustering algorithms, K-Means and Fuzzy C-Means. In the study, the T1-weighted MRI images of the human brain with brain tumor were used for clustering. In addition to the original size of 512 lines and 512 pixels per line, three more different sizes, 256 × 256, 128 × 128 and 64 × 64, were included in the study to examine their effect on the methods. The obtained results were compared in terms of both the reconstruction errors and the Euclidean distance errors among the clustered images containing the same number of principle components. According to the findings, the PPCA obtained the best results among all others. Furthermore, the EM-PCA and the PPCA assisted K-Means algorithm to accomplish the best clustering performance in the majority as well as achieving significant results with both clustering algorithms for all size of T1w MRI images. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Structure of the principal olfactory tract.
Gil-Carcedo, L M; Vallejo, L A; Gil-Carcedo, E
2000-01-01
Although the purpose and importance of the sense of smell in human beings has not been totally clarified, it is one of the principal information channels in macrosmatic animals. It was the first long-distance information system to have appeared in phylogenetic evolution. The objective of this article is to deepen the knowledge of the pathways that join the olfactory epithelium with the cortical olfaction areas, to better understand olfactory dysfunction in human beings. Differential staining and marking techniques were applied to histologic sections obtained from 155 animals of different species, to study the different connections existing among olfactory tract components. Our study of the connections between the olfactory mucosa and the principal olfactory bulb deserves special mention. The distribution of second neuron connections of the olfactory tract with the central nervous system is quite complex and diffuse. This indicates an interrelation between the sense of smell and a multitude of functions. These connections seem to be of different quantitative importance according to species, but qualitatively they exist in both human beings and other macrosmatic animals.
Kakio, Tomoko; Nagase, Hitomi; Takaoka, Takashi; Yoshida, Naoko; Hirakawa, Junichi; Macha, Susan; Hiroshima, Takashi; Ikeda, Yukihiro; Tsuboi, Hirohito; Kimura, Kazuko
2018-06-01
The World Health Organization has warned that substandard and falsified medical products (SFs) can harm patients and fail to treat the diseases for which they were intended, and they affect every region of the world, leading to loss of confidence in medicines, health-care providers, and health systems. Therefore, development of analytical procedures to detect SFs is extremely important. In this study, we investigated the quality of pharmaceutical tablets containing the antihypertensive candesartan cilexetil, collected in China, Indonesia, Japan, and Myanmar, using the Japanese pharmacopeial analytical procedures for quality control, together with principal component analysis (PCA) of Raman spectrum obtained with handheld Raman spectrometer. Some samples showed delayed dissolution and failed to meet the pharmacopeial specification, whereas others failed the assay test. These products appeared to be substandard. Principal component analysis showed that all Raman spectra could be explained in terms of two components: the amount of the active pharmaceutical ingredient and the kinds of excipients. Principal component analysis score plot indicated one substandard, and the falsified tablets have similar principal components in Raman spectra, in contrast to authentic products. The locations of samples within the PCA score plot varied according to the source country, suggesting that manufacturers in different countries use different excipients. Our results indicate that the handheld Raman device will be useful for detection of SFs in the field. Principal component analysis of that Raman data clarify the difference in chemical properties between good quality products and SFs that circulate in the Asian market.
Dabo-Niang, S; Zoueu, J T
2012-09-01
In this communication, we demonstrate how kriging, combine with multispectral and multimodal microscopy can enhance the resolution of malaria-infected images and provide more details on their composition, for analysis and diagnosis. The results of this interpolation applied to the two principal components of multispectral and multimodal images illustrate that the examination of the content of Plasmodium falciparum infected human erythrocyte is improved. © 2012 The Authors Journal of Microscopy © 2012 Royal Microscopical Society.
Halouska, Steven; Chacon, Ofelia; Fenton, Robert J.; Zinniel, Denise K.; Barletta, Raul G.; Powers, Robert
2008-01-01
D-cycloserine (DCS) is only used with multi-drug resistant strains of tuberculosis because of serious side-effects. DCS is known to inhibit cell wall biosynthesis, but the in vivo lethal target is still unknown. We have applied NMR-based metabolomics combined with principal component analysis to monitor the in vivo affect of DCS on M. smegmatis. Our analysis suggests DCS functions by inhibiting multiple protein targets. PMID:17979227
Principal component analysis and the locus of the Fréchet mean in the space of phylogenetic trees.
Nye, Tom M W; Tang, Xiaoxian; Weyenberg, Grady; Yoshida, Ruriko
2017-12-01
Evolutionary relationships are represented by phylogenetic trees, and a phylogenetic analysis of gene sequences typically produces a collection of these trees, one for each gene in the analysis. Analysis of samples of trees is difficult due to the multi-dimensionality of the space of possible trees. In Euclidean spaces, principal component analysis is a popular method of reducing high-dimensional data to a low-dimensional representation that preserves much of the sample's structure. However, the space of all phylogenetic trees on a fixed set of species does not form a Euclidean vector space, and methods adapted to tree space are needed. Previous work introduced the notion of a principal geodesic in this space, analogous to the first principal component. Here we propose a geometric object for tree space similar to the [Formula: see text]th principal component in Euclidean space: the locus of the weighted Fréchet mean of [Formula: see text] vertex trees when the weights vary over the [Formula: see text]-simplex. We establish some basic properties of these objects, in particular showing that they have dimension [Formula: see text], and propose algorithms for projection onto these surfaces and for finding the principal locus associated with a sample of trees. Simulation studies demonstrate that these algorithms perform well, and analyses of two datasets, containing Apicomplexa and African coelacanth genomes respectively, reveal important structure from the second principal components.
Multivariate classification of the infrared spectra of cell and tissue samples
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haaland, D.M.; Jones, H.D.; Thomas, E.V.
1997-03-01
Infrared microspectroscopy of biopsied canine lymph cells and tissue was performed to investigate the possibility of using IR spectra coupled with multivariate classification methods to classify the samples as normal, hyperplastic, or neoplastic (malignant). IR spectra were obtained in transmission mode through BaF{sub 2} windows and in reflection mode from samples prepared on gold-coated microscope slides. Cytology and histopathology samples were prepared by a variety of methods to identify the optimal methods of sample preparation. Cytospinning procedures that yielded a monolayer of cells on the BaF{sub 2} windows produced a limited set of IR transmission spectra. These transmission spectra weremore » converted to absorbance and formed the basis for a classification rule that yielded 100{percent} correct classification in a cross-validated context. Classifications of normal, hyperplastic, and neoplastic cell sample spectra were achieved by using both partial least-squares (PLS) and principal component regression (PCR) classification methods. Linear discriminant analysis applied to principal components obtained from the spectral data yielded a small number of misclassifications. PLS weight loading vectors yield valuable qualitative insight into the molecular changes that are responsible for the success of the infrared classification. These successful classification results show promise for assisting pathologists in the diagnosis of cell types and offer future potential for {ital in vivo} IR detection of some types of cancer. {copyright} {ital 1997} {ital Society for Applied Spectroscopy}« less
Sghaier, Lilia; Cordella, Christophe B Y; Rutledge, Douglas N; Lefèvre, Fanny; Watiez, Mickaël; Breton, Sylvie; Sassiat, Patrick; Thiebaut, Didier; Vial, Jérôme
2017-06-01
Lipid oxidation leads to the formation of volatile compounds and very often to off-flavors. In the case of the heating of rapeseed oil, unpleasant odors, characterized as a fishy odor, are emitted. In this study, 2 different essential oils (coriander and nutmeg essential oils) were added to refined rapeseed oil as odor masking agents. The aim of this work was to determine a potential antioxidant effect of these essential oils on the thermal stability of rapeseed oil subject to heating cycles between room temperature and 180 °C. For this purpose, normed determinations of different parameters (peroxide value, anisidine value, and the content of total polar compounds, free fatty acids and tocopherols) were carried out to examine the differences between pure and degraded oil. No significant difference was observed between pure rapeseed oil and rapeseed oil with essential oils for each parameter separately. However, a stabilizing effect of the essential oils, with a higher effect for the nutmeg essential oil was highlighted by principal component analysis applied on physicochemical dataset. Moreover, the analysis of the volatile compounds performed by GC × GC showed a substantial loss of the volatile compounds of the essential oils from the first heating cycle. © 2017 Institute of Food Technologists®.
Fractal analysis of scatter imaging signatures to distinguish breast pathologies
NASA Astrophysics Data System (ADS)
Eguizabal, Alma; Laughney, Ashley M.; Krishnaswamy, Venkataramanan; Wells, Wendy A.; Paulsen, Keith D.; Pogue, Brian W.; López-Higuera, José M.; Conde, Olga M.
2013-02-01
Fractal analysis combined with a label-free scattering technique is proposed for describing the pathological architecture of tumors. Clinicians and pathologists are conventionally trained to classify abnormal features such as structural irregularities or high indices of mitosis. The potential of fractal analysis lies in the fact of being a morphometric measure of the irregular structures providing a measure of the object's complexity and self-similarity. As cancer is characterized by disorder and irregularity in tissues, this measure could be related to tumor growth. Fractal analysis has been probed in the understanding of the tumor vasculature network. This work addresses the feasibility of applying fractal analysis to the scattering power map (as a physical modeling) and principal components (as a statistical modeling) provided by a localized reflectance spectroscopic system. Disorder, irregularity and cell size variation in tissue samples is translated into the scattering power and principal components magnitude and its fractal dimension is correlated with the pathologist assessment of the samples. The fractal dimension is computed applying the box-counting technique. Results show that fractal analysis of ex-vivo fresh tissue samples exhibits separated ranges of fractal dimension that could help classifier combining the fractal results with other morphological features. This contrast trend would help in the discrimination of tissues in the intraoperative context and may serve as a useful adjunct to surgeons.
NASA Astrophysics Data System (ADS)
Ye, M.; Pacheco Castro, R. B.; Pacheco Avila, J.; Cabrera Sansores, A.
2014-12-01
The karstic aquifer of Yucatan is a vulnerable and complex system. The first fifteen meters of this aquifer have been polluted, due to this the protection of this resource is important because is the only source of potable water of the entire State. Through the assessment of groundwater quality we can gain some knowledge about the main processes governing water chemistry as well as spatial patterns which are important to establish protection zones. In this work multivariate statistical techniques are used to assess the groundwater quality of the supply wells (30 to 40 meters deep) in the hidrogeologic region of the Ring of Cenotes, located in Yucatan, Mexico. Cluster analysis and principal component analysis are applied in groundwater chemistry data of the study area. Results of principal component analysis show that the main sources of variation in the data are due sea water intrusion and the interaction of the water with the carbonate rocks of the system and some pollution processes. The cluster analysis shows that the data can be divided in four clusters. The spatial distribution of the clusters seems to be random, but is consistent with sea water intrusion and pollution with nitrates. The overall results show that multivariate statistical analysis can be successfully applied in the groundwater quality assessment of this karstic aquifer.
Choi, Ji Yeh; Hwang, Heungsun; Yamamoto, Michio; Jung, Kwanghee; Woodward, Todd S
2017-06-01
Functional principal component analysis (FPCA) and functional multiple-set canonical correlation analysis (FMCCA) are data reduction techniques for functional data that are collected in the form of smooth curves or functions over a continuum such as time or space. In FPCA, low-dimensional components are extracted from a single functional dataset such that they explain the most variance of the dataset, whereas in FMCCA, low-dimensional components are obtained from each of multiple functional datasets in such a way that the associations among the components are maximized across the different sets. In this paper, we propose a unified approach to FPCA and FMCCA. The proposed approach subsumes both techniques as special cases. Furthermore, it permits a compromise between the techniques, such that components are obtained from each set of functional data to maximize their associations across different datasets, while accounting for the variance of the data well. We propose a single optimization criterion for the proposed approach, and develop an alternating regularized least squares algorithm to minimize the criterion in combination with basis function approximations to functions. We conduct a simulation study to investigate the performance of the proposed approach based on synthetic data. We also apply the approach for the analysis of multiple-subject functional magnetic resonance imaging data to obtain low-dimensional components of blood-oxygen level-dependent signal changes of the brain over time, which are highly correlated across the subjects as well as representative of the data. The extracted components are used to identify networks of neural activity that are commonly activated across the subjects while carrying out a working memory task.
Paris, Guillaume; Ramseyer, Christophe; Enescu, Mironel
2014-05-01
The conformational dynamics of human serum albumin (HSA) was investigated by principal component analysis (PCA) applied to three molecular dynamics trajectories of 200 ns each. The overlap of the essential subspaces spanned by the first 10 principal components (PC) of different trajectories was about 0.3 showing that the PCA based on a trajectory length of 200 ns is not completely convergent for this protein. The contributions of the relative motion of subdomains and of the subdomains (internal) distortion to the first 10 PCs were found to be comparable. Based on the distribution of the first 3 PC, 10 protein conformers are identified showing relative root mean square deviations (RMSD) between 2.3 and 4.6 Å. The main PCs are found to be delocalized over the whole protein structure indicating that the motions of different protein subdomains are coupled. This coupling is considered as being related to the allosteric effects observed upon ligand binding to HSA. On the other hand, the first PC of one of the three trajectories describes a conformational transition of the protein domain I that is close to that experimentally observed upon myristate binding. This is a theoretical support for the older hypothesis stating that changes of the protein onformation favorable to binding can precede the ligand complexation. A detailed all atoms PCA performed on the primary Sites 1 and 2 confirms the multiconformational character of the HSA binding sites as well as the significant coupling of their motions. Copyright © 2013 Wiley Periodicals, Inc.
Genetic Classification of Populations Using Supervised Learning
Bridges, Michael; Heron, Elizabeth A.; O'Dushlaine, Colm; Segurado, Ricardo; Morris, Derek; Corvin, Aiden; Gill, Michael; Pinto, Carlos
2011-01-01
There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case–control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed unsupervised. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available. In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into pre-defined populations, particularly in quality control for large scale genome wide association studies. PMID:21589856
[Geographical distribution of left ventricular Tei index based on principal component analysis].
Xu, Jinhui; Ge, Miao; He, Jinwei; Xue, Ranyin; Yang, Shaofang; Jiang, Jilin
2014-11-01
To provide a scientific standard of left ventricular Tei index for healthy people from various region of China, and to lay a reliable foundation for the evaluation of left ventricular diastolic and systolic function. The correlation and principal component analysis were used to explore the left ventricular Tei index, which based on the data of 3 562 samples from 50 regions of China by means of literature retrieval. Th e nine geographical factors were longitude(X₁), latitude(X₂), altitude(X₃), annual sunshine hours (X₄), the annual average temperature (X₅), annual average relative humidity (X₆), annual precipitation (X₇), annual temperature range (X₈) and annual average wind speed (X₉). ArcGIS soft ware was applied to calculate the spatial distribution regularities of left ventricular Tei index. There is a significant correlation between the healthy people's left ventricular Tei index and geographical factors, and the correlation coefficients were -0.107 (r₁), -0.301 (r₂), -0.029 (r₃), -0.277 (r₄), -0.256(r₅), -0.289(r₆), -0.320(r₇), -0.310 (r₈) and -0.117 (r₉), respectively. A linear equation between the Tei index and the geographical factor was obtained by regression analysis based on the three extracting principal components. The geographical distribution tendency chart for healthy people's left Tei index was fitted out by the ArcGIS spatial interpolation analysis. The geographical distribution for left ventricular Tei index in China follows certain pattern. The reference value in North is higher than that in South, while the value in East is higher than that in West.
Ding, Haiquan; Lu, Qipeng; Gao, Hongzhi; Peng, Zhongqi
2014-01-01
To facilitate non-invasive diagnosis of anemia, specific equipment was developed, and non-invasive hemoglobin (HB) detection method based on back propagation artificial neural network (BP-ANN) was studied. In this paper, we combined a broadband light source composed of 9 LEDs with grating spectrograph and Si photodiode array, and then developed a high-performance spectrophotometric system. By using this equipment, fingertip spectra of 109 volunteers were measured. In order to deduct the interference of redundant data, principal component analysis (PCA) was applied to reduce the dimensionality of collected spectra. Then the principal components of the spectra were taken as input of BP-ANN model. On this basis we obtained the optimal network structure, in which node numbers of input layer, hidden layer, and output layer was 9, 11, and 1. Calibration and correction sample sets were used for analyzing the accuracy of non-invasive hemoglobin measurement, and prediction sample set was used for testing the adaptability of the model. The correlation coefficient of network model established by this method is 0.94, standard error of calibration, correction, and prediction are 11.29g/L, 11.47g/L, and 11.01g/L respectively. The result proves that there exist good correlations between spectra of three sample sets and actual hemoglobin level, and the model has a good robustness. It is indicated that the developed spectrophotometric system has potential for the non-invasive detection of HB levels with the method of BP-ANN combined with PCA. PMID:24761296
Study on pattern recognition of Raman spectrum based on fuzzy neural network
NASA Astrophysics Data System (ADS)
Zheng, Xiangxiang; Lv, Xiaoyi; Mo, Jiaqing
2017-10-01
Hydatid disease is a serious parasitic disease in many regions worldwide, especially in Xinjiang, China. Raman spectrum of the serum of patients with echinococcosis was selected as the research object in this paper. The Raman spectrum of blood samples from healthy people and patients with echinococcosis are measured, of which the spectrum characteristics are analyzed. The fuzzy neural network not only has the ability of fuzzy logic to deal with uncertain information, but also has the ability to store knowledge of neural network, so it is combined with the Raman spectrum on the disease diagnosis problem based on Raman spectrum. Firstly, principal component analysis (PCA) is used to extract the principal components of the Raman spectrum, reducing the network input and accelerating the prediction speed and accuracy of Network based on remaining the original data. Then, the information of the extracted principal component is used as the input of the neural network, the hidden layer of the network is the generation of rules and the inference process, and the output layer of the network is fuzzy classification output. Finally, a part of samples are randomly selected for the use of training network, then the trained network is used for predicting the rest of the samples, and the predicted results are compared with general BP neural network to illustrate the feasibility and advantages of fuzzy neural network. Success in this endeavor would be helpful for the research work of spectroscopic diagnosis of disease and it can be applied in practice in many other spectral analysis technique fields.
NASA Astrophysics Data System (ADS)
Singh, Dharmendra; Kumar, Harish
Earth observation satellites provide data that covers different portions of the electromagnetic spectrum at different spatial and spectral resolutions. The increasing availability of information products generated from satellite images are extending the ability to understand the patterns and dynamics of the earth resource systems at all scales of inquiry. In which one of the most important application is the generation of land cover classification from satellite images for understanding the actual status of various land cover classes. The prospect for the use of satel-lite images in land cover classification is an extremely promising one. The quality of satellite images available for land-use mapping is improving rapidly by development of advanced sensor technology. Particularly noteworthy in this regard is the improved spatial and spectral reso-lution of the images captured by new satellite sensors like MODIS, ASTER, Landsat 7, and SPOT 5. For the full exploitation of increasingly sophisticated multisource data, fusion tech-niques are being developed. Fused images may enhance the interpretation capabilities. The images used for fusion have different temporal, and spatial resolution. Therefore, the fused image provides a more complete view of the observed objects. It is one of the main aim of image fusion to integrate different data in order to obtain more information that can be de-rived from each of the single sensor data alone. A good example of this is the fusion of images acquired by different sensors having a different spatial resolution and of different spectral res-olution. Researchers are applying the fusion technique since from three decades and propose various useful methods and techniques. The importance of high-quality synthesis of spectral information is well suited and implemented for land cover classification. More recently, an underlying multiresolution analysis employing the discrete wavelet transform has been used in image fusion. It was found that multisensor image fusion is a tradeoff between the spectral information from a low resolution multi-spectral images and the spatial information from a high resolution multi-spectral images. With the wavelet transform based fusion method, it is easy to control this tradeoff. A new transform, the curvelet transform was used in recent years by Starck. A ridgelet transform is applied to square blocks of detail frames of undecimated wavelet decomposition, consequently the curvelet transform is obtained. Since the ridgelet transform possesses basis functions matching directional straight lines therefore, the curvelet transform is capable of representing piecewise linear contours on multiple scales through few significant coefficients. This property leads to a better separation between geometric details and background noise, which may be easily reduced by thresholding curvelet coefficients before they are used for fusion. The Terra and Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) instrument provides high radiometric sensitivity (12 bit) in 36 spectral bands ranging in wavelength from 0.4 m to 14.4 m and also it is freely available. Two bands are imaged at a nominal resolution of 250 m at nadir, with five bands at 500 m, and the remaining 29 bands at 1 km. In this paper, the band 1 of spatial resolution 250 m and bandwidth 620-670 nm, and band 2, of spatial resolution of 250m and bandwidth 842-876 nm is considered as these bands has special features to identify the agriculture and other land covers. In January 2006, the Advanced Land Observing Satellite (ALOS) was successfully launched by the Japan Aerospace Exploration Agency (JAXA). The Phased Arraytype L-band SAR (PALSAR) sensor onboard the satellite acquires SAR imagery at a wavelength of 23.5 cm (frequency 1.27 GHz) with capabilities of multimode and multipolarization observation. PALSAR can operate in several modes: the fine-beam single (FBS) polarization mode (HH), fine-beam dual (FBD) polariza-tion mode (HH/HV or VV/VH), polarimetric (PLR) mode (HH/HV/VH/VV), and ScanSAR (WB) mode (HH/VV) [15]. These makes PALSAR imagery very attractive for spatially and temporally consistent monitoring system. The Overview of Principal Component Analysis is that the most of the information within all the bands can be compressed into a much smaller number of bands with little loss of information. It allows us to extract the low-dimensional subspaces that capture the main linear correlation among the high-dimensional image data. This facilitates viewing the explained variance or signal in the available imagery, allowing both gross and more subtle features in the imagery to be seen. In this paper we have explored the fusion technique for enhancing the land cover classification of low resolution satellite data espe-cially freely available satellite data. For this purpose, we have considered to fuse the PALSAR principal component data with MODIS principal component data. Initially, the MODIS band 1 and band 2 is considered, its principal component is computed. Similarly the PALSAR HH, HV and VV polarized data are considered, and there principal component is also computed. con-sequently, the PALSAR principal component image is fused with MODIS principal component image. The aim of this paper is to analyze the effect of classification accuracy on major type of land cover types like agriculture, water and urban bodies with fusion of PALSAR data to MODIS data. Curvelet transformation has been applied for fusion of these two satellite images and Minimum Distance classification technique has been applied for the resultant fused image. It is qualitatively and visually observed that the overall classification accuracy of MODIS image after fusion is enhanced. This type of fusion technique may be quite helpful in near future to use freely available satellite data to develop monitoring system for different land cover classes on the earth.
Remembering the dynamic changes in pain intensity and unpleasantness: a psychophysical study.
Khoshnejad, Mina; Fortin, Marie C; Rohani, Farzan; Duncan, Gary H; Rainville, Pierre
2014-03-01
This study investigated the short-term memory of dynamic changes in acute pain using psychophysical methods. Pain intensity or unpleasantness induced by painful contact-heat stimuli of 8, 9, or 10s was rated continuously during the stimulus or after a 14-s delay using an electronic visual analog scale in 10 healthy volunteers. Because the continuous visual analog scale time courses contained large amounts of redundant information, a principal component analysis was applied to characterize the main features inherent to both the concurrent rating and retrospective evaluations. Three components explained about 90% of the total variance across all trials and subjects, with the first component reflecting the global perceptual profile, and the second and third components explaining finer perceptual aspects (eg, changes in slope at onset and offset and shifts in peak latency). We postulate that these 3 principal components may provide some information about the structure of the mental representations of what one perceives, stores, and remembers during the course of few seconds. Analysis performed on the components confirmed significant memory distortions and revealed that the discriminative information about pain dimensions in concurrent ratings was partly or completely lost in retrospective ratings. Importantly, our results highlight individual differences affecting these memory processes. These results provide further evidence of the important transformations underlying the processing of pain in explicit memory and raise fundamental questions about the conversion of dynamic nociceptive signals into a mental representation of pain in perception and memory. Copyright © 2013 International Association for the Study of Pain. Published by Elsevier B.V. All rights reserved.
Independent component analysis decomposition of hospital emergency department throughput measures
NASA Astrophysics Data System (ADS)
He, Qiang; Chu, Henry
2016-05-01
We present a method adapted from medical sensor data analysis, viz. independent component analysis of electroencephalography data, to health system analysis. Timely and effective care in a hospital emergency department is measured by throughput measures such as median times patients spent before they were admitted as an inpatient, before they were sent home, before they were seen by a healthcare professional. We consider a set of five such measures collected at 3,086 hospitals distributed across the U.S. One model of the performance of an emergency department is that these correlated throughput measures are linear combinations of some underlying sources. The independent component analysis decomposition of the data set can thus be viewed as transforming a set of performance measures collected at a site to a collection of outputs of spatial filters applied to the whole multi-measure data. We compare the independent component sources with the output of the conventional principal component analysis to show that the independent components are more suitable for understanding the data sets through visualizations.
Meyer, Karin; Kirkpatrick, Mark
2005-01-01
Principal component analysis is a widely used 'dimension reduction' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1)/2 to m(2k - m + 1)/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given. PMID:15588566
Morin, R.H.
1997-01-01
Returns from drilling in unconsolidated cobble and sand aquifers commonly do not identify lithologic changes that may be meaningful for Hydrogeologic investigations. Vertical resolution of saturated, Quaternary, coarse braided-slream deposits is significantly improved by interpreting natural gamma (G), epithermal neutron (N), and electromagnetically induced resistivity (IR) logs obtained from wells at the Capital Station site in Boise, Idaho. Interpretation of these geophysical logs is simplified because these sediments are derived largely from high-gamma-producing source rocks (granitics of the Boise River drainage), contain few clays, and have undergone little diagenesis. Analysis of G, N, and IR data from these deposits with principal components analysis provides an objective means to determine if units can be recognized within the braided-stream deposits. In particular, performing principal components analysis on G, N, and IR data from eight wells at Capital Station (1) allows the variable system dimensionality to be reduced from three to two by selecting the two eigenvectors with the greatest variance as axes for principal component scatterplots, (2) generates principal components with interpretable physical meanings, (3) distinguishes sand from cobble-dominated units, and (4) provides a means to distinguish between cobble-dominated units.
Analysis and Evaluation of the Characteristic Taste Components in Portobello Mushroom.
Wang, Jinbin; Li, Wen; Li, Zhengpeng; Wu, Wenhui; Tang, Xueming
2018-05-10
To identify the characteristic taste components of the common cultivated mushroom (brown; Portobello), Agaricus bisporus, taste components in the stipe and pileus of Portobello mushroom harvested at different growth stages were extracted and identified, and principal component analysis (PCA) and taste active value (TAV) were used to reveal the characteristic taste components during the each of the growth stages of Portobello mushroom. In the stipe and pileus, 20 and 14 different principal taste components were identified, respectively, and they were considered as the principal taste components of Portobello mushroom fruit bodies, which included most amino acids and 5'-nucleotides. Some taste components that were found at high levels, such as lactic acid and citric acid, were not detected as Portobello mushroom principal taste components through PCA. However, due to their high content, Portobello mushroom could be used as a source of organic acids. The PCA and TAV results revealed that 5'-GMP, glutamic acid, malic acid, alanine, proline, leucine, and aspartic acid were the characteristic taste components of Portobello mushroom fruit bodies. Portobello mushroom was also found to be rich in protein and amino acids, so it might also be useful in the formulation of nutraceuticals and functional food. The results in this article could provide a theoretical basis for understanding and regulating the characteristic flavor components synthesis process of Portobello mushroom. © 2018 Institute of Food Technologists®.
NASA Astrophysics Data System (ADS)
Kistenev, Yu. V.; Shapovalov, A. V.; Borisov, A. V.; Vrazhnov, D. A.; Nikolaev, V. V.; Nikiforova, O. Y.
2015-12-01
The results of numerical simulation of application principal component analysis to absorption spectra of breath air of patients with pulmonary diseases are presented. Various methods of experimental data preprocessing are analyzed.
NASA Astrophysics Data System (ADS)
Pal, S. K.; Majumdar, T. J.; Bhattacharya, Amit K.
Fusion of optical and synthetic aperture radar data has been attempted in the present study for mapping of various lithologic units over a part of the Singhbhum Shear Zone (SSZ) and its surroundings. ERS-2 SAR data over the study area has been enhanced using Fast Fourier Transformation (FFT) based filtering approach, and also using Frost filtering technique. Both the enhanced SAR imagery have been then separately fused with histogram equalized IRS-1C LISS III image using Principal Component Analysis (PCA) technique. Later, Feature-oriented Principal Components Selection (FPCS) technique has been applied to generate False Color Composite (FCC) images, from which corresponding geological maps have been prepared. Finally, GIS techniques have been successfully used for change detection analysis in the lithological interpretation between the published geological map and the fusion based geological maps. In general, there is good agreement between these maps over a large portion of the study area. Based on the change detection studies, few areas could be identified which need attention for further detailed ground-based geological studies.
Chtourou, Fatma; Jabeur, Hazem; Lazzez, Ayda; Bouaziz, Mohamed
2017-05-03
Dynamics of squalene, sterol, aliphatic alcohol, pigment, and triterpenic diol accumulations in olive oils from adult and young trees of the Oueslati cultivar were studied for two consecutive years, 2013-2014 and 2014-2015. Data were compared statistically for differences by age of trees, maturation of olive, and year of harvesting. Results showed that the mean campesterol content in olive oil from adult trees at the green stage of maturation was significantly (p < 0.02) above the limit established by IOC legislation. However, the mean values of campesterol and Δ-7-stigmastenol were significantly (p < 0.01) above the limits in oils from young trees at the black stage of ripening. Principal component analysis was applied to alcohols, squalene, pigments, and sterols having noncompliance with the legislation. Then, data of 36 samples were subjected to a discriminant analysis with "maturation" as grouping variable and principal components as input variables. The model revealed clear discrimination of each tree age/maturation stage group.
Raman spectra of single cells with autofluorescence suppression by modulated wavelength excitation
NASA Astrophysics Data System (ADS)
Krafft, Christoph; Dochow, Sebastian; Bergner, Norbert; Clement, Joachim H.; Praveen, Bavishna B.; Mazilu, Michael; Marchington, Rob; Dholakia, Kishan; Popp, Jürgen
2012-01-01
Raman spectroscopy is a non-invasive technique offering great potential in the biomedical field for label-free discrimination between normal and tumor cells based on their biochemical composition. First, this contribution describes Raman spectra of lymphocytes after drying, in laser tweezers, and trapped in a microfluidic environment. Second, spectral differences between lymphocytes and acute myeloid leukemia cells (OCI-AML3) are compared for these three experimental conditions. Significant similarities of difference spectra are consistent with the biological relevance of the spectral features. Third, modulated wavelength Raman spectroscopy has been applied to this model system to demonstrate background suppression. Here, the laser excitation wavelength of 785 nm was modulated with a frequency of 40 mHz by 0.6 nm. 40 spectra were accumulated with an exposure time of 5 seconds each. These data were subjected to principal component analysis to calculate modulated Raman signatures. The loading of the principal component shows characteristics of first derivatives with derivative like band shapes. The derivative of this loading corresponds to a pseudo-second derivative spectrum and enables to determine band positions.
Bravo, Ignacio; Mazo, Manuel; Lázaro, José L.; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel
2010-01-01
This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices. PMID:22163406
Mari, Angela; Montoro, Paola; Pizza, Cosimo; Piacente, Sonia
2012-11-01
A validated analytical method for the quantitative determination of seven chemical markers occurring in a hydroalcoholic extract of Vitex agnus-castus fruits by liquid chromatography electrospray triple quadrupole tandem mass spectrometry (LC/ESI/(QqQ)MSMS) is reported. To carry out a comparative study, five commercial food supplements corresponding to hydroalcoholic extracts of V. agnus-castus fruits were analysed under the same chromatographic conditions of the crude extract. Principal component analysis (PCA), based only on the variation of the amount of the seven chemical markers, was applied in order to find similarities between the hydroalcoholic extract and the food supplements. A second PCA analysis was carried out considering the whole spectroscopic data deriving from liquid chromatography electrospray linear ion trap mass spectrometry (LC/ESI/(LIT)MS) analysis. High similarity between the two PCA was observed, showing the possibility to select one of these two approaches for future applications in the field of comparative analysis of food supplements and quality control procedures. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Díaz-Ayil, Gilberto; Amouroux, Marine; Clanché, Fabien; Granjon, Yves; Blondel, Walter C. P. M.
2009-07-01
Spatially-resolved bimodal spectroscopy (multiple AutoFluorescence AF excitation and Diffuse Reflectance DR), was used in vivo to discriminate various healthy and precancerous skin stages in a pre-clinical model (UV-irradiated mouse): Compensatory Hyperplasia CH, Atypical Hyperplasia AH and Dysplasia D. A specific data preprocessing scheme was applied to intensity spectra (filtering, spectral correction and intensity normalization), and several sets of spectral characteristics were automatically extracted and selected based on their discrimination power, statistically tested for every pair-wise comparison of histological classes. Data reduction with Principal Components Analysis (PCA) was performed and 3 classification methods were implemented (k-NN, LDA and SVM), in order to compare diagnostic performance of each method. Diagnostic performance was studied and assessed in terms of Sensibility (Se) and Specificity (Sp) as a function of the selected features, of the combinations of 3 different inter-fibres distances and of the numbers of principal components, such that: Se and Sp ~ 100% when discriminating CH vs. others; Sp ~ 100% and Se > 95% when discriminating Healthy vs. AH or D; Sp ~ 74% and Se ~ 63% for AH vs. D.
Gu, Qun; David, Frank; Lynen, Frédéric; Rumpel, Klaus; Dugardeyn, Jasper; Van Der Straeten, Dominique; Xu, Guowang; Sandra, Pat
2011-05-27
In this paper, automated sample preparation, retention time locked gas chromatography-mass spectrometry (GC-MS) and data analysis methods for the metabolomics study were evaluated. A miniaturized and automated derivatisation method using sequential oximation and silylation was applied to a polar extract of 4 types (2 types×2 ages) of Arabidopsis thaliana, a popular model organism often used in plant sciences and genetics. Automation of the derivatisation process offers excellent repeatability, and the time between sample preparation and analysis was short and constant, reducing artifact formation. Retention time locked (RTL) gas chromatography-mass spectrometry was used, resulting in reproducible retention times and GC-MS profiles. Two approaches were used for data analysis. XCMS followed by principal component analysis (approach 1) and AMDIS deconvolution combined with a commercially available program (Mass Profiler Professional) followed by principal component analysis (approach 2) were compared. Several features that were up- or down-regulated in the different types were detected. Copyright © 2011 Elsevier B.V. All rights reserved.
Røislien, Jo; Winje, Brita
2013-09-20
Clinical studies frequently include repeated measurements of individuals, often for long periods. We present a methodology for extracting common temporal features across a set of individual time series observations. In particular, the methodology explores extreme observations within the time series, such as spikes, as a possible common temporal phenomenon. Wavelet basis functions are attractive in this sense, as they are localized in both time and frequency domains simultaneously, allowing for localized feature extraction from a time-varying signal. We apply wavelet basis function decomposition of individual time series, with corresponding wavelet shrinkage to remove noise. We then extract common temporal features using linear principal component analysis on the wavelet coefficients, before inverse transformation back to the time domain for clinical interpretation. We demonstrate the methodology on a subset of a large fetal activity study aiming to identify temporal patterns in fetal movement (FM) count data in order to explore formal FM counting as a screening tool for identifying fetal compromise and thus preventing adverse birth outcomes. Copyright © 2013 John Wiley & Sons, Ltd.
Kharroubi, Adel; Gargouri, Dorra; Baati, Houda; Azri, Chafai
2012-06-01
Concentrations of selected heavy metals (Cd, Pb, Zn, Cu, Mn, and Fe) in surface sediments from 66 sites in both northern and eastern Mediterranean Sea-Boughrara lagoon exchange areas (southeastern Tunisia) were studied in order to understand current metal contamination due to the urbanization and economic development of nearby several coastal regions of the Gulf of Gabès. Multiple approaches were applied for the sediment quality assessment. These approaches were based on GIS coupled with chemometric methods (enrichment factors, geoaccumulation index, principal component analysis, and cluster analysis). Enrichment factors and principal component analysis revealed two distinct groups of metals. The first group corresponded to Fe and Mn derived from natural sources, and the second group contained Cd, Pb, Zn, and Cu originated from man-made sources. For these latter metals, cluster analysis showed two distinct distributions in the selected areas. They were attributed to temporal and spatial variations of contaminant sources input. The geoaccumulation index (I (geo)) values explained that only Cd, Pb, and Cu can be considered as moderate to extreme pollutants in the studied sediments.
NASA Astrophysics Data System (ADS)
Yang, Haiqing; Wu, Di; He, Yong
2007-11-01
Near-infrared spectroscopy (NIRS) with the characteristics of high speed, non-destructiveness, high precision and reliable detection data, etc. is a pollution-free, rapid, quantitative and qualitative analysis method. A new approach for variety discrimination of brown sugars using short-wave NIR spectroscopy (800-1050nm) was developed in this work. The relationship between the absorbance spectra and brown sugar varieties was established. The spectral data were compressed by the principal component analysis (PCA). The resulting features can be visualized in principal component (PC) space, which can lead to discovery of structures correlative with the different class of spectral samples. It appears to provide a reasonable variety clustering of brown sugars. The 2-D PCs plot obtained using the first two PCs can be used for the pattern recognition. Least-squares support vector machines (LS-SVM) was applied to solve the multivariate calibration problems in a relatively fast way. The work has shown that short-wave NIR spectroscopy technique is available for the brand identification of brown sugar, and LS-SVM has the better identification ability than PLS when the calibration set is small.
Bravo, Ignacio; Mazo, Manuel; Lázaro, José L; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel
2010-01-01
This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices.
Nuclear norm-based 2-DPCA for extracting features from images.
Zhang, Fanlong; Yang, Jian; Qian, Jianjun; Xu, Yong
2015-10-01
The 2-D principal component analysis (2-DPCA) is a widely used method for image feature extraction. However, it can be equivalently implemented via image-row-based principal component analysis. This paper presents a structured 2-D method called nuclear norm-based 2-DPCA (N-2-DPCA), which uses a nuclear norm-based reconstruction error criterion. The nuclear norm is a matrix norm, which can provide a structured 2-D characterization for the reconstruction error image. The reconstruction error criterion is minimized by converting the nuclear norm-based optimization problem into a series of F-norm-based optimization problems. In addition, N-2-DPCA is extended to a bilateral projection-based N-2-DPCA (N-B2-DPCA). The virtue of N-B2-DPCA over N-2-DPCA is that an image can be represented with fewer coefficients. N-2-DPCA and N-B2-DPCA are applied to face recognition and reconstruction and evaluated using the Extended Yale B, CMU PIE, FRGC, and AR databases. Experimental results demonstrate the effectiveness of the proposed methods.
Szabo, J.K.; Fedriani, E.M.; Segovia-Gonzalez, M. M.; Astheimer, L.B.; Hooper, M.J.
2010-01-01
This paper introduces a new technique in ecology to analyze spatial and temporal variability in environmental variables. By using simple statistics, we explore the relations between abiotic and biotic variables that influence animal distributions. However, spatial and temporal variability in rainfall, a key variable in ecological studies, can cause difficulties to any basic model including time evolution. The study was of a landscape scale (three million square kilometers in eastern Australia), mainly over the period of 19982004. We simultaneously considered qualitative spatial (soil and habitat types) and quantitative temporal (rainfall) variables in a Geographical Information System environment. In addition to some techniques commonly used in ecology, we applied a new method, Functional Principal Component Analysis, which proved to be very suitable for this case, as it explained more than 97% of the total variance of the rainfall data, providing us with substitute variables that are easier to manage and are even able to explain rainfall patterns. The main variable came from a habitat classification that showed strong correlations with rainfall values and soil types. ?? 2010 World Scientific Publishing Company.
Balsamo, Geisi M; Valentim-Neto, Pedro A; Mello, Carla S; Arisi, Ana C M
2015-12-09
The genetically modified (GM) common bean event Embrapa 5.1 was commercially approved in Brazil in 2011; it is resistant to golden mosaic virus infection. In the present work grain proteome profiles of two Embrapa 5.1 common bean varieties, Pérola and Pontal, and their non-GM counterparts were compared by two-dimensional gel electrophoresis (2-DE) followed by mass spectrometry (MS). Analyses detected 23 spots differentially accumulated between GM Pérola and non-GM Pérola and 21 spots between GM Pontal and non-GM Pontal, although they were not the same proteins in Pérola and Pontal varieties, indicating that the variability observed may not be due to the genetic transformation. Among them, eight proteins were identified in Pérola varieties, and four proteins were identified in Pontal. Moreover, we applied principal component analysis (PCA) on 2-DE data, and variation between varieties was explained in the first two principal components. This work provides a first 2-DE-MS/MS-based analysis of Embrapa 5.1 common bean grains.
Does boldness explain vulnerability to angling in Eurasian perch Perca fluviatilis?
Vainikka, Anssi; Tammela, Ilkka; Hyvärinen, Pekka
2016-04-01
Consistent individual differences (CIDs) in behavior are of interest to both basic and applied research, because any selection acting on them could induce evolution of animal behavior. It has been suggested that CIDs in the behavior of fish might explain individual differences in vulnerability to fishing. If so, fishing could impose selection on fish behavior. In this study, we assessed boldness-indicating behaviors of Eurasian perch Perca fluviatilis using individually conducted experiments measuring the time taken to explore a novel arena containing predator (burbot, Lota lota ) cues. We studied if individual differences in boldness would explain vulnerability of individually tagged perch to experimental angling in outdoor ponds, or if fishing would impose selection on boldness-indicating behavior. Perch expressed repeatable individual differences in boldness-indicating behavior but the individual boldness-score (the first principal component) obtained using principal component analysis combining all the measured behavioral responses did not explain vulnerability to experimental angling. Instead, large body size appeared as the only statistically significant predictor of capture probability. Our results suggest that angling is selective for large size, but not always selective for high boldness.
Does boldness explain vulnerability to angling in Eurasian perch Perca fluviatilis?
Vainikka, Anssi; Tammela, Ilkka; Hyvärinen, Pekka
2016-01-01
Abstract Consistent individual differences (CIDs) in behavior are of interest to both basic and applied research, because any selection acting on them could induce evolution of animal behavior. It has been suggested that CIDs in the behavior of fish might explain individual differences in vulnerability to fishing. If so, fishing could impose selection on fish behavior. In this study, we assessed boldness-indicating behaviors of Eurasian perch Perca fluviatilis using individually conducted experiments measuring the time taken to explore a novel arena containing predator (burbot, Lota lota) cues. We studied if individual differences in boldness would explain vulnerability of individually tagged perch to experimental angling in outdoor ponds, or if fishing would impose selection on boldness-indicating behavior. Perch expressed repeatable individual differences in boldness-indicating behavior but the individual boldness-score (the first principal component) obtained using principal component analysis combining all the measured behavioral responses did not explain vulnerability to experimental angling. Instead, large body size appeared as the only statistically significant predictor of capture probability. Our results suggest that angling is selective for large size, but not always selective for high boldness. PMID:29491897
Tchabo, William; Ma, Yongkun; Kwaw, Emmanuel; Zhang, Haining; Xiao, Lulu; Apaliya, Maurice T
2018-01-15
The four different methods of color measurement of wine proposed by Boulton, Giusti, Glories and Commission International de l'Eclairage (CIE) were applied to assess the statistical relationship between the phytochemical profile and chromatic characteristics of sulfur dioxide-free mulberry (Morus nigra) wine submitted to non-thermal maturation processes. The alteration in chromatic properties and phenolic composition of non-thermal aged mulberry wine were examined, aided by the used of Pearson correlation, cluster and principal component analysis. The results revealed a positive effect of non-thermal processes on phytochemical families of wines. From Pearson correlation analysis relationships between chromatic indexes and flavonols as well as anthocyanins were established. Cluster analysis highlighted similarities between Boulton and Giusti parameters, as well as Glories and CIE parameters in the assessment of chromatic properties of wines. Finally, principal component analysis was able to discriminate wines subjected to different maturation techniques on the basis of their chromatic and phenolics characteristics. Copyright © 2017. Published by Elsevier Ltd.
Zarzo, Manuel; Fernández-Navajas, Angel; García-Diego, Fernando-Juan
2011-01-01
We describe the performance of a microclimate monitoring system that was implemented for the preventive conservation of the Renaissance frescoes in the apse vault of the Cathedral of Valencia, that were restored in 2006. This system comprises 29 relative humidity (RH) and temperature sensors: 10 of them inserted into the plaster layer supporting the fresco paintings, 10 sensors in the walls close to the frescoes and nine sensors measuring the indoor microclimate at different points of the vault. Principal component analysis was applied to RH data recorded in 2007. The analysis was repeated with data collected in 2008 and 2010. The resulting loading plots revealed that the similarities and dissimilarities among sensors were approximately maintained along the three years. A physical interpretation was provided for the first and second principal components. Interestingly, sensors recording the highest RH values correspond to zones where humidity problems are causing formation of efflorescence. Recorded data of RH and temperature are discussed according to Italian Standard UNI 10829 (1999). PMID:22164100
Liu, Tsang-Sen; Lin, Jhen-Nan; Peng, Tsung-Ren
2018-01-16
Isotopic compositions of δ 2 H, δ 18 O, δ 13 C, and δ 15 N and concentrations of 22 trace elements from garlic samples were analyzed and processed with stepwise principal component analysis (PCA) to discriminate garlic's country of origin among Asian regions including South Korea, Vietnam, Taiwan, and China. Results indicate that there is no single trace-element concentration or isotopic composition that can accomplish the study's purpose and the stepwise PCA approach proposed does allow for discrimination between countries on a regional basis. Sequentially, Step-1 PCA distinguishes garlic's country of origin among Taiwanese, South Korean, and Vietnamese samples; Step-2 PCA discriminates Chinese garlic from South Korean garlic; and Step-3 and Step-4 PCA, Chinese garlic from Vietnamese garlic. In model tests, countries of origin of all audit samples were correctly discriminated by stepwise PCA. Consequently, this study demonstrates that stepwise PCA as applied is a simple and effective approach to discriminating country of origin among Asian garlics. © 2018 American Academy of Forensic Sciences.
ERIC Educational Resources Information Center
Mugrage, Beverly; And Others
Three ridge regression solutions are compared with ordinary least squares regression and with principal components regression using all components. Ridge regression, particularly the Lawless-Wang solution, out-performed ordinary least squares regression and the principal components solution on the criteria of stability of coefficient and closeness…
A Note on McDonald's Generalization of Principal Components Analysis
ERIC Educational Resources Information Center
Shine, Lester C., II
1972-01-01
It is shown that McDonald's generalization of Classical Principal Components Analysis to groups of variables maximally channels the totalvariance of the original variables through the groups of variables acting as groups. An equation is obtained for determining the vectors of correlations of the L2 components with the original variables.…
Peterson, Leif E
2002-01-01
CLUSFAVOR (CLUSter and Factor Analysis with Varimax Orthogonal Rotation) 5.0 is a Windows-based computer program for hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles. CLUSFAVOR 5.0 standardizes input data; sorts data according to gene-specific coefficient of variation, standard deviation, average and total expression, and Shannon entropy; performs hierarchical cluster analysis using nearest-neighbor, unweighted pair-group method using arithmetic averages (UPGMA), or furthest-neighbor joining methods, and Euclidean, correlation, or jack-knife distances; and performs principal-component analysis. PMID:12184816
Study on fast discrimination of varieties of yogurt using Vis/NIR-spectroscopy
NASA Astrophysics Data System (ADS)
He, Yong; Feng, Shuijuan; Deng, Xunfei; Li, Xiaoli
2006-09-01
A new approach for discrimination of varieties of yogurt by means of VisINTR-spectroscopy was present in this paper. Firstly, through the principal component analysis (PCA) of spectroscopy curves of 5 typical kinds of yogurt, the clustering of yogurt varieties was processed. The analysis results showed that the cumulate reliabilities of PC1 and PC2 (the first two principle components) were more than 98.956%, and the cumulate reliabilities from PC1 to PC7 (the first seven principle components) was 99.97%. Secondly, a discrimination model of Artificial Neural Network (ANN-BP) was set up. The first seven principles components of the samples were applied as ANN-BP inputs, and the value of type of yogurt were applied as outputs, then the three-layer ANN-BP model was build. In this model, every variety yogurt includes 27 samples, the total number of sample is 135, and the rest 25 samples were used as prediction set. The results showed the distinguishing rate of the five yogurt varieties was 100%. It presented that this model was reliable and practicable. So a new approach for the rapid and lossless discrimination of varieties of yogurt was put forward.
Diagnostics and Active Control of Aircraft Interior Noise
NASA Technical Reports Server (NTRS)
Fuller, C. R.
1998-01-01
This project deals with developing advanced methods for investigating and controlling interior noise in aircraft. The work concentrates on developing and applying the techniques of Near Field Acoustic Holography (NAH) and Principal Component Analysis (PCA) to the aircraft interior noise dynamic problem. This involves investigating the current state of the art, developing new techniques and then applying them to the particular problem being studied. The knowledge gained under the first part of the project was then used to develop and apply new, advanced noise control techniques for reducing interior noise. A new fully active control approach based on the PCA was developed and implemented on a test cylinder. Finally an active-passive approach based on tunable vibration absorbers was to be developed and analytically applied to a range of test structures from simple plates to aircraft fuselages.
NASA Astrophysics Data System (ADS)
Geminale, A.; Grassi, D.; Altieri, F.; Serventi, G.; Carli, C.; Carrozzo, F. G.; Sgavetti, M.; Orosei, R.; D'Aversa, E.; Bellucci, G.; Frigeri, A.
2015-06-01
The aim of this work is to extract the surface contribution in the martian visible/near-infrared spectra removing the atmospheric components by means of Principal Component Analysis (PCA) and target transformation (TT). The developed technique is suitable for separating spectral components in a data set large enough to enable an effective usage of statistical methods, in support to the more common approaches to remove the gaseous component. In this context, a key role is played by the estimation, from the spectral population, of the covariance matrix that describes the statistical correlation of the signal among different points in the spectrum. As a general rule, the covariance matrix becomes more and more meaningful increasing the size of initial population, justifying therefore the importance of sizable datasets. Data collected by imaging spectrometers, such as the OMEGA (Observatoire pour la Minéralogie, l'Eau, les Glaces et l'Activité) instrument on board the ESA mission Mars Express (MEx), are particularly suitable for this purpose since it includes in the same session of observation a large number of spectra with different content of aerosols, gases and mineralogy. The methodology presented in this work has been first validated using a simulated dataset of spectra to evaluate its accuracy. Then, it has been applied to the analysis of OMEGA sessions over Nili Fossae and Mawrth Vallis regions, which have been already widely studied because of the presence of hydrated minerals. These minerals are key components of the surface to investigate the presence of liquid water flowing on the martian surface in the Noachian period. Moreover, since a correction for the atmospheric aerosols (dust) component is also applied to these observations, the present work is able to completely remove the atmospheric contribution from the analysed spectra. Once the surface reflectance, free from atmospheric contributions, has been obtained, the Modified Gaussian Model (MGM) has been applied to spectra showing the hydrated phase. Silicates and iron-bearing hydrated minerals have been identified by means of the electronic transitions of Fe2+ between 0.8 and 1.2 μm, while at longer wavelengths the hydrated mineralogy is identified by overtones of the OH group. Surface reflectance spectra, as derived through the method discussed in this paper, clearly show a lower level of the atmospheric residuals in the 1.9 hydration band, thus resulting in a better match with the MGM deconvolution parameters found for the laboratory spectra of martian hydrated mineral analogues and allowing a deeper investigation of this spectral range.
Konaté, Ahmed Amara; Ma, Huolin; Pan, Heping; Qin, Zhen; Ahmed, Hafizullah Abba; Dembele, N'dji Dit Jacques
2017-10-01
The availability of a deep well that penetrates deep into the Ultra High Pressure (UHP) metamorphic rocks is unusual and consequently offers a unique chance to study the metamorphic rocks. One such borehole is located in the southern part of Donghai County in the Sulu UHP metamorphic belt of Eastern China, from the Chinese Continental Scientific Drilling Main hole. This study reports the results obtained from the analysis of oxide log data. A geochemical logging tool provides in situ, gamma ray spectroscopy measurements of major and trace elements in the borehole. Dry weight percent oxide concentration logs obtained for this study were SiO 2 , K 2 O, TiO 2 , H 2 O, CO 2 , Na 2 O, Fe 2 O 3 , FeO, CaO, MnO, MgO, P 2 O 5 and Al 2 O 3 . Cross plot and Principal Component Analysis methods were applied for lithology characterization and mineralogy description respectively. Cross plot analysis allows lithological variations to be characterized. Principal Component Analysis shows that the oxide logs can be summarized by two components related to the feldspar and hydrous minerals. This study has shown that geochemical logging tool data is accurate and adequate to be tremendously useful in UHP metamorphic rocks analysis. Copyright © 2017 Elsevier Ltd. All rights reserved.
Metsalu, Tauno; Vilo, Jaak
2015-01-01
The Principal Component Analysis (PCA) is a widely used method of reducing the dimensionality of high-dimensional data, often followed by visualizing two of the components on the scatterplot. Although widely used, the method is lacking an easy-to-use web interface that scientists with little programming skills could use to make plots of their own data. The same applies to creating heatmaps: it is possible to add conditional formatting for Excel cells to show colored heatmaps, but for more advanced features such as clustering and experimental annotations, more sophisticated analysis tools have to be used. We present a web tool called ClustVis that aims to have an intuitive user interface. Users can upload data from a simple delimited text file that can be created in a spreadsheet program. It is possible to modify data processing methods and the final appearance of the PCA and heatmap plots by using drop-down menus, text boxes, sliders etc. Appropriate defaults are given to reduce the time needed by the user to specify input parameters. As an output, users can download PCA plot and heatmap in one of the preferred file formats. This web server is freely available at http://biit.cs.ut.ee/clustvis/. PMID:25969447
Principal Components Analysis of a JWST NIRSpec Detector Subsystem
NASA Technical Reports Server (NTRS)
Arendt, Richard G.; Fixsen, D. J.; Greenhouse, Matthew A.; Lander, Matthew; Lindler, Don; Loose, Markus; Moseley, S. H.; Mott, D. Brent; Rauscher, Bernard J.; Wen, Yiting;
2013-01-01
We present principal component analysis (PCA) of a flight-representative James Webb Space Telescope NearInfrared Spectrograph (NIRSpec) Detector Subsystem. Although our results are specific to NIRSpec and its T - 40 K SIDECAR ASICs and 5 m cutoff H2RG detector arrays, the underlying technical approach is more general. We describe how we measured the systems response to small environmental perturbations by modulating a set of bias voltages and temperature. We used this information to compute the systems principal noise components. Together with information from the astronomical scene, we show how the zeroth principal component can be used to calibrate out the effects of small thermal and electrical instabilities to produce cosmetically cleaner images with significantly less correlated noise. Alternatively, if one were designing a new instrument, one could use a similar PCA approach to inform a set of environmental requirements (temperature stability, electrical stability, etc.) that enabled the planned instrument to meet performance requirements
Ghosh, Debasree; Chattopadhyay, Parimal
2012-06-01
The objective of the work was to use the method of quantitative descriptive analysis (QDA) to describe the sensory attributes of the fermented food products prepared with the incorporation of lactic cultures. Panellists were selected and trained to evaluate various attributes specially color and appearance, body texture, flavor, overall acceptability and acidity of the fermented food products like cow milk curd and soymilk curd, idli, sauerkraut and probiotic ice cream. Principal component analysis (PCA) identified the six significant principal components that accounted for more than 90% of the variance in the sensory attribute data. Overall product quality was modelled as a function of principal components using multiple least squares regression (R (2) = 0.8). The result from PCA was statistically analyzed by analysis of variance (ANOVA). These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring the fermented food product attributes that are important for consumer acceptability.
Wavelet packets for multi- and hyper-spectral imagery
NASA Astrophysics Data System (ADS)
Benedetto, J. J.; Czaja, W.; Ehler, M.; Flake, C.; Hirn, M.
2010-01-01
State of the art dimension reduction and classification schemes in multi- and hyper-spectral imaging rely primarily on the information contained in the spectral component. To better capture the joint spatial and spectral data distribution we combine the Wavelet Packet Transform with the linear dimension reduction method of Principal Component Analysis. Each spectral band is decomposed by means of the Wavelet Packet Transform and we consider a joint entropy across all the spectral bands as a tool to exploit the spatial information. Dimension reduction is then applied to the Wavelet Packets coefficients. We present examples of this technique for hyper-spectral satellite imaging. We also investigate the role of various shrinkage techniques to model non-linearity in our approach.
Health behaviours, body weight and self-esteem among grade five students in Canada.
Wu, Xiuyun; Kirk, Sara F L; Ohinmaa, Arto; Veugelers, Paul
2016-01-01
This study sought to identify the principal components of self-esteem and the health behavioural determinants of these components among grade five students. We analysed data from a population-based survey among 4918 grade five students, who are primarily 10 and 11 years of age, and their parents in the Canadian province of Nova Scotia. The survey comprised the Harvard Youth and Adolescent Questionnaire, parental reporting of students' physical activity (PA) and time spent watching television or using computer/video games. Students heights and weights were objectively measured. We applied principal component analysis (PCA) to derive the components of self-esteem, and multilevel, multivariable logistic regression to quantify associations of diet quality, PA, sedentary behaviour and body weight with these components of self-esteem. PCA identified four components for self-esteem: self-perception, externalizing problems, internalizing problems, social-perception. Influences of health behaviours and body weight on self-esteem varied across the components. Better diet quality was associated with higher self-perception and fewer externalizing problems. Less PA and more use of computer/video games were related to lower self-perception and social-perception. Excessive TV watching was associated with more internalizing problems. Students classified as obese were more likely to report low self- and social-perception, and to experience fewer externalizing problems relative to students classified as normal weight. This study demonstrates independent influences of diet quality, physical activity, sedentary behaviour and body weight on four aspects of self-esteem among children. These findings suggest that school programs and health promotion strategies that target health behaviours may benefit self-esteem in childhood, and mental health and quality of life later in life.
Independent component analysis applied to long bunch beams in the Los Alamos Proton Storage Ring
NASA Astrophysics Data System (ADS)
Kolski, Jeffrey S.; Macek, Robert J.; McCrady, Rodney C.; Pang, Xiaoying
2012-11-01
Independent component analysis (ICA) is a powerful blind source separation (BSS) method. Compared to the typical BSS method, principal component analysis, ICA is more robust to noise, coupling, and nonlinearity. The conventional ICA application to turn-by-turn position data from multiple beam position monitors (BPMs) yields information about cross-BPM correlations. With this scheme, multi-BPM ICA has been used to measure the transverse betatron phase and amplitude functions, dispersion function, linear coupling, sextupole strength, and nonlinear beam dynamics. We apply ICA in a new way to slices along the bunch revealing correlations of particle motion within the beam bunch. We digitize beam signals of the long bunch at the Los Alamos Proton Storage Ring with a single device (BPM or fast current monitor) for an entire injection-extraction cycle. ICA of the digitized beam signals results in source signals, which we identify to describe varying betatron motion along the bunch, locations of transverse resonances along the bunch, measurement noise, characteristic frequencies of the digitizing oscilloscopes, and longitudinal beam structure.
Qin, Kunming; Wang, Bin; Li, Weidong; Cai, Hao; Chen, Danni; Liu, Xiao; Yin, Fangzhou; Cai, Baochang
2015-05-01
In traditional Chinese medicine, raw and processed herbs are used to treat different diseases. Suitable quality assessment methods are crucial for the discrimination between raw and processed herbs. The dried fruit of Arctium lappa L. and their processed products are widely used in traditional Chinese medicine, yet their therapeutic effects are different. In this study, a novel strategy using high-performance liquid chromatography and diode array detection coupled with multivariate statistical analysis to rapidly explore raw and processed Arctium lappa L. was proposed and validated. Four main components in a total of 30 batches of raw and processed Fructus Arctii samples were analyzed, and ten characteristic peaks were identified in the fingerprint common pattern. Furthermore, similarity evaluation, principal component analysis, and hierachical cluster analysis were applied to demonstrate the distinction. The results suggested that the relative amounts of the chemical components of raw and processed Fructus Arctii samples are different. This new method has been successfully applied to detect the raw and processed Fructus Arctii in marketed herbal medicinal products. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Structured Sparse Principal Components Analysis With the TV-Elastic Net Penalty.
de Pierrefeu, Amicie; Lofstedt, Tommy; Hadj-Selem, Fouad; Dubois, Mathieu; Jardri, Renaud; Fovet, Thomas; Ciuciu, Philippe; Frouin, Vincent; Duchesnay, Edouard
2018-02-01
Principal component analysis (PCA) is an exploratory tool widely used in data analysis to uncover the dominant patterns of variability within a population. Despite its ability to represent a data set in a low-dimensional space, PCA's interpretability remains limited. Indeed, the components produced by PCA are often noisy or exhibit no visually meaningful patterns. Furthermore, the fact that the components are usually non-sparse may also impede interpretation, unless arbitrary thresholding is applied. However, in neuroimaging, it is essential to uncover clinically interpretable phenotypic markers that would account for the main variability in the brain images of a population. Recently, some alternatives to the standard PCA approach, such as sparse PCA (SPCA), have been proposed, their aim being to limit the density of the components. Nonetheless, sparsity alone does not entirely solve the interpretability problem in neuroimaging, since it may yield scattered and unstable components. We hypothesized that the incorporation of prior information regarding the structure of the data may lead to improved relevance and interpretability of brain patterns. We therefore present a simple extension of the popular PCA framework that adds structured sparsity penalties on the loading vectors in order to identify the few stable regions in the brain images that capture most of the variability. Such structured sparsity can be obtained by combining, e.g., and total variation (TV) penalties, where the TV regularization encodes information on the underlying structure of the data. This paper presents the structured SPCA (denoted SPCA-TV) optimization framework and its resolution. We demonstrate SPCA-TV's effectiveness and versatility on three different data sets. It can be applied to any kind of structured data, such as, e.g., -dimensional array images or meshes of cortical surfaces. The gains of SPCA-TV over unstructured approaches (such as SPCA and ElasticNet PCA) or structured approach (such as GraphNet PCA) are significant, since SPCA-TV reveals the variability within a data set in the form of intelligible brain patterns that are easier to interpret and more stable across different samples.
Climatic change projections for winter streamflow in Guadalquivir river
NASA Astrophysics Data System (ADS)
Jesús Esteban Parra, María; Hidalgo Muñoz, José Manuel; García-Valdecasas-Ojeda, Matilde; Raquel Gámiz Fortis, Sonia; Castro Díez, Yolanda
2015-04-01
In this work we have obtained climate change projections for winter streamflow of the Guadalquivir River in the period 2071-2100 using the Principal Component Regression (PCR) method. The streamflow data base used has been provided by the Center for Studies and Experimentation of Public Works, CEDEX. Series from gauging stations and reservoirs with less than 10% of missing data (filled by regression with well correlated neighboring stations) have been considered. The homogeneity of these series has been evaluated through the Pettit test and degree of human alteration by the Common Area Index. The application of these criteria led to the selection of 13 streamflow time series homogeneously distributed over the basin, covering the period 1952-2011. For this streamflow data, winter seasonal values were obtained by averaging the monthly values from January to March. The PCR method has been applied using the Principal Components of the mean anomalies of sea level pressure (SLP) in winter (December to February averaged) as predictors of streamflow for the development of a downscaled statistical model. The SLP database is the NCEP reanalysis covering the North Atlantic region, and the calibration and validation periods used for fitting and evaluating the ability of the model are 1952-1992 and 1993-2011, respectively. In general, using four Principal Components, regression models are able to explain up to 70% of the variance of the streamflow data. Finally, the statistical model obtained for the observational data was applied to the SLP data for the period 2071-2100, using the outputs of different GCMs of the CMIP5 under the RPC8.5 scenario. The results found for the end of the century show no significant changes or moderate decrease in the streamflow of this river for most GCMs in winter, but for some of them the decrease is very strong. Keywords: Statistical downscaling, streamflow, Guadalquivir River, climate change. ACKNOWLEDGEMENTS This work has been financed by the projects P11-RNM-7941 (Junta de Andalucía-Spain) and CGL2013-48539-R (MINECO-Spain, FEDER).
Wei, Fang; Hu, Na; Lv, Xin; Dong, Xu-Yan; Chen, Hong
2015-07-24
In this investigation, off-line comprehensive two-dimensional liquid chromatography-atmospheric pressure chemical ionization mass spectrometry using a single column has been applied for the identification and quantification of triacylglycerols in edible oils. A novel mixed-mode phenyl-hexyl chromatographic column was employed in this off-line two-dimensional separation system. The phenyl-hexyl column combined the features of traditional C18 and silver-ion columns, which could provide hydrophobic interactions with triacylglycerols under acetonitrile conditions and can offer π-π interactions with triacylglycerols under methanol conditions. When compared with traditional off-line comprehensive two-dimensional liquid chromatography employing two different chromatographic columns (C18 and silver-ion column) and using elution solvents comprised of two phases (reversed-phase/normal-phase) for triacylglycerols separation, the novel off-line comprehensive two-dimensional liquid chromatography using a single column can be achieved by simply altering the mobile phase between acetonitrile and methanol, which exhibited a much higher selectivity for the separation of triacylglycerols with great efficiency and rapid speed. In addition, an approach based on the use of response factor with atmospheric pressure chemical ionization mass spectrometry has been developed for triacylglycerols quantification. Due to the differences between saturated and unsaturated acyl chains, the use of response factors significantly improves the quantitation of triacylglycerols. This two-dimensional liquid chromatography-mass spectrometry system was successfully applied for the profiling of triacylglycerols in soybean oils, peanut oils and lord oils. A total of 68 triacylglycerols including 40 triacylglycerols in soybean oils, 50 triacylglycerols in peanut oils and 44 triacylglycerols in lord oils have been identified and quantified. The liquid chromatography-mass spectrometry data were analyzed using principal component analysis. The results of the principal component analysis enabled a clear identification of different plant oils. By using this two-dimensional liquid chromatography-mass spectrometry system coupled with principal component analysis, adulterated soybean oils with 5% added lord oil and peanut oils with 5% added soybean oil can be clearly identified. Copyright © 2015 Elsevier B.V. All rights reserved.
Application of digital image processing techniques to astronomical imagery, 1979
NASA Technical Reports Server (NTRS)
Lorre, J. J.
1979-01-01
Several areas of applications of image processing to astronomy were identified and discussed. These areas include: (1) deconvolution for atmospheric seeing compensation; a comparison between maximum entropy and conventional Wiener algorithms; (2) polarization in galaxies from photographic plates; (3) time changes in M87 and methods of displaying these changes; (4) comparing emission line images in planetary nebulae; and (5) log intensity, hue saturation intensity, and principal component color enhancements of M82. Examples are presented of these techniques applied to a variety of objects.
Contrast improvement of terahertz images of thin histopathologic sections
Formanek, Florian; Brun, Marc-Aurèle; Yasuda, Akio
2011-01-01
We present terahertz images of 10 μm thick histopathologic sections obtained in reflection geometry with a time-domain spectrometer, and demonstrate improved contrast for sections measured in paraffin with water. Automated segmentation is applied to the complex refractive index data to generate clustered terahertz images distinguishing cancer from healthy tissues. The degree of classification of pixels is then evaluated using registered visible microscope images. Principal component analysis and propagation simulations are employed to investigate the origin and the gain of image contrast. PMID:21326635
Contrast improvement of terahertz images of thin histopathologic sections.
Formanek, Florian; Brun, Marc-Aurèle; Yasuda, Akio
2010-12-03
We present terahertz images of 10 μm thick histopathologic sections obtained in reflection geometry with a time-domain spectrometer, and demonstrate improved contrast for sections measured in paraffin with water. Automated segmentation is applied to the complex refractive index data to generate clustered terahertz images distinguishing cancer from healthy tissues. The degree of classification of pixels is then evaluated using registered visible microscope images. Principal component analysis and propagation simulations are employed to investigate the origin and the gain of image contrast.
Quantifying and Reducing Uncertainty in Correlated Multi-Area Short-Term Load Forecasting
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sun, Yannan; Hou, Zhangshuan; Meng, Da
2016-07-17
In this study, we represent and reduce the uncertainties in short-term electric load forecasting by integrating time series analysis tools including ARIMA modeling, sequential Gaussian simulation, and principal component analysis. The approaches are mainly focusing on maintaining the inter-dependency between multiple geographically related areas. These approaches are applied onto cross-correlated load time series as well as their forecast errors. Multiple short-term prediction realizations are then generated from the reduced uncertainty ranges, which are useful for power system risk analyses.
Pepper seed variety identification based on visible/near-infrared spectral technology
NASA Astrophysics Data System (ADS)
Li, Cuiling; Wang, Xiu; Meng, Zhijun; Fan, Pengfei; Cai, Jichen
2016-11-01
Pepper is a kind of important fruit vegetable, with the expansion of pepper hybrid planting area, detection of pepper seed purity is especially important. This research used visible/near infrared (VIS/NIR) spectral technology to detect the variety of single pepper seed, and chose hybrid pepper seeds "Zhuo Jiao NO.3", "Zhuo Jiao NO.4" and "Zhuo Jiao NO.5" as research sample. VIS/NIR spectral data of 80 "Zhuo Jiao NO.3", 80 "Zhuo Jiao NO.4" and 80 "Zhuo Jiao NO.5" pepper seeds were collected, and the original spectral data was pretreated with standard normal variable (SNV) transform, first derivative (FD), and Savitzky-Golay (SG) convolution smoothing methods. Principal component analysis (PCA) method was adopted to reduce the dimension of the spectral data and extract principal components, according to the distribution of the first principal component (PC1) along with the second principal component(PC2) in the twodimensional plane, similarly, the distribution of PC1 coupled with the third principal component(PC3), and the distribution of PC2 combined with PC3, distribution areas of three varieties of pepper seeds were divided in each twodimensional plane, and the discriminant accuracy of PCA was tested through observing the distribution area of samples' principal components in validation set. This study combined PCA and linear discriminant analysis (LDA) to identify single pepper seed varieties, results showed that with the FD preprocessing method, the discriminant accuracy of pepper seed varieties was 98% for validation set, it concludes that using VIS/NIR spectral technology is feasible for identification of single pepper seed varieties.
Long, J.M.; Fisher, W.L.
2006-01-01
We present a method for spatial interpretation of environmental variation in a reservoir that integrates principal components analysis (PCA) of environmental data with geographic information systems (GIS). To illustrate our method, we used data from a Great Plains reservoir (Skiatook Lake, Oklahoma) with longitudinal variation in physicochemical conditions. We measured 18 physicochemical features, mapped them using GIS, and then calculated and interpreted four principal components. Principal component 1 (PC1) was readily interpreted as longitudinal variation in water chemistry, but the other principal components (PC2-4) were difficult to interpret. Site scores for PC1-4 were calculated in GIS by summing weighted overlays of the 18 measured environmental variables, with the factor loadings from the PCA as the weights. PC1-4 were then ordered into a landscape hierarchy, an emergent property of this technique, which enabled their interpretation. PC1 was interpreted as a reservoir scale change in water chemistry, PC2 was a microhabitat variable of rip-rap substrate, PC3 identified coves/embayments and PC4 consisted of shoreline microhabitats related to slope. The use of GIS improved our ability to interpret the more obscure principal components (PC2-4), which made the spatial variability of the reservoir environment more apparent. This method is applicable to a variety of aquatic systems, can be accomplished using commercially available software programs, and allows for improved interpretation of the geographic environmental variability of a system compared to using typical PCA plots. ?? Copyright by the North American Lake Management Society 2006.
NASA Astrophysics Data System (ADS)
Heikkilä, U.; Shi, X.; Phipps, S. J.; Smith, A. M.
2013-10-01
This study investigates the effect of deglacial climate on the deposition of the solar proxy 10Be globally, and at two specific locations, the GRIP site at Summit, Central Greenland, and the Law Dome site in coastal Antarctica. The deglacial climate is represented by three 30 yr time slice simulations of 10 000 BP (years before present = 1950 CE), 11 000 BP and 12 000 BP, compared with a preindustrial control simulation. The model used is the ECHAM5-HAM atmospheric aerosol-climate model, driven with sea surface temperatures and sea ice cover simulated using the CSIRO Mk3L coupled climate system model. The focus is on isolating the 10Be production signal, driven by solar variability, from the weather or climate driven noise in the 10Be deposition flux during different stages of climate. The production signal varies on lower frequencies, dominated by the 11yr solar cycle within the 30 yr time scale of these experiments. The climatic noise is of higher frequencies. We first apply empirical orthogonal functions (EOF) analysis to global 10Be deposition on the annual scale and find that the first principal component, consisting of the spatial pattern of mean 10Be deposition and the temporally varying solar signal, explains 64% of the variability. The following principal components are closely related to those of precipitation. Then, we apply ensemble empirical decomposition (EEMD) analysis on the time series of 10Be deposition at GRIP and at Law Dome, which is an effective method for adaptively decomposing the time series into different frequency components. The low frequency components and the long term trend represent production and have reduced noise compared to the entire frequency spectrum of the deposition. The high frequency components represent climate driven noise related to the seasonal cycle of e.g. precipitation and are closely connected to high frequencies of precipitation. These results firstly show that the 10Be atmospheric production signal is preserved in the deposition flux to surface even during climates very different from today's both in global data and at two specific locations. Secondly, noise can be effectively reduced from 10Be deposition data by simply applying the EOF analysis in the case of a reasonably large number of available data sets, or by decomposing the individual data sets to filter out high-frequency fluctuations.
2017-01-01
Introduction This research paper aims to assess factors reported by parents associated with the successful transition of children with complex additional support requirements that have undergone a transition between school environments from 8 European Union member states. Methods Quantitative data were collected from 306 parents within education systems from 8 EU member states (Bulgaria, Cyprus, Greece, Ireland, the Netherlands, Romania, Spain and the UK). The data were derived from an online questionnaire and consisted of 41 questions. Information was collected on: parental involvement in their child’s transition, child involvement in transition, child autonomy, school ethos, professionals’ involvement in transition and integrated working, such as, joint assessment, cooperation and coordination between agencies. Survey questions that were designed on a Likert-scale were included in the Principal Components Analysis (PCA), additional survey questions, along with the results from the PCA, were used to build a logistic regression model. Results Four principal components were identified accounting for 48.86% of the variability in the data. Principal component 1 (PC1), ‘child inclusive ethos,’ contains 16.17% of the variation. Principal component 2 (PC2), which represents child autonomy and involvement, is responsible for 8.52% of the total variation. Principal component 3 (PC3) contains questions relating to parental involvement and contributed to 12.26% of the overall variation. Principal component 4 (PC4), which involves transition planning and coordination, contributed to 11.91% of the overall variation. Finally, the principal components were included in a logistic regression to evaluate the relationship between inclusion and a successful transition, as well as whether other factors that may have influenced transition. All four principal components were significantly associated with a successful transition, with PC1 being having the most effect (OR: 4.04, CI: 2.43–7.18, p<0.0001). Discussion To support a child with complex additional support requirements through transition from special school to mainstream, governments and professionals need to ensure children with additional support requirements and their parents are at the centre of all decisions that affect them. It is important that professionals recognise the educational, psychological, social and cultural contexts of a child with additional support requirements and their families which will provide a holistic approach and remove barriers for learning. PMID:28636649
Ravenscroft, John; Wazny, Kerri; Davis, John M
2017-01-01
This research paper aims to assess factors reported by parents associated with the successful transition of children with complex additional support requirements that have undergone a transition between school environments from 8 European Union member states. Quantitative data were collected from 306 parents within education systems from 8 EU member states (Bulgaria, Cyprus, Greece, Ireland, the Netherlands, Romania, Spain and the UK). The data were derived from an online questionnaire and consisted of 41 questions. Information was collected on: parental involvement in their child's transition, child involvement in transition, child autonomy, school ethos, professionals' involvement in transition and integrated working, such as, joint assessment, cooperation and coordination between agencies. Survey questions that were designed on a Likert-scale were included in the Principal Components Analysis (PCA), additional survey questions, along with the results from the PCA, were used to build a logistic regression model. Four principal components were identified accounting for 48.86% of the variability in the data. Principal component 1 (PC1), 'child inclusive ethos,' contains 16.17% of the variation. Principal component 2 (PC2), which represents child autonomy and involvement, is responsible for 8.52% of the total variation. Principal component 3 (PC3) contains questions relating to parental involvement and contributed to 12.26% of the overall variation. Principal component 4 (PC4), which involves transition planning and coordination, contributed to 11.91% of the overall variation. Finally, the principal components were included in a logistic regression to evaluate the relationship between inclusion and a successful transition, as well as whether other factors that may have influenced transition. All four principal components were significantly associated with a successful transition, with PC1 being having the most effect (OR: 4.04, CI: 2.43-7.18, p<0.0001). To support a child with complex additional support requirements through transition from special school to mainstream, governments and professionals need to ensure children with additional support requirements and their parents are at the centre of all decisions that affect them. It is important that professionals recognise the educational, psychological, social and cultural contexts of a child with additional support requirements and their families which will provide a holistic approach and remove barriers for learning.
Principal components of wrist circumduction from electromagnetic surgical tracking.
Rasquinha, Brian J; Rainbow, Michael J; Zec, Michelle L; Pichora, David R; Ellis, Randy E
2017-02-01
An electromagnetic (EM) surgical tracking system was used for a functionally calibrated kinematic analysis of wrist motion. Circumduction motions were tested for differences in subject gender and for differences in the sense of the circumduction as clockwise or counter-clockwise motion. Twenty subjects were instrumented for EM tracking. Flexion-extension motion was used to identify the functional axis. Subjects performed unconstrained wrist circumduction in a clockwise and counter-clockwise sense. Data were decomposed into orthogonal flexion-extension motions and radial-ulnar deviation motions. PCA was used to concisely represent motions. Nonparametric Wilcoxon tests were used to distinguish the groups. Flexion-extension motions were projected onto a direction axis with a root-mean-square error of [Formula: see text]. Using the first three principal components, there was no statistically significant difference in gender (all [Formula: see text]). For motion sense, radial-ulnar deviation distinguished the sense of circumduction in the first principal component ([Formula: see text]) and in the third principal component ([Formula: see text]); flexion-extension distinguished the sense in the second principal component ([Formula: see text]). The clockwise sense of circumduction could be distinguished by a multifactorial combination of components; there were no gender differences in this small population. These data constitute a baseline for normal wrist circumduction. The multifactorial PCA findings suggest that a higher-dimensional method, such as manifold analysis, may be a more concise way of representing circumduction in human joints.
Introduction to uses and interpretation of principal component analyses in forest biology.
J. G. Isebrands; Thomas R. Crow
1975-01-01
The application of principal component analysis for interpretation of multivariate data sets is reviewed with emphasis on (1) reduction of the number of variables, (2) ordination of variables, and (3) applications in conjunction with multiple regression.
Principal component analysis of phenolic acid spectra
USDA-ARS?s Scientific Manuscript database
Phenolic acids are common plant metabolites that exhibit bioactive properties and have applications in functional food and animal feed formulations. The ultraviolet (UV) and infrared (IR) spectra of four closely related phenolic acid structures were evaluated by principal component analysis (PCA) to...
Optimal pattern synthesis for speech recognition based on principal component analysis
NASA Astrophysics Data System (ADS)
Korsun, O. N.; Poliyev, A. V.
2018-02-01
The algorithm for building an optimal pattern for the purpose of automatic speech recognition, which increases the probability of correct recognition, is developed and presented in this work. The optimal pattern forming is based on the decomposition of an initial pattern to principal components, which enables to reduce the dimension of multi-parameter optimization problem. At the next step the training samples are introduced and the optimal estimates for principal components decomposition coefficients are obtained by a numeric parameter optimization algorithm. Finally, we consider the experiment results that show the improvement in speech recognition introduced by the proposed optimization algorithm.
NASA Astrophysics Data System (ADS)
Ueki, Kenta; Iwamori, Hikaru
2017-10-01
In this study, with a view of understanding the structure of high-dimensional geochemical data and discussing the chemical processes at work in the evolution of arc magmas, we employed principal component analysis (PCA) to evaluate the compositional variations of volcanic rocks from the Sengan volcanic cluster of the Northeastern Japan Arc. We analyzed the trace element compositions of various arc volcanic rocks, sampled from 17 different volcanoes in a volcanic cluster. The PCA results demonstrated that the first three principal components accounted for 86% of the geochemical variation in the magma of the Sengan region. Based on the relationships between the principal components and the major elements, the mass-balance relationships with respect to the contributions of minerals, the composition of plagioclase phenocrysts, geothermal gradient, and seismic velocity structure in the crust, the first, the second, and the third principal components appear to represent magma mixing, crystallizations of olivine/pyroxene, and crystallizations of plagioclase, respectively. These represented 59%, 20%, and 6%, respectively, of the variance in the entire compositional range, indicating that magma mixing accounted for the largest variance in the geochemical variation of the arc magma. Our result indicated that crustal processes dominate the geochemical variation of magma in the Sengan volcanic cluster.
[Identification of two varieties of Citri Fructus by fingerprint and chemometrics].
Su, Jing-hua; Zhang, Chao; Sun, Lei; Gu, Bing-ren; Ma, Shuang-cheng
2015-06-01
Citri Fructus identification by fingerprint and chemometrics was investigated in this paper. Twenty-three Citri Fructus samples were collected which referred to two varieties as Cirtus wilsonii and C. medica recorded in Chinese Pharmacopoeia. HPLC chromatograms were obtained. The components were partly identified by reference substances, and then common pattern was established for chemometrics analysis. Similarity analysis, principal component analysis (PCA) , partial least squares-discriminant analysis (PLS-DA) and hierarchical cluster analysis heatmap were applied. The results indicated that C. wilsonii and C. medica could be ideally classified with common pattern contained twenty-five characteristic peaks. Besides, preliminary pattern recognition had verified the chemometrics analytical results. Absolute peak area (APA) was used for relevant quantitative analysis, results showed the differences between two varieties and it was valuable for further quality control as selection of characteristic components.
Advanced methods in NDE using machine learning approaches
NASA Astrophysics Data System (ADS)
Wunderlich, Christian; Tschöpe, Constanze; Duckhorn, Frank
2018-04-01
Machine learning (ML) methods and algorithms have been applied recently with great success in quality control and predictive maintenance. Its goal to build new and/or leverage existing algorithms to learn from training data and give accurate predictions, or to find patterns, particularly with new and unseen similar data, fits perfectly to Non-Destructive Evaluation. The advantages of ML in NDE are obvious in such tasks as pattern recognition in acoustic signals or automated processing of images from X-ray, Ultrasonics or optical methods. Fraunhofer IKTS is using machine learning algorithms in acoustic signal analysis. The approach had been applied to such a variety of tasks in quality assessment. The principal approach is based on acoustic signal processing with a primary and secondary analysis step followed by a cognitive system to create model data. Already in the second analysis steps unsupervised learning algorithms as principal component analysis are used to simplify data structures. In the cognitive part of the software further unsupervised and supervised learning algorithms will be trained. Later the sensor signals from unknown samples can be recognized and classified automatically by the algorithms trained before. Recently the IKTS team was able to transfer the software for signal processing and pattern recognition to a small printed circuit board (PCB). Still, algorithms will be trained on an ordinary PC; however, trained algorithms run on the Digital Signal Processor and the FPGA chip. The identical approach will be used for pattern recognition in image analysis of OCT pictures. Some key requirements have to be fulfilled, however. A sufficiently large set of training data, a high signal-to-noise ratio, and an optimized and exact fixation of components are required. The automated testing can be done subsequently by the machine. By integrating the test data of many components along the value chain further optimization including lifetime and durability prediction based on big data becomes possible, even if components are used in different versions or configurations. This is the promise behind German Industry 4.0.
Integrated Approaches On Archaeo-Geophysical Data
NASA Astrophysics Data System (ADS)
Kucukdemirci, M.; Piro, S.; Zamuner, D.; Ozer, E.
2015-12-01
Key words: Ground Penetrating Radar (GPR), Magnetometry, Geophysical Data Integration, Principal Component Analyse (PCA), Aizanoi Archaeological Site An application of geophysical integration methods which often appealed are divided into two classes as qualitative and quantitative approaches. This work focused on the application of quantitative integration approaches, which involve the mathematical and statistical integration techniques, on the archaeo-geophysical data obtained in Aizanoi Archaeological Site,Turkey. Two geophysical methods were applied as Ground Penetrating Radar (GPR) and Magnetometry for archaeological prospection on the selected archaeological site. After basic data processing of each geophysical method, the mathematical approaches of Sums and Products and the statistical approach of Principal Component Analysis (PCA) have been applied for the integration. These integration approches were first tested on synthetic digital images before application to field data. Then the same approaches were applied to 2D magnetic maps and 2D GPR time slices which were obtained on the same unit grids in the archaeological site. Initially, the geophysical data were examined individually by referencing with archeological maps and informations obtained from archaeologists and some important structures as possible walls, roads and relics were determined. The results of all integration approaches provided very important and different details about the anomalies related to archaeological features. By using all those applications, integrated images can provide complementary informations as well about the archaeological relics under the ground. Acknowledgements The authors would like to thanks to Scientific and Technological Research Council of Turkey (TUBITAK), Fellowship for Visiting Scientists Programme for their support, Istanbul University Scientific Research Project Fund, (Project.No:12302) and archaeologist team of Aizanoi Archaeological site for their support during the field work.
Investigation of inversion polymorphisms in the human genome using principal components analysis.
Ma, Jianzhong; Amos, Christopher I
2012-01-01
Despite the significant advances made over the last few years in mapping inversions with the advent of paired-end sequencing approaches, our understanding of the prevalence and spectrum of inversions in the human genome has lagged behind other types of structural variants, mainly due to the lack of a cost-efficient method applicable to large-scale samples. We propose a novel method based on principal components analysis (PCA) to characterize inversion polymorphisms using high-density SNP genotype data. Our method applies to non-recurrent inversions for which recombination between the inverted and non-inverted segments in inversion heterozygotes is suppressed due to the loss of unbalanced gametes. Inside such an inversion region, an effect similar to population substructure is thus created: two distinct "populations" of inversion homozygotes of different orientations and their 1:1 admixture, namely the inversion heterozygotes. This kind of substructure can be readily detected by performing PCA locally in the inversion regions. Using simulations, we demonstrated that the proposed method can be used to detect and genotype inversion polymorphisms using unphased genotype data. We applied our method to the phase III HapMap data and inferred the inversion genotypes of known inversion polymorphisms at 8p23.1 and 17q21.31. These inversion genotypes were validated by comparing with literature results and by checking Mendelian consistency using the family data whenever available. Based on the PCA-approach, we also performed a preliminary genome-wide scan for inversions using the HapMap data, which resulted in 2040 candidate inversions, 169 of which overlapped with previously reported inversions. Our method can be readily applied to the abundant SNP data, and is expected to play an important role in developing human genome maps of inversions and exploring associations between inversions and susceptibility of diseases.
ERIC Educational Resources Information Center
Kronenberger, William G.; Thompson, Robert J., Jr.; Morrow, Catherine
1997-01-01
A principal components analysis of the Family Environment Scale (FES) (R. Moos and B. Moos, 1994) was performed using 113 undergraduates. Research supported 3 broad components encompassing the 10 FES subscales. These results supported previous research and the generalization of the FES to college samples. (SLD)
Time series analysis of collective motions in proteins
NASA Astrophysics Data System (ADS)
Alakent, Burak; Doruker, Pemra; ćamurdan, Mehmet C.
2004-01-01
The dynamics of α-amylase inhibitor tendamistat around its native state is investigated using time series analysis of the principal components of the Cα atomic displacements obtained from molecular dynamics trajectories. Collective motion along a principal component is modeled as a homogeneous nonstationary process, which is the result of the damped oscillations in local minima superimposed on a random walk. The motion in local minima is described by a stationary autoregressive moving average model, consisting of the frequency, damping factor, moving average parameters and random shock terms. Frequencies for the first 50 principal components are found to be in the 3-25 cm-1 range, which are well correlated with the principal component indices and also with atomistic normal mode analysis results. Damping factors, though their correlation is less pronounced, decrease as principal component indices increase, indicating that low frequency motions are less affected by friction. The existence of a positive moving average parameter indicates that the stochastic force term is likely to disturb the mode in opposite directions for two successive sampling times, showing the modes tendency to stay close to minimum. All these four parameters affect the mean square fluctuations of a principal mode within a single minimum. The inter-minima transitions are described by a random walk model, which is driven by a random shock term considerably smaller than that for the intra-minimum motion. The principal modes are classified into three subspaces based on their dynamics: essential, semiconstrained, and constrained, at least in partial consistency with previous studies. The Gaussian-type distributions of the intermediate modes, called "semiconstrained" modes, are explained by asserting that this random walk behavior is not completely free but between energy barriers.
NASA Technical Reports Server (NTRS)
Holliday, Ezekiel S. (Inventor)
2014-01-01
Vibrations at harmonic frequencies are reduced by injecting harmonic balancing signals into the armature of a linear motor/alternator coupled to a Stirling machine. The vibrations are sensed to provide a signal representing the mechanical vibrations. A harmonic balancing signal is generated for selected harmonics of the operating frequency by processing the sensed vibration signal with adaptive filter algorithms of adaptive filters for each harmonic. Reference inputs for each harmonic are applied to the adaptive filter algorithms at the frequency of the selected harmonic. The harmonic balancing signals for all of the harmonics are summed with a principal control signal. The harmonic balancing signals modify the principal electrical drive voltage and drive the motor/alternator with a drive voltage component in opposition to the vibration at each harmonic.
NASA Astrophysics Data System (ADS)
Kitao, Akio; Hirata, Fumio; Gō, Nobuhiro
1991-12-01
The effects of solvent on the conformation and dynamics of protein is studied by computer simulation. The dynamics is studied by focusing mainly on collective motions of the protein molecule. Three types of simulation, normal mode analysis, molecular dynamics in vacuum, and molecular dynamics in water are applied to melittin, the major component of bee venom. To define collective motions principal, component analysis as well as normal mode analysis has been carried out. The principal components with large fluctuation amplitudes have a very good correspondence with the low-frequency normal modes. Trajectories of the molecular dynamics simulation are projected onto the principal axes. From the projected motions time correlation functions are calculated. The results indicate that the very-low-frequency modes, whose frequencies are less than ≈ 50 cm -1, are overdamping in water with relaxation times roushly twice as long as the period of the oscillatory motion. Effective Langevin mode analysis is carried out by using the friction coefficient matrix determined from the velocity correlation function calculated from the molecular dynamics trajectory in water. This analysis reproduces the results of the simulation in water reasonably well. The presence of the solvent water is found also to affect the shape of the potential energy surface in such a way that it produces many local minima with low-energy barriers in between, the envelope of which is given by the surface in vacuum. Inter-minimum transitions endow the conformational dynamics of proteins in water another diffusive character, which already exists in the intra-minimum collective motions.
NASA Astrophysics Data System (ADS)
Voss, B. M.; Pereira, M. P.; Rolfe, B. F.; Doolan, M. C.
2017-09-01
The growth in use of Advanced High Strength Steels in the automotive industry for light-weighting and safety has increased the rates of tool wear in sheet metal stamping. This is an issue that adds significant costs to production in terms of manual inspection and part refinishing. To reduce these costs, a tool condition monitoring system is required and a firm understanding of process signal variation must form the foundation for any such monitoring system. Punch force is a stamping process signal that is widely collected by industrial presses and has been linked closely to part quality and tool condition, making it an ideal candidate as a tool condition monitoring signal. In this preliminary investigation, the variation of punch force due to different lubrication conditions and progressive wear are examined. Linking specific punch force signature changes to developing lubrication and wear events is valuable for die wear and stamping condition monitoring. A series of semi-industrial channel forming trials were conducted under different lubrication regimes and progressive die wear. Punch force signatures were captured for each part and Principal Component Analysis (PCA) was applied to determine the key Principal Components of the signature data sets. These Principal Components were linked to the evolution of friction conditions over the course of the stroke for the different lubrication regimes and mechanism of galling wear. As a result, variation in punch force signatures were correlated to the current mechanism of wear dominant on the formed part; either abrasion or adhesion, and to changes in lubrication mechanism. The outcomes of this study provide important insights into punch force signature variation, that will provide a foundation for future work into the development of die wear and lubrication monitoring systems for sheet metal stamping.
Hook, Andrew L; Scurr, David J
2016-04-01
Surface analysis plays a key role in understanding the function of materials, particularly in biological environments. Time-of-flight secondary ion mass spectrometry (ToF-SIMS) provides highly surface sensitive chemical information that can readily be acquired over large areas and has, thus, become an important surface analysis tool. However, the information-rich nature of ToF-SIMS complicates the interpretation and comparison of spectra, particularly in cases where multicomponent samples are being assessed. In this study, a method is presented to assess the chemical variance across 16 poly(meth)acrylates. Materials are selected to contain C 6 pendant groups, and ten replicates of each are printed as a polymer microarray. SIMS spectra are acquired for each material with the most intense and unique ions assessed for each material to identify the predominant and distinctive fragmentation pathways within the materials studied. Differentiating acrylate/methacrylate pairs is readily achieved using secondary ions derived from both the polymer backbone and pendant groups. Principal component analysis (PCA) is performed on the SIMS spectra of the 16 polymers, whereby the resulting principal components are able to distinguish phenyl from benzyl groups, mono-functional from multi-functional monomers and acrylates from methacrylates. The principal components are applied to copolymer series to assess the predictive capabilities of the PCA. Beyond being able to predict the copolymer ratio, in some cases, the SIMS analysis is able to provide insight into the molecular sequence of a copolymer. The insight gained in this study will be beneficial for developing structure-function relationships based upon ToF-SIMS data of polymer libraries. © 2016 The Authors Surface and Interface Analysis Published by John Wiley & Sons Ltd.
Voukantsis, Dimitris; Karatzas, Kostas; Kukkonen, Jaakko; Räsänen, Teemu; Karppinen, Ari; Kolehmainen, Mikko
2011-03-01
In this paper we propose a methodology consisting of specific computational intelligence methods, i.e. principal component analysis and artificial neural networks, in order to inter-compare air quality and meteorological data, and to forecast the concentration levels for environmental parameters of interest (air pollutants). We demonstrate these methods to data monitored in the urban areas of Thessaloniki and Helsinki in Greece and Finland, respectively. For this purpose, we applied the principal component analysis method in order to inter-compare the patterns of air pollution in the two selected cities. Then, we proceeded with the development of air quality forecasting models for both studied areas. On this basis, we formulated and employed a novel hybrid scheme in the selection process of input variables for the forecasting models, involving a combination of linear regression and artificial neural networks (multi-layer perceptron) models. The latter ones were used for the forecasting of the daily mean concentrations of PM₁₀ and PM₂.₅ for the next day. Results demonstrated an index of agreement between measured and modelled daily averaged PM₁₀ concentrations, between 0.80 and 0.85, while the kappa index for the forecasting of the daily averaged PM₁₀ concentrations reached 60% for both cities. Compared with previous corresponding studies, these statistical parameters indicate an improved performance of air quality parameters forecasting. It was also found that the performance of the models for the forecasting of the daily mean concentrations of PM₁₀ was not substantially different for both cities, despite the major differences of the two urban environments under consideration. Copyright © 2011 Elsevier B.V. All rights reserved.
Gaudreault, Nathaly; Mezghani, Neila; Turcot, Katia; Hagemeister, Nicola; Boivin, Karine; de Guise, Jacques A
2011-03-01
Interpreting gait data is challenging due to intersubject variability observed in the gait pattern of both normal and pathological populations. The objective of this study was to investigate the impact of using principal component analysis for grouping knee osteoarthritis (OA) patients' gait data in more homogeneous groups when studying the effect of a physiotherapy treatment. Three-dimensional (3D) knee kinematic and kinetic data were recorded during the gait of 29 participants diagnosed with knee OA before and after they received 12 weeks of physiotherapy treatment. Principal component analysis was applied to extract groups of knee flexion/extension, adduction/abduction and internal/external rotation angle and moment data. The treatment's effect on parameters of interest was assessed using paired t-tests performed before and after grouping the knee kinematic data. Increased quadriceps and hamstring strength was observed following treatment (P<0.05). Except for the knee flexion/extension angle, two different groups (G(1) and G(2)) were extracted from the angle and moment data. When pre- and post-treatment analyses were performed considering the groups, participants exhibiting a G(2) knee moment pattern demonstrated a greater first peak flexion moment, lower adduction moment impulse and smaller rotation angle range post-treatment (P<0.05). When pre- and post-treatment comparisons were performed without grouping, the data showed no treatment effect. The results of the present study suggest that the effect of physiotherapy on gait mechanics of knee osteoarthritis patients may be masked or underestimated if kinematic data are not separated into more homogeneous groups when performing pre- and post-treatment comparisons. Copyright © 2010 Elsevier Ltd. All rights reserved.
Qualitative data analysis for an exploratory sensory study of Grechetto wine.
Esti, Marco; González Airola, Ricardo L; Moneta, Elisabetta; Paperaio, Marina; Sinesio, Fiorella
2010-02-15
Grechetto is a traditional white-grape vine, widespread in Umbria and Lazio regions in central Italy. Despite the wine commercial diffusion, little literature on its sensory characteristics is available. The present study is an exploratory research conducted with the aim of identifying the sensory markers of Grechetto wine and of evaluating the effect of clone, geographical area, vintage and producer on sensory attributes. A qualitative sensory study was conducted on 16 wines, differing for vintage, Typical Geographic Indication, and clone, collected from 7 wineries, using a trained panel in isolation who referred to a glossary of 133 white wine descriptors. Sixty-five attributes identified by a minimum of 50% of the respondents were submitted to a correspondence analysis to link wine samples to the sensory attributes. Seventeen terms identified as common to all samples are considered as characteristics of Grechetto wine, 10 of which olfactory: fruity, apple, acacia flower, pineapple, banana, floral, herbaceous, honey, apricot and peach. In order to interpret the relationship between design variables and sensory attributes data on 2005 and 2006 wines, the 28 most discriminating descriptors were projected in a principal component analysis. The first principal component was best described by olfactory terms and the second by gustative attributes. Good reproducibility of results was obtained for the two vintages. For one winery, vintage effect (2002-2006) was described in a new principal component analysis model applied on 39 most discriminating descriptors, which globally explained about 84% of the variance. In the young wines the notes of sulphur, yeast, dried fruit, butter, combined with herbaceous fresh and tropical fruity notes (melon, grapefruit) were dominant. During wine aging, sweeter notes, like honey, caramel, jam, become more dominant as well as some mineral notes, such as tuff and flint. Copyright 2009 Elsevier B.V. All rights reserved.
Takegami, Shigehiko; Ueyama, Keita; Konishi, Atsuko; Kitade, Tatsuya
2018-06-06
The lipid fluidity of various lipid nanoemulsions (LNEs) without and with flutamide (FT) and containing one of two neutral lipids, one of four phosphatidylcholines as a surfactant, and sodium palmitate as a cosurfactant was investigated by the combination of 1 H nuclear magnetic resonance (NMR) spectroscopy and principal component analysis (PCA). In the 1 H NMR spectra, the peaks from the methylene groups of the neutral lipids and surfactants for all LNE preparations showed downfield shifts with increasing temperature from 20 to 60 °C. PCA was applied to the 1 H NMR spectral data obtained for the LNEs. The PCA resulted in a model in which the first two principal components (PCs) extracted 88% of the total spectral variation; the first PC (PC-1) axis and second PC (PC-2) axis accounted for 73 and 15%, respectively, of the total spectral variation. The Score-1 values for PC-1 plotted against temperature revealed the existence of two clusters, which were defined by the neutral lipid of the LNE preparations. Meanwhile, the Score-2 values decreased with rising temperature and reflected the increase in lipid fluidity of each LNE preparation, consistent with fluorescence anisotropy measurements. In addition, the changes of Score-2 values with temperature for LNE preparations with FT were smaller than those for LNE preparations without FT. This indicates that FT encapsulated in LNE particles markedly suppressed the increase in lipid fluidity of LNE particles with rising temperature. Thus, PCA of 1 H NMR spectra will become a powerful tool to analyze the lipid fluidity of lipid nanoparticles. Graphical abstract ᅟ.
NASA Astrophysics Data System (ADS)
Unglert, K.; Radić, V.; Jellinek, A. M.
2016-06-01
Variations in the spectral content of volcano seismicity related to changes in volcanic activity are commonly identified manually in spectrograms. However, long time series of monitoring data at volcano observatories require tools to facilitate automated and rapid processing. Techniques such as self-organizing maps (SOM) and principal component analysis (PCA) can help to quickly and automatically identify important patterns related to impending eruptions. For the first time, we evaluate the performance of SOM and PCA on synthetic volcano seismic spectra constructed from observations during two well-studied eruptions at Klauea Volcano, Hawai'i, that include features observed in many volcanic settings. In particular, our objective is to test which of the techniques can best retrieve a set of three spectral patterns that we used to compose a synthetic spectrogram. We find that, without a priori knowledge of the given set of patterns, neither SOM nor PCA can directly recover the spectra. We thus test hierarchical clustering, a commonly used method, to investigate whether clustering in the space of the principal components and on the SOM, respectively, can retrieve the known patterns. Our clustering method applied to the SOM fails to detect the correct number and shape of the known input spectra. In contrast, clustering of the data reconstructed by the first three PCA modes reproduces these patterns and their occurrence in time more consistently. This result suggests that PCA in combination with hierarchical clustering is a powerful practical tool for automated identification of characteristic patterns in volcano seismic spectra. Our results indicate that, in contrast to PCA, common clustering algorithms may not be ideal to group patterns on the SOM and that it is crucial to evaluate the performance of these tools on a control dataset prior to their application to real data.
Xue, Gang; Song, Wen-qi; Li, Shu-chao
2015-01-01
In order to achieve the rapid identification of fire resistive coating for steel structure of different brands in circulating, a new method for the fast discrimination of varieties of fire resistive coating for steel structure by means of near infrared spectroscopy was proposed. The raster scanning near infrared spectroscopy instrument and near infrared diffuse reflectance spectroscopy were applied to collect the spectral curve of different brands of fire resistive coating for steel structure and the spectral data were preprocessed with standard normal variate transformation(standard normal variate transformation, SNV) and Norris second derivative. The principal component analysis (principal component analysis, PCA)was used to near infrared spectra for cluster analysis. The analysis results showed that the cumulate reliabilities of PC1 to PC5 were 99. 791%. The 3-dimentional plot was drawn with the scores of PC1, PC2 and PC3 X 10, which appeared to provide the best clustering of the varieties of fire resistive coating for steel structure. A total of 150 fire resistive coating samples were divided into calibration set and validation set randomly, the calibration set had 125 samples with 25 samples of each variety, and the validation set had 25 samples with 5 samples of each variety. According to the principal component scores of unknown samples, Mahalanobis distance values between each variety and unknown samples were calculated to realize the discrimination of different varieties. The qualitative analysis model for external verification of unknown samples is a 10% recognition ration. The results demonstrated that this identification method can be used as a rapid, accurate method to identify the classification of fire resistive coating for steel structure and provide technical reference for market regulation.
Use of a Principal Components Analysis for the Generation of Daily Time Series.
NASA Astrophysics Data System (ADS)
Dreveton, Christine; Guillou, Yann
2004-07-01
A new approach for generating daily time series is considered in response to the weather-derivatives market. This approach consists of performing a principal components analysis to create independent variables, the values of which are then generated separately with a random process. Weather derivatives are financial or insurance products that give companies the opportunity to cover themselves against adverse climate conditions. The aim of a generator is to provide a wider range of feasible situations to be used in an assessment of risk. Generation of a temperature time series is required by insurers or bankers for pricing weather options. The provision of conditional probabilities and a good representation of the interannual variance are the main challenges of a generator when used for weather derivatives. The generator was developed according to this new approach using a principal components analysis and was applied to the daily average temperature time series of the Paris-Montsouris station in France. The observed dataset was homogenized and the trend was removed to represent correctly the present climate. The results obtained with the generator show that it represents correctly the interannual variance of the observed climate; this is the main result of the work, because one of the main discrepancies of other generators is their inability to represent accurately the observed interannual climate variance—this discrepancy is not acceptable for an application to weather derivatives. The generator was also tested to calculate conditional probabilities: for example, the knowledge of the aggregated value of heating degree-days in the middle of the heating season allows one to estimate the probability if reaching a threshold at the end of the heating season. This represents the main application of a climate generator for use with weather derivatives.
NASA Astrophysics Data System (ADS)
Li, Jun; Song, Minghui; Peng, Yuanxi
2018-03-01
Current infrared and visible image fusion methods do not achieve adequate information extraction, i.e., they cannot extract the target information from infrared images while retaining the background information from visible images. Moreover, most of them have high complexity and are time-consuming. This paper proposes an efficient image fusion framework for infrared and visible images on the basis of robust principal component analysis (RPCA) and compressed sensing (CS). The novel framework consists of three phases. First, RPCA decomposition is applied to the infrared and visible images to obtain their sparse and low-rank components, which represent the salient features and background information of the images, respectively. Second, the sparse and low-rank coefficients are fused by different strategies. On the one hand, the measurements of the sparse coefficients are obtained by the random Gaussian matrix, and they are then fused by the standard deviation (SD) based fusion rule. Next, the fused sparse component is obtained by reconstructing the result of the fused measurement using the fast continuous linearized augmented Lagrangian algorithm (FCLALM). On the other hand, the low-rank coefficients are fused using the max-absolute rule. Subsequently, the fused image is superposed by the fused sparse and low-rank components. For comparison, several popular fusion algorithms are tested experimentally. By comparing the fused results subjectively and objectively, we find that the proposed framework can extract the infrared targets while retaining the background information in the visible images. Thus, it exhibits state-of-the-art performance in terms of both fusion effects and timeliness.
Different odor tests contribute differently to the evaluation of olfactory loss.
Lötsch, Jörn; Reichmann, Heinz; Hummel, Thomas
2008-01-01
In a clinical context, the importance of the sense of smell has increasingly been recognized, for example, in terms of the evaluation of neurodegenerative disorders. In this study, 2 strategies of olfactory testing, a simple one and a more complex one, were compared with respect to their suitability to assess olfactory dysfunction. Odor threshold (T), discrimination (D), and identification (I) were assessed in a control sample of 916 males and 1160 females, aged 6-90 years, and in 81 men and 21 women, aged 38-80 years, suffering from idiopathic Parkinson's disease (IPD). Sums of the 3 subtest results T, D, and I yielded threshold discrimination identification (TDI) scores reflecting olfactory function. Sensitivity of any of the 3 subtests to confirm the diagnosis established by the composite TDI score was assessed separately for each test. Principal component analyses were applied to determine any common source of variance among the 3 specific subtests. Sensitivities of the subtests to provide the diagnosis established by the composite TDI score were 64% (T), 56% (D), and 47% (I), respectively. In IPD patients, each of the subtests provided the correct diagnosis (sensitivity >90%), as olfaction was impaired in 99% of the patient group. Two principal components emerged in both controls and IPD patients, with eigenvalues >0.5. The first component received high loadings from all factors. The second component received high loadings from odor threshold, whereas loadings from odor discrimination and identification were much smaller. In conclusion, combined testing of several components of olfaction, especially including assessment of thresholds, provides the most significant approach to the diagnosis of smell loss.
Burst and Principal Components Analyses of MEA Data Separates Chemicals by Class
Microelectrode arrays (MEAs) detect drug and chemical induced changes in action potential "spikes" in neuronal networks and can be used to screen chemicals for neurotoxicity. Analytical "fingerprinting," using Principal Components Analysis (PCA) on spike trains recorded from prim...
EVALUATION OF ACID DEPOSITION MODELS USING PRINCIPAL COMPONENT SPACES
An analytical technique involving principal components analysis is proposed for use in the evaluation of acid deposition models. elationships among model predictions are compared to those among measured data, rather than the more common one-to-one comparison of predictions to mea...
Complexity of free energy landscapes of peptides revealed by nonlinear principal component analysis.
Nguyen, Phuong H
2006-12-01
Employing the recently developed hierarchical nonlinear principal component analysis (NLPCA) method of Saegusa et al. (Neurocomputing 2004;61:57-70 and IEICE Trans Inf Syst 2005;E88-D:2242-2248), the complexities of the free energy landscapes of several peptides, including triglycine, hexaalanine, and the C-terminal beta-hairpin of protein G, were studied. First, the performance of this NLPCA method was compared with the standard linear principal component analysis (PCA). In particular, we compared two methods according to (1) the ability of the dimensionality reduction and (2) the efficient representation of peptide conformations in low-dimensional spaces spanned by the first few principal components. The study revealed that NLPCA reduces the dimensionality of the considered systems much better, than did PCA. For example, in order to get the similar error, which is due to representation of the original data of beta-hairpin in low dimensional space, one needs 4 and 21 principal components of NLPCA and PCA, respectively. Second, by representing the free energy landscapes of the considered systems as a function of the first two principal components obtained from PCA, we obtained the relatively well-structured free energy landscapes. In contrast, the free energy landscapes of NLPCA are much more complicated, exhibiting many states which are hidden in the PCA maps, especially in the unfolded regions. Furthermore, the study also showed that many states in the PCA maps are mixed up by several peptide conformations, while those of the NLPCA maps are more pure. This finding suggests that the NLPCA should be used to capture the essential features of the systems. (c) 2006 Wiley-Liss, Inc.
NASA Astrophysics Data System (ADS)
Li, Jiangtong; Luo, Yongdao; Dai, Honglin
2018-01-01
Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.
Vargas-Bello-Pérez, Einar; Toro-Mujica, Paula; Enriquez-Hidalgo, Daniel; Fellenberg, María Angélica; Gómez-Cortés, Pilar
2017-06-01
We used a multivariate chemometric approach to differentiate or associate retail bovine milks with different fat contents and non-dairy beverages, using fatty acid profiles and statistical analysis. We collected samples of bovine milk (whole, semi-skim, and skim; n = 62) and non-dairy beverages (n = 27), and we analyzed them using gas-liquid chromatography. Principal component analysis of the fatty acid data yielded 3 significant principal components, which accounted for 72% of the total variance in the data set. Principal component 1 was related to saturated fatty acids (C4:0, C6:0, C8:0, C12:0, C14:0, C17:0, and C18:0) and monounsaturated fatty acids (C14:1 cis-9, C16:1 cis-9, C17:1 cis-9, and C18:1 trans-11); whole milk samples were clearly differentiated from the rest using this principal component. Principal component 2 differentiated semi-skim milk samples by n-3 fatty acid content (C20:3n-3, C20:5n-3, and C22:6n-3). Principal component 3 was related to C18:2 trans-9,trans-12 and C20:4n-6, and its lower scores were observed in skim milk and non-dairy beverages. A cluster analysis yielded 3 groups: group 1 consisted of only whole milk samples, group 2 was represented mainly by semi-skim milks, and group 3 included skim milk and non-dairy beverages. Overall, the present study showed that a multivariate chemometric approach is a useful tool for differentiating or associating retail bovine milks and non-dairy beverages using their fatty acid profile. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Use of multivariate statistics to identify unreliable data obtained using CASA.
Martínez, Luis Becerril; Crispín, Rubén Huerta; Mendoza, Maximino Méndez; Gallegos, Oswaldo Hernández; Martínez, Andrés Aragón
2013-06-01
In order to identify unreliable data in a dataset of motility parameters obtained from a pilot study acquired by a veterinarian with experience in boar semen handling, but without experience in the operation of a computer assisted sperm analysis (CASA) system, a multivariate graphical and statistical analysis was performed. Sixteen boar semen samples were aliquoted then incubated with varying concentrations of progesterone from 0 to 3.33 µg/ml and analyzed in a CASA system. After standardization of the data, Chernoff faces were pictured for each measurement, and a principal component analysis (PCA) was used to reduce the dimensionality and pre-process the data before hierarchical clustering. The first twelve individual measurements showed abnormal features when Chernoff faces were drawn. PCA revealed that principal components 1 and 2 explained 63.08% of the variance in the dataset. Values of principal components for each individual measurement of semen samples were mapped to identify differences among treatment or among boars. Twelve individual measurements presented low values of principal component 1. Confidence ellipses on the map of principal components showed no statistically significant effects for treatment or boar. Hierarchical clustering realized on two first principal components produced three clusters. Cluster 1 contained evaluations of the two first samples in each treatment, each one of a different boar. With the exception of one individual measurement, all other measurements in cluster 1 were the same as observed in abnormal Chernoff faces. Unreliable data in cluster 1 are probably related to the operator inexperience with a CASA system. These findings could be used to objectively evaluate the skill level of an operator of a CASA system. This may be particularly useful in the quality control of semen analysis using CASA systems.
Liu, Xiang; Guo, Ling-Peng; Zhang, Fei-Yun; Ma, Jie; Mu, Shu-Yong; Zhao, Xin; Li, Lan-Hai
2015-02-01
Eight physical and chemical indicators related to water quality were monitored from nineteen sampling sites along the Kunes River at the end of snowmelt season in spring. To investigate the spatial distribution characteristics of water physical and chemical properties, cluster analysis (CA), discriminant analysis (DA) and principal component analysis (PCA) are employed. The result of cluster analysis showed that the Kunes River could be divided into three reaches according to the similarities of water physical and chemical properties among sampling sites, representing the upstream, midstream and downstream of the river, respectively; The result of discriminant analysis demonstrated that the reliability of such a classification was high, and DO, Cl- and BOD5 were the significant indexes leading to this classification; Three principal components were extracted on the basis of the principal component analysis, in which accumulative variance contribution could reach 86.90%. The result of principal component analysis also indicated that water physical and chemical properties were mostly affected by EC, ORP, NO3(-) -N, NH4(+) -N, Cl- and BOD5. The sorted results of principal component scores in each sampling sites showed that the water quality was mainly influenced by DO in upstream, by pH in midstream, and by the rest of indicators in downstream. The order of comprehensive scores for principal components revealed that the water quality degraded from the upstream to downstream, i.e., the upstream had the best water quality, followed by the midstream, while the water quality at downstream was the worst. This result corresponded exactly to the three reaches classified using cluster analysis. Anthropogenic activity and the accumulation of pollutants along the river were probably the main reasons leading to this spatial difference.
Putilov, Arcady A; Donskaya, Olga G
2016-01-01
Age-associated changes in different bandwidths of the human electroencephalographic (EEG) spectrum are well documented, but their functional significance is poorly understood. This spectrum seems to represent summation of simultaneous influences of several sleep-wake regulatory processes. Scoring of its orthogonal (uncorrelated) principal components can help in separation of the brain signatures of these processes. In particular, the opposite age-associated changes were documented for scores on the two largest (1st and 2nd) principal components of the sleep EEG spectrum. A decrease of the first score and an increase of the second score can reflect, respectively, the weakening of the sleep drive and disinhibition of the opposing wake drive with age. In order to support the suggestion of age-associated disinhibition of the wake drive from the antagonistic influence of the sleep drive, we analyzed principal component scores of the resting EEG spectra obtained in sleep deprivation experiments with 81 healthy young adults aged between 19 and 26 and 40 healthy older adults aged between 45 and 66 years. At the second day of the sleep deprivation experiments, frontal scores on the 1st principal component of the EEG spectrum demonstrated an age-associated reduction of response to eyes closed relaxation. Scores on the 2nd principal component were either initially increased during wakefulness or less responsive to such sleep-provoking conditions (frontal and occipital scores, respectively). These results are in line with the suggestion of disinhibition of the wake drive with age. They provide an explanation of why older adults are less vulnerable to sleep deprivation than young adults.
Taguchi, Y-H
2018-05-08
Even though coexistence of multiple phenotypes sharing the same genomic background is interesting, it remains incompletely understood. Epigenomic profiles may represent key factors, with unknown contributions to the development of multiple phenotypes, and social-insect castes are a good model for elucidation of the underlying mechanisms. Nonetheless, previous studies have failed to identify genes associated with aberrant gene expression and methylation profiles because of the lack of suitable methodology that can address this problem properly. A recently proposed principal component analysis (PCA)-based and tensor decomposition (TD)-based unsupervised feature extraction (FE) can solve this problem because these two approaches can deal with gene expression and methylation profiles even when a small number of samples is available. PCA-based and TD-based unsupervised FE methods were applied to the analysis of gene expression and methylation profiles in the brains of two social insects, Polistes canadensis and Dinoponera quadriceps. Genes associated with differential expression and methylation between castes were identified, and analysis of enrichment of Gene Ontology terms confirmed reliability of the obtained sets of genes from the biological standpoint. Biologically relevant genes, shown to be associated with significant differential gene expression and methylation between castes, were identified here for the first time. The identification of these genes may help understand the mechanisms underlying epigenetic control of development of multiple phenotypes under the same genomic conditions.
Li, Yong-Wei; Qi, Jin; Wen-Zhang; Zhou, Shui-Ping; Yan-Wu; Yu, Bo-Yang
2014-07-01
Liriope muscari (Decne.) L. H. Bailey is a well-known traditional Chinese medicine used for treating cough and insomnia. There are few reports on the quality evaluation of this herb partly because the major steroid saponins are not readily identified by UV detectors and are not easily isolated due to the existence of many similar isomers. In this study, a qualitative and quantitative method was developed to analyze the major components in L. muscari (Decne.) L. H. Bailey roots. Sixteen components were deduced and identified primarily by the information obtained from ultra high performance liquid chromatography with ion-trap time-of-flight mass spectrometry. The method demonstrated the desired specificity, linearity, stability, precision, and accuracy for simultaneous determination of 15 constituents (13 steroidal glycosides, 25(R)-ruscogenin, and pentylbenzoate) in 26 samples from different origins. The fingerprint was established, and the evaluation was achieved using similarity analysis and principal component analysis of 15 fingerprint peaks from 26 samples by ultra high performance liquid chromatography. The results from similarity analysis were consistent with those of principal component analysis. All results suggest that the established method could be applied effectively to the determination of multi-ingredients and fingerprint analysis of steroid saponins for quality assessment and control of L. muscari (Decne.) L. H. Bailey. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sciutto, Giorgia; Oliveri, Paolo; Catelli, Emilio; Bonacini, Irene
2017-01-01
In the field of applied researches in heritage science, the use of multivariate approach is still quite limited and often chemometric results obtained are often underinterpreted. Within this scenario, the present paper is aimed at disseminating the use of suitable multivariate methodologies and proposes a procedural workflow applied on a representative group of case studies, of considerable importance for conservation purposes, as a sort of guideline on the processing and on the interpretation of this FTIR data. Initially, principal component analysis (PCA) is performed and the score values are converted into chemical maps. Successively, the brushing approach is applied, demonstrating its usefulness for a deep understanding of the relationships between the multivariate map and PC score space, as well as for the identification of the spectral bands mainly involved in the definition of each area localised within the score maps. PMID:29333162
Saunders, Kate; Bilderbeck, Amy; Palmius, Niclas; Goodwin, Guy; De Vos, Maarten
2017-01-01
Background We recently described a new questionnaire to monitor mood called mood zoom (MZ). MZ comprises 6 items assessing mood symptoms on a 7-point Likert scale; we had previously used standard principal component analysis (PCA) to tentatively understand its properties, but the presence of multiple nonzero loadings obstructed the interpretation of its latent variables. Objective The aim of this study was to rigorously investigate the internal properties and latent variables of MZ using an algorithmic approach which may lead to more interpretable results than PCA. Additionally, we explored three other widely used psychiatric questionnaires to investigate latent variable structure similarities with MZ: (1) Altman self-rating mania scale (ASRM), assessing mania; (2) quick inventory of depressive symptomatology (QIDS) self-report, assessing depression; and (3) generalized anxiety disorder (7-item) (GAD-7), assessing anxiety. Methods We elicited responses from 131 participants: 48 bipolar disorder (BD), 32 borderline personality disorder (BPD), and 51 healthy controls (HC), collected longitudinally (median [interquartile range, IQR]: 363 [276] days). Participants were requested to complete ASRM, QIDS, and GAD-7 weekly (all 3 questionnaires were completed on the Web) and MZ daily (using a custom-based smartphone app). We applied sparse PCA (SPCA) to determine the latent variables for the four questionnaires, where a small subset of the original items contributes toward each latent variable. Results We found that MZ had great consistency across the three cohorts studied. Three main principal components were derived using SPCA, which can be tentatively interpreted as (1) anxiety and sadness, (2) positive affect, and (3) irritability. The MZ principal component comprising anxiety and sadness explains most of the variance in BD and BPD, whereas the positive affect of MZ explains most of the variance in HC. The latent variables in ASRM were identical for the patient groups but different for HC; nevertheless, the latent variables shared common items across both the patient group and HC. On the contrary, QIDS had overall very different principal components across groups; sleep was a key element in HC and BD but was absent in BPD. In GAD-7, nervousness was the principal component explaining most of the variance in BD and HC. Conclusions This study has important implications for understanding self-reported mood. MZ has a consistent, intuitively interpretable latent variable structure and hence may be a good instrument for generic mood assessment. Irritability appears to be the key distinguishing latent variable between BD and BPD and might be useful for differential diagnosis. Anxiety and sadness are closely interlinked, a finding that might inform treatment effects to jointly address these covarying symptoms. Anxiety and nervousness appear to be amongst the cardinal latent variable symptoms in BD and merit close attention in clinical practice. PMID:28546141
A stochastic model of weather states and concurrent daily precipitation at multiple precipitation stations is described. our algorithms are invested for classification of daily weather states; k means, fuzzy clustering, principal components, and principal components coupled with ...
Rosacea assessment by erythema index and principal component analysis segmentation maps
NASA Astrophysics Data System (ADS)
Kuzmina, Ilona; Rubins, Uldis; Saknite, Inga; Spigulis, Janis
2017-12-01
RGB images of rosacea were analyzed using segmentation maps of principal component analysis (PCA) and erythema index (EI). Areas of segmented clusters were compared to Clinician's Erythema Assessment (CEA) values given by two dermatologists. The results show that visible blood vessels are segmented more precisely on maps of the erythema index and the third principal component (PC3). In many cases, a distribution of clusters on EI and PC3 maps are very similar. Mean values of clusters' areas on these maps show a decrease of the area of blood vessels and erythema and an increase of lighter skin area after the therapy for the patients with diagnosis CEA = 2 on the first visit and CEA=1 on the second visit. This study shows that EI and PC3 maps are more useful than the maps of the first (PC1) and second (PC2) principal components for indicating vascular structures and erythema on the skin of rosacea patients and therapy monitoring.
Multilevel sparse functional principal component analysis.
Di, Chongzhi; Crainiceanu, Ciprian M; Jank, Wolfgang S
2014-01-29
We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis (MFPCA; Di et al. 2009) was proposed for such data when functions are densely recorded. Here we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between and within subject levels. We address inherent methodological differences in the sparse sampling context to: 1) estimate the covariance operators; 2) estimate the functional principal component scores; 3) predict the underlying curves. Through simulations the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions.
[Content of mineral elements of Gastrodia elata by principal components analysis].
Li, Jin-ling; Zhao, Zhi; Liu, Hong-chang; Luo, Chun-li; Huang, Ming-jin; Luo, Fu-lai; Wang, Hua-lei
2015-03-01
To study the content of mineral elements and the principal components in Gastrodia elata. Mineral elements were determined by ICP and the data was analyzed by SPSS. K element has the highest content-and the average content was 15.31 g x kg(-1). The average content of N element was 8.99 g x kg(-1), followed by K element. The coefficient of variation of K and N was small, but the Mn was the biggest with 51.39%. The highly significant positive correlation was found among N, P and K . Three principal components were selected by principal components analysis to evaluate the quality of G. elata. P, B, N, K, Cu, Mn, Fe and Mg were the characteristic elements of G. elata. The content of K and N elements was higher and relatively stable. The variation of Mn content was biggest. The quality of G. elata in Guizhou and Yunnan was better from the perspective of mineral elements.
Panazzolo, Diogo G; Sicuro, Fernando L; Clapauch, Ruth; Maranhão, Priscila A; Bouskela, Eliete; Kraemer-Aguiar, Luiz G
2012-11-13
We aimed to evaluate the multivariate association between functional microvascular variables and clinical-laboratorial-anthropometrical measurements. Data from 189 female subjects (34.0 ± 15.5 years, 30.5 ± 7.1 kg/m2), who were non-smokers, non-regular drug users, without a history of diabetes and/or hypertension, were analyzed by principal component analysis (PCA). PCA is a classical multivariate exploratory tool because it highlights common variation between variables allowing inferences about possible biological meaning of associations between them, without pre-establishing cause-effect relationships. In total, 15 variables were used for PCA: body mass index (BMI), waist circumference, systolic and diastolic blood pressure (BP), fasting plasma glucose, levels of total cholesterol, high-density lipoprotein cholesterol (HDL-c), low-density lipoprotein cholesterol (LDL-c), triglycerides (TG), insulin, C-reactive protein (CRP), and functional microvascular variables measured by nailfold videocapillaroscopy. Nailfold videocapillaroscopy was used for direct visualization of nutritive capillaries, assessing functional capillary density, red blood cell velocity (RBCV) at rest and peak after 1 min of arterial occlusion (RBCV(max)), and the time taken to reach RBCV(max) (TRBCV(max)). A total of 35% of subjects had metabolic syndrome, 77% were overweight/obese, and 9.5% had impaired fasting glucose. PCA was able to recognize that functional microvascular variables and clinical-laboratorial-anthropometrical measurements had a similar variation. The first five principal components explained most of the intrinsic variation of the data. For example, principal component 1 was associated with BMI, waist circumference, systolic BP, diastolic BP, insulin, TG, CRP, and TRBCV(max) varying in the same way. Principal component 1 also showed a strong association among HDL-c, RBCV, and RBCV(max), but in the opposite way. Principal component 3 was associated only with microvascular variables in the same way (functional capillary density, RBCV and RBCV(max)). Fasting plasma glucose appeared to be related to principal component 4 and did not show any association with microvascular reactivity. In non-diabetic female subjects, a multivariate scenario of associations between classic clinical variables strictly related to obesity and metabolic syndrome suggests a significant relationship between these diseases and microvascular reactivity.
The factorial reliability of the Middlesex Hospital Questionnaire in normal subjects.
Bagley, C
1980-03-01
The internal reliability of the Middlesex Hospital Questionnaire and its component subscales has been checked by means of principal components analyses of data on 256 normal subjects. The subscales (with the possible exception of Hysteria) were found to contribute to the general underlying factor of psychoneurosis. In general, the principal components analysis points to the reliability of the subscales, despite some item overlap.
ERIC Educational Resources Information Center
McCormick, Ernest J.; And Others
The study deals with the job component method of establishing compensation rates. The basic job analysis questionnaire used in the study was the Position Analysis Questionnaire (PAQ) (Form B). On the basis of a principal components analysis of PAQ data for a large sample (2,688) of jobs, a number of principal components (job dimensions) were…
NASA Astrophysics Data System (ADS)
Sokolov, Anton; Dmitriev, Egor; Delbarre, Hervé; Augustin, Patrick; Gengembre, Cyril; Fourmenten, Marc
2016-04-01
The problem of atmospheric contamination by principal air pollutants was considered in the industrialized coastal region of English Channel in Dunkirk influenced by north European metropolitan areas. MESO-NH nested models were used for the simulation of the local atmospheric dynamics and the online calculation of Lagrangian backward trajectories with 15-minute temporal resolution and the horizontal resolution down to 500 m. The one-month mesoscale numerical simulation was coupled with local pollution measurements of volatile organic components, particulate matter, ozone, sulphur dioxide and nitrogen oxides. Principal atmospheric pathways were determined by clustering technique applied to backward trajectories simulated. Six clusters were obtained which describe local atmospheric dynamics, four winds blowing through the English Channel, one coming from the south, and the biggest cluster with small wind speeds. This last cluster includes mostly sea breeze events. The analysis of meteorological data and pollution measurements allows relating the principal atmospheric pathways with local air contamination events. It was shown that contamination events are mostly connected with a channelling of pollution from local sources and low-turbulent states of the local atmosphere.
Metabolic profiles are principally different between cancers of the liver, pancreas and breast.
Budhu, Anuradha; Terunuma, Atsushi; Zhang, Geng; Hussain, S Perwez; Ambs, Stefan; Wang, Xin Wei
2014-01-01
Molecular profiling of primary tumors may facilitate the classification of patients with cancer into more homogenous biological groups to aid clinical management. Metabolomic profiling has been shown to be a powerful tool in characterizing the biological mechanisms underlying a disease but has not been evaluated for its ability to classify cancers by their tissue of origin. Thus, we assessed metabolomic profiling as a novel tool for multiclass cancer characterization. Global metabolic profiling was employed to identify metabolites in paired tumor and non-tumor liver (n=60), breast (n=130) and pancreatic (n=76) tissue specimens. Unsupervised principal component analysis showed that metabolites are principally unique to each tissue and cancer type. Such a difference can also be observed even among early stage cancers, suggesting a significant and unique alteration of global metabolic pathways associated with each cancer type. Our global high-throughput metabolomic profiling study shows that specific biochemical alterations distinguish liver, pancreatic and breast cancer and could be applied as cancer classification tools to differentiate tumors based on tissue of origin.
ERIC Educational Resources Information Center
Faginski-Stark, Erica; Casavant, Christopher; Collins, William; McCandless, Jason; Tencza, Marilyn
2012-01-01
Recent federal and state mandates have tasked school systems to move beyond principal evaluation as a bureaucratic function and to re-imagine it as a critical component to improve principal performance and compel school renewal. This qualitative study investigated the district leaders' and principals' perceptions of the performance evaluation…
2L-PCA: a two-level principal component analyzer for quantitative drug design and its applications.
Du, Qi-Shi; Wang, Shu-Qing; Xie, Neng-Zhong; Wang, Qing-Yan; Huang, Ri-Bo; Chou, Kuo-Chen
2017-09-19
A two-level principal component predictor (2L-PCA) was proposed based on the principal component analysis (PCA) approach. It can be used to quantitatively analyze various compounds and peptides about their functions or potentials to become useful drugs. One level is for dealing with the physicochemical properties of drug molecules, while the other level is for dealing with their structural fragments. The predictor has the self-learning and feedback features to automatically improve its accuracy. It is anticipated that 2L-PCA will become a very useful tool for timely providing various useful clues during the process of drug development.
Dynamic frequency tuning of electric and magnetic metamaterial response
O'Hara, John F; Averitt, Richard; Padilla, Willie; Chen, Hou-Tong
2014-09-16
A geometrically modifiable resonator is comprised of a resonator disposed on a substrate, and a means for geometrically modifying the resonator. The geometrically modifiable resonator can achieve active optical and/or electronic control of the frequency response in metamaterials and/or frequency selective surfaces, potentially with sub-picosecond response times. Additionally, the methods taught here can be applied to discrete geometrically modifiable circuit components such as inductors and capacitors. Principally, controlled conductivity regions, using either reversible photodoping or voltage induced depletion activation, are used to modify the geometries of circuit components, thus allowing frequency tuning of resonators without otherwise affecting the bulk substrate electrical properties. The concept is valid over any frequency range in which metamaterials are designed to operate.
26 CFR 1.181-6 - Effective/applicability date.
Code of Federal Regulations, 2014 CFR
2014-04-01
....181-1 through 1.181-5 apply to productions the first day of principal photography for which occurs on... principal photography or in-between animation occurs on or after December 7, 2012. (b) Pre-effective date... applies and for which the first day of principal photography (or in-between animation) occurred before...
26 CFR 1.181-6 - Effective/applicability date.
Code of Federal Regulations, 2013 CFR
2013-04-01
....181-1 through 1.181-5 apply to productions the first day of principal photography for which occurs on... principal photography or in-between animation occurs on or after December 7, 2012. (b) Pre-effective date... applies and for which the first day of principal photography (or in-between animation) occurred before...
Promising Practices: Building the Next Generation of School Leaders
ERIC Educational Resources Information Center
Bryant, Jennifer Edic; Escalante, Karen; Selva, Ashley
2017-01-01
This study applies transformational leadership theory practices to examine the purposeful ways in which principals work to build the next generation of teacher leaders in response to the shortage of K-12 principals. Given the impact principals have on student development and the shortage of those applying for the principalship, the purpose of this…
NASA Astrophysics Data System (ADS)
Chen, Shuming; Wang, Dengfeng; Liu, Bo
This paper investigates optimization design of the thickness of the sound package performed on a passenger automobile. The major characteristics indexes for performance selected to evaluate the processes are the SPL of the exterior noise and the weight of the sound package, and the corresponding parameters of the sound package are the thickness of the glass wool with aluminum foil for the first layer, the thickness of the glass fiber for the second layer, and the thickness of the PE foam for the third layer. In this paper, the process is fundamentally with multiple performances, thus, the grey relational analysis that utilizes grey relational grade as performance index is especially employed to determine the optimal combination of the thickness of the different layers for the designed sound package. Additionally, in order to evaluate the weighting values corresponding to various performance characteristics, the principal component analysis is used to show their relative importance properly and objectively. The results of the confirmation experiments uncover that grey relational analysis coupled with principal analysis methods can successfully be applied to find the optimal combination of the thickness for each layer of the sound package material. Therefore, the presented method can be an effective tool to improve the vehicle exterior noise and lower the weight of the sound package. In addition, it will also be helpful for other applications in the automotive industry, such as the First Automobile Works in China, Changan Automobile in China, etc.
Information extraction from multivariate images
NASA Technical Reports Server (NTRS)
Park, S. K.; Kegley, K. A.; Schiess, J. R.
1986-01-01
An overview of several multivariate image processing techniques is presented, with emphasis on techniques based upon the principal component transformation (PCT). Multiimages in various formats have a multivariate pixel value, associated with each pixel location, which has been scaled and quantized into a gray level vector, and the bivariate of the extent to which two images are correlated. The PCT of a multiimage decorrelates the multiimage to reduce its dimensionality and reveal its intercomponent dependencies if some off-diagonal elements are not small, and for the purposes of display the principal component images must be postprocessed into multiimage format. The principal component analysis of a multiimage is a statistical analysis based upon the PCT whose primary application is to determine the intrinsic component dimensionality of the multiimage. Computational considerations are also discussed.
Soleimani, Mohammad Ali; Yaghoobzadeh, Ameneh; Bahrami, Nasim; Sharif, Saeed Pahlevan; Sharif Nia, Hamid
2016-10-01
In this study, 398 Iranian cancer patients completed the 15-item Templer's Death Anxiety Scale (TDAS). Tests of internal consistency, principal components analysis, and confirmatory factor analysis were conducted to assess the internal consistency and factorial validity of the Persian TDAS. The construct reliability statistic and average variance extracted were also calculated to measure construct reliability, convergent validity, and discriminant validity. Principal components analysis indicated a 3-component solution, which was generally supported in the confirmatory analysis. However, acceptable cutoffs for construct reliability, convergent validity, and discriminant validity were not fulfilled for the three subscales that were derived from the principal component analysis. This study demonstrated both the advantages and potential limitations of using the TDAS with Persian-speaking cancer patients.
Functional Data Analysis in NTCP Modeling: A New Method to Explore the Radiation Dose-Volume Effects
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benadjaoud, Mohamed Amine, E-mail: mohamedamine.benadjaoud@gustaveroussy.fr; Université Paris sud, Le Kremlin-Bicêtre; Institut Gustave Roussy, Villejuif
2014-11-01
Purpose/Objective(s): To describe a novel method to explore radiation dose-volume effects. Functional data analysis is used to investigate the information contained in differential dose-volume histograms. The method is applied to the normal tissue complication probability modeling of rectal bleeding (RB) for patients irradiated in the prostatic bed by 3-dimensional conformal radiation therapy. Methods and Materials: Kernel density estimation was used to estimate the individual probability density functions from each of the 141 rectum differential dose-volume histograms. Functional principal component analysis was performed on the estimated probability density functions to explore the variation modes in the dose distribution. The functional principalmore » components were then tested for association with RB using logistic regression adapted to functional covariates (FLR). For comparison, 3 other normal tissue complication probability models were considered: the Lyman-Kutcher-Burman model, logistic model based on standard dosimetric parameters (LM), and logistic model based on multivariate principal component analysis (PCA). Results: The incidence rate of grade ≥2 RB was 14%. V{sub 65Gy} was the most predictive factor for the LM (P=.058). The best fit for the Lyman-Kutcher-Burman model was obtained with n=0.12, m = 0.17, and TD50 = 72.6 Gy. In PCA and FLR, the components that describe the interdependence between the relative volumes exposed at intermediate and high doses were the most correlated to the complication. The FLR parameter function leads to a better understanding of the volume effect by including the treatment specificity in the delivered mechanistic information. For RB grade ≥2, patients with advanced age are significantly at risk (odds ratio, 1.123; 95% confidence interval, 1.03-1.22), and the fits of the LM, PCA, and functional principal component analysis models are significantly improved by including this clinical factor. Conclusion: Functional data analysis provides an attractive method for flexibly estimating the dose-volume effect for normal tissues in external radiation therapy.« less
Understanding software faults and their role in software reliability modeling
NASA Technical Reports Server (NTRS)
Munson, John C.
1994-01-01
This study is a direct result of an on-going project to model the reliability of a large real-time control avionics system. In previous modeling efforts with this system, hardware reliability models were applied in modeling the reliability behavior of this system. In an attempt to enhance the performance of the adapted reliability models, certain software attributes were introduced in these models to control for differences between programs and also sequential executions of the same program. As the basic nature of the software attributes that affect software reliability become better understood in the modeling process, this information begins to have important implications on the software development process. A significant problem arises when raw attribute measures are to be used in statistical models as predictors, for example, of measures of software quality. This is because many of the metrics are highly correlated. Consider the two attributes: lines of code, LOC, and number of program statements, Stmts. In this case, it is quite obvious that a program with a high value of LOC probably will also have a relatively high value of Stmts. In the case of low level languages, such as assembly language programs, there might be a one-to-one relationship between the statement count and the lines of code. When there is a complete absence of linear relationship among the metrics, they are said to be orthogonal or uncorrelated. Usually the lack of orthogonality is not serious enough to affect a statistical analysis. However, for the purposes of some statistical analysis such as multiple regression, the software metrics are so strongly interrelated that the regression results may be ambiguous and possibly even misleading. Typically, it is difficult to estimate the unique effects of individual software metrics in the regression equation. The estimated values of the coefficients are very sensitive to slight changes in the data and to the addition or deletion of variables in the regression equation. Since most of the existing metrics have common elements and are linear combinations of these common elements, it seems reasonable to investigate the structure of the underlying common factors or components that make up the raw metrics. The technique we have chosen to use to explore this structure is a procedure called principal components analysis. Principal components analysis is a decomposition technique that may be used to detect and analyze collinearity in software metrics. When confronted with a large number of metrics measuring a single construct, it may be desirable to represent the set by some smaller number of variables that convey all, or most, of the information in the original set. Principal components are linear transformations of a set of random variables that summarize the information contained in the variables. The transformations are chosen so that the first component accounts for the maximal amount of variation of the measures of any possible linear transform; the second component accounts for the maximal amount of residual variation; and so on. The principal components are constructed so that they represent transformed scores on dimensions that are orthogonal. Through the use of principal components analysis, it is possible to have a set of highly related software attributes mapped into a small number of uncorrelated attribute domains. This definitively solves the problem of multi-collinearity in subsequent regression analysis. There are many software metrics in the literature, but principal component analysis reveals that there are few distinct sources of variation, i.e. dimensions, in this set of metrics. It would appear perfectly reasonable to characterize the measurable attributes of a program with a simple function of a small number of orthogonal metrics each of which represents a distinct software attribute domain.
Feature extraction for ultrasonic sensor based defect detection in ceramic components
NASA Astrophysics Data System (ADS)
Kesharaju, Manasa; Nagarajah, Romesh
2014-02-01
High density silicon carbide materials are commonly used as the ceramic element of hard armour inserts used in traditional body armour systems to reduce their weight, while providing improved hardness, strength and elastic response to stress. Currently, armour ceramic tiles are inspected visually offline using an X-ray technique that is time consuming and very expensive. In addition, from X-rays multiple defects are also misinterpreted as single defects. Therefore, to address these problems the ultrasonic non-destructive approach is being investigated. Ultrasound based inspection would be far more cost effective and reliable as the methodology is applicable for on-line quality control including implementation of accept/reject criteria. This paper describes a recently developed methodology to detect, locate and classify various manufacturing defects in ceramic tiles using sub band coding of ultrasonic test signals. The wavelet transform is applied to the ultrasonic signal and wavelet coefficients in the different frequency bands are extracted and used as input features to an artificial neural network (ANN) for purposes of signal classification. Two different classifiers, using artificial neural networks (supervised) and clustering (un-supervised) are supplied with features selected using Principal Component Analysis(PCA) and their classification performance compared. This investigation establishes experimentally that Principal Component Analysis(PCA) can be effectively used as a feature selection method that provides superior results for classifying various defects in the context of ultrasonic inspection in comparison with the X-ray technique.
Modified Inverse First Order Reliability Method (I-FORM) for Predicting Extreme Sea States.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eckert-Gallup, Aubrey Celia; Sallaberry, Cedric Jean-Marie; Dallman, Ann Renee
Environmental contours describing extreme sea states are generated as the input for numerical or physical model simulation s as a part of the stand ard current practice for designing marine structure s to survive extreme sea states. Such environmental contours are characterized by combinations of significant wave height ( ) and energy period ( ) values calculated for a given recurrence interval using a set of data based on hindcast simulations or buoy observations over a sufficient period of record. The use of the inverse first - order reliability method (IFORM) i s standard design practice for generating environmental contours.more » In this paper, the traditional appli cation of the IFORM to generating environmental contours representing extreme sea states is described in detail and its merits and drawbacks are assessed. The application of additional methods for analyzing sea state data including the use of principal component analysis (PCA) to create an uncorrelated representation of the data under consideration is proposed. A reexamination of the components of the IFORM application to the problem at hand including the use of new distribution fitting techniques are shown to contribute to the development of more accurate a nd reasonable representations of extreme sea states for use in survivability analysis for marine struc tures. Keywords: In verse FORM, Principal Component Analysis , Environmental Contours, Extreme Sea State Characteri zation, Wave Energy Converters« less
MR Image Reconstruction Using Block Matching and Adaptive Kernel Methods.
Schmidt, Johannes F M; Santelli, Claudio; Kozerke, Sebastian
2016-01-01
An approach to Magnetic Resonance (MR) image reconstruction from undersampled data is proposed. Undersampling artifacts are removed using an iterative thresholding algorithm applied to nonlinearly transformed image block arrays. Each block array is transformed using kernel principal component analysis where the contribution of each image block to the transform depends in a nonlinear fashion on the distance to other image blocks. Elimination of undersampling artifacts is achieved by conventional principal component analysis in the nonlinear transform domain, projection onto the main components and back-mapping into the image domain. Iterative image reconstruction is performed by interleaving the proposed undersampling artifact removal step and gradient updates enforcing consistency with acquired k-space data. The algorithm is evaluated using retrospectively undersampled MR cardiac cine data and compared to k-t SPARSE-SENSE, block matching with spatial Fourier filtering and k-t ℓ1-SPIRiT reconstruction. Evaluation of image quality and root-mean-squared-error (RMSE) reveal improved image reconstruction for up to 8-fold undersampled data with the proposed approach relative to k-t SPARSE-SENSE, block matching with spatial Fourier filtering and k-t ℓ1-SPIRiT. In conclusion, block matching and kernel methods can be used for effective removal of undersampling artifacts in MR image reconstruction and outperform methods using standard compressed sensing and ℓ1-regularized parallel imaging methods.
Sacks, Jason D; Ito, Kazuhiko; Wilson, William E; Neas, Lucas M
2012-10-01
With the advent of multicity studies, uniform statistical approaches have been developed to examine air pollution-mortality associations across cities. To assess the sensitivity of the air pollution-mortality association to different model specifications in a single and multipollutant context, the authors applied various regression models developed in previous multicity time-series studies of air pollution and mortality to data from Philadelphia, Pennsylvania (May 1992-September 1995). Single-pollutant analyses used daily cardiovascular mortality, fine particulate matter (particles with an aerodynamic diameter ≤2.5 µm; PM(2.5)), speciated PM(2.5), and gaseous pollutant data, while multipollutant analyses used source factors identified through principal component analysis. In single-pollutant analyses, risk estimates were relatively consistent across models for most PM(2.5) components and gaseous pollutants. However, risk estimates were inconsistent for ozone in all-year and warm-season analyses. Principal component analysis yielded factors with species associated with traffic, crustal material, residual oil, and coal. Risk estimates for these factors exhibited less sensitivity to alternative regression models compared with single-pollutant models. Factors associated with traffic and crustal material showed consistently positive associations in the warm season, while the coal combustion factor showed consistently positive associations in the cold season. Overall, mortality risk estimates examined using a source-oriented approach yielded more stable and precise risk estimates, compared with single-pollutant analyses.
Principal Component Clustering Approach to Teaching Quality Discriminant Analysis
ERIC Educational Resources Information Center
Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan
2016-01-01
Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…
Analysis of the principal component algorithm in phase-shifting interferometry.
Vargas, J; Quiroga, J Antonio; Belenguer, T
2011-06-15
We recently presented a new asynchronous demodulation method for phase-sampling interferometry. The method is based in the principal component analysis (PCA) technique. In the former work, the PCA method was derived heuristically. In this work, we present an in-depth analysis of the PCA demodulation method.
Psychometric Measurement Models and Artificial Neural Networks
ERIC Educational Resources Information Center
Sese, Albert; Palmer, Alfonso L.; Montano, Juan J.
2004-01-01
The study of measurement models in psychometrics by means of dimensionality reduction techniques such as Principal Components Analysis (PCA) is a very common practice. In recent times, an upsurge of interest in the study of artificial neural networks apt to computing a principal component extraction has been observed. Despite this interest, the…
Microelectrode arrays (MEAs) detect drug and chemical induced changes in neuronal network function and have been used for neurotoxicity screening. As a proof-•of-concept, the current study assessed the utility of analytical "fingerprinting" using Principal Components Analysis (P...
Incremental principal component pursuit for video background modeling
Rodriquez-Valderrama, Paul A.; Wohlberg, Brendt
2017-03-14
An incremental Principal Component Pursuit (PCP) algorithm for video background modeling that is able to process one frame at a time while adapting to changes in background, with a computational complexity that allows for real-time processing, having a low memory footprint and is robust to translational and rotational jitter.
A single determinant dominates the rate of yeast protein evolution.
Drummond, D Allan; Raval, Alpan; Wilke, Claus O
2006-02-01
A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
NASA Astrophysics Data System (ADS)
Pournamdari, Mohsen; Hashim, Mazlan; Pour, Amin Beiranvand
2014-08-01
Spectral transformation methods, including correlation coefficient (CC) and Optimum Index Factor (OIF), band ratio (BR) and principal component analysis (PCA) were applied to ASTER and Landsat TM bands for lithological mapping of Soghan ophiolitic complex in south of Iran. The results indicated that the methods used evidently showed superior outputs for detecting lithological units in ophiolitic complexes. CC and OIF methods were used to establish enhanced Red-Green-Blue (RGB) color combination bands for discriminating lithological units. A specialized band ratio (4/1, 4/5, 4/7 in RGB) was developed using ASTER bands to differentiate lithological units in ophiolitic complexes. The band ratio effectively detected serpentinite dunite as host rock of chromite ore deposits from surrounding lithological units in the study area. Principal component images derived from first three bands of ASTER and Landsat TM produced well results for lithological mapping applications. ASTER bands contain improved spectral characteristics and higher spatial resolution for detecting serpentinite dunite in ophiolitic complexes. The developed approach used in this study offers great potential for lithological mapping using ASTER and Landsat TM bands, which contributes in economic geology for prospecting chromite ore deposits associated with ophiolitic complexes.
Spatial and spectral analysis of corneal epithelium injury using hyperspectral images
NASA Astrophysics Data System (ADS)
Md Noor, Siti Salwa; Michael, Kaleena; Marshall, Stephen; Ren, Jinchang
2017-12-01
Eye assessment is essential in preventing blindness. Currently, the existing methods to assess corneal epithelium injury are complex and require expert knowledge. Hence, we have introduced a non-invasive technique using hyperspectral imaging (HSI) and an image analysis algorithm of corneal epithelium injury. Three groups of images were compared and analyzed, namely healthy eyes, injured eyes, and injured eyes with stain. Dimensionality reduction using principal component analysis (PCA) was applied to reduce massive data and redundancies. The first 10 principal components (PCs) were selected for further processing. The mean vector of 10 PCs with 45 pairs of all combinations was computed and sent to two classifiers. A quadratic Bayes normal classifier (QDC) and a support vector classifier (SVC) were used in this study to discriminate the eleven eyes into three groups. As a result, the combined classifier of QDC and SVC showed optimal performance with 2D PCA features (2DPCA-QDSVC) and was utilized to classify normal and abnormal tissues, using color image segmentation. The result was compared with human segmentation. The outcome showed that the proposed algorithm produced extremely promising results to assist the clinician in quantifying a cornea injury.
A model of objective weighting for EIA.
Ying, L G; Liu, Y C
1995-06-01
In spite of progress achieved in the research of environmental impact assessment (EIA), the problem of weight distribution for a set of parameters has not as yet, been properly solved. This paper presents an approach of objective weighting by using a procedure of P ij principal component-factor analysis (P ij PCFA), which suits specifically those parameters measured directly by physical scales. The P ij PCFA weighting procedure reforms the conventional weighting practice in two aspects: first, the expert subjective judgment is replaced by the standardized measure P ij as the original input of weight processing and, secondly, the principal component-factor analysis is introduced to approach the environmental parameters for their respective contributions to the totality of the regional ecosystem. Not only is the P ij PCFA weighting logical in theoretical reasoning, it also suits practically all levels of professional routines in natural environmental assessment and impact analysis. Having been assured of objectivity and accuracy in the EIA case study of the Chuansha County in Shanghai, China, the P ij PCFA weighting procedure has the potential to be applied in other geographical fields that need assigning weights to parameters that are measured by physical scales.
Piscivory limits diversification of feeding morphology in centrarchid fishes.
Collar, David C; O'Meara, Brian C; Wainwright, Peter C; Near, Thomas J
2009-06-01
Proximity to an adaptive peak influences a lineage's potential to diversify. We tested whether piscivory, a high quality but functionally demanding trophic strategy, represents an adaptive peak that limits morphological diversification in the teleost fish clade, Centrarchidae. We synthesized published diet data and applied a well-resolved, multilocus and time-calibrated phylogeny to reconstruct ancestral piscivory. We measured functional features of the skull and performed principal components analysis on species' values for these variables. To assess the role of piscivory on morphological diversification, we compared the fit of several models of evolution for each principal component (PC), where model parameters were allowed to vary between lineages that differed in degree of piscivory. According to the best-fitting model, two adaptive peaks influenced PC 1 evolution, one peak shared between highly and moderately piscivorous lineages and another for nonpiscivores. Brownian motion better fit PCs 2, 3, and 4, but the best Brownian models infer a slow rate of PC 2 evolution shared among all piscivores and a uniquely slow rate of PC 4 evolution in highly piscivorous lineages. These results suggest that piscivory limits feeding morphology diversification, but this effect is most severe in lineages that exhibit an extreme form of this diet.
Mottese, Antonio Francesco; Naccari, Clara; Vadalà, Rossella; Bua, Giuseppe Daniel; Bartolomeo, Giovanni; Rando, Rossana; Cicero, Nicola; Dugo, Giacomo
2018-01-01
Opuntia ficus-indica L. Miller fruits, particularly 'Ficodindia dell'Etna' of Biancavilla (POD), 'Fico d'india tradizionale di Roccapalumba' with protected brand and samples from an experimental field in Pezzolo (Sicily) were analyzed by inductively coupled plasma mass spectrometry in order to determine the multi-element profile. A multivariate chemometric approach, specifically principal component analysis (PCA), was applied to individuate how mineral elements may represent a marker of geographic origin, which would be useful for traceability. PCA has allowed us to verify that the geographical origin of prickly pear fruits is significantly influenced by trace element content, and the results found in Biancavilla PDO samples were linked to the geological composition of this volcanic areas. It was observed that two principal components accounted for 72.03% of the total variance in the data and, in more detail, PC1 explains 45.51% and PC2 26.52%, respectively. This study demonstrated that PCA is an integrated tool for the traceability of food products and, at the same time, a useful method of authentication of typical local fruits such as prickly pear. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Support vector machine and principal component analysis for microarray data classification
NASA Astrophysics Data System (ADS)
Astuti, Widi; Adiwijaya
2018-03-01
Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.
NASA Astrophysics Data System (ADS)
Ogruc Ildiz, G.; Arslan, M.; Unsalan, O.; Araujo-Andrade, C.; Kurt, E.; Karatepe, H. T.; Yilmaz, A.; Yalcinkaya, O. B.; Herken, H.
2016-01-01
In this study, a methodology based on Fourier-transform infrared spectroscopy and principal component analysis and partial least square methods is proposed for the analysis of blood plasma samples in order to identify spectral changes correlated with some biomarkers associated with schizophrenia and bipolarity. Our main goal was to use the spectral information for the calibration of statistical models to discriminate and classify blood plasma samples belonging to bipolar and schizophrenic patients. IR spectra of 30 samples of blood plasma obtained from each, bipolar and schizophrenic patients and healthy control group were collected. The results obtained from principal component analysis (PCA) show a clear discrimination between the bipolar (BP), schizophrenic (SZ) and control group' (CG) blood samples that also give possibility to identify three main regions that show the major differences correlated with both mental disorders (biomarkers). Furthermore, a model for the classification of the blood samples was calibrated using partial least square discriminant analysis (PLS-DA), allowing the correct classification of BP, SZ and CG samples. The results obtained applying this methodology suggest that it can be used as a complimentary diagnostic tool for the detection and discrimination of these mental diseases.
The fine-scale genetic structure and evolution of the Japanese population.
Takeuchi, Fumihiko; Katsuya, Tomohiro; Kimura, Ryosuke; Nabika, Toru; Isomura, Minoru; Ohkubo, Takayoshi; Tabara, Yasuharu; Yamamoto, Ken; Yokota, Mitsuhiro; Liu, Xuanyao; Saw, Woei-Yuh; Mamatyusupu, Dolikun; Yang, Wenjun; Xu, Shuhua; Teo, Yik-Ying; Kato, Norihiro
2017-01-01
The contemporary Japanese populations largely consist of three genetically distinct groups-Hondo, Ryukyu and Ainu. By principal-component analysis, while the three groups can be clearly separated, the Hondo people, comprising 99% of the Japanese, form one almost indistinguishable cluster. To understand fine-scale genetic structure, we applied powerful haplotype-based statistical methods to genome-wide single nucleotide polymorphism data from 1600 Japanese individuals, sampled from eight distinct regions in Japan. We then combined the Japanese data with 26 other Asian populations data to analyze the shared ancestry and genetic differentiation. We found that the Japanese could be separated into nine genetic clusters in our dataset, showing a marked concordance with geography; and that major components of ancestry profile of Japanese were from the Korean and Han Chinese clusters. We also detected and dated admixture in the Japanese. While genetic differentiation between Ryukyu and Hondo was suggested to be caused in part by positive selection, genetic differentiation among the Hondo clusters appeared to result principally from genetic drift. Notably, in Asians, we found the possibility that positive selection accentuated genetic differentiation among distant populations but attenuated genetic differentiation among close populations. These findings are significant for studies of human evolution and medical genetics.
Boeing, Joana Schuelter; Barizão, Erica Oliveira; E Silva, Beatriz Costa; Montanher, Paula Fernandes; de Cinque Almeida, Vitor; Visentainer, Jesuí Vergilio
2014-01-01
This study evaluated the effect of the solvent on the extraction of antioxidant compounds from black mulberry (Morus nigra), blackberry (Rubus ulmifolius) and strawberry (Fragaria x ananassa). Different extracts of each berry were evaluated from the determination of total phenolic content, anthocyanin content and antioxidant capacity, and data were applied to the principal component analysis (PCA) to gain an overview of the effect of the solvent in extraction method. For all the berries analyzed, acetone/water (70/30, v/v) solvent mixture was more efficient solvent in the extracting of phenolic compounds, and methanol/water/acetic acid (70/29.5/0.5, v/v/v) showed the best values for anthocyanin content. Mixtures of ethanol/water (50/50, v/v), acetone water/acetic acid (70/29.5/0.5, v/v/v) and acetone/water (50/50, v/v) presented the highest antioxidant capacities for black mulberries, blackberries and strawberries, respectively. Antioxidants extractions are extremely affected by the solvent combination used. In addition, the obtained extracts with the organic solvent-water mixtures were distinguished from the extracts obtained with pure organic solvents, through the PCA analysis.
Li, Hong Zhi; Tao, Wei; Gao, Ting; Li, Hui; Lu, Ying Hua; Su, Zhong Min
2011-01-01
We propose a generalized regression neural network (GRNN) approach based on grey relational analysis (GRA) and principal component analysis (PCA) (GP-GRNN) to improve the accuracy of density functional theory (DFT) calculation for homolysis bond dissociation energies (BDE) of Y-NO bond. As a demonstration, this combined quantum chemistry calculation with the GP-GRNN approach has been applied to evaluate the homolysis BDE of 92 Y-NO organic molecules. The results show that the ull-descriptor GRNN without GRA and PCA (F-GRNN) and with GRA (G-GRNN) approaches reduce the root-mean-square (RMS) of the calculated homolysis BDE of 92 organic molecules from 5.31 to 0.49 and 0.39 kcal mol(-1) for the B3LYP/6-31G (d) calculation. Then the newly developed GP-GRNN approach further reduces the RMS to 0.31 kcal mol(-1). Thus, the GP-GRNN correction on top of B3LYP/6-31G (d) can improve the accuracy of calculating the homolysis BDE in quantum chemistry and can predict homolysis BDE which cannot be obtained experimentally.
Savio, Marianela; Ortiz, María S; Almeida, César A; Olsina, Roberto A; Martinez, Luis D; Gil, Raúl A
2014-09-15
Trace metals have negative effects on the oxidative stability of edible oils and they are important because of possibility for oils characterisation. A single-step procedure for trace elemental analysis of edible oils is presented. To this aim, a solubilisation with tetramethylammonium hydroxide (TMAH) was assayed prior to inductively coupled plasma mass spectrometry detection. Small amounts of TMAH were used, resulting in high elemental concentrations. This method was applied to edible oils commercially available in Argentine. Elements present in small amounts (Cu, Ge, Mn, Mo, Ni, Sb, Sr, Ti, and V) were determined in olive, corn, almond and sunflower oils. The limits of detection were between 0.004 μg g(-1) for Mn and Sr, and 0.32 μg g(-1) for Sb. Principal components analysis was used to correlate the content of trace metals with the type of oils. The two first principal components retained 91.6% of the variability of the system. This is a relatively simple and safe procedure, and could be an attractive alternative for quality control, traceability and routine analysis of edible oils. Copyright © 2014 Elsevier Ltd. All rights reserved.
Spectroscopic study of honey from Apis mellifera from different regions in Mexico
NASA Astrophysics Data System (ADS)
Frausto-Reyes, C.; Casillas-Peñuelas, R.; Quintanar-Stephano, JL; Macías-López, E.; Bujdud-Pérez, JM; Medina-Ramírez, I.
2017-05-01
The objective of this study was to analyze by Raman and UV-Vis-NIR Spectroscopic techniques, Mexican honey from Apis Mellífera, using representative samples with different botanic origins (unifloral and multifloral) and diverse climates. Using Raman spectroscopy together with principal components analysis, the results obtained represent the possibility to use them for determination of floral origin of honey, independently of the region of sampling. For this, the effect of heat up the honey was analyzed in relation that it was possible to greatly reduce the fluorescence background in Raman spectra, which allowed the visualization of fructose and glucose peaks. Using UV-Vis-NIR, spectroscopy, a characteristic spectrum profile of transmittance was obtained for each honey type. In addition, to have an objective characterization of color, a CIE Yxy and CIE L*a*b* colorimetric register was realized for each honey type. Applying the principal component analysis and their correlation with chromaticity coordinates allowed classifying the honey samples in one plot as: cutoff wavelength, maximum transmittance, tones and lightness. The results show that it is possible to obtain a spectroscopic record of honeys with specific characteristics by reducing the effects of fluorescence.
Macro policy responses to oil booms and busts in the United Arab Emirates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Al-Mutawa, A.K.
1991-01-01
The effects of oil shocks and macro policy changes in the United Arab Emirates are analyzed. A theoretical model is developed within the framework of the Dutch Disease literature. It contains four unique features that are applicable to the United Arab Emirates' economy. There are: (1) the presence of a large foreign labor force; (2) OPEC's oil export quotas; (3) the division of oil profits; and (4) the important role of government expenditures. The model is then used to examine the welfare effects of the above-mentioned shocks. An econometric model is then specified that conforms to the analytical model. Inmore » the econometric model the method of principal components' is applied owing to the undersized sample data. The principal components methodology is used in both the identification testing and the estimation of the structural equations. The oil and macro policy shocks are then simulated. The simulation results show that an oil-quantity boom leads to a higher welfare gain than an oil-price boom. Under certain circumstances, this finding is also confirmed by the comparative statistics that follow from the analytical model.« less
Correlation between grade of pearlite spheroidization and laser induced spectra
NASA Astrophysics Data System (ADS)
Yao, Shunchun; Dong, Meirong; Lu, Jidong; Li, Jun; Dong, Xuan
2013-12-01
Laser induced breakdown spectroscopy (LIBS) which is used traditionally as a spectrochemical analytical technique was employed to analyze the grade of pearlite spheroidization. Three 12Cr1MoV steel specimens with different grades of pearlite spheroidization were ablated to produce plasma by pulse laser at 266 nm. In order to determine the optimal temporal condition and plasma parameters for correlating the grade of pearlite spheroidization and laser induced spectra, a set of spectra at different delays were analyzed by the principal component analysis method. Then, the relationship between plasma temperature, intensity ratios of ionic to atomic lines and grade of pearlite spheroidization was studied. The analysis results show that the laser induced spectra of different grades of pearlite spheroidization can be readily identifiable by principal component analysis in the range of 271.941-289.672 nm with 1000 ns delay time. It is also found that a good agreement exists between the Fe ionic to atomic line ratios and the tensile strength, whereas there is no obvious difference in the plasma temperature. Therefore, LIBS may be applied not only as a spectrochemical analytical technique but also as a new way to estimate the grade of pearlite spheroidization.
Sand/cement ratio evaluation on mortar using neural networks and ultrasonic transmission inspection.
Molero, M; Segura, I; Izquierdo, M A G; Fuente, J V; Anaya, J J
2009-02-01
The quality and degradation state of building materials can be determined by nondestructive testing (NDT). These materials are composed of a cementitious matrix and particles or fragments of aggregates. Sand/cement ratio (s/c) provides the final material quality; however, the sand content can mask the matrix properties in a nondestructive measurement. Therefore, s/c ratio estimation is needed in nondestructive characterization of cementitious materials. In this study, a methodology to classify the sand content in mortar is presented. The methodology is based on ultrasonic transmission inspection, data reduction, and features extraction by principal components analysis (PCA), and neural network classification. This evaluation is carried out with several mortar samples, which were made while taking into account different cement types and s/c ratios. The estimated s/c ratio is determined by ultrasonic spectral attenuation with three different broadband transducers (0.5, 1, and 2 MHz). Statistical PCA to reduce the dimension of the captured traces has been applied. Feed-forward neural networks (NNs) are trained using principal components (PCs) and their outputs are used to display the estimated s/c ratios in false color images, showing the s/c ratio distribution of the mortar samples.
Signal to Noise Studies on Thermographic Data with Fabricated Defects for Defense Structures
NASA Technical Reports Server (NTRS)
Zalameda, Joseph N.; Rajic, Nik; Genest, Marc
2006-01-01
There is a growing international interest in thermal inspection systems for asset life assessment and management of defense platforms. The efficacy of flash thermography is generally enhanced by applying image processing algorithms to the observations of raw temperature. Improving the defect signal to noise ratio (SNR) is of primary interest to reduce false calls and allow for easier interpretation of a thermal inspection image. Several factors affecting defect SNR were studied such as data compression and reconstruction using principal component analysis and time window processing.
A unified development of several techniques for the representation of random vectors and data sets
NASA Technical Reports Server (NTRS)
Bundick, W. T.
1973-01-01
Linear vector space theory is used to develop a general representation of a set of data vectors or random vectors by linear combinations of orthonormal vectors such that the mean squared error of the representation is minimized. The orthonormal vectors are shown to be the eigenvectors of an operator. The general representation is applied to several specific problems involving the use of the Karhunen-Loeve expansion, principal component analysis, and empirical orthogonal functions; and the common properties of these representations are developed.
Application of remote sensing to reconnaissance geologic mapping and mineral exploration
NASA Technical Reports Server (NTRS)
Birnie, R. W.; Dykstra, J. D.
1978-01-01
A method of mapping geology at a reconnaissance scale and locating zones of possible hydrothermal alteration has been developed. This method is based on principal component analysis of Landsat digital data and is applied to the desert area of the Chagai Hills, Baluchistan, Pakistan. A method for airborne spectrometric detection of geobotanical anomalies associated with prophyry Cu-Mo mineralization at Heddleston, Montana has also been developed. This method is based on discriminants in the 0.67 micron and 0.79 micron region of the spectrum.
Dynamic competitive probabilistic principal components analysis.
López-Rubio, Ezequiel; Ortiz-DE-Lazcano-Lobato, Juan Miguel
2009-04-01
We present a new neural model which extends the classical competitive learning (CL) by performing a Probabilistic Principal Components Analysis (PPCA) at each neuron. The model also has the ability to learn the number of basis vectors required to represent the principal directions of each cluster, so it overcomes a drawback of most local PCA models, where the dimensionality of a cluster must be fixed a priori. Experimental results are presented to show the performance of the network with multispectral image data.
NASA Astrophysics Data System (ADS)
Chinga-Carrasco, Gary
2011-06-01
During the last decade, major efforts have been made to develop adequate and commercially viable processes for disintegrating cellulose fibres into their structural components. Homogenisation of cellulose fibres has been one of the principal applied procedures. Homogenisation has produced materials which may be inhomogeneous, containing fibres, fibres fragments, fibrillar fines and nanofibrils. The material has been denominated microfibrillated cellulose (MFC). In addition, terms relating to the nano-scale have been given to the MFC material. Several modern and high-tech nano-applications have been envisaged for MFC. However, is MFC a nano-structure? It is concluded that MFC materials may be composed of (1) nanofibrils, (2) fibrillar fines, (3) fibre fragments and (4) fibres. This implies that MFC is not necessarily synonymous with nanofibrils, microfibrils or any other cellulose nano-structure. However, properly produced MFC materials contain nano-structures as a main component, i.e. nanofibrils.
Hainsworth, S V; Fitzpatrick, M E
2007-06-01
Forensic engineering is the application of engineering principles or techniques to the investigation of materials, products, structures or components that fail or do not perform as intended. In particular, forensic engineering can involve providing solutions to forensic problems by the application of engineering science. A criminal aspect may be involved in the investigation but often the problems are related to negligence, breach of contract, or providing information needed in the redesign of a product to eliminate future failures. Forensic engineering may include the investigation of the physical causes of accidents or other sources of claims and litigation (for example, patent disputes). It involves the preparation of technical engineering reports, and may require giving testimony and providing advice to assist in the resolution of disputes affecting life or property.This paper reviews the principal methods available for the analysis of failed components and then gives examples of different component failure modes through selected case studies.
A principal components model of soundscape perception.
Axelsson, Östen; Nilsson, Mats E; Berglund, Birgitta
2010-11-01
There is a need for a model that identifies underlying dimensions of soundscape perception, and which may guide measurement and improvement of soundscape quality. With the purpose to develop such a model, a listening experiment was conducted. One hundred listeners measured 50 excerpts of binaural recordings of urban outdoor soundscapes on 116 attribute scales. The average attribute scale values were subjected to principal components analysis, resulting in three components: Pleasantness, eventfulness, and familiarity, explaining 50, 18 and 6% of the total variance, respectively. The principal-component scores were correlated with physical soundscape properties, including categories of dominant sounds and acoustic variables. Soundscape excerpts dominated by technological sounds were found to be unpleasant, whereas soundscape excerpts dominated by natural sounds were pleasant, and soundscape excerpts dominated by human sounds were eventful. These relationships remained after controlling for the overall soundscape loudness (Zwicker's N(10)), which shows that 'informational' properties are substantial contributors to the perception of soundscape. The proposed principal components model provides a framework for future soundscape research and practice. In particular, it suggests which basic dimensions are necessary to measure, how to measure them by a defined set of attribute scales, and how to promote high-quality soundscapes.
Confirmatory Factor Analysis of the Delirium Rating Scale Revised-98 (DRS-R98).
Thurber, Steven; Kishi, Yasuhiro; Trzepacz, Paula T; Franco, Jose G; Meagher, David J; Lee, Yanghyun; Kim, Jeong-Lan; Furlanetto, Leticia M; Negreiros, Daniel; Huang, Ming-Chyi; Chen, Chun-Hsin; Kean, Jacob; Leonard, Maeve
2015-01-01
Principal components analysis applied to the Delirium Rating Scale-Revised-98 contributes to understanding the delirium construct. Using a multisite pooled international delirium database, the authors applied confirmatory factor analysis to Delirium Rating Scale-Revised-98 scores from 859 adult patients evaluated by delirium experts (delirium, N=516; nondelirium, N=343). Confirmatory factor analysis found all diagnostic features and core symptoms (cognitive, language, thought process, sleep-wake cycle, motor retardation), except motor agitation, loaded onto factor 1. Motor agitation loaded onto factor 2 with noncore symptoms (delusions, affective lability, and perceptual disturbances). Factor 1 loading supports delirium as a single construct, but when accompanied by psychosis, motor agitation's role may not be solely as a circadian activity indicator.
NASA Astrophysics Data System (ADS)
Liu, Wen; Zhang, Yuying; Yang, Si; Han, Donghai
2018-05-01
A new technique to identify the floral resources of honeys is demanded. Terahertz time-domain attenuated total reflection spectroscopy combined with chemometrics methods was applied to discriminate different categorizes (Medlar honey, Vitex honey, and Acacia honey). Principal component analysis (PCA), cluster analysis (CA) and partial least squares-discriminant analysis (PLS-DA) have been used to find information of the botanical origins of honeys. Spectral range also was discussed to increase the precision of PLS-DA model. The accuracy of 88.46% for validation set was obtained, using PLS-DA model in 0.5-1.5 THz. This work indicated terahertz time-domain attenuated total reflection spectroscopy was an available approach to evaluate the quality of honey rapidly.
Data on Support Vector Machines (SVM) model to forecast photovoltaic power.
Malvoni, M; De Giorgi, M G; Congedo, P M
2016-12-01
The data concern the photovoltaic (PV) power, forecasted by a hybrid model that considers weather variations and applies a technique to reduce the input data size, as presented in the paper entitled "Photovoltaic forecast based on hybrid pca-lssvm using dimensionality reducted data" (M. Malvoni, M.G. De Giorgi, P.M. Congedo, 2015) [1]. The quadratic Renyi entropy criteria together with the principal component analysis (PCA) are applied to the Least Squares Support Vector Machines (LS-SVM) to predict the PV power in the day-ahead time frame. The data here shared represent the proposed approach results. Hourly PV power predictions for 1,3,6,12, 24 ahead hours and for different data reduction sizes are provided in Supplementary material.
SAS program for quantitative stratigraphic correlation by principal components
Hohn, M.E.
1985-01-01
A SAS program is presented which constructs a composite section of stratigraphic events through principal components analysis. The variables in the analysis are stratigraphic sections and the observational units are range limits of taxa. The program standardizes data in each section, extracts eigenvectors, estimates missing range limits, and computes the composite section from scores of events on the first principal component. Provided is an option of several types of diagnostic plots; these help one to determine conservative range limits or unrealistic estimates of missing values. Inspection of the graphs and eigenvalues allow one to evaluate goodness of fit between the composite and measured data. The program is extended easily to the creation of a rank-order composite. ?? 1985.
NASA Astrophysics Data System (ADS)
Werth, Alexandra; Liakat, Sabbir; Dong, Anqi; Woods, Callie M.; Gmachl, Claire F.
2018-05-01
An integrating sphere is used to enhance the collection of backscattered light in a noninvasive glucose sensor based on quantum cascade laser spectroscopy. The sphere enhances signal stability by roughly an order of magnitude, allowing us to use a thermoelectrically (TE) cooled detector while maintaining comparable glucose prediction accuracy levels. Using a smaller TE-cooled detector reduces form factor, creating a mobile sensor. Principal component analysis has predicted principal components of spectra taken from human subjects that closely match the absorption peaks of glucose. These principal components are used as regressors in a linear regression algorithm to make glucose concentration predictions, over 75% of which are clinically accurate.
Principals' Perceptions of Collegial Support as a Component of Administrative Inservice.
ERIC Educational Resources Information Center
Daresh, John C.
To address the problem of increasing professional isolation of building administrators, the Principals' Inservice Project helps establish principals' collegial support groups across the nation. The groups are typically composed of 6 to 10 principals who meet at least once each month over a 2-year period. One collegial support group of seven…
Training the Trainers: Learning to Be a Principal Supervisor
ERIC Educational Resources Information Center
Saltzman, Amy
2017-01-01
While most principal supervisors are former principals themselves, few come to the role with specific training in how to do the job effectively. For this reason, both the Washington, D.C., and Tulsa, Oklahoma, principal supervisor programs include a strong professional development component. In this article, the author takes a look inside these…
NASA Astrophysics Data System (ADS)
Mao, Hanling; Zhang, Yuhua; Mao, Hanying; Li, Xinxin; Huang, Zhenfeng
2018-06-01
This paper presents the study of applying the nonlinear ultrasonic wave to evaluate the stress state of metallic materials under steady state. The pre-stress loading method is applied to guarantee components with steady stress. Three kinds of nonlinear ultrasonic experiments based on critically refracted longitudinal wave are conducted on components which the critically refracted longitudinal wave propagates along x, x1 and x2 direction. Experimental results indicate the second and third order relative nonlinear coefficients monotonically increase with stress, and the normalized relationship is consistent with simplified dislocation models, which indicates the experimental result is logical. The combined ultrasonic nonlinear parameter is proposed, and three stress evaluation models at x direction are established based on three ultrasonic nonlinear parameters, which the estimation error is below 5%. Then two stress detection models at x1 and x2 direction are built based on combined ultrasonic nonlinear parameter, the stress synthesis method is applied to calculate the magnitude and direction of principal stress. The results show the prediction error is within 5% and the angle deviation is within 1.5°. Therefore the nonlinear ultrasonic technique based on LCR wave could be applied to nondestructively evaluate the stress of metallic materials under steady state which the magnitude and direction are included.
ERIC Educational Resources Information Center
Rodrigue, Christine M.
2011-01-01
This paper presents a laboratory exercise used to teach principal components analysis (PCA) as a means of surface zonation. The lab was built around abundance data for 16 oxides and elements collected by the Mars Exploration Rover Spirit in Gusev Crater between Sol 14 and Sol 470. Students used PCA to reduce 15 of these into 3 components, which,…
ERIC Educational Resources Information Center
Ackermann, Margot Elise; Morrow, Jennifer Ann
2008-01-01
The present study describes the development and initial validation of the Coping with the College Environment Scale (CWCES). Participants included 433 college students who took an online survey. Principal Components Analysis (PCA) revealed six coping strategies: planning and self-management, seeking support from institutional resources, escaping…
NASA Astrophysics Data System (ADS)
Kistenev, Yu. V.; Shapovalov, A. V.; Borisov, A. V.; Vrazhnov, D. A.; Nikolaev, V. V.; Nikiforova, O. Yu.
2015-11-01
The comparison results of different mother wavelets used for de-noising of model and experimental data which were presented by profiles of absorption spectra of exhaled air are presented. The impact of wavelets de-noising on classification quality made by principal component analysis are also discussed.
Evaluation of skin melanoma in spectral range 450-950 nm using principal component analysis
NASA Astrophysics Data System (ADS)
Jakovels, D.; Lihacova, I.; Kuzmina, I.; Spigulis, J.
2013-06-01
Diagnostic potential of principal component analysis (PCA) of multi-spectral imaging data in the wavelength range 450- 950 nm for distant skin melanoma recognition is discussed. Processing of the measured clinical data by means of PCA resulted in clear separation between malignant melanomas and pigmented nevi.
ERIC Educational Resources Information Center
Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Kooij, Anita J.
2007-01-01
Principal components analysis (PCA) is used to explore the structure of data sets containing linearly related numeric variables. Alternatively, nonlinear PCA can handle possibly nonlinearly related numeric as well as nonnumeric variables. For linear PCA, the stability of its solution can be established under the assumption of multivariate…
40 CFR 60.2998 - What are the principal components of the model rule?
Code of Federal Regulations, 2012 CFR
2012-07-01
... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
40 CFR 60.2998 - What are the principal components of the model rule?
Code of Federal Regulations, 2014 CFR
2014-07-01
... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
40 CFR 60.2998 - What are the principal components of the model rule?
Code of Federal Regulations, 2011 CFR
2011-07-01
... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
40 CFR 60.1580 - What are the principal components of the model rule?
Code of Federal Regulations, 2010 CFR
2010-07-01
... the model rule? 60.1580 Section 60.1580 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines..., 1999 Use of Model Rule § 60.1580 What are the principal components of the model rule? The model rule...
40 CFR 60.2998 - What are the principal components of the model rule?
Code of Federal Regulations, 2013 CFR
2013-07-01
... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
Students' Perceptions of Teaching and Learning Practices: A Principal Component Approach
ERIC Educational Resources Information Center
Mukorera, Sophia; Nyatanga, Phocenah
2017-01-01
Students' attendance and engagement with teaching and learning practices is perceived as a critical element for academic performance. Even with stipulated attendance policies, students still choose not to engage. The study employed a principal component analysis to analyze first- and second-year students' perceptions of the importance of the 12…
ERIC Educational Resources Information Center
Hunley-Jenkins, Keisha Janine
2012-01-01
This qualitative study explores large, urban, mid-western principal perspectives about cyberbullying and the policy components and practices that they have found effective and ineffective at reducing its occurrence and/or negative effect on their schools' learning environments. More specifically, the researcher was interested in learning more…
Learning Principal Component Analysis by Using Data from Air Quality Networks
ERIC Educational Resources Information Center
Perez-Arribas, Luis Vicente; Leon-González, María Eugenia; Rosales-Conrado, Noelia
2017-01-01
With the final objective of using computational and chemometrics tools in the chemistry studies, this paper shows the methodology and interpretation of the Principal Component Analysis (PCA) using pollution data from different cities. This paper describes how students can obtain data on air quality and process such data for additional information…
Applications of Nonlinear Principal Components Analysis to Behavioral Data.
ERIC Educational Resources Information Center
Hicks, Marilyn Maginley
1981-01-01
An empirical investigation of the statistical procedure entitled nonlinear principal components analysis was conducted on a known equation and on measurement data in order to demonstrate the procedure and examine its potential usefulness. This method was suggested by R. Gnanadesikan and based on an early paper of Karl Pearson. (Author/AL)
ERIC Educational Resources Information Center
Hendrix, Dean
2010-01-01
This study analyzed 2005-2006 Web of Science bibliometric data from institutions belonging to the Association of Research Libraries (ARL) and corresponding ARL statistics to find any associations between indicators from the two data sets. Principal components analysis on 36 variables from 103 universities revealed obvious associations between…
Principal component analysis for protein folding dynamics.
Maisuradze, Gia G; Liwo, Adam; Scheraga, Harold A
2009-01-09
Protein folding is considered here by studying the dynamics of the folding of the triple beta-strand WW domain from the Formin-binding protein 28. Starting from the unfolded state and ending either in the native or nonnative conformational states, trajectories are generated with the coarse-grained united residue (UNRES) force field. The effectiveness of principal components analysis (PCA), an already established mathematical technique for finding global, correlated motions in atomic simulations of proteins, is evaluated here for coarse-grained trajectories. The problems related to PCA and their solutions are discussed. The folding and nonfolding of proteins are examined with free-energy landscapes. Detailed analyses of many folding and nonfolding trajectories at different temperatures show that PCA is very efficient for characterizing the general folding and nonfolding features of proteins. It is shown that the first principal component captures and describes in detail the dynamics of a system. Anomalous diffusion in the folding/nonfolding dynamics is examined by the mean-square displacement (MSD) and the fractional diffusion and fractional kinetic equations. The collisionless (or ballistic) behavior of a polypeptide undergoing Brownian motion along the first few principal components is accounted for.
Principal Component 2-D Long Short-Term Memory for Font Recognition on Single Chinese Characters.
Tao, Dapeng; Lin, Xu; Jin, Lianwen; Li, Xuelong
2016-03-01
Chinese character font recognition (CCFR) has received increasing attention as the intelligent applications based on optical character recognition becomes popular. However, traditional CCFR systems do not handle noisy data effectively. By analyzing in detail the basic strokes of Chinese characters, we propose that font recognition on a single Chinese character is a sequence classification problem, which can be effectively solved by recurrent neural networks. For robust CCFR, we integrate a principal component convolution layer with the 2-D long short-term memory (2DLSTM) and develop principal component 2DLSTM (PC-2DLSTM) algorithm. PC-2DLSTM considers two aspects: 1) the principal component layer convolution operation helps remove the noise and get a rational and complete font information and 2) simultaneously, 2DLSTM deals with the long-range contextual processing along scan directions that can contribute to capture the contrast between character trajectory and background. Experiments using the frequently used CCFR dataset suggest the effectiveness of PC-2DLSTM compared with other state-of-the-art font recognition methods.
Yuan, Yuan-Yuan; Zhou, Yu-Bi; Sun, Jing; Deng, Juan; Bai, Ying; Wang, Jie; Lu, Xue-Feng
2017-06-01
The content of elements in fifteen different regions of Nitraria roborowskii samples were determined by inductively coupled plasma-atomic emission spectrometry(ICP-OES), and its elemental characteristics were analyzed by principal component analysis. The results indicated that 18 mineral elements were detected in N. roborowskii of which V cannot be detected. In addition, contents of Na, K and Ca showed high concentration. Ti showed maximum content variance, while K is minimum. Four principal components were gained from the original data. The cumulative variance contribution rate is 81.542% and the variance contribution of the first principal component was 44.997%, indicating that Cr, Fe, P and Ca were the characteristic elements of N. roborowskii.Thus, the established method was simple, precise and can be used for determination of mineral elements in N.roborowskii Kom. fruits. The elemental distribution characteristics among N.roborowskii fruits are related to geographical origins which were clearly revealed by PCA. All the results will provide good basis for comprehensive utilization of N.roborowskii. Copyright© by the Chinese Pharmaceutical Association.
Lü, Gui-Cai; Zhao, Wei-Hong; Wang, Jiang-Tao
2011-01-01
The identification techniques for 10 species of red tide algae often found in the coastal areas of China were developed by combining the three-dimensional fluorescence spectra of fluorescence dissolved organic matter (FDOM) from the cultured red tide algae with principal component analysis. Based on the results of principal component analysis, the first principal component loading spectrum of three-dimensional fluorescence spectrum was chosen as the identification characteristic spectrum for red tide algae, and the phytoplankton fluorescence characteristic spectrum band was established. Then the 10 algae species were tested using Bayesian discriminant analysis with a correct identification rate of more than 92% for Pyrrophyta on the level of species, and that of more than 75% for Bacillariophyta on the level of genus in which the correct identification rates were more than 90% for the phaeodactylum and chaetoceros. The results showed that the identification techniques for 10 species of red tide algae based on the three-dimensional fluorescence spectra of FDOM from the cultured red tide algae and principal component analysis could work well.
NASA Astrophysics Data System (ADS)
Ji, Yi; Sun, Shanlin; Xie, Hong-Bo
2017-06-01
Discrete wavelet transform (WT) followed by principal component analysis (PCA) has been a powerful approach for the analysis of biomedical signals. Wavelet coefficients at various scales and channels were usually transformed into a one-dimensional array, causing issues such as the curse of dimensionality dilemma and small sample size problem. In addition, lack of time-shift invariance of WT coefficients can be modeled as noise and degrades the classifier performance. In this study, we present a stationary wavelet-based two-directional two-dimensional principal component analysis (SW2D2PCA) method for the efficient and effective extraction of essential feature information from signals. Time-invariant multi-scale matrices are constructed in the first step. The two-directional two-dimensional principal component analysis then operates on the multi-scale matrices to reduce the dimension, rather than vectors in conventional PCA. Results are presented from an experiment to classify eight hand motions using 4-channel electromyographic (EMG) signals recorded in healthy subjects and amputees, which illustrates the efficiency and effectiveness of the proposed method for biomedical signal analysis.
Hyperspectral optical imaging of human iris in vivo: characteristics of reflectance spectra
NASA Astrophysics Data System (ADS)
Medina, José M.; Pereira, Luís M.; Correia, Hélder T.; Nascimento, Sérgio M. C.
2011-07-01
We report a hyperspectral imaging system to measure the reflectance spectra of real human irises with high spatial resolution. A set of ocular prosthesis was used as the control condition. Reflectance data were decorrelated by the principal-component analysis. The main conclusion is that spectral complexity of the human iris is considerable: between 9 and 11 principal components are necessary to account for 99% of the cumulative variance in human irises. Correcting image misalignments associated with spontaneous ocular movements did not influence this result. The data also suggests a correlation between the first principal component and different levels of melanin present in the irises. It was also found that although the spectral characteristics of the first five principal components were not affected by the radial and angular position of the selected iridal areas, they affect the higher-order ones, suggesting a possible influence of the iris texture. The results show that hyperspectral imaging in the iris, together with adequate spectroscopic analyses provide more information than conventional colorimetric methods, making hyperspectral imaging suitable for the characterization of melanin and the noninvasive diagnosis of ocular diseases and iris color.
Seeing wholes: The concept of systems thinking and its implementation in school leadership
NASA Astrophysics Data System (ADS)
Shaked, Haim; Schechter, Chen
2013-12-01
Systems thinking (ST) is an approach advocating thinking about any given issue as a whole, emphasising the interrelationships between its components rather than the components themselves. This article aims to link ST and school leadership, claiming that ST may enable school principals to develop highly performing schools that can cope successfully with current challenges, which are more complex than ever before in today's era of accountability and high expectations. The article presents the concept of ST - its definition, components, history and applications. Thereafter, its connection to education and its contribution to school management are described. The article concludes by discussing practical processes including screening for ST-skilled principal candidates and developing ST skills among prospective and currently performing school principals, pinpointing three opportunities for skills acquisition: during preparatory programmes; during their first years on the job, supported by veteran school principals as mentors; and throughout their entire career. Such opportunities may not only provide school principals with ST skills but also improve their functioning throughout the aforementioned stages of professional development.
Temporal evolution of financial-market correlations.
Fenn, Daniel J; Porter, Mason A; Williams, Stacy; McDonald, Mark; Johnson, Neil F; Jones, Nick S
2011-08-01
We investigate financial market correlations using random matrix theory and principal component analysis. We use random matrix theory to demonstrate that correlation matrices of asset price changes contain structure that is incompatible with uncorrelated random price changes. We then identify the principal components of these correlation matrices and demonstrate that a small number of components accounts for a large proportion of the variability of the markets that we consider. We characterize the time-evolving relationships between the different assets by investigating the correlations between the asset price time series and principal components. Using this approach, we uncover notable changes that occurred in financial markets and identify the assets that were significantly affected by these changes. We show in particular that there was an increase in the strength of the relationships between several different markets following the 2007-2008 credit and liquidity crisis.
Temporal evolution of financial-market correlations
NASA Astrophysics Data System (ADS)
Fenn, Daniel J.; Porter, Mason A.; Williams, Stacy; McDonald, Mark; Johnson, Neil F.; Jones, Nick S.
2011-08-01
We investigate financial market correlations using random matrix theory and principal component analysis. We use random matrix theory to demonstrate that correlation matrices of asset price changes contain structure that is incompatible with uncorrelated random price changes. We then identify the principal components of these correlation matrices and demonstrate that a small number of components accounts for a large proportion of the variability of the markets that we consider. We characterize the time-evolving relationships between the different assets by investigating the correlations between the asset price time series and principal components. Using this approach, we uncover notable changes that occurred in financial markets and identify the assets that were significantly affected by these changes. We show in particular that there was an increase in the strength of the relationships between several different markets following the 2007-2008 credit and liquidity crisis.
TOF-SIMS imaging technique with information entropy
NASA Astrophysics Data System (ADS)
Aoyagi, Satoka; Kawashima, Y.; Kudo, Masahiro
2005-05-01
Time-of-flight secondary ion mass spectrometry (TOF-SIMS) is capable of chemical imaging of proteins on insulated samples in principal. However, selection of specific peaks related to a particular protein, which are necessary for chemical imaging, out of numerous candidates had been difficult without an appropriate spectrum analysis technique. Therefore multivariate analysis techniques, such as principal component analysis (PCA), and analysis with mutual information defined by information theory, have been applied to interpret SIMS spectra of protein samples. In this study mutual information was applied to select specific peaks related to proteins in order to obtain chemical images. Proteins on insulated materials were measured with TOF-SIMS and then SIMS spectra were analyzed by means of the analysis method based on the comparison using mutual information. Chemical mapping of each protein was obtained using specific peaks related to each protein selected based on values of mutual information. The results of TOF-SIMS images of proteins on the materials provide some useful information on properties of protein adsorption, optimality of immobilization processes and reaction between proteins. Thus chemical images of proteins by TOF-SIMS contribute to understand interactions between material surfaces and proteins and to develop sophisticated biomaterials.
Upadhyay, Neelam; Jaiswal, Pranita; Jha, Shyam Narayan
2016-10-01
Ghee forms an important component of the diet of human beings due to its rich flavor and high nutritive value. This high priced fat is prone to adulteration with cheaper fats. ATR-FTIR spectroscopy coupled with chemometrics was applied for determining the presence of goat body fat in ghee (@1, 3, 5, 10, 15 and 20% level in the laboratory made/spiked samples). The spectra of pure (ghee and goat body fat) and spiked samples were taken in the wavenumber range of 4000-500 cm -1 . Separated clusters of pure ghee and spiked samples were obtained on applying principal component analysis at 5% level of significance in the selected wavenumber range (1786-1680, 1490-919 and 1260-1040 cm -1 ). SIMCA was applied for classification of samples and pure ghee showed 100% classification efficiency. The value of R 2 was found to be >0.99 for calibration and validation sets using partial least square method at all the selected wavenumber range which indicate that the model was well developed. The study revealed that the spiked samples of goat body fat could be detected even at 1% level in ghee.
Xiao, Keke; Chen, Yun; Jiang, Xie; Zhou, Yan
2017-03-01
An investigation was conducted for 20 different types of sludge in order to identify the key organic compounds in extracellular polymeric substances (EPS) that are important in assessing variations of sludge filterability. The different types of sludge varied in initial total solids (TS) content, organic composition and pre-treatment methods. For instance, some of the sludges were pre-treated by acid, ultrasonic, thermal, alkaline, or advanced oxidation technique. The Pearson's correlation results showed significant correlations between sludge filterability and zeta potential, pH, dissolved organic carbon, protein and polysaccharide in soluble EPS (SB EPS), loosely bound EPS (LB EPS) and tightly bound EPS (TB EPS). The principal component analysis (PCA) method was used to further explore correlations between variables and similarities among EPS fractions of different types of sludge. Two principal components were extracted: principal component 1 accounted for 59.24% of total EPS variations, while principal component 2 accounted for 25.46% of total EPS variations. Dissolved organic carbon, protein and polysaccharide in LB EPS showed higher eigenvector projection values than the corresponding compounds in SB EPS and TB EPS in principal component 1. Further characterization of fractionized key organic compounds in LB EPS was conducted with size-exclusion chromatography-organic carbon detection-organic nitrogen detection (LC-OCD-OND). A numerical multiple linear regression model was established to describe relationship between organic compounds in LB EPS and sludge filterability. Copyright © 2016 Elsevier Ltd. All rights reserved.
QSAR modeling of flotation collectors using principal components extracted from topological indices.
Natarajan, R; Nirdosh, Inderjit; Basak, Subhash C; Mills, Denise R
2002-01-01
Several topological indices were calculated for substituted-cupferrons that were tested as collectors for the froth flotation of uranium. The principal component analysis (PCA) was used for data reduction. Seven principal components (PC) were found to account for 98.6% of the variance among the computed indices. The principal components thus extracted were used in stepwise regression analyses to construct regression models for the prediction of separation efficiencies (Es) of the collectors. A two-parameter model with a correlation coefficient of 0.889 and a three-parameter model with a correlation coefficient of 0.913 were formed. PCs were found to be better than partition coefficient to form regression equations, and inclusion of an electronic parameter such as Hammett sigma or quantum mechanically derived electronic charges on the chelating atoms did not improve the correlation coefficient significantly. The method was extended to model the separation efficiencies of mercaptobenzothiazoles (MBT) and aminothiophenols (ATP) used in the flotation of lead and zinc ores, respectively. Five principal components were found to explain 99% of the data variability in each series. A three-parameter equation with correlation coefficient of 0.985 and a two-parameter equation with correlation coefficient of 0.926 were obtained for MBT and ATP, respectively. The amenability of separation efficiencies of chelating collectors to QSAR modeling using PCs based on topological indices might lead to the selection of collectors for synthesis and testing from a virtual database.
A principal components approach to parent-to-newborn body composition associations in South India
Veena, Sargoor R; Krishnaveni, Ghattu V; Wills, Andrew K; Hill, Jacqueline C; Fall, Caroline HD
2009-01-01
Background Size at birth is influenced by environmental factors, like maternal nutrition and parity, and by genes. Birth weight is a composite measure, encompassing bone, fat and lean mass. These may have different determinants. The main purpose of this paper was to use anthropometry and principal components analysis (PCA) to describe maternal and newborn body composition, and associations between them, in an Indian population. We also compared maternal and paternal measurements (body mass index (BMI) and height) as predictors of newborn body composition. Methods Weight, height, head and mid-arm circumferences, skinfold thicknesses and external pelvic diameters were measured at 30 ± 2 weeks gestation in 571 pregnant women attending the antenatal clinic of the Holdsworth Memorial Hospital, Mysore, India. Paternal height and weight were also measured. At birth, detailed neonatal anthropometry was performed. Unrotated and varimax rotated PCA was applied to the maternal and neonatal measurements. Results Rotated PCA reduced maternal measurements to 4 independent components (fat, pelvis, height and muscle) and neonatal measurements to 3 components (trunk+head, fat, and leg length). An SD increase in maternal fat was associated with a 0.16 SD increase (β) in neonatal fat (p < 0.001, adjusted for gestation, maternal parity, newborn sex and socio-economic status). Maternal pelvis, height and (for male babies) muscle predicted neonatal trunk+head (β = 0. 09 SD; p = 0.017, β = 0.12 SD; p = 0.006 and β = 0.27 SD; p < 0.001). In the mother-baby and father-baby comparison, maternal BMI predicted neonatal fat (β = 0.20 SD; p < 0.001) and neonatal trunk+head (β = 0.15 SD; p = 0.001). Both maternal (β = 0.12 SD; p = 0.002) and paternal height (β = 0.09 SD; p = 0.030) predicted neonatal trunk+head but the associations became weak and statistically non-significant in multivariate analysis. Only paternal height predicted neonatal leg length (β = 0.15 SD; p = 0.003). Conclusion Principal components analysis is a useful method to describe neonatal body composition and its determinants. Newborn adiposity is related to maternal nutritional status and parity, while newborn length is genetically determined. Further research is needed to understand mechanisms linking maternal pelvic size to fetal growth and the determinants and implications of the components (trunk v leg length) of fetal skeletal growth. PMID:19236724
NASA Astrophysics Data System (ADS)
Heikkilä, U.; Shi, X.; Phipps, S. J.; Smith, A. M.
2014-04-01
This study investigates the effect of deglacial climate on the deposition of the solar proxy 10Be globally, and at two specific locations, the GRIP site at Summit, Central Greenland, and the Law Dome site in coastal Antarctica. The deglacial climate is represented by three 30 year time slice simulations of 10 000 BP (years before present = 1950 CE), 11 000 and 12 000 BP, compared with a preindustrial control simulation. The model used is the ECHAM5-HAM atmospheric aerosol-climate model, driven with sea-surface temperatures and sea ice cover simulated using the CSIRO Mk3L coupled climate system model. The focus is on isolating the 10Be production signal, driven by solar variability, from the weather- or climate-driven noise in the 10Be deposition flux during different stages of climate. The production signal varies at lower frequencies, dominated by the 11 year solar cycle within the 30 year timescale of these experiments. The climatic noise is of higher frequencies than 11 years during the 30 year period studied. We first apply empirical orthogonal function (EOF) analysis to global 10Be deposition on the annual scale and find that the first principal component, consisting of the spatial pattern of mean 10Be deposition and the temporally varying solar signal, explains 64% of the variability. The following principal components are closely related to those of precipitation. Then, we apply ensemble empirical decomposition (EEMD) analysis to the time series of 10Be deposition at GRIP and at Law Dome, which is an effective method for adaptively decomposing the time series into different frequency components. The low-frequency components and the long-term trend represent production and have reduced noise compared to the entire frequency spectrum of the deposition. The high-frequency components represent climate-driven noise related to the seasonal cycle of e.g. precipitation and are closely connected to high frequencies of precipitation. These results firstly show that the 10Be atmospheric production signal is preserved in the deposition flux to surface even during climates very different from today's both in global data and at two specific locations. Secondly, noise can be effectively reduced from 10Be deposition data by simply applying the EOF analysis in the case of a reasonably large number of available data sets, or by decomposing the individual data sets to filter out high-frequency fluctuations.
Principal component reconstruction (PCR) for cine CBCT with motion learning from 2D fluoroscopy.
Gao, Hao; Zhang, Yawei; Ren, Lei; Yin, Fang-Fang
2018-01-01
This work aims to generate cine CT images (i.e., 4D images with high-temporal resolution) based on a novel principal component reconstruction (PCR) technique with motion learning from 2D fluoroscopic training images. In the proposed PCR method, the matrix factorization is utilized as an explicit low-rank regularization of 4D images that are represented as a product of spatial principal components and temporal motion coefficients. The key hypothesis of PCR is that temporal coefficients from 4D images can be reasonably approximated by temporal coefficients learned from 2D fluoroscopic training projections. For this purpose, we can acquire fluoroscopic training projections for a few breathing periods at fixed gantry angles that are free from geometric distortion due to gantry rotation, that is, fluoroscopy-based motion learning. Such training projections can provide an effective characterization of the breathing motion. The temporal coefficients can be extracted from these training projections and used as priors for PCR, even though principal components from training projections are certainly not the same for these 4D images to be reconstructed. For this purpose, training data are synchronized with reconstruction data using identical real-time breathing position intervals for projection binning. In terms of image reconstruction, with a priori temporal coefficients, the data fidelity for PCR changes from nonlinear to linear, and consequently, the PCR method is robust and can be solved efficiently. PCR is formulated as a convex optimization problem with the sum of linear data fidelity with respect to spatial principal components and spatiotemporal total variation regularization imposed on 4D image phases. The solution algorithm of PCR is developed based on alternating direction method of multipliers. The implementation is fully parallelized on GPU with NVIDIA CUDA toolbox and each reconstruction takes about a few minutes. The proposed PCR method is validated and compared with a state-of-art method, that is, PICCS, using both simulation and experimental data with the on-board cone-beam CT setting. The results demonstrated the feasibility of PCR for cine CBCT and significantly improved reconstruction quality of PCR from PICCS for cine CBCT. With a priori estimated temporal motion coefficients using fluoroscopic training projections, the PCR method can accurately reconstruct spatial principal components, and then generate cine CT images as a product of temporal motion coefficients and spatial principal components. © 2017 American Association of Physicists in Medicine.
The quality of social relationships in ravens
Fraser, Orlaith N.; Bugnyar, Thomas
2015-01-01
The quality of a social relationship represents the history of past social interactions between two individuals, from which the nature and outcome of future interactions can be predicted. Current theory predicts that relationship quality comprises three separate components, its value, compatibility and security. This study is the first to investigate the components of relationship quality in a large-brained bird. Following methods recently used to obtain quantitative measures of each relationship quality component in chimpanzees, Pan troglodytes, we entered data on seven behavioural variables from a group of 11 ravens, Corvus corax, into a principal components analysis. The characteristics of the extracted components matched those predicted for value, compatibility and security, and were labelled as such. When the effects of kinship and sex combination on each relationship quality component were analysed, we found that kin had more valuable relationships, whereas females had less secure and compatible relationships, although the effect of sex combination on compatibility only applied to nonkin. These patterns are consistent with what little knowledge we have of raven relationships from aviary studies and show that the components of relationship quality in ravens may indeed be analogous to those in chimpanzees. PMID:25821236
NASA Astrophysics Data System (ADS)
Alevizos, Evangelos; Snellen, Mirjam; Simons, Dick; Siemes, Kerstin; Greinert, Jens
2018-06-01
This study applies three classification methods exploiting the angular dependence of acoustic seafloor backscatter along with high resolution sub-bottom profiling for seafloor sediment characterization in the Eckernförde Bay, Baltic Sea Germany. This area is well suited for acoustic backscatter studies due to its shallowness, its smooth bathymetry and the presence of a wide range of sediment types. Backscatter data were acquired using a Seabeam1180 (180 kHz) multibeam echosounder and sub-bottom profiler data were recorded using a SES-2000 parametric sonar transmitting 6 and 12 kHz. The high density of seafloor soundings allowed extracting backscatter layers for five beam angles over a large part of the surveyed area. A Bayesian probability method was employed for sediment classification based on the backscatter variability at a single incidence angle, whereas Maximum Likelihood Classification (MLC) and Principal Components Analysis (PCA) were applied to the multi-angle layers. The Bayesian approach was used for identifying the optimum number of acoustic classes because cluster validation is carried out prior to class assignment and class outputs are ordinal categorical values. The method is based on the principle that backscatter values from a single incidence angle express a normal distribution for a particular sediment type. The resulting Bayesian classes were well correlated to median grain sizes and the percentage of coarse material. The MLC method uses angular response information from five layers of training areas extracted from the Bayesian classification map. The subsequent PCA analysis is based on the transformation of these five layers into two principal components that comprise most of the data variability. These principal components were clustered in five classes after running an external cluster validation test. In general both methods MLC and PCA, separated the various sediment types effectively, showing good agreement (kappa >0.7) with the Bayesian approach which also correlates well with ground truth data (r2 > 0.7). In addition, sub-bottom data were used in conjunction with the Bayesian classification results to characterize acoustic classes with respect to their geological and stratigraphic interpretation. The joined interpretation of seafloor and sub-seafloor data sets proved to be an efficient approach for a better understanding of seafloor backscatter patchiness and to discriminate acoustically similar classes in different geological/bathymetric settings.
Cluster-based exposure variation analysis
2013-01-01
Background Static posture, repetitive movements and lack of physical variation are known risk factors for work-related musculoskeletal disorders, and thus needs to be properly assessed in occupational studies. The aims of this study were (i) to investigate the effectiveness of a conventional exposure variation analysis (EVA) in discriminating exposure time lines and (ii) to compare it with a new cluster-based method for analysis of exposure variation. Methods For this purpose, we simulated a repeated cyclic exposure varying within each cycle between “low” and “high” exposure levels in a “near” or “far” range, and with “low” or “high” velocities (exposure change rates). The duration of each cycle was also manipulated by selecting a “small” or “large” standard deviation of the cycle time. Theses parameters reflected three dimensions of exposure variation, i.e. range, frequency and temporal similarity. Each simulation trace included two realizations of 100 concatenated cycles with either low (ρ = 0.1), medium (ρ = 0.5) or high (ρ = 0.9) correlation between the realizations. These traces were analyzed by conventional EVA, and a novel cluster-based EVA (C-EVA). Principal component analysis (PCA) was applied on the marginal distributions of 1) the EVA of each of the realizations (univariate approach), 2) a combination of the EVA of both realizations (multivariate approach) and 3) C-EVA. The least number of principal components describing more than 90% of variability in each case was selected and the projection of marginal distributions along the selected principal component was calculated. A linear classifier was then applied to these projections to discriminate between the simulated exposure patterns, and the accuracy of classified realizations was determined. Results C-EVA classified exposures more correctly than univariate and multivariate EVA approaches; classification accuracy was 49%, 47% and 52% for EVA (univariate and multivariate), and C-EVA, respectively (p < 0.001). All three methods performed poorly in discriminating exposure patterns differing with respect to the variability in cycle time duration. Conclusion While C-EVA had a higher accuracy than conventional EVA, both failed to detect differences in temporal similarity. The data-driven optimality of data reduction and the capability of handling multiple exposure time lines in a single analysis are the advantages of the C-EVA. PMID:23557439
Dean, Jamie A; Wong, Kee H; Gay, Hiram; Welsh, Liam C; Jones, Ann-Britt; Schick, Ulrike; Oh, Jung Hun; Apte, Aditya; Newbold, Kate L; Bhide, Shreerang A; Harrington, Kevin J; Deasy, Joseph O; Nutting, Christopher M; Gulliford, Sarah L
2016-11-15
Current normal tissue complication probability modeling using logistic regression suffers from bias and high uncertainty in the presence of highly correlated radiation therapy (RT) dose data. This hinders robust estimates of dose-response associations and, hence, optimal normal tissue-sparing strategies from being elucidated. Using functional data analysis (FDA) to reduce the dimensionality of the dose data could overcome this limitation. FDA was applied to modeling of severe acute mucositis and dysphagia resulting from head and neck RT. Functional partial least squares regression (FPLS) and functional principal component analysis were used for dimensionality reduction of the dose-volume histogram data. The reduced dose data were input into functional logistic regression models (functional partial least squares-logistic regression [FPLS-LR] and functional principal component-logistic regression [FPC-LR]) along with clinical data. This approach was compared with penalized logistic regression (PLR) in terms of predictive performance and the significance of treatment covariate-response associations, assessed using bootstrapping. The area under the receiver operating characteristic curve for the PLR, FPC-LR, and FPLS-LR models was 0.65, 0.69, and 0.67, respectively, for mucositis (internal validation) and 0.81, 0.83, and 0.83, respectively, for dysphagia (external validation). The calibration slopes/intercepts for the PLR, FPC-LR, and FPLS-LR models were 1.6/-0.67, 0.45/0.47, and 0.40/0.49, respectively, for mucositis (internal validation) and 2.5/-0.96, 0.79/-0.04, and 0.79/0.00, respectively, for dysphagia (external validation). The bootstrapped odds ratios indicated significant associations between RT dose and severe toxicity in the mucositis and dysphagia FDA models. Cisplatin was significantly associated with severe dysphagia in the FDA models. None of the covariates was significantly associated with severe toxicity in the PLR models. Dose levels greater than approximately 1.0 Gy/fraction were most strongly associated with severe acute mucositis and dysphagia in the FDA models. FPLS and functional principal component analysis marginally improved predictive performance compared with PLR and provided robust dose-response associations. FDA is recommended for use in normal tissue complication probability modeling. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
XCOM intrinsic dimensionality for low-Z elements at diagnostic energies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bornefalk, Hans
2012-02-15
Purpose: To determine the intrinsic dimensionality of linear attenuation coefficients (LACs) from XCOM for elements with low atomic number (Z = 1-20) at diagnostic x-ray energies (25-120 keV). H{sub 0}{sup q}, the hypothesis that the space of LACs is spanned by q bases, is tested for various q-values. Methods: Principal component analysis is first applied and the LACs are projected onto the first q principal component bases. The residuals of the model values vs XCOM data are determined for all energies and atomic numbers. Heteroscedasticity invalidates the prerequisite of i.i.d. errors necessary for bootstrapping residuals. Instead wild bootstrap is applied,more » which, by not mixing residuals, allows the effect of the non-i.i.d residuals to be reflected in the result. Credible regions for the eigenvalues of the correlation matrix for the bootstrapped LAC data are determined. If subsequent credible regions for the eigenvalues overlap, the corresponding principal component is not considered to represent true data structure but noise. If this happens for eigenvalues l and l + 1, for any l{<=}q, H{sub 0}{sup q} is rejected. Results: The largest value of q for which H{sub 0}{sup q} is nonrejectable at the 5%-level is q = 4. This indicates that the statistically significant intrinsic dimensionality of low-Z XCOM data at diagnostic energies is four. Conclusions: The method presented allows determination of the statistically significant dimensionality of any noisy linear subspace. Knowledge of such significant dimensionality is of interest for any method making assumptions on intrinsic dimensionality and evaluating results on noisy reference data. For LACs, knowledge of the low-Z dimensionality might be relevant when parametrization schemes are tuned to XCOM data. For x-ray imaging techniques based on the basis decomposition method (Alvarez and Macovski, Phys. Med. Biol. 21, 733-744, 1976), an underlying dimensionality of two is commonly assigned to the LAC of human tissue at diagnostic energies. The finding of a higher statistically significant dimensionality thus raises the question whether a higher assumed model dimensionality (now feasible with the advent of multibin x-ray systems) might also be practically relevant, i.e., if better tissue characterization results can be obtained.« less
Villarrasa-Sapiña, Israel; Álvarez-Pitti, Julio; Cabeza-Ruiz, Ruth; Redón, Pau; Lurbe, Empar; García-Massó, Xavier
2018-02-01
Excess body weight during childhood causes reduced motor functionality and problems in postural control, a negative influence which has been reported in the literature. Nevertheless, no information regarding the effect of body composition on the postural control of overweight and obese children is available. The objective of this study was therefore to establish these relationships. A cross-sectional design was used to establish relationships between body composition and postural control variables obtained in bipedal eyes-open and eyes-closed conditions in twenty-two children. Centre of pressure signals were analysed in the temporal and frequency domains. Pearson correlations were applied to establish relationships between variables. Principal component analysis was applied to the body composition variables to avoid potential multicollinearity in the regression models. These principal components were used to perform a multiple linear regression analysis, from which regression models were obtained to predict postural control. Height and leg mass were the body composition variables that showed the highest correlation with postural control. Multiple regression models were also obtained and several of these models showed a higher correlation coefficient in predicting postural control than simple correlations. These models revealed that leg and trunk mass were good predictors of postural control. More equations were found in the eyes-open than eyes-closed condition. Body weight and height are negatively correlated with postural control. However, leg and trunk mass are better postural control predictors than arm or body mass. Finally, body composition variables are more useful in predicting postural control when the eyes are open. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Dang, Thanh Duc; Cochrane, Thomas A.; Arias, Mauricio E.
2018-06-01
Temporal and spatial concentrations of suspended sediment in floodplains are difficult to quantify because in situ measurements can be logistically complex, time consuming and costly. In this research, satellite imagery with long temporal and large spatial coverage (Landsat TM/ETM+) was used to complement in situ suspended sediment measurements to reflect sediment dynamics in a large (70,000 km2) floodplain. Instead of using a single spectral band from Landsat, a Principal Component Analysis was applied to obtain uncorrelated reflectance values for five bands of Landsat TM/ETM+. Significant correlations between the scores of the 1st principal component and the values of continuously gauged suspended sediment concentration, shown via high coefficients of determination of sediment rating curves (R2 ranging from 0.66 to 0.92), permit the application of satellite images to quantify spatial and temporal sediment variation in the Mekong floodplains. Estimated suspended sediment maps show that hydraulic regimes at Chaktomuk (Cambodia), where the Mekong, Bassac, and Tonle Sap rivers diverge, determine the amount of seasonal sediment supplies to the Mekong Delta. The development of flood prevention systems to allow for three rice crops a year in the Vietnam Mekong Delta significantly reduces localized flooding, but also prevents sediment (source of nutrients) from entering fields. A direct consequence of this is the need to apply more artificial fertilizers to boost agricultural productivity, which may trigger environmental problems. Overall, remote sensing is shown to be an effective tool to understand temporal and spatial sediment dynamics in large floodplains.
Osis, Sean T; Hettinga, Blayne A; Leitch, Jessica; Ferber, Reed
2014-08-22
As 3-dimensional (3D) motion-capture for clinical gait analysis continues to evolve, new methods must be developed to improve the detection of gait cycle events based on kinematic data. Recently, the application of principal component analysis (PCA) to gait data has shown promise in detecting important biomechanical features. Therefore, the purpose of this study was to define a new foot strike detection method for a continuum of striking techniques, by applying PCA to joint angle waveforms. In accordance with Newtonian mechanics, it was hypothesized that transient features in the sagittal-plane accelerations of the lower extremity would be linked with the impulsive application of force to the foot at foot strike. Kinematic and kinetic data from treadmill running were selected for 154 subjects, from a database of gait biomechanics. Ankle, knee and hip sagittal plane angular acceleration kinematic curves were chained together to form a row input to a PCA matrix. A linear polynomial was calculated based on PCA scores, and a 10-fold cross-validation was performed to evaluate prediction accuracy against gold-standard foot strike as determined by a 10 N rise in the vertical ground reaction force. Results show 89-94% of all predicted foot strikes were within 4 frames (20 ms) of the gold standard with the largest error being 28 ms. It is concluded that this new foot strike detection is an improvement on existing methods and can be applied regardless of whether the runner exhibits a rearfoot, midfoot, or forefoot strike pattern. Copyright © 2014 Elsevier Ltd. All rights reserved.
Influence of apple pomace inclusion on the process of animal feed pelleting.
Maslovarić, Marijana D; Vukmirović, Đuro; Pezo, Lato; Čolović, Radmilo; Jovanović, Rade; Spasevski, Nedeljka; Tolimir, Nataša
2017-08-01
Apple pomace (AP) is the main by-product of apple juice production. Large amounts of this material disposed into landfills can cause serious environmental problems. One of the solutions is to utilise AP as animal feed. The aim of this study was to investigate the impact of dried AP inclusion into model mixtures made from conventional feedstuffs on pellet quality and pellet press performance. Three model mixtures, with different ratios of maize, sunflower meal and AP, were pelleted. Response surface methodology (RSM) was applied when designing the experiment. The simultaneous and interactive effects of apple pomace share (APS) in the mixtures, die thickness (DT) of the pellet press and initial moisture content of the mixtures (M), on pellet quality and production parameters were investigated. Principal component analysis (PCA) and standard score (SS) analysis were applied for comprehensive analysis of the experimental data. The increase in APS led to an improvement of pellet quality parameters: pellet durability index (PDI), hardness (H) and proportion of fines in pellets. The increase in DT and M resulted in pellet quality improvement. The increase in DT and APS resulted in higher energy consumption of the pellet press. APS was the most influential variable for PDI and H calculation, while APS and DT were the most influential variables in the calculation of pellet press energy consumption. PCA showed that the first two principal components could be considered sufficient for data representation. In conclusion, addition of dried AP to feed model mixtures significantly improved the quality of the pellets.
NASA Astrophysics Data System (ADS)
Hoell, Simon; Omenzetter, Piotr
2015-03-01
The development of large wind turbines that enable to harvest energy more efficiently is a consequence of the increasing demand for renewables in the world. To optimize the potential energy output, light and flexible wind turbine blades (WTBs) are designed. However, the higher flexibilities and lower buckling capacities adversely affect the long-term safety and reliability of WTBs, and thus the increased operation and maintenance costs reduce the expected revenue. Effective structural health monitoring techniques can help to counteract this by limiting inspection efforts and avoiding unplanned maintenance actions. Vibration-based methods deserve high attention due to the moderate instrumentation efforts and the applicability for in-service measurements. The present paper proposes the use of cross-correlations (CCs) of acceleration responses between sensors at different locations for structural damage detection in WTBs. CCs were in the past successfully applied for damage detection in numerical and experimental beam structures while utilizing only single lags between the signals. The present approach uses vectors of CC coefficients for multiple lags between measurements of two selected sensors taken from multiple possible combinations of sensors. To reduce the dimensionality of the damage sensitive feature (DSF) vectors, principal component analysis is performed. The optimal number of principal components (PCs) is chosen with respect to a statistical threshold. Finally, the detection phase uses the selected PCs of the healthy structure to calculate scores from a current DSF vector, where statistical hypothesis testing is performed for making a decision about the current structural state. The method is applied to laboratory experiments conducted on a small WTB with non-destructive damage scenarios.
NASA Astrophysics Data System (ADS)
Ni, Yongnian; Wang, Yong; Kokot, Serge
2008-10-01
A spectrophotometric method for the simultaneous determination of the important pharmaceuticals, pefloxacin and its structurally similar metabolite, norfloxacin, is described for the first time. The analysis is based on the monitoring of a kinetic spectrophotometric reaction of the two analytes with potassium permanganate as the oxidant. The measurement of the reaction process followed the absorbance decrease of potassium permanganate at 526 nm, and the accompanying increase of the product, potassium manganate, at 608 nm. It was essential to use multivariate calibrations to overcome severe spectral overlaps and similarities in reaction kinetics. Calibration curves for the individual analytes showed linear relationships over the concentration ranges of 1.0-11.5 mg L -1 at 526 and 608 nm for pefloxacin, and 0.15-1.8 mg L -1 at 526 and 608 nm for norfloxacin. Various multivariate calibration models were applied, at the two analytical wavelengths, for the simultaneous prediction of the two analytes including classical least squares (CLS), principal component regression (PCR), partial least squares (PLS), radial basis function-artificial neural network (RBF-ANN) and principal component-radial basis function-artificial neural network (PC-RBF-ANN). PLS and PC-RBF-ANN calibrations with the data collected at 526 nm, were the preferred methods—%RPE T ˜ 5, and LODs for pefloxacin and norfloxacin of 0.36 and 0.06 mg L -1, respectively. Then, the proposed method was applied successfully for the simultaneous determination of pefloxacin and norfloxacin present in pharmaceutical and human plasma samples. The results compared well with those from the alternative analysis by HPLC.
Liu, Wei; Wang, Zhen-Zhong; Qing, Jian-Ping; Li, Hong-Juan; Xiao, Wei
2014-01-01
Background: Peach kernels which contain kinds of fatty acids play an important role in the regulation of a variety of physiological and biological functions. Objective: To establish an innovative and rapid diffuse reflectance near-infrared spectroscopy (DR-NIR) analysis method along with chemometric techniques for the qualitative and quantitative determination of a peach kernel. Materials and Methods: Peach kernel samples from nine different origins were analyzed with high-performance liquid chromatography (HPLC) as a reference method. DR-NIR is in the spectral range 1100-2300 nm. Principal component analysis (PCA) and partial least squares regression (PLSR) algorithm were applied to obtain prediction models, The Savitzky-Golay derivative and first derivative were adopted for the spectral pre-processing, PCA was applied to classify the varieties of those samples. For the quantitative calibration, the models of linoleic and oleinic acids were established with the PLSR algorithm and the optimal principal component (PC) numbers were selected with leave-one-out (LOO) cross-validation. The established models were evaluated with the root mean square error of deviation (RMSED) and corresponding correlation coefficients (R2). Results: The PCA results of DR-NIR spectra yield clear classification of the two varieties of peach kernel. PLSR had a better predictive ability. The correlation coefficients of the two calibration models were above 0.99, and the RMSED of linoleic and oleinic acids were 1.266% and 1.412%, respectively. Conclusion: The DR-NIR combined with PCA and PLSR algorithm could be used efficiently to identify and quantify peach kernels and also help to solve variety problem. PMID:25422544
Taguchi, Y-h; Iwadate, Mitsuo; Umeyama, Hideaki
2015-04-30
Feature extraction (FE) is difficult, particularly if there are more features than samples, as small sample numbers often result in biased outcomes or overfitting. Furthermore, multiple sample classes often complicate FE because evaluating performance, which is usual in supervised FE, is generally harder than the two-class problem. Developing sample classification independent unsupervised methods would solve many of these problems. Two principal component analysis (PCA)-based FE, specifically, variational Bayes PCA (VBPCA) was extended to perform unsupervised FE, and together with conventional PCA (CPCA)-based unsupervised FE, were tested as sample classification independent unsupervised FE methods. VBPCA- and CPCA-based unsupervised FE both performed well when applied to simulated data, and a posttraumatic stress disorder (PTSD)-mediated heart disease data set that had multiple categorical class observations in mRNA/microRNA expression of stressed mouse heart. A critical set of PTSD miRNAs/mRNAs were identified that show aberrant expression between treatment and control samples, and significant, negative correlation with one another. Moreover, greater stability and biological feasibility than conventional supervised FE was also demonstrated. Based on the results obtained, in silico drug discovery was performed as translational validation of the methods. Our two proposed unsupervised FE methods (CPCA- and VBPCA-based) worked well on simulated data, and outperformed two conventional supervised FE methods on a real data set. Thus, these two methods have suggested equivalence for FE on categorical multiclass data sets, with potential translational utility for in silico drug discovery.
Relationships between NIR spectra and sensory attributes of Thai commercial fish sauces.
Ritthiruangdej, Pitiporn; Suwonsichon, Thongchai
2007-07-01
Twenty Thai commercial fish sauces were characterized by sensory descriptive analysis and near-infrared (NIR) spectroscopy. The main objectives were i) to investigate the relationships between sensory attributes and NIR spectra of samples and ii) to characterize the sensory characteristics of fish sauces based on NIR data. A generic descriptive analysis with 12 trained panels was used to characterize the sensory attributes. These attributes consisted of 15 descriptors: brown color, 5 aromatics (sweet, caramelized, fermented, fishy, and musty), 4 tastes (sweet, salty, bitter, and umami), 3 aftertastes (sweet, salty and bitter) and 2 flavors (caramelized and fishy). The results showed that Thai fish sauce samples exhibited significant differences in all of sensory attribute values (p < 0.05). NIR transflectance spectra were obtained from 1100 to 2500 nm. Prior to investigation of the relationships between sensory attributes and NIR spectra, principal component analysis (PCA) was applied to reduce the dimensionality of the spectral data from 622 wavelengths to two uncorrelated components (NIR1 and NIR2) which explained 92 and 7% of the total variation, respectively. NIR1 was highly correlated with the wavelength regions of 1100 - 1544, 1774 - 2062, 2092 - 2308, and 2358 - 2440 nm, while NIR2 was highly correlated with the wavelength regions of 1742 - 1764, 2066 - 2088, and 2312 - 2354 nm. Subsequently, the relationships among these two components and all sensory attributes were also investigated by PCA. The results showed that the first three principal components (PCs) named as fishy flavor component (PC1), sweet component (PC2) and bitterness component (PC3), respectively, explained a total of 66.86% of the variation. NIR1 was mainly correlated to the sensory attributes of fishy aromatic, fishy flavor and sweet aftertaste on PC1. In addition, the PCA using only the factor loadings of NIR1 and NIR2 could be used to classify samples into three groups which showed high, medium and low degrees of fishy aromatic, fishy flavor and sweet aftertaste.
ERIC Educational Resources Information Center
Lin, Mind-Dih
2012-01-01
Improving principal leadership is a vital component to the success of educational reform initiatives that seek to improve whole-school performance, as principal leadership often exercises positive but indirect effects on student learning. Because of the importance of principals within the field of school improvement, this article focuses on…
ERIC Educational Resources Information Center
Herrmann, Mariesa; Ross, Christine
2016-01-01
States and districts across the country are implementing new principal evaluation systems that include measures of the quality of principals' school leadership practices and measures of student achievement growth. Because these evaluation systems will be used for high-stakes decisions, it is important that the component measures of the evaluation…
ERIC Educational Resources Information Center
Hvidston, David J.; Range, Bret G.; McKim, Courtney Ann; Mette, Ian M.
2015-01-01
This study examined the perspectives of novice and late career principals concerning instructional and organizational leadership within their performance evaluations. An online survey was sent to 251 principals with a return rate of 49%. Instructional leadership components of the evaluation that were most important to all principals were:…
[Local conditions of vulnerability associated with dengue in two communities of Morelos].
Chuc, Silvia; Hurtado-Díaz, Magali; Schilmann, Astrid; Riojas-Rodríguez, Horacio; Rangel, Hilda; González-Fernández, Mariana Irina
2013-04-01
To evaluate the vulnerability associated with the occurrence of dengue in two villages of Morelos, Mexico from 2006 to 2009. MATERIALS AND METHODS. A survey on knowledge, risk perception, prevention practices and water use was applied in two villages of Morelos. Using a principal component analysis, an index of local vulnerability to dengue (IVL) was constructed. The association of IVL with the disease at home was assessed using a Chi-square test. The IVL included five components explaining 63% of the variance and was classified in three categories: low, medium and high. There was a significant association between increased vulnerability and prevalence of reported cases of dengue in Temixco and Tlaquiltenango. The study of vulnerability to dengue allows us to identify local needs in the field of health promotion.
Lateral separation of colloids or cells by dielectrophoresis augmented by AC electroosmosis.
Zhou, Hao; White, Lee R; Tilton, Robert D
2005-05-01
Colloidal particles and biological cells are patterned and separated laterally adjacent to a micropatterned electrode array by applying AC electric fields that are principally oriented normally to the electrode array. This is demonstrated for yeast cells, red blood cells, and colloidal polystyrene particles of different sizes and zeta-potentials. The separation mechanism is observed experimentally to depend on the applied field frequency and voltage. At high frequencies, particles position themselves in a manner that is consistent with dielectrophoresis, while at low frequencies, the positioning is explained in terms of a strong coupling between gravity, the vertical component of the dielectrophoretic force, and the Stokes drag on particles induced by AC electroosmotic flow. Compared to high frequency dielectrophoretic separations, the low frequency separations are faster and require lower applied voltages. Furthermore, the AC electroosmosis coupling with dielectrophoresis may enable cell separations that are not feasible based on dielectrophoresis alone.
ERIC Educational Resources Information Center
Chou, Yeh-Tai; Wang, Wen-Chung
2010-01-01
Dimensionality is an important assumption in item response theory (IRT). Principal component analysis on standardized residuals has been used to check dimensionality, especially under the family of Rasch models. It has been suggested that an eigenvalue greater than 1.5 for the first eigenvalue signifies a violation of unidimensionality when there…
ERIC Educational Resources Information Center
Brusco, Michael J.; Singh, Renu; Steinley, Douglas
2009-01-01
The selection of a subset of variables from a pool of candidates is an important problem in several areas of multivariate statistics. Within the context of principal component analysis (PCA), a number of authors have argued that subset selection is crucial for identifying those variables that are required for correct interpretation of the…
NASA Technical Reports Server (NTRS)
Murray, C. W., Jr.; Mueller, J. L.; Zwally, H. J.
1984-01-01
A field of measured anomalies of some physical variable relative to their time averages, is partitioned in either the space domain or the time domain. Eigenvectors and corresponding principal components of the smaller dimensioned covariance matrices associated with the partitioned data sets are calculated independently, then joined to approximate the eigenstructure of the larger covariance matrix associated with the unpartitioned data set. The accuracy of the approximation (fraction of the total variance in the field) and the magnitudes of the largest eigenvalues from the partitioned covariance matrices together determine the number of local EOF's and principal components to be joined by any particular level. The space-time distribution of Nimbus-5 ESMR sea ice measurement is analyzed.