linear principal component: Topics by Science.gov

Sample records for linear principal component

Non-linear principal component analysis applied to Lorenz models and to North Atlantic SLP

NASA Astrophysics Data System (ADS)

Russo, A.; Trigo, R. M.

2003-04-01

A non-linear generalisation of Principal Component Analysis (PCA), denoted Non-Linear Principal Component Analysis (NLPCA), is introduced and applied to the analysis of three data sets. Non-Linear Principal Component Analysis allows for the detection and characterisation of low-dimensional non-linear structure in multivariate data sets. This method is implemented using a 5-layer feed-forward neural network introduced originally in the chemical engineering literature (Kramer, 1991). The method is described and details of its implementation are addressed. Non-Linear Principal Component Analysis is first applied to a data set sampled from the Lorenz attractor (1963). It is found that the NLPCA approximations are more representative of the data than are the corresponding PCA approximations. The same methodology was applied to the less known Lorenz attractor (1984). However, the results obtained weren't as good as those attained with the famous 'Butterfly' attractor. Further work with this model is underway in order to assess if NLPCA techniques can be more representative of the data characteristics than are the corresponding PCA approximations. The application of NLPCA to relatively 'simple' dynamical systems, such as those proposed by Lorenz, is well understood. However, the application of NLPCA to a large climatic data set is much more challenging. Here, we have applied NLPCA to the sea level pressure (SLP) field for the entire North Atlantic area and the results show a slight imcrement of explained variance associated. Finally, directions for future work are presented.%}
Classical Testing in Functional Linear Models.

PubMed

Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

2016-01-01

We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models

PubMed Central

Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

2016-01-01

We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
Principal component regression analysis with SPSS.

PubMed

Liu, R X; Kuang, J; Gong, Q; Hou, X L

2003-06-01

The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Nonlinear Principal Components Analysis: Introduction and Application

ERIC Educational Resources Information Center

Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Koojj, Anita J.

2007-01-01

The authors provide a didactic treatment of nonlinear (categorical) principal components analysis (PCA). This method is the nonlinear equivalent of standard PCA and reduces the observed variables to a number of uncorrelated principal components. The most important advantages of nonlinear over linear PCA are that it incorporates nominal and ordinal…
Stability of Nonlinear Principal Components Analysis: An Empirical Study Using the Balanced Bootstrap

ERIC Educational Resources Information Center

Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Kooij, Anita J.

2007-01-01

Principal components analysis (PCA) is used to explore the structure of data sets containing linearly related numeric variables. Alternatively, nonlinear PCA can handle possibly nonlinearly related numeric as well as nonnumeric variables. For linear PCA, the stability of its solution can be established under the assumption of multivariate…
Principal Component Analysis: Resources for an Essential Application of Linear Algebra

ERIC Educational Resources Information Center

Pankavich, Stephen; Swanson, Rebecca

2015-01-01

Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…
Least Principal Components Analysis (LPCA): An Alternative to Regression Analysis.

ERIC Educational Resources Information Center

Olson, Jeffery E.

Often, all of the variables in a model are latent, random, or subject to measurement error, or there is not an obvious dependent variable. When any of these conditions exist, an appropriate method for estimating the linear relationships among the variables is Least Principal Components Analysis. Least Principal Components are robust, consistent,…
Wavelet decomposition based principal component analysis for face recognition using MATLAB

NASA Astrophysics Data System (ADS)

Sharma, Mahesh Kumar; Sharma, Shashikant; Leeprechanon, Nopbhorn; Ranjan, Aashish

2016-03-01

For the realization of face recognition systems in the static as well as in the real time frame, algorithms such as principal component analysis, independent component analysis, linear discriminate analysis, neural networks and genetic algorithms are used for decades. This paper discusses an approach which is a wavelet decomposition based principal component analysis for face recognition. Principal component analysis is chosen over other algorithms due to its relative simplicity, efficiency, and robustness features. The term face recognition stands for identifying a person from his facial gestures and having resemblance with factor analysis in some sense, i.e. extraction of the principal component of an image. Principal component analysis is subjected to some drawbacks, mainly the poor discriminatory power and the large computational load in finding eigenvectors, in particular. These drawbacks can be greatly reduced by combining both wavelet transform decomposition for feature extraction and principal component analysis for pattern representation and classification together, by analyzing the facial gestures into space and time domain, where, frequency and time are used interchangeably. From the experimental results, it is envisaged that this face recognition method has made a significant percentage improvement in recognition rate as well as having a better computational efficiency.
HT-FRTC: a fast radiative transfer code using kernel regression

NASA Astrophysics Data System (ADS)

Thelen, Jean-Claude; Havemann, Stephan; Lewis, Warren

2016-09-01

The HT-FRTC is a principal component based fast radiative transfer code that can be used across the electromagnetic spectrum from the microwave through to the ultraviolet to calculate transmittance, radiance and flux spectra. The principal components cover the spectrum at a very high spectral resolution, which allows very fast line-by-line, hyperspectral and broadband simulations for satellite-based, airborne and ground-based sensors. The principal components are derived during a code training phase from line-by-line simulations for a diverse set of atmosphere and surface conditions. The derived principal components are sensor independent, i.e. no extra training is required to include additional sensors. During the training phase we also derive the predictors which are required by the fast radiative transfer code to determine the principal component scores from the monochromatic radiances (or fluxes, transmittances). These predictors are calculated for each training profile at a small number of frequencies, which are selected by a k-means cluster algorithm during the training phase. Until recently the predictors were calculated using a linear regression. However, during a recent rewrite of the code the linear regression was replaced by a Gaussian Process (GP) regression which resulted in a significant increase in accuracy when compared to the linear regression. The HT-FRTC has been trained with a large variety of gases, surface properties and scatterers. Rayleigh scattering as well as scattering by frozen/liquid clouds, hydrometeors and aerosols have all been included. The scattering phase function can be fully accounted for by an integrated line-by-line version of the Edwards-Slingo spherical harmonics radiation code or approximately by a modification to the extinction (Chou scaling).
[A novel method of multi-channel feature extraction combining multivariate autoregression and multiple-linear principal component analysis].

PubMed

Wang, Jinjia; Zhang, Yanna

2015-02-01

Brain-computer interface (BCI) systems identify brain signals through extracting features from them. In view of the limitations of the autoregressive model feature extraction method and the traditional principal component analysis to deal with the multichannel signals, this paper presents a multichannel feature extraction method that multivariate autoregressive (MVAR) model combined with the multiple-linear principal component analysis (MPCA), and used for magnetoencephalography (MEG) signals and electroencephalograph (EEG) signals recognition. Firstly, we calculated the MVAR model coefficient matrix of the MEG/EEG signals using this method, and then reduced the dimensions to a lower one, using MPCA. Finally, we recognized brain signals by Bayes Classifier. The key innovation we introduced in our investigation showed that we extended the traditional single-channel feature extraction method to the case of multi-channel one. We then carried out the experiments using the data groups of IV-III and IV - I. The experimental results proved that the method proposed in this paper was feasible.
Relationship between rice yield and climate variables in southwest Nigeria using multiple linear regression and support vector machine analysis

NASA Astrophysics Data System (ADS)

Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried

2018-03-01

This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease ( P < 0.001) in rice yield, pan evaporation, solar radiation, and wind speed declined significantly. Eight principal components exhibited an eigenvalue > 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.
Principal components analysis in clinical studies.

PubMed

Zhang, Zhongheng; Castelló, Adela

2017-09-01

In multivariate analysis, independent variables are usually correlated to each other which can introduce multicollinearity in the regression models. One approach to solve this problem is to apply principal components analysis (PCA) over these variables. This method uses orthogonal transformation to represent sets of potentially correlated variables with principal components (PC) that are linearly uncorrelated. PCs are ordered so that the first PC has the largest possible variance and only some components are selected to represent the correlated variables. As a result, the dimension of the variable space is reduced. This tutorial illustrates how to perform PCA in R environment, the example is a simulated dataset in which two PCs are responsible for the majority of the variance in the data. Furthermore, the visualization of PCA is highlighted.
Implementation of an integrating sphere for the enhancement of noninvasive glucose detection using quantum cascade laser spectroscopy

NASA Astrophysics Data System (ADS)

Werth, Alexandra; Liakat, Sabbir; Dong, Anqi; Woods, Callie M.; Gmachl, Claire F.

2018-05-01

An integrating sphere is used to enhance the collection of backscattered light in a noninvasive glucose sensor based on quantum cascade laser spectroscopy. The sphere enhances signal stability by roughly an order of magnitude, allowing us to use a thermoelectrically (TE) cooled detector while maintaining comparable glucose prediction accuracy levels. Using a smaller TE-cooled detector reduces form factor, creating a mobile sensor. Principal component analysis has predicted principal components of spectra taken from human subjects that closely match the absorption peaks of glucose. These principal components are used as regressors in a linear regression algorithm to make glucose concentration predictions, over 75% of which are clinically accurate.
A Graphical Approach to the Standard Principal-Agent Model.

ERIC Educational Resources Information Center

Zhou, Xianming

2002-01-01

States the principal-agent theory is difficult to teach because of its technical complexity and intractability. Indicates the equilibrium in the contract space is defined by the incentive parameter and insurance component of pay under a linear contract. Describes a graphical approach that students with basic knowledge of algebra and…
Restricted maximum likelihood estimation of genetic principal components and smoothed covariance matrices

PubMed Central

Meyer, Karin; Kirkpatrick, Mark

2005-01-01

Principal component analysis is a widely used 'dimension reduction' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1)/2 to m(2k - m + 1)/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given. PMID:15588566
Optimized principal component analysis on coronagraphic images of the fomalhaut system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Meshkat, Tiffany; Kenworthy, Matthew A.; Quanz, Sascha P.

We present the results of a study to optimize the principal component analysis (PCA) algorithm for planet detection, a new algorithm complementing angular differential imaging and locally optimized combination of images (LOCI) for increasing the contrast achievable next to a bright star. The stellar point spread function (PSF) is constructed by removing linear combinations of principal components, allowing the flux from an extrasolar planet to shine through. The number of principal components used determines how well the stellar PSF is globally modeled. Using more principal components may decrease the number of speckles in the final image, but also increases themore » background noise. We apply PCA to Fomalhaut Very Large Telescope NaCo images acquired at 4.05 μm with an apodized phase plate. We do not detect any companions, with a model dependent upper mass limit of 13-18 M {sub Jup} from 4-10 AU. PCA achieves greater sensitivity than the LOCI algorithm for the Fomalhaut coronagraphic data by up to 1 mag. We make several adaptations to the PCA code and determine which of these prove the most effective at maximizing the signal-to-noise from a planet very close to its parent star. We demonstrate that optimizing the number of principal components used in PCA proves most effective for pulling out a planet signal.« less
Principal component reconstruction (PCR) for cine CBCT with motion learning from 2D fluoroscopy.

PubMed

Gao, Hao; Zhang, Yawei; Ren, Lei; Yin, Fang-Fang

2018-01-01

This work aims to generate cine CT images (i.e., 4D images with high-temporal resolution) based on a novel principal component reconstruction (PCR) technique with motion learning from 2D fluoroscopic training images. In the proposed PCR method, the matrix factorization is utilized as an explicit low-rank regularization of 4D images that are represented as a product of spatial principal components and temporal motion coefficients. The key hypothesis of PCR is that temporal coefficients from 4D images can be reasonably approximated by temporal coefficients learned from 2D fluoroscopic training projections. For this purpose, we can acquire fluoroscopic training projections for a few breathing periods at fixed gantry angles that are free from geometric distortion due to gantry rotation, that is, fluoroscopy-based motion learning. Such training projections can provide an effective characterization of the breathing motion. The temporal coefficients can be extracted from these training projections and used as priors for PCR, even though principal components from training projections are certainly not the same for these 4D images to be reconstructed. For this purpose, training data are synchronized with reconstruction data using identical real-time breathing position intervals for projection binning. In terms of image reconstruction, with a priori temporal coefficients, the data fidelity for PCR changes from nonlinear to linear, and consequently, the PCR method is robust and can be solved efficiently. PCR is formulated as a convex optimization problem with the sum of linear data fidelity with respect to spatial principal components and spatiotemporal total variation regularization imposed on 4D image phases. The solution algorithm of PCR is developed based on alternating direction method of multipliers. The implementation is fully parallelized on GPU with NVIDIA CUDA toolbox and each reconstruction takes about a few minutes. The proposed PCR method is validated and compared with a state-of-art method, that is, PICCS, using both simulation and experimental data with the on-board cone-beam CT setting. The results demonstrated the feasibility of PCR for cine CBCT and significantly improved reconstruction quality of PCR from PICCS for cine CBCT. With a priori estimated temporal motion coefficients using fluoroscopic training projections, the PCR method can accurately reconstruct spatial principal components, and then generate cine CT images as a product of temporal motion coefficients and spatial principal components. © 2017 American Association of Physicists in Medicine.
Comparative study on fast classification of brick samples by combination of principal component analysis and linear discriminant analysis using stand-off and table-top laser-induced breakdown spectroscopy

NASA Astrophysics Data System (ADS)

Vítková, Gabriela; Prokeš, Lubomír; Novotný, Karel; Pořízka, Pavel; Novotný, Jan; Všianský, Dalibor; Čelko, Ladislav; Kaiser, Jozef

2014-11-01

Focusing on historical aspect, during archeological excavation or restoration works of buildings or different structures built from bricks it is important to determine, preferably in-situ and in real-time, the locality of bricks origin. Fast classification of bricks on the base of Laser-Induced Breakdown Spectroscopy (LIBS) spectra is possible using multivariate statistical methods. Combination of principal component analysis (PCA) and linear discriminant analysis (LDA) was applied in this case. LIBS was used to classify altogether the 29 brick samples from 7 different localities. Realizing comparative study using two different LIBS setups - stand-off and table-top it is shown that stand-off LIBS has a big potential for archeological in-field measurements.
Assessment of mechanical properties of isolated bovine intervertebral discs from multi-parametric magnetic resonance imaging.

PubMed

Recuerda, Maximilien; Périé, Delphine; Gilbert, Guillaume; Beaudoin, Gilles

2012-10-12

The treatment planning of spine pathologies requires information on the rigidity and permeability of the intervertebral discs (IVDs). Magnetic resonance imaging (MRI) offers great potential as a sensitive and non-invasive technique for describing the mechanical properties of IVDs. However, the literature reported small correlation coefficients between mechanical properties and MRI parameters. Our hypothesis is that the compressive modulus and the permeability of the IVD can be predicted by a linear combination of MRI parameters. Sixty IVDs were harvested from bovine tails, and randomly separated in four groups (in-situ, digested-6h, digested-18h, digested-24h). Multi-parametric MRI acquisitions were used to quantify the relaxation times T1 and T2, the magnetization transfer ratio MTR, the apparent diffusion coefficient ADC and the fractional anisotropy FA. Unconfined compression, confined compression and direct permeability measurements were performed to quantify the compressive moduli and the hydraulic permeabilities. Differences between groups were evaluated from a one way ANOVA. Multi linear regressions were performed between dependent mechanical properties and independent MRI parameters to verify our hypothesis. A principal component analysis was used to convert the set of possibly correlated variables into a set of linearly uncorrelated variables. Agglomerative Hierarchical Clustering was performed on the 3 principal components. Multilinear regressions showed that 45 to 80% of the Young's modulus E, the aggregate modulus in absence of deformation HA0, the radial permeability kr and the axial permeability in absence of deformation k0 can be explained by the MRI parameters within both the nucleus pulposus and the annulus pulposus. The principal component analysis reduced our variables to two principal components with a cumulative variability of 52-65%, which increased to 70-82% when considering the third principal component. The dendograms showed a natural division into four clusters for the nucleus pulposus and into three or four clusters for the annulus fibrosus. The compressive moduli and the permeabilities of isolated IVDs can be assessed mostly by MT and diffusion sequences. However, the relationships have to be improved with the inclusion of MRI parameters more sensitive to IVD degeneration. Before the use of this technique to quantify the mechanical properties of IVDs in vivo on patients suffering from various diseases, the relationships have to be defined for each degeneration state of the tissue that mimics the pathology. Our MRI protocol associated to principal component analysis and agglomerative hierarchical clustering are promising tools to classify the degenerated intervertebral discs and further find biomarkers and predictive factors of the evolution of the pathologies.

Differential Lipid Profiles of Normal Human Brain Matter and Gliomas by Positive and Negative Mode Desorption Electrospray Ionization – Mass Spectrometry Imaging

PubMed Central

Pirro, Valentina; Hattab, Eyas M.; Cohen-Gadol, Aaron A.; Cooks, R. Graham

2016-01-01

Desorption electrospray ionization—mass spectrometry (DESI-MS) imaging was used to analyze unmodified human brain tissue sections from 39 subjects sequentially in the positive and negative ionization modes. Acquisition of both MS polarities allowed more complete analysis of the human brain tumor lipidome as some phospholipids ionize preferentially in the positive and others in the negative ion mode. Normal brain parenchyma, comprised of grey matter and white matter, was differentiated from glioma using positive and negative ion mode DESI-MS lipid profiles with the aid of principal component analysis along with linear discriminant analysis. Principal component–linear discriminant analyses of the positive mode lipid profiles was able to distinguish grey matter, white matter, and glioma with an average sensitivity of 93.2% and specificity of 96.6%, while the negative mode lipid profiles had an average sensitivity of 94.1% and specificity of 97.4%. The positive and negative mode lipid profiles provided complementary information. Principal component–linear discriminant analysis of the combined positive and negative mode lipid profiles, via data fusion, resulted in approximately the same average sensitivity (94.7%) and specificity (97.6%) of the positive and negative modes when used individually. However, they complemented each other by improving the sensitivity and specificity of all classes (grey matter, white matter, and glioma) beyond 90% when used in combination. Further principal component analysis using the fused data resulted in the subgrouping of glioma into two groups associated with grey and white matter, respectively, a separation not apparent in the principal component analysis scores plots of the separate positive and negative mode data. The interrelationship of tumor cell percentage and the lipid profiles is discussed, and how such a measure could be used to measure residual tumor at surgical margins. PMID:27658243
Characterization of Type Ia Supernova Light Curves Using Principal Component Analysis of Sparse Functional Data

NASA Astrophysics Data System (ADS)

He, Shiyuan; Wang, Lifan; Huang, Jianhua Z.

2018-04-01

With growing data from ongoing and future supernova surveys, it is possible to empirically quantify the shapes of SNIa light curves in more detail, and to quantitatively relate the shape parameters with the intrinsic properties of SNIa. Building such relationships is critical in controlling systematic errors associated with supernova cosmology. Based on a collection of well-observed SNIa samples accumulated in the past years, we construct an empirical SNIa light curve model using a statistical method called the functional principal component analysis (FPCA) for sparse and irregularly sampled functional data. Using this method, the entire light curve of an SNIa is represented by a linear combination of principal component functions, and the SNIa is represented by a few numbers called “principal component scores.” These scores are used to establish relations between light curve shapes and physical quantities such as intrinsic color, interstellar dust reddening, spectral line strength, and spectral classes. These relations allow for descriptions of some critical physical quantities based purely on light curve shape parameters. Our study shows that some important spectral feature information is being encoded in the broad band light curves; for instance, we find that the light curve shapes are correlated with the velocity and velocity gradient of the Si II λ6355 line. This is important for supernova surveys (e.g., LSST and WFIRST). Moreover, the FPCA light curve model is used to construct the entire light curve shape, which in turn is used in a functional linear form to adjust intrinsic luminosity when fitting distance models.
Application of third molar development and eruption models in estimating dental age in Malay sub-adults.

PubMed

Mohd Yusof, Mohd Yusmiaidil Putera; Cauwels, Rita; Deschepper, Ellen; Martens, Luc

2015-08-01

The third molar development (TMD) has been widely utilized as one of the radiographic method for dental age estimation. By using the same radiograph of the same individual, third molar eruption (TME) information can be incorporated to the TMD regression model. This study aims to evaluate the performance of dental age estimation in individual method models and the combined model (TMD and TME) based on the classic regressions of multiple linear and principal component analysis. A sample of 705 digital panoramic radiographs of Malay sub-adults aged between 14.1 and 23.8 years was collected. The techniques described by Gleiser and Hunt (modified by Kohler) and Olze were employed to stage the TMD and TME, respectively. The data was divided to develop three respective models based on the two regressions of multiple linear and principal component analysis. The trained models were then validated on the test sample and the accuracy of age prediction was compared between each model. The coefficient of determination (R²) and root mean square error (RMSE) were calculated. In both genders, adjusted R² yielded an increment in the linear regressions of combined model as compared to the individual models. The overall decrease in RMSE was detected in combined model as compared to TMD (0.03-0.06) and TME (0.2-0.8). In principal component regression, low value of adjusted R(2) and high RMSE except in male were exhibited in combined model. Dental age estimation is better predicted using combined model in multiple linear regression models. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Architectural measures of the cancellous bone of the mandibular condyle identified by principal components analysis.

PubMed

Giesen, E B W; Ding, M; Dalstra, M; van Eijden, T M G J

2003-09-01

As several morphological parameters of cancellous bone express more or less the same architectural measure, we applied principal components analysis to group these measures and correlated these to the mechanical properties. Cylindrical specimens (n = 24) were obtained in different orientations from embalmed mandibular condyles; the angle of the first principal direction and the axis of the specimen, expressing the orientation of the trabeculae, ranged from 10 degrees to 87 degrees. Morphological parameters were determined by a method based on Archimedes' principle and by micro-CT scanning, and the mechanical properties were obtained by mechanical testing. The principal components analysis was used to obtain a set of independent components to describe the morphology. This set was entered into linear regression analyses for explaining the variance in mechanical properties. The principal components analysis revealed four components: amount of bone, number of trabeculae, trabecular orientation, and miscellaneous. They accounted for about 90% of the variance in the morphological variables. The component loadings indicated that a higher amount of bone was primarily associated with more plate-like trabeculae, and not with more or thicker trabeculae. The trabecular orientation was most determinative (about 50%) in explaining stiffness, strength, and failure energy. The amount of bone was second most determinative and increased the explained variance to about 72%. These results suggest that trabecular orientation and amount of bone are important in explaining the anisotropic mechanical properties of the cancellous bone of the mandibular condyle.
Short communication: Principal components and factor analytic models for test-day milk yield in Brazilian Holstein cattle.

PubMed

Bignardi, A B; El Faro, L; Rosa, G J M; Cardoso, V L; Machado, P F; Albuquerque, L G

2012-04-01

A total of 46,089 individual monthly test-day (TD) milk yields (10 test-days), from 7,331 complete first lactations of Holstein cattle were analyzed. A standard multivariate analysis (MV), reduced rank analyses fitting the first 2, 3, and 4 genetic principal components (PC2, PC3, PC4), and analyses that fitted a factor analytic structure considering 2, 3, and 4 factors (FAS2, FAS3, FAS4), were carried out. The models included the random animal genetic effect and fixed effects of the contemporary groups (herd-year-month of test-day), age of cow (linear and quadratic effects), and days in milk (linear effect). The residual covariance matrix was assumed to have full rank. Moreover, 2 random regression models were applied. Variance components were estimated by restricted maximum likelihood method. The heritability estimates ranged from 0.11 to 0.24. The genetic correlation estimates between TD obtained with the PC2 model were higher than those obtained with the MV model, especially on adjacent test-days at the end of lactation close to unity. The results indicate that for the data considered in this study, only 2 principal components are required to summarize the bulk of genetic variation among the 10 traits. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Protein quantification on dendrimer-activated surfaces by using time-of-flight secondary ion mass spectrometry and principal component regression

NASA Astrophysics Data System (ADS)

Kim, Young-Pil; Hong, Mi-Young; Shon, Hyun Kyong; Chegal, Won; Cho, Hyun Mo; Moon, Dae Won; Kim, Hak-Sung; Lee, Tae Geol

2008-12-01

Interaction between streptavidin and biotin on poly(amidoamine) (PAMAM) dendrimer-activated surfaces and on self-assembled monolayers (SAMs) was quantitatively studied by using time-of-flight secondary ion mass spectrometry (ToF-SIMS). The surface protein density was systematically varied as a function of protein concentration and independently quantified using the ellipsometry technique. Principal component analysis (PCA) and principal component regression (PCR) were used to identify a correlation between the intensities of the secondary ion peaks and the surface protein densities. From the ToF-SIMS and ellipsometry results, a good linear correlation of protein density was found. Our study shows that surface protein densities are higher on dendrimer-activated surfaces than on SAMs surfaces due to the spherical property of the dendrimer, and that these surface protein densities can be easily quantified with high sensitivity in a label-free manner by ToF-SIMS.
Pepper seed variety identification based on visible/near-infrared spectral technology

NASA Astrophysics Data System (ADS)

Li, Cuiling; Wang, Xiu; Meng, Zhijun; Fan, Pengfei; Cai, Jichen

2016-11-01

Pepper is a kind of important fruit vegetable, with the expansion of pepper hybrid planting area, detection of pepper seed purity is especially important. This research used visible/near infrared (VIS/NIR) spectral technology to detect the variety of single pepper seed, and chose hybrid pepper seeds "Zhuo Jiao NO.3", "Zhuo Jiao NO.4" and "Zhuo Jiao NO.5" as research sample. VIS/NIR spectral data of 80 "Zhuo Jiao NO.3", 80 "Zhuo Jiao NO.4" and 80 "Zhuo Jiao NO.5" pepper seeds were collected, and the original spectral data was pretreated with standard normal variable (SNV) transform, first derivative (FD), and Savitzky-Golay (SG) convolution smoothing methods. Principal component analysis (PCA) method was adopted to reduce the dimension of the spectral data and extract principal components, according to the distribution of the first principal component (PC1) along with the second principal component(PC2) in the twodimensional plane, similarly, the distribution of PC1 coupled with the third principal component(PC3), and the distribution of PC2 combined with PC3, distribution areas of three varieties of pepper seeds were divided in each twodimensional plane, and the discriminant accuracy of PCA was tested through observing the distribution area of samples' principal components in validation set. This study combined PCA and linear discriminant analysis (LDA) to identify single pepper seed varieties, results showed that with the FD preprocessing method, the discriminant accuracy of pepper seed varieties was 98% for validation set, it concludes that using VIS/NIR spectral technology is feasible for identification of single pepper seed varieties.
Strain Transient Detection Techniques: A Comparison of Source Parameter Inversions of Signals Isolated through Principal Component Analysis (PCA), Non-Linear PCA, and Rotated PCA

NASA Astrophysics Data System (ADS)

Lipovsky, B.; Funning, G. J.

2009-12-01

We compare several techniques for the analysis of geodetic time series with the ultimate aim to characterize the physical processes which are represented therein. We compare three methods for the analysis of these data: Principal Component Analysis (PCA), Non-Linear PCA (NLPCA), and Rotated PCA (RPCA). We evaluate each method by its ability to isolate signals which may be any combination of low amplitude (near noise level), temporally transient, unaccompanied by seismic emissions, and small scale with respect to the spatial domain. PCA is a powerful tool for extracting structure from large datasets which is traditionally realized through either the solution of an eigenvalue problem or through iterative methods. PCA is an transformation of the coordinate system of our data such that the new "principal" data axes retain maximal variance and minimal reconstruction error (Pearson, 1901; Hotelling, 1933). RPCA is achieved by an orthogonal transformation of the principal axes determined in PCA. In the analysis of meteorological data sets, RPCA has been seen to overcome domain shape dependencies, correct for sampling errors, and to determine principal axes which more closely represent physical processes (e.g., Richman, 1986). NLPCA generalizes PCA such that principal axes are replaced by principal curves (e.g., Hsieh 2004). We achieve NLPCA through an auto-associative feed-forward neural network (Scholz, 2005). We show the geophysical relevance of these techniques by application of each to a synthetic data set. Results are compared by inverting principal axes to determine deformation source parameters. Temporal variability in source parameters, estimated by each method, are also compared.
Principal component analysis and neurocomputing-based models for total ozone concentration over different urban regions of India

NASA Astrophysics Data System (ADS)

Chattopadhyay, Goutami; Chattopadhyay, Surajit; Chakraborthy, Parthasarathi

2012-07-01

The present study deals with daily total ozone concentration time series over four metro cities of India namely Kolkata, Mumbai, Chennai, and New Delhi in the multivariate environment. Using the Kaiser-Meyer-Olkin measure, it is established that the data set under consideration are suitable for principal component analysis. Subsequently, by introducing rotated component matrix for the principal components, the predictors suitable for generating artificial neural network (ANN) for daily total ozone prediction are identified. The multicollinearity is removed in this way. Models of ANN in the form of multilayer perceptron trained through backpropagation learning are generated for all of the study zones, and the model outcomes are assessed statistically. Measuring various statistics like Pearson correlation coefficients, Willmott's indices, percentage errors of prediction, and mean absolute errors, it is observed that for Mumbai and Kolkata the proposed ANN model generates very good predictions. The results are supported by the linearly distributed coordinates in the scatterplots.
Sparse principal component analysis in medical shape modeling

NASA Astrophysics Data System (ADS)

Sjöstrand, Karl; Stegmann, Mikkel B.; Larsen, Rasmus

2006-03-01

Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims at producing easily interpreted models through sparse loadings, i.e. each new variable is a linear combination of a subset of the original variables. One of the aims of using SPCA is the possible separation of the results into isolated and easily identifiable effects. This article introduces SPCA for shape analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA algorithm has been implemented using Matlab and is available for download. The general behavior of the algorithm is investigated, and strengths and weaknesses are discussed. The original report on the SPCA algorithm argues that the ordering of modes is not an issue. We disagree on this point and propose several approaches to establish sensible orderings. A method that orders modes by decreasing variance and maximizes the sum of variances for all modes is presented and investigated in detail.
Simplified model of statistically stationary spacecraft rotation and associated induced gravity environments

NASA Technical Reports Server (NTRS)

Fichtl, G. H.; Holland, R. L.

1978-01-01

A stochastic model of spacecraft motion was developed based on the assumption that the net torque vector due to crew activity and rocket thruster firings is a statistically stationary Gaussian vector process. The process had zero ensemble mean value, and the components of the torque vector were mutually stochastically independent. The linearized rigid-body equations of motion were used to derive the autospectral density functions of the components of the spacecraft rotation vector. The cross-spectral density functions of the components of the rotation vector vanish for all frequencies so that the components of rotation were mutually stochastically independent. The autospectral and cross-spectral density functions of the induced gravity environment imparted to scientific apparatus rigidly attached to the spacecraft were calculated from the rotation rate spectral density functions via linearized inertial frame to body-fixed principal axis frame transformation formulae. The induced gravity process was a Gaussian one with zero mean value. Transformation formulae were used to rotate the principal axis body-fixed frame to which the rotation rate and induced gravity vector were referred to a body-fixed frame in which the components of the induced gravity vector were stochastically independent. Rice's theory of exceedances was used to calculate expected exceedance rates of the components of the rotation and induced gravity vector processes.
Modulated Hebb-Oja learning rule--a method for principal subspace analysis.

PubMed

Jankovic, Marko V; Ogawa, Hidemitsu

2006-03-01

This paper presents analysis of the recently proposed modulated Hebb-Oja (MHO) method that performs linear mapping to a lower-dimensional subspace. Principal component subspace is the method that will be analyzed. Comparing to some other well-known methods for yielding principal component subspace (e.g., Oja's Subspace Learning Algorithm), the proposed method has one feature that could be seen as desirable from the biological point of view--synaptic efficacy learning rule does not need the explicit information about the value of the other efficacies to make individual efficacy modification. Also, the simplicity of the "neural circuits" that perform global computations and a fact that their number does not depend on the number of input and output neurons, could be seen as good features of the proposed method.
Improving the Power of GWAS and Avoiding Confounding from Population Stratification with PC-Select

PubMed Central

Tucker, George; Price, Alkes L.; Berger, Bonnie

2014-01-01

Using a reduced subset of SNPs in a linear mixed model can improve power for genome-wide association studies, yet this can result in insufficient correction for population stratification. We propose a hybrid approach using principal components that does not inflate statistics in the presence of population stratification and improves power over standard linear mixed models. PMID:24788602
The comparison of robust partial least squares regression with robust principal component regression on a real

NASA Astrophysics Data System (ADS)

Polat, Esra; Gunay, Suleyman

2013-10-01

One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Complexity of free energy landscapes of peptides revealed by nonlinear principal component analysis.

PubMed

Nguyen, Phuong H

2006-12-01

Employing the recently developed hierarchical nonlinear principal component analysis (NLPCA) method of Saegusa et al. (Neurocomputing 2004;61:57-70 and IEICE Trans Inf Syst 2005;E88-D:2242-2248), the complexities of the free energy landscapes of several peptides, including triglycine, hexaalanine, and the C-terminal beta-hairpin of protein G, were studied. First, the performance of this NLPCA method was compared with the standard linear principal component analysis (PCA). In particular, we compared two methods according to (1) the ability of the dimensionality reduction and (2) the efficient representation of peptide conformations in low-dimensional spaces spanned by the first few principal components. The study revealed that NLPCA reduces the dimensionality of the considered systems much better, than did PCA. For example, in order to get the similar error, which is due to representation of the original data of beta-hairpin in low dimensional space, one needs 4 and 21 principal components of NLPCA and PCA, respectively. Second, by representing the free energy landscapes of the considered systems as a function of the first two principal components obtained from PCA, we obtained the relatively well-structured free energy landscapes. In contrast, the free energy landscapes of NLPCA are much more complicated, exhibiting many states which are hidden in the PCA maps, especially in the unfolded regions. Furthermore, the study also showed that many states in the PCA maps are mixed up by several peptide conformations, while those of the NLPCA maps are more pure. This finding suggests that the NLPCA should be used to capture the essential features of the systems. (c) 2006 Wiley-Liss, Inc.
Application of principal component regression and partial least squares regression in ultraviolet spectrum water quality detection

NASA Astrophysics Data System (ADS)

Li, Jiangtong; Luo, Yongdao; Dai, Honglin

2018-01-01

Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.
Principal Curves on Riemannian Manifolds.

PubMed

Hauberg, Soren

2016-09-01

Euclidean statistics are often generalized to Riemannian manifolds by replacing straight-line interpolations with geodesic ones. While these Riemannian models are familiar-looking, they are restricted by the inflexibility of geodesics, and they rely on constructions which are optimal only in Euclidean domains. We consider extensions of Principal Component Analysis (PCA) to Riemannian manifolds. Classic Riemannian approaches seek a geodesic curve passing through the mean that optimizes a criteria of interest. The requirements that the solution both is geodesic and must pass through the mean tend to imply that the methods only work well when the manifold is mostly flat within the support of the generating distribution. We argue that instead of generalizing linear Euclidean models, it is more fruitful to generalize non-linear Euclidean models. Specifically, we extend the classic Principal Curves from Hastie & Stuetzle to data residing on a complete Riemannian manifold. We show that for elliptical distributions in the tangent of spaces of constant curvature, the standard principal geodesic is a principal curve. The proposed model is simple to compute and avoids many of the pitfalls of traditional geodesic approaches. We empirically demonstrate the effectiveness of the Riemannian principal curves on several manifolds and datasets.
Evaluating filterability of different types of sludge by statistical analysis: The role of key organic compounds in extracellular polymeric substances.

PubMed

Xiao, Keke; Chen, Yun; Jiang, Xie; Zhou, Yan

2017-03-01

An investigation was conducted for 20 different types of sludge in order to identify the key organic compounds in extracellular polymeric substances (EPS) that are important in assessing variations of sludge filterability. The different types of sludge varied in initial total solids (TS) content, organic composition and pre-treatment methods. For instance, some of the sludges were pre-treated by acid, ultrasonic, thermal, alkaline, or advanced oxidation technique. The Pearson's correlation results showed significant correlations between sludge filterability and zeta potential, pH, dissolved organic carbon, protein and polysaccharide in soluble EPS (SB EPS), loosely bound EPS (LB EPS) and tightly bound EPS (TB EPS). The principal component analysis (PCA) method was used to further explore correlations between variables and similarities among EPS fractions of different types of sludge. Two principal components were extracted: principal component 1 accounted for 59.24% of total EPS variations, while principal component 2 accounted for 25.46% of total EPS variations. Dissolved organic carbon, protein and polysaccharide in LB EPS showed higher eigenvector projection values than the corresponding compounds in SB EPS and TB EPS in principal component 1. Further characterization of fractionized key organic compounds in LB EPS was conducted with size-exclusion chromatography-organic carbon detection-organic nitrogen detection (LC-OCD-OND). A numerical multiple linear regression model was established to describe relationship between organic compounds in LB EPS and sludge filterability. Copyright © 2016 Elsevier Ltd. All rights reserved.
How many atoms are required to characterize accurately trajectory fluctuations of a protein?

NASA Astrophysics Data System (ADS)

Cukier, Robert I.

2010-06-01

Large molecules, whose thermal fluctuations sample a complex energy landscape, exhibit motions on an extended range of space and time scales. Principal component analysis (PCA) is often used to extract dominant motions that in proteins are typically domain motions. These motions are captured in the large eigenvalue (leading) principal components. There is also information in the small eigenvalues, arising from approximate linear dependencies among the coordinates. These linear dependencies suggest that instead of using all the atom coordinates to represent a trajectory, it should be possible to use a reduced set of coordinates with little loss in the information captured by the large eigenvalue principal components. In this work, methods that can monitor the correlation (overlap) between a reduced set of atoms and any number of retained principal components are introduced. For application to trajectory data generated by simulations, where the overall translational and rotational motion needs to be eliminated before PCA is carried out, some difficulties with the overlap measures arise and methods are developed to overcome them. The overlap measures are evaluated for a trajectory generated by molecular dynamics for the protein adenylate kinase, which consists of a stable, core domain, and two more mobile domains, referred to as the LID domain and the AMP-binding domain. The use of reduced sets corresponding, for the smallest set, to one-eighth of the alpha carbon (CA) atoms relative to using all the CA atoms is shown to predict the dominant motions of adenylate kinase. The overlap between using all the CA atoms and all the backbone atoms is essentially unity for a sum over PCA modes that effectively capture the exact trajectory. A reduction to a few atoms (three in the LID and three in the AMP-binding domain) shows that at least the first principal component, characterizing a large part of the LID-binding and AMP-binding motion, is well described. Based on these results, the overlap criterion should be applicable as a guide to postulating and validating coarse-grained descriptions of generic biomolecular assemblies.
Quantitative structure-activity relationship study of P2X7 receptor inhibitors using combination of principal component analysis and artificial intelligence methods.

PubMed

Ahmadi, Mehdi; Shahlaei, Mohsen

2015-01-01

P2X7 antagonist activity for a set of 49 molecules of the P2X7 receptor antagonists, derivatives of purine, was modeled with the aid of chemometric and artificial intelligence techniques. The activity of these compounds was estimated by means of combination of principal component analysis (PCA), as a well-known data reduction method, genetic algorithm (GA), as a variable selection technique, and artificial neural network (ANN), as a non-linear modeling method. First, a linear regression, combined with PCA, (principal component regression) was operated to model the structure-activity relationships, and afterwards a combination of PCA and ANN algorithm was employed to accurately predict the biological activity of the P2X7 antagonist. PCA preserves as much of the information as possible contained in the original data set. Seven most important PC's to the studied activity were selected as the inputs of ANN box by an efficient variable selection method, GA. The best computational neural network model was a fully-connected, feed-forward model with 7-7-1 architecture. The developed ANN model was fully evaluated by different validation techniques, including internal and external validation, and chemical applicability domain. All validations showed that the constructed quantitative structure-activity relationship model suggested is robust and satisfactory.

Quantitative structure–activity relationship study of P2X7 receptor inhibitors using combination of principal component analysis and artificial intelligence methods

PubMed Central

Ahmadi, Mehdi; Shahlaei, Mohsen

2015-01-01

P2X7 antagonist activity for a set of 49 molecules of the P2X7 receptor antagonists, derivatives of purine, was modeled with the aid of chemometric and artificial intelligence techniques. The activity of these compounds was estimated by means of combination of principal component analysis (PCA), as a well-known data reduction method, genetic algorithm (GA), as a variable selection technique, and artificial neural network (ANN), as a non-linear modeling method. First, a linear regression, combined with PCA, (principal component regression) was operated to model the structure–activity relationships, and afterwards a combination of PCA and ANN algorithm was employed to accurately predict the biological activity of the P2X7 antagonist. PCA preserves as much of the information as possible contained in the original data set. Seven most important PC's to the studied activity were selected as the inputs of ANN box by an efficient variable selection method, GA. The best computational neural network model was a fully-connected, feed-forward model with 7−7−1 architecture. The developed ANN model was fully evaluated by different validation techniques, including internal and external validation, and chemical applicability domain. All validations showed that the constructed quantitative structure–activity relationship model suggested is robust and satisfactory. PMID:26600858
The conservative behavior of dissolved organic carbon in surface waters of the southern Chukchi Sea, Arctic Ocean, during early summer

PubMed Central

Tanaka, Kazuki; Takesue, Nobuyuki; Nishioka, Jun; Kondo, Yoshiko; Ooki, Atsushi; Kuma, Kenshi; Hirawake, Toru; Yamashita, Youhei

2016-01-01

The spatial distribution of dissolved organic carbon (DOC) concentrations and the optical properties of dissolved organic matter (DOM) determined by ultraviolet-visible absorbance and fluorescence spectroscopy were measured in surface waters of the southern Chukchi Sea, western Arctic Ocean, during the early summer of 2013. Neither the DOC concentration nor the optical parameters of the DOM correlated with salinity. Principal component analysis using the DOM optical parameters clearly separated the DOM sources. A significant linear relationship was evident between the DOC and the principal component score for specific water masses, indicating that a high DOC level was related to a terrigenous source, whereas a low DOC level was related to a marine source. Relationships between the DOC and the principal component scores of the surface waters of the southern Chukchi Sea implied that the major factor controlling the distribution of DOC concentrations was the mixing of plural water masses rather than local production and degradation. PMID:27658444
Understanding software faults and their role in software reliability modeling

NASA Technical Reports Server (NTRS)

Munson, John C.

1994-01-01

This study is a direct result of an on-going project to model the reliability of a large real-time control avionics system. In previous modeling efforts with this system, hardware reliability models were applied in modeling the reliability behavior of this system. In an attempt to enhance the performance of the adapted reliability models, certain software attributes were introduced in these models to control for differences between programs and also sequential executions of the same program. As the basic nature of the software attributes that affect software reliability become better understood in the modeling process, this information begins to have important implications on the software development process. A significant problem arises when raw attribute measures are to be used in statistical models as predictors, for example, of measures of software quality. This is because many of the metrics are highly correlated. Consider the two attributes: lines of code, LOC, and number of program statements, Stmts. In this case, it is quite obvious that a program with a high value of LOC probably will also have a relatively high value of Stmts. In the case of low level languages, such as assembly language programs, there might be a one-to-one relationship between the statement count and the lines of code. When there is a complete absence of linear relationship among the metrics, they are said to be orthogonal or uncorrelated. Usually the lack of orthogonality is not serious enough to affect a statistical analysis. However, for the purposes of some statistical analysis such as multiple regression, the software metrics are so strongly interrelated that the regression results may be ambiguous and possibly even misleading. Typically, it is difficult to estimate the unique effects of individual software metrics in the regression equation. The estimated values of the coefficients are very sensitive to slight changes in the data and to the addition or deletion of variables in the regression equation. Since most of the existing metrics have common elements and are linear combinations of these common elements, it seems reasonable to investigate the structure of the underlying common factors or components that make up the raw metrics. The technique we have chosen to use to explore this structure is a procedure called principal components analysis. Principal components analysis is a decomposition technique that may be used to detect and analyze collinearity in software metrics. When confronted with a large number of metrics measuring a single construct, it may be desirable to represent the set by some smaller number of variables that convey all, or most, of the information in the original set. Principal components are linear transformations of a set of random variables that summarize the information contained in the variables. The transformations are chosen so that the first component accounts for the maximal amount of variation of the measures of any possible linear transform; the second component accounts for the maximal amount of residual variation; and so on. The principal components are constructed so that they represent transformed scores on dimensions that are orthogonal. Through the use of principal components analysis, it is possible to have a set of highly related software attributes mapped into a small number of uncorrelated attribute domains. This definitively solves the problem of multi-collinearity in subsequent regression analysis. There are many software metrics in the literature, but principal component analysis reveals that there are few distinct sources of variation, i.e. dimensions, in this set of metrics. It would appear perfectly reasonable to characterize the measurable attributes of a program with a simple function of a small number of orthogonal metrics each of which represents a distinct software attribute domain.
Tracing and separating plasma components causing matrix effects in hydrophilic interaction chromatography-electrospray ionization mass spectrometry.

PubMed

Ekdahl, Anja; Johansson, Maria C; Ahnoff, Martin

2013-04-01

Matrix effects on electrospray ionization were investigated for plasma samples analysed by hydrophilic interaction chromatography (HILIC) in gradient elution mode, and HILIC columns of different chemistries were tested for separation of plasma components and model analytes. By combining mass spectral data with post-column infusion traces, the following components of protein-precipitated plasma were identified and found to have significant effect on ionization: urea, creatinine, phosphocholine, lysophosphocholine, sphingomyelin, sodium ion, chloride ion, choline and proline betaine. The observed effect on ionization was both matrix-component and analyte dependent. The separation of identified plasma components and model analytes on eight columns was compared, using pair-wise linear correlation analysis and principal component analysis (PCA). Large changes in selectivity could be obtained by change of column, while smaller changes were seen when the mobile phase buffer was changed from ammonium formate pH 3.0 to ammonium acetate pH 4.5. While results from PCA and linear correlation analysis were largely in accord, linear correlation analysis was judged to be more straight-forward in terms of conduction and interpretation.
Artificial neural networks and multiple linear regression model using principal components to estimate rainfall over South America

NASA Astrophysics Data System (ADS)

Soares dos Santos, T.; Mendes, D.; Rodrigues Torres, R.

2016-01-01

Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANNs) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon; northeastern Brazil; and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model output and observed monthly precipitation. We used general circulation model (GCM) experiments for the 20th century (RCP historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANNs significantly outperform the MLR downscaling of monthly precipitation variability.
Artificial neural networks and multiple linear regression model using principal components to estimate rainfall over South America

NASA Astrophysics Data System (ADS)

dos Santos, T. S.; Mendes, D.; Torres, R. R.

2015-08-01

Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANN) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon, Northeastern Brazil and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model out- put and observed monthly precipitation. We used GCMs experiments for the 20th century (RCP Historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANN significantly outperforms the MLR downscaling of monthly precipitation variability.
High Accuracy Passive Magnetic Field-Based Localization for Feedback Control Using Principal Component Analysis.

PubMed

Foong, Shaohui; Sun, Zhenglong

2016-08-12

In this paper, a novel magnetic field-based sensing system employing statistically optimized concurrent multiple sensor outputs for precise field-position association and localization is presented. This method capitalizes on the independence between simultaneous spatial field measurements at multiple locations to induce unique correspondences between field and position. This single-source-multi-sensor configuration is able to achieve accurate and precise localization and tracking of translational motion without contact over large travel distances for feedback control. Principal component analysis (PCA) is used as a pseudo-linear filter to optimally reduce the dimensions of the multi-sensor output space for computationally efficient field-position mapping with artificial neural networks (ANNs). Numerical simulations are employed to investigate the effects of geometric parameters and Gaussian noise corruption on PCA assisted ANN mapping performance. Using a 9-sensor network, the sensing accuracy and closed-loop tracking performance of the proposed optimal field-based sensing system is experimentally evaluated on a linear actuator with a significantly more expensive optical encoder as a comparison.
A new modulated Hebbian learning rule--biologically plausible method for local computation of a principal subspace.

PubMed

Jankovic, Marko; Ogawa, Hidemitsu

2003-08-01

This paper presents one possible implementation of a transformation that performs linear mapping to a lower-dimensional subspace. Principal component subspace will be the one that will be analyzed. Idea implemented in this paper represents generalization of the recently proposed infinity OH neural method for principal component extraction. The calculations in the newly proposed method are performed locally--a feature which is usually considered as desirable from the biological point of view. Comparing to some other wellknown methods, proposed synaptic efficacy learning rule requires less information about the value of the other efficacies to make single efficacy modification. Synaptic efficacies are modified by implementation of Modulated Hebb-type (MH) learning rule. Slightly modified MH algorithm named Modulated Hebb Oja (MHO) algorithm, will be also introduced. Structural similarity of the proposed network with part of the retinal circuit will be presented, too.
[Interrelations between plant communities and environmental factors of wetlands and surrounding lands in mid- and lower reaches of Tarim River].

PubMed

Zhao, Ruifeng; Zhou, Huarong; Qian, Yibing; Zhang, Jianjun

2006-06-01

A total of 16 quadrants of wetlands and surrounding lands in the mid- and lower reaches of Tarim River were surveyed, and the data about the characteristics of plant communities and environmental factors were collected and counted. By using PCA (principal component analysis) ordination and regression procedure, the distribution patterns of plant communities and the relationships between the characteristics of plant community structure and environmental factors were analyzed. The results showed that the distribution of the plant communities was closely related to soil moisture, salt, and nutrient contents. The accumulative contribution rate of soil moisture and salt contents in the first principal component accounted for 35.70%, and that of soil nutrient content in the second principal component reached 25.97%. There were 4 types of habitats for the plant community distribution, i. e., fenny--light salt--medium nutrient, moist--medium salt--medium nutrient, mesophytic--medium salt--low nutrient, and medium xerophytic-heavy salt--low nutrient. Along these habitats, swamp vegetation, meadow vegetation, riparian sparse forest, halophytic desert, and salinized shrub were distributed. In the wetlands and surrounding lands of mid- and lower reaches of Tarim River, the ecological dominance of the plant communities was markedly and unitary-linearly correlated with the compound gradient of soil moisture and salt contents. The relationships between species diversity, ecological dominance, and compound gradient of soil moisture and salt contents were significantly accorded to binary-linear regression model.
Effect of noise in principal component analysis with an application to ozone pollution

NASA Astrophysics Data System (ADS)

Tsakiri, Katerina G.

This thesis analyzes the effect of independent noise in principal components of k normally distributed random variables defined by a covariance matrix. We prove that the principal components as well as the canonical variate pairs determined from joint distribution of original sample affected by noise can be essentially different in comparison with those determined from the original sample. However when the differences between the eigenvalues of the original covariance matrix are sufficiently large compared to the level of the noise, the effect of noise in principal components and canonical variate pairs proved to be negligible. The theoretical results are supported by simulation study and examples. Moreover, we compare our results about the eigenvalues and eigenvectors in the two dimensional case with other models examined before. This theory can be applied in any field for the decomposition of the components in multivariate analysis. One application is the detection and prediction of the main atmospheric factor of ozone concentrations on the example of Albany, New York. Using daily ozone, solar radiation, temperature, wind speed and precipitation data, we determine the main atmospheric factor for the explanation and prediction of ozone concentrations. A methodology is described for the decomposition of the time series of ozone and other atmospheric variables into the global term component which describes the long term trend and the seasonal variations, and the synoptic scale component which describes the short term variations. By using the Canonical Correlation Analysis, we show that solar radiation is the only main factor between the atmospheric variables considered here for the explanation and prediction of the global and synoptic scale component of ozone. The global term components are modeled by a linear regression model, while the synoptic scale components by a vector autoregressive model and the Kalman filter. The coefficient of determination, R2, for the prediction of the synoptic scale ozone component was found to be the highest when we consider the synoptic scale component of the time series for solar radiation and temperature. KEY WORDS: multivariate analysis; principal component; canonical variate pairs; eigenvalue; eigenvector; ozone; solar radiation; spectral decomposition; Kalman filter; time series prediction
Classifying Facial Actions

PubMed Central

Donato, Gianluca; Bartlett, Marian Stewart; Hager, Joseph C.; Ekman, Paul; Sejnowski, Terrence J.

2010-01-01

The Facial Action Coding System (FACS) [23] is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions. PMID:21188284
Characterization of the lateral distribution of fluorescent lipid in binary-constituent lipid monolayers by principal component analysis.

PubMed

Sugár, István P; Zhai, Xiuhong; Boldyrev, Ivan A; Molotkovsky, Julian G; Brockman, Howard L; Brown, Rhoderick E

2010-01-01

Lipid lateral organization in binary-constituent monolayers consisting of fluorescent and nonfluorescent lipids has been investigated by acquiring multiple emission spectra during measurement of each force-area isotherm. The emission spectra reflect BODIPY-labeled lipid surface concentration and lateral mixing with different nonfluorescent lipid species. Using principal component analysis (PCA) each spectrum could be approximated as the linear combination of only two principal vectors. One point on a plane could be associated with each spectrum, where the coordinates of the point are the coefficients of the linear combination. Points belonging to the same lipid constituents and experimental conditions form a curve on the plane, where each point belongs to a different mole fraction. The location and shape of the curve reflects the lateral organization of the fluorescent lipid mixed with a specific nonfluorescent lipid. The method provides massive data compression that preserves and emphasizes key information pertaining to lipid distribution in different lipid monolayer phases. Collectively, the capacity of PCA for handling large spectral data sets, the nanoscale resolution afforded by the fluorescence signal, and the inherent versatility of monolayers for characterization of lipid lateral interactions enable significantly enhanced resolution of lipid lateral organizational changes induced by different lipid compositions.
Decomposing the Apoptosis Pathway Into Biologically Interpretable Principal Components

PubMed Central

Wang, Min; Kornblau, Steven M; Coombes, Kevin R

2018-01-01

Principal component analysis (PCA) is one of the most common techniques in the analysis of biological data sets, but applying PCA raises 2 challenges. First, one must determine the number of significant principal components (PCs). Second, because each PC is a linear combination of genes, it rarely has a biological interpretation. Existing methods to determine the number of PCs are either subjective or computationally extensive. We review several methods and describe a new R package, PCDimension, that implements additional methods, the most important being an algorithm that extends and automates a graphical Bayesian method. Using simulations, we compared the methods. Our newly automated procedure is competitive with the best methods when considering both accuracy and speed and is the most accurate when the number of objects is small compared with the number of attributes. We applied the method to a proteomics data set from patients with acute myeloid leukemia. Proteins in the apoptosis pathway could be explained using 6 PCs. By clustering the proteins in PC space, we were able to replace the PCs by 6 “biological components,” 3 of which could be immediately interpreted from the current literature. We expect this approach combining PCA with clustering to be widely applicable. PMID:29881252
Structured functional additive regression in reproducing kernel Hilbert spaces.

PubMed

Zhu, Hongxiao; Yao, Fang; Zhang, Hao Helen

2014-06-01

Functional additive models (FAMs) provide a flexible yet simple framework for regressions involving functional predictors. The utilization of data-driven basis in an additive rather than linear structure naturally extends the classical functional linear model. However, the critical issue of selecting nonlinear additive components has been less studied. In this work, we propose a new regularization framework for the structure estimation in the context of Reproducing Kernel Hilbert Spaces. The proposed approach takes advantage of the functional principal components which greatly facilitates the implementation and the theoretical analysis. The selection and estimation are achieved by penalized least squares using a penalty which encourages the sparse structure of the additive components. Theoretical properties such as the rate of convergence are investigated. The empirical performance is demonstrated through simulation studies and a real data application.
Spectral decomposition of asteroid Itokawa based on principal component analysis

NASA Astrophysics Data System (ADS)

Koga, Sumire C.; Sugita, Seiji; Kamata, Shunichi; Ishiguro, Masateru; Hiroi, Takahiro; Tatsumi, Eri; Sasaki, Sho

2018-01-01

The heliocentric stratification of asteroid spectral types may hold important information on the early evolution of the Solar System. Asteroid spectral taxonomy is based largely on principal component analysis. However, how the surface properties of asteroids, such as the composition and age, are projected in the principal-component (PC) space is not understood well. We decompose multi-band disk-resolved visible spectra of the Itokawa surface with principal component analysis (PCA) in comparison with main-belt asteroids. The obtained distribution of Itokawa spectra projected in the PC space of main-belt asteroids follows a linear trend linking the Q-type and S-type regions and is consistent with the results of space-weathering experiments on ordinary chondrites and olivine, suggesting that this trend may be a space-weathering-induced spectral evolution track for S-type asteroids. Comparison with space-weathering experiments also yield a short average surface age (< a few million years) for Itokawa, consistent with the cosmic-ray-exposure time of returned samples from Itokawa. The Itokawa PC score distribution exhibits asymmetry along the evolution track, strongly suggesting that space weathering has begun saturated on this young asteroid. The freshest spectrum found on Itokawa exhibits a clear sign for space weathering, indicating again that space weathering occurs very rapidly on this body. We also conducted PCA on Itokawa spectra alone and compared the results with space-weathering experiments. The obtained results indicate that the first principal component of Itokawa surface spectra is consistent with spectral change due to space weathering and that the spatial variation in the degree of space weathering is very large (a factor of three in surface age), which would strongly suggest the presence of strong regional/local resurfacing process(es) on this small asteroid.
On the Problems of Construction and Statistical Inference Associated with a Generalization of Canonical Variables.

DTIC Science & Technology

1982-02-01

of them are pre- sented in this paper. As an application, important practical problems similar to the one posed by Gnanadesikan (1977), p. 77 can be... Gnanadesikan and Wilk (1969) to search for a non-linear combination, giving rise to non-linear first principal component. So, a p-dinensional vector can...distribution, Gnanadesikan and Gupta (1970) and earlier Eaton (1967) have considered the problem of ranking the r underlying populations according to the
Correcting for population structure and kinship using the linear mixed model: theory and extensions.

PubMed

Hoffman, Gabriel E

2013-01-01

Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans.
Wavelet packets for multi- and hyper-spectral imagery

NASA Astrophysics Data System (ADS)

Benedetto, J. J.; Czaja, W.; Ehler, M.; Flake, C.; Hirn, M.

2010-01-01

State of the art dimension reduction and classification schemes in multi- and hyper-spectral imaging rely primarily on the information contained in the spectral component. To better capture the joint spatial and spectral data distribution we combine the Wavelet Packet Transform with the linear dimension reduction method of Principal Component Analysis. Each spectral band is decomposed by means of the Wavelet Packet Transform and we consider a joint entropy across all the spectral bands as a tool to exploit the spatial information. Dimension reduction is then applied to the Wavelet Packets coefficients. We present examples of this technique for hyper-spectral satellite imaging. We also investigate the role of various shrinkage techniques to model non-linearity in our approach.
Influence of damping on the frequency-dependent polarizabilities of doped quantum dot

NASA Astrophysics Data System (ADS)

Pal, Suvajit; Ghosh, Manas

2014-09-01

We investigate the profiles of diagonal components of frequency-dependent linear (αxx and αyy), and first nonlinear (βxxx and βyyy) optical response of repulsive impurity doped quantum dots. The dopant impurity potential chosen assumes Gaussian form. The study principally focuses on investigating the role of damping on the polarizability components. In view of this the dopant is considered to be propagating under damped condition which is otherwise linear inherently. The frequency-dependent polarizabilities are then analyzed by placing the doped dot to a periodically oscillating external electric field of given intensity. The damping strength, in conjunction with external oscillation frequency and confinement potentials, fabricate the polarizability components in a fascinating manner which is adorned with emergence of maximization, minimization, and saturation. The discrimination in the values of the polarizability components in x and y-directions has also been addressed in the present context.
Discriminative components of data.

PubMed

Peltonen, Jaakko; Kaski, Samuel

2005-01-01

A simple probabilistic model is introduced to generalize classical linear discriminant analysis (LDA) in finding components that are informative of or relevant for data classes. The components maximize the predictability of the class distribution which is asymptotically equivalent to 1) maximizing mutual information with the classes, and 2) finding principal components in the so-called learning or Fisher metrics. The Fisher metric measures only distances that are relevant to the classes, that is, distances that cause changes in the class distribution. The components have applications in data exploration, visualization, and dimensionality reduction. In empirical experiments, the method outperformed, in addition to more classical methods, a Renyi entropy-based alternative while having essentially equivalent computational cost.

Crude oil price forecasting based on hybridizing wavelet multiple linear regression model, particle swarm optimization techniques, and principal component analysis.

PubMed

Shabri, Ani; Samsudin, Ruhaidah

2014-01-01

Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series.
Nonparametric regression applied to quantitative structure-activity relationships

PubMed

Constans; Hirst

2000-03-01

Several nonparametric regressors have been applied to modeling quantitative structure-activity relationship (QSAR) data. The simplest regressor, the Nadaraya-Watson, was assessed in a genuine multivariate setting. Other regressors, the local linear and the shifted Nadaraya-Watson, were implemented within additive models--a computationally more expedient approach, better suited for low-density designs. Performances were benchmarked against the nonlinear method of smoothing splines. A linear reference point was provided by multilinear regression (MLR). Variable selection was explored using systematic combinations of different variables and combinations of principal components. For the data set examined, 47 inhibitors of dopamine beta-hydroxylase, the additive nonparametric regressors have greater predictive accuracy (as measured by the mean absolute error of the predictions or the Pearson correlation in cross-validation trails) than MLR. The use of principal components did not improve the performance of the nonparametric regressors over use of the original descriptors, since the original descriptors are not strongly correlated. It remains to be seen if the nonparametric regressors can be successfully coupled with better variable selection and dimensionality reduction in the context of high-dimensional QSARs.
Crude Oil Price Forecasting Based on Hybridizing Wavelet Multiple Linear Regression Model, Particle Swarm Optimization Techniques, and Principal Component Analysis

PubMed Central

Shabri, Ani; Samsudin, Ruhaidah

2014-01-01

Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666
Linear measurements of the neurocranium are better indicators of population differences than those of the facial skeleton: comparative study of 1,961 skulls.

PubMed

Holló, Gábor; Szathmáry, László; Marcsik, Antónia; Barta, Zoltán

2010-02-01

The aim of this study is to individualize potential differences between two cranial regions used to differentiate human populations. We compared the neurocranium and the facial skeleton using skulls from the Great Hungarian Plain. The skulls date to the 1st-11th centuries, a long space of time that encompasses seven archaeological periods. We analyzed six neurocranial and seven facial measurements. The reduction of the number of variables was carried out using principal components analysis. Linear mixed-effects models were fitted to the principal components of each archaeological period, and then the models were compared using multiple pairwise tests. The neurocranium showed significant differences in seven cases between nonsubsequent periods and in one case, between two subsequent populations. For the facial skeleton, no significant results were found. Our results, which are also compared to previous craniofacial heritability estimates, suggest that the neurocranium is a more conservative region and that population differences can be pointed out better in the neurocranium than in the facial skeleton.
Alignment of the Stanford Linear Collider Arcs: Concepts and results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pitthan, R.; Bell, B.; Friedsam, H.

1987-02-01

The alignment of the Arcs for the Stanford Linear Collider at SLAC has posed problems in accelerator survey and alignment not encountered before. These problems come less from the tight tolerances of 0.1 mm, although reaching such a tight statistically defined accuracy in a controlled manner is difficult enough, but from the absence of a common reference plane for the Arcs. Traditional circular accelerators, including HERA and LEP, have been designed in one plane referenced to local gravity. For the SLC Arcs no such single plane exists. Methods and concepts developed to solve these and other problems, connected with themore » unique design of SLC, range from the first use of satellites for accelerator alignment, use of electronic laser theodolites for placement of components, computer control of the manual adjustment process, complete automation of the data flow incorporating the most advanced concepts of geodesy, strict separation of survey and alignment, to linear principal component analysis for the final statistical smoothing of the mechanical components.« less
Signal-to-noise contribution of principal component loads in reconstructed near-infrared Raman tissue spectra.

PubMed

Grimbergen, M C M; van Swol, C F P; Kendall, C; Verdaasdonk, R M; Stone, N; Bosch, J L H R

2010-01-01

The overall quality of Raman spectra in the near-infrared region, where biological samples are often studied, has benefited from various improvements to optical instrumentation over the past decade. However, obtaining ample spectral quality for analysis is still challenging due to device requirements and short integration times required for (in vivo) clinical applications of Raman spectroscopy. Multivariate analytical methods, such as principal component analysis (PCA) and linear discriminant analysis (LDA), are routinely applied to Raman spectral datasets to develop classification models. Data compression is necessary prior to discriminant analysis to prevent or decrease the degree of over-fitting. The logical threshold for the selection of principal components (PCs) to be used in discriminant analysis is likely to be at a point before the PCs begin to introduce equivalent signal and noise and, hence, include no additional value. Assessment of the signal-to-noise ratio (SNR) at a certain peak or over a specific spectral region will depend on the sample measured. Therefore, the mean SNR over the whole spectral region (SNR(msr)) is determined in the original spectrum as well as for spectra reconstructed from an increasing number of principal components. This paper introduces a method of assessing the influence of signal and noise from individual PC loads and indicates a method of selection of PCs for LDA. To evaluate this method, two data sets with different SNRs were used. The sets were obtained with the same Raman system and the same measurement parameters on bladder tissue collected during white light cystoscopy (set A) and fluorescence-guided cystoscopy (set B). This method shows that the mean SNR over the spectral range in the original Raman spectra of these two data sets is related to the signal and noise contribution of principal component loads. The difference in mean SNR over the spectral range can also be appreciated since fewer principal components can reliably be used in the low SNR data set (set B) compared to the high SNR data set (set A). Despite the fact that no definitive threshold could be found, this method may help to determine the cutoff for the number of principal components used in discriminant analysis. Future analysis of a selection of spectral databases using this technique will allow optimum thresholds to be selected for different applications and spectral data quality levels.
Statistical techniques applied to aerial radiometric surveys (STAARS): principal components analysis user's manual. [NURE program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Koch, C.D.; Pirkle, F.L.; Schmidt, J.S.

1981-01-01

A Principal Components Analysis (PCA) has been written to aid in the interpretation of multivariate aerial radiometric data collected by the US Department of Energy (DOE) under the National Uranium Resource Evaluation (NURE) program. The variations exhibited by these data have been reduced and classified into a number of linear combinations by using the PCA program. The PCA program then generates histograms and outlier maps of the individual variates. Black and white plots can be made on a Calcomp plotter by the application of follow-up programs. All programs referred to in this guide were written for a DEC-10. From thismore » analysis a geologist may begin to interpret the data structure. Insight into geological processes underlying the data may be obtained.« less
Structured functional additive regression in reproducing kernel Hilbert spaces

PubMed Central

Zhu, Hongxiao; Yao, Fang; Zhang, Hao Helen

2013-01-01

Summary Functional additive models (FAMs) provide a flexible yet simple framework for regressions involving functional predictors. The utilization of data-driven basis in an additive rather than linear structure naturally extends the classical functional linear model. However, the critical issue of selecting nonlinear additive components has been less studied. In this work, we propose a new regularization framework for the structure estimation in the context of Reproducing Kernel Hilbert Spaces. The proposed approach takes advantage of the functional principal components which greatly facilitates the implementation and the theoretical analysis. The selection and estimation are achieved by penalized least squares using a penalty which encourages the sparse structure of the additive components. Theoretical properties such as the rate of convergence are investigated. The empirical performance is demonstrated through simulation studies and a real data application. PMID:25013362
Co-pyrolysis characteristics and kinetic analysis of organic food waste and plastic.

PubMed

Tang, Yijing; Huang, Qunxing; Sun, Kai; Chi, Yong; Yan, Jianhua

2018-02-01

In this work, typical organic food waste (soybean protein (SP)) and typical chlorine enriched plastic waste (polyvinyl chloride (PVC)) were chosen as principal MSW components and their interaction during co-pyrolysis was investigated. Results indicate that the interaction accelerated the reaction during co-pyrolysis. The activation energies needed were 2-13% lower for the decomposition of mixture compared with linear calculation while the maximum reaction rates were 12-16% higher than calculation. In the fixed-bed experiments, interaction was observed to reduce the yield of tar by 2-69% and promote the yield of char by 13-39% compared with linear calculation. In addition, 2-6 times more heavy components and 61-93% less nitrogen-containing components were formed for tar derived from mixtures. Copyright © 2017 Elsevier Ltd. All rights reserved.
Balancing Vibrations at Harmonic Frequencies by Injecting Harmonic Balancing Signals into the Armature of a Linear Motor/Alternator Coupled to a Stirling Machine

NASA Technical Reports Server (NTRS)

Holliday, Ezekiel S. (Inventor)

2014-01-01

Vibrations at harmonic frequencies are reduced by injecting harmonic balancing signals into the armature of a linear motor/alternator coupled to a Stirling machine. The vibrations are sensed to provide a signal representing the mechanical vibrations. A harmonic balancing signal is generated for selected harmonics of the operating frequency by processing the sensed vibration signal with adaptive filter algorithms of adaptive filters for each harmonic. Reference inputs for each harmonic are applied to the adaptive filter algorithms at the frequency of the selected harmonic. The harmonic balancing signals for all of the harmonics are summed with a principal control signal. The harmonic balancing signals modify the principal electrical drive voltage and drive the motor/alternator with a drive voltage component in opposition to the vibration at each harmonic.
Support vector machine based classification of fast Fourier transform spectroscopy of proteins

NASA Astrophysics Data System (ADS)

Lazarevic, Aleksandar; Pokrajac, Dragoljub; Marcano, Aristides; Melikechi, Noureddine

2009-02-01

Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a novel machine learning based method that uses protein spectra for classification and identification of such proteins within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear combinations of original spectral components and then employs support vector machine (SVM) classification model applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2 and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines exhibits excellent classification accuracy when identifying proteins using their infrared spectra.
Prediction of Knee Joint Contact Forces From External Measures Using Principal Component Prediction and Reconstruction.

PubMed

Saliba, Christopher M; Clouthier, Allison L; Brandon, Scott C E; Rainbow, Michael J; Deluzio, Kevin J

2018-05-29

Abnormal loading of the knee joint contributes to the pathogenesis of knee osteoarthritis. Gait retraining is a non-invasive intervention that aims to reduce knee loads by providing audible, visual, or haptic feedback of gait parameters. The computational expense of joint contact force prediction has limited real-time feedback to surrogate measures of the contact force, such as the knee adduction moment. We developed a method to predict knee joint contact forces using motion analysis and a statistical regression model that can be implemented in near real-time. Gait waveform variables were deconstructed using principal component analysis and a linear regression was used to predict the principal component scores of the contact force waveforms. Knee joint contact force waveforms were reconstructed using the predicted scores. We tested our method using a heterogenous population of asymptomatic controls and subjects with knee osteoarthritis. The reconstructed contact force waveforms had mean (SD) RMS differences of 0.17 (0.05) bodyweight compared to the contact forces predicted by a musculoskeletal model. Our method successfully predicted subject-specific shape features of contact force waveforms and is a potentially powerful tool in biofeedback and clinical gait analysis.
Modeling vertebrate diversity in Oregon using satellite imagery

NASA Astrophysics Data System (ADS)

Cablk, Mary Elizabeth

Vertebrate diversity was modeled for the state of Oregon using a parametric approach to regression tree analysis. This exploratory data analysis effectively modeled the non-linear relationships between vertebrate richness and phenology, terrain, and climate. Phenology was derived from time-series NOAA-AVHRR satellite imagery for the year 1992 using two methods: principal component analysis and derivation of EROS data center greenness metrics. These two measures of spatial and temporal vegetation condition incorporated the critical temporal element in this analysis. The first three principal components were shown to contain spatial and temporal information about the landscape and discriminated phenologically distinct regions in Oregon. Principal components 2 and 3, 6 greenness metrics, elevation, slope, aspect, annual precipitation, and annual seasonal temperature difference were investigated as correlates to amphibians, birds, all vertebrates, reptiles, and mammals. Variation explained for each regression tree by taxa were: amphibians (91%), birds (67%), all vertebrates (66%), reptiles (57%), and mammals (55%). Spatial statistics were used to quantify the pattern of each taxa and assess validity of resulting predictions from regression tree models. Regression tree analysis was relatively robust against spatial autocorrelation in the response data and graphical results indicated models were well fit to the data.
Towards Solving the Mixing Problem in the Decomposition of Geophysical Time Series by Independent Component Analysis

NASA Technical Reports Server (NTRS)

Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)

2000-01-01

The use of the Principal Component Analysis technique for the analysis of geophysical time series has been questioned in particular for its tendency to extract components that mix several physical phenomena even when the signal is just their linear sum. We demonstrate with a data simulation experiment that the Independent Component Analysis, a recently developed technique, is able to solve this problem. This new technique requires the statistical independence of components, a stronger constraint, that uses higher-order statistics, instead of the classical decorrelation a weaker constraint, that uses only second-order statistics. Furthermore, ICA does not require additional a priori information such as the localization constraint used in Rotational Techniques.
Spatial variation analyses of Thematic Mapper data for the identification of linear features in agricultural landscapes

NASA Technical Reports Server (NTRS)

Pelletier, R. E.

1984-01-01

A need exists for digitized information pertaining to linear features such as roads, streams, water bodies and agricultural field boundaries as component parts of a data base. For many areas where this data may not yet exist or is in need of updating, these features may be extracted from remotely sensed digital data. This paper examines two approaches for identifying linear features, one utilizing raw data and the other classified data. Each approach uses a series of data enhancement procedures including derivation of standard deviation values, principal component analysis and filtering procedures using a high-pass window matrix. Just as certain bands better classify different land covers, so too do these bands exhibit high spectral contrast by which boundaries between land covers can be delineated. A few applications for this kind of data are briefly discussed, including its potential in a Universal Soil Loss Equation Model.
Using dynamic mode decomposition for real-time background/foreground separation in video

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kutz, Jose Nathan; Grosek, Jacob; Brunton, Steven

The technique of dynamic mode decomposition (DMD) is disclosed herein for the purpose of robustly separating video frames into background (low-rank) and foreground (sparse) components in real-time. Foreground/background separation is achieved at the computational cost of just one singular value decomposition (SVD) and one linear equation solve, thus producing results orders of magnitude faster than robust principal component analysis (RPCA). Additional techniques, including techniques for analyzing the video for multi-resolution time-scale components, and techniques for reusing computations to allow processing of streaming video in real time, are also described herein.
Improved Statistical Fault Detection Technique and Application to Biological Phenomena Modeled by S-Systems.

PubMed

Mansouri, Majdi; Nounou, Mohamed N; Nounou, Hazem N

2017-09-01

In our previous work, we have demonstrated the effectiveness of the linear multiscale principal component analysis (PCA)-based moving window (MW)-generalized likelihood ratio test (GLRT) technique over the classical PCA and multiscale principal component analysis (MSPCA)-based GLRT methods. The developed fault detection algorithm provided optimal properties by maximizing the detection probability for a particular false alarm rate (FAR) with different values of windows, and however, most real systems are nonlinear, which make the linear PCA method not able to tackle the issue of non-linearity to a great extent. Thus, in this paper, first, we apply a nonlinear PCA to obtain an accurate principal component of a set of data and handle a wide range of nonlinearities using the kernel principal component analysis (KPCA) model. The KPCA is among the most popular nonlinear statistical methods. Second, we extend the MW-GLRT technique to one that utilizes exponential weights to residuals in the moving window (instead of equal weightage) as it might be able to further improve fault detection performance by reducing the FAR using exponentially weighed moving average (EWMA). The developed detection method, which is called EWMA-GLRT, provides improved properties, such as smaller missed detection and FARs and smaller average run length. The idea behind the developed EWMA-GLRT is to compute a new GLRT statistic that integrates current and previous data information in a decreasing exponential fashion giving more weight to the more recent data. This provides a more accurate estimation of the GLRT statistic and provides a stronger memory that will enable better decision making with respect to fault detection. Therefore, in this paper, a KPCA-based EWMA-GLRT method is developed and utilized in practice to improve fault detection in biological phenomena modeled by S-systems and to enhance monitoring process mean. The idea behind a KPCA-based EWMA-GLRT fault detection algorithm is to combine the advantages brought forward by the proposed EWMA-GLRT fault detection chart with the KPCA model. Thus, it is used to enhance fault detection of the Cad System in E. coli model through monitoring some of the key variables involved in this model such as enzymes, transport proteins, regulatory proteins, lysine, and cadaverine. The results demonstrate the effectiveness of the proposed KPCA-based EWMA-GLRT method over Q , GLRT, EWMA, Shewhart, and moving window-GLRT methods. The detection performance is assessed and evaluated in terms of FAR, missed detection rates, and average run length (ARL 1 ) values.
Kernel PLS-SVC for Linear and Nonlinear Discrimination

NASA Technical Reports Server (NTRS)

Rosipal, Roman; Trejo, Leonard J.; Matthews, Bryan

2003-01-01

A new methodology for discrimination is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by support vector machines for classification. Close connection of orthonormalized PLS and Fisher's approach to linear discrimination or equivalently with canonical correlation analysis is described. This gives preference to use orthonormalized PLS over principal component analysis. Good behavior of the proposed method is demonstrated on 13 different benchmark data sets and on the real world problem of the classification finger movement periods versus non-movement periods based on electroencephalogram.
A unified development of several techniques for the representation of random vectors and data sets

NASA Technical Reports Server (NTRS)

Bundick, W. T.

1973-01-01

Linear vector space theory is used to develop a general representation of a set of data vectors or random vectors by linear combinations of orthonormal vectors such that the mean squared error of the representation is minimized. The orthonormal vectors are shown to be the eigenvectors of an operator. The general representation is applied to several specific problems involving the use of the Karhunen-Loeve expansion, principal component analysis, and empirical orthogonal functions; and the common properties of these representations are developed.
A first application of independent component analysis to extracting structure from stock returns.

PubMed

Back, A D; Weigend, A S

1997-08-01

This paper explores the application of a signal processing technique known as independent component analysis (ICA) or blind source separation to multivariate financial time series such as a portfolio of stocks. The key idea of ICA is to linearly map the observed multivariate time series into a new space of statistically independent components (ICs). We apply ICA to three years of daily returns of the 28 largest Japanese stocks and compare the results with those obtained using principal component analysis. The results indicate that the estimated ICs fall into two categories, (i) infrequent large shocks (responsible for the major changes in the stock prices), and (ii) frequent smaller fluctuations (contributing little to the overall level of the stocks). We show that the overall stock price can be reconstructed surprisingly well by using a small number of thresholded weighted ICs. In contrast, when using shocks derived from principal components instead of independent components, the reconstructed price is less similar to the original one. ICA is shown to be a potentially powerful method of analyzing and understanding driving mechanisms in financial time series. The application to portfolio optimization is described in Chin and Weigend (1998).

Near-infrared Raman spectroscopy for estimating biochemical changes associated with different pathological conditions of cervix

NASA Astrophysics Data System (ADS)

Daniel, Amuthachelvi; Prakasarao, Aruna; Ganesan, Singaravelu

2018-02-01

The molecular level changes associated with oncogenesis precede the morphological changes in cells and tissues. Hence molecular level diagnosis would promote early diagnosis of the disease. Raman spectroscopy is capable of providing specific spectral signature of various biomolecules present in the cells and tissues under various pathological conditions. The aim of this work is to develop a non-linear multi-class statistical methodology for discrimination of normal, neoplastic and malignant cells/tissues. The tissues were classified as normal, pre-malignant and malignant by employing Principal Component Analysis followed by Artificial Neural Network (PC-ANN). The overall accuracy achieved was 99%. Further, to get an insight into the quantitative biochemical composition of the normal, neoplastic and malignant tissues, a linear combination of the major biochemicals by non-negative least squares technique was fit to the measured Raman spectra of the tissues. This technique confirms the changes in the major biomolecules such as lipids, nucleic acids, actin, glycogen and collagen associated with the different pathological conditions. To study the efficacy of this technique in comparison with histopathology, we have utilized Principal Component followed by Linear Discriminant Analysis (PC-LDA) to discriminate the well differentiated, moderately differentiated and poorly differentiated squamous cell carcinoma with an accuracy of 94.0%. And the results demonstrated that Raman spectroscopy has the potential to complement the good old technique of histopathology.
Principal Component Analysis for Normal-Distribution-Valued Symbolic Data.

PubMed

Wang, Huiwen; Chen, Meiling; Shi, Xiaojun; Li, Nan

2016-02-01

This paper puts forward a new approach to principal component analysis (PCA) for normal-distribution-valued symbolic data, which has a vast potential of applications in the economic and management field. We derive a full set of numerical characteristics and variance-covariance structure for such data, which forms the foundation for our analytical PCA approach. Our approach is able to use all of the variance information in the original data than the prevailing representative-type approach in the literature which only uses centers, vertices, etc. The paper also provides an accurate approach to constructing the observations in a PC space based on the linear additivity property of normal distribution. The effectiveness of the proposed method is illustrated by simulated numerical experiments. At last, our method is applied to explain the puzzle of risk-return tradeoff in China's stock market.
Pattern classification of fMRI data: applications for analysis of spatially distributed cortical networks.

PubMed

Yourganov, Grigori; Schmah, Tanya; Churchill, Nathan W; Berman, Marc G; Grady, Cheryl L; Strother, Stephen C

2014-08-01

The field of fMRI data analysis is rapidly growing in sophistication, particularly in the domain of multivariate pattern classification. However, the interaction between the properties of the analytical model and the parameters of the BOLD signal (e.g. signal magnitude, temporal variance and functional connectivity) is still an open problem. We addressed this problem by evaluating a set of pattern classification algorithms on simulated and experimental block-design fMRI data. The set of classifiers consisted of linear and quadratic discriminants, linear support vector machine, and linear and nonlinear Gaussian naive Bayes classifiers. For linear discriminant, we used two methods of regularization: principal component analysis, and ridge regularization. The classifiers were used (1) to classify the volumes according to the behavioral task that was performed by the subject, and (2) to construct spatial maps that indicated the relative contribution of each voxel to classification. Our evaluation metrics were: (1) accuracy of out-of-sample classification and (2) reproducibility of spatial maps. In simulated data sets, we performed an additional evaluation of spatial maps with ROC analysis. We varied the magnitude, temporal variance and connectivity of simulated fMRI signal and identified the optimal classifier for each simulated environment. Overall, the best performers were linear and quadratic discriminants (operating on principal components of the data matrix) and, in some rare situations, a nonlinear Gaussian naïve Bayes classifier. The results from the simulated data were supported by within-subject analysis of experimental fMRI data, collected in a study of aging. This is the first study that systematically characterizes interactions between analysis model and signal parameters (such as magnitude, variance and correlation) on the performance of pattern classifiers for fMRI. Copyright © 2014 Elsevier Inc. All rights reserved.
Modified neural networks for rapid recovery of tokamak plasma parameters for real time control

NASA Astrophysics Data System (ADS)

Sengupta, A.; Ranjan, P.

2002-07-01

Two modified neural network techniques are used for the identification of the equilibrium plasma parameters of the Superconducting Steady State Tokamak I from external magnetic measurements. This is expected to ultimately assist in a real time plasma control. As different from the conventional network structure where a single network with the optimum number of processing elements calculates the outputs, a multinetwork system connected in parallel does the calculations here in one of the methods. This network is called the double neural network. The accuracy of the recovered parameters is clearly more than the conventional network. The other type of neural network used here is based on the statistical function parametrization combined with a neural network. The principal component transformation removes linear dependences from the measurements and a dimensional reduction process reduces the dimensionality of the input space. This reduced and transformed input set, rather than the entire set, is fed into the neural network input. This is known as the principal component transformation-based neural network. The accuracy of the recovered parameters in the latter type of modified network is found to be a further improvement over the accuracy of the double neural network. This result differs from that obtained in an earlier work where the double neural network showed better performance. The conventional network and the function parametrization methods have also been used for comparison. The conventional network has been used for an optimization of the set of magnetic diagnostics. The effective set of sensors, as assessed by this network, are compared with the principal component based network. Fault tolerance of the neural networks has been tested. The double neural network showed the maximum resistance to faults in the diagnostics, while the principal component based network performed poorly. Finally the processing times of the methods have been compared. The double network and the principal component network involve the minimum computation time, although the conventional network also performs well enough to be used in real time.
Statistical methods and regression analysis of stratospheric ozone and meteorological variables in Isfahan

NASA Astrophysics Data System (ADS)

Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.

2008-04-01

Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.
Linear degrees of freedom in speech production: analysis of cineradio- and labio-film data and articulatory-acoustic modeling.

PubMed

Beautemps, D; Badin, P; Bailly, G

2001-05-01

The following contribution addresses several issues concerning speech degrees of freedom in French oral vowels, stop, and fricative consonants based on an analysis of tongue and lip shapes extracted from cineradio- and labio-films. The midsagittal tongue shapes have been submitted to a linear decomposition where some of the loading factors were selected such as jaw and larynx position while four other components were derived from principal component analysis (PCA). For the lips, in addition to the more traditional protrusion and opening components, a supplementary component was extracted to explain the upward movement of both the upper and lower lips in [v] production. A linear articulatory model was developed; the six tongue degrees of freedom were used as the articulatory control parameters of the midsagittal tongue contours and explained 96% of the tongue data variance. These control parameters were also used to specify the frontal lip width dimension derived from the labio-film front views. Finally, this model was complemented by a conversion model going from the midsagittal to the area function, based on a fitting of the midsagittal distances and the formant frequencies for both vowels and consonants.
Cognitive load, emotion, and performance in high-fidelity simulation among beginning nursing students: a pilot study.

PubMed

Schlairet, Maura C; Schlairet, Timothy James; Sauls, Denise H; Bellflowers, Lois

2015-03-01

Establishing the impact of the high-fidelity simulation environment on student performance, as well as identifying factors that could predict learning, would refine simulation outcome expectations among educators. The purpose of this quasi-experimental pilot study was to explore the impact of simulation on emotion and cognitive load among beginning nursing students. Forty baccalaureate nursing students participated in teaching simulations, rated their emotional state and cognitive load, and completed evaluation simulations. Two principal components of emotion were identified representing the pleasant activation and pleasant deactivation components of affect. Mean rating of cognitive load following simulation was high. Linear regression identiffed slight but statistically nonsignificant positive associations between principal components of emotion and cognitive load. Logistic regression identified a negative but statistically nonsignificant effect of cognitive load on assessment performance. Among lower ability students, a more pronounced effect of cognitive load on assessment performance was observed; this also was statistically non-significant. Copyright 2015, SLACK Incorporated.
Multivariate Analysis of Solar Spectral Irradiance Measurements

NASA Technical Reports Server (NTRS)

Pilewskie, P.; Rabbette, M.

2001-01-01

Principal component analysis is used to characterize approximately 7000 downwelling solar irradiance spectra retrieved at the Southern Great Plains site during an Atmospheric Radiation Measurement (ARM) shortwave intensive operating period. This analysis technique has proven to be very effective in reducing a large set of variables into a much smaller set of independent variables while retaining the information content. It is used to determine the minimum number of parameters necessary to characterize atmospheric spectral irradiance or the dimensionality of atmospheric variability. It was found that well over 99% of the spectral information was contained in the first six mutually orthogonal linear combinations of the observed variables (flux at various wavelengths). Rotation of the principal components was effective in separating various components by their independent physical influences. The majority of the variability in the downwelling solar irradiance (380-1000 nm) was explained by the following fundamental atmospheric parameters (in order of their importance): cloud scattering, water vapor absorption, molecular scattering, and ozone absorption. In contrast to what has been proposed as a resolution to a clear-sky absorption anomaly, no unexpected gaseous absorption signature was found in any of the significant components.
Understanding deformation mechanisms during powder compaction using principal component analysis of compression data.

PubMed

Roopwani, Rahul; Buckner, Ira S

2011-10-14

Principal component analysis (PCA) was applied to pharmaceutical powder compaction. A solid fraction parameter (SF(c/d)) and a mechanical work parameter (W(c/d)) representing irreversible compression behavior were determined as functions of applied load. Multivariate analysis of the compression data was carried out using PCA. The first principal component (PC1) showed loadings for the solid fraction and work values that agreed with changes in the relative significance of plastic deformation to consolidation at different pressures. The PC1 scores showed the same rank order as the relative plasticity ranking derived from the literature for common pharmaceutical materials. The utility of PC1 in understanding deformation was extended to binary mixtures using a subset of the original materials. Combinations of brittle and plastic materials were characterized using the PCA method. The relationships between PC1 scores and the weight fractions of the mixtures were typically linear showing ideal mixing in their deformation behaviors. The mixture consisting of two plastic materials was the only combination to show a consistent positive deviation from ideality. The application of PCA to solid fraction and mechanical work data appears to be an effective means of predicting deformation behavior during compaction of simple powder mixtures. Copyright © 2011 Elsevier B.V. All rights reserved.
Classification of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulphides by principal component analysis and artificial neural networks.

PubMed

Kalegowda, Yogesh; Harmer, Sarah L

2013-01-08

Artificial neural network (ANN) and a hybrid principal component analysis-artificial neural network (PCA-ANN) classifiers have been successfully implemented for classification of static time-of-flight secondary ion mass spectrometry (ToF-SIMS) mass spectra collected from complex Cu-Fe sulphides (chalcopyrite, bornite, chalcocite and pyrite) at different flotation conditions. ANNs are very good pattern classifiers because of: their ability to learn and generalise patterns that are not linearly separable; their fault and noise tolerance capability; and high parallelism. In the first approach, fragments from the whole ToF-SIMS spectrum were used as input to the ANN, the model yielded high overall correct classification rates of 100% for feed samples, 88% for conditioned feed samples and 91% for Eh modified samples. In the second approach, the hybrid pattern classifier PCA-ANN was integrated. PCA is a very effective multivariate data analysis tool applied to enhance species features and reduce data dimensionality. Principal component (PC) scores which accounted for 95% of the raw spectral data variance, were used as input to the ANN, the model yielded high overall correct classification rates of 88% for conditioned feed samples and 95% for Eh modified samples. Copyright © 2012 Elsevier B.V. All rights reserved.
Source apportionment of soil heavy metals using robust absolute principal component scores-robust geographically weighted regression (RAPCS-RGWR) receptor model.

PubMed

Qu, Mingkai; Wang, Yan; Huang, Biao; Zhao, Yongcun

2018-06-01

The traditional source apportionment models, such as absolute principal component scores-multiple linear regression (APCS-MLR), are usually susceptible to outliers, which may be widely present in the regional geochemical dataset. Furthermore, the models are merely built on variable space instead of geographical space and thus cannot effectively capture the local spatial characteristics of each source contributions. To overcome the limitations, a new receptor model, robust absolute principal component scores-robust geographically weighted regression (RAPCS-RGWR), was proposed based on the traditional APCS-MLR model. Then, the new method was applied to the source apportionment of soil metal elements in a region of Wuhan City, China as a case study. Evaluations revealed that: (i) RAPCS-RGWR model had better performance than APCS-MLR model in the identification of the major sources of soil metal elements, and (ii) source contributions estimated by RAPCS-RGWR model were more close to the true soil metal concentrations than that estimated by APCS-MLR model. It is shown that the proposed RAPCS-RGWR model is a more effective source apportionment method than APCS-MLR (i.e., non-robust and global model) in dealing with the regional geochemical dataset. Copyright © 2018 Elsevier B.V. All rights reserved.
Scalable Robust Principal Component Analysis Using Grassmann Averages.

PubMed

Hauberg, Sren; Feragen, Aasa; Enficiaud, Raffi; Black, Michael J

2016-11-01

In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average ( GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average ( TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.
Quantifying Parkinson's disease finger-tapping severity by extracting and synthesizing finger motion properties.

PubMed

Sano, Yuko; Kandori, Akihiko; Shima, Keisuke; Yamaguchi, Yuki; Tsuji, Toshio; Noda, Masafumi; Higashikawa, Fumiko; Yokoe, Masaru; Sakoda, Saburo

2016-06-01

We propose a novel index of Parkinson's disease (PD) finger-tapping severity, called "PDFTsi," for quantifying the severity of symptoms related to the finger tapping of PD patients with high accuracy. To validate the efficacy of PDFTsi, the finger-tapping movements of normal controls and PD patients were measured by using magnetic sensors, and 21 characteristics were extracted from the finger-tapping waveforms. To distinguish motor deterioration due to PD from that due to aging, the aging effect on finger tapping was removed from these characteristics. Principal component analysis (PCA) was applied to the age-normalized characteristics, and principal components that represented the motion properties of finger tapping were calculated. Multiple linear regression (MLR) with stepwise variable selection was applied to the principal components, and PDFTsi was calculated. The calculated PDFTsi indicates that PDFTsi has a high estimation ability, namely a mean square error of 0.45. The estimation ability of PDFTsi is higher than that of the alternative method, MLR with stepwise regression selection without PCA, namely a mean square error of 1.30. This result suggests that PDFTsi can quantify PD finger-tapping severity accurately. Furthermore, the result of interpreting a model for calculating PDFTsi indicated that motion wideness and rhythm disorder are important for estimating PD finger-tapping severity.
Product competitiveness analysis for e-commerce platform of special agricultural products

NASA Astrophysics Data System (ADS)

Wan, Fucheng; Ma, Ning; Yang, Dongwei; Xiong, Zhangyuan

2017-09-01

On the basis of analyzing the influence factors of the product competitiveness of the e-commerce platform of the special agricultural products and the characteristics of the analytical methods for the competitiveness of the special agricultural products, the price, the sales volume, the postage included service, the store reputation, the popularity, etc. were selected in this paper as the dimensionality for analyzing the competitiveness of the agricultural products, and the principal component factor analysis was taken as the competitiveness analysis method. Specifically, the web crawler was adopted to capture the information of various special agricultural products in the e-commerce platform ---- chi.taobao.com. Then, the original data captured thereby were preprocessed and MYSQL database was adopted to establish the information library for the special agricultural products. Then, the principal component factor analysis method was adopted to establish the analysis model for the competitiveness of the special agricultural products, and SPSS was adopted in the principal component factor analysis process to obtain the competitiveness evaluation factor system (support degree factor, price factor, service factor and evaluation factor) of the special agricultural products. Then, the linear regression method was adopted to establish the competitiveness index equation of the special agricultural products for estimating the competitiveness of the special agricultural products.
Polarization Ratio Determination with Two Identical Linearly Polarized Antennas

DTIC Science & Technology

2017-01-17

Fourier transform analysis of 21 measurements with one of the antennas rotating about its axis a circular polarization ratio is derived which can be...deter- mined directly from a discrete Fourier transform (DFT) of (5). However, leakage between closely spaced DFT bins requires improving the... Fourier transform and a mechanical antenna rotation to separate the principal and opposite circular polarization components followed by a basis
Using Structural Equation Modeling To Fit Models Incorporating Principal Components.

ERIC Educational Resources Information Center

Dolan, Conor; Bechger, Timo; Molenaar, Peter

1999-01-01

Considers models incorporating principal components from the perspectives of structural-equation modeling. These models include the following: (1) the principal-component analysis of patterned matrices; (2) multiple analysis of variance based on principal components; and (3) multigroup principal-components analysis. Discusses fitting these models…
Principal components colour display of ERTS imagery

NASA Technical Reports Server (NTRS)

Taylor, M. M.

1974-01-01

In the technique presented, colours are not derived from single bands, but rather from independent linear combinations of the bands. Using a simple model of the processing done by the visual system, three informationally independent linear combinations of the four ERTS bands are mapped onto the three visual colour dimensions of brightness, redness-greenness and blueness-yellowness. The technique permits user-specific transformations which enhance particular features, but this is not usually needed, since a single transformation provides a picture which conveys much of the information implicit in the ERTS data. Examples of experimental vector images with matched individual band images are shown.
Typification of cider brandy on the basis of cider used in its manufacture.

PubMed

Rodríguez Madrera, Roberto; Mangas Alonso, Juan J

2005-04-20

A study of typification of cider brandies on the basis of the origin of the raw material used in their manufacture was conducted using chemometric techniques (principal component analysis, linear discriminant analysis, and Bayesian analysis) together with their composition in volatile compounds, as analyzed by gas chromatography with flame ionization to detect the major volatiles and by mass spectrometric to detect the minor ones. Significant principal components computed by a double cross-validation procedure allowed the structure of the database to be visualized as a function of the raw material, that is, cider made from fresh apple juice versus cider made from apple juice concentrate. Feasible and robust discriminant rules were computed and validated by a cross-validation procedure that allowed the authors to classify fresh and concentrate cider brandies, obtaining classification hits of >92%. The most discriminating variables for typifying cider brandies according to their raw material were 1-butanol and ethyl hexanoate.
Classification of adulterated honeys by multivariate analysis.

PubMed

Amiry, Saber; Esmaiili, Mohsen; Alizadeh, Mohammad

2017-06-01

In this research, honey samples were adulterated with date syrup (DS) and invert sugar syrup (IS) at three concentrations (7%, 15% and 30%). 102 adulterated samples were prepared in six batches with 17 replications for each batch. For each sample, 32 parameters including color indices, rheological, physical, and chemical parameters were determined. To classify the samples, based on type and concentrations of adulterant, a multivariate analysis was applied using principal component analysis (PCA) followed by a linear discriminant analysis (LDA). Then, 21 principal components (PCs) were selected in five sets. Approximately two-thirds were identified correctly using color indices (62.75%) or rheological properties (67.65%). A power discrimination was obtained using physical properties (97.06%), and the best separations were achieved using two sets of chemical properties (set 1: lactone, diastase activity, sucrose - 100%) (set 2: free acidity, HMF, ash - 95%). Copyright © 2016 Elsevier Ltd. All rights reserved.
Application of principal component analysis to distinguish patients with schizophrenia from healthy controls based on fractional anisotropy measurements.

PubMed

Caprihan, A; Pearlson, G D; Calhoun, V D

2008-08-15

Principal component analysis (PCA) is often used to reduce the dimension of data before applying more sophisticated data analysis methods such as non-linear classification algorithms or independent component analysis. This practice is based on selecting components corresponding to the largest eigenvalues. If the ultimate goal is separation of data in two groups, then these set of components need not have the most discriminatory power. We measured the distance between two such populations using Mahalanobis distance and chose the eigenvectors to maximize it, a modified PCA method, which we call the discriminant PCA (DPCA). DPCA was applied to diffusion tensor-based fractional anisotropy images to distinguish age-matched schizophrenia subjects from healthy controls. The performance of the proposed method was evaluated by the one-leave-out method. We show that for this fractional anisotropy data set, the classification error with 60 components was close to the minimum error and that the Mahalanobis distance was twice as large with DPCA, than with PCA. Finally, by masking the discriminant function with the white matter tracts of the Johns Hopkins University atlas, we identified left superior longitudinal fasciculus as the tract which gave the least classification error. In addition, with six optimally chosen tracts the classification error was zero.

Independent component analysis for automatic note extraction from musical trills

NASA Astrophysics Data System (ADS)

Brown, Judith C.; Smaragdis, Paris

2004-05-01

The method of principal component analysis, which is based on second-order statistics (or linear independence), has long been used for redundancy reduction of audio data. The more recent technique of independent component analysis, enforcing much stricter statistical criteria based on higher-order statistical independence, is introduced and shown to be far superior in separating independent musical sources. This theory has been applied to piano trills and a database of trill rates was assembled from experiments with a computer-driven piano, recordings of a professional pianist, and commercially available compact disks. The method of independent component analysis has thus been shown to be an outstanding, effective means of automatically extracting interesting musical information from a sea of redundant data.
Locally linear embedding: dimension reduction of massive protostellar spectra

NASA Astrophysics Data System (ADS)

Ward, J. L.; Lumsden, S. L.

2016-09-01

We present the results of the application of locally linear embedding (LLE) to reduce the dimensionality of dereddened and continuum subtracted near-infrared spectra using a combination of models and real spectra of massive protostars selected from the Red MSX Source survey data base. A brief comparison is also made with two other dimension reduction techniques; principal component analysis (PCA) and Isomap using the same set of spectra as well as a more advanced form of LLE, Hessian locally linear embedding. We find that whilst LLE certainly has its limitations, it significantly outperforms both PCA and Isomap in classification of spectra based on the presence/absence of emission lines and provides a valuable tool for classification and analysis of large spectral data sets.
Chemometric investigation of light-shade effects on essential oil yield and morphology of Moroccan Myrtus communis L.

PubMed

Fadil, Mouhcine; Farah, Abdellah; Ihssane, Bouchaib; Haloui, Taoufik; Lebrazi, Sara; Zghari, Badreddine; Rachiq, Saâd

2016-01-01

To investigate the effect of environmental factors such as light and shade on essential oil yield and morphological traits of Moroccan Myrtus communis, a chemometric study was conducted on 20 individuals growing under two contrasting light environments. The study of individual's parameters by principal component analysis has shown that essential oil yield, altitude, and leaves thickness were positively correlated between them and negatively correlated with plants height, leaves length and leaves width. Principal component analysis and hierarchical cluster analysis have also shown that the individuals of each sampling site were grouped separately. The one-way ANOVA test has confirmed the effect of light and shade on essential oil yield and morphological parameters by showing a statistically significant difference between them from the shaded side to the sunny one. Finally, the multiple linear model containing main, interaction and quadratic terms was chosen for the modeling of essential oil yield in terms of morphological parameters. Sun plants have a small height, small leaves length and width, but they are thicker and richer in essential oil than shade plants which have shown almost the opposite. The highlighted multiple linear model can be used to predict essential oil yield in the studied area.
Acoustic-articulatory mapping in vowels by locally weighted regression

PubMed Central

McGowan, Richard S.; Berger, Michael A.

2009-01-01

A method for mapping between simultaneously measured articulatory and acoustic data is proposed. The method uses principal components analysis on the articulatory and acoustic variables, and mapping between the domains by locally weighted linear regression, or loess [Cleveland, W. S. (1979). J. Am. Stat. Assoc. 74, 829–836]. The latter method permits local variation in the slopes of the linear regression, assuming that the function being approximated is smooth. The methodology is applied to vowels of four speakers in the Wisconsin X-ray Microbeam Speech Production Database, with formant analysis. Results are examined in terms of (1) examples of forward (articulation-to-acoustics) mappings and inverse mappings, (2) distributions of local slopes and constants, (3) examples of correlations among slopes and constants, (4) root-mean-square error, and (5) sensitivity of formant frequencies to articulatory change. It is shown that the results are qualitatively correct and that loess performs better than global regression. The forward mappings show different root-mean-square error properties than the inverse mappings indicating that this method is better suited for the forward mappings than the inverse mappings, at least for the data chosen for the current study. Some preliminary results on sensitivity of the first two formant frequencies to the two most important articulatory principal components are presented. PMID:19813812
Application of kernel principal component analysis and computational machine learning to exploration of metabolites strongly associated with diet.

PubMed

Shiokawa, Yuka; Date, Yasuhiro; Kikuchi, Jun

2018-02-21

Computer-based technological innovation provides advancements in sophisticated and diverse analytical instruments, enabling massive amounts of data collection with relative ease. This is accompanied by a fast-growing demand for technological progress in data mining methods for analysis of big data derived from chemical and biological systems. From this perspective, use of a general "linear" multivariate analysis alone limits interpretations due to "non-linear" variations in metabolic data from living organisms. Here we describe a kernel principal component analysis (KPCA)-incorporated analytical approach for extracting useful information from metabolic profiling data. To overcome the limitation of important variable (metabolite) determinations, we incorporated a random forest conditional variable importance measure into our KPCA-based analytical approach to demonstrate the relative importance of metabolites. Using a market basket analysis, hippurate, the most important variable detected in the importance measure, was associated with high levels of some vitamins and minerals present in foods eaten the previous day, suggesting a relationship between increased hippurate and intake of a wide variety of vegetables and fruits. Therefore, the KPCA-incorporated analytical approach described herein enabled us to capture input-output responses, and should be useful not only for metabolic profiling but also for profiling in other areas of biological and environmental systems.
Temporal trend and climate factors of hemorrhagic fever with renal syndrome epidemic in Shenyang City, China

PubMed Central

2011-01-01

Background Hemorrhagic fever with renal syndrome (HFRS) is an important infectious disease caused by different species of hantaviruses. As a rodent-borne disease with a seasonal distribution, external environmental factors including climate factors may play a significant role in its transmission. The city of Shenyang is one of the most seriously endemic areas for HFRS. Here, we characterized the dynamic temporal trend of HFRS, and identified climate-related risk factors and their roles in HFRS transmission in Shenyang, China. Methods The annual and monthly cumulative numbers of HFRS cases from 2004 to 2009 were calculated and plotted to show the annual and seasonal fluctuation in Shenyang. Cross-correlation and autocorrelation analyses were performed to detect the lagged effect of climate factors on HFRS transmission and the autocorrelation of monthly HFRS cases. Principal component analysis was constructed by using climate data from 2004 to 2009 to extract principal components of climate factors to reduce co-linearity. The extracted principal components and autocorrelation terms of monthly HFRS cases were added into a multiple regression model called principal components regression model (PCR) to quantify the relationship between climate factors, autocorrelation terms and transmission of HFRS. The PCR model was compared to a general multiple regression model conducted only with climate factors as independent variables. Results A distinctly declining temporal trend of annual HFRS incidence was identified. HFRS cases were reported every month, and the two peak periods occurred in spring (March to May) and winter (November to January), during which, nearly 75% of the HFRS cases were reported. Three principal components were extracted with a cumulative contribution rate of 86.06%. Component 1 represented MinRH0, MT1, RH1, and MWV1; component 2 represented RH2, MaxT3, and MAP3; and component 3 represented MaxT2, MAP2, and MWV2. The PCR model was composed of three principal components and two autocorrelation terms. The association between HFRS epidemics and climate factors was better explained in the PCR model (F = 446.452, P < 0.001, adjusted R2 = 0.75) than in the general multiple regression model (F = 223.670, P < 0.000, adjusted R2 = 0.51). Conclusion The temporal distribution of HFRS in Shenyang varied in different years with a distinctly declining trend. The monthly trends of HFRS were significantly associated with local temperature, relative humidity, precipitation, air pressure, and wind velocity of the different previous months. The model conducted in this study will make HFRS surveillance simpler and the control of HFRS more targeted in Shenyang. PMID:22133347
Preliminary Results Of PCA On MRO CRISM Multispectral Images

NASA Astrophysics Data System (ADS)

Klassen, David R.; Smith, M. D.

2008-09-01

Mars Reconnaissance Orbiter arrived at Mars in March 2006 and by September had achieved its science-phase orbit with the Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) beginning its visible to near-infrared (VIS/NIR) spectral imaging shortly thereafter. One of the goals of CRISM is to fill in the spatial gaps between the various targeted observations, eventually mapping the entire surface. Due to the large volume of data this would create, the instrument works in a reduced spectral sampling mode creating "multispectral” images. From this data we can create image cubes using 70 wavelengths from 0.410 to 3.504 µm. We present here a preliminary analysis of these multispectral mode data products using the technique of Principal Components Analysis. Previous work with ground-based images has shown that over an entire visible hemisphere, there are only three to four meaningful components out of 32-105 wavelengths over 1.5-4.1 µm. The first two of these components are fairly consistent over all time intervals from day-to-day and season-to-season. [1-4] The preliminary work on the CRISM images cubes implies similar results_three to four significant principal components that are fairly consistent over time. We will show these components and a rough linear mixture modeling based on in-data spectral endmembers derived from the extrema of the principal components [5]. References: [1] Klassen, D. R. and Bell III, J. F. (2001) BAAS 33, 1069. [2] Klassen, D. R. and Bell III, J. F. (2003) BAAS, 35, 936. [3] Klassen, D. R., Wark, T. J., Cugliotta, C. G. (2005) BAAS, 37, 693. [4] Klassen, D. R. and Bell III, J. F. (2007) in preparation. [5] Klassen, D. R. and Bell III, J. F. (2000) BAAS, 32, 1105.
Classification of breast tissue in mammograms using efficient coding.

PubMed

Costa, Daniel D; Campos, Lúcio F; Barros, Allan K

2011-06-24

Female breast cancer is the major cause of death by cancer in western countries. Efforts in Computer Vision have been made in order to improve the diagnostic accuracy by radiologists. Some methods of lesion diagnosis in mammogram images were developed based in the technique of principal component analysis which has been used in efficient coding of signals and 2D Gabor wavelets used for computer vision applications and modeling biological vision. In this work, we present a methodology that uses efficient coding along with linear discriminant analysis to distinguish between mass and non-mass from 5090 region of interest from mammograms. The results show that the best rates of success reached with Gabor wavelets and principal component analysis were 85.28% and 87.28%, respectively. In comparison, the model of efficient coding presented here reached up to 90.07%. Altogether, the results presented demonstrate that independent component analysis performed successfully the efficient coding in order to discriminate mass from non-mass tissues. In addition, we have observed that LDA with ICA bases showed high predictive performance for some datasets and thus provide significant support for a more detailed clinical investigation.
Discrimination of a chestnut-oak forest unit for geologic mapping by means of a principal component enhancement of Landsat multispectral scanner data.

USGS Publications Warehouse

Krohn, M.D.; Milton, N.M.; Segal, D.; Enland, A.

1981-01-01

A principal component image enhancement has been effective in applying Landsat data to geologic mapping in a heavily forested area of E Virginia. The image enhancement procedure consists of a principal component transformation, a histogram normalization, and the inverse principal componnet transformation. The enhancement preserves the independence of the principal components, yet produces a more readily interpretable image than does a single principal component transformation. -from Authors
Comparison of three-dimensional fluorescence analysis methods for predicting formation of trihalomethanes and haloacetic acids.

PubMed

Peleato, Nicolás M; Andrews, Robert C

2015-01-01

This work investigated the application of several fluorescence excitation-emission matrix analysis methods as natural organic matter (NOM) indicators for use in predicting the formation of trihalomethanes (THMs) and haloacetic acids (HAAs). Waters from four different sources (two rivers and two lakes) were subjected to jar testing followed by 24hr disinfection by-product formation tests using chlorine. NOM was quantified using three common measures: dissolved organic carbon, ultraviolet absorbance at 254 nm, and specific ultraviolet absorbance as well as by principal component analysis, peak picking, and parallel factor analysis of fluorescence spectra. Based on multi-linear modeling of THMs and HAAs, principle component (PC) scores resulted in the lowest mean squared prediction error of cross-folded test sets (THMs: 43.7 (μg/L)(2), HAAs: 233.3 (μg/L)(2)). Inclusion of principle components representative of protein-like material significantly decreased prediction error for both THMs and HAAs. Parallel factor analysis did not identify a protein-like component and resulted in prediction errors similar to traditional NOM surrogates as well as fluorescence peak picking. These results support the value of fluorescence excitation-emission matrix-principal component analysis as a suitable NOM indicator in predicting the formation of THMs and HAAs for the water sources studied. Copyright © 2014. Published by Elsevier B.V.
[HPLC fingerprint of flavonoids in Sophora flavescens and determination of five components].

PubMed

Ma, Hong-Yan; Zhou, Wan-Shan; Chu, Fu-Jiang; Wang, Dong; Liang, Sheng-Wang; Li, Shao

2013-08-01

A simple and reliable method of high-performance liquid chromatography with photodiode array detection (HPLC-DAD) was developed to evaluate the quality of a traditional Chinese medicine Sophora flavescens through establishing chromatographic fingerprint and simultaneous determination of five flavonoids, including trifolirhizin, maackiain, kushenol I, kurarinone and sophoraflavanone G. The optimal conditions of separation and detection were achieved on an ULTIMATE XB-C18 column (4.6 mm x 250 mm, 5 microm) with a gradient of acetonitrile and water, detected at 295 nm. In the chromatographic fingerprint, 13 peaks were selected as the characteristic peaks to assess the similarities of different samples collected from different origins in China according to similarity evaluation for chromatographic fingerprint of traditional chinese medicine (2004AB) and principal component analysis (PCA) were used in data analysis. There were significant differences in the fingerprint chromatograms between S. flavescens and S. tonkinensis. Principal component analysis showed that kurarinone and sophoraflavanone G were the most important component. In quantitative analysis, the five components showed good regression (R > 0.999) with linear ranges, and their recoveries were in the range of 96.3% - 102.3%. This study indicated that the combination of quantitative and chromatographic fingerprint analysis can be readily utilized as a quality control method for S. flavescens and its related traditional Chinese medicinal preparations.
Face Hallucination with Linear Regression Model in Semi-Orthogonal Multilinear PCA Method

NASA Astrophysics Data System (ADS)

Asavaskulkiet, Krissada

2018-04-01

In this paper, we propose a new face hallucination technique, face images reconstruction in HSV color space with a semi-orthogonal multilinear principal component analysis method. This novel hallucination technique can perform directly from tensors via tensor-to-vector projection by imposing the orthogonality constraint in only one mode. In our experiments, we use facial images from FERET database to test our hallucination approach which is demonstrated by extensive experiments with high-quality hallucinated color faces. The experimental results assure clearly demonstrated that we can generate photorealistic color face images by using the SO-MPCA subspace with a linear regression model.
An empirical comparative study on biological age estimation algorithms with an application of Work Ability Index (WAI).

PubMed

Cho, Il Haeng; Park, Kyung S; Lim, Chang Joo

2010-02-01

In this study, we described the characteristics of five different biological age (BA) estimation algorithms, including (i) multiple linear regression, (ii) principal component analysis, and somewhat unique methods developed by (iii) Hochschild, (iv) Klemera and Doubal, and (v) a variant of Klemera and Doubal's method. The objective of this study is to find the most appropriate method of BA estimation by examining the association between Work Ability Index (WAI) and the differences of each algorithm's estimates from chronological age (CA). The WAI was found to be a measure that reflects an individual's current health status rather than the deterioration caused by a serious dependency with the age. Experiments were conducted on 200 Korean male participants using a BA estimation system developed principally under the concept of non-invasive, simple to operate and human function-based. Using the empirical data, BA estimation as well as various analyses including correlation analysis and discriminant function analysis was performed. As a result, it had been confirmed by the empirical data that Klemera and Doubal's method with uncorrelated variables from principal component analysis produces relatively reliable and acceptable BA estimates. 2009 Elsevier Ireland Ltd. All rights reserved.
Principal component of explained variance: An efficient and optimal data dimension reduction framework for association studies.

PubMed

Turgeon, Maxime; Oualkacha, Karim; Ciampi, Antonio; Miftah, Hanane; Dehghan, Golsa; Zanke, Brent W; Benedet, Andréa L; Rosa-Neto, Pedro; Greenwood, Celia Mt; Labbe, Aurélie

2018-05-01

The genomics era has led to an increase in the dimensionality of data collected in the investigation of biological questions. In this context, dimension-reduction techniques can be used to summarise high-dimensional signals into low-dimensional ones, to further test for association with one or more covariates of interest. This paper revisits one such approach, previously known as principal component of heritability and renamed here as principal component of explained variance (PCEV). As its name suggests, the PCEV seeks a linear combination of outcomes in an optimal manner, by maximising the proportion of variance explained by one or several covariates of interest. By construction, this method optimises power; however, due to its computational complexity, it has unfortunately received little attention in the past. Here, we propose a general analytical PCEV framework that builds on the assets of the original method, i.e. conceptually simple and free of tuning parameters. Moreover, our framework extends the range of applications of the original procedure by providing a computationally simple strategy for high-dimensional outcomes, along with exact and asymptotic testing procedures that drastically reduce its computational cost. We investigate the merits of the PCEV using an extensive set of simulations. Furthermore, the use of the PCEV approach is illustrated using three examples taken from the fields of epigenetics and brain imaging.
Optical system for tablet variety discrimination using visible/near-infrared spectroscopy

NASA Astrophysics Data System (ADS)

Shao, Yongni; He, Yong; Hu, Xingyue

2007-12-01

An optical system based on visible/near-infrared spectroscopy (Vis/NIRS) for variety discrimination of ginkgo (Ginkgo biloba L.) tablets was developed. This system consisted of a light source, beam splitter system, sample chamber, optical detector (diffuse reflection detector), and data collection. The tablet varieties used in the research include Da na kang, Xin bang, Tian bao ning, Yi kang, Hua na xing, Dou le, Lv yuan, Hai wang, and Ji yao. All samples (n=270) were scanned in the Vis/NIR region between 325 and 1075 nm using a spectrograph. The chemometrics method of principal component artificial neural network (PC-ANN) was used to establish discrimination models of them. In PC-ANN models, the scores of the principal components were chosen as the input nodes for the input layer of ANN, and the best discrimination rate of 91.1% was reached. Principal component analysis was also executed to select several optimal wavelengths based on loading values. Wavelengths at 481, 458, 466, 570, 1000, 662, and 400 nm were then used as the input data of stepwise multiple linear regression, the regression equation of ginkgo tablets was obtained, and the discrimination rate was researched 84.4%. The results indicated that this optical system could be applied to discriminating ginkgo (Ginkgo biloba L.) tablets, and it supplied a new method for fast ginkgo tablet variety discrimination.
Modeling and Prediction of Monthly Total Ozone Concentrations by Use of an Artificial Neural Network Based on Principal Component Analysis

NASA Astrophysics Data System (ADS)

Chattopadhyay, Surajit; Chattopadhyay, Goutami

2012-10-01

In the work discussed in this paper we considered total ozone time series over Kolkata (22°34'10.92″N, 88°22'10.92″E), an urban area in eastern India. Using cloud cover, average temperature, and rainfall as the predictors, we developed an artificial neural network, in the form of a multilayer perceptron with sigmoid non-linearity, for prediction of monthly total ozone concentrations from values of the predictors in previous months. We also estimated total ozone from values of the predictors in the same month. Before development of the neural network model we removed multicollinearity by means of principal component analysis. On the basis of the variables extracted by principal component analysis, we developed three artificial neural network models. By rigorous statistical assessment it was found that cloud cover and rainfall can act as good predictors for monthly total ozone when they are considered as the set of input variables for the neural network model constructed in the form of a multilayer perceptron. In general, the artificial neural network has good potential for predicting and estimating monthly total ozone on the basis of the meteorological predictors. It was further observed that during pre-monsoon and winter seasons, the proposed models perform better than during and after the monsoon.
Principal Components Analysis Studies of Martian Clouds

NASA Astrophysics Data System (ADS)

Klassen, D. R.; Bell, J. F., III

2001-11-01

We present the principal components analysis (PCA) of absolutely calibrated multi-spectral images of Mars as a function of Martian season. The PCA technique is a mathematical rotation and translation of the data from a brightness/wavelength space to a vector space of principal ``traits'' that lie along the directions of maximal variance. The first of these traits, accounting for over 90% of the data variance, is overall brightness and represented by an average Mars spectrum. Interpretation of the remaining traits, which account for the remaining ~10% of the variance, is not always the same and depends upon what other components are in the scene and thus, varies with Martian season. For example, during seasons with large amounts of water ice in the scene, the second trait correlates with the ice and anti-corrlates with temperature. We will investigate the interpretation of the second, and successive important PCA traits. Although these PCA traits are orthogonal in their own vector space, it is unlikely that any one trait represents a singular, mineralogic, spectral end-member. It is more likely that there are many spectral endmembers that vary identically to within the noise level, that the PCA technique will not be able to distinguish them. Another possibility is that similar absorption features among spectral endmembers may be tied to one PCA trait, for example ''amount of 2 \\micron\\ absorption''. We thus attempt to extract spectral endmembers by matching linear combinations of the PCA traits to USGS, JHU, and JPL spectral libraries as aquired through the JPL Aster project. The recovered spectral endmembers are then linearly combined to model the multi-spectral image set. We present here the spectral abundance maps of the water ice/frost endmember which allow us to track Martian clouds and ground frosts. This work supported in part through NASA Planetary Astronomy Grant NAG5-6776. All data gathered at the NASA Infrared Telescope Facility in collaboration with the telescope operators and with thanks to the support staff and day crew.
Q-mode versus R-mode principal component analysis for linear discriminant analysis (LDA)

NASA Astrophysics Data System (ADS)

Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz

2017-05-01

Many literature apply Principal Component Analysis (PCA) as either preliminary visualization or variable con-struction methods or both. Focus of PCA can be on the samples (R-mode PCA) or variables (Q-mode PCA). Traditionally, R-mode PCA has been the usual approach to reduce high-dimensionality data before the application of Linear Discriminant Analysis (LDA), to solve classification problems. Output from PCA composed of two new matrices known as loadings and scores matrices. Each matrix can then be used to produce a plot, i.e. loadings plot aids identification of important variables whereas scores plot presents spatial distribution of samples on new axes that are also known as Principal Components (PCs). Fundamentally, the scores matrix always be the input variables for building classification model. A recent paper uses Q-mode PCA but the focus of analysis was not on the variables but instead on the samples. As a result, the authors have exchanged the use of both loadings and scores plots in which clustering of samples was studied using loadings plot whereas scores plot has been used to identify important manifest variables. Therefore, the aim of this study is to statistically validate the proposed practice. Evaluation is based on performance of external error obtained from LDA models according to number of PCs. On top of that, bootstrapping was also conducted to evaluate the external error of each of the LDA models. Results show that LDA models produced by PCs from R-mode PCA give logical performance and the matched external error are also unbiased whereas the ones produced with Q-mode PCA show the opposites. With that, we concluded that PCs produced from Q-mode is not statistically stable and thus should not be applied to problems of classifying samples, but variables. We hope this paper will provide some insights on the disputable issues.
The influence of acceleration loading curve characteristics on traumatic brain injury.

PubMed

Post, Andrew; Blaine Hoshizaki, T; Gilchrist, Michael D; Brien, Susan; Cusimano, Michael D; Marshall, Shawn

2014-03-21

To prevent brain trauma, understanding the mechanism of injury is essential. Once the mechanism of brain injury has been identified, prevention technologies could then be developed to aid in their prevention. The incidence of brain injury is linked to how the kinematics of a brain injury event affects the internal structures of the brain. As a result it is essential that an attempt be made to describe how the characteristics of the linear and rotational acceleration influence specific traumatic brain injury lesions. As a result, the purpose of this study was to examine the influence of the characteristics of linear and rotational acceleration pulses and how they account for the variance in predicting the outcome of TBI lesions, namely contusion, subdural hematoma (SDH), subarachnoid hemorrhage (SAH), and epidural hematoma (EDH) using a principal components analysis (PCA). Monorail impacts were conducted which simulated falls which caused the TBI lesions. From these reconstructions, the characteristics of the linear and rotational acceleration were determined and used for a PCA analysis. The results indicated that peak resultant acceleration variables did not account for any of the variance in predicting TBI lesions. The majority of the variance was accounted for by duration of the resultant and component linear and rotational acceleration. In addition, the components of linear and rotational acceleration characteristics on the x, y, and z axes accounted for the majority of the remainder of the variance after duration. Copyright © 2014 Elsevier Ltd. All rights reserved.
Multivariate Quality Control Procedures

DTIC Science & Technology

1988-10-01

CLASSIFICATION OF THIS PAGE PREFACE The mathematical modeling work described in this report was authorized under Project No. IC162706A553, CB Defense and...the sum of the measurements. A CUSUM of the first principal component would detect changes in the overall thickness of the sheet. A linear trend could...develop- ment of a unique outlier rule for the specific application. 28 LITERATURE CITED 1. Mood, A.M., Graybill , F.A., and Boes, D.C., Introduction to

Real time damage detection using recursive principal components and time varying auto-regressive modeling

NASA Astrophysics Data System (ADS)

Krishnan, M.; Bhowmik, B.; Hazra, B.; Pakrashi, V.

2018-02-01

In this paper, a novel baseline free approach for continuous online damage detection of multi degree of freedom vibrating structures using Recursive Principal Component Analysis (RPCA) in conjunction with Time Varying Auto-Regressive Modeling (TVAR) is proposed. In this method, the acceleration data is used to obtain recursive proper orthogonal components online using rank-one perturbation method, followed by TVAR modeling of the first transformed response, to detect the change in the dynamic behavior of the vibrating system from its pristine state to contiguous linear/non-linear-states that indicate damage. Most of the works available in the literature deal with algorithms that require windowing of the gathered data owing to their data-driven nature which renders them ineffective for online implementation. Algorithms focussed on mathematically consistent recursive techniques in a rigorous theoretical framework of structural damage detection is missing, which motivates the development of the present framework that is amenable for online implementation which could be utilized along with suite experimental and numerical investigations. The RPCA algorithm iterates the eigenvector and eigenvalue estimates for sample covariance matrices and new data point at each successive time instants, using the rank-one perturbation method. TVAR modeling on the principal component explaining maximum variance is utilized and the damage is identified by tracking the TVAR coefficients. This eliminates the need for offline post processing and facilitates online damage detection especially when applied to streaming data without requiring any baseline data. Numerical simulations performed on a 5-dof nonlinear system under white noise excitation and El Centro (also known as 1940 Imperial Valley earthquake) excitation, for different damage scenarios, demonstrate the robustness of the proposed algorithm. The method is further validated on results obtained from case studies involving experiments performed on a cantilever beam subjected to earthquake excitation; a two-storey benchscale model with a TMD and, data from recorded responses of UCLA factor building demonstrate the efficacy of the proposed methodology as an ideal candidate for real time, reference free structural health monitoring.
Linear Tidal Vestige Found in the WM Sheet

NASA Astrophysics Data System (ADS)

Lee, Jounghun; Kim, Suk; Rey, Soo-Chang

2018-06-01

We present a vestige of the linear tidal influence on the spin orientations of the constituent galaxies of the WM sheet discovered in the vicinity of the Virgo Cluster and the Local Void. The WM sheet is chosen as an optimal target since it has a rectangular parallelepiped-like shape whose three sides are in parallel with the supergalactic Cartesian axes. Determining three probability density functions of the absolute values of the supergalactic Cartesian components of the spin vectors of the WM sheet galaxies, we investigate their alignments with the principal directions of the surrounding large-scale tidal field. When the WM sheet galaxies located in the central region within the distance of 2 h ‑1 Mpc are excluded, the spin vectors of the remaining WM sheet galaxies are found to be weakly aligned, strongly aligned, and strongly anti-aligned with the minor, intermediate, and major principal directions of the surrounding large-scale tidal field, respectively. To examine whether or not the origin of the observed alignment tendency from the WM sheet is the linear tidal effect, we infer the eigenvalues of the linear tidal tensor from the axial ratios of the WM sheet with the help of the Zeldovich approximation and conduct a full analytic evaluation of the prediction of the linear tidal torque model for the three probability density functions. A detailed comparison between the analytical and the observational results reveals a good quantitative agreement not only in the behaviors but also in the amplitudes of the three probability density functions.
Some constraints on levels of shear stress in the crust from observations and theory.

USGS Publications Warehouse

McGarr, A.

1980-01-01

In situ stress determinations in North America, southern Africa, and Australia indicate that on the average the maximum shear stress increases linearly with depth to at least 5.1 km measured in soft rock, such as shale and sandstone, and to 3.7 km in hard rock, including granite and quartzite. Regression lines fitted to the data yield gradients of 3.8 MPa/km and 6.6 MPa/km for soft and hard rock, respectively. Generally, the maximum shear stress in compressional states of stress for which the least principal stress is oriented near vertically is substantially greater than in extensional stress regimes, with the greatest principal stress in a vertical direction. The equations of equilibrium and compatibility can be used to provide functional constrains on the state of stress. If the stress is assumed to vary only with depth z in a given region, then all nonzero components must have the form A + Bz, where A and B are constants which generally differ for the various components. - Author
Discrimination of serum Raman spectroscopy between normal and colorectal cancer

NASA Astrophysics Data System (ADS)

Li, Xiaozhou; Yang, Tianyue; Yu, Ting; Li, Siqi

2011-07-01

Raman spectroscopy of tissues has been widely studied for the diagnosis of various cancers, but biofluids were seldom used as the analyte because of the low concentration. Herein, serum of 30 normal people, 46 colon cancer, and 44 rectum cancer patients were measured Raman spectra and analyzed. The information of Raman peaks (intensity and width) and that of the fluorescence background (baseline function coefficients) were selected as parameters for statistical analysis. Principal component regression (PCR) and partial least square regression (PLSR) were used on the selected parameters separately to see the performance of the parameters. PCR performed better than PLSR in our spectral data. Then linear discriminant analysis (LDA) was used on the principal components (PCs) of the two regression method on the selected parameters, and a diagnostic accuracy of 88% and 83% were obtained. The conclusion is that the selected features can maintain the information of original spectra well and Raman spectroscopy of serum has the potential for the diagnosis of colorectal cancer.
Vibrational structure and antimicrobial activity of selected isonicotinates, potassium picolinate and nicotinate

NASA Astrophysics Data System (ADS)

Koczoń, P.; Piekut, J.; Borawska, M.; Lewandowski, W.

2003-06-01

The lithium, sodium, potassium, rubidium and caesium isonicotinates, potassium picolinate and nicotinate (microbiological data) as well as sodium benzoate (as a referee for microbiological tests) were under study. The selected experimental bands occurring in the FT-IR and FT-Raman spectra of studied alkaline metal isonicotinates and potassium picolinate were assigned. The change of wavenumber of those bands was observed along the metal series and along the change of position of nitrogen in the aromatic ring. The linear combination of wavenumber of assigned bands (the principal component analysis) was performed to estimate the change in the electronic properties of the molecule along the metal series. The antimicrobial activity of studied complexes against yeasts Hansenula anomala, Saccharomyces cerevisiae, and bacteria Escherichia coli and Bacillus subtilis was measured after 24 and 48 h of incubation. The attempt was made, to find out if there is any correlation between the first principal component and the degree of growth inhibition exhibited by studied complexes in relation to studied microorganisms.
PCA-LBG-based algorithms for VQ codebook generation

NASA Astrophysics Data System (ADS)

Tsai, Jinn-Tsong; Yang, Po-Yuan

2015-04-01

Vector quantisation (VQ) codebooks are generated by combining principal component analysis (PCA) algorithms with Linde-Buzo-Gray (LBG) algorithms. All training vectors are grouped according to the projected values of the principal components. The PCA-LBG-based algorithms include (1) PCA-LBG-Median, which selects the median vector of each group, (2) PCA-LBG-Centroid, which adopts the centroid vector of each group, and (3) PCA-LBG-Random, which randomly selects a vector of each group. The LBG algorithm finds a codebook based on the better vectors sent to an initial codebook by the PCA. The PCA performs an orthogonal transformation to convert a set of potentially correlated variables into a set of variables that are not linearly correlated. Because the orthogonal transformation efficiently distinguishes test image vectors, the proposed PCA-LBG-based algorithm is expected to outperform conventional algorithms in designing VQ codebooks. The experimental results confirm that the proposed PCA-LBG-based algorithms indeed obtain better results compared to existing methods reported in the literature.
Principal components technique analysis for vegetation and land use discrimination. [Brazilian cerrados

NASA Technical Reports Server (NTRS)

Parada, N. D. J. (Principal Investigator); Formaggio, A. R.; Dossantos, J. R.; Dias, L. A. V.

1984-01-01

Automatic pre-processing technique called Principal Components (PRINCO) in analyzing LANDSAT digitized data, for land use and vegetation cover, on the Brazilian cerrados was evaluated. The chosen pilot area, 223/67 of MSS/LANDSAT 3, was classified on a GE Image-100 System, through a maximum-likehood algorithm (MAXVER). The same procedure was applied to the PRINCO treated image. PRINCO consists of a linear transformation performed on the original bands, in order to eliminate the information redundancy of the LANDSAT channels. After PRINCO only two channels were used thus reducing computer effort. The original channels and the PRINCO channels grey levels for the five identified classes (grassland, "cerrado", burned areas, anthropic areas, and gallery forest) were obtained through the MAXVER algorithm. This algorithm also presented the average performance for both cases. In order to evaluate the results, the Jeffreys-Matusita distance (JM-distance) between classes was computed. The classification matrix, obtained through MAXVER, after a PRINCO pre-processing, showed approximately the same average performance in the classes separability.
Effects of distillation system and yeast strain on the aroma profile of Albariño (Vitis vinifera L.) grape pomace spirits.

PubMed

Arrieta-Garay, Y; Blanco, P; López-Vázquez, C; Rodríguez-Bencomo, J J; Pérez-Correa, J R; López, F; Orriols, I

2014-10-29

Orujo is a traditional alcoholic beverage produced in Galicia (northwest Spain) from distillation of grape pomace, a byproduct of the winemaking industry. In this study, the effect of the distillation system (copper charentais alembic versus packed column) and the yeast strain (native yeast L1 versus commercial yeast L2) on the chemical and sensory characteristics of orujo obtained from Albariño (Vitis vinifera L.) grape pomace has been analyzed. Principal component analysis, with two components explaining 74% of the variance, is able to clearly differentiate the distillates according to distillation system and yeast strain. Principal component 1, mainly defined by C6-C12 esters, isoamyl octanoate, and methanol, differentiates L1 from L2 distillates. In turn, principal component 2, mainly defined by linear alcohols, linalool, and 1-hexenol, differentiates alembic from packed column distillates. In addition, an aroma descriptive test reveals that the distillate obtained with a packed column from a pomace fermented with L1 presented the highest positive general impression, which is associated with the highest fruity and smallest solvent aroma scores. Moreover, chemical analysis shows that use of a packed column increases average ethanol recovery by 12%, increases the concentration of C6-C12 esters by 25%, and reduces the concentration of higher alcohols by 21%. In turn, L2 yeast obtained lower scores in the alembic distillates aroma profile. In addition, with L1, 9% higher ethanol yields were achieved, and L2 distillates contained 34%-40% more methanol than L1 distillates.
Comparative Study of SVM Methods Combined with Voxel Selection for Object Category Classification on fMRI Data

PubMed Central

Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li

2011-01-01

Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Forest Species Identification with High Spectral Resolution Data

NASA Technical Reports Server (NTRS)

Olson, C. E., Jr.; Zhu, Z.

1985-01-01

Data collected over the Sleeping Bear Sand Dunes Test Site and the Saginaw Forest Test Site (Michigan) with the JPL Airborne Imaging Spectrometer and the Collins' Airborne Spectroradiometer are being used for forest species identification. The linear discriminant function has provided higher identification accuracies than have principal components analyses. Highest identification accuracies are obtained in the 450 to 520 nm spectral region. Spectral bands near 1,300, 1,685 and 2,220 nm appear to be important, also.
Multilayer neural networks for reduced-rank approximation.

PubMed

Diamantaras, K I; Kung, S Y

1994-01-01

This paper is developed in two parts. First, the authors formulate the solution to the general reduced-rank linear approximation problem relaxing the invertibility assumption of the input autocorrelation matrix used by previous authors. The authors' treatment unifies linear regression, Wiener filtering, full rank approximation, auto-association networks, SVD and principal component analysis (PCA) as special cases. The authors' analysis also shows that two-layer linear neural networks with reduced number of hidden units, trained with the least-squares error criterion, produce weights that correspond to the generalized singular value decomposition of the input-teacher cross-correlation matrix and the input data matrix. As a corollary the linear two-layer backpropagation model with reduced hidden layer extracts an arbitrary linear combination of the generalized singular vector components. Second, the authors investigate artificial neural network models for the solution of the related generalized eigenvalue problem. By introducing and utilizing the extended concept of deflation (originally proposed for the standard eigenvalue problem) the authors are able to find that a sequential version of linear BP can extract the exact generalized eigenvector components. The advantage of this approach is that it's easier to update the model structure by adding one more unit or pruning one or more units when the application requires it. An alternative approach for extracting the exact components is to use a set of lateral connections among the hidden units trained in such a way as to enforce orthogonality among the upper- and lower-layer weights. The authors call this the lateral orthogonalization network (LON) and show via theoretical analysis-and verify via simulation-that the network extracts the desired components. The advantage of the LON-based model is that it can be applied in a parallel fashion so that the components are extracted concurrently. Finally, the authors show the application of their results to the solution of the identification problem of systems whose excitation has a non-invertible autocorrelation matrix. Previous identification methods usually rely on the invertibility assumption of the input autocorrelation, therefore they can not be applied to this case.
Serum Folate Shows an Inverse Association with Blood Pressure in a Cohort of Chinese Women of Childbearing Age: A Cross-Sectional Study

PubMed Central

Shen, Minxue; Tan, Hongzhuan; Zhou, Shujin; Retnakaran, Ravi; Smith, Graeme N.; Davidge, Sandra T.; Trasler, Jacquetta; Walker, Mark C.; Wen, Shi Wu

2016-01-01

Background It has been reported that higher folate intake from food and supplementation is associated with decreased blood pressure (BP). The association between serum folate concentration and BP has been examined in few studies. We aim to examine the association between serum folate and BP levels in a cohort of young Chinese women. Methods We used the baseline data from a pre-conception cohort of women of childbearing age in Liuyang, China, for this study. Demographic data were collected by structured interview. Serum folate concentration was measured by immunoassay, and homocysteine, blood glucose, triglyceride and total cholesterol were measured through standardized clinical procedures. Multiple linear regression and principal component regression model were applied in the analysis. Results A total of 1,532 healthy normotensive non-pregnant women were included in the final analysis. The mean concentration of serum folate was 7.5 ± 5.4 nmol/L and 55% of the women presented with folate deficiency (< 6.8 nmol/L). Multiple linear regression and principal component regression showed that serum folate levels were inversely associated with systolic and diastolic BP, after adjusting for demographic, anthropometric, and biochemical factors. Conclusions Serum folate is inversely associated with BP in non-pregnant women of childbearing age with high prevalence of folate deficiency. PMID:27182603
Risk prediction for myocardial infarction via generalized functional regression models.

PubMed

Ieva, Francesca; Paganoni, Anna M

2016-08-01

In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques. © The Author(s) 2013.
On the Fallibility of Principal Components in Research

ERIC Educational Resources Information Center

Raykov, Tenko; Marcoulides, George A.; Li, Tenglong

2017-01-01

The measurement error in principal components extracted from a set of fallible measures is discussed and evaluated. It is shown that as long as one or more measures in a given set of observed variables contains error of measurement, so also does any principal component obtained from the set. The error variance in any principal component is shown…
Progress Towards Improved Analysis of TES X-ray Data Using Principal Component Analysis

NASA Technical Reports Server (NTRS)

Busch, S. E.; Adams, J. S.; Bandler, S. R.; Chervenak, J. A.; Eckart, M. E.; Finkbeiner, F. M.; Fixsen, D. J.; Kelley, R. L.; Kilbourne, C. A.; Lee, S.-J.;

2015-01-01

The traditional method of applying a digital optimal filter to measure X-ray pulses from transition-edge sensor (TES) devices does not achieve the best energy resolution when the signals have a highly non-linear response to energy, or the noise is non-stationary during the pulse. We present an implementation of a method to analyze X-ray data from TESs, which is based upon principal component analysis (PCA). Our method separates the X-ray signal pulse into orthogonal components that have the largest variance. We typically recover pulse height, arrival time, differences in pulse shape, and the variation of pulse height with detector temperature. These components can then be combined to form a representation of pulse energy. An added value of this method is that by reporting information on more descriptive parameters (as opposed to a single number representing energy), we generate a much more complete picture of the pulse received. Here we report on progress in developing this technique for future implementation on X-ray telescopes. We used an 55Fe source to characterize Mo/Au TESs. On the same dataset, the PCA method recovers a spectral resolution that is better by a factor of two than achievable with digital optimal filters.

Three-Component Decomposition of Polarimetric SAR Data Integrating Eigen-Decomposition Results

NASA Astrophysics Data System (ADS)

Lu, Da; He, Zhihua; Zhang, Huan

2018-01-01

This paper presents a novel three-component scattering power decomposition of polarimetric SAR data. There are two problems in three-component decomposition method: volume scattering component overestimation in urban areas and artificially set parameter to be a fixed value. Though volume scattering component overestimation can be partly solved by deorientation process, volume scattering still dominants some oriented urban areas. The speckle-like decomposition results introduced by artificially setting value are not conducive to further image interpretation. This paper integrates the results of eigen-decomposition to solve the aforementioned problems. Two principal eigenvectors are used to substitute the surface scattering model and the double bounce scattering model. The decomposed scattering powers are obtained using a constrained linear least-squares method. The proposed method has been verified using an ESAR PolSAR image, and the results show that the proposed method has better performance in urban area.
Application of principal component regression and artificial neural network in FT-NIR soluble solids content determination of intact pear fruit

NASA Astrophysics Data System (ADS)

Ying, Yibin; Liu, Yande; Fu, Xiaping; Lu, Huishan

2005-11-01

The artificial neural networks (ANNs) have been used successfully in applications such as pattern recognition, image processing, automation and control. However, majority of today's applications of ANNs is back-propagate feed-forward ANN (BP-ANN). In this paper, back-propagation artificial neural networks (BP-ANN) were applied for modeling soluble solid content (SSC) of intact pear from their Fourier transform near infrared (FT-NIR) spectra. One hundred and sixty-four pear samples were used to build the calibration models and evaluate the models predictive ability. The results are compared to the classical calibration approaches, i.e. principal component regression (PCR), partial least squares (PLS) and non-linear PLS (NPLS). The effects of the optimal methods of training parameters on the prediction model were also investigated. BP-ANN combine with principle component regression (PCR) resulted always better than the classical PCR, PLS and Weight-PLS methods, from the point of view of the predictive ability. Based on the results, it can be concluded that FT-NIR spectroscopy and BP-ANN models can be properly employed for rapid and nondestructive determination of fruit internal quality.
Recovery of a spectrum based on a compressive-sensing algorithm with weighted principal component analysis

NASA Astrophysics Data System (ADS)

Dafu, Shen; Leihong, Zhang; Dong, Liang; Bei, Li; Yi, Kang

2017-07-01

The purpose of this study is to improve the reconstruction precision and better copy the color of spectral image surfaces. A new spectral reflectance reconstruction algorithm based on an iterative threshold combined with weighted principal component space is presented in this paper, and the principal component with weighted visual features is the sparse basis. Different numbers of color cards are selected as the training samples, a multispectral image is the testing sample, and the color differences in the reconstructions are compared. The channel response value is obtained by a Mega Vision high-accuracy, multi-channel imaging system. The results show that spectral reconstruction based on weighted principal component space is superior in performance to that based on traditional principal component space. Therefore, the color difference obtained using the compressive-sensing algorithm with weighted principal component analysis is less than that obtained using the algorithm with traditional principal component analysis, and better reconstructed color consistency with human eye vision is achieved.
Addressing the identification problem in age-period-cohort analysis: a tutorial on the use of partial least squares and principal components analysis.

PubMed

Tu, Yu-Kang; Krämer, Nicole; Lee, Wen-Chung

2012-07-01

In the analysis of trends in health outcomes, an ongoing issue is how to separate and estimate the effects of age, period, and cohort. As these 3 variables are perfectly collinear by definition, regression coefficients in a general linear model are not unique. In this tutorial, we review why identification is a problem, and how this problem may be tackled using partial least squares and principal components regression analyses. Both methods produce regression coefficients that fulfill the same collinearity constraint as the variables age, period, and cohort. We show that, because the constraint imposed by partial least squares and principal components regression is inherent in the mathematical relation among the 3 variables, this leads to more interpretable results. We use one dataset from a Taiwanese health-screening program to illustrate how to use partial least squares regression to analyze the trends in body heights with 3 continuous variables for age, period, and cohort. We then use another dataset of hepatocellular carcinoma mortality rates for Taiwanese men to illustrate how to use partial least squares regression to analyze tables with aggregated data. We use the second dataset to show the relation between the intrinsic estimator, a recently proposed method for the age-period-cohort analysis, and partial least squares regression. We also show that the inclusion of all indicator variables provides a more consistent approach. R code for our analyses is provided in the eAppendix.
Reduced order surrogate modelling (ROSM) of high dimensional deterministic simulations

NASA Astrophysics Data System (ADS)

Mitry, Mina

Often, computationally expensive engineering simulations can prohibit the engineering design process. As a result, designers may turn to a less computationally demanding approximate, or surrogate, model to facilitate their design process. However, owing to the the curse of dimensionality, classical surrogate models become too computationally expensive for high dimensional data. To address this limitation of classical methods, we develop linear and non-linear Reduced Order Surrogate Modelling (ROSM) techniques. Two algorithms are presented, which are based on a combination of linear/kernel principal component analysis and radial basis functions. These algorithms are applied to subsonic and transonic aerodynamic data, as well as a model for a chemical spill in a channel. The results of this thesis show that ROSM can provide a significant computational benefit over classical surrogate modelling, sometimes at the expense of a minor loss in accuracy.

Source identification and apportionment of heavy metals in urban soil profiles.

PubMed

Luo, Xiao-San; Xue, Yan; Wang, Yan-Ling; Cang, Long; Xu, Bo; Ding, Jing

2015-05-01

Because heavy metals (HMs) occurring naturally in soils accumulate continuously due to human activities, identifying and apportioning their sources becomes a challenging task for pollution prevention in urban environments. Besides the enrichment factors (EFs) and principal component analysis (PCA) for source classification, the receptor model (Absolute Principal Component Scores-Multiple Linear Regression, APCS-MLR) and Pb isotopic mixing model were also developed to quantify the source contribution for typical HMs (Cd, Co, Cr, Cu, Mn, Ni, Pb, Zn) in urban park soils of Xiamen, a representative megacity in southeast China. Furthermore, distribution patterns of their concentrations and sources in 13 soil profiles (top 20 cm) were investigated by different depths (0-5, 5-10, 10-20 cm). Currently the principal anthropogenic source for HMs in urban soil of China is atmospheric deposition from coal combustion rather than vehicle exhaust. Specifically for Pb source by isotopic model ((206)Pb/(207)Pb and (208)Pb/(207)Pb), the average contributions were natural (49%)>coal combustion (45%)≫traffic emissions (6%). Although the urban surface soils are usually more contaminated owing to recent and current human sources, leaching effects and historic vehicle emissions can also make deep soil layer contaminated by HMs. Copyright © 2015 Elsevier Ltd. All rights reserved.
Principal Component and Linkage Analysis of Cardiovascular Risk Traits in the Norfolk Isolate

PubMed Central

Cox, Hannah C.; Bellis, Claire; Lea, Rod A.; Quinlan, Sharon; Hughes, Roger; Dyer, Thomas; Charlesworth, Jac; Blangero, John; Griffiths, Lyn R.

2009-01-01

Objective(s) An individual's risk of developing cardiovascular disease (CVD) is influenced by genetic factors. This study focussed on mapping genetic loci for CVD-risk traits in a unique population isolate derived from Norfolk Island. Methods This investigation focussed on 377 individuals descended from the population founders. Principal component analysis was used to extract orthogonal components from 11 cardiovascular risk traits. Multipoint variance component methods were used to assess genome-wide linkage using SOLAR to the derived factors. A total of 285 of the 377 related individuals were informative for linkage analysis. Results A total of 4 principal components accounting for 83% of the total variance were derived. Principal component 1 was loaded with body size indicators; principal component 2 with body size, cholesterol and triglyceride levels; principal component 3 with the blood pressures; and principal component 4 with LDL-cholesterol and total cholesterol levels. Suggestive evidence of linkage for principal component 2 (h2 = 0.35) was observed on chromosome 5q35 (LOD = 1.85; p = 0.0008). While peak regions on chromosome 10p11.2 (LOD = 1.27; p = 0.005) and 12q13 (LOD = 1.63; p = 0.003) were observed to segregate with principal components 1 (h2 = 0.33) and 4 (h2 = 0.42), respectively. Conclusion(s): This study investigated a number of CVD risk traits in a unique isolated population. Findings support the clustering of CVD risk traits and provide interesting evidence of a region on chromosome 5q35 segregating with weight, waist circumference, HDL-c and total triglyceride levels. PMID:19339786
Discrimination of gender-, speed-, and shoe-dependent movement patterns in runners using full-body kinematics.

PubMed

Maurer, Christian; Federolf, Peter; von Tscharner, Vinzenz; Stirling, Lisa; Nigg, Benno M

2012-05-01

Changes in gait kinematics have often been analyzed using pattern recognition methods such as principal component analysis (PCA). It is usually just the first few principal components that are analyzed, because they describe the main variability within a dataset and thus represent the main movement patterns. However, while subtle changes in gait pattern (for instance, due to different footwear) may not change main movement patterns, they may affect movements represented by higher principal components. This study was designed to test two hypotheses: (1) speed and gender differences can be observed in the first principal components, and (2) small interventions such as changing footwear change the gait characteristics of higher principal components. Kinematic changes due to different running conditions (speed - 3.1m/s and 4.9 m/s, gender, and footwear - control shoe and adidas MicroBounce shoe) were investigated by applying PCA and support vector machine (SVM) to a full-body reflective marker setup. Differences in speed changed the basic movement pattern, as was reflected by a change in the time-dependent coefficient derived from the first principal. Gender was differentiated by using the time-dependent coefficient derived from intermediate principal components. (Intermediate principal components are characterized by limb rotations of the thigh and shank.) Different shoe conditions were identified in higher principal components. This study showed that different interventions can be analyzed using a full-body kinematic approach. Within the well-defined vector space spanned by the data of all subjects, higher principal components should also be considered because these components show the differences that result from small interventions such as footwear changes. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.
Non-Linear Metamodeling Extensions to the Robust Parameter Design of Computer Simulations

DTIC Science & Technology

2016-09-15

design By principal component analysis," Total Quality Management, vol. 8, no. 6, pp. 409-416, 1997. [25] A. Salmasnia, R. B . Kazemzadeh and S. T . A...and D. T . Sturrock, Simulation with Arena (3rd ed.), New York, NY: McGraw-Hill, 2004. [85] A. M. Mathai and S. B . Provost, Quadratic Forms in Random...PhD Member ADEDEJI B . BADIRU, PhD Dean, Graduate School of Engineering and Management iv AFIT-ENS-DS-16-S-026 Abstract Robust
Multispectral histogram normalization contrast enhancement

NASA Technical Reports Server (NTRS)

Soha, J. M.; Schwartz, A. A.

1979-01-01

A multispectral histogram normalization or decorrelation enhancement which achieves effective color composites by removing interband correlation is described. The enhancement procedure employs either linear or nonlinear transformations to equalize principal component variances. An additional rotation to any set of orthogonal coordinates is thus possible, while full histogram utilization is maintained by avoiding the reintroduction of correlation. For the three-dimensional case, the enhancement procedure may be implemented with a lookup table. An application of the enhancement to Landsat multispectral scanning imagery is presented.
Multiple long-term trends and trend reversals dominate environmental conditions in a man-made freshwater reservoir.

PubMed

Znachor, Petr; Nedoma, Jiří; Hejzlar, Josef; Seďa, Jaromír; Kopáček, Jiří; Boukal, David; Mrkvička, Tomáš

2018-05-15

Man-made reservoirs are common across the world and provide a wide range of ecological services. Environmental conditions in riverine reservoirs are affected by the changing climate, catchment-wide processes and manipulations with the water level, and water abstraction from the reservoir. Long-term trends of environmental conditions in reservoirs thus reflect a wider range of drivers in comparison to lakes, which makes the understanding of reservoir dynamics more challenging. We analysed a 32-year time series of 36 environmental variables characterising weather, land use in the catchment, reservoir hydrochemistry, hydrology and light availability in the small, canyon-shaped Římov Reservoir in the Czech Republic to detect underlying trends, trend reversals and regime shifts. To do so, we fitted linear and piecewise linear regression and a regime shift model to the time series of mean annual values of each variable and to principal components produced by Principal Component Analysis. Models were weighted and ranked using Akaike information criterion and the model selection approach. Most environmental variables exhibited temporal changes that included time-varying trends and trend reversals. For instance, dissolved organic carbon showed a linear increasing trend while nitrate concentration or conductivity exemplified trend reversal. All trend reversals and cessations of temporal trends in reservoir hydrochemistry (except total phosphorus concentrations) occurred in the late 1980s and during 1990s as a consequence of dramatic socioeconomic changes. After a series of heavy rains in the late 1990s, an administrative decision to increase the flood-retention volume of the reservoir resulted in a significant regime shift in reservoir hydraulic conditions in 1999. Our analyses also highlight the utility of the model selection framework, based on relatively simple extensions of linear regression, to describe temporal trends in reservoir characteristics. This approach can provide a solid basis for a better understanding of processes in freshwater reservoirs. Copyright © 2017 Elsevier B.V. All rights reserved.
Principal Component Relaxation Mode Analysis of an All-Atom Molecular Dynamics Simulation of Human Lysozyme

NASA Astrophysics Data System (ADS)

Nagai, Toshiki; Mitsutake, Ayori; Takano, Hiroshi

2013-02-01

A new relaxation mode analysis method, which is referred to as the principal component relaxation mode analysis method, has been proposed to handle a large number of degrees of freedom of protein systems. In this method, principal component analysis is carried out first and then relaxation mode analysis is applied to a small number of principal components with large fluctuations. To reduce the contribution of fast relaxation modes in these principal components efficiently, we have also proposed a relaxation mode analysis method using multiple evolution times. The principal component relaxation mode analysis method using two evolution times has been applied to an all-atom molecular dynamics simulation of human lysozyme in aqueous solution. Slow relaxation modes and corresponding relaxation times have been appropriately estimated, demonstrating that the method is applicable to protein systems.
Holonomy of a principal composite bundle connection, non-Abelian geometric phases, and gauge theory of gravity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Viennot, David

We show that the holonomy of a connection defined on a principal composite bundle is related by a non-Abelian Stokes theorem to the composition of the holonomies associated with the connections of the component bundles of the composite. We apply this formalism to describe the non-Abelian geometric phase (when the geometric phase generator does not commute with the dynamical phase generator). We find then an assumption to obtain a new kind of separation between the dynamical and the geometric phases. We also apply this formalism to the gauge theory of gravity in the presence of a Dirac spinor field inmore » order to decompose the holonomy of the Lorentz connection into holonomies of the linear connection and of the Cartan connection.« less
Functional principal component analysis of glomerular filtration rate curves after kidney transplant.

PubMed

Dong, Jianghu J; Wang, Liangliang; Gill, Jagbir; Cao, Jiguo

2017-01-01

This article is motivated by some longitudinal clinical data of kidney transplant recipients, where kidney function progression is recorded as the estimated glomerular filtration rates at multiple time points post kidney transplantation. We propose to use the functional principal component analysis method to explore the major source of variations of glomerular filtration rate curves. We find that the estimated functional principal component scores can be used to cluster glomerular filtration rate curves. Ordering functional principal component scores can detect abnormal glomerular filtration rate curves. Finally, functional principal component analysis can effectively estimate missing glomerular filtration rate values and predict future glomerular filtration rate values.
Adherence to an (n-3) Fatty Acid/Fish Intake Pattern Is Inversely Associated with Metabolic Syndrome among Puerto Rican Adults in the Greater Boston Area123

PubMed Central

Noel, Sabrina E.; Newby, P. K.; Ordovas, Jose M.; Tucker, Katherine L.

2010-01-01

Combinations of fatty acids may affect risk of metabolic syndrome. Puerto Ricans have a disproportionate number of chronic conditions compared with other Hispanic groups. We aimed to characterize fatty acid intake patterns of Puerto Rican adults aged 45–75 y and living in the Greater Boston area (n = 1207) and to examine associations between these patterns and metabolic syndrome. Dietary fatty acids, as a percentage of total fat, were entered into principle components analysis. Spearman correlation coefficients were used to examine associations between fatty acid intake patterns, nutrients, and food groups. Associations with metabolic syndrome were analyzed by using logistic regression and general linear models with quintiles of principal component scores. Four principal components (factors) emerged: factor 1, short- and medium-chain SFA/dairy; factor 2, (n-3) fatty acid/fish; factor 3, very long-chain (VLC) SFA and PUFA/oils; and factor 4, monounsaturated fatty acid/trans fat. The SFA/dairy factor was inversely associated with fasting serum glucose concentrations (P = 0.02) and the VLC SFA/oils factor was negatively related to waist circumference (P = 0.008). However, these associations were no longer significant after additional adjustment for BMI. The (n-3) fatty acid/fish factor was associated with a lower likelihood of metabolic syndrome (Q5 vs. Q1: odds ratio: 0.54, 95% CI: 0.34, 0.86). In summary, principal components analysis of fatty acid intakes revealed 4 dietary fatty acid patterns in this population. Identifying optimal combinations of fatty acids may be beneficial for understanding relationships with health outcomes given their diverse effects on metabolism. PMID:20702744
The Relation between Factor Score Estimates, Image Scores, and Principal Component Scores

ERIC Educational Resources Information Center

Velicer, Wayne F.

1976-01-01

Investigates the relation between factor score estimates, principal component scores, and image scores. The three methods compared are maximum likelihood factor analysis, principal component analysis, and a variant of rescaled image analysis. (RC)
The Butterflies of Principal Components: A Case of Ultrafine-Grained Polyphase Units

NASA Astrophysics Data System (ADS)

Rietmeijer, F. J. M.

1996-03-01

Dusts in the accretion regions of chondritic interplanetary dust particles [IDPs] consisted of three principal components: carbonaceous units [CUs], carbon-bearing chondritic units [GUs] and carbon-free silicate units [PUs]. Among others, differences among chondritic IDP morphologies and variable bulk C/Si ratios reflect variable mixtures of principal components. The spherical shapes of the initially amorphous principal components remain visible in many chondritic porous IDPs but fusion was documented for CUs, GUs and PUs. The PUs occur as coarse- and ultrafine-grained units that include so called GEMS. Spherical principal components preserved in an IDP as recognisable textural units have unique proporties with important implications for their petrological evolution from pre-accretion processing to protoplanet alteration and dynamic pyrometamorphism. Throughout their lifetime the units behaved as closed-systems without chemical exchange with other units. This behaviour is reflected in their mineralogies while the bulk compositions of principal components define the environments wherein they were formed.
Migration of scattered teleseismic body waves

NASA Astrophysics Data System (ADS)

Bostock, M. G.; Rondenay, S.

1999-06-01

The retrieval of near-receiver mantle structure from scattered waves associated with teleseismic P and S and recorded on three-component, linear seismic arrays is considered in the context of inverse scattering theory. A Ray + Born formulation is proposed which admits linearization of the forward problem and economy in the computation of the elastic wave Green's function. The high-frequency approximation further simplifies the problem by enabling (1) the use of an earth-flattened, 1-D reference model, (2) a reduction in computations to 2-D through the assumption of 2.5-D experimental geometry, and (3) band-diagonalization of the Hessian matrix in the inverse formulation. The final expressions are in a form reminiscent of the classical diffraction stack of seismic migration. Implementation of this procedure demands an accurate estimate of the scattered wave contribution to the impulse response, and thus requires the removal of both the reference wavefield and the source time signature from the raw record sections. An approximate separation of direct and scattered waves is achieved through application of the inverse free-surface transfer operator to individual station records and a Karhunen-Loeve transform to the resulting record sections. This procedure takes the full displacement field to a wave vector space wherein the first principal component of the incident wave-type section is identified with the direct wave and is used as an estimate of the source time function. The scattered displacement field is reconstituted from the remaining principal components using the forward free-surface transfer operator, and may be reduced to a scattering impulse response upon deconvolution of the source estimate. An example employing pseudo-spectral synthetic seismograms demonstrates an application of the methodology.
Rotation of EOFs by the Independent Component Analysis: Towards A Solution of the Mixing Problem in the Decomposition of Geophysical Time Series

NASA Technical Reports Server (NTRS)

Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)

2001-01-01

The Independent Component Analysis is a recently developed technique for component extraction. This new method requires the statistical independence of the extracted components, a stronger constraint that uses higher-order statistics, instead of the classical decorrelation, a weaker constraint that uses only second-order statistics. This technique has been used recently for the analysis of geophysical time series with the goal of investigating the causes of variability in observed data (i.e. exploratory approach). We demonstrate with a data simulation experiment that, if initialized with a Principal Component Analysis, the Independent Component Analysis performs a rotation of the classical PCA (or EOF) solution. This rotation uses no localization criterion like other Rotation Techniques (RT), only the global generalization of decorrelation by statistical independence is used. This rotation of the PCA solution seems to be able to solve the tendency of PCA to mix several physical phenomena, even when the signal is just their linear sum.
Independent component analysis decomposition of hospital emergency department throughput measures

NASA Astrophysics Data System (ADS)

He, Qiang; Chu, Henry

2016-05-01

We present a method adapted from medical sensor data analysis, viz. independent component analysis of electroencephalography data, to health system analysis. Timely and effective care in a hospital emergency department is measured by throughput measures such as median times patients spent before they were admitted as an inpatient, before they were sent home, before they were seen by a healthcare professional. We consider a set of five such measures collected at 3,086 hospitals distributed across the U.S. One model of the performance of an emergency department is that these correlated throughput measures are linear combinations of some underlying sources. The independent component analysis decomposition of the data set can thus be viewed as transforming a set of performance measures collected at a site to a collection of outputs of spatial filters applied to the whole multi-measure data. We compare the independent component sources with the output of the conventional principal component analysis to show that the independent components are more suitable for understanding the data sets through visualizations.
First impressions: gait cues drive reliable trait judgements.

PubMed

Thoresen, John C; Vuong, Quoc C; Atkinson, Anthony P

2012-09-01

Personality trait attribution can underpin important social decisions and yet requires little effort; even a brief exposure to a photograph can generate lasting impressions. Body movement is a channel readily available to observers and allows judgements to be made when facial and body appearances are less visible; e.g., from great distances. Across three studies, we assessed the reliability of trait judgements of point-light walkers and identified motion-related visual cues driving observers' judgements. The findings confirm that observers make reliable, albeit inaccurate, trait judgements, and these were linked to a small number of motion components derived from a Principal Component Analysis of the motion data. Parametric manipulation of the motion components linearly affected trait ratings, providing strong evidence that the visual cues captured by these components drive observers' trait judgements. Subsequent analyses suggest that reliability of trait ratings was driven by impressions of emotion, attractiveness and masculinity. Copyright © 2012 Elsevier B.V. All rights reserved.
The relationships between spatial ability, logical thinking, mathematics performance and kinematics graph interpretation skills of 12th grade physics students

NASA Astrophysics Data System (ADS)

Bektasli, Behzat

Graphs have a broad use in science classrooms, especially in physics. In physics, kinematics is probably the topic for which graphs are most widely used. The participants in this study were from two different grade-12 physics classrooms, advanced placement and calculus-based physics. The main purpose of this study was to search for the relationships between student spatial ability, logical thinking, mathematical achievement, and kinematics graphs interpretation skills. The Purdue Spatial Visualization Test, the Middle Grades Integrated Process Skills Test (MIPT), and the Test of Understanding Graphs in Kinematics (TUG-K) were used for quantitative data collection. Classroom observations were made to acquire ideas about classroom environment and instructional techniques. Factor analysis, simple linear correlation, multiple linear regression, and descriptive statistics were used to analyze the quantitative data. Each instrument has two principal components. The selection and calculation of the slope and of the area were the two principal components of TUG-K. MIPT was composed of a component based upon processing text and a second component based upon processing symbolic information. The Purdue Spatial Visualization Test was composed of a component based upon one-step processing and a second component based upon two-step processing of information. Student ability to determine the slope in a kinematics graph was significantly correlated with spatial ability, logical thinking, and mathematics aptitude and achievement. However, student ability to determine the area in a kinematics graph was only significantly correlated with student pre-calculus semester 2 grades. Male students performed significantly better than female students on the slope items of TUG-K. Also, male students performed significantly better than female students on the PSAT mathematics assessment and spatial ability. This study found that students have different levels of spatial ability, logical thinking, and mathematics aptitude and achievement levels. These different levels were related to student learning of kinematics and they need to be considered when kinematics is being taught. It might be easier for students to understand the kinematics graphs if curriculum developers include more activities related to spatial ability and logical thinking.
Global and system-specific resting-state fMRI fluctuations are uncorrelated: principal component analysis reveals anti-correlated networks.

PubMed

Carbonell, Felix; Bellec, Pierre; Shmuel, Amir

2011-01-01

The influence of the global average signal (GAS) on functional-magnetic resonance imaging (fMRI)-based resting-state functional connectivity is a matter of ongoing debate. The global average fluctuations increase the correlation between functional systems beyond the correlation that reflects their specific functional connectivity. Hence, removal of the GAS is a common practice for facilitating the observation of network-specific functional connectivity. This strategy relies on the implicit assumption of a linear-additive model according to which global fluctuations, irrespective of their origin, and network-specific fluctuations are super-positioned. However, removal of the GAS introduces spurious negative correlations between functional systems, bringing into question the validity of previous findings of negative correlations between fluctuations in the default-mode and the task-positive networks. Here we present an alternative method for estimating global fluctuations, immune to the complications associated with the GAS. Principal components analysis was applied to resting-state fMRI time-series. A global-signal effect estimator was defined as the principal component (PC) that correlated best with the GAS. The mean correlation coefficient between our proposed PC-based global effect estimator and the GAS was 0.97±0.05, demonstrating that our estimator successfully approximated the GAS. In 66 out of 68 runs, the PC that showed the highest correlation with the GAS was the first PC. Since PCs are orthogonal, our method provides an estimator of the global fluctuations, which is uncorrelated to the remaining, network-specific fluctuations. Moreover, unlike the regression of the GAS, the regression of the PC-based global effect estimator does not introduce spurious anti-correlations beyond the decrease in seed-based correlation values allowed by the assumed additive model. After regressing this PC-based estimator out of the original time-series, we observed robust anti-correlations between resting-state fluctuations in the default-mode and the task-positive networks. We conclude that resting-state global fluctuations and network-specific fluctuations are uncorrelated, supporting a Resting-State Linear-Additive Model. In addition, we conclude that the network-specific resting-state fluctuations of the default-mode and task-positive networks show artifact-free anti-correlations.
The influence of iliotibial band syndrome history on running biomechanics examined via principal components analysis.

PubMed

Foch, Eric; Milner, Clare E

2014-01-03

Iliotibial band syndrome (ITBS) is a common knee overuse injury among female runners. Atypical discrete trunk and lower extremity biomechanics during running may be associated with the etiology of ITBS. Examining discrete data points limits the interpretation of a waveform to a single value. Characterizing entire kinematic and kinetic waveforms may provide additional insight into biomechanical factors associated with ITBS. Therefore, the purpose of this cross-sectional investigation was to determine whether female runners with previous ITBS exhibited differences in kinematics and kinetics compared to controls using a principal components analysis (PCA) approach. Forty participants comprised two groups: previous ITBS and controls. Principal component scores were retained for the first three principal components and were analyzed using independent t-tests. The retained principal components accounted for 93-99% of the total variance within each waveform. Runners with previous ITBS exhibited low principal component one scores for frontal plane hip angle. Principal component one accounted for the overall magnitude in hip adduction which indicated that runners with previous ITBS assumed less hip adduction throughout stance. No differences in the remaining retained principal component scores for the waveforms were detected among groups. A smaller hip adduction angle throughout the stance phase of running may be a compensatory strategy to limit iliotibial band strain. This running strategy may have persisted after ITBS symptoms subsided. © 2013 Published by Elsevier Ltd.
Linearized radiative transfer models for retrieval of cloud parameters from EPIC/DSCOVR measurements

NASA Astrophysics Data System (ADS)

Molina García, Víctor; Sasi, Sruthy; Efremenko, Dmitry S.; Doicu, Adrian; Loyola, Diego

2018-07-01

In this paper, we describe several linearized radiative transfer models which can be used for the retrieval of cloud parameters from EPIC (Earth Polychromatic Imaging Camera) measurements. The approaches under examination are (1) the linearized forward approach, represented in this paper by the linearized discrete ordinate and matrix operator methods with matrix exponential, and (2) the forward-adjoint approach based on the discrete ordinate method with matrix exponential. To enhance the performance of the radiative transfer computations, the correlated k-distribution method and the Principal Component Analysis (PCA) technique are used. We provide a compact description of the proposed methods, as well as a numerical analysis of their accuracy and efficiency when simulating EPIC measurements in the oxygen A-band channel at 764 nm. We found that the computation time of the forward-adjoint approach using the correlated k-distribution method in conjunction with PCA is approximately 13 s for simultaneously computing the derivatives with respect to cloud optical thickness and cloud top height.

Multi-segmental movements as a function of experience in karate.

PubMed

Zago, Matteo; Codari, Marina; Iaia, F Marcello; Sforza, Chiarella

2017-08-01

Karate is a martial art that partly depends on subjective scoring of complex movements. Principal component analysis (PCA)-based methods can identify the fundamental synergies (principal movements) of motor system, providing a quantitative global analysis of technique. In this study, we aimed at describing the fundamental multi-joint synergies of a karate performance, under the hypothesis that the latter are skilldependent; estimate karateka's experience level, expressed as years of practice. A motion capture system recorded traditional karate techniques of 10 professional and amateur karateka. At any time point, the 3D-coordinates of body markers produced posture vectors that were normalised, concatenated from all karateka and submitted to a first PCA. Five principal movements described both gross movement synergies and individual differences. A second PCA followed by linear regression estimated the years of practice using principal movements (eigenpostures and weighting curves) and centre of mass kinematics (error: 3.71 years; R2 = 0.91, P ≪ 0.001). Principal movements and eigenpostures varied among different karateka and as functions of experience. This approach provides a framework to develop visual tools for the analysis of motor synergies in karate, allowing to detect the multi-joint motor patterns that should be restored after an injury, or to be specifically trained to increase performance.
Structured penalties for functional linear models-partially empirical eigenvectors for regression.

PubMed

Randolph, Timothy W; Harezlak, Jaroslaw; Feng, Ziding

2012-01-01

One of the challenges with functional data is incorporating geometric structure, or local correlation, into the analysis. This structure is inherent in the output from an increasing number of biomedical technologies, and a functional linear model is often used to estimate the relationship between the predictor functions and scalar responses. Common approaches to the problem of estimating a coefficient function typically involve two stages: regularization and estimation. Regularization is usually done via dimension reduction, projecting onto a predefined span of basis functions or a reduced set of eigenvectors (principal components). In contrast, we present a unified approach that directly incorporates geometric structure into the estimation process by exploiting the joint eigenproperties of the predictors and a linear penalty operator. In this sense, the components in the regression are 'partially empirical' and the framework is provided by the generalized singular value decomposition (GSVD). The form of the penalized estimation is not new, but the GSVD clarifies the process and informs the choice of penalty by making explicit the joint influence of the penalty and predictors on the bias, variance and performance of the estimated coefficient function. Laboratory spectroscopy data and simulations are used to illustrate the concepts.
An Intelligent Architecture Based on Field Programmable Gate Arrays Designed to Detect Moving Objects by Using Principal Component Analysis

PubMed Central

Bravo, Ignacio; Mazo, Manuel; Lázaro, José L.; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel

2010-01-01

This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices. PMID:22163406
Liquid chromatography tandem mass spectrometry determination of chemical markers and principal component analysis of Vitex agnus-castus L. fruits (Verbenaceae) and derived food supplements.

PubMed

Mari, Angela; Montoro, Paola; Pizza, Cosimo; Piacente, Sonia

2012-11-01

A validated analytical method for the quantitative determination of seven chemical markers occurring in a hydroalcoholic extract of Vitex agnus-castus fruits by liquid chromatography electrospray triple quadrupole tandem mass spectrometry (LC/ESI/(QqQ)MSMS) is reported. To carry out a comparative study, five commercial food supplements corresponding to hydroalcoholic extracts of V. agnus-castus fruits were analysed under the same chromatographic conditions of the crude extract. Principal component analysis (PCA), based only on the variation of the amount of the seven chemical markers, was applied in order to find similarities between the hydroalcoholic extract and the food supplements. A second PCA analysis was carried out considering the whole spectroscopic data deriving from liquid chromatography electrospray linear ion trap mass spectrometry (LC/ESI/(LIT)MS) analysis. High similarity between the two PCA was observed, showing the possibility to select one of these two approaches for future applications in the field of comparative analysis of food supplements and quality control procedures. Copyright © 2012 Elsevier B.V. All rights reserved.
Feature extraction across individual time series observations with spikes using wavelet principal component analysis.

PubMed

Røislien, Jo; Winje, Brita

2013-09-20

Clinical studies frequently include repeated measurements of individuals, often for long periods. We present a methodology for extracting common temporal features across a set of individual time series observations. In particular, the methodology explores extreme observations within the time series, such as spikes, as a possible common temporal phenomenon. Wavelet basis functions are attractive in this sense, as they are localized in both time and frequency domains simultaneously, allowing for localized feature extraction from a time-varying signal. We apply wavelet basis function decomposition of individual time series, with corresponding wavelet shrinkage to remove noise. We then extract common temporal features using linear principal component analysis on the wavelet coefficients, before inverse transformation back to the time domain for clinical interpretation. We demonstrate the methodology on a subset of a large fetal activity study aiming to identify temporal patterns in fetal movement (FM) count data in order to explore formal FM counting as a screening tool for identifying fetal compromise and thus preventing adverse birth outcomes. Copyright © 2013 John Wiley & Sons, Ltd.
An intelligent architecture based on Field Programmable Gate Arrays designed to detect moving objects by using Principal Component Analysis.

PubMed

Bravo, Ignacio; Mazo, Manuel; Lázaro, José L; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel

2010-01-01

This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices.
Identification of milk origin and process-induced changes in milk by stable isotope ratio mass spectrometry.

PubMed

Scampicchio, Matteo; Mimmo, Tanja; Capici, Calogero; Huck, Christian; Innocente, Nadia; Drusch, Stephan; Cesco, Stefano

2012-11-14

Stable isotope values were used to develop a new analytical approach enabling the simultaneous identification of milk samples either processed with different heating regimens or from different geographical origins. The samples consisted of raw, pasteurized (HTST), and ultrapasteurized (UHT) milk from different Italian origins. The approach consisted of the analysis of the isotope ratio of δ¹³C and δ¹⁵N for the milk samples and their fractions (fat, casein, and whey). The main finding of this work is that as the heat processing affects the composition of the milk fractions, changes in δ¹³C and δ¹⁵N were also observed. These changes were used as markers to develop pattern recognition maps based on principal component analysis and supervised classification models, such as linear discriminant analysis (LDA), multivariate regression (MLR), principal component regression (PCR), and partial least-squares (PLS). The results give proof of the concept that isotope ratio mass spectroscopy can discriminate simultaneously between milk samples according to their geographical origin and type of processing.
Loneliness Literacy Scale: Development and Evaluation of an Early Indicator for Loneliness Prevention.

PubMed

Honigh-de Vlaming, Rianne; Haveman-Nies, Annemien; Bos-Oude Groeniger, Inge; Hooft van Huysduynen, Eveline J C; de Groot, Lisette C P G M; Van't Veer, Pieter

2014-01-01

To develop and evaluate the Loneliness Literacy Scale for the assessment of short-term outcomes of a loneliness prevention programme among Dutch elderly persons. Scale development was based on evidence from literature and experiences from local stakeholders and representatives of the target group. The scale was pre-tested among 303 elderly persons aged 65 years and over. Principal component analysis and internal consistency analysis were used to affirm the scale structure, reduce the number of items and assess the reliability of the constructs. Linear regression analysis was conducted to evaluate the association between the literacy constructs and loneliness. The four constructs "motivation", "self-efficacy", "perceived social support" and "subjective norm" derived from principal component analysis captured 56 % of the original variance. Cronbach's coefficient α was above 0.7 for each construct. The constructs "self-efficacy" and "perceived social support" were positively and "subjective norm" was negatively associated with loneliness. To our knowledge this is the first study developing a short-term indicator for loneliness prevention. The indicator contributes to the need of evaluating public health interventions more close to the intervention activities.
Analytical optimal pulse shapes obtained with the aid of genetic algorithms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guerrero, Rubén D., E-mail: rdguerrerom@unal.edu.co; Arango, Carlos A.; Reyes, Andrés

2015-09-28

We propose a methodology to design optimal pulses for achieving quantum optimal control on molecular systems. Our approach constrains pulse shapes to linear combinations of a fixed number of experimentally relevant pulse functions. Quantum optimal control is obtained by maximizing a multi-target fitness function using genetic algorithms. As a first application of the methodology, we generated an optimal pulse that successfully maximized the yield on a selected dissociation channel of a diatomic molecule. Our pulse is obtained as a linear combination of linearly chirped pulse functions. Data recorded along the evolution of the genetic algorithm contained important information regarding themore » interplay between radiative and diabatic processes. We performed a principal component analysis on these data to retrieve the most relevant processes along the optimal path. Our proposed methodology could be useful for performing quantum optimal control on more complex systems by employing a wider variety of pulse shape functions.« less
Sufficient Forecasting Using Factor Models

PubMed Central

Fan, Jianqing; Xue, Lingzhou; Yao, Jiawei

2017-01-01

We consider forecasting a single time series when there is a large number of predictors and a possible nonlinear effect. The dimensionality was first reduced via a high-dimensional (approximate) factor model implemented by the principal component analysis. Using the extracted factors, we develop a novel forecasting method called the sufficient forecasting, which provides a set of sufficient predictive indices, inferred from high-dimensional predictors, to deliver additional predictive power. The projected principal component analysis will be employed to enhance the accuracy of inferred factors when a semi-parametric (approximate) factor model is assumed. Our method is also applicable to cross-sectional sufficient regression using extracted factors. The connection between the sufficient forecasting and the deep learning architecture is explicitly stated. The sufficient forecasting correctly estimates projection indices of the underlying factors even in the presence of a nonparametric forecasting function. The proposed method extends the sufficient dimension reduction to high-dimensional regimes by condensing the cross-sectional information through factor models. We derive asymptotic properties for the estimate of the central subspace spanned by these projection directions as well as the estimates of the sufficient predictive indices. We further show that the natural method of running multiple regression of target on estimated factors yields a linear estimate that actually falls into this central subspace. Our method and theory allow the number of predictors to be larger than the number of observations. We finally demonstrate that the sufficient forecasting improves upon the linear forecasting in both simulation studies and an empirical study of forecasting macroeconomic variables. PMID:29731537
Differences in kinematic control of ankle joint motions in people with chronic ankle instability.

PubMed

Kipp, Kristof; Palmieri-Smith, Riann M

2013-06-01

People with chronic ankle instability display different ankle joint motions compared to healthy people. The purpose of this study was to investigate the strategies used to control ankle joint motions between a group of people with chronic ankle instability and a group of healthy, matched controls. Kinematic data were collected from 11 people with chronic ankle instability and 11 matched control subjects as they performed a single-leg land-and-cut maneuver. Three-dimensional ankle joint angles were calculated from 100 ms before, to 200 ms after landing. Kinematic control of the three rotational ankle joint degrees of freedom was investigated by simultaneously examining the three-dimensional co-variation of plantarflexion/dorsiflexion, toe-in/toe-out rotation, and inversion/eversion motions with principal component analysis. Group differences in the variance proportions of the first two principal components indicated that the angular co-variation between ankle joint motions was more linear in the control group, but more planar in the chronic ankle instability group. Frontal and transverse plane motions, in particular, contributed to the group differences in the linearity and planarity of angular co-variation. People with chronic ankle instability use a different kinematic control strategy to coordinate ankle joint motions during a single-leg landing task. Compared to the healthy group, the chronic ankle instability group's control strategy appeared to be more complex and involved joint-specific contributions that would tend to predispose this group to recurring episodes of instability. Copyright © 2013 Elsevier Ltd. All rights reserved.
[Geographical distribution of left ventricular Tei index based on principal component analysis].

PubMed

Xu, Jinhui; Ge, Miao; He, Jinwei; Xue, Ranyin; Yang, Shaofang; Jiang, Jilin

2014-11-01

To provide a scientific standard of left ventricular Tei index for healthy people from various region of China, and to lay a reliable foundation for the evaluation of left ventricular diastolic and systolic function. The correlation and principal component analysis were used to explore the left ventricular Tei index, which based on the data of 3 562 samples from 50 regions of China by means of literature retrieval. Th e nine geographical factors were longitude(X₁), latitude(X₂), altitude(X₃), annual sunshine hours (X₄), the annual average temperature (X₅), annual average relative humidity (X₆), annual precipitation (X₇), annual temperature range (X₈) and annual average wind speed (X₉). ArcGIS soft ware was applied to calculate the spatial distribution regularities of left ventricular Tei index. There is a significant correlation between the healthy people's left ventricular Tei index and geographical factors, and the correlation coefficients were -0.107 (r₁), -0.301 (r₂), -0.029 (r₃), -0.277 (r₄), -0.256(r₅), -0.289(r₆), -0.320(r₇), -0.310 (r₈) and -0.117 (r₉), respectively. A linear equation between the Tei index and the geographical factor was obtained by regression analysis based on the three extracting principal components. The geographical distribution tendency chart for healthy people's left Tei index was fitted out by the ArcGIS spatial interpolation analysis. The geographical distribution for left ventricular Tei index in China follows certain pattern. The reference value in North is higher than that in South, while the value in East is higher than that in West.
Relation between dietary pattern analysis (principal component analysis) and body mass index: a 5-year follow-up study in a Belgian military population.

PubMed

Mullie, Patrick; Clarys, P

2016-02-01

Increasing body mass index (BMI) has been related to many chronic diseases. Knowledge of nutritional determinants of BMI increase may be important to detect persons at risk. A longitudinal prospective study design was used in 805 Belgian soldiers. Daily nutrition was recorded with a validated food-frequency questionnaire. Weight and height were recorded from medical military data and principal component analysis was used to detect dietary patterns. During the 5 years follow-up, mean BMI increased from 25.8 (±3.3) kg/m(2) to 27.1 (±3.6) kg/m(2) (p<0.05). Consequently, the prevalence of being overweight and obesity increased from 46.2% and 9.6% to 51.6% and 19.9% (p<0.05), respectively. Mean (SD) weight gain differed between the BMI categories at baseline with a respective weight gain of 3.8 (±3.1) kg for normal weight at baseline, 4.2 (±3.2) kg for overweight and 5.1 (±3.4) kg for obesity (p for trend <0.05). Three dietary patterns were detected by principal component analysis: Meat, Sweet and Healthy dietary pattern. In energy-unadjusted and adjusted linear regressions, no dietary pattern was associated with BMI increase. No specific dietary pattern was related to BMI increase. Prevention of obesity should focus on total energy intake at all BMI categories. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Principal shapes and squeezed limits in the effective field theory of large scale structure

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bertolini, Daniele; Solon, Mikhail P., E-mail: dbertolini@lbl.gov, E-mail: mpsolon@lbl.gov

2016-11-01

We apply an orthogonalization procedure on the effective field theory of large scale structure (EFT of LSS) shapes, relevant for the angle-averaged bispectrum and non-Gaussian covariance of the matter power spectrum at one loop. Assuming natural-sized EFT parameters, this identifies a linear combination of EFT shapes—referred to as the principal shape—that gives the dominant contribution for the whole kinematic plane, with subdominant combinations suppressed by a few orders of magnitude. For the covariance, our orthogonal transformation is in excellent agreement with a principal component analysis applied to available data. Additionally we find that, for both observables, the coefficients of themore » principal shapes are well approximated by the EFT coefficients appearing in the squeezed limit, and are thus measurable from power spectrum response functions. Employing data from N-body simulations for the growth-only response, we measure the single EFT coefficient describing the angle-averaged bispectrum with Ο (10%) precision. These methods of shape orthogonalization and measurement of coefficients from response functions are valuable tools for developing the EFT of LSS framework, and can be applied to more general observables.« less
Cognitive performance predicts treatment decisional abilities in mild to moderate dementia.

PubMed

Gurrera, R J; Moye, J; Karel, M J; Azar, A R; Armesto, J C

2006-05-09

To examine the contribution of neuropsychological test performance to treatment decision-making capacity in community volunteers with mild to moderate dementia. The authors recruited volunteers (44 men, 44 women) with mild to moderate dementia from the community. Subjects completed a battery of 11 neuropsychological tests that assessed auditory and visual attention, logical memory, language, and executive function. To measure decision making capacity, the authors administered the Capacity to Consent to Treatment Interview, the Hopemont Capacity Assessment Interview, and the MacCarthur Competence Assessment Tool--Treatment. Each of these instruments individually scores four decisional abilities serving capacity: understanding, appreciation, reasoning, and expression of choice. The authors used principal components analysis to generate component scores for each ability across instruments, and to extract principal components for neuropsychological performance. Multiple linear regression analyses demonstrated that neuropsychological performance significantly predicted all four abilities. Specifically, it predicted 77.8% of the common variance for understanding, 39.4% for reasoning, 24.6% for appreciation, and 10.2% for expression of choice. Except for reasoning and appreciation, neuropsychological predictor (beta) profiles were unique for each ability. Neuropsychological performance substantially and differentially predicted capacity for treatment decisions in individuals with mild to moderate dementia. Relationships between elemental cognitive function and decisional capacity may differ in individuals whose decisional capacity is impaired by other disorders, such as mental illness.
Cenesthopathy and Subjective Cognitive Complaints: An Exploratory Study in Schizophrenia.

PubMed

Jimeno, Natalia; Vargas, Martin L

2018-01-01

Cenesthopathy is mainly associated with schizophrenia; however, its neurobiological basis is nowadays unclear. The general objective was to explore clinical correlates of cenesthopathy and subjective cognitive complaints in schizophrenia. Participants (n = 30) meeting DSM-IV criteria for psychotic disorder were recruited from a psychiatry unit and assessed with: Association for Methodology and Documentation in Psychiatry (AMDP) system, Positive and Negative Syndrome Scale, Frankfurt Complaint Questionnaire (FCQ), and the Bonn Scale for the Assessment of Basic Symptoms (BSABS). For quantitative variables, means and Spearman correlation coefficients were calculated. Linear regression following backward method and principal component analysis with varimax rotation were used. 83.3% of subjects (73.3% male, mean age, 31.5 years) presented any type of cenesthopathy; all types of cenesthetic basic symptoms were found. Cenesthetic basic symptoms significantly correlated with the AMDP category "fear and anancasm," FCQ total score, and BSABS cognitive thought disturbances. In the regression analysis only 1 predictor, cognitive thought disturbances, entered the model. In the principal component analysis, a main component which accounted for 22.69% of the variance was found. Cenesthopathy, as assessed with the Bonn Scale (BSABS), is mainly associated with cog-nitive abnormalities including disturbances of thought initiative and mental intentionality, of receptive speech, and subjective retardation or pressure of thoughts. © 2018 S. Karger AG, Basel.
Case study on prediction of remaining methane potential of landfilled municipal solid waste by statistical analysis of waste composition data.

PubMed

Sel, İlker; Çakmakcı, Mehmet; Özkaya, Bestamin; Suphi Altan, H

2016-10-01

Main objective of this study was to develop a statistical model for easier and faster Biochemical Methane Potential (BMP) prediction of landfilled municipal solid waste by analyzing waste composition of excavated samples from 12 sampling points and three waste depths representing different landfilling ages of closed and active sections of a sanitary landfill site located in İstanbul, Turkey. Results of Principal Component Analysis (PCA) were used as a decision support tool to evaluation and describe the waste composition variables. Four principal component were extracted describing 76% of data set variance. The most effective components were determined as PCB, PO, T, D, W, FM, moisture and BMP for the data set. Multiple Linear Regression (MLR) models were built by original compositional data and transformed data to determine differences. It was observed that even residual plots were better for transformed data the R(2) and Adjusted R(2) values were not improved significantly. The best preliminary BMP prediction models consisted of D, W, T and FM waste fractions for both versions of regressions. Adjusted R(2) values of the raw and transformed models were determined as 0.69 and 0.57, respectively. Copyright © 2016 Elsevier Ltd. All rights reserved.
Improved estimation of parametric images of cerebral glucose metabolic rate from dynamic FDG-PET using volume-wise principle component analysis

NASA Astrophysics Data System (ADS)

Dai, Xiaoqian; Tian, Jie; Chen, Zhe

2010-03-01

Parametric images can represent both spatial distribution and quantification of the biological and physiological parameters of tracer kinetics. The linear least square (LLS) method is a well-estimated linear regression method for generating parametric images by fitting compartment models with good computational efficiency. However, bias exists in LLS-based parameter estimates, owing to the noise present in tissue time activity curves (TTACs) that propagates as correlated error in the LLS linearized equations. To address this problem, a volume-wise principal component analysis (PCA) based method is proposed. In this method, firstly dynamic PET data are properly pre-transformed to standardize noise variance as PCA is a data driven technique and can not itself separate signals from noise. Secondly, the volume-wise PCA is applied on PET data. The signals can be mostly represented by the first few principle components (PC) and the noise is left in the subsequent PCs. Then the noise-reduced data are obtained using the first few PCs by applying 'inverse PCA'. It should also be transformed back according to the pre-transformation method used in the first step to maintain the scale of the original data set. Finally, the obtained new data set is used to generate parametric images using the linear least squares (LLS) estimation method. Compared with other noise-removal method, the proposed method can achieve high statistical reliability in the generated parametric images. The effectiveness of the method is demonstrated both with computer simulation and with clinical dynamic FDG PET study.
Generalized shrunken type-GM estimator and its application

NASA Astrophysics Data System (ADS)

Ma, C. Z.; Du, Y. L.

2014-03-01

The parameter estimation problem in linear model is considered when multicollinearity and outliers exist simultaneously. A class of new robust biased estimator, Generalized Shrunken Type-GM Estimation, with their calculated methods are established by combination of GM estimator and biased estimator include Ridge estimate, Principal components estimate and Liu estimate and so on. A numerical example shows that the most attractive advantage of these new estimators is that they can not only overcome the multicollinearity of coefficient matrix and outliers but also have the ability to control the influence of leverage points.
Selective principal component regression analysis of fluorescence hyperspectral image to assess aflatoxin contamination in corn

USDA-ARS?s Scientific Manuscript database

Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...

Similarities between principal components of protein dynamics and random diffusion

NASA Astrophysics Data System (ADS)

Hess, Berk

2000-12-01

Principal component analysis, also called essential dynamics, is a powerful tool for finding global, correlated motions in atomic simulations of macromolecules. It has become an established technique for analyzing molecular dynamics simulations of proteins. The first few principal components of simulations of large proteins often resemble cosines. We derive the principal components for high-dimensional random diffusion, which are almost perfect cosines. This resemblance between protein simulations and noise implies that for many proteins the time scales of current simulations are too short to obtain convergence of collective motions.
Directly Reconstructing Principal Components of Heterogeneous Particles from Cryo-EM Images

PubMed Central

Tagare, Hemant D.; Kucukelbir, Alp; Sigworth, Fred J.; Wang, Hongwei; Rao, Murali

2015-01-01

Structural heterogeneity of particles can be investigated by their three-dimensional principal components. This paper addresses the question of whether, and with what algorithm, the three-dimensional principal components can be directly recovered from cryo-EM images. The first part of the paper extends the Fourier slice theorem to covariance functions showing that the three-dimensional covariance, and hence the principal components, of a heterogeneous particle can indeed be recovered from two-dimensional cryo-EM images. The second part of the paper proposes a practical algorithm for reconstructing the principal components directly from cryo-EM images without the intermediate step of calculating covariances. This algorithm is based on maximizing the (posterior) likelihood using the Expectation-Maximization algorithm. The last part of the paper applies this algorithm to simulated data and to two real cryo-EM data sets: a data set of the 70S ribosome with and without Elongation Factor-G (EF-G), and a data set of the inluenza virus RNA dependent RNA Polymerase (RdRP). The first principal component of the 70S ribosome data set reveals the expected conformational changes of the ribosome as the EF-G binds and unbinds. The first principal component of the RdRP data set reveals a conformational change in the two dimers of the RdRP. PMID:26049077
Dark-bright solitons in coupled nonlinear Schrödinger equations with unequal dispersion coefficients.

PubMed

Charalampidis, E G; Kevrekidis, P G; Frantzeskakis, D J; Malomed, B A

2015-01-01

We study a two-component nonlinear Schrödinger system with equal, repulsive cubic interactions and different dispersion coefficients in the two components. We consider states that have a dark solitary wave in one component. Treating it as a frozen one, we explore the possibility of the formation of bright-solitonic structures in the other component. We identify bifurcation points at which such states emerge in the bright component in the linear limit and explore their continuation into the nonlinear regime. An additional analytically tractable limit is found to be that of vanishing dispersion of the bright component. We numerically identify regimes of potential stability, not only of the single-peak ground state (the dark-bright soliton), but also of excited states with one or more zero crossings in the bright component. When the states are identified as unstable, direct numerical simulations are used to investigate the outcome of the instability development. Although our principal focus is on the homogeneous setting, we also briefly touch upon the counterintuitive impact of the potential presence of a parabolic trap on the states of interest.
Local Prediction Models on Mid-Atlantic Ridge MORB by Principal Component Regression

NASA Astrophysics Data System (ADS)

Ling, X.; Snow, J. E.; Chin, W.

2017-12-01

The isotopic compositions of the daughter isotopes of long-lived radioactive systems (Sr, Nd, Hf and Pb ) can be used to map the scale and history of mantle heterogeneities beneath mid-ocean ridges. Our goal is to relate the multidimensional structure in the existing isotopic dataset with an underlying physical reality of mantle sources. The numerical technique of Principal Component Analysis is useful to reduce the linear dependence of the data to a minimum set of orthogonal eigenvectors encapsulating the information contained (cf Agranier et al 2005). The dataset used for this study covers almost all the MORBs along mid-Atlantic Ridge (MAR), from 54oS to 77oN and 8.8oW to -46.7oW, including replicating the dataset of Agranier et al., 2005 published plus 53 basalt samples dredged and analyzed since then (data from PetDB). The principal components PC1 and PC2 account for 61.56% and 29.21%, respectively, of the total isotope ratios variability. The samples with similar compositions to HIMU and EM and DM are identified to better understand the PCs. PC1 and PC2 are accountable for HIMU and EM whereas PC2 has limited control over the DM source. PC3 is more strongly controlled by the depleted mantle source than PC2. What this means is that all three principal components have a high degree of significance relevant to the established mantle sources. We also tested the relationship between mantle heterogeneity and sample locality. K-means clustering algorithm is a type of unsupervised learning to find groups in the data based on feature similarity. The PC factor scores of each sample are clustered into three groups. Cluster one and three are alternating on the north and south MAR. Cluster two exhibits on 45.18oN to 0.79oN and -27.9oW to -30.40oW alternating with cluster one. The ridge has been preliminarily divided into 16 sections considering both the clusters and ridge segments. The principal component regression models the section based on 6 isotope ratios and PCs. The prediction residual is about 1-2km. It means that the combined 5 isotopes are a strong predictor of geographic location along the ridge, a slightly surprising result. PCR is a robust and powerful method for both visualizing and manipulating the multidimensional representation of isotope data.
[Simultaneous separation and detection of principal component isomer and related substances of raw material drug of ammonium glycyrrhizinate by RP-HPLC and structure confirmation].

PubMed

Zhao, Yan-Yan; Liu, Li-Yan; Han, Yuan-Yuan; Li, Yue-Qiu; Wang, Yan; Shi, Min-Jian

2013-08-01

A simple, fast and sensitive analytical method for the simultaneous separation and detection of 18alpha-glycyrrhizinic acid, 18beta-glycyrrhizinic acid, related substance A and related substance B by RP-HPLC and drug quality standard was established. The structures of principal component isomer and related substances of raw material drug of ammonium glycyrrhizinate have been confirmed. Reference European Pharmacopoeia EP7.0 version, British Pharmacopoeia 2012 version, National Drug Standards of China (WS 1-XG-2002), domestic and international interrelated literature were referred to select the composition of mobile phase. The experimental parameters including salt concentration, pH, addition quantities of organic solvent, column temperature and flow rate were optimized. Finally, the assay was conducted on a Durashell-C18 column (250 mm x 4.6 mm, 5 microm) with 0.01 mol x mL(-1) ammonium perchlorate (add ammonia to adjust the pH value to 8.2) -methanol (48 : 52) as mobile phase at the flow rate of 0.8 mL x min(-1), and the detection wavelength was set at 254 nm. The column temperature was 50 degrees C and the injection volume was 10 microL. The MS, NMR, UV and RP-HPLC were used to confirm the structures of principal component isomer and related substances of raw material drug of ammonium glycyrrhizinate. Under the optimized separation conditions, the calibration curves of 18 alpha-glycyrrhizinic acid, 18beta-glycyrrhizinic acid, related substance A and related substance B showed good linearity within the concentration of 0.50-100 microg x mL(-1) (r = 0.999 9). The detection limits for 18alpha-glycyrrhizinic acid, 18beta-glycyrrhizinic acid, related substance A and related substance B were 0.15, 0.10, 0.10, 0.15 microg x mL(-1) respectively. The method is sensitive, reproducible and the results are accurate and reliable. It can be used for chiral resolution of 18alpha-glycyrrhizinic acid, 18Pbeta-glycyrrhizinic acid, and detection content of principal component and related substances of raw material drug of ammonium glycyrrhizinate. It is concluded that the separation of principal component isomer of raw material drug of ammonium glycyrrhizinate and the validity of the substance's structure assignments of retention time being 1.2 in the European pharmacopoeia EP7.0 version, British pharmacopoeia 2012 version remains open to question. It may be of practical value for the quality control of raw material drug, preparation, and Chinese herbal medicine of ammonium glycyrrhizinate.
Assessment of computer techniques for processing digital LANDSAT MSS data for lithological discrimination of Serra do Ramalho, State of Bahia

NASA Technical Reports Server (NTRS)

Paradella, W. R. (Principal Investigator); Vitorello, I.; Monteiro, M. D.

1984-01-01

Enhancement techniques and thematic classifications were applied to the metasediments of Bambui Super Group (Upper Proterozoic) in the Region of Serra do Ramalho, SW of the state of Bahia. Linear contrast stretch, band-ratios with contrast stretch, and color-composites allow lithological discriminations. The effects of human activities and of vegetation cover mask and limit, in several ways, the lithological discrimination with digital MSS data. Principal component images and color composite of linear contrast stretch of these products, show lithological discrimination through tonal gradations. This set of products allows the delineations of several metasedimentary sequences to a level superior to reconnaissance mapping. Supervised (maximum likelihood classifier) and nonsupervised (K-Means classifier) classification of the limestone sequence, host to fluorite mineralization show satisfactory results.
New insights into the folding of a β-sheet miniprotein in a reduced space of collective hydrogen bond variables: application to a hydrodynamic analysis of the folding flow.

PubMed

Kalgin, Igor V; Caflisch, Amedeo; Chekmarev, Sergei F; Karplus, Martin

2013-05-23

A new analysis of the 20 μs equilibrium folding/unfolding molecular dynamics simulations of the three-stranded antiparallel β-sheet miniprotein (beta3s) in implicit solvent is presented. The conformation space is reduced in dimensionality by introduction of linear combinations of hydrogen bond distances as the collective variables making use of a specially adapted principal component analysis (PCA); i.e., to make structured conformations more pronounced, only the formed bonds are included in determining the principal components. It is shown that a three-dimensional (3D) subspace gives a meaningful representation of the folding behavior. The first component, to which eight native hydrogen bonds make the major contribution (four in each beta hairpin), is found to play the role of the reaction coordinate for the overall folding process, while the second and third components distinguish the structured conformations. The representative points of the trajectory in the 3D space are grouped into conformational clusters that correspond to locally stable conformations of beta3s identified in earlier work. A simplified kinetic network based on the three components is constructed, and it is complemented by a hydrodynamic analysis. The latter, making use of "passive tracers" in 3D space, indicates that the folding flow is much more complex than suggested by the kinetic network. A 2D representation of streamlines shows there are vortices which correspond to repeated local rearrangement, not only around minima of the free energy surface but also in flat regions between minima. The vortices revealed by the hydrodynamic analysis are apparently not evident in folding pathways generated by transition-path sampling. Making use of the fact that the values of the collective hydrogen bond variables are linearly related to the Cartesian coordinate space, the RMSD between clusters is determined. Interestingly, the transition rates show an approximate exponential correlation with distance in the hydrogen bond subspace. Comparison with the many published studies shows good agreement with the present analysis for the parts that can be compared, supporting the robust character of our understanding of this "hydrogen atom" of protein folding.
Identifying Crucial Parameter Correlations Maintaining Bursting Activity

PubMed Central

Doloc-Mihu, Anca; Calabrese, Ronald L.

2014-01-01

Recent experimental and computational studies suggest that linearly correlated sets of parameters (intrinsic and synaptic properties of neurons) allow central pattern-generating networks to produce and maintain their rhythmic activity regardless of changing internal and external conditions. To determine the role of correlated conductances in the robust maintenance of functional bursting activity, we used our existing database of half-center oscillator (HCO) model instances of the leech heartbeat CPG. From the database, we identified functional activity groups of burster (isolated neuron) and half-center oscillator model instances and realistic subgroups of each that showed burst characteristics (principally period and spike frequency) similar to the animal. To find linear correlations among the conductance parameters maintaining functional leech bursting activity, we applied Principal Component Analysis (PCA) to each of these four groups. PCA identified a set of three maximal conductances (leak current, Leak; a persistent K current, K2; and of a persistent Na+ current, P) that correlate linearly for the two groups of burster instances but not for the HCO groups. Visualizations of HCO instances in a reduced space suggested that there might be non-linear relationships between these parameters for these instances. Experimental studies have shown that period is a key attribute influenced by modulatory inputs and temperature variations in heart interneurons. Thus, we explored the sensitivity of period to changes in maximal conductances of Leak, K2, and P, and we found that for our realistic bursters the effect of these parameters on period could not be assessed because when varied individually bursting activity was not maintained. PMID:24945358
An Introductory Application of Principal Components to Cricket Data

ERIC Educational Resources Information Center

Manage, Ananda B. W.; Scariano, Stephen M.

2013-01-01

Principal Component Analysis is widely used in applied multivariate data analysis, and this article shows how to motivate student interest in this topic using cricket sports data. Here, principal component analysis is successfully used to rank the cricket batsmen and bowlers who played in the 2012 Indian Premier League (IPL) competition. In…
Identifying apple surface defects using principal components analysis and artifical neural networks

USDA-ARS?s Scientific Manuscript database

Artificial neural networks and principal components were used to detect surface defects on apples in near-infrared images. Neural networks were trained and tested on sets of principal components derived from columns of pixels from images of apples acquired at two wavelengths (740 nm and 950 nm). I...
Coordinate measuring system

DOEpatents

Carlisle, Keith [Discovery Bay, CA

2003-04-08

An apparatus and method is utilized to measure relative rigid body motion between two bodies by measuring linear motion in the principal axis and linear motion in an orthogonal axis. From such measurements it is possible to obtain displacement, departure from straightness, and angular displacement from the principal axis of a rigid body.
Finding Planets in K2: A New Method of Cleaning the Data

NASA Astrophysics Data System (ADS)

Currie, Miles; Mullally, Fergal; Thompson, Susan E.

2017-01-01

We present a new method of removing systematic flux variations from K2 light curves by employing a pixel-level principal component analysis (PCA). This method decomposes the light curves into its principal components (eigenvectors), each with an associated eigenvalue, the value of which is correlated to how much influence the basis vector has on the shape of the light curve. This method assumes that the most influential basis vectors will correspond to the unwanted systematic variations in the light curve produced by K2’s constant motion. We correct the raw light curve by automatically fitting and removing the strongest principal components. The strongest principal components generally correspond to the flux variations that result from the motion of the star in the field of view. Our primary method of calculating the strongest principal components to correct for in the raw light curve estimates the noise by measuring the scatter in the light curve after using an algorithm for Savitsy-Golay detrending, which computes the combined photometric precision value (SG-CDPP value) used in classic Kepler. We calculate this value after correcting the raw light curve for each element in a list of cumulative sums of principal components so that we have as many noise estimate values as there are principal components. We then take the derivative of the list of SG-CDPP values and take the number of principal components that correlates to the point at which the derivative effectively goes to zero. This is the optimal number of principal components to exclude from the refitting of the light curve. We find that a pixel-level PCA is sufficient for cleaning unwanted systematic and natural noise from K2’s light curves. We present preliminary results and a basic comparison to other methods of reducing the noise from the flux variations.
How Does District Principal Evaluation Affect Learning-Centered Principal Leadership? Evidence from Michigan School Districts

ERIC Educational Resources Information Center

Sun, Min; Youngs, Peter

2009-01-01

This study used Hierarchical Multivariate Linear models to investigate relationships between principals' behaviors and district principal evaluation purpose, focus, and assessed leadership activities in 13 school districts in Michigan. The study found that principals were more likely to engage in learning-centered leadership behaviors when the…
Directly reconstructing principal components of heterogeneous particles from cryo-EM images.

PubMed

Tagare, Hemant D; Kucukelbir, Alp; Sigworth, Fred J; Wang, Hongwei; Rao, Murali

2015-08-01

Structural heterogeneity of particles can be investigated by their three-dimensional principal components. This paper addresses the question of whether, and with what algorithm, the three-dimensional principal components can be directly recovered from cryo-EM images. The first part of the paper extends the Fourier slice theorem to covariance functions showing that the three-dimensional covariance, and hence the principal components, of a heterogeneous particle can indeed be recovered from two-dimensional cryo-EM images. The second part of the paper proposes a practical algorithm for reconstructing the principal components directly from cryo-EM images without the intermediate step of calculating covariances. This algorithm is based on maximizing the posterior likelihood using the Expectation-Maximization algorithm. The last part of the paper applies this algorithm to simulated data and to two real cryo-EM data sets: a data set of the 70S ribosome with and without Elongation Factor-G (EF-G), and a data set of the influenza virus RNA dependent RNA Polymerase (RdRP). The first principal component of the 70S ribosome data set reveals the expected conformational changes of the ribosome as the EF-G binds and unbinds. The first principal component of the RdRP data set reveals a conformational change in the two dimers of the RdRP. Copyright © 2015 Elsevier Inc. All rights reserved.
Perturbational formulation of principal component analysis in molecular dynamics simulation.

PubMed

Koyama, Yohei M; Kobayashi, Tetsuya J; Tomoda, Shuji; Ueda, Hiroki R

2008-10-01

Conformational fluctuations of a molecule are important to its function since such intrinsic fluctuations enable the molecule to respond to the external environmental perturbations. For extracting large conformational fluctuations, which predict the primary conformational change by the perturbation, principal component analysis (PCA) has been used in molecular dynamics simulations. However, several versions of PCA, such as Cartesian coordinate PCA and dihedral angle PCA (dPCA), are limited to use with molecules with a single dominant state or proteins where the dihedral angle represents an important internal coordinate. Other PCAs with general applicability, such as the PCA using pairwise atomic distances, do not represent the physical meaning clearly. Therefore, a formulation that provides general applicability and clearly represents the physical meaning is yet to be developed. For developing such a formulation, we consider the conformational distribution change by the perturbation with arbitrary linearly independent perturbation functions. Within the second order approximation of the Kullback-Leibler divergence by the perturbation, the PCA can be naturally interpreted as a method for (1) decomposing a given perturbation into perturbations that independently contribute to the conformational distribution change or (2) successively finding the perturbation that induces the largest conformational distribution change. In this perturbational formulation of PCA, (i) the eigenvalue measures the Kullback-Leibler divergence from the unperturbed to perturbed distributions, (ii) the eigenvector identifies the combination of the perturbation functions, and (iii) the principal component determines the probability change induced by the perturbation. Based on this formulation, we propose a PCA using potential energy terms, and we designate it as potential energy PCA (PEPCA). The PEPCA provides both general applicability and clear physical meaning. For demonstrating its power, we apply the PEPCA to an alanine dipeptide molecule in vacuum as a minimal model of a nonsingle dominant conformational biomolecule. The first and second principal components clearly characterize two stable states and the transition state between them. Positive and negative components with larger absolute values of the first and second eigenvectors identify the electrostatic interactions, which stabilize or destabilize each stable state and the transition state. Our result therefore indicates that PCA can be applied, by carefully selecting the perturbation functions, not only to identify the molecular conformational fluctuation but also to predict the conformational distribution change by the perturbation beyond the limitation of the previous methods.
Perturbational formulation of principal component analysis in molecular dynamics simulation

NASA Astrophysics Data System (ADS)

Koyama, Yohei M.; Kobayashi, Tetsuya J.; Tomoda, Shuji; Ueda, Hiroki R.

2008-10-01

Conformational fluctuations of a molecule are important to its function since such intrinsic fluctuations enable the molecule to respond to the external environmental perturbations. For extracting large conformational fluctuations, which predict the primary conformational change by the perturbation, principal component analysis (PCA) has been used in molecular dynamics simulations. However, several versions of PCA, such as Cartesian coordinate PCA and dihedral angle PCA (dPCA), are limited to use with molecules with a single dominant state or proteins where the dihedral angle represents an important internal coordinate. Other PCAs with general applicability, such as the PCA using pairwise atomic distances, do not represent the physical meaning clearly. Therefore, a formulation that provides general applicability and clearly represents the physical meaning is yet to be developed. For developing such a formulation, we consider the conformational distribution change by the perturbation with arbitrary linearly independent perturbation functions. Within the second order approximation of the Kullback-Leibler divergence by the perturbation, the PCA can be naturally interpreted as a method for (1) decomposing a given perturbation into perturbations that independently contribute to the conformational distribution change or (2) successively finding the perturbation that induces the largest conformational distribution change. In this perturbational formulation of PCA, (i) the eigenvalue measures the Kullback-Leibler divergence from the unperturbed to perturbed distributions, (ii) the eigenvector identifies the combination of the perturbation functions, and (iii) the principal component determines the probability change induced by the perturbation. Based on this formulation, we propose a PCA using potential energy terms, and we designate it as potential energy PCA (PEPCA). The PEPCA provides both general applicability and clear physical meaning. For demonstrating its power, we apply the PEPCA to an alanine dipeptide molecule in vacuum as a minimal model of a nonsingle dominant conformational biomolecule. The first and second principal components clearly characterize two stable states and the transition state between them. Positive and negative components with larger absolute values of the first and second eigenvectors identify the electrostatic interactions, which stabilize or destabilize each stable state and the transition state. Our result therefore indicates that PCA can be applied, by carefully selecting the perturbation functions, not only to identify the molecular conformational fluctuation but also to predict the conformational distribution change by the perturbation beyond the limitation of the previous methods.
Resonance Raman Spectroscopy of human brain metastasis of lung cancer analyzed by blind source separation

NASA Astrophysics Data System (ADS)

Zhou, Yan; Liu, Cheng-Hui; Pu, Yang; Cheng, Gangge; Yu, Xinguang; Zhou, Lixin; Lin, Dongmei; Zhu, Ke; Alfano, Robert R.

2017-02-01

Resonance Raman (RR) spectroscopy offers a novel Optical Biopsy method in cancer discrimination by a means of enhancement in Raman scattering. It is widely acknowledged that the RR spectrum of tissue is a superposition of spectra of various key building block molecules. In this study, the Resonance Raman (RR) spectra of human metastasis of lung cancerous and normal brain tissues excited by a visible selected wavelength at 532 nm are used to explore spectral changes caused by the tumor evolution. The potential application of RR spectra human brain metastasis of lung cancer was investigated by Blind Source Separation such as Principal Component Analysis (PCA). PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components (PCs). The results show significant RR spectra difference between human metastasis of lung cancerous and normal brain tissues analyzed by PCA. To evaluate the efficacy of for cancer detection, a linear discriminant analysis (LDA) classifier is utilized to calculate the sensitivity, and specificity and the receiver operating characteristic (ROC) curves are used to evaluate the performance of this criterion. Excellent sensitivity of 0.97, specificity (close to 1.00) and the Area Under ROC Curve (AUC) of 0.99 values are achieved under best optimal circumstance. This research demonstrates that RR spectroscopy is effective for detecting changes of tissues due to the development of brain metastasis of lung cancer. RR spectroscopy analyzed by blind source separation may have potential to be a new armamentarium.
40 CFR 60.2998 - What are the principal components of the model rule?

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 6 2010-07-01 2010-07-01 false What are the principal components of... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule... management plan. (c) Operator training and qualification. (d) Emission limitations and operating limits. (e...
40 CFR 60.2570 - What are the principal components of the model rule?

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 6 2010-07-01 2010-07-01 false What are the principal components of... Construction On or Before November 30, 1999 Use of Model Rule § 60.2570 What are the principal components of... (k) of this section. (a) Increments of progress toward compliance. (b) Waste management plan. (c...
The Relationship of Social Engagement and Social Support With Sense of Community.

PubMed

Tang, Fengyan; Chi, Iris; Dong, Xinqi

2017-07-01

We aimed to investigate the relationship of engagement in social and cognitive activities and social support with the sense of community (SOC) and its components among older Chinese Americans. The Sense of Community Index (SCI) was used to measure SOC and its four component factors: membership, influence, needs fulfillment, and emotional connection. Social engagement was assessed with 16 questions. Social support included positive support and negative strain. Principal component analysis was used to identify the SCI components. Linear regression analysis was used to detect the contribution of social engagement and social support to SOC and its components. After controlling for sociodemographics and self-rated health, social activity engagement and positive social support were positively related to SOC and its components. This study points to the importance of social activity engagement and positive support from family and friends in increasing the sense of community. © The Author 2017. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Candidate gene analyses of 3-dimensional dentoalveolar phenotypes in subjects with malocclusion

PubMed Central

Weaver, Cole A.; Miller, Steven F.; da Fontoura, Clarissa S. G.; Wehby, George L.; Amendt, Brad A.; Holton, Nathan E.; Allareddy, Veeratrishul; Southard, Thomas E.; Moreno Uribe, Lina M.

2017-01-01

Introduction Genetic studies of malocclusion etiology have identified 4 deleterious mutations in genes, DUSP6, ARHGAP21, FGF23, and ADAMTS1 in familial Class III cases. Although these variants may have large impacts on Class III phenotypic expression, their low frequency (<1%) makes them unlikely to explain most malocclusions. Thus, much of the genetic variation underlying the dentofacial phenotypic variation associated with malocclusion remains unknown. In this study, we evaluated associations between common genetic variations in craniofacial candidate genes and 3-dimensional dentoalveolar phenotypes in patients with malocclusion. Methods Pretreatment dental casts or cone-beam computed tomographic images from 300 healthy subjects were digitized with 48 landmarks. The 3-dimensional coordinate data were submitted to a geometric morphometric approach along with principal component analysis to generate continuous phenotypes including symmetric and asymmetric components of dentoalveolar shape variation, fluctuating asymmetry, and size. The subjects were genotyped for 222 single-nucleotide polymorphisms in 82 genes/loci, and phenotpye-genotype associations were tested via multivariate linear regression. Results Principal component analysis of symmetric variation identified 4 components that explained 68% of the total variance and depicted anteroposterior, vertical, and transverse dentoalveolar discrepancies. Suggestive associations (P < 0.05) were identified with PITX2, SNAI3, 11q22.2-q22.3, 4p16.1, ISL1, and FGF8. Principal component analysis for asymmetric variations identified 4 components that explained 51% of the total variations and captured left-to-right discrepancies resulting in midline deviations, unilateral crossbites, and ectopic eruptions. Suggestive associations were found with TBX1 AJUBA, SNAI3 SATB2, TP63, and 1p22.1. Fluctuating asymmetry was associated with BMP3 and LATS1. Associations for SATB2 and BMP3 with asymmetric variations remained significant after the Bonferroni correction (P <0.00022). Suggestive associations were found for centroid size, a proxy for dentoalveolar size variation with 4p16.1 and SNAI1. Conclusions Specific genetic pathways associated with 3-dimensional dentoalveolar phenotypic variation in malocclusions were identified. PMID:28257739
Infrared and visible image fusion based on robust principal component analysis and compressed sensing

NASA Astrophysics Data System (ADS)

Li, Jun; Song, Minghui; Peng, Yuanxi

2018-03-01

Current infrared and visible image fusion methods do not achieve adequate information extraction, i.e., they cannot extract the target information from infrared images while retaining the background information from visible images. Moreover, most of them have high complexity and are time-consuming. This paper proposes an efficient image fusion framework for infrared and visible images on the basis of robust principal component analysis (RPCA) and compressed sensing (CS). The novel framework consists of three phases. First, RPCA decomposition is applied to the infrared and visible images to obtain their sparse and low-rank components, which represent the salient features and background information of the images, respectively. Second, the sparse and low-rank coefficients are fused by different strategies. On the one hand, the measurements of the sparse coefficients are obtained by the random Gaussian matrix, and they are then fused by the standard deviation (SD) based fusion rule. Next, the fused sparse component is obtained by reconstructing the result of the fused measurement using the fast continuous linearized augmented Lagrangian algorithm (FCLALM). On the other hand, the low-rank coefficients are fused using the max-absolute rule. Subsequently, the fused image is superposed by the fused sparse and low-rank components. For comparison, several popular fusion algorithms are tested experimentally. By comparing the fused results subjectively and objectively, we find that the proposed framework can extract the infrared targets while retaining the background information in the visible images. Thus, it exhibits state-of-the-art performance in terms of both fusion effects and timeliness.
Determination and fingerprint analysis of steroidal saponins in roots of Liriope muscari (Decne.) L. H. Bailey by ultra high performance liquid chromatography coupled with ion trap time-of-flight mass spectrometry.

PubMed

Li, Yong-Wei; Qi, Jin; Wen-Zhang; Zhou, Shui-Ping; Yan-Wu; Yu, Bo-Yang

2014-07-01

Liriope muscari (Decne.) L. H. Bailey is a well-known traditional Chinese medicine used for treating cough and insomnia. There are few reports on the quality evaluation of this herb partly because the major steroid saponins are not readily identified by UV detectors and are not easily isolated due to the existence of many similar isomers. In this study, a qualitative and quantitative method was developed to analyze the major components in L. muscari (Decne.) L. H. Bailey roots. Sixteen components were deduced and identified primarily by the information obtained from ultra high performance liquid chromatography with ion-trap time-of-flight mass spectrometry. The method demonstrated the desired specificity, linearity, stability, precision, and accuracy for simultaneous determination of 15 constituents (13 steroidal glycosides, 25(R)-ruscogenin, and pentylbenzoate) in 26 samples from different origins. The fingerprint was established, and the evaluation was achieved using similarity analysis and principal component analysis of 15 fingerprint peaks from 26 samples by ultra high performance liquid chromatography. The results from similarity analysis were consistent with those of principal component analysis. All results suggest that the established method could be applied effectively to the determination of multi-ingredients and fingerprint analysis of steroid saponins for quality assessment and control of L. muscari (Decne.) L. H. Bailey. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Attitude of medical students towards occupational safety and health: a multi-national study.

PubMed

Bhardwaj, M; Arteta, M; Batmunkh, T; Briceno Leonardo, L; Caraballo, Y; Carvalho, D; Dan, W; Erdogan, S; Brborovic, H; Gudrun, K; Ilse, U; Ingle, G K; Joshi, S K; Kishore, J; Khan, Z; Retneswari, M; Menses, C; Moraga, D; Njan, A; Okonkwo, F O; Ozlem, K; Ravichandran, S; Rosales, J; Rybacki, M; Sainnyambuu, M; Shathanapriya, K; Radon, K

2015-01-01

Work-related diseases contribute immensely to the global burden of diseases. Better understanding of attitudes of health care workers towards occupational safety and health (OSH) is important for planning. To assess the attitude of medical students towards OSH around the globe. A questionnaire assessing the attitude towards OSH was administered to medical and paramedical students of 21 Medical Universities across the globe. In the current study 1895 students, aged 18-36 years, from 17 countries were included. After having performed a principal components analysis, the associations of interest between the identified components and other socio demographic characteristics were assessed by multivariate linear regression. Principal component analysis revealed 3 components. Students from lower and lower-middle-income countries had a more positive attitude towards OSH, but the importance of OSH was still rated higher by students from upper-income countries. Although students from Asian and African continents showed high interest for OSH, European and South-Central American students comparatively rated importance of OSH to be higher. Paramedical students had more positive attitude towards OSH than medical students. The attitude of students from lower-income and lower-middle-income towards importance of OSH is negative. This attitude could be changed by recommending modifications to OSH courses that reflect the importance of OSH. Since paramedical students showed more interest in OSH than medical students, modifications in existing health care system with major role of paramedics in OSH service delivery is recommended.
Intercomparison of air quality data using principal component analysis, and forecasting of PM₁₀ and PM₂.₅ concentrations using artificial neural networks, in Thessaloniki and Helsinki.

PubMed

Voukantsis, Dimitris; Karatzas, Kostas; Kukkonen, Jaakko; Räsänen, Teemu; Karppinen, Ari; Kolehmainen, Mikko

2011-03-01

In this paper we propose a methodology consisting of specific computational intelligence methods, i.e. principal component analysis and artificial neural networks, in order to inter-compare air quality and meteorological data, and to forecast the concentration levels for environmental parameters of interest (air pollutants). We demonstrate these methods to data monitored in the urban areas of Thessaloniki and Helsinki in Greece and Finland, respectively. For this purpose, we applied the principal component analysis method in order to inter-compare the patterns of air pollution in the two selected cities. Then, we proceeded with the development of air quality forecasting models for both studied areas. On this basis, we formulated and employed a novel hybrid scheme in the selection process of input variables for the forecasting models, involving a combination of linear regression and artificial neural networks (multi-layer perceptron) models. The latter ones were used for the forecasting of the daily mean concentrations of PM₁₀ and PM₂.₅ for the next day. Results demonstrated an index of agreement between measured and modelled daily averaged PM₁₀ concentrations, between 0.80 and 0.85, while the kappa index for the forecasting of the daily averaged PM₁₀ concentrations reached 60% for both cities. Compared with previous corresponding studies, these statistical parameters indicate an improved performance of air quality parameters forecasting. It was also found that the performance of the models for the forecasting of the daily mean concentrations of PM₁₀ was not substantially different for both cities, despite the major differences of the two urban environments under consideration. Copyright © 2011 Elsevier B.V. All rights reserved.
SU-F-J-138: An Extension of PCA-Based Respiratory Deformation Modeling Via Multi-Linear Decomposition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Iliopoulos, AS; Sun, X; Pitsianis, N

Purpose: To address and lift the limited degree of freedom (DoF) of globally bilinear motion components such as those based on principal components analysis (PCA), for encoding and modeling volumetric deformation motion. Methods: We provide a systematic approach to obtaining a multi-linear decomposition (MLD) and associated motion model from deformation vector field (DVF) data. We had previously introduced MLD for capturing multi-way relationships between DVF variables, without being restricted by the bilinear component format of PCA-based models. PCA-based modeling is commonly used for encoding patient-specific deformation as per planning 4D-CT images, and aiding on-board motion estimation during radiotherapy. However, themore » bilinear space-time decomposition inherently limits the DoF of such models by the small number of respiratory phases. While this limit is not reached in model studies using analytical or digital phantoms with low-rank motion, it compromises modeling power in the presence of relative motion, asymmetries and hysteresis, etc, which are often observed in patient data. Specifically, a low-DoF model will spuriously couple incoherent motion components, compromising its adaptability to on-board deformation changes. By the multi-linear format of extracted motion components, MLD-based models can encode higher-DoF deformation structure. Results: We conduct mathematical and experimental comparisons between PCA- and MLD-based models. A set of temporally-sampled analytical trajectories provides a synthetic, high-rank DVF; trajectories correspond to respiratory and cardiac motion factors, including different relative frequencies and spatial variations. Additionally, a digital XCAT phantom is used to simulate a lung lesion deforming incoherently with respect to the body, which adheres to a simple respiratory trend. In both cases, coupling of incoherent motion components due to a low model DoF is clearly demonstrated. Conclusion: Multi-linear decomposition can enable decoupling of distinct motion factors in high-rank DVF measurements. This may improve motion model expressiveness and adaptability to on-board deformation, aiding model-based image reconstruction for target verification. NIH Grant No. R01-184173.« less
Free energy landscape of a biomolecule in dihedral principal component space: sampling convergence and correspondence between structures and minima.

PubMed

Maisuradze, Gia G; Leitner, David M

2007-05-15

Dihedral principal component analysis (dPCA) has recently been developed and shown to display complex features of the free energy landscape of a biomolecule that may be absent in the free energy landscape plotted in principal component space due to mixing of internal and overall rotational motion that can occur in principal component analysis (PCA) [Mu et al., Proteins: Struct Funct Bioinfo 2005;58:45-52]. Another difficulty in the implementation of PCA is sampling convergence, which we address here for both dPCA and PCA using a tetrapeptide as an example. We find that for both methods the sampling convergence can be reached over a similar time. Minima in the free energy landscape in the space of the two largest dihedral principal components often correspond to unique structures, though we also find some distinct minima to correspond to the same structure. 2007 Wiley-Liss, Inc.
Focal points and principal solutions of linear Hamiltonian systems revisited

NASA Astrophysics Data System (ADS)

Šepitka, Peter; Šimon Hilscher, Roman

2018-05-01

In this paper we present a novel view on the principal (and antiprincipal) solutions of linear Hamiltonian systems, as well as on the focal points of their conjoined bases. We present a new and unified theory of principal (and antiprincipal) solutions at a finite point and at infinity, and apply it to obtain new representation of the multiplicities of right and left proper focal points of conjoined bases. We show that these multiplicities can be characterized by the abnormality of the system in a neighborhood of the given point and by the rank of the associated T-matrix from the theory of principal (and antiprincipal) solutions. We also derive some additional important results concerning the representation of T-matrices and associated normalized conjoined bases. The results in this paper are new even for completely controllable linear Hamiltonian systems. We also discuss other potential applications of our main results, in particular in the singular Sturmian theory.
Authentication of virgin olive oil by a novel curve resolution approach combined with visible spectroscopy.

PubMed

Ferreiro-González, Marta; Barbero, Gerardo F; Álvarez, José A; Ruiz, Antonio; Palma, Miguel; Ayuso, Jesús

2017-04-01

Adulteration of olive oil is not only a major economic fraud but can also have major health implications for consumers. In this study, a combination of visible spectroscopy with a novel multivariate curve resolution method (CR), principal component analysis (PCA) and linear discriminant analysis (LDA) is proposed for the authentication of virgin olive oil (VOO) samples. VOOs are well-known products with the typical properties of a two-component system due to the two main groups of compounds that contribute to the visible spectra (chlorophylls and carotenoids). Application of the proposed CR method to VOO samples provided the two pure-component spectra for the aforementioned families of compounds. A correlation study of the real spectra and the resolved component spectra was carried out for different types of oil samples (n=118). LDA using the correlation coefficients as variables to discriminate samples allowed the authentication of 95% of virgin olive oil samples. Copyright © 2016 Elsevier Ltd. All rights reserved.
Predicting ground contact events for a continuum of gait types: An application of targeted machine learning using principal component analysis.

PubMed

Osis, Sean T; Hettinga, Blayne A; Ferber, Reed

2016-05-01

An ongoing challenge in the application of gait analysis to clinical settings is the standardized detection of temporal events, with unobtrusive and cost-effective equipment, for a wide range of gait types. The purpose of the current study was to investigate a targeted machine learning approach for the prediction of timing for foot strike (or initial contact) and toe-off, using only kinematics for walking, forefoot running, and heel-toe running. Data were categorized by gait type and split into a training set (∼30%) and a validation set (∼70%). A principal component analysis was performed, and separate linear models were trained and validated for foot strike and toe-off, using ground reaction force data as a gold-standard for event timing. Results indicate the model predicted both foot strike and toe-off timing to within 20ms of the gold-standard for more than 95% of cases in walking and running gaits. The machine learning approach continues to provide robust timing predictions for clinical use, and may offer a flexible methodology to handle new events and gait types. Copyright © 2016 Elsevier B.V. All rights reserved.
Support vector machine and principal component analysis for microarray data classification

NASA Astrophysics Data System (ADS)

Astuti, Widi; Adiwijaya

2018-03-01

Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.
Background recovery via motion-based robust principal component analysis with matrix factorization

NASA Astrophysics Data System (ADS)

Pan, Peng; Wang, Yongli; Zhou, Mingyuan; Sun, Zhipeng; He, Guoping

2018-03-01

Background recovery is a key technique in video analysis, but it still suffers from many challenges, such as camouflage, lighting changes, and diverse types of image noise. Robust principal component analysis (RPCA), which aims to recover a low-rank matrix and a sparse matrix, is a general framework for background recovery. The nuclear norm is widely used as a convex surrogate for the rank function in RPCA, which requires computing the singular value decomposition (SVD), a task that is increasingly costly as matrix sizes and ranks increase. However, matrix factorization greatly reduces the dimension of the matrix for which the SVD must be computed. Motion information has been shown to improve low-rank matrix recovery in RPCA, but this method still finds it difficult to handle original video data sets because of its batch-mode formulation and implementation. Hence, in this paper, we propose a motion-assisted RPCA model with matrix factorization (FM-RPCA) for background recovery. Moreover, an efficient linear alternating direction method of multipliers with a matrix factorization (FL-ADM) algorithm is designed for solving the proposed FM-RPCA model. Experimental results illustrate that the method provides stable results and is more efficient than the current state-of-the-art algorithms.
Differentiation of tea varieties using UV-Vis spectra and pattern recognition techniques

NASA Astrophysics Data System (ADS)

Palacios-Morillo, Ana; Alcázar, Ángela.; de Pablos, Fernando; Jurado, José Marcos

2013-02-01

Tea, one of the most consumed beverages all over the world, is of great importance in the economies of a number of countries. Several methods have been developed to classify tea varieties or origins based in pattern recognition techniques applied to chemical data, such as metal profile, amino acids, catechins and volatile compounds. Some of these analytical methods become tedious and expensive to be applied in routine works. The use of UV-Vis spectral data as discriminant variables, highly influenced by the chemical composition, can be an alternative to these methods. UV-Vis spectra of methanol-water extracts of tea have been obtained in the interval 250-800 nm. Absorbances have been used as input variables. Principal component analysis was used to reduce the number of variables and several pattern recognition methods, such as linear discriminant analysis, support vector machines and artificial neural networks, have been applied in order to differentiate the most common tea varieties. A successful classification model was built by combining principal component analysis and multilayer perceptron artificial neural networks, allowing the differentiation between tea varieties. This rapid and simple methodology can be applied to solve classification problems in food industry saving economic resources.
Predictors of burnout among correctional mental health professionals.

PubMed

Gallavan, Deanna B; Newman, Jody L

2013-02-01

This study focused on the experience of burnout among a sample of correctional mental health professionals. We examined the relationship of a linear combination of optimism, work family conflict, and attitudes toward prisoners with two dimensions derived from the Maslach Burnout Inventory and the Professional Quality of Life Scale. Initially, three subscales from the Maslach Burnout Inventory and two subscales from the Professional Quality of Life Scale were subjected to principal components analysis with oblimin rotation in order to identify underlying dimensions among the subscales. This procedure resulted in two components accounting for approximately 75% of the variance (r = -.27). The first component was labeled Negative Experience of Work because it seemed to tap the experience of being emotionally spent, detached, and socially avoidant. The second component was labeled Positive Experience of Work and seemed to tap a sense of competence, success, and satisfaction in one's work. Two multiple regression analyses were subsequently conducted, in which Negative Experience of Work and Positive Experience of Work, respectively, were predicted from a linear combination of optimism, work family conflict, and attitudes toward prisoners. In the first analysis, 44% of the variance in Negative Experience of Work was accounted for, with work family conflict and optimism accounting for the most variance. In the second analysis, 24% of the variance in Positive Experience of Work was accounted for, with optimism and attitudes toward prisoners accounting for the most variance.
Searching for the main anti-bacterial components in artificial Calculus bovis using UPLC and microcalorimetry coupled with multi-linear regression analysis.

PubMed

Zang, Qing-Ce; Wang, Jia-Bo; Kong, Wei-Jun; Jin, Cheng; Ma, Zhi-Jie; Chen, Jing; Gong, Qian-Feng; Xiao, Xiao-He

2011-12-01

The fingerprints of artificial Calculus bovis extracts from different solvents were established by ultra-performance liquid chromatography (UPLC) and the anti-bacterial activities of artificial C. bovis extracts on Staphylococcus aureus (S. aureus) growth were studied by microcalorimetry. The UPLC fingerprints were evaluated using hierarchical clustering analysis. Some quantitative parameters obtained from the thermogenic curves of S. aureus growth affected by artificial C. bovis extracts were analyzed using principal component analysis. The spectrum-effect relationships between UPLC fingerprints and anti-bacterial activities were investigated using multi-linear regression analysis. The results showed that peak 1 (taurocholate sodium), peak 3 (unknown compound), peak 4 (cholic acid), and peak 6 (chenodeoxycholic acid) are more significant than the other peaks with the standard parameter estimate 0.453, -0.166, 0.749, 0.025, respectively. So, compounds cholic acid, taurocholate sodium, and chenodeoxycholic acid might be the major anti-bacterial components in artificial C. bovis. Altogether, this work provides a general model of the combination of UPLC chromatography and anti-bacterial effect to study the spectrum-effect relationships of artificial C. bovis extracts, which can be used to discover the main anti-bacterial components in artificial C. bovis or other Chinese herbal medicines with anti-bacterial effects. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Fast, Exact Bootstrap Principal Component Analysis for p > 1 million

PubMed Central

Fisher, Aaron; Caffo, Brian; Schwartz, Brian; Zipunnikov, Vadim

2015-01-01

Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (p) is much larger than the number of subjects (n), calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same n-dimensional subspace as the original sample. As a result, all bootstrap principal components are limited to the same n-dimensional subspace and can be efficiently represented by their low dimensional coordinates in that subspace. Several uncertainty metrics can be computed solely based on the bootstrap distribution of these low dimensional coordinates, without calculating or storing the p-dimensional bootstrap components. Fast bootstrap PCA is applied to a dataset of sleep electroencephalogram recordings (p = 900, n = 392), and to a dataset of brain magnetic resonance images (MRIs) (p ≈ 3 million, n = 352). For the MRI dataset, our method allows for standard errors for the first 3 principal components based on 1000 bootstrap samples to be calculated on a standard laptop in 47 minutes, as opposed to approximately 4 days with standard methods. PMID:27616801
Principal Workload: Components, Determinants and Coping Strategies in an Era of Standardization and Accountability

ERIC Educational Resources Information Center

Oplatka, Izhar

2017-01-01

Purpose: In order to fill the gap in theoretical and empirical knowledge about the characteristics of principal workload, the purpose of this paper is to explore the components of principal workload as well as its determinants and the coping strategies commonly used by principals to face this personal state. Design/methodology/approach:…
A compact linear accelerator based on a scalable microelectromechanical-system RF-structure

DOE PAGES

Persaud, A.; Ji, Q.; Feinberg, E.; ...

2017-06-08

Here, a new approach for a compact radio-frequency (RF) accelerator structure is presented. The new accelerator architecture is based on the Multiple Electrostatic Quadrupole Array Linear Accelerator (MEQALAC) structure that was first developed in the 1980s. The MEQALAC utilized RF resonators producing the accelerating fields and providing for higher beam currents through parallel beamlets focused using arrays of electrostatic quadrupoles (ESQs). While the early work obtained ESQs with lateral dimensions on the order of a few centimeters, using a printed circuit board (PCB), we reduce the characteristic dimension to the millimeter regime, while massively scaling up the potential number ofmore » parallel beamlets. Using Microelectromechanical systems scalable fabrication approaches, we are working on further red ucing the characteristic dimension to the sub-millimeter regime. The technology is based on RF-acceleration components and ESQs implemented in the PCB or silicon wafers where each beamlet passes through beam apertures in the wafer. The complete accelerator is then assembled by stacking these wafers. This approach has the potential for fast and inexpensive batch fabrication of the components and flexibility in system design for application specific beam energies and currents. For prototyping the accelerator architecture, the components have been fabricated using the PCB. In this paper, we present proof of concept results of the principal components using the PCB: RF acceleration and ESQ focusing. Finally, ongoing developments on implementing components in silicon and scaling of the accelerator technology to high currents and beam energies are discussed.« less
A compact linear accelerator based on a scalable microelectromechanical-system RF-structure

NASA Astrophysics Data System (ADS)

Persaud, A.; Ji, Q.; Feinberg, E.; Seidl, P. A.; Waldron, W. L.; Schenkel, T.; Lal, A.; Vinayakumar, K. B.; Ardanuc, S.; Hammer, D. A.

2017-06-01

A new approach for a compact radio-frequency (RF) accelerator structure is presented. The new accelerator architecture is based on the Multiple Electrostatic Quadrupole Array Linear Accelerator (MEQALAC) structure that was first developed in the 1980s. The MEQALAC utilized RF resonators producing the accelerating fields and providing for higher beam currents through parallel beamlets focused using arrays of electrostatic quadrupoles (ESQs). While the early work obtained ESQs with lateral dimensions on the order of a few centimeters, using a printed circuit board (PCB), we reduce the characteristic dimension to the millimeter regime, while massively scaling up the potential number of parallel beamlets. Using Microelectromechanical systems scalable fabrication approaches, we are working on further reducing the characteristic dimension to the sub-millimeter regime. The technology is based on RF-acceleration components and ESQs implemented in the PCB or silicon wafers where each beamlet passes through beam apertures in the wafer. The complete accelerator is then assembled by stacking these wafers. This approach has the potential for fast and inexpensive batch fabrication of the components and flexibility in system design for application specific beam energies and currents. For prototyping the accelerator architecture, the components have been fabricated using the PCB. In this paper, we present proof of concept results of the principal components using the PCB: RF acceleration and ESQ focusing. Ongoing developments on implementing components in silicon and scaling of the accelerator technology to high currents and beam energies are discussed.
A compact linear accelerator based on a scalable microelectromechanical-system RF-structure.

PubMed

Persaud, A; Ji, Q; Feinberg, E; Seidl, P A; Waldron, W L; Schenkel, T; Lal, A; Vinayakumar, K B; Ardanuc, S; Hammer, D A

2017-06-01

A new approach for a compact radio-frequency (RF) accelerator structure is presented. The new accelerator architecture is based on the Multiple Electrostatic Quadrupole Array Linear Accelerator (MEQALAC) structure that was first developed in the 1980s. The MEQALAC utilized RF resonators producing the accelerating fields and providing for higher beam currents through parallel beamlets focused using arrays of electrostatic quadrupoles (ESQs). While the early work obtained ESQs with lateral dimensions on the order of a few centimeters, using a printed circuit board (PCB), we reduce the characteristic dimension to the millimeter regime, while massively scaling up the potential number of parallel beamlets. Using Microelectromechanical systems scalable fabrication approaches, we are working on further reducing the characteristic dimension to the sub-millimeter regime. The technology is based on RF-acceleration components and ESQs implemented in the PCB or silicon wafers where each beamlet passes through beam apertures in the wafer. The complete accelerator is then assembled by stacking these wafers. This approach has the potential for fast and inexpensive batch fabrication of the components and flexibility in system design for application specific beam energies and currents. For prototyping the accelerator architecture, the components have been fabricated using the PCB. In this paper, we present proof of concept results of the principal components using the PCB: RF acceleration and ESQ focusing. Ongoing developments on implementing components in silicon and scaling of the accelerator technology to high currents and beam energies are discussed.

Considering Horn's Parallel Analysis from a Random Matrix Theory Point of View.

PubMed

Saccenti, Edoardo; Timmerman, Marieke E

2017-03-01

Horn's parallel analysis is a widely used method for assessing the number of principal components and common factors. We discuss the theoretical foundations of parallel analysis for principal components based on a covariance matrix by making use of arguments from random matrix theory. In particular, we show that (i) for the first component, parallel analysis is an inferential method equivalent to the Tracy-Widom test, (ii) its use to test high-order eigenvalues is equivalent to the use of the joint distribution of the eigenvalues, and thus should be discouraged, and (iii) a formal test for higher-order components can be obtained based on a Tracy-Widom approximation. We illustrate the performance of the two testing procedures using simulated data generated under both a principal component model and a common factors model. For the principal component model, the Tracy-Widom test performs consistently in all conditions, while parallel analysis shows unpredictable behavior for higher-order components. For the common factor model, including major and minor factors, both procedures are heuristic approaches, with variable performance. We conclude that the Tracy-Widom procedure is preferred over parallel analysis for statistically testing the number of principal components based on a covariance matrix.
Short-term PV/T module temperature prediction based on PCA-RBF neural network

NASA Astrophysics Data System (ADS)

Li, Jiyong; Zhao, Zhendong; Li, Yisheng; Xiao, Jing; Tang, Yunfeng

2018-02-01

Aiming at the non-linearity and large inertia of temperature control in PV/T system, short-term temperature prediction of PV/T module is proposed, to make the PV/T system controller run forward according to the short-term forecasting situation to optimize control effect. Based on the analysis of the correlation between PV/T module temperature and meteorological factors, and the temperature of adjacent time series, the principal component analysis (PCA) method is used to pre-process the original input sample data. Combined with the RBF neural network theory, the simulation results show that the PCA method makes the prediction accuracy of the network model higher and the generalization performance stronger than that of the RBF neural network without the main component extraction.
A Review of Feature Extraction Software for Microarray Gene Expression Data

PubMed Central

Tan, Ching Siang; Ting, Wai Soon; Mohamad, Mohd Saberi; Chan, Weng Howe; Deris, Safaai; Ali Shah, Zuraini

2014-01-01

When gene expression data are too large to be processed, they are transformed into a reduced representation set of genes. Transforming large-scale gene expression data into a set of genes is called feature extraction. If the genes extracted are carefully chosen, this gene set can extract the relevant information from the large-scale gene expression data, allowing further analysis by using this reduced representation instead of the full size data. In this paper, we review numerous software applications that can be used for feature extraction. The software reviewed is mainly for Principal Component Analysis (PCA), Independent Component Analysis (ICA), Partial Least Squares (PLS), and Local Linear Embedding (LLE). A summary and sources of the software are provided in the last section for each feature extraction method. PMID:25250315
Discrimination of rectal cancer through human serum using surface-enhanced Raman spectroscopy

NASA Astrophysics Data System (ADS)

Li, Xiaozhou; Yang, Tianyue; Li, Siqi; Zhang, Su; Jin, Lili

2015-05-01

In this paper, surface-enhanced Raman spectroscopy (SERS) was used to detect the changes in blood serum components that accompany rectal cancer. The differences in serum SERS data between rectal cancer patients and healthy controls were examined. Postoperative rectal cancer patients also participated in the comparison to monitor the effects of cancer treatments. The results show that there are significant variations at certain wavenumbers which indicates alteration of corresponding biological substances. Principal component analysis (PCA) and parameters of intensity ratios were used on the original SERS spectra for the extraction of featured variables. These featured variables then underwent linear discriminant analysis (LDA) and classification and regression tree (CART) for the discrimination analysis. Accuracies of 93.5 and 92.4 % were obtained for PCA-LDA and parameter-CART, respectively.
Principal component analysis of Mn(salen) catalysts.

PubMed

Teixeira, Filipe; Mosquera, Ricardo A; Melo, André; Freire, Cristina; Cordeiro, M Natália D S

2014-12-14

The theoretical study of Mn(salen) catalysts has been traditionally performed under the assumption that Mn(acacen') (acacen' = 3,3'-(ethane-1,2-diylbis(azanylylidene))bis(prop-1-en-olate)) is an appropriate surrogate for the larger Mn(salen) complexes. In this work, the geometry and the electronic structure of several Mn(salen) and Mn(acacen') model complexes were studied using Density Functional Theory (DFT) at diverse levels of approximation, with the aim of understanding the effects of truncation, metal oxidation, axial coordination, substitution on the aromatic rings of the salen ligand and chirality of the diimine bridge, as well as the choice of the density functional and basis set. To achieve this goal, geometric and structural data, obtained from these calculations, were subjected to Principal Component Analysis (PCA) and PCA with orthogonal rotation of the components (rPCA). The results show the choice of basis set to be of paramount importance, accounting for up to 30% of the variance in the data, while the differences between salen and acacen' complexes account for about 9% of the variance in the data, and are mostly related to the conformation of the salen/acacen' ligand around the metal centre. Variations in the spin state and oxidation state of the metal centre also account for large fractions of the total variance (up to 10% and 9%, respectively). Other effects, such as the nature of the diimine bridge or the presence of an alkyl substituent in the 3,3 and 5,5 positions of the aldehyde moiety, were found to be less important in terms of explaining the variance within the data set. A matrix of discriminants was compiled using the loadings of the principal and rotated components that best performed in the classification of the entries in the data. The scores obtained from its application to the data set were used as independent variables for devising linear models of different properties, with satisfactory prediction capabilities.
Gender classification of running subjects using full-body kinematics

NASA Astrophysics Data System (ADS)

Williams, Christina M.; Flora, Jeffrey B.; Iftekharuddin, Khan M.

2016-05-01

This paper proposes novel automated gender classification of subjects while engaged in running activity. The machine learning techniques include preprocessing steps using principal component analysis followed by classification with linear discriminant analysis, and nonlinear support vector machines, and decision-stump with AdaBoost. The dataset consists of 49 subjects (25 males, 24 females, 2 trials each) all equipped with approximately 80 retroreflective markers. The trials are reflective of the subject's entire body moving unrestrained through a capture volume at a self-selected running speed, thus producing highly realistic data. The classification accuracy using leave-one-out cross validation for the 49 subjects is improved from 66.33% using linear discriminant analysis to 86.74% using the nonlinear support vector machine. Results are further improved to 87.76% by means of implementing a nonlinear decision stump with AdaBoost classifier. The experimental findings suggest that the linear classification approaches are inadequate in classifying gender for a large dataset with subjects running in a moderately uninhibited environment.
Fitting a Point Cloud to a 3d Polyhedral Surface

NASA Astrophysics Data System (ADS)

Popov, E. V.; Rotkov, S. I.

2017-05-01

The ability to measure parameters of large-scale objects in a contactless fashion has a tremendous potential in a number of industrial applications. However, this problem is usually associated with an ambiguous task to compare two data sets specified in two different co-ordinate systems. This paper deals with the study of fitting a set of unorganized points to a polyhedral surface. The developed approach uses Principal Component Analysis (PCA) and Stretched grid method (SGM) to substitute a non-linear problem solution with several linear steps. The squared distance (SD) is a general criterion to control the process of convergence of a set of points to a target surface. The described numerical experiment concerns the remote measurement of a large-scale aerial in the form of a frame with a parabolic shape. The experiment shows that the fitting process of a point cloud to a target surface converges in several linear steps. The method is applicable to the geometry remote measurement of large-scale objects in a contactless fashion.
Cognitive performance predicts treatment decisional abilities in mild to moderate dementia

PubMed Central

Gurrera, R.J.; Moye, J.; Karel, M.J.; Azar, A.R.; Armesto, J.C.

2016-01-01

Objective To examine the contribution of neuropsychological test performance to treatment decision-making capacity in community volunteers with mild to moderate dementia. Methods The authors recruited volunteers (44 men, 44 women) with mild to moderate dementia from the community. Subjects completed a battery of 11 neuropsychological tests that assessed auditory and visual attention, logical memory, language, and executive function. To measure decision making capacity, the authors administered the Capacity to Consent to Treatment Interview, the Hopemont Capacity Assessment Interview, and the MacCarthur Competence Assessment Tool—Treatment. Each of these instruments individually scores four decisional abilities serving capacity: understanding, appreciation, reasoning, and expression of choice. The authors used principal components analysis to generate component scores for each ability across instruments, and to extract principal components for neuropsychological performance. Results Multiple linear regression analyses demonstrated that neuropsychological performance significantly predicted all four abilities. Specifically, it predicted 77.8% of the common variance for understanding, 39.4% for reasoning, 24.6% for appreciation, and 10.2% for expression of choice. Except for reasoning and appreciation, neuropsychological predictor (β) profiles were unique for each ability. Conclusions Neuropsychological performance substantially and differentially predicted capacity for treatment decisions in individuals with mild to moderate dementia. Relationships between elemental cognitive function and decisional capacity may differ in individuals whose decisional capacity is impaired by other disorders, such as mental illness. PMID:16682669
Global and System-Specific Resting-State fMRI Fluctuations Are Uncorrelated: Principal Component Analysis Reveals Anti-Correlated Networks

PubMed Central

Carbonell, Felix; Bellec, Pierre

2011-01-01

Abstract The influence of the global average signal (GAS) on functional-magnetic resonance imaging (fMRI)–based resting-state functional connectivity is a matter of ongoing debate. The global average fluctuations increase the correlation between functional systems beyond the correlation that reflects their specific functional connectivity. Hence, removal of the GAS is a common practice for facilitating the observation of network-specific functional connectivity. This strategy relies on the implicit assumption of a linear-additive model according to which global fluctuations, irrespective of their origin, and network-specific fluctuations are super-positioned. However, removal of the GAS introduces spurious negative correlations between functional systems, bringing into question the validity of previous findings of negative correlations between fluctuations in the default-mode and the task-positive networks. Here we present an alternative method for estimating global fluctuations, immune to the complications associated with the GAS. Principal components analysis was applied to resting-state fMRI time-series. A global-signal effect estimator was defined as the principal component (PC) that correlated best with the GAS. The mean correlation coefficient between our proposed PC-based global effect estimator and the GAS was 0.97±0.05, demonstrating that our estimator successfully approximated the GAS. In 66 out of 68 runs, the PC that showed the highest correlation with the GAS was the first PC. Since PCs are orthogonal, our method provides an estimator of the global fluctuations, which is uncorrelated to the remaining, network-specific fluctuations. Moreover, unlike the regression of the GAS, the regression of the PC-based global effect estimator does not introduce spurious anti-correlations beyond the decrease in seed-based correlation values allowed by the assumed additive model. After regressing this PC-based estimator out of the original time-series, we observed robust anti-correlations between resting-state fluctuations in the default-mode and the task-positive networks. We conclude that resting-state global fluctuations and network-specific fluctuations are uncorrelated, supporting a Resting-State Linear-Additive Model. In addition, we conclude that the network-specific resting-state fluctuations of the default-mode and task-positive networks show artifact-free anti-correlations. PMID:22444074
XCOM intrinsic dimensionality for low-Z elements at diagnostic energies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bornefalk, Hans

2012-02-15

Purpose: To determine the intrinsic dimensionality of linear attenuation coefficients (LACs) from XCOM for elements with low atomic number (Z = 1-20) at diagnostic x-ray energies (25-120 keV). H{sub 0}{sup q}, the hypothesis that the space of LACs is spanned by q bases, is tested for various q-values. Methods: Principal component analysis is first applied and the LACs are projected onto the first q principal component bases. The residuals of the model values vs XCOM data are determined for all energies and atomic numbers. Heteroscedasticity invalidates the prerequisite of i.i.d. errors necessary for bootstrapping residuals. Instead wild bootstrap is applied,more » which, by not mixing residuals, allows the effect of the non-i.i.d residuals to be reflected in the result. Credible regions for the eigenvalues of the correlation matrix for the bootstrapped LAC data are determined. If subsequent credible regions for the eigenvalues overlap, the corresponding principal component is not considered to represent true data structure but noise. If this happens for eigenvalues l and l + 1, for any l{<=}q, H{sub 0}{sup q} is rejected. Results: The largest value of q for which H{sub 0}{sup q} is nonrejectable at the 5%-level is q = 4. This indicates that the statistically significant intrinsic dimensionality of low-Z XCOM data at diagnostic energies is four. Conclusions: The method presented allows determination of the statistically significant dimensionality of any noisy linear subspace. Knowledge of such significant dimensionality is of interest for any method making assumptions on intrinsic dimensionality and evaluating results on noisy reference data. For LACs, knowledge of the low-Z dimensionality might be relevant when parametrization schemes are tuned to XCOM data. For x-ray imaging techniques based on the basis decomposition method (Alvarez and Macovski, Phys. Med. Biol. 21, 733-744, 1976), an underlying dimensionality of two is commonly assigned to the LAC of human tissue at diagnostic energies. The finding of a higher statistically significant dimensionality thus raises the question whether a higher assumed model dimensionality (now feasible with the advent of multibin x-ray systems) might also be practically relevant, i.e., if better tissue characterization results can be obtained.« less
Method of determining the optimal dilution ratio for fluorescence fingerprint of food constituents.

PubMed

Trivittayasil, Vipavee; Tsuta, Mizuki; Kokawa, Mito; Yoshimura, Masatoshi; Sugiyama, Junichi; Fujita, Kaori; Shibata, Mario

2015-01-01

Quantitative determination by fluorescence spectroscopy is possible because of the linear relationship between the intensity of emitted fluorescence and the fluorophore concentration. However, concentration quenching may cause the relationship to become nonlinear, and thus, the optimal dilution ratio has to be determined. In the case of fluorescence fingerprint (FF) measurement, fluorescence is measured under multiple wavelength conditions and a method of determining the optimal dilution ratio for multivariate data such as FFs has not been reported. In this study, the FFs of mixed solutions of tryptophan and epicatechin of different concentrations and composition ratios were measured. Principal component analysis was applied, and the resulting loading plots were found to contain useful information about each constituent. The optimal concentration ranges could be determined by identifying the linear region of the PC score plotted against total concentration.
Esophageal cancer detection based on tissue surface-enhanced Raman spectroscopy and multivariate analysis

NASA Astrophysics Data System (ADS)

Feng, Shangyuan; Lin, Juqiang; Huang, Zufang; Chen, Guannan; Chen, Weisheng; Wang, Yue; Chen, Rong; Zeng, Haishan

2013-01-01

The capability of using silver nanoparticle based near-infrared surface enhanced Raman scattering (SERS) spectroscopy combined with principal component analysis (PCA) and linear discriminate analysis (LDA) to differentiate esophageal cancer tissue from normal tissue was presented. Significant differences in Raman intensities of prominent SERS bands were observed between normal and cancer tissues. PCA-LDA multivariate analysis of the measured tissue SERS spectra achieved diagnostic sensitivity of 90.9% and specificity of 97.8%. This exploratory study demonstrated great potential for developing label-free tissue SERS analysis into a clinical tool for esophageal cancer detection.
Spectral discrimination of serum from liver cancer and liver cirrhosis using Raman spectroscopy

NASA Astrophysics Data System (ADS)

Yang, Tianyue; Li, Xiaozhou; Yu, Ting; Sun, Ruomin; Li, Siqi

2011-07-01

In this paper, Raman spectra of human serum were measured using Raman spectroscopy, then the spectra was analyzed by multivariate statistical methods of principal component analysis (PCA). Then linear discriminant analysis (LDA) was utilized to differentiate the loading score of different diseases as the diagnosing algorithm. Artificial neural network (ANN) was used for cross-validation. The diagnosis sensitivity and specificity by PCA-LDA are 88% and 79%, while that of the PCA-ANN are 89% and 95%. It can be seen that modern analyzing method is a useful tool for the analysis of serum spectra for diagnosing diseases.
The Influence Function of Principal Component Analysis by Self-Organizing Rule.

PubMed

Higuchi; Eguchi

1998-07-28

This article is concerned with a neural network approach to principal component analysis (PCA). An algorithm for PCA by the self-organizing rule has been proposed and its robustness observed through the simulation study by Xu and Yuille (1995). In this article, the robustness of the algorithm against outliers is investigated by using the theory of influence function. The influence function of the principal component vector is given in an explicit form. Through this expression, the method is shown to be robust against any directions orthogonal to the principal component vector. In addition, a statistic generated by the self-organizing rule is proposed to assess the influence of data in PCA.
Use of multivariate analysis for determining sources of solutes found in wet atmospheric deposition in the United States

USGS Publications Warehouse

Hooper, R.P.; Peters, N.E.

1989-01-01

A principal-components analysis was performed on the major solutes in wet deposition collected from 194 stations in the United States and its territories. Approximately 90% of the components derived could be interpreted as falling into one of three categories - acid, salt, or an agricultural/soil association. The total mass, or the mass of any one solute, was apportioned among these components by multiple linear regression techniques. The use of multisolute components for determining trends or spatial distribution represents a substantial improvement over single-solute analysis in that these components are more directly related to the sources of the deposition. The geographic patterns displayed by the components in this analysis indicate a far more important role for acid deposition in the Southeast and intermountain regions of the United States than would be indicated by maps of sulfate or nitrate deposition alone. In the Northeast and Midwest, the acid component is not declining at most stations, as would be expected from trends in sulfate deposition, but is holding constant or increasing. This is due, in part, to a decline in the agriculture/soil factor throughout this region, which would help to neutralize the acidity.
Use of principal-component, correlation, and stepwise multiple-regression analyses to investigate selected physical and hydraulic properties of carbonate-rock aquifers

USGS Publications Warehouse

Brown, C. Erwin

1993-01-01

Correlation analysis in conjunction with principal-component and multiple-regression analyses were applied to laboratory chemical and petrographic data to assess the usefulness of these techniques in evaluating selected physical and hydraulic properties of carbonate-rock aquifers in central Pennsylvania. Correlation and principal-component analyses were used to establish relations and associations among variables, to determine dimensions of property variation of samples, and to filter the variables containing similar information. Principal-component and correlation analyses showed that porosity is related to other measured variables and that permeability is most related to porosity and grain size. Four principal components are found to be significant in explaining the variance of data. Stepwise multiple-regression analysis was used to see how well the measured variables could predict porosity and (or) permeability for this suite of rocks. The variation in permeability and porosity is not totally predicted by the other variables, but the regression is significant at the 5% significance level. ?? 1993.
Linear dichroism of DNA: Characterization of the orientation distribution function caused by hydrodynamic shear

DOE PAGES

Sutherland, John C.

2017-04-15

Linear dichroism provides information on the orientation of chromophores part of, or bound to, an orientable molecule such as DNA. For molecular alignment induced by hydrodynamic shear, the principal axes orthogonal to the direction of alignment are not equivalent. Thus, the magnitude of the flow-induced change in absorption for light polarized parallel to the direction of flow can be more than a factor of two greater than the corresponding change for light polarized perpendicular to both that direction and the shear axis. The ratio of the two flow-induced changes in absorption, the dichroic increment ratio, is characterized using the orthogonalmore » orientation model, which assumes that each absorbing unit is aligned parallel to one of the principal axes of the apparatus. The absorption of the alienable molecules is characterized by components parallel and perpendicular to the orientable axis of the molecule. The dichroic increment ratio indicates that for the alignment of DNA in rectangular flow cells, average alignment is not uniaxial, but for higher shear, as produced in a Couette cell, it can be. The results from the simple model are identical to tensor models for typical experimental configuration. Approaches for measuring the dichroic increment ratio with modern dichrometers are further discussed.« less
Linear dichroism of DNA: Characterization of the orientation distribution function caused by hydrodynamic shear

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sutherland, John C.

Linear dichroism provides information on the orientation of chromophores part of, or bound to, an orientable molecule such as DNA. For molecular alignment induced by hydrodynamic shear, the principal axes orthogonal to the direction of alignment are not equivalent. Thus, the magnitude of the flow-induced change in absorption for light polarized parallel to the direction of flow can be more than a factor of two greater than the corresponding change for light polarized perpendicular to both that direction and the shear axis. The ratio of the two flow-induced changes in absorption, the dichroic increment ratio, is characterized using the orthogonalmore » orientation model, which assumes that each absorbing unit is aligned parallel to one of the principal axes of the apparatus. The absorption of the alienable molecules is characterized by components parallel and perpendicular to the orientable axis of the molecule. The dichroic increment ratio indicates that for the alignment of DNA in rectangular flow cells, average alignment is not uniaxial, but for higher shear, as produced in a Couette cell, it can be. The results from the simple model are identical to tensor models for typical experimental configuration. Approaches for measuring the dichroic increment ratio with modern dichrometers are further discussed.« less
Linear dichroism of DNA: Characterization of the orientation distribution function caused by hydrodynamic shear.

PubMed

Sutherland, John C

2017-04-15

Linear dichroism provides information on the orientation of chromophores part of, or bound to, an orientable molecule such as DNA. For molecular alignment induced by hydrodynamic shear, the principal axes orthogonal to the direction of alignment are not equivalent. Thus, the magnitude of the flow-induced change in absorption for light polarized parallel to the direction of flow can be more than a factor of two greater than the corresponding change for light polarized perpendicular to both that direction and the shear axis. The ratio of the two flow-induced changes in absorption, the dichroic increment ratio, is characterized using the orthogonal orientation model, which assumes that each absorbing unit is aligned parallel to one of the principal axes of the apparatus. The absorption of the alienable molecules is characterized by components parallel and perpendicular to the orientable axis of the molecule. The dichroic increment ratio indicates that for the alignment of DNA in rectangular flow cells, average alignment is not uniaxial, but for higher shear, as produced in a Couette cell, it can be. The results from the simple model are identical to tensor models for typical experimental configurations. Approaches for measuring the dichroic increment ratio with modern dichrometers are discussed. Copyright © 2017. Published by Elsevier Inc.
Genetic algorithm applied to the selection of factors in principal component-artificial neural networks: application to QSAR study of calcium channel antagonist activity of 1,4-dihydropyridines (nifedipine analogous).

PubMed

Hemmateenejad, Bahram; Akhond, Morteza; Miri, Ramin; Shamsipur, Mojtaba

2003-01-01

A QSAR algorithm, principal component-genetic algorithm-artificial neural network (PC-GA-ANN), has been applied to a set of newly synthesized calcium channel blockers, which are of special interest because of their role in cardiac diseases. A data set of 124 1,4-dihydropyridines bearing different ester substituents at the C-3 and C-5 positions of the dihydropyridine ring and nitroimidazolyl, phenylimidazolyl, and methylsulfonylimidazolyl groups at the C-4 position with known Ca(2+) channel binding affinities was employed in this study. Ten different sets of descriptors (837 descriptors) were calculated for each molecule. The principal component analysis was used to compress the descriptor groups into principal components. The most significant descriptors of each set were selected and used as input for the ANN. The genetic algorithm (GA) was used for the selection of the best set of extracted principal components. A feed forward artificial neural network with a back-propagation of error algorithm was used to process the nonlinear relationship between the selected principal components and biological activity of the dihydropyridines. A comparison between PC-GA-ANN and routine PC-ANN shows that the first model yields better prediction ability.

Exploring functional data analysis and wavelet principal component analysis on ecstasy (MDMA) wastewater data.

PubMed

Salvatore, Stefania; Bramness, Jørgen G; Røislien, Jo

2016-07-12

Wastewater-based epidemiology (WBE) is a novel approach in drug use epidemiology which aims to monitor the extent of use of various drugs in a community. In this study, we investigate functional principal component analysis (FPCA) as a tool for analysing WBE data and compare it to traditional principal component analysis (PCA) and to wavelet principal component analysis (WPCA) which is more flexible temporally. We analysed temporal wastewater data from 42 European cities collected daily over one week in March 2013. The main temporal features of ecstasy (MDMA) were extracted using FPCA using both Fourier and B-spline basis functions with three different smoothing parameters, along with PCA and WPCA with different mother wavelets and shrinkage rules. The stability of FPCA was explored through bootstrapping and analysis of sensitivity to missing data. The first three principal components (PCs), functional principal components (FPCs) and wavelet principal components (WPCs) explained 87.5-99.6 % of the temporal variation between cities, depending on the choice of basis and smoothing. The extracted temporal features from PCA, FPCA and WPCA were consistent. FPCA using Fourier basis and common-optimal smoothing was the most stable and least sensitive to missing data. FPCA is a flexible and analytically tractable method for analysing temporal changes in wastewater data, and is robust to missing data. WPCA did not reveal any rapid temporal changes in the data not captured by FPCA. Overall the results suggest FPCA with Fourier basis functions and common-optimal smoothing parameter as the most accurate approach when analysing WBE data.
40 CFR 62.14505 - What are the principal components of this subpart?

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 8 2010-07-01 2010-07-01 false What are the principal components of this subpart? 62.14505 Section 62.14505 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY... components of this subpart? This subpart contains the eleven major components listed in paragraphs (a...
Hierarchical Regularity in Multi-Basin Dynamics on Protein Landscapes

NASA Astrophysics Data System (ADS)

Matsunaga, Yasuhiro; Kostov, Konstatin S.; Komatsuzaki, Tamiki

2004-04-01

We analyze time series of potential energy fluctuations and principal components at several temperatures for two kinds of off-lattice 46-bead models that have two distinctive energy landscapes. The less-frustrated "funnel" energy landscape brings about stronger nonstationary behavior of the potential energy fluctuations at the folding temperature than the other, rather frustrated energy landscape at the collapse temperature. By combining principal component analysis with an embedding nonlinear time-series analysis, it is shown that the fast fluctuations with small amplitudes of 70-80% of the principal components cause the time series to become almost "random" in only 100 simulation steps. However, the stochastic feature of the principal components tends to be suppressed through a wide range of degrees of freedom at the transition temperature.
Children's environmental chemical exposures in the USA, NHANES 2003-2012.

PubMed

Hendryx, Michael; Luo, Juhua

2018-02-01

Children are vulnerable to environmental chemical exposures, but little is known about the extent of multiple chemical exposures among children. We analyzed biomonitoring data from five cycles (2003-2012) of the National Health and Nutrition Examination Survey (NHANES) to describe multiple chemical exposures in US children, examine levels of chemical concentrations present over time, and examine differences in chemical exposures by selected demographic groups. We analyzed data for 36 chemical analytes across five chemical classes in a sample of 4299 children aged 6-18. Classes included metals, pesticides, phthalates, phenols, and polycyclic aromatic hydrocarbons. We calculated the number and percent of chemicals detected and tested for secular trends over time in chemical concentrations. We compared log concentrations among groups defined by age, sex, race/ethnicity, and poverty using multiple linear regression models and report adjusted geometric means. Among a smaller subgroup of 733 children with data across chemical classes, we calculated the linear correlations within and between classes and conducted a principal component analysis. The percentage of children with detectable concentrations of an individual chemical ranged from 26 to 100%; the average was 93%, and 29 of 36 were detected in more than 90% of children. Concentrations of most tested chemicals were either unchanged or declined from earlier to more recent years. Many differences in concentrations were present by age, sex, poverty, and race/ethnicity categories. Within and between class correlations were all significant and positive, and the principal component analysis suggested a one factor solution, indicating that children exposed to higher levels of one chemical were exposed to higher levels of other chemicals. In conclusion, children in the USA are exposed to multiple simultaneous chemicals at uneven risk across socioeconomic and demographic groups. Further efforts to understand the effects of multiple exposures on child health and development are warranted.
Statistical analysis of aerosol species, trace gasses, and meteorology in Chicago.

PubMed

Binaku, Katrina; O'Brien, Timothy; Schmeling, Martina; Fosco, Tinamarie

2013-09-01

Both canonical correlation analysis (CCA) and principal component analysis (PCA) were applied to atmospheric aerosol and trace gas concentrations and meteorological data collected in Chicago during the summer months of 2002, 2003, and 2004. Concentrations of ammonium, calcium, nitrate, sulfate, and oxalate particulate matter, as well as, meteorological parameters temperature, wind speed, wind direction, and humidity were subjected to CCA and PCA. Ozone and nitrogen oxide mixing ratios were also included in the data set. The purpose of statistical analysis was to determine the extent of existing linear relationship(s), or lack thereof, between meteorological parameters and pollutant concentrations in addition to reducing dimensionality of the original data to determine sources of pollutants. In CCA, the first three canonical variate pairs derived were statistically significant at the 0.05 level. Canonical correlation between the first canonical variate pair was 0.821, while correlations of the second and third canonical variate pairs were 0.562 and 0.461, respectively. The first canonical variate pair indicated that increasing temperatures resulted in high ozone mixing ratios, while the second canonical variate pair showed wind speed and humidity's influence on local ammonium concentrations. No new information was uncovered in the third variate pair. Canonical loadings were also interpreted for information regarding relationships between data sets. Four principal components (PCs), expressing 77.0 % of original data variance, were derived in PCA. Interpretation of PCs suggested significant production and/or transport of secondary aerosols in the region (PC1). Furthermore, photochemical production of ozone and wind speed's influence on pollutants were expressed (PC2) along with overall measure of local meteorology (PC3). In summary, CCA and PCA results combined were successful in uncovering linear relationships between meteorology and air pollutants in Chicago and aided in determining possible pollutant sources.
Testing the Sensory Drive Hypothesis: Geographic variation in echolocation frequencies of Geoffroy's horseshoe bat (Rhinolophidae: Rhinolophus clivosus)

PubMed Central

Catto, Sarah; Mutumi, Gregory L.; Finger, Nikita; Webala, Paul W.

2017-01-01

Geographic variation in sensory traits is usually influenced by adaptive processes because these traits are involved in crucial life-history aspects including orientation, communication, lineage recognition and mate choice. Studying this variation can therefore provide insights into lineage diversification. According to the Sensory Drive Hypothesis, lineage diversification may be driven by adaptation of sensory systems to local environments. It predicts that acoustic signals vary in association with local climatic conditions so that atmospheric attenuation is minimized and transmission of the signals maximized. To test this prediction, we investigated the influence of climatic factors (specifically relative humidity and temperature) on geographic variation in the resting frequencies of the echolocation pulses of Geoffroy’s horseshoe bat, Rhinolophus clivosus. If the evolution of phenotypic variation in this lineage tracks climate variation, human induced climate change may lead to decreases in detection volumes and a reduction in foraging efficiency. A complex non-linear interaction between relative humidity and temperature affects atmospheric attenuation of sound and principal components composed of these correlated variables were, therefore, used in a linear mixed effects model to assess their contribution to observed variation in resting frequencies. A principal component composed predominantly of mean annual temperature (factor loading of -0.8455) significantly explained a proportion of the variation in resting frequency across sites (P < 0.05). Specifically, at higher relative humidity (around 60%) prevalent across the distribution of R. clivosus, increasing temperature had a strong negative effect on resting frequency. Climatic factors thus strongly influence acoustic signal divergence in this lineage, supporting the prediction of the Sensory Drive Hypothesis. The predicted future increase in temperature due to climate change is likely to decrease the detection volume in echolocating bats and adversely impact their foraging efficiency. PMID:29186147
Testing the Sensory Drive Hypothesis: Geographic variation in echolocation frequencies of Geoffroy's horseshoe bat (Rhinolophidae: Rhinolophus clivosus).

PubMed

Jacobs, David S; Catto, Sarah; Mutumi, Gregory L; Finger, Nikita; Webala, Paul W

2017-01-01

Geographic variation in sensory traits is usually influenced by adaptive processes because these traits are involved in crucial life-history aspects including orientation, communication, lineage recognition and mate choice. Studying this variation can therefore provide insights into lineage diversification. According to the Sensory Drive Hypothesis, lineage diversification may be driven by adaptation of sensory systems to local environments. It predicts that acoustic signals vary in association with local climatic conditions so that atmospheric attenuation is minimized and transmission of the signals maximized. To test this prediction, we investigated the influence of climatic factors (specifically relative humidity and temperature) on geographic variation in the resting frequencies of the echolocation pulses of Geoffroy's horseshoe bat, Rhinolophus clivosus. If the evolution of phenotypic variation in this lineage tracks climate variation, human induced climate change may lead to decreases in detection volumes and a reduction in foraging efficiency. A complex non-linear interaction between relative humidity and temperature affects atmospheric attenuation of sound and principal components composed of these correlated variables were, therefore, used in a linear mixed effects model to assess their contribution to observed variation in resting frequencies. A principal component composed predominantly of mean annual temperature (factor loading of -0.8455) significantly explained a proportion of the variation in resting frequency across sites (P < 0.05). Specifically, at higher relative humidity (around 60%) prevalent across the distribution of R. clivosus, increasing temperature had a strong negative effect on resting frequency. Climatic factors thus strongly influence acoustic signal divergence in this lineage, supporting the prediction of the Sensory Drive Hypothesis. The predicted future increase in temperature due to climate change is likely to decrease the detection volume in echolocating bats and adversely impact their foraging efficiency.
Application of multi-scale wavelet entropy and multi-resolution Volterra models for climatic downscaling

NASA Astrophysics Data System (ADS)

Sehgal, V.; Lakhanpal, A.; Maheswaran, R.; Khosa, R.; Sridhar, Venkataramana

2018-01-01

This study proposes a wavelet-based multi-resolution modeling approach for statistical downscaling of GCM variables to mean monthly precipitation for five locations at Krishna Basin, India. Climatic dataset from NCEP is used for training the proposed models (Jan.'69 to Dec.'94) and are applied to corresponding CanCM4 GCM variables to simulate precipitation for the validation (Jan.'95-Dec.'05) and forecast (Jan.'06-Dec.'35) periods. The observed precipitation data is obtained from the India Meteorological Department (IMD) gridded precipitation product at 0.25 degree spatial resolution. This paper proposes a novel Multi-Scale Wavelet Entropy (MWE) based approach for clustering climatic variables into suitable clusters using k-means methodology. Principal Component Analysis (PCA) is used to obtain the representative Principal Components (PC) explaining 90-95% variance for each cluster. A multi-resolution non-linear approach combining Discrete Wavelet Transform (DWT) and Second Order Volterra (SoV) is used to model the representative PCs to obtain the downscaled precipitation for each downscaling location (W-P-SoV model). The results establish that wavelet-based multi-resolution SoV models perform significantly better compared to the traditional Multiple Linear Regression (MLR) and Artificial Neural Networks (ANN) based frameworks. It is observed that the proposed MWE-based clustering and subsequent PCA, helps reduce the dimensionality of the input climatic variables, while capturing more variability compared to stand-alone k-means (no MWE). The proposed models perform better in estimating the number of precipitation events during the non-monsoon periods whereas the models with clustering without MWE over-estimate the rainfall during the dry season.
An Efficient Method Coupling Kernel Principal Component Analysis with Adjoint-Based Optimal Control and Its Goal-Oriented Extensions

NASA Astrophysics Data System (ADS)

Thimmisetty, C.; Talbot, C.; Tong, C. H.; Chen, X.

2016-12-01

The representativeness of available data poses a significant fundamental challenge to the quantification of uncertainty in geophysical systems. Furthermore, the successful application of machine learning methods to geophysical problems involving data assimilation is inherently constrained by the extent to which obtainable data represent the problem considered. We show how the adjoint method, coupled with optimization based on methods of machine learning, can facilitate the minimization of an objective function defined on a space of significantly reduced dimension. By considering uncertain parameters as constituting a stochastic process, the Karhunen-Loeve expansion and its nonlinear extensions furnish an optimal basis with respect to which optimization using L-BFGS can be carried out. In particular, we demonstrate that kernel PCA can be coupled with adjoint-based optimal control methods to successfully determine the distribution of material parameter values for problems in the context of channelized deformable media governed by the equations of linear elasticity. Since certain subsets of the original data are characterized by different features, the convergence rate of the method in part depends on, and may be limited by, the observations used to furnish the kernel principal component basis. By determining appropriate weights for realizations of the stochastic random field, then, one may accelerate the convergence of the method. To this end, we present a formulation of Weighted PCA combined with a gradient-based means using automatic differentiation to iteratively re-weight observations concurrent with the determination of an optimal reduced set control variables in the feature space. We demonstrate how improvements in the accuracy and computational efficiency of the weighted linear method can be achieved over existing unweighted kernel methods, and discuss nonlinear extensions of the algorithm.
Principals' Perceptions Regarding Their Supervision and Evaluation

ERIC Educational Resources Information Center

Hvidston, David J.; Range, Bret G.; McKim, Courtney Ann

2015-01-01

This study examined the perceptions of principals concerning principal evaluation and supervisory feedback. Principals were asked two open-ended questions. Respondents included 82 principals in the Rocky Mountain region. The emerging themes were "Superintendent Performance," "Principal Evaluation Components," "Specific…
Association of Cardiometabolic Genes with Arsenic Metabolism Biomarkers in American Indian Communities: The Strong Heart Family Study (SHFS)

PubMed Central

Balakrishnan, Poojitha; Vaidya, Dhananjay; Franceschini, Nora; Voruganti, V. Saroja; Gribble, Matthew O.; Haack, Karin; Laston, Sandra; Umans, Jason G.; Francesconi, Kevin A.; Goessler, Walter; North, Kari E.; Lee, Elisa; Yracheta, Joseph; Best, Lyle G.; MacCluer, Jean W.; Kent, Jack; Cole, Shelley A.; Navas-Acien, Ana

2016-01-01

Background: Metabolism of inorganic arsenic (iAs) is subject to inter-individual variability, which is explained partly by genetic determinants. Objectives: We investigated the association of genetic variants with arsenic species and principal components of arsenic species in the Strong Heart Family Study (SHFS). Methods: We examined variants previously associated with cardiometabolic traits (~ 200,000 from Illumina Cardio MetaboChip) or arsenic metabolism and toxicity (670) among 2,428 American Indian participants in the SHFS. Urine arsenic species were measured by high performance liquid chromatography–inductively coupled plasma mass spectrometry (HPLC-ICP-MS), and percent arsenic species [iAs, monomethylarsonate (MMA), and dimethylarsinate (DMA), divided by their sum × 100] were logit transformed. We created two orthogonal principal components that summarized iAs, MMA, and DMA and were also phenotypes for genetic analyses. Linear regression was performed for each phenotype, dependent on allele dosage of the variant. Models accounted for familial relatedness and were adjusted for age, sex, total arsenic levels, and population stratification. Single nucleotide polymorphism (SNP) associations were stratified by study site and were meta-analyzed. Bonferroni correction was used to account for multiple testing. Results: Variants at 10q24 were statistically significant for all percent arsenic species and principal components of arsenic species. The index SNP for iAs%, MMA%, and DMA% (rs12768205) and for the principal components (rs3740394, rs3740393) were located near AS3MT, whose gene product catalyzes methylation of iAs to MMA and DMA. Among the candidate arsenic variant associations, functional SNPs in AS3MT and 10q24 were most significant (p < 9.33 × 10–5). Conclusions: This hypothesis-driven association study supports the role of common variants in arsenic metabolism, particularly AS3MT and 10q24. Citation: Balakrishnan P, Vaidya D, Franceschini N, Voruganti VS, Gribble MO, Haack K, Laston S, Umans JG, Francesconi KA, Goessler W, North KE, Lee E, Yracheta J, Best LG, MacCluer JW, Kent J Jr., Cole SA, Navas-Acien A. 2017. Association of cardiometabolic genes with arsenic metabolism biomarkers in American Indian communities: the Strong Heart Family Study (SHFS). Environ Health Perspect 125:15–22; http://dx.doi.org/10.1289/EHP251 PMID:27352405
Association of Cardiometabolic Genes with Arsenic Metabolism Biomarkers in American Indian Communities: The Strong Heart Family Study (SHFS).

PubMed

Balakrishnan, Poojitha; Vaidya, Dhananjay; Franceschini, Nora; Voruganti, V Saroja; Gribble, Matthew O; Haack, Karin; Laston, Sandra; Umans, Jason G; Francesconi, Kevin A; Goessler, Walter; North, Kari E; Lee, Elisa; Yracheta, Joseph; Best, Lyle G; MacCluer, Jean W; Kent, Jack; Cole, Shelley A; Navas-Acien, Ana

2017-01-01

Metabolism of inorganic arsenic (iAs) is subject to inter-individual variability, which is explained partly by genetic determinants. We investigated the association of genetic variants with arsenic species and principal components of arsenic species in the Strong Heart Family Study (SHFS). We examined variants previously associated with cardiometabolic traits (~ 200,000 from Illumina Cardio MetaboChip) or arsenic metabolism and toxicity (670) among 2,428 American Indian participants in the SHFS. Urine arsenic species were measured by high performance liquid chromatography-inductively coupled plasma mass spectrometry (HPLC-ICP-MS), and percent arsenic species [iAs, monomethylarsonate (MMA), and dimethylarsinate (DMA), divided by their sum × 100] were logit transformed. We created two orthogonal principal components that summarized iAs, MMA, and DMA and were also phenotypes for genetic analyses. Linear regression was performed for each phenotype, dependent on allele dosage of the variant. Models accounted for familial relatedness and were adjusted for age, sex, total arsenic levels, and population stratification. Single nucleotide polymorphism (SNP) associations were stratified by study site and were meta-analyzed. Bonferroni correction was used to account for multiple testing. Variants at 10q24 were statistically significant for all percent arsenic species and principal components of arsenic species. The index SNP for iAs%, MMA%, and DMA% (rs12768205) and for the principal components (rs3740394, rs3740393) were located near AS3MT, whose gene product catalyzes methylation of iAs to MMA and DMA. Among the candidate arsenic variant associations, functional SNPs in AS3MT and 10q24 were most significant (p < 9.33 × 10-5). This hypothesis-driven association study supports the role of common variants in arsenic metabolism, particularly AS3MT and 10q24. Citation: Balakrishnan P, Vaidya D, Franceschini N, Voruganti VS, Gribble MO, Haack K, Laston S, Umans JG, Francesconi KA, Goessler W, North KE, Lee E, Yracheta J, Best LG, MacCluer JW, Kent J Jr., Cole SA, Navas-Acien A. 2017. Association of cardiometabolic genes with arsenic metabolism biomarkers in American Indian communities: the Strong Heart Family Study (SHFS). Environ Health Perspect 125:15-22; http://dx.doi.org/10.1289/EHP251.
Conformational states and folding pathways of peptides revealed by principal-independent component analyses.

PubMed

Nguyen, Phuong H

2007-05-15

Principal component analysis is a powerful method for projecting multidimensional conformational space of peptides or proteins onto lower dimensional subspaces in which the main conformations are present, making it easier to reveal the structures of molecules from e.g. molecular dynamics simulation trajectories. However, the identification of all conformational states is still difficult if the subspaces consist of more than two dimensions. This is mainly due to the fact that the principal components are not independent with each other, and states in the subspaces cannot be visualized. In this work, we propose a simple and fast scheme that allows one to obtain all conformational states in the subspaces. The basic idea is that instead of directly identifying the states in the subspace spanned by principal components, we first transform this subspace into another subspace formed by components that are independent of one other. These independent components are obtained from the principal components by employing the independent component analysis method. Because of independence between components, all states in this new subspace are defined as all possible combinations of the states obtained from each single independent component. This makes the conformational analysis much simpler. We test the performance of the method by analyzing the conformations of the glycine tripeptide and the alanine hexapeptide. The analyses show that our method is simple and quickly reveal all conformational states in the subspaces. The folding pathways between the identified states of the alanine hexapeptide are analyzed and discussed in some detail. 2007 Wiley-Liss, Inc.
YORP torque as the function of shape harmonics

NASA Astrophysics Data System (ADS)

Breiter, Sławomir; Michalska, Hanna

2008-08-01

The second-order analytical approximation of the mean Yarkovsky-O'Keefe-Radzievskii-Paddack (YORP) torque components is given as an explicit function of the shape spherical harmonics coefficients for a sufficiently regular minor body. The results are based upon a new expression for the insolation function, significantly simpler than in previous works. Linearized plane-parallel model of the temperature distribution derived from the insolation function allows us to take into account a non-zero conductivity. Final expressions for the three average components of the YORP torque related with rotation period, obliquity and precession are given in a form of the Legendre series of the cosine of obliquity. The series have good numerical properties and can be easily truncated according to the degree of the Legendre polynomials or associated functions, with first two terms playing the principal role.
New Insights into the Folding of a β-Sheet Miniprotein in a Reduced Space of Collective Hydrogen Bond Variables: Application to a Hydrodynamic Analysis of the Folding Flow

PubMed Central

Kalgin, Igor V.; Caflisch, Amedeo; Chekmarev, Sergei F.; Karplus, Martin

2013-01-01

A new analysis of the 20 μs equilibrium folding/unfolding molecular dynamics simulations of the three-stranded antiparallel β-sheet miniprotein (beta3s) in implicit solvent is presented. The conformation space is reduced in dimensionality by introduction of linear combinations of hydrogen bond distances as the collective variables making use of a specially adapted Principal Component Analysis (PCA); i.e., to make structured conformations more pronounced, only the formed bonds are included in determining the principal components. It is shown that a three-dimensional (3D) subspace gives a meaningful representation of the folding behavior. The first component, to which eight native hydrogen bonds make the major contribution (four in each beta hairpin), is found to play the role of the reaction coordinate for the overall folding process, while the second and third components distinguish the structured conformations. The representative points of the trajectory in the 3D space are grouped into conformational clusters that correspond to locally stable conformations of beta3s identified in earlier work. A simplified kinetic network based on the three components is constructed and it is complemented by a hydrodynamic analysis. The latter, making use of “passive tracers” in 3D space, indicates that the folding flow is much more complex than suggested by the kinetic network. A 2D representation of streamlines shows there are vortices which correspond to repeated local rearrangement, not only around minima of the free energy surface, but also in flat regions between minima. The vortices revealed by the hydrodynamic analysis are apparently not evident in folding pathways generated by transition-path sampling. Making use of the fact that the values of the collective hydrogen bond variables are linearly related to the Cartesian coordinate space, the RMSD between clusters is determined. Interestingly, the transition rates show an approximate exponential correlation with distance in the hydrogen bond subspace. Comparison with the many published studies shows good agreement with the present analysis for the parts that can be compared, supporting the robust character of our understanding of this “hydrogen atom” of protein folding. PMID:23621790
Combining multiple regression and principal component analysis for accurate predictions for column ozone in Peninsular Malaysia

NASA Astrophysics Data System (ADS)

Rajab, Jasim M.; MatJafri, M. Z.; Lim, H. S.

2013-06-01

This study encompasses columnar ozone modelling in the peninsular Malaysia. Data of eight atmospheric parameters [air surface temperature (AST), carbon monoxide (CO), methane (CH4), water vapour (H2Ovapour), skin surface temperature (SSKT), atmosphere temperature (AT), relative humidity (RH), and mean surface pressure (MSP)] data set, retrieved from NASA's Atmospheric Infrared Sounder (AIRS), for the entire period (2003-2008) was employed to develop models to predict the value of columnar ozone (O3) in study area. The combined method, which is based on using both multiple regressions combined with principal component analysis (PCA) modelling, was used to predict columnar ozone. This combined approach was utilized to improve the prediction accuracy of columnar ozone. Separate analysis was carried out for north east monsoon (NEM) and south west monsoon (SWM) seasons. The O3 was negatively correlated with CH4, H2Ovapour, RH, and MSP, whereas it was positively correlated with CO, AST, SSKT, and AT during both the NEM and SWM season periods. Multiple regression analysis was used to fit the columnar ozone data using the atmospheric parameter's variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to acquire subsets of the predictor variables to be comprised in the linear regression model of the atmospheric parameter's variables. It was found that the increase in columnar O3 value is associated with an increase in the values of AST, SSKT, AT, and CO and with a drop in the levels of CH4, H2Ovapour, RH, and MSP. The result of fitting the best models for the columnar O3 value using eight of the independent variables gave about the same values of the R (≈0.93) and R2 (≈0.86) for both the NEM and SWM seasons. The common variables that appeared in both regression equations were SSKT, CH4 and RH, and the principal precursor of the columnar O3 value in both the NEM and SWM seasons was SSKT.
[Assessment of the strength of tobacco control on creating smoke-free hospitals using principal components analysis].

PubMed

Liu, Hui-lin; Wan, Xia; Yang, Gong-huan

2013-02-01

To explore the relationship between the strength of tobacco control and the effectiveness of creating smoke-free hospital, and summarize the main factors that affect the program of creating smoke-free hospitals. A total of 210 hospitals from 7 provinces/municipalities directly under the central government were enrolled in this study using stratified random sampling method. Principle component analysis and regression analysis were conducted to analyze the strength of tobacco control and the effectiveness of creating smoke-free hospitals. Two principal components were extracted in the strength of tobacco control index, which respectively reflected the tobacco control policies and efforts, and the willingness and leadership of hospital managers regarding tobacco control. The regression analysis indicated that only the first principal component was significantly correlated with the progression in creating smoke-free hospital (P<0.001), i.e. hospitals with higher scores on the first principal component had better achievements in smoke-free environment creation. Tobacco control policies and efforts are critical in creating smoke-free hospitals. The principal component analysis provides a comprehensive and objective tool for evaluating the creation of smoke-free hospitals.
Critical Factors Explaining the Leadership Performance of High-Performing Principals

ERIC Educational Resources Information Center

Hutton, Disraeli M.

2018-01-01

The study explored critical factors that explain leadership performance of high-performing principals and examined the relationship between these factors based on the ratings of school constituents in the public school system. The principal component analysis with the use of Varimax Rotation revealed that four components explain 51.1% of the…
Molecular dynamics in principal component space.

PubMed

Michielssens, Servaas; van Erp, Titus S; Kutzner, Carsten; Ceulemans, Arnout; de Groot, Bert L

2012-07-26

A molecular dynamics algorithm in principal component space is presented. It is demonstrated that sampling can be improved without changing the ensemble by assigning masses to the principal components proportional to the inverse square root of the eigenvalues. The setup of the simulation requires no prior knowledge of the system; a short initial MD simulation to extract the eigenvectors and eigenvalues suffices. Independent measures indicated a 6-7 times faster sampling compared to a regular molecular dynamics simulation.
Discriminating the Mineralogical Composition in Drill Cuttings Based on Absorption Spectra in the Terahertz Range.

PubMed

Miao, Xinyang; Li, Hao; Bao, Rima; Feng, Chengjing; Wu, Hang; Zhan, Honglei; Li, Yizhang; Zhao, Kun

2017-02-01

Understanding the geological units of a reservoir is essential to the development and management of the resource. In this paper, drill cuttings from several depths from an oilfield were studied using terahertz time domain spectroscopy (THz-TDS). Cluster analysis (CA) and principal component analysis (PCA) were employed to classify and analyze the cuttings. The cuttings were clearly classified based on CA and PCA methods, and the results were in agreement with the lithology. Moreover, calcite and dolomite have stronger absorption of a THz pulse than any other minerals, based on an analysis of the PC1 scores. Quantitative analyses of minor minerals were also realized by building a series of linear and non-linear models between contents and PC2 scores. The results prove THz technology to be a promising means for determining reservoir lithology as well as other properties, which will be a significant supplementary method in oil fields.

Decoupled ARX and RBF Neural Network Modeling Using PCA and GA Optimization for Nonlinear Distributed Parameter Systems.

PubMed

Zhang, Ridong; Tao, Jili; Lu, Renquan; Jin, Qibing

2018-02-01

Modeling of distributed parameter systems is difficult because of their nonlinearity and infinite-dimensional characteristics. Based on principal component analysis (PCA), a hybrid modeling strategy that consists of a decoupled linear autoregressive exogenous (ARX) model and a nonlinear radial basis function (RBF) neural network model are proposed. The spatial-temporal output is first divided into a few dominant spatial basis functions and finite-dimensional temporal series by PCA. Then, a decoupled ARX model is designed to model the linear dynamics of the dominant modes of the time series. The nonlinear residual part is subsequently parameterized by RBFs, where genetic algorithm is utilized to optimize their hidden layer structure and the parameters. Finally, the nonlinear spatial-temporal dynamic system is obtained after the time/space reconstruction. Simulation results of a catalytic rod and a heat conduction equation demonstrate the effectiveness of the proposed strategy compared to several other methods.
Geographical identification of saffron (Crocus sativus L.) by linear discriminant analysis applied to the UV-visible spectra of aqueous extracts.

PubMed

D'Archivio, Angelo Antonio; Maggi, Maria Anna

2017-03-15

We attempted geographical classification of saffron using UV-visible spectroscopy, conventionally adopted for quality grading according to the ISO Normative 3632. We investigated 81 saffron samples produced in L'Aquila, Città della Pieve, Cascia, and Sardinia (Italy) and commercial products purchased in various supermarkets. Exploratory principal component analysis applied to the UV-vis spectra of saffron aqueous extracts revealed a clear differentiation of the samples belonging to different quality categories, but a poor separation according to the geographical origin of the spices. On the other hand, linear discriminant analysis based on 8 selected absorbance values, concentrated near 279, 305 and 328nm, allowed a good distinction of the spices coming from different sites. Under severe validation conditions (30% and 50% of saffron samples in the evaluation set), correct predictions were 85 and 83%, respectively. Copyright © 2016 Elsevier Ltd. All rights reserved.
Primary and aggregate color centers in proton irradiated LiF crystals and thin films for luminescent solid state detectors

NASA Astrophysics Data System (ADS)

Piccinini, M.; Ambrosini, F.; Ampollini, A.; Bonfigli, F.; Libera, S.; Picardi, L.; Ronsivalle, C.; Vincenti, M. A.; Montereali, R. M.

2015-04-01

Proton beams of 3 MeV energy, produced by the injector of a linear accelerator for proton therapy, were used to irradiate at room temperature lithium fluoride crystals and polycrystalline thin films grown by thermal evaporation. The irradiation fluence range was 1011-1015 protons/cm2. The proton irradiation induced the stable formation of primary and aggregate color centers. Their formation was investigated by optical absorption and photoluminescence spectroscopy. The F2 and F3+ photoluminescence intensities, carefully measured in LiF crystals and thin films, show linear behaviours up to different maximum values of the irradiation fluence, after which a quenching is observed, depending on the nature of the samples (crystals and films). The Principal Component Analysis, applied to the absorption spectra of colored crystals, allowed to clearly identify the formation of more complex aggregate defects in samples irradiated at highest fluences.
Exploring the CAESAR database using dimensionality reduction techniques

NASA Astrophysics Data System (ADS)

Mendoza-Schrock, Olga; Raymer, Michael L.

2012-06-01

The Civilian American and European Surface Anthropometry Resource (CAESAR) database containing over 40 anthropometric measurements on over 4000 humans has been extensively explored for pattern recognition and classification purposes using the raw, original data [1-4]. However, some of the anthropometric variables would be impossible to collect in an uncontrolled environment. Here, we explore the use of dimensionality reduction methods in concert with a variety of classification algorithms for gender classification using only those variables that are readily observable in an uncontrolled environment. Several dimensionality reduction techniques are employed to learn the underlining structure of the data. These techniques include linear projections such as the classical Principal Components Analysis (PCA) and non-linear (manifold learning) techniques, such as Diffusion Maps and the Isomap technique. This paper briefly describes all three techniques, and compares three different classifiers, Naïve Bayes, Adaboost, and Support Vector Machines (SVM), for gender classification in conjunction with each of these three dimensionality reduction approaches.
Pattern recognition and genetic algorithms for discrimination of orange juices and reduction of significant components from headspace solid-phase microextraction.

PubMed

Rinaldi, Maurizio; Gindro, Roberto; Barbeni, Massimo; Allegrone, Gianna

2009-01-01

Orange (Citrus sinensis L.) juice comprises a complex mixture of volatile components that are difficult to identify and quantify. Classification and discrimination of the varieties on the basis of the volatile composition could help to guarantee the quality of a juice and to detect possible adulteration of the product. To provide information on the amounts of volatile constituents in fresh-squeezed juices from four orange cultivars and to establish suitable discrimination rules to differentiate orange juices using new chemometric approaches. Fresh juices of four orange cultivars were analysed by headspace solid-phase microextraction (HS-SPME) coupled with GC-MS. Principal component analysis, linear discriminant analysis and heuristic methods, such as neural networks, allowed clustering of the data from HS-SPME analysis while genetic algorithms addressed the problem of data reduction. To check the quality of the results the chemometric techniques were also evaluated on a sample. Thirty volatile compounds were identified by HS-SPME and GC-MS analyses and their relative amounts calculated. Differences in composition of orange juice volatile components were observed. The chosen orange cultivars could be discriminated using neural networks, genetic relocation algorithms and linear discriminant analysis. Genetic algorithms applied to the data were also able to detect the most significant compounds. SPME is a useful technique to investigate orange juice volatile composition and a flexible chemometric approach is able to correctly separate the juices.
[A study of Boletus bicolor from different areas using Fourier transform infrared spectrometry].

PubMed

Zhou, Zai-Jin; Liu, Gang; Ren, Xian-Pei

2010-04-01

It is hard to differentiate the same species of wild growing mushrooms from different areas by macromorphological features. In this paper, Fourier transform infrared (FTIR) spectroscopy combined with principal component analysis was used to identify 58 samples of boletus bicolor from five different areas. Based on the fingerprint infrared spectrum of boletus bicolor samples, principal component analysis was conducted on 58 boletus bicolor spectra in the range of 1 350-750 cm(-1) using the statistical software SPSS 13.0. According to the result, the accumulated contributing ratio of the first three principal components accounts for 88.87%. They included almost all the information of samples. The two-dimensional projection plot using first and second principal component is a satisfactory clustering effect for the classification and discrimination of boletus bicolor. All boletus bicolor samples were divided into five groups with a classification accuracy of 98.3%. The study demonstrated that wild growing boletus bicolor at species level from different areas can be identified by FTIR spectra combined with principal components analysis.
HPLC-PDA Combined with Chemometrics for Quantitation of Active Components and Quality Assessment of Raw and Processed Fruits of Xanthium strumarium L.

PubMed

Jiang, Hai; Yang, Liu; Xing, Xudong; Yan, Meiling; Guo, Xinyue; Yang, Bingyou; Wang, Qiuhong; Kuang, Haixue

2018-01-25

As a valuable herbal medicine, the fruits of Xanthium strumarium L. (Xanthii Fructus) have been widely used in raw and processed forms to achieve different therapeutic effects in practice. In this study, a comprehensive strategy was proposed for evaluating the active components in 30 batches of raw and processed Xanthii Fructus (RXF and PXF) samples, based on high-performance liquid chromatography coupled with photodiode array detection (HPLC-PDA). Twelve common peaks were detected and eight compounds of caffeoylquinic acids were simultaneously quantified in RXF and PXF. All the analytes were detected with satisfactory linearity (R² > 0.9991) over wide concentration ranges. Simultaneously, the chemically latent information was revealed by hierarchical cluster analysis (HCA) and principal component analysis (PCA). The results suggest that there were significant differences between RXF and PXF from different regions in terms of the content of eight caffeoylquinic acids. Potential chemical markers for XF were found during processing by chemometrics.
Multivariate analyses of crater parameters and the classification of craters

NASA Technical Reports Server (NTRS)

Siegal, B. S.; Griffiths, J. C.

1974-01-01

Multivariate analyses were performed on certain linear dimensions of six genetic types of craters. A total of 320 craters, consisting of laboratory fluidization craters, craters formed by chemical and nuclear explosives, terrestrial maars and other volcanic craters, and terrestrial meteorite impact craters, authenticated and probable, were analyzed in the first data set in terms of their mean rim crest diameter, mean interior relief, rim height, and mean exterior rim width. The second data set contained an additional 91 terrestrial craters of which 19 were of experimental percussive impact and 28 of volcanic collapse origin, and which was analyzed in terms of mean rim crest diameter, mean interior relief, and rim height. Principal component analyses were performed on the six genetic types of craters. Ninety per cent of the variation in the variables can be accounted for by two components. Ninety-nine per cent of the variation in the craters formed by chemical and nuclear explosives is explained by the first component alone.
Further Improvements to Linear Mixed Models for Genome-Wide Association Studies

PubMed Central

Widmer, Christian; Lippert, Christoph; Weissbrod, Omer; Fusi, Nicolo; Kadie, Carl; Davidson, Robert; Listgarten, Jennifer; Heckerman, David

2014-01-01

We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals in a cohort. These similarities are estimated from single nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs are used to estimate the GSM. In empirical studies across a wide range of synthetic and real data, we find that modifications to this approach improve GWAS performance as measured by type I error control and power. Specifically, when only population structure is present, a GSM constructed from SNPs that well predict the phenotype in combination with principal components as covariates controls type I error and yields more power than the traditional LMM. In any setting, with or without population structure or family relatedness, a GSM consisting of a mixture of two component GSMs, one constructed from all SNPs and another constructed from SNPs that well predict the phenotype again controls type I error and yields more power than the traditional LMM. Software implementing these improvements and the experimental comparisons are available at http://microsoft.com/science. PMID:25387525
Further Improvements to Linear Mixed Models for Genome-Wide Association Studies

NASA Astrophysics Data System (ADS)

Widmer, Christian; Lippert, Christoph; Weissbrod, Omer; Fusi, Nicolo; Kadie, Carl; Davidson, Robert; Listgarten, Jennifer; Heckerman, David

2014-11-01

We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals in a cohort. These similarities are estimated from single nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs are used to estimate the GSM. In empirical studies across a wide range of synthetic and real data, we find that modifications to this approach improve GWAS performance as measured by type I error control and power. Specifically, when only population structure is present, a GSM constructed from SNPs that well predict the phenotype in combination with principal components as covariates controls type I error and yields more power than the traditional LMM. In any setting, with or without population structure or family relatedness, a GSM consisting of a mixture of two component GSMs, one constructed from all SNPs and another constructed from SNPs that well predict the phenotype again controls type I error and yields more power than the traditional LMM. Software implementing these improvements and the experimental comparisons are available at http://microsoft.com/science.
Further improvements to linear mixed models for genome-wide association studies.

PubMed

Widmer, Christian; Lippert, Christoph; Weissbrod, Omer; Fusi, Nicolo; Kadie, Carl; Davidson, Robert; Listgarten, Jennifer; Heckerman, David

2014-11-12

We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals in a cohort. These similarities are estimated from single nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs are used to estimate the GSM. In empirical studies across a wide range of synthetic and real data, we find that modifications to this approach improve GWAS performance as measured by type I error control and power. Specifically, when only population structure is present, a GSM constructed from SNPs that well predict the phenotype in combination with principal components as covariates controls type I error and yields more power than the traditional LMM. In any setting, with or without population structure or family relatedness, a GSM consisting of a mixture of two component GSMs, one constructed from all SNPs and another constructed from SNPs that well predict the phenotype again controls type I error and yields more power than the traditional LMM. Software implementing these improvements and the experimental comparisons are available at http://microsoft.com/science.
A metric space for Type Ia supernova spectra: a new method to assess explosion scenarios

NASA Astrophysics Data System (ADS)

Sasdelli, Michele; Hillebrandt, W.; Kromer, M.; Ishida, E. E. O.; Röpke, F. K.; Sim, S. A.; Pakmor, R.; Seitenzahl, I. R.; Fink, M.

2017-04-01

Over the past years, Type Ia supernovae (SNe Ia) have become a major tool to determine the expansion history of the Universe, and considerable attention has been given to, both, observations and models of these events. However, until now, their progenitors are not known. The observed diversity of light curves and spectra seems to point at different progenitor channels and explosion mechanisms. Here, we present a new way to compare model predictions with observations in a systematic way. Our method is based on the construction of a metric space for SN Ia spectra by means of linear principal component analysis, taking care of missing and/or noisy data, and making use of partial least-squares regression to find correlations between spectral properties and photometric data. We investigate realizations of the three major classes of explosion models that are presently discussed: delayed-detonation Chandrasekhar-mass explosions, sub-Chandrasekhar-mass detonations and double-degenerate mergers, and compare them with data. We show that in the principal component space, all scenarios have observed counterparts, supporting the idea that different progenitors are likely. However, all classes of models face problems in reproducing the observed correlations between spectral properties and light curves and colours. Possible reasons are briefly discussed.
Groundwater quality assessment and pollution source apportionment in an intensely exploited region of northern China.

PubMed

Zhang, Qianqian; Wang, Huiwei; Wang, Yanchao; Yang, Mingnan; Zhu, Liang

2017-07-01

Deterioration in groundwater quality has attracted wide social interest in China. In this study, groundwater quality was monitored during December 2014 at 115 sites in the Hutuo River alluvial-pluvial fan region of northern China. Results showed that 21.7% of NO 3 - and 51.3% of total hardness samples exceeded grade III of the national quality standards for Chinese groundwater. In addition, results of gray relationship analysis (GRA) show that 64.3, 10.4, 21.7, and 3.6% of samples were within the I, II, IV, and V grades of groundwater in the Hutuo River region, respectively. The poor water quality in the study region is due to intense anthropogenic activities as well as aquifer vulnerability to contamination. Results of principal component analysis (PCA) revealed three major factors: (1) domestic wastewater and agricultural runoff pollution (anthropogenic activities), (2) water-rock interactions (natural processes), and (3) industrial wastewater pollution (anthropogenic activities). Using PCA and absolute principal component scores-multivariate linear regression (APCS-MLR), results show that domestic wastewater and agricultural runoff are the main sources of groundwater pollution in the Hutuo River alluvial-pluvial fan area. Thus, the most appropriate methods to prevent groundwater quality degradation are to improve capacities for wastewater treatment and to optimize fertilization strategies.
Seasonal forecasting of high wind speeds over Western Europe

NASA Astrophysics Data System (ADS)

Palutikof, J. P.; Holt, T.

2003-04-01

As financial losses associated with extreme weather events escalate, there is interest from end users in the forestry and insurance industries, for example, in the development of seasonal forecasting models with a long lead time. This study uses exceedences of the 90th, 95th, and 99th percentiles of daily maximum wind speed over the period 1958 to present to derive predictands of winter wind extremes. The source data is the 6-hourly NCEP Reanalysis gridded surface wind field. Predictor variables include principal components of Atlantic sea surface temperature and several indices of climate variability, including the NAO and SOI. Lead times of up to a year are considered, in monthly increments. Three regression techniques are evaluated; multiple linear regression (MLR), principal component regression (PCR), and partial least squares regression (PLS). PCR and PLS proved considerably superior to MLR with much lower standard errors. PLS was chosen to formulate the predictive model since it offers more flexibility in experimental design and gave slightly better results than PCR. The results indicate that winter windiness can be predicted with considerable skill one year ahead for much of coastal Europe, but that this deteriorates rapidly in the hinterland. The experiment succeeded in highlighting PLS as a very useful method for developing more precise forecasting models, and in identifying areas of high predictability.
Tool Wear Prediction in Ti-6Al-4V Machining through Multiple Sensor Monitoring and PCA Features Pattern Recognition.

PubMed

Caggiano, Alessandra

2018-03-09

Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features ( k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear ( VB max ) was achieved, with predicted values very close to the measured tool wear values.
Tool Wear Prediction in Ti-6Al-4V Machining through Multiple Sensor Monitoring and PCA Features Pattern Recognition

PubMed Central

2018-01-01

Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features (k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear (VBmax) was achieved, with predicted values very close to the measured tool wear values. PMID:29522443
Identification of fungal phytopathogens using Fourier transform infrared-attenuated total reflection spectroscopy and advanced statistical methods

NASA Astrophysics Data System (ADS)

Salman, Ahmad; Lapidot, Itshak; Pomerantz, Ami; Tsror, Leah; Shufan, Elad; Moreh, Raymond; Mordechai, Shaul; Huleihel, Mahmoud

2012-01-01

The early diagnosis of phytopathogens is of a great importance; it could save large economical losses due to crops damaged by fungal diseases, and prevent unnecessary soil fumigation or the use of fungicides and bactericides and thus prevent considerable environmental pollution. In this study, 18 isolates of three different fungi genera were investigated; six isolates of Colletotrichum coccodes, six isolates of Verticillium dahliae and six isolates of Fusarium oxysporum. Our main goal was to differentiate these fungi samples on the level of isolates, based on their infrared absorption spectra obtained using the Fourier transform infrared-attenuated total reflection (FTIR-ATR) sampling technique. Advanced statistical and mathematical methods: principal component analysis (PCA), linear discriminant analysis (LDA), and k-means were applied to the spectra after manipulation. Our results showed significant spectral differences between the various fungi genera examined. The use of k-means enabled classification between the genera with a 94.5% accuracy, whereas the use of PCA [3 principal components (PCs)] and LDA has achieved a 99.7% success rate. However, on the level of isolates, the best differentiation results were obtained using PCA (9 PCs) and LDA for the lower wavenumber region (800-1775 cm-1), with identification success rates of 87%, 85.5%, and 94.5% for Colletotrichum, Fusarium, and Verticillium strains, respectively.
How multi segmental patterns deviate in spastic diplegia from typical developed.

PubMed

Zago, Matteo; Sforza, Chiarella; Bona, Alessia; Cimolin, Veronica; Costici, Pier Francesco; Condoluci, Claudia; Galli, Manuela

2017-10-01

The relationship between gait features and coordination in children with Cerebral Palsy is not sufficiently analyzed yet. Principal Component Analysis can help in understanding motion patterns decomposing movement into its fundamental components (Principal Movements). This study aims at quantitatively characterizing the functional connections between multi-joint gait patterns in Cerebral Palsy. 65 children with spastic diplegia aged 10.6 (SD 3.7) years participated in standardized gait analysis trials; 31 typically developing adolescents aged 13.6 (4.4) years were also tested. To determine if posture affects gait patterns, patients were split into Crouch and knee Hyperextension group according to knee flexion angle at standing. 3D coordinates of hips, knees, ankles, metatarsal joints, pelvis and shoulders were submitted to Principal Component Analysis. Four Principal Movements accounted for 99% of global variance; components 1-3 explained major sagittal patterns, components 4-5 referred to movements on frontal plane and component 6 to additional movement refinements. Dimensionality was higher in patients than in controls (p<0.01), and the Crouch group significantly differed from controls in the application of components 1 and 4-6 (p<0.05), while the knee Hyperextension group in components 1-2 and 5 (p<0.05). Compensatory strategies of children with Cerebral Palsy (interactions between main and secondary movement patterns), were objectively determined. Principal Movements can reduce the effort in interpreting gait reports, providing an immediate and quantitative picture of the connections between movement components. Copyright © 2017 Elsevier Ltd. All rights reserved.
Tailored multivariate analysis for modulated enhanced diffraction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Caliandro, Rocco; Guccione, Pietro; Nico, Giovanni

2015-10-21

Modulated enhanced diffraction (MED) is a technique allowing the dynamic structural characterization of crystalline materials subjected to an external stimulus, which is particularly suited forin situandoperandostructural investigations at synchrotron sources. Contributions from the (active) part of the crystal system that varies synchronously with the stimulus can be extracted by an offline analysis, which can only be applied in the case of periodic stimuli and linear system responses. In this paper a new decomposition approach based on multivariate analysis is proposed. The standard principal component analysis (PCA) is adapted to treat MED data: specific figures of merit based on their scoresmore » and loadings are found, and the directions of the principal components obtained by PCA are modified to maximize such figures of merit. As a result, a general method to decompose MED data, called optimum constrained components rotation (OCCR), is developed, which produces very precise results on simulated data, even in the case of nonperiodic stimuli and/or nonlinear responses. The multivariate analysis approach is able to supply in one shot both the diffraction pattern related to the active atoms (through the OCCR loadings) and the time dependence of the system response (through the OCCR scores). When applied to real data, OCCR was able to supply only the latter information, as the former was hindered by changes in abundances of different crystal phases, which occurred besides structural variations in the specific case considered. To develop a decomposition procedure able to cope with this combined effect represents the next challenge in MED analysis.« less
A reduction in ag/residential signature conflict using principal components analysis of LANDSAT temporal data

NASA Technical Reports Server (NTRS)

Williams, D. L.; Borden, F. Y.

1977-01-01

Methods to accurately delineate the types of land cover in the urban-rural transition zone of metropolitan areas were considered. The application of principal components analysis to multidate LANDSAT imagery was investigated as a means of reducing the overlap between residential and agricultural spectral signatures. The statistical concepts of principal components analysis were discussed, as well as the results of this analysis when applied to multidate LANDSAT imagery of the Washington, D.C. metropolitan area.

Constrained Principal Component Analysis: Various Applications.

ERIC Educational Resources Information Center

Hunter, Michael; Takane, Yoshio

2002-01-01

Provides example applications of constrained principal component analysis (CPCA) that illustrate the method on a variety of contexts common to psychological research. Two new analyses, decompositions into finer components and fitting higher order structures, are presented, followed by an illustration of CPCA on contingency tables and the CPCA of…
Planar Poincare chart - A planar graphic representation of the state of light polarization

NASA Technical Reports Server (NTRS)

Tedjojuwono, Ken K.; Hunter, William W., Jr.; Ocheltree, Stewart L.

1989-01-01

The planar Poincare chart, which represents the complete planar equivalence of the Poincare sphere, is proposed. The four sets of basic lines are drawn on two separate charts for the generalization and convenience of reading the scale. The chart indicates the rotation of the principal axes of linear birefringent material. The relationships between parameters of the two charts are given as 2xi-2phi (orientation angle of the major axis-ellipticity angle) pair and 2alpha-delta (angle of amplitude ratio-phase difference angle) pair. The results are useful for designing and analyzing polarization properties of optical components with birefringent properties.
Construction of mathematical model for measuring material concentration by colorimetric method

NASA Astrophysics Data System (ADS)

Liu, Bing; Gao, Lingceng; Yu, Kairong; Tan, Xianghua

2018-06-01

This paper use the method of multiple linear regression to discuss the data of C problem of mathematical modeling in 2017. First, we have established a regression model for the concentration of 5 substances. But only the regression model of the substance concentration of urea in milk can pass through the significance test. The regression model established by the second sets of data can pass the significance test. But this model exists serious multicollinearity. We have improved the model by principal component analysis. The improved model is used to control the system so that it is possible to measure the concentration of material by direct colorimetric method.
The pre-image problem in kernel methods.

PubMed

Kwok, James Tin-yau; Tsang, Ivor Wai-hung

2004-11-01

In this paper, we address the problem of finding the pre-image of a feature vector in the feature space induced by a kernel. This is of central importance in some kernel applications, such as on using kernel principal component analysis (PCA) for image denoising. Unlike the traditional method which relies on nonlinear optimization, our proposed method directly finds the location of the pre-image based on distance constraints in the feature space. It is noniterative, involves only linear algebra and does not suffer from numerical instability or local minimum problems. Evaluations on performing kernel PCA and kernel clustering on the USPS data set show much improved performance.
Sample-space-based feature extraction and class preserving projection for gene expression data.

PubMed

Wang, Wenjun

2013-01-01

In order to overcome the problems of high computational complexity and serious matrix singularity for feature extraction using Principal Component Analysis (PCA) and Fisher's Linear Discrinimant Analysis (LDA) in high-dimensional data, sample-space-based feature extraction is presented, which transforms the computation procedure of feature extraction from gene space to sample space by representing the optimal transformation vector with the weighted sum of samples. The technique is used in the implementation of PCA, LDA, Class Preserving Projection (CPP) which is a new method for discriminant feature extraction proposed, and the experimental results on gene expression data demonstrate the effectiveness of the method.
Discriminant analysis of resting-state functional connectivity patterns on the Grassmann manifold

NASA Astrophysics Data System (ADS)

Fan, Yong; Liu, Yong; Jiang, Tianzi; Liu, Zhening; Hao, Yihui; Liu, Haihong

2010-03-01

The functional networks, extracted from fMRI images using independent component analysis, have been demonstrated informative for distinguishing brain states of cognitive functions and neurological diseases. In this paper, we propose a novel algorithm for discriminant analysis of functional networks encoded by spatial independent components. The functional networks of each individual are used as bases for a linear subspace, referred to as a functional connectivity pattern, which facilitates a comprehensive characterization of temporal signals of fMRI data. The functional connectivity patterns of different individuals are analyzed on the Grassmann manifold by adopting a principal angle based subspace distance. In conjunction with a support vector machine classifier, a forward component selection technique is proposed to select independent components for constructing the most discriminative functional connectivity pattern. The discriminant analysis method has been applied to an fMRI based schizophrenia study with 31 schizophrenia patients and 31 healthy individuals. The experimental results demonstrate that the proposed method not only achieves a promising classification performance for distinguishing schizophrenia patients from healthy controls, but also identifies discriminative functional networks that are informative for schizophrenia diagnosis.
The elliptical Gaussian wave transformation due to diffraction by an elliptical hologram

NASA Astrophysics Data System (ADS)

Janicijevic, L.

1985-03-01

Realized as an interferogram of a spherical and a cylindrical wave, the elliptical hologram is treated as a plane diffracting grating which produces Fresnel diffraction of a simple astigmatic Gaussian incident wave. It is shown that if the principal axes of the incident beam coincide with the principal axes of the hologram, the diffracted wave field is composed of three different astigmatic Gaussian waves, with their waists situated in parallel but distinct planes. The diffraction pattern, observed on a transverse screen, is the result of the interference of the three diffracted wave components. It consists of three systems of overlapped second-order curves, whose shape depends on the distance of the observation screen from the hologram, as well as on the parameters of the incident wave beam and the hologram. The results are specialized for gratings in the form of circular and linear holograms and for the case of a stigmatic Gaussian incident wave, as well as for the normal plane-wave incidence on the three mentioned types of hologram.
School Climate, Principal Support and Collaboration among Portuguese Teachers

ERIC Educational Resources Information Center

Castro Silva, José; Amante, Lúcia; Morgado, José

2017-01-01

This article analyses the relationship between school principal support and teacher collaboration among Portuguese teachers. Data were collected from a random sample of 234 teachers in middle and secondary schools. The use of a combined approach using linear and multiple regression tests concluded that the school principal support, through the…
A measure for objects clustering in principal component analysis biplot: A case study in inter-city buses maintenance cost data

NASA Astrophysics Data System (ADS)

Ginanjar, Irlandia; Pasaribu, Udjianna S.; Indratno, Sapto W.

2017-03-01

This article presents the application of the principal component analysis (PCA) biplot for the needs of data mining. This article aims to simplify and objectify the methods for objects clustering in PCA biplot. The novelty of this paper is to get a measure that can be used to objectify the objects clustering in PCA biplot. Orthonormal eigenvectors, which are the coefficients of a principal component model representing an association between principal components and initial variables. The existence of the association is a valid ground to objects clustering based on principal axes value, thus if m principal axes used in the PCA, then the objects can be classified into 2m clusters. The inter-city buses are clustered based on maintenance costs data by using two principal axes PCA biplot. The buses are clustered into four groups. The first group is the buses with high maintenance costs, especially for lube, and brake canvass. The second group is the buses with high maintenance costs, especially for tire, and filter. The third group is the buses with low maintenance costs, especially for lube, and brake canvass. The fourth group is buses with low maintenance costs, especially for tire, and filter.
Simultaneous analysis of 11 main active components in Cirsium setosum based on HPLC-ESI-MS/MS and combined with statistical methods.

PubMed

Sun, Qian; Chang, Lu; Ren, Yanping; Cao, Liang; Sun, Yingguang; Du, Yingfeng; Shi, Xiaowei; Wang, Qiao; Zhang, Lantong

2012-11-01

A novel method based on high-performance liquid chromatography coupled with electrospray ionization tandem mass spectrometry was developed for simultaneous determination of the 11 major active components including ten flavonoids and one phenolic acid in Cirsium setosum. Separation was performed on a reversed-phase C(18) column with gradient elution of methanol and 0.1‰ acetic acid (v/v). The identification and quantification of the analytes were achieved on a hybrid quadrupole linear ion trap mass spectrometer. Multiple-reaction monitoring scanning was employed for quantification with switching electrospray ion source polarity between positive and negative modes in a single run. Full validation of the assay was carried out including linearity, precision, accuracy, stability, limits of detection and quantification. The results demonstrated that the method developed was reliable, rapid, and specific. The 25 batches of C. setosum samples from different sources were first determined using the developed method and the total contents of 11 analytes ranged from 1717.460 to 23028.258 μg/g. Among them, the content of linarin was highest, and its mean value was 7340.967 μg/g. Principal component analysis and hierarchical clustering analysis were performed to differentiate and classify the samples, which is helpful for comprehensive evaluation of the quality of C. setosum. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Survey to Identify Substandard and Falsified Tablets in Several Asian Countries with Pharmacopeial Quality Control Tests and Principal Component Analysis of Handheld Raman Spectroscopy.

PubMed

Kakio, Tomoko; Nagase, Hitomi; Takaoka, Takashi; Yoshida, Naoko; Hirakawa, Junichi; Macha, Susan; Hiroshima, Takashi; Ikeda, Yukihiro; Tsuboi, Hirohito; Kimura, Kazuko

2018-06-01

The World Health Organization has warned that substandard and falsified medical products (SFs) can harm patients and fail to treat the diseases for which they were intended, and they affect every region of the world, leading to loss of confidence in medicines, health-care providers, and health systems. Therefore, development of analytical procedures to detect SFs is extremely important. In this study, we investigated the quality of pharmaceutical tablets containing the antihypertensive candesartan cilexetil, collected in China, Indonesia, Japan, and Myanmar, using the Japanese pharmacopeial analytical procedures for quality control, together with principal component analysis (PCA) of Raman spectrum obtained with handheld Raman spectrometer. Some samples showed delayed dissolution and failed to meet the pharmacopeial specification, whereas others failed the assay test. These products appeared to be substandard. Principal component analysis showed that all Raman spectra could be explained in terms of two components: the amount of the active pharmaceutical ingredient and the kinds of excipients. Principal component analysis score plot indicated one substandard, and the falsified tablets have similar principal components in Raman spectra, in contrast to authentic products. The locations of samples within the PCA score plot varied according to the source country, suggesting that manufacturers in different countries use different excipients. Our results indicate that the handheld Raman device will be useful for detection of SFs in the field. Principal component analysis of that Raman data clarify the difference in chemical properties between good quality products and SFs that circulate in the Asian market.
Principal component analysis and the locus of the Fréchet mean in the space of phylogenetic trees.

PubMed

Nye, Tom M W; Tang, Xiaoxian; Weyenberg, Grady; Yoshida, Ruriko

2017-12-01

Evolutionary relationships are represented by phylogenetic trees, and a phylogenetic analysis of gene sequences typically produces a collection of these trees, one for each gene in the analysis. Analysis of samples of trees is difficult due to the multi-dimensionality of the space of possible trees. In Euclidean spaces, principal component analysis is a popular method of reducing high-dimensional data to a low-dimensional representation that preserves much of the sample's structure. However, the space of all phylogenetic trees on a fixed set of species does not form a Euclidean vector space, and methods adapted to tree space are needed. Previous work introduced the notion of a principal geodesic in this space, analogous to the first principal component. Here we propose a geometric object for tree space similar to the [Formula: see text]th principal component in Euclidean space: the locus of the weighted Fréchet mean of [Formula: see text] vertex trees when the weights vary over the [Formula: see text]-simplex. We establish some basic properties of these objects, in particular showing that they have dimension [Formula: see text], and propose algorithms for projection onto these surfaces and for finding the principal locus associated with a sample of trees. Simulation studies demonstrate that these algorithms perform well, and analyses of two datasets, containing Apicomplexa and African coelacanth genomes respectively, reveal important structure from the second principal components.
Dimension Reduction With Extreme Learning Machine.

PubMed

Kasun, Liyanaarachchi Lekamalage Chamara; Yang, Yan; Huang, Guang-Bin; Zhang, Zhengyou

2016-08-01

Data may often contain noise or irrelevant information, which negatively affect the generalization capability of machine learning algorithms. The objective of dimension reduction algorithms, such as principal component analysis (PCA), non-negative matrix factorization (NMF), random projection (RP), and auto-encoder (AE), is to reduce the noise or irrelevant information of the data. The features of PCA (eigenvectors) and linear AE are not able to represent data as parts (e.g. nose in a face image). On the other hand, NMF and non-linear AE are maimed by slow learning speed and RP only represents a subspace of original data. This paper introduces a dimension reduction framework which to some extend represents data as parts, has fast learning speed, and learns the between-class scatter subspace. To this end, this paper investigates a linear and non-linear dimension reduction framework referred to as extreme learning machine AE (ELM-AE) and sparse ELM-AE (SELM-AE). In contrast to tied weight AE, the hidden neurons in ELM-AE and SELM-AE need not be tuned, and their parameters (e.g, input weights in additive neurons) are initialized using orthogonal and sparse random weights, respectively. Experimental results on USPS handwritten digit recognition data set, CIFAR-10 object recognition, and NORB object recognition data set show the efficacy of linear and non-linear ELM-AE and SELM-AE in terms of discriminative capability, sparsity, training time, and normalized mean square error.
Independent EEG Sources Are Dipolar

PubMed Central

Delorme, Arnaud; Palmer, Jason; Onton, Julie; Oostenveld, Robert; Makeig, Scott

2012-01-01

Independent component analysis (ICA) and blind source separation (BSS) methods are increasingly used to separate individual brain and non-brain source signals mixed by volume conduction in electroencephalographic (EEG) and other electrophysiological recordings. We compared results of decomposing thirteen 71-channel human scalp EEG datasets by 22 ICA and BSS algorithms, assessing the pairwise mutual information (PMI) in scalp channel pairs, the remaining PMI in component pairs, the overall mutual information reduction (MIR) effected by each decomposition, and decomposition ‘dipolarity’ defined as the number of component scalp maps matching the projection of a single equivalent dipole with less than a given residual variance. The least well-performing algorithm was principal component analysis (PCA); best performing were AMICA and other likelihood/mutual information based ICA methods. Though these and other commonly-used decomposition methods returned many similar components, across 18 ICA/BSS algorithms mean dipolarity varied linearly with both MIR and with PMI remaining between the resulting component time courses, a result compatible with an interpretation of many maximally independent EEG components as being volume-conducted projections of partially-synchronous local cortical field activity within single compact cortical domains. To encourage further method comparisons, the data and software used to prepare the results have been made available (http://sccn.ucsd.edu/wiki/BSSComparison). PMID:22355308
Recognition of units in coarse, unconsolidated braided-stream deposits from geophysical log data with principal components analysis

USGS Publications Warehouse

Morin, R.H.

1997-01-01

Returns from drilling in unconsolidated cobble and sand aquifers commonly do not identify lithologic changes that may be meaningful for Hydrogeologic investigations. Vertical resolution of saturated, Quaternary, coarse braided-slream deposits is significantly improved by interpreting natural gamma (G), epithermal neutron (N), and electromagnetically induced resistivity (IR) logs obtained from wells at the Capital Station site in Boise, Idaho. Interpretation of these geophysical logs is simplified because these sediments are derived largely from high-gamma-producing source rocks (granitics of the Boise River drainage), contain few clays, and have undergone little diagenesis. Analysis of G, N, and IR data from these deposits with principal components analysis provides an objective means to determine if units can be recognized within the braided-stream deposits. In particular, performing principal components analysis on G, N, and IR data from eight wells at Capital Station (1) allows the variable system dimensionality to be reduced from three to two by selecting the two eigenvectors with the greatest variance as axes for principal component scatterplots, (2) generates principal components with interpretable physical meanings, (3) distinguishes sand from cobble-dominated units, and (4) provides a means to distinguish between cobble-dominated units.
[Complexity and its integrative effects of the time lags of environment factors affecting Larix gmelinii stem sap flow].

PubMed

Wang, Hui-Mei; Sun, Wei; Zu, Yuan-Gang; Wang, Wen-Jie

2011-12-01

Based on the one-year (2005) observations with a frequency of half hour on the stem sap flow of Larix gmelinii plantation trees planted in 1969 and the related environmental factors air humidity (RH), air temperature (T(air)), photosynthetic components active radiation (PAR), soil temperature (T(soil)), and soil moisture (TDR), principal analysis (PCA) and correction analysis were made on the time lag effect of the stem flow in different seasons (26 days of each season) and in a year via dislocation analysis, with the complexity and its integrative effects of the time lags of environment factors affecting the stem sap flow approached. The results showed that in different seasons and for different environmental factors, the time lag effect varied obviously. In general, the time lag of PAR was 0.5-1 hour ahead of sap flow, that of T(air) and RH was 0-2 hours ahead of or behind the sap flow, and the time lags of T(soil) and TDR were much longer or sometimes undetectable. Because of the complexity of the time lags, no evident improvements were observed in the linear correlations (R2, slope, and intercept) when the time lags based on short-term (20 days) data were used to correct the time lags based on whole year data. However, obvious improvements were found in the standardized and non-standardized correlation coefficients in stepwise multiple regressions, i.e., the time lag corrections could improve the effects of RH, but decreased the effects of PAR, T(air), and T(soil). PCA could be used to simplify the complexity. The first and the second principal components could stand for over 75% information of all the environmental factors in different seasons and in whole year. The time lags of both the first and the second principal components were 1-1.5 hours in advance of the sap flow, except in winter (no time lag effect).
Biomarkers of furan exposure by metabolic profiling of rat urine with liquid chromatography-tandem mass spectrometry and principal component analysis.

PubMed

Kellert, Marco; Wagner, Silvia; Lutz, Ursula; Lutz, Werner K

2008-03-01

Furan has been found in a number of heated food items and is carcinogenic in the liver of rats and mice. Estimates of human exposure on the basis of concentrations measured in food are not reliable because of the volatility of furan. A biomarker approach is therefore indicated. We searched for metabolites excreted in the urine of male Fischer 344 rats treated by oral gavage with 40 mg of furan per kg of body weight. A control group received the vehicle oil only. Urine collected over two 24-h periods both before and after treatment was analyzed by a column-switching LC-MS/MS method. Data were acquired by a full scan survey scan in combination with information dependent acquisition of fragmentation spectra by the use of a linear ion trap. Areas of 449 peaks were extracted from the chromatograms and used for principal component analysis (PCA). The first principal component fully separated the samples of treated rats from the controls in the first post-treatment sampling period. Thirteen potential biomarkers selected from the corresponding loadings plot were reanalyzed using specific transitions in the MRM mode. Seven peaks that increased significantly upon treatment were further investigated as biomarkers of exposure. MS/MS information indicated conjugation with glutathione on the basis of the characteristic neutral loss of 129 for mercapturates. Adducts with the side chain amino group of lysine were characterized by a neutral loss of 171 for N-acetyl- l-lysine. Analysis of products of in vitro incubations of the reactive furan metabolite cis-2-butene-1,4-dial with the respective amino acid derivatives supported five structures, including a new 3-methylthio-pyrrole metabolite probably formed by beta-lyase reaction on a glutathione conjugate, followed by methylation of the thiol group. Our results demonstrate the potential of comprehensive mass spectrometric analysis of urine combined with multivariate analyses for metabolic profiling in search of biomarkers of exposure.
Microglia Morphological Categorization in a Rat Model of Neuroinflammation by Hierarchical Cluster and Principal Components Analysis.

PubMed

Fernández-Arjona, María Del Mar; Grondona, Jesús M; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D

2017-01-01

It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable morphological change upon neuraminidase induced inflammation.Hierarchical cluster and principal components analysis allow morphological classification of microglia.Brain location of microglia is a relevant factor.
Chemical and microstructural characterizations of plasma polymer films by time-of-flight secondary ion mass spectrometry and principal component analysis

NASA Astrophysics Data System (ADS)

Cossement, Damien; Renaux, Fabian; Thiry, Damien; Ligot, Sylvie; Francq, Rémy; Snyders, Rony

2015-11-01

It is accepted that the macroscopic properties of functional plasma polymer films (PPF) are defined by their functional density and their crosslinking degree (χ) which are quantities that most of the time behave in opposite trends. If the PPF chemistry is relatively easy to evaluate, it is much more challenging for χ. This paper reviews the recent work developed in our group on the application of principal component analysis (PCA) to time-of-flight secondary ion mass spectrometric (ToF-SIMS) positive spectra data in order to extract the relative cross-linking degree (χ) of PPF. NH2-, COOR- and SH-containing PPF synthesized in our group by plasma enhanced chemical vapor deposition (PECVD) varying the applied radiofrequency power (PRF), have been used as model surfaces. For the three plasma polymer families, the scores of the first computed principal component (PC1) highlighted significant differences in the chemical composition supported by X-Ray photoelectron spectroscopy (XPS) data. The most important fragments contributing to PC1 (loadings > 90%) were used to compute an average C/H ratio index for samples synthesized at low and high PRF. This ratio being an evaluation of χ, these data, accordingly to the literature, indicates an increase of χ with PRF excepted for the SH-PPF. These results have been cross-checked by the evaluation of functional properties of the plasma polymers namely a linear correlation with the stability of NH2-PPF in ethanol and a correlation with the mechanical properties of the COOR-PPF. For the SH-PPF family, the peculiar evolution of χ is supported by the understanding of the growth mechanism of the PPF from plasma diagnostic. The whole set of data clearly demonstrates the potential of the PCA method for extracting information on the microstructure of plasma polymers from ToF-SIMS measurements.
Microglia Morphological Categorization in a Rat Model of Neuroinflammation by Hierarchical Cluster and Principal Components Analysis

PubMed Central

Fernández-Arjona, María del Mar; Grondona, Jesús M.; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D.

2017-01-01

It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable morphological change upon neuraminidase induced inflammation.Hierarchical cluster and principal components analysis allow morphological classification of microglia.Brain location of microglia is a relevant factor. PMID:28848398

Investigating the sex-related geometric variation of the human cranium.

PubMed

Bertsatos, Andreas; Papageorgopoulou, Christina; Valakos, Efstratios; Chovalopoulou, Maria-Eleni

2018-01-29

Accurate sexing methods are of great importance in forensic anthropology since sex assessment is among the principal tasks when examining human skeletal remains. The present study explores a novel approach in assessing the most accurate metric traits of the human cranium for sex estimation based on 80 ectocranial landmarks from 176 modern individuals of known age and sex from the Athens Collection. The purpose of the study is to identify those distance and angle measurements that can be most effectively used in sex assessment. Three-dimensional landmark coordinates were digitized with a Microscribe 3DX and analyzed in GNU Octave. An iterative linear discriminant analysis of all possible combinations of landmarks was performed for each unique set of the 3160 distances and 246,480 angles. Cross-validated correct classification as well as multivariate DFA on top performing variables reported 13 craniometric distances with over 85% classification accuracy, 7 angles over 78%, as well as certain multivariate combinations yielding over 95%. Linear regression of these variables with the centroid size was used to assess their relation to the size of the cranium. In contrast to the use of generalized procrustes analysis (GPA) and principal component analysis (PCA), which constitute the common analytical work flow for such data, our method, although computational intensive, produced easily applicable discriminant functions of high accuracy, while at the same time explored the maximum of cranial variability.
Analysis and Evaluation of the Characteristic Taste Components in Portobello Mushroom.

PubMed

Wang, Jinbin; Li, Wen; Li, Zhengpeng; Wu, Wenhui; Tang, Xueming

2018-05-10

To identify the characteristic taste components of the common cultivated mushroom (brown; Portobello), Agaricus bisporus, taste components in the stipe and pileus of Portobello mushroom harvested at different growth stages were extracted and identified, and principal component analysis (PCA) and taste active value (TAV) were used to reveal the characteristic taste components during the each of the growth stages of Portobello mushroom. In the stipe and pileus, 20 and 14 different principal taste components were identified, respectively, and they were considered as the principal taste components of Portobello mushroom fruit bodies, which included most amino acids and 5'-nucleotides. Some taste components that were found at high levels, such as lactic acid and citric acid, were not detected as Portobello mushroom principal taste components through PCA. However, due to their high content, Portobello mushroom could be used as a source of organic acids. The PCA and TAV results revealed that 5'-GMP, glutamic acid, malic acid, alanine, proline, leucine, and aspartic acid were the characteristic taste components of Portobello mushroom fruit bodies. Portobello mushroom was also found to be rich in protein and amino acids, so it might also be useful in the formulation of nutraceuticals and functional food. The results in this article could provide a theoretical basis for understanding and regulating the characteristic flavor components synthesis process of Portobello mushroom. © 2018 Institute of Food Technologists®.
Applications of principal component analysis to breath air absorption spectra profiles classification

NASA Astrophysics Data System (ADS)

Kistenev, Yu. V.; Shapovalov, A. V.; Borisov, A. V.; Vrazhnov, D. A.; Nikolaev, V. V.; Nikiforova, O. Y.

2015-12-01

The results of numerical simulation of application principal component analysis to absorption spectra of breath air of patients with pulmonary diseases are presented. Various methods of experimental data preprocessing are analyzed.
Linear Discriminant Analysis Achieves High Classification Accuracy for the BOLD fMRI Response to Naturalistic Movie Stimuli

PubMed Central

Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.

2016-01-01

Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these results, the combination of naturalistic movie stimuli and classification analysis in fMRI experiments may prove to be a sensitive tool for the assessment of changes in natural cognitive processes under experimental manipulation. PMID:27065832
Determining quality of caviar from Caspian Sea based on Raman spectroscopy and using artificial neural networks.

PubMed

Mohamadi Monavar, H; Afseth, N K; Lozano, J; Alimardani, R; Omid, M; Wold, J P

2013-07-15

The purpose of this study was to evaluate the feasibility of Raman spectroscopy for predicting purity of caviars. The 93 wild caviar samples of three different types, namely; Beluga, Asetra and Sevruga were analysed by Raman spectroscopy in the range 1995 cm(-1) to 545 cm(-1). Also, 60 samples from combinations of every two types were examined. The chemical origin of the samples was identified by reference measurements on pure samples. Linear chemometric methods like Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were used for data visualisation and classification which permitted clear distinction between different caviars. Non-linear methods like Artificial Neural Networks (ANN) were used to classify caviar samples. Two different networks were tested in the classification: Probabilistic Neural Network with Radial-Basis Function (PNN) and Multilayer Feed Forward Networks with Back Propagation (BP-NN). In both cases, scores of principal components (PCs) were chosen as input nodes for the input layer in PC-ANN models in order to reduce the redundancy of data and time of training. Leave One Out (LOO) cross validation was applied in order to check the performance of the networks. Results of PCA indicated that, features like type and purity can be used to discriminate different caviar samples. These findings were also supported by LDA with efficiency between 83.77% and 100%. These results were confirmed with the results obtained by developed PC-ANN models, able to classify pure caviar samples with 93.55% and 71.00% accuracy in BP network and PNN, respectively. In comparison, LDA, PNN and BP-NN models for predicting caviar types have 90.3%, 73.1% and 91.4% accuracy. Partial least squares regression (PLSR) models were built under cross validation and tested with different independent data sets, yielding determination coefficients (R(2)) of 0.86, 0.83, 0.92 and 0.91 with root mean square error (RMSE) of validation of 0.32, 0.11, 0.03 and 0.09 for fatty acids of 16.0, 20.5, 22.6 and fat, respectively. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.
[The principal components analysis--method to classify the statistical variables with applications in medicine].

PubMed

Dascălu, Cristina Gena; Antohe, Magda Ecaterina

2009-01-01

Based on the eigenvalues and the eigenvectors analysis, the principal component analysis has the purpose to identify the subspace of the main components from a set of parameters, which are enough to characterize the whole set of parameters. Interpreting the data for analysis as a cloud of points, we find through geometrical transformations the directions where the cloud's dispersion is maximal--the lines that pass through the cloud's center of weight and have a maximal density of points around them (by defining an appropriate criteria function and its minimization. This method can be successfully used in order to simplify the statistical analysis on questionnaires--because it helps us to select from a set of items only the most relevant ones, which cover the variations of the whole set of data. For instance, in the presented sample we started from a questionnaire with 28 items and, applying the principal component analysis we identified 7 principal components--or main items--fact that simplifies significantly the further data statistical analysis.
On Using the Average Intercorrelation Among Predictor Variables and Eigenvector Orientation to Choose a Regression Solution.

ERIC Educational Resources Information Center

Mugrage, Beverly; And Others

Three ridge regression solutions are compared with ordinary least squares regression and with principal components regression using all components. Ridge regression, particularly the Lawless-Wang solution, out-performed ordinary least squares regression and the principal components solution on the criteria of stability of coefficient and closeness…
A Note on McDonald's Generalization of Principal Components Analysis

ERIC Educational Resources Information Center

Shine, Lester C., II

1972-01-01

It is shown that McDonald's generalization of Classical Principal Components Analysis to groups of variables maximally channels the totalvariance of the original variables through the groups of variables acting as groups. An equation is obtained for determining the vectors of correlations of the L2 components with the original variables.…
Principals' Leadership Behaviors as Perceived by Teachers in At-Risk Middle Schools

ERIC Educational Resources Information Center

Johnson, R. Anthony

2011-01-01

A need for greater understanding of teachers' (N = 530) perceptions of the leadership behaviors of principals in Title I middle schools (n = 13) is prevalent exists. The researcher used the "Audit of Principal Effectiveness" survey to collect data. The researcher also used Hierarchical Linear Modeling as the quantitative analysis.…
CLUSFAVOR 5.0: hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles

PubMed Central

Peterson, Leif E

2002-01-01

CLUSFAVOR (CLUSter and Factor Analysis with Varimax Orthogonal Rotation) 5.0 is a Windows-based computer program for hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles. CLUSFAVOR 5.0 standardizes input data; sorts data according to gene-specific coefficient of variation, standard deviation, average and total expression, and Shannon entropy; performs hierarchical cluster analysis using nearest-neighbor, unweighted pair-group method using arithmetic averages (UPGMA), or furthest-neighbor joining methods, and Euclidean, correlation, or jack-knife distances; and performs principal-component analysis. PMID:12184816
Forensic age estimation by morphometric analysis of the manubrium from 3D MR images.

PubMed

Martínez Vera, Naira P; Höller, Johannes; Widek, Thomas; Neumayer, Bernhard; Ehammer, Thomas; Urschler, Martin

2017-08-01

Forensic age estimation research based on skeletal structures focuses on patterns of growth and development using different bones. In this work, our aim was to study growth-related evolution of the manubrium in living adolescents and young adults using magnetic resonance imaging (MRI), which is an image acquisition modality that does not involve ionizing radiation. In a first step, individual manubrium and subject features were correlated with age, which confirmed a statistically significant change of manubrium volume (M vol :p<0.01, R 2 ¯=0.50) and surface area (M sur :p<0.01, R 2 ¯=0.53) for the studied age range. Additionally, shapes of the manubria were for the first time investigated using principal component analysis. The decomposition of the data in principal components allowed to analyse the contribution of each component to total shape variation. With 13 principal components, ∼96% of shape variation could be described (M shp :p<0.01, R 2 ¯=0.60). Multiple linear regression analysis modelled the relationship between the statistically best correlated variables and age. Models including manubrium shape, volume or surface area divided by the height of the subject (Y∼M shp M sur /S h :p<0.01, R 2 ¯=0.71; Y∼M shp M vol /S h :p<0.01, R 2 ¯=0.72) presented a standard error of estimate of two years. In order to estimate the accuracy of these two manubrium-based age estimation models, cross validation experiments predicting age on held-out test sets were performed. Median absolute difference of predicted and known chronological age was 1.18 years for the best performing model (Y∼M shp M sur /S h :p<0.01, R p 2 =0.67). In conclusion, despite limitations in determining legal majority age, manubrium morphometry analysis presented statistically significant results for skeletal age estimation, which indicates that this bone structure may be considered as a new candidate in multi-factorial MRI-based age estimation. Copyright © 2017 Elsevier B.V. All rights reserved.
Investigation of carbon dioxide emission in China by primary component analysis.

PubMed

Zhang, Jing; Wang, Cheng-Ming; Liu, Lian; Guo, Hang; Liu, Guo-Dong; Li, Yuan-Wei; Deng, Shi-Huai

2014-02-15

Principal component analysis (PCA) is employed to investigate the relationship between CO2 emissions (COEs) stemming from fossil fuel burning and cement manufacturing and their affecting factors. Eight affecting factors, namely, Population (P), Urban Population (UP); the Output Values of Primary Industry (PIOV), Secondary Industry (SIOV), and Tertiary Industry (TIOV); and the Proportions of Primary Industry's Output Value (PPIOV), Secondary Industry's Output Value (PSIOV), and Tertiary Industry's Output Value (PTIOV), are chosen. PCA is employed to eliminate the multicollinearity of the affecting factors. Two principal components, which can explain 92.86% of the variance of the eight affecting factors, are chosen as variables in the regression analysis. Ordinary least square regression is used to estimate multiple linear regression models, in which COEs and the principal components serve as dependent and independent variables, respectively. The results are given in the following. (1) Theoretically, the carbon intensities of PIOV, SIOV, and TIOV are 2573.4693, 552.7036, and 606.0791 kt per one billion $, respectively. The incomplete statistical data, the different statistical standards, and the ideology of self sufficiency and peasantry appear to show that the carbon intensity of PIOV is higher than those of SIOV and TIOV in China. (2) PPIOV, PSIOV, and PTIOV influence the fluctuations of COE. The parameters of PPIOV, PSIOV, and PTIOV are -2706946.7564, 2557300.5450, and 3924767.9807 kt, respectively. As the economic structure of China is strongly tied to technology level, the period when PIOV plays the leading position is characterized by lagging technology and economic developing. Thus, the influence of PPIOV has a negative value. As the increase of PSIOV and PTIOV is always followed by technological innovation and economic development, PSIOV and PTIOV have the opposite influence. (3) The carbon intensities of P and UP are 1.1029 and 1.7862 kt per thousand people, respectively. The carbon intensity of the rural population can be inferred to be lower than 1.1029 kt per thousand people. The characteristics of poverty and the use of bio-energy in rural areas result in a carbon intensity of the rural population that is lower than that of P. Copyright © 2013 Elsevier B.V. All rights reserved.
EMPCA and Cluster Analysis of Quasar Spectra: Construction and Application to Simulated Spectra

NASA Astrophysics Data System (ADS)

Marrs, Adam; Leighly, Karen; Wagner, Cassidy; Macinnis, Francis

2017-01-01

Quasars have complex spectra with emission lines influenced by many factors. Therefore, to fully describe the spectrum requires specification of a large number of parameters, such as line equivalent width, blueshift, and ratios. Principal Component Analysis (PCA) aims to construct eigenvectors-or principal components-from the data with the goal of finding a few key parameters that can be used to predict the rest of the spectrum fairly well. Analysis of simulated quasar spectra was used to verify and justify our modified application of PCA.We used a variant of PCA called Weighted Expectation Maximization PCA (EMPCA; Bailey 2012) along with k-means cluster analysis to analyze simulated quasar spectra. Our approach combines both analytical methods to address two known problems with classical PCA. EMPCA uses weights to account for uncertainty and missing points in the spectra. K-means groups similar spectra together to address the nonlinearity of quasar spectra, specifically variance in blueshifts and widths of the emission lines.In producing and analyzing simulations, we first tested the effects of varying equivalent widths and blueshifts on the derived principal components, and explored the differences between standard PCA and EMPCA. We also tested the effects of varying signal-to-noise ratio. Next we used the results of fits to composite quasar spectra (see accompanying poster by Wagner et al.) to construct a set of realistic simulated spectra, and subjected those spectra to the EMPCA /k-means analysis. We concluded that our approach was validated when we found that the mean spectra from our k-means clusters derived from PCA projection coefficients reproduced the trends observed in the composite spectra.Furthermore, our method needed only two eigenvectors to identify both sets of correlations used to construct the simulations, as well as indicating the linear and nonlinear segments. Comparing this to regular PCA, which can require a dozen or more components, or to direct spectral analysis that may need measurement of 20 fit parameters, shows why the dual application of these two techniques is such a powerful tool.
Identification of the isomers using principal component analysis (PCA) method

NASA Astrophysics Data System (ADS)

Kepceoǧlu, Abdullah; Gündoǧdu, Yasemin; Ledingham, Kenneth William David; Kilic, Hamdi Sukur

2016-03-01

In this work, we have carried out a detailed statistical analysis for experimental data of mass spectra from xylene isomers. Principle Component Analysis (PCA) was used to identify the isomers which cannot be distinguished using conventional statistical methods for interpretation of their mass spectra. Experiments have been carried out using a linear TOF-MS coupled to a femtosecond laser system as an energy source for the ionisation processes. We have performed experiments and collected data which has been analysed and interpreted using PCA as a multivariate analysis of these spectra. This demonstrates the strength of the method to get an insight for distinguishing the isomers which cannot be identified using conventional mass analysis obtained through dissociative ionisation processes on these molecules. The PCA results dependending on the laser pulse energy and the background pressure in the spectrometers have been presented in this work.
Evaluation of the psychometric properties of the main meal quality index when applied in the UK population.

PubMed

Gorgulho, B M; Pot, G K; Marchioni, D M

2017-05-01

The aim of this study was to evaluate the validity and reliability of the Main Meal Quality Index when applied on the UK population. The indicator was developed to assess meal quality in different populations, and is composed of 10 components: fruit, vegetables (excluding potatoes), ratio of animal protein to total protein, fiber, carbohydrate, total fat, saturated fat, processed meat, sugary beverages and desserts, and energy density, resulting in a score range of 0-100 points. The performance of the indicator was measured using strategies for assessing content validity, construct validity, discriminant validity and reliability, including principal component analysis, linear regression models and Cronbach's alpha. The indicator presented good reliability. The Main Meal Quality Index has been shown to be valid for use as an instrument to evaluate, monitor and compare the quality of meals consumed by adults in the United Kingdom.
The Complexity of Human Walking: A Knee Osteoarthritis Study

PubMed Central

Kotti, Margarita; Duffell, Lynsey D.; Faisal, Aldo A.; McGregor, Alison H.

2014-01-01

This study proposes a framework for deconstructing complex walking patterns to create a simple principal component space before checking whether the projection to this space is suitable for identifying changes from the normality. We focus on knee osteoarthritis, the most common knee joint disease and the second leading cause of disability. Knee osteoarthritis affects over 250 million people worldwide. The motivation for projecting the highly dimensional movements to a lower dimensional and simpler space is our belief that motor behaviour can be understood by identifying a simplicity via projection to a low principal component space, which may reflect upon the underlying mechanism. To study this, we recruited 180 subjects, 47 of which reported that they had knee osteoarthritis. They were asked to walk several times along a walkway equipped with two force plates that capture their ground reaction forces along 3 axes, namely vertical, anterior-posterior, and medio-lateral, at 1000 Hz. Data when the subject does not clearly strike the force plate were excluded, leaving 1–3 gait cycles per subject. To examine the complexity of human walking, we applied dimensionality reduction via Probabilistic Principal Component Analysis. The first principal component explains 34% of the variance in the data, whereas over 80% of the variance is explained by 8 principal components or more. This proves the complexity of the underlying structure of the ground reaction forces. To examine if our musculoskeletal system generates movements that are distinguishable between normal and pathological subjects in a low dimensional principal component space, we applied a Bayes classifier. For the tested cross-validated, subject-independent experimental protocol, the classification accuracy equals 82.62%. Also, a novel complexity measure is proposed, which can be used as an objective index to facilitate clinical decision making. This measure proves that knee osteoarthritis subjects exhibit more variability in the two-dimensional principal component space. PMID:25232949
Characterization of CDOM of river waters in China using fluorescence excitation-emission matrix and regional integration techniques

NASA Astrophysics Data System (ADS)

Zhao, Ying; Song, Kaishan; Shang, Yingxin; Shao, Tiantian; Wen, Zhidan; Lv, Lili

2017-08-01

The spatial characteristics of fluorescent dissolved organic matter (FDOM) components in river waters in China were first examined by excitation-emission matrix spectra and fluorescence regional integration (FRI) with the data collected during September to November between 2013 and 2015. One tyrosine-like (R1), one tryptophan-like (R2), one fulvic-like (R3), one microbial protein-like (R4), and one humic-like (R5) components have been identified by FRI method. Principal component analysis (PCA) was conducted to assess variations in the five FDOM components (FRί (ί = 1, 2, 3, 4, and 5)) and the humification index for all 194 river water samples. The average fluorescence intensities of the five fluorescent components and the total fluorescence intensities FSUM differed under spatial variation among the seven major river basins (Songhua, Liao, Hai, Yellow and Huai, Yangtze, Pearl, and Inflow Rivers) in China. When all the river water samples were pooled together, the fulvic-like FR3 and the humic-like FR5 showed a strong positive linear relationship (R2 = 0.90, n = 194), indicating that the two allochthonous FDOM components R3 and R5 may originate from similar sources. There is a moderate strong positive correlation between the tryptophan-like FR2 and the microbial protein-like FR4 (R2 = 0.71, n = 194), suggesting that parts of two autochthonous FDOM components R2 and R4 are likely from some common sources. However, the total allochthonous substance FR(3+5) and the total autochthonous substances FR(1+2+4) exhibited a weak correlation (R2 = 0.40, n = 194). Significant positive linear relationships between FR3 (R2 = 0.69, n = 194), FR5 (R2 = 0.79, n = 194), and chromophoric DOM (CDOM) absorption coefficient a(254) were observed, which demonstrated that the CDOM absorption was dominated by the allochthonous FDOM components R3 and R5.
Principal Components Analysis of a JWST NIRSpec Detector Subsystem

NASA Technical Reports Server (NTRS)

Arendt, Richard G.; Fixsen, D. J.; Greenhouse, Matthew A.; Lander, Matthew; Lindler, Don; Loose, Markus; Moseley, S. H.; Mott, D. Brent; Rauscher, Bernard J.; Wen, Yiting;

2013-01-01

We present principal component analysis (PCA) of a flight-representative James Webb Space Telescope NearInfrared Spectrograph (NIRSpec) Detector Subsystem. Although our results are specific to NIRSpec and its T - 40 K SIDECAR ASICs and 5 m cutoff H2RG detector arrays, the underlying technical approach is more general. We describe how we measured the systems response to small environmental perturbations by modulating a set of bias voltages and temperature. We used this information to compute the systems principal noise components. Together with information from the astronomical scene, we show how the zeroth principal component can be used to calibrate out the effects of small thermal and electrical instabilities to produce cosmetically cleaner images with significantly less correlated noise. Alternatively, if one were designing a new instrument, one could use a similar PCA approach to inform a set of environmental requirements (temperature stability, electrical stability, etc.) that enabled the planned instrument to meet performance requirements

Application of principal component analysis (PCA) as a sensory assessment tool for fermented food products.

PubMed

Ghosh, Debasree; Chattopadhyay, Parimal

2012-06-01

The objective of the work was to use the method of quantitative descriptive analysis (QDA) to describe the sensory attributes of the fermented food products prepared with the incorporation of lactic cultures. Panellists were selected and trained to evaluate various attributes specially color and appearance, body texture, flavor, overall acceptability and acidity of the fermented food products like cow milk curd and soymilk curd, idli, sauerkraut and probiotic ice cream. Principal component analysis (PCA) identified the six significant principal components that accounted for more than 90% of the variance in the sensory attribute data. Overall product quality was modelled as a function of principal components using multiple least squares regression (R (2) = 0.8). The result from PCA was statistically analyzed by analysis of variance (ANOVA). These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring the fermented food product attributes that are important for consumer acceptability.
Snapshot hyperspectral imaging probe with principal component analysis and confidence ellipse for classification

NASA Astrophysics Data System (ADS)

Lim, Hoong-Ta; Murukeshan, Vadakke Matham

2017-06-01

Hyperspectral imaging combines imaging and spectroscopy to provide detailed spectral information for each spatial point in the image. This gives a three-dimensional spatial-spatial-spectral datacube with hundreds of spectral images. Probe-based hyperspectral imaging systems have been developed so that they can be used in regions where conventional table-top platforms would find it difficult to access. A fiber bundle, which is made up of specially-arranged optical fibers, has recently been developed and integrated with a spectrograph-based hyperspectral imager. This forms a snapshot hyperspectral imaging probe, which is able to form a datacube using the information from each scan. Compared to the other configurations, which require sequential scanning to form a datacube, the snapshot configuration is preferred in real-time applications where motion artifacts and pixel misregistration can be minimized. Principal component analysis is a dimension-reducing technique that can be applied in hyperspectral imaging to convert the spectral information into uncorrelated variables known as principal components. A confidence ellipse can be used to define the region of each class in the principal component feature space and for classification. This paper demonstrates the use of the snapshot hyperspectral imaging probe to acquire data from samples of different colors. The spectral library of each sample was acquired and then analyzed using principal component analysis. Confidence ellipse was then applied to the principal components of each sample and used as the classification criteria. The results show that the applied analysis can be used to perform classification of the spectral data acquired using the snapshot hyperspectral imaging probe.

Analysis of environmental variation in a Great Plains reservoir using principal components analysis and geographic information systems

USGS Publications Warehouse

Long, J.M.; Fisher, W.L.

2006-01-01

We present a method for spatial interpretation of environmental variation in a reservoir that integrates principal components analysis (PCA) of environmental data with geographic information systems (GIS). To illustrate our method, we used data from a Great Plains reservoir (Skiatook Lake, Oklahoma) with longitudinal variation in physicochemical conditions. We measured 18 physicochemical features, mapped them using GIS, and then calculated and interpreted four principal components. Principal component 1 (PC1) was readily interpreted as longitudinal variation in water chemistry, but the other principal components (PC2-4) were difficult to interpret. Site scores for PC1-4 were calculated in GIS by summing weighted overlays of the 18 measured environmental variables, with the factor loadings from the PCA as the weights. PC1-4 were then ordered into a landscape hierarchy, an emergent property of this technique, which enabled their interpretation. PC1 was interpreted as a reservoir scale change in water chemistry, PC2 was a microhabitat variable of rip-rap substrate, PC3 identified coves/embayments and PC4 consisted of shoreline microhabitats related to slope. The use of GIS improved our ability to interpret the more obscure principal components (PC2-4), which made the spatial variability of the reservoir environment more apparent. This method is applicable to a variety of aquatic systems, can be accomplished using commercially available software programs, and allows for improved interpretation of the geographic environmental variability of a system compared to using typical PCA plots. ?? Copyright by the North American Lake Management Society 2006.
Factors associated with successful transition among children with disabilities in eight European countries

PubMed Central

2017-01-01

Introduction This research paper aims to assess factors reported by parents associated with the successful transition of children with complex additional support requirements that have undergone a transition between school environments from 8 European Union member states. Methods Quantitative data were collected from 306 parents within education systems from 8 EU member states (Bulgaria, Cyprus, Greece, Ireland, the Netherlands, Romania, Spain and the UK). The data were derived from an online questionnaire and consisted of 41 questions. Information was collected on: parental involvement in their child’s transition, child involvement in transition, child autonomy, school ethos, professionals’ involvement in transition and integrated working, such as, joint assessment, cooperation and coordination between agencies. Survey questions that were designed on a Likert-scale were included in the Principal Components Analysis (PCA), additional survey questions, along with the results from the PCA, were used to build a logistic regression model. Results Four principal components were identified accounting for 48.86% of the variability in the data. Principal component 1 (PC1), ‘child inclusive ethos,’ contains 16.17% of the variation. Principal component 2 (PC2), which represents child autonomy and involvement, is responsible for 8.52% of the total variation. Principal component 3 (PC3) contains questions relating to parental involvement and contributed to 12.26% of the overall variation. Principal component 4 (PC4), which involves transition planning and coordination, contributed to 11.91% of the overall variation. Finally, the principal components were included in a logistic regression to evaluate the relationship between inclusion and a successful transition, as well as whether other factors that may have influenced transition. All four principal components were significantly associated with a successful transition, with PC1 being having the most effect (OR: 4.04, CI: 2.43–7.18, p<0.0001). Discussion To support a child with complex additional support requirements through transition from special school to mainstream, governments and professionals need to ensure children with additional support requirements and their parents are at the centre of all decisions that affect them. It is important that professionals recognise the educational, psychological, social and cultural contexts of a child with additional support requirements and their families which will provide a holistic approach and remove barriers for learning. PMID:28636649
Factors associated with successful transition among children with disabilities in eight European countries.

PubMed

Ravenscroft, John; Wazny, Kerri; Davis, John M

2017-01-01

This research paper aims to assess factors reported by parents associated with the successful transition of children with complex additional support requirements that have undergone a transition between school environments from 8 European Union member states. Quantitative data were collected from 306 parents within education systems from 8 EU member states (Bulgaria, Cyprus, Greece, Ireland, the Netherlands, Romania, Spain and the UK). The data were derived from an online questionnaire and consisted of 41 questions. Information was collected on: parental involvement in their child's transition, child involvement in transition, child autonomy, school ethos, professionals' involvement in transition and integrated working, such as, joint assessment, cooperation and coordination between agencies. Survey questions that were designed on a Likert-scale were included in the Principal Components Analysis (PCA), additional survey questions, along with the results from the PCA, were used to build a logistic regression model. Four principal components were identified accounting for 48.86% of the variability in the data. Principal component 1 (PC1), 'child inclusive ethos,' contains 16.17% of the variation. Principal component 2 (PC2), which represents child autonomy and involvement, is responsible for 8.52% of the total variation. Principal component 3 (PC3) contains questions relating to parental involvement and contributed to 12.26% of the overall variation. Principal component 4 (PC4), which involves transition planning and coordination, contributed to 11.91% of the overall variation. Finally, the principal components were included in a logistic regression to evaluate the relationship between inclusion and a successful transition, as well as whether other factors that may have influenced transition. All four principal components were significantly associated with a successful transition, with PC1 being having the most effect (OR: 4.04, CI: 2.43-7.18, p<0.0001). To support a child with complex additional support requirements through transition from special school to mainstream, governments and professionals need to ensure children with additional support requirements and their parents are at the centre of all decisions that affect them. It is important that professionals recognise the educational, psychological, social and cultural contexts of a child with additional support requirements and their families which will provide a holistic approach and remove barriers for learning.
Identifying Plant Part Composition of Forest Logging Residue Using Infrared Spectral Data and Linear Discriminant Analysis

PubMed Central

Acquah, Gifty E.; Via, Brian K.; Billor, Nedret; Fasina, Oladiran O.; Eckhardt, Lori G.

2016-01-01

As new markets, technologies and economies evolve in the low carbon bioeconomy, forest logging residue, a largely untapped renewable resource will play a vital role. The feedstock can however be variable depending on plant species and plant part component. This heterogeneity can influence the physical, chemical and thermochemical properties of the material, and thus the final yield and quality of products. Although it is challenging to control compositional variability of a batch of feedstock, it is feasible to monitor this heterogeneity and make the necessary changes in process parameters. Such a system will be a first step towards optimization, quality assurance and cost-effectiveness of processes in the emerging biofuel/chemical industry. The objective of this study was therefore to qualitatively classify forest logging residue made up of different plant parts using both near infrared spectroscopy (NIRS) and Fourier transform infrared spectroscopy (FTIRS) together with linear discriminant analysis (LDA). Forest logging residue harvested from several Pinus taeda (loblolly pine) plantations in Alabama, USA, were classified into three plant part components: clean wood, wood and bark and slash (i.e., limbs and foliage). Five-fold cross-validated linear discriminant functions had classification accuracies of over 96% for both NIRS and FTIRS based models. An extra factor/principal component (PC) was however needed to achieve this in FTIRS modeling. Analysis of factor loadings of both NIR and FTIR spectra showed that, the statistically different amount of cellulose in the three plant part components of logging residue contributed to their initial separation. This study demonstrated that NIR or FTIR spectroscopy coupled with PCA and LDA has the potential to be used as a high throughput tool in classifying the plant part makeup of a batch of forest logging residue feedstock. Thus, NIR/FTIR could be employed as a tool to rapidly probe/monitor the variability of forest biomass so that the appropriate online adjustments to parameters can be made in time to ensure process optimization and product quality. PMID:27618901
Patient phenotypes associated with outcomes after aneurysmal subarachnoid hemorrhage: a principal component analysis.

PubMed

Ibrahim, George M; Morgan, Benjamin R; Macdonald, R Loch

2014-03-01

Predictors of outcome after aneurysmal subarachnoid hemorrhage have been determined previously through hypothesis-driven methods that often exclude putative covariates and require a priori knowledge of potential confounders. Here, we apply a data-driven approach, principal component analysis, to identify baseline patient phenotypes that may predict neurological outcomes. Principal component analysis was performed on 120 subjects enrolled in a prospective randomized trial of clazosentan for the prevention of angiographic vasospasm. Correlation matrices were created using a combination of Pearson, polyserial, and polychoric regressions among 46 variables. Scores of significant components (with eigenvalues>1) were included in multivariate logistic regression models with incidence of severe angiographic vasospasm, delayed ischemic neurological deficit, and long-term outcome as outcomes of interest. Sixteen significant principal components accounting for 74.6% of the variance were identified. A single component dominated by the patients' initial hemodynamic status, World Federation of Neurosurgical Societies score, neurological injury, and initial neutrophil/leukocyte counts was significantly associated with poor outcome. Two additional components were associated with angiographic vasospasm, of which one was also associated with delayed ischemic neurological deficit. The first was dominated by the aneurysm-securing procedure, subarachnoid clot clearance, and intracerebral hemorrhage, whereas the second had high contributions from markers of anemia and albumin levels. Principal component analysis, a data-driven approach, identified patient phenotypes that are associated with worse neurological outcomes. Such data reduction methods may provide a better approximation of unique patient phenotypes and may inform clinical care as well as patient recruitment into clinical trials. http://www.clinicaltrials.gov. Unique identifier: NCT00111085.
Principal components of wrist circumduction from electromagnetic surgical tracking.

PubMed

Rasquinha, Brian J; Rainbow, Michael J; Zec, Michelle L; Pichora, David R; Ellis, Randy E

2017-02-01

An electromagnetic (EM) surgical tracking system was used for a functionally calibrated kinematic analysis of wrist motion. Circumduction motions were tested for differences in subject gender and for differences in the sense of the circumduction as clockwise or counter-clockwise motion. Twenty subjects were instrumented for EM tracking. Flexion-extension motion was used to identify the functional axis. Subjects performed unconstrained wrist circumduction in a clockwise and counter-clockwise sense. Data were decomposed into orthogonal flexion-extension motions and radial-ulnar deviation motions. PCA was used to concisely represent motions. Nonparametric Wilcoxon tests were used to distinguish the groups. Flexion-extension motions were projected onto a direction axis with a root-mean-square error of [Formula: see text]. Using the first three principal components, there was no statistically significant difference in gender (all [Formula: see text]). For motion sense, radial-ulnar deviation distinguished the sense of circumduction in the first principal component ([Formula: see text]) and in the third principal component ([Formula: see text]); flexion-extension distinguished the sense in the second principal component ([Formula: see text]). The clockwise sense of circumduction could be distinguished by a multifactorial combination of components; there were no gender differences in this small population. These data constitute a baseline for normal wrist circumduction. The multifactorial PCA findings suggest that a higher-dimensional method, such as manifold analysis, may be a more concise way of representing circumduction in human joints.
Resolvability of regional density structure and the road to direct density inversion - a principal-component approach to resolution analysis

NASA Astrophysics Data System (ADS)

Płonka, Agnieszka; Fichtner, Andreas

2017-04-01

Lateral density variations are the source of mass transport in the Earth at all scales, acting as drivers of convective motion. However, the density structure of the Earth remains largely unknown since classic seismic observables and gravity provide only weak constraints with strong trade-offs. Current density models are therefore often based on velocity scaling, making strong assumptions on the origin of structural heterogeneities, which may not necessarily be correct. Our goal is to assess if 3D density structure may be resolvable with emerging full-waveform inversion techniques. We have previously quantified the impact of regional-scale crustal density structure on seismic waveforms with the conclusion that reasonably sized density variations within the crust can leave a strong imprint on both travel times and amplitudes, and, while this can produce significant biases in velocity and Q estimates, the seismic waveform inversion for density may become feasible. In this study we perform principal component analyses of sensitivity kernels for P velocity, S velocity, and density. This is intended to establish the extent to which these kernels are linearly independent, i.e. the extent to which the different parameters may be constrained independently. We apply the method to data from 81 events around the Iberian Penninsula, registered in total by 492 stations. The objective is to find a principal kernel which would maximize the sensitivity to density, potentially allowing for as independent as possible density resolution. We find that surface (mosty Rayleigh) waves have significant sensitivity to density, and that the trade-off with velocity is negligible. We also show the preliminary results of the inversion.
Introduction to uses and interpretation of principal component analyses in forest biology.

Treesearch

J. G. Isebrands; Thomas R. Crow

1975-01-01

The application of principal component analysis for interpretation of multivariate data sets is reviewed with emphasis on (1) reduction of the number of variables, (2) ordination of variables, and (3) applications in conjunction with multiple regression.
Principal component analysis of phenolic acid spectra

USDA-ARS?s Scientific Manuscript database

Phenolic acids are common plant metabolites that exhibit bioactive properties and have applications in functional food and animal feed formulations. The ultraviolet (UV) and infrared (IR) spectra of four closely related phenolic acid structures were evaluated by principal component analysis (PCA) to...
Optimal pattern synthesis for speech recognition based on principal component analysis

NASA Astrophysics Data System (ADS)

Korsun, O. N.; Poliyev, A. V.

2018-02-01

The algorithm for building an optimal pattern for the purpose of automatic speech recognition, which increases the probability of correct recognition, is developed and presented in this work. The optimal pattern forming is based on the decomposition of an initial pattern to principal components, which enables to reduce the dimension of multi-parameter optimization problem. At the next step the training samples are introduced and the optimal estimates for principal components decomposition coefficients are obtained by a numeric parameter optimization algorithm. Finally, we consider the experiment results that show the improvement in speech recognition introduced by the proposed optimization algorithm.
Facilitating in vivo tumor localization by principal component analysis based on dynamic fluorescence molecular imaging

NASA Astrophysics Data System (ADS)

Gao, Yang; Chen, Maomao; Wu, Junyu; Zhou, Yuan; Cai, Chuangjian; Wang, Daliang; Luo, Jianwen

2017-09-01

Fluorescence molecular imaging has been used to target tumors in mice with xenograft tumors. However, tumor imaging is largely distorted by the aggregation of fluorescent probes in the liver. A principal component analysis (PCA)-based strategy was applied on the in vivo dynamic fluorescence imaging results of three mice with xenograft tumors to facilitate tumor imaging, with the help of a tumor-specific fluorescent probe. Tumor-relevant features were extracted from the original images by PCA and represented by the principal component (PC) maps. The second principal component (PC2) map represented the tumor-related features, and the first principal component (PC1) map retained the original pharmacokinetic profiles, especially of the liver. The distribution patterns of the PC2 map of the tumor-bearing mice were in good agreement with the actual tumor location. The tumor-to-liver ratio and contrast-to-noise ratio were significantly higher on the PC2 map than on the original images, thus distinguishing the tumor from its nearby fluorescence noise of liver. The results suggest that the PC2 map could serve as a bioimaging marker to facilitate in vivo tumor localization, and dynamic fluorescence molecular imaging with PCA could be a valuable tool for future studies of in vivo tumor metabolism and progression.
Geochemical differentiation processes for arc magma of the Sengan volcanic cluster, Northeastern Japan, constrained from principal component analysis

NASA Astrophysics Data System (ADS)

Ueki, Kenta; Iwamori, Hikaru

2017-10-01

In this study, with a view of understanding the structure of high-dimensional geochemical data and discussing the chemical processes at work in the evolution of arc magmas, we employed principal component analysis (PCA) to evaluate the compositional variations of volcanic rocks from the Sengan volcanic cluster of the Northeastern Japan Arc. We analyzed the trace element compositions of various arc volcanic rocks, sampled from 17 different volcanoes in a volcanic cluster. The PCA results demonstrated that the first three principal components accounted for 86% of the geochemical variation in the magma of the Sengan region. Based on the relationships between the principal components and the major elements, the mass-balance relationships with respect to the contributions of minerals, the composition of plagioclase phenocrysts, geothermal gradient, and seismic velocity structure in the crust, the first, the second, and the third principal components appear to represent magma mixing, crystallizations of olivine/pyroxene, and crystallizations of plagioclase, respectively. These represented 59%, 20%, and 6%, respectively, of the variance in the entire compositional range, indicating that magma mixing accounted for the largest variance in the geochemical variation of the arc magma. Our result indicated that crustal processes dominate the geochemical variation of magma in the Sengan volcanic cluster.
Assessing women's lacrosse head impacts using finite element modelling.

PubMed

Clark, J Michio; Hoshizaki, T Blaine; Gilchrist, Michael D

2018-04-01

Recently studies have assessed the ability of helmets to reduce peak linear and rotational acceleration for women's lacrosse head impacts. However, such measures have had low correlation with injury. Maximum principal strain interprets loading curves which provide better injury prediction than peak linear and rotational acceleration, especially in compliant situations which create low magnitude accelerations but long impact durations. The purpose of this study was to assess head and helmet impacts in women's lacrosse using finite element modelling. Linear and rotational acceleration loading curves from women's lacrosse impacts to a helmeted and an unhelmeted Hybrid III headform were input into the University College Dublin Brain Trauma Model. The finite element model was used to calculate maximum principal strain in the cerebrum. The results demonstrated for unhelmeted impacts, falls and ball impacts produce higher maximum principal strain values than stick and shoulder collisions. The strain values for falls and ball impacts were found to be within the range of concussion and traumatic brain injury. The results also showed that men's lacrosse helmets reduced maximum principal strain for follow-through slashing, falls and ball impacts. These findings are novel and demonstrate that for high risk events, maximum principal strain can be reduced by implementing the use of helmets if the rules of the sport do not effectively manage such situations. Copyright © 2018 Elsevier Ltd. All rights reserved.
Parameter expansion for estimation of reduced rank covariance matrices (Open Access publication)

PubMed Central

Meyer, Karin

2008-01-01

Parameter expanded and standard expectation maximisation algorithms are described for reduced rank estimation of covariance matrices by restricted maximum likelihood, fitting the leading principal components only. Convergence behaviour of these algorithms is examined for several examples and contrasted to that of the average information algorithm, and implications for practical analyses are discussed. It is shown that expectation maximisation type algorithms are readily adapted to reduced rank estimation and converge reliably. However, as is well known for the full rank case, the convergence is linear and thus slow. Hence, these algorithms are most useful in combination with the quadratically convergent average information algorithm, in particular in the initial stages of an iterative solution scheme. PMID:18096112
Optical characteristics of fine and coarse particulates at Grand Canyon, Arizona

NASA Astrophysics Data System (ADS)

Malm, William C.; Johnson, Christopher E.

The relationship between airborne particulate matter and atmospheric light extinction was examined using the multivariate techniques of principal component analysis and multiple linear regression on data gathered at the Grand Canyon, Arizona, from December 1979 to November 1981. Results showed that, on the average, fine sulfates were most strongly associated with light attenuation in the atmosphere. Other fine mass (nitrates, organics, soot and carbonaceous material) and coarse mass (primarily windblown dust) were much less associated with atmospheric extinction. Fine sulfate mass at the Grand Canyon was responsible for 63% of atmospheric light extinction while other fine mass and coarse mass were responsible for 17 and 20% of atmospheric extinction, respectively.
The Raman spectrum character of skin tumor induced by UVB

NASA Astrophysics Data System (ADS)

Wu, Shulian; Hu, Liangjun; Wang, Yunxia; Li, Yongzeng

2016-03-01

In our study, the skin canceration processes induced by UVB were analyzed from the perspective of tissue spectrum. A home-made Raman spectral system with a millimeter order excitation laser spot size combined with a multivariate statistical analysis for monitoring the skin changed irradiated by UVB was studied and the discrimination were evaluated. Raman scattering signals of the SCC and normal skin were acquired. Spectral differences in Raman spectra were revealed. Linear discriminant analysis (LDA) based on principal component analysis (PCA) were employed to generate diagnostic algorithms for the classification of skin SCC and normal. The results indicated that Raman spectroscopy combined with PCA-LDA demonstrated good potential for improving the diagnosis of skin cancers.
The study of esophageal cancer in an early stage by using Raman spectroscopy

NASA Astrophysics Data System (ADS)

Ishigaki, Mika; Taketani, Akinori; Maeda, Yasuhiro; Andriana, Bibin B.; Ishihara, Ryu; Sato, Hidetoshi

2013-02-01

The esophageal cancer is a disease with a high mortality. In order to lead a higher survival rate five years after the cancer's treatment, we inevitably need a method to diagnose the cancer in an early stage and support the therapy. Raman spectroscopy is one of the most powerful techniques for the purpose. In the present study, we apply Raman spectroscopy to obtain ex vivo spectra of normal and early tumor human esophageal sample. The result of principal component analysis indicates that the tumor tissue is associated with a decrease in tryptophan concentration. Furthermore, we can predict the tissue type with 80% accuracy by linear discriminant analysis which model is made by tryptophan bands.
Two biased estimation techniques in linear regression: Application to aircraft

NASA Technical Reports Server (NTRS)

Klein, Vladislav

1988-01-01

Several ways for detection and assessment of collinearity in measured data are discussed. Because data collinearity usually results in poor least squares estimates, two estimation techniques which can limit a damaging effect of collinearity are presented. These two techniques, the principal components regression and mixed estimation, belong to a class of biased estimation techniques. Detection and assessment of data collinearity and the two biased estimation techniques are demonstrated in two examples using flight test data from longitudinal maneuvers of an experimental aircraft. The eigensystem analysis and parameter variance decomposition appeared to be a promising tool for collinearity evaluation. The biased estimators had far better accuracy than the results from the ordinary least squares technique.
Evaluation of methodologies for assessing the overall diet: dietary quality scores and dietary pattern analysis.

PubMed

Ocké, Marga C

2013-05-01

This paper aims to describe different approaches for studying the overall diet with advantages and limitations. Studies of the overall diet have emerged because the relationship between dietary intake and health is very complex with all kinds of interactions. These cannot be captured well by studying single dietary components. Three main approaches to study the overall diet can be distinguished. The first method is researcher-defined scores or indices of diet quality. These are usually based on guidelines for a healthy diet or on diets known to be healthy. The second approach, using principal component or cluster analysis, is driven by the underlying dietary data. In principal component analysis, scales are derived based on the underlying relationships between food groups, whereas in cluster analysis, subgroups of the population are created with people that cluster together based on their dietary intake. A third approach includes methods that are driven by a combination of biological pathways and the underlying dietary data. Reduced rank regression defines linear combinations of food intakes that maximally explain nutrient intakes or intermediate markers of disease. Decision tree analysis identifies subgroups of a population whose members share dietary characteristics that influence (intermediate markers of) disease. It is concluded that all approaches have advantages and limitations and essentially answer different questions. The third approach is still more in an exploration phase, but seems to have great potential with complementary value. More insight into the utility of conducting studies on the overall diet can be gained if more attention is given to methodological issues.
Tailored multivariate analysis for modulated enhanced diffraction

DOE PAGES

Caliandro, Rocco; Guccione, Pietro; Nico, Giovanni; ...

2015-10-21

Modulated enhanced diffraction (MED) is a technique allowing the dynamic structural characterization of crystalline materials subjected to an external stimulus, which is particularly suited forin situandoperandostructural investigations at synchrotron sources. Contributions from the (active) part of the crystal system that varies synchronously with the stimulus can be extracted by an offline analysis, which can only be applied in the case of periodic stimuli and linear system responses. In this paper a new decomposition approach based on multivariate analysis is proposed. The standard principal component analysis (PCA) is adapted to treat MED data: specific figures of merit based on their scoresmore » and loadings are found, and the directions of the principal components obtained by PCA are modified to maximize such figures of merit. As a result, a general method to decompose MED data, called optimum constrained components rotation (OCCR), is developed, which produces very precise results on simulated data, even in the case of nonperiodic stimuli and/or nonlinear responses. Furthermore, the multivariate analysis approach is able to supply in one shot both the diffraction pattern related to the active atoms (through the OCCR loadings) and the time dependence of the system response (through the OCCR scores). Furthermore, when applied to real data, OCCR was able to supply only the latter information, as the former was hindered by changes in abundances of different crystal phases, which occurred besides structural variations in the specific case considered. In order to develop a decomposition procedure able to cope with this combined effect represents the next challenge in MED analysis.« less

Robust Segmentation of Planar and Linear Features of Terrestrial Laser Scanner Point Clouds Acquired from Construction Sites.

PubMed

Maalek, Reza; Lichti, Derek D; Ruwanpura, Janaka Y

2018-03-08

Automated segmentation of planar and linear features of point clouds acquired from construction sites is essential for the automatic extraction of building construction elements such as columns, beams and slabs. However, many planar and linear segmentation methods use scene-dependent similarity thresholds that may not provide generalizable solutions for all environments. In addition, outliers exist in construction site point clouds due to data artefacts caused by moving objects, occlusions and dust. To address these concerns, a novel method for robust classification and segmentation of planar and linear features is proposed. First, coplanar and collinear points are classified through a robust principal components analysis procedure. The classified points are then grouped using a new robust clustering method, the robust complete linkage method. A robust method is also proposed to extract the points of flat-slab floors and/or ceilings independent of the aforementioned stages to improve computational efficiency. The applicability of the proposed method is evaluated in eight datasets acquired from a complex laboratory environment and two construction sites at the University of Calgary. The precision, recall, and accuracy of the segmentation at both construction sites were 96.8%, 97.7% and 95%, respectively. These results demonstrate the suitability of the proposed method for robust segmentation of planar and linear features of contaminated datasets, such as those collected from construction sites.
Robust Segmentation of Planar and Linear Features of Terrestrial Laser Scanner Point Clouds Acquired from Construction Sites

PubMed Central

Maalek, Reza; Lichti, Derek D; Ruwanpura, Janaka Y

2018-01-01

Automated segmentation of planar and linear features of point clouds acquired from construction sites is essential for the automatic extraction of building construction elements such as columns, beams and slabs. However, many planar and linear segmentation methods use scene-dependent similarity thresholds that may not provide generalizable solutions for all environments. In addition, outliers exist in construction site point clouds due to data artefacts caused by moving objects, occlusions and dust. To address these concerns, a novel method for robust classification and segmentation of planar and linear features is proposed. First, coplanar and collinear points are classified through a robust principal components analysis procedure. The classified points are then grouped using a new robust clustering method, the robust complete linkage method. A robust method is also proposed to extract the points of flat-slab floors and/or ceilings independent of the aforementioned stages to improve computational efficiency. The applicability of the proposed method is evaluated in eight datasets acquired from a complex laboratory environment and two construction sites at the University of Calgary. The precision, recall, and accuracy of the segmentation at both construction sites were 96.8%, 97.7% and 95%, respectively. These results demonstrate the suitability of the proposed method for robust segmentation of planar and linear features of contaminated datasets, such as those collected from construction sites. PMID:29518062
Assessment of Supportive, Conflicted, and Controlling Dimensions of Family Functioning: A Principal Components Analysis of Family Environment Scale Subscales in a College Sample.

ERIC Educational Resources Information Center

Kronenberger, William G.; Thompson, Robert J., Jr.; Morrow, Catherine

1997-01-01

A principal components analysis of the Family Environment Scale (FES) (R. Moos and B. Moos, 1994) was performed using 113 undergraduates. Research supported 3 broad components encompassing the 10 FES subscales. These results supported previous research and the generalization of the FES to college samples. (SLD)
Time series analysis of collective motions in proteins

NASA Astrophysics Data System (ADS)

Alakent, Burak; Doruker, Pemra; ćamurdan, Mehmet C.

2004-01-01

The dynamics of α-amylase inhibitor tendamistat around its native state is investigated using time series analysis of the principal components of the Cα atomic displacements obtained from molecular dynamics trajectories. Collective motion along a principal component is modeled as a homogeneous nonstationary process, which is the result of the damped oscillations in local minima superimposed on a random walk. The motion in local minima is described by a stationary autoregressive moving average model, consisting of the frequency, damping factor, moving average parameters and random shock terms. Frequencies for the first 50 principal components are found to be in the 3-25 cm-1 range, which are well correlated with the principal component indices and also with atomistic normal mode analysis results. Damping factors, though their correlation is less pronounced, decrease as principal component indices increase, indicating that low frequency motions are less affected by friction. The existence of a positive moving average parameter indicates that the stochastic force term is likely to disturb the mode in opposite directions for two successive sampling times, showing the modes tendency to stay close to minimum. All these four parameters affect the mean square fluctuations of a principal mode within a single minimum. The inter-minima transitions are described by a random walk model, which is driven by a random shock term considerably smaller than that for the intra-minimum motion. The principal modes are classified into three subspaces based on their dynamics: essential, semiconstrained, and constrained, at least in partial consistency with previous studies. The Gaussian-type distributions of the intermediate modes, called "semiconstrained" modes, are explained by asserting that this random walk behavior is not completely free but between energy barriers.
Cortical Contribution to Linear, Non-linear and Frequency Components of Motor Variability Control during Standing.

PubMed

König Ignasiak, Niklas; Habermacher, Lars; Taylor, William R; Singh, Navrag B

2017-01-01

Motor variability is an inherent feature of all human movements and reflects the quality of functional task performance. Depending on the requirements of the motor task, the human sensory-motor system is thought to be able to flexibly govern the appropriate level of variability. However, it remains unclear which neurophysiological structures are responsible for the control of motor variability. In this study, we tested the contribution of cortical cognitive resources on the control of motor variability (in this case postural sway) using a dual-task paradigm and furthermore observed potential changes in control strategy by evaluating Ia-afferent integration (H-reflex). Twenty healthy subjects were instructed to stand relaxed on a force plate with eyes open and closed, as well as while trying to minimize sway magnitude and performing a "subtracting-sevens" cognitive task. In total 25 linear and non-linear parameters were used to evaluate postural sway, which were combined using a Principal Components procedure. Neurophysiological response of Ia-afferent reflex loop was quantified using the Hoffman reflex. In order to assess the contribution of the H-reflex on the sway outcome in the different standing conditions multiple mixed-model ANCOVAs were performed. The results suggest that subjects were unable to further minimize their sway, despite actively focusing to do so. The dual-task had a destabilizing effect on PS, which could partly (by 4%) be counter-balanced by increasing reliance on Ia-afferent information. The effect of the dual-task was larger than the protective mechanism of increasing Ia-afferent information. We, therefore, conclude that cortical structures, as compared to peripheral reflex loops, play a dominant role in the control of motor variability.
Local linear discriminant analysis framework using sample neighbors.

PubMed

Fan, Zizhu; Xu, Yong; Zhang, David

2011-07-01

The linear discriminant analysis (LDA) is a very popular linear feature extraction approach. The algorithms of LDA usually perform well under the following two assumptions. The first assumption is that the global data structure is consistent with the local data structure. The second assumption is that the input data classes are Gaussian distributions. However, in real-world applications, these assumptions are not always satisfied. In this paper, we propose an improved LDA framework, the local LDA (LLDA), which can perform well without needing to satisfy the above two assumptions. Our LLDA framework can effectively capture the local structure of samples. According to different types of local data structure, our LLDA framework incorporates several different forms of linear feature extraction approaches, such as the classical LDA and principal component analysis. The proposed framework includes two LLDA algorithms: a vector-based LLDA algorithm and a matrix-based LLDA (MLLDA) algorithm. MLLDA is directly applicable to image recognition, such as face recognition. Our algorithms need to train only a small portion of the whole training set before testing a sample. They are suitable for learning large-scale databases especially when the input data dimensions are very high and can achieve high classification accuracy. Extensive experiments show that the proposed algorithms can obtain good classification results.
Binding affinity toward human prion protein of some anti-prion compounds - Assessment based on QSAR modeling, molecular docking and non-parametric ranking.

PubMed

Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija

2018-01-01

The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.
Burst and Principal Components Analyses of MEA Data Separates Chemicals by Class

EPA Science Inventory

Microelectrode arrays (MEAs) detect drug and chemical induced changes in action potential "spikes" in neuronal networks and can be used to screen chemicals for neurotoxicity. Analytical "fingerprinting," using Principal Components Analysis (PCA) on spike trains recorded from prim...
EVALUATION OF ACID DEPOSITION MODELS USING PRINCIPAL COMPONENT SPACES

EPA Science Inventory

An analytical technique involving principal components analysis is proposed for use in the evaluation of acid deposition models. elationships among model predictions are compared to those among measured data, rather than the more common one-to-one comparison of predictions to mea...
Plant Invasions in China – Challenges and Chances

PubMed Central

Axmacher, Jan C.; Sang, Weiguo

2013-01-01

Invasive species cause serious environmental and economic harm and threaten global biodiversity. We set out to investigate how quickly invasive plant species are currently spreading in China and how their resulting distribution patterns are linked to socio-economic and environmental conditions. A comparison of the invasive plant species density (log species/log area) reported in 2008 with current data shows that invasive species were originally highly concentrated in the wealthy, southeastern coastal provinces of China, but they are currently rapidly spreading inland. Linear regression models based on the species density and turnover of invasive plants as dependent parameters and principal components representing key socio-economic and environmental parameters as predictors indicate strong positive links between invasive plant density and the overall phytodiversity and associated climatic parameters. Principal components representing socio-economic factors and endemic plant density also show significant positive links with invasive plant density. Urgent control and eradication measures are needed in China's coastal provinces to counteract the rapid inland spread of invasive plants. Strict controls of imports through seaports need to be accompanied by similarly strict controls of the developing horticultural trade and underpinned by awareness campaigns for China's increasingly affluent population to limit the arrival of new invaders. Furthermore, China needs to fully utilize its substantial native phytodiversity, rather than relying on exotics, in current large-scale afforestation projects and in the creation of urban green spaces. PMID:23691164
Multivariate classification of the infrared spectra of cell and tissue samples

DOE Office of Scientific and Technical Information (OSTI.GOV)

Haaland, D.M.; Jones, H.D.; Thomas, E.V.

1997-03-01

Infrared microspectroscopy of biopsied canine lymph cells and tissue was performed to investigate the possibility of using IR spectra coupled with multivariate classification methods to classify the samples as normal, hyperplastic, or neoplastic (malignant). IR spectra were obtained in transmission mode through BaF{sub 2} windows and in reflection mode from samples prepared on gold-coated microscope slides. Cytology and histopathology samples were prepared by a variety of methods to identify the optimal methods of sample preparation. Cytospinning procedures that yielded a monolayer of cells on the BaF{sub 2} windows produced a limited set of IR transmission spectra. These transmission spectra weremore » converted to absorbance and formed the basis for a classification rule that yielded 100{percent} correct classification in a cross-validated context. Classifications of normal, hyperplastic, and neoplastic cell sample spectra were achieved by using both partial least-squares (PLS) and principal component regression (PCR) classification methods. Linear discriminant analysis applied to principal components obtained from the spectral data yielded a small number of misclassifications. PLS weight loading vectors yield valuable qualitative insight into the molecular changes that are responsible for the success of the infrared classification. These successful classification results show promise for assisting pathologists in the diagnosis of cell types and offer future potential for {ital in vivo} IR detection of some types of cancer. {copyright} {ital 1997} {ital Society for Applied Spectroscopy}« less
Assets as a Socioeconomic Status Index: Categorical Principal Components Analysis vs. Latent Class Analysis.

PubMed

Sartipi, Majid; Nedjat, Saharnaz; Mansournia, Mohammad Ali; Baigi, Vali; Fotouhi, Akbar

2016-11-01

Some variables like Socioeconomic Status (SES) cannot be directly measured, instead, so-called 'latent variables' are measured indirectly through calculating tangible items. There are different methods for measuring latent variables such as data reduction methods e.g. Principal Components Analysis (PCA) and Latent Class Analysis (LCA). The purpose of our study was to measure assets index- as a representative of SES- through two methods of Non-Linear PCA (NLPCA) and LCA, and to compare them for choosing the most appropriate model. This was a cross sectional study in which 1995 respondents filled the questionnaires about their assets in Tehran. The data were analyzed by SPSS 19 (CATPCA command) and SAS 9.2 (PROC LCA command) to estimate their socioeconomic status. The results were compared based on the Intra-class Correlation Coefficient (ICC). The 6 derived classes from LCA based on BIC, were highly consistent with the 6 classes from CATPCA (Categorical PCA) (ICC = 0.87, 95%CI: 0.86 - 0.88). There is no gold standard to measure SES. Therefore, it is not possible to definitely say that a specific method is better than another one. LCA is a complicated method that presents detailed information about latent variables and required one assumption (local independency), while NLPCA is a simple method, which requires more assumptions. Generally, NLPCA seems to be an acceptable method of analysis because of its simplicity and high agreement with LCA.
Raman exfoliative cytology for oral precancer diagnosis

NASA Astrophysics Data System (ADS)

Sahu, Aditi; Gera, Poonam; Pai, Venkatesh; Dubey, Abhishek; Tyagi, Gunjan; Waghmare, Mandavi; Pagare, Sandeep; Mahimkar, Manoj; Murali Krishna, C.

2017-11-01

Oral premalignant lesions (OPLs) such as leukoplakia, erythroplakia, and oral submucous fibrosis, often precede oral cancer. Screening and management of these premalignant conditions can improve prognosis. Raman spectroscopy has previously demonstrated potential in the diagnosis of oral premalignant conditions (in vivo), detected viral infection, and identified cancer in both oral and cervical exfoliated cells (ex vivo). The potential of Raman exfoliative cytology (REC) in identifying premalignant conditions was investigated. Oral exfoliated samples were collected from healthy volunteers (n=20), healthy volunteers with tobacco habits (n=20), and oral premalignant conditions (n=27, OPL) using Cytobrush. Spectra were acquired using Raman microprobe. Spectral acquisition parameters were: λex: 785 nm, laser power: 40 mW, acquisition time: 15 s, and average: 3. Postspectral acquisition, cell pellet was subjected to Pap staining. Multivariate analysis was carried out using principal component analysis and principal component-linear discriminant analysis using both spectra- and patient-wise approaches in three- and two-group models. OPLs could be identified with ˜77% (spectra-wise) and ˜70% (patient-wise) sensitivity in the three-group model while with 86% (spectra-wise) and 83% (patient-wise) in the two-group model. Use of histopathologically confirmed premalignant cases and better sampling devices may help in development of improved standard models and also enhance the sensitivity of the method. Future longitudinal studies can help validate potential of REC in screening and monitoring high-risk populations and prognosis prediction of premalignant lesions.
Quantitative identification and source apportionment of anthropogenic heavy metals in marine sediment of Hong Kong

NASA Astrophysics Data System (ADS)

Zhou, Feng; Guo, Huaicheng; Liu, Lei

2007-10-01

Based on ten heavy metals collected twice annually at 59 sites from 1998 to 2004, enrichment factors (EFs), principal component analysis (PCA) and multivariate linear regression of absolute principal component scores (MLR-APCS) were used in identification and source apportionment of the anthropogenic heavy metals in marine sediment. EFs with Fe as a normalizer and local background as reference values was properly tested and suitable in Hong Kong, and Zn, Ni, Pb, Cu, Cd, Hg and Cr mainly originated from anthropogenic sources, while Al, Mn and Fe were derived from rocks weathering. Rotated PCA and GIS mapping further identified two types of anthropogenic sources and their impacted regions: (1) electronic industrial pollution, riparian runoff and vehicle exhaust impacted the entire Victoria Harbour, inner Tolo Harbour, Eastern Buffer, inner Deep Bay and Cheung Chau; and (2) discharges from textile factories and paint, influenced Tsuen Wan Bay and Kwun Tong typhoon shelter and Rambler Channel. In addition, MLR-APCS was successfully introduced to quantitatively determine the source contributions with uncertainties almost less than 8%: the first anthropogenic sources were responsible for 50.0, 45.1, 86.6, 78.9 and 87.5% of the Zn, Pb, Cu, Cd and Hg, respectively, whereas 49.9% of the Ni and 58.4% of the Cr came from the second anthropogenic sources.
Multivariate optimization of a synergistic blend of oleoresin sage (Salvia officinalis L.) and ascorbyl palmitate to stabilize sunflower oil.

PubMed

Upadhyay, Rohit; Mishra, Hari Niwas

2016-04-01

The simultaneous optimization of a synergistic blend of oleoresin sage (SAG) and ascorbyl palmitate (AP) in sunflower oil (SO) was performed using central composite and rotatable design coupled with principal component analysis (PCA) and response surface methodology (RSM). The physicochemical parameters viz., peroxide value, anisidine value, free fatty acids, induction period, total polar matter, antioxidant capacity and conjugated diene value were considered as response variables. PCA reduced the original set of correlated responses to few uncorrelated principal components (PC). The PC1 (eigen value, 5.78; data variance explained, 82.53 %) was selected for optimization using RSM. The quadratic model adequately described the data (R (2) = 0. 91, p < 0.05) and lack of fit was insignificant (p > 0.05). The contour plot of PC 1 score indicated the optimal synergistic combination of 1289.19 and 218.06 ppm for SAG and AP, respectively. This combination of SAG and AP resulted in shelf life of 320 days at 25 °C estimated using linear shelf life prediction model. In conclusion, the versatility of PCA-RSM approach has resulted in an easy interpretation in multiple response optimizations. This approach can be considered as a useful guide to develop new oil blends stabilized with food additives from natural sources.
Integrating Multiple Correlated Phenotypes for Genetic Association Analysis by Maximizing Heritability

PubMed Central

Zhou, Jin J.; Cho, Michael H.; Lange, Christoph; Lutz, Sharon; Silverman, Edwin K.; Laird, Nan M.

2015-01-01

Many correlated disease variables are analyzed jointly in genetic studies in the hope of increasing power to detect causal genetic variants. One approach involves assessing the relationship between each phenotype and each single nucleotide polymorphism (SNP) individually and using a Bonferroni correction for the effective number of tests conducted. Alternatively, one can apply a multivariate regression or a dimension reduction technique, such as principal component analysis (PCA), and test for the association with the principal components (PC) of the phenotypes rather than the individual phenotypes. Inspired by the previous approaches of combining phenotypes to maximize heritability at individual SNPs, in this paper, we propose to construct a maximally heritable phenotype (MaxH) by taking advantage of the estimated total heritability and co-heritability. The heritability and co-heritability only need to be estimated once, therefore our method is applicable to genome-wide scans. MaxH phenotype is a linear combination of the individual phenotypes with increased heritability and power over the phenotypes being combined. Simulations show that the heritability and power achieved agree well with the theory for large samples and two phenotypes. We compare our approach with commonly used methods and assess both the heritability and the power of the MaxH phenotype. Moreover we provide suggestions for how to choose the phenotypes for combination. An application of our approach to a COPD genome-wide association study shows the practical relevance. PMID:26111731
Fast Principal-Component Analysis Reveals Convergent Evolution of ADH1B in Europe and East Asia

PubMed Central

Galinsky, Kevin J.; Bhatia, Gaurav; Loh, Po-Ru; Georgiev, Stoyan; Mukherjee, Sayan; Patterson, Nick J.; Price, Alkes L.

2016-01-01

Searching for genetic variants with unusual differentiation between subpopulations is an established approach for identifying signals of natural selection. However, existing methods generally require discrete subpopulations. We introduce a method that infers selection using principal components (PCs) by identifying variants whose differentiation along top PCs is significantly greater than the null distribution of genetic drift. To enable the application of this method to large datasets, we developed the FastPCA software, which employs recent advances in random matrix theory to accurately approximate top PCs while reducing time and memory cost from quadratic to linear in the number of individuals, a computational improvement of many orders of magnitude. We apply FastPCA to a cohort of 54,734 European Americans, identifying 5 distinct subpopulations spanning the top 4 PCs. Using the PC-based test for natural selection, we replicate previously known selected loci and identify three new genome-wide significant signals of selection, including selection in Europeans at ADH1B. The coding variant rs1229984∗T has previously been associated to a decreased risk of alcoholism and shown to be under selection in East Asians; we show that it is a rare example of independent evolution on two continents. We also detect selection signals at IGFBP3 and IGH, which have also previously been associated to human disease. PMID:26924531
Taste characteristics based quantitative and qualitative evaluation of ginseng adulteration.

PubMed

Cui, Shaoqing; Yang, Liangcheng; Wang, Jun; Wang, Xinlei

2015-05-01

Adulteration of American ginseng with Asian ginseng is common and has caused much damage to customers. Panel evaluation is commonly used to determine their differences, but it is subjective. Chemical instruments are used to identify critical compounds but they are time-consuming and expensive. Therefore, a fast, accurate and convenient method is required. A taste sensing system, combining both advantages of the above two technologies, provides a novel potential technology for determining ginseng adulteration. The aim is to build appropriate models to distinguish and predict ginseng adulteration by using taste characteristics. It was found that ginsenoside contents decreased linearly (R(2) = 0.92) with mixed ratios. A bioplot of principal component analysis showed a good performance in classing samples with the first two principal components reaching 89.7%, and it was noted that it was the bitterness, astringency, aftertaste of bitterness and astringency, and saltiness leading the successful determination. After factor screening, bitterness, astringency, aftertaste of bitterness and saltiness were employed to build latent models. Tastes of bitterness, astringency and aftertaste bitterness were demonstrated to be most effective in predicting adulteration ratio, mean while, bitterness and aftertaste bitterness turned out to be most effective in ginsenoside content prediction. Taste characteristics of adulterated ginsengs, considered as taste fingerprint, can provide novel guidance for determining the adulteration of American and Asian ginseng. © 2014 Society of Chemical Industry.
Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra

NASA Astrophysics Data System (ADS)

Zhan, Hao; Fang, Jing; Tang, Liying; Yang, Hongjun; Li, Hua; Wang, Zhuju; Yang, Bin; Wu, Hongwei; Fu, Meihong

2017-08-01

Near-infrared (NIR) spectroscopy with multivariate analysis was used to quantify gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra, and the feasibility to classify the samples originating from different areas was investigated. A new high-performance liquid chromatography method was developed and validated to analyze gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra as the reference. Partial least squares (PLS), principal component regression (PCR), and stepwise multivariate linear regression (SMLR) were performed to calibrate the regression model. Different data pretreatments such as derivatives (1st and 2nd), multiplicative scatter correction, standard normal variate, Savitzky-Golay filter, and Norris derivative filter were applied to remove the systematic errors. The performance of the model was evaluated according to the root mean square of calibration (RMSEC), root mean square error of prediction (RMSEP), root mean square error of cross-validation (RMSECV), and correlation coefficient (r). The results show that compared to PCR and SMLR, PLS had a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. PLS coupled with proper pretreatments showed good performance in both the fitting and predicting results. Furthermore, the original areas of Radix Paeoniae Rubra samples were partly distinguished by principal component analysis. This study shows that NIR with PLS is a reliable, inexpensive, and rapid tool for the quality assessment of Radix Paeoniae Rubra.
Multivariate methods for indoor PM10 and PM2.5 modelling in naturally ventilated schools buildings

NASA Astrophysics Data System (ADS)

Elbayoumi, Maher; Ramli, Nor Azam; Md Yusof, Noor Faizah Fitri; Yahaya, Ahmad Shukri Bin; Al Madhoun, Wesam; Ul-Saufie, Ahmed Zia

2014-09-01

In this study the concentrations of PM10, PM2.5, CO and CO2 concentrations and meteorological variables (wind speed, air temperature, and relative humidity) were employed to predict the annual and seasonal indoor concentration of PM10 and PM2.5 using multivariate statistical methods. The data have been collected in twelve naturally ventilated schools in Gaza Strip (Palestine) from October 2011 to May 2012 (academic year). The bivariate correlation analysis showed that the indoor PM10 and PM2.5 were highly positive correlated with outdoor concentration of PM10 and PM2.5. Further, Multiple linear regression (MLR) was used for modelling and R2 values for indoor PM10 were determined as 0.62 and 0.84 for PM10 and PM2.5 respectively. The Performance indicators of MLR models indicated that the prediction for PM10 and PM2.5 annual models were better than seasonal models. In order to reduce the number of input variables, principal component analysis (PCA) and principal component regression (PCR) were applied by using annual data. The predicted R2 were 0.40 and 0.73 for PM10 and PM2.5, respectively. PM10 models (MLR and PCR) show the tendency to underestimate indoor PM10 concentrations as it does not take into account the occupant's activities which highly affect the indoor concentrations during the class hours.

Spectroscopic and Chemometric Analysis of Binary and Ternary Edible Oil Mixtures: Qualitative and Quantitative Study.

PubMed

Jović, Ozren; Smolić, Tomislav; Primožič, Ines; Hrenar, Tomica

2016-04-19

The aim of this study was to investigate the feasibility of FTIR-ATR spectroscopy coupled with the multivariate numerical methodology for qualitative and quantitative analysis of binary and ternary edible oil mixtures. Four pure oils (extra virgin olive oil, high oleic sunflower oil, rapeseed oil, and sunflower oil), as well as their 54 binary and 108 ternary mixtures, were analyzed using FTIR-ATR spectroscopy in combination with principal component and discriminant analysis, partial least-squares, and principal component regression. It was found that the composition of all 166 samples can be excellently represented using only the first three principal components describing 98.29% of total variance in the selected spectral range (3035-2989, 1170-1140, 1120-1100, 1093-1047, and 930-890 cm(-1)). Factor scores in 3D space spanned by these three principal components form a tetrahedral-like arrangement: pure oils being at the vertices, binary mixtures at the edges, and ternary mixtures on the faces of a tetrahedron. To confirm the validity of results, we applied several cross-validation methods. Quantitative analysis was performed by minimization of root-mean-square error of cross-validation values regarding the spectral range, derivative order, and choice of method (partial least-squares or principal component regression), which resulted in excellent predictions for test sets (R(2) > 0.99 in all cases). Additionally, experimentally more demanding gas chromatography analysis of fatty acid content was carried out for all specimens, confirming the results obtained by FTIR-ATR coupled with principal component analysis. However, FTIR-ATR provided a considerably better model for prediction of mixture composition than gas chromatography, especially for high oleic sunflower oil.
Short communication: Discrimination between retail bovine milks with different fat contents using chemometrics and fatty acid profiling.

PubMed

Vargas-Bello-Pérez, Einar; Toro-Mujica, Paula; Enriquez-Hidalgo, Daniel; Fellenberg, María Angélica; Gómez-Cortés, Pilar

2017-06-01

We used a multivariate chemometric approach to differentiate or associate retail bovine milks with different fat contents and non-dairy beverages, using fatty acid profiles and statistical analysis. We collected samples of bovine milk (whole, semi-skim, and skim; n = 62) and non-dairy beverages (n = 27), and we analyzed them using gas-liquid chromatography. Principal component analysis of the fatty acid data yielded 3 significant principal components, which accounted for 72% of the total variance in the data set. Principal component 1 was related to saturated fatty acids (C4:0, C6:0, C8:0, C12:0, C14:0, C17:0, and C18:0) and monounsaturated fatty acids (C14:1 cis-9, C16:1 cis-9, C17:1 cis-9, and C18:1 trans-11); whole milk samples were clearly differentiated from the rest using this principal component. Principal component 2 differentiated semi-skim milk samples by n-3 fatty acid content (C20:3n-3, C20:5n-3, and C22:6n-3). Principal component 3 was related to C18:2 trans-9,trans-12 and C20:4n-6, and its lower scores were observed in skim milk and non-dairy beverages. A cluster analysis yielded 3 groups: group 1 consisted of only whole milk samples, group 2 was represented mainly by semi-skim milks, and group 3 included skim milk and non-dairy beverages. Overall, the present study showed that a multivariate chemometric approach is a useful tool for differentiating or associating retail bovine milks and non-dairy beverages using their fatty acid profile. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Use of multivariate statistics to identify unreliable data obtained using CASA.

PubMed

Martínez, Luis Becerril; Crispín, Rubén Huerta; Mendoza, Maximino Méndez; Gallegos, Oswaldo Hernández; Martínez, Andrés Aragón

2013-06-01

In order to identify unreliable data in a dataset of motility parameters obtained from a pilot study acquired by a veterinarian with experience in boar semen handling, but without experience in the operation of a computer assisted sperm analysis (CASA) system, a multivariate graphical and statistical analysis was performed. Sixteen boar semen samples were aliquoted then incubated with varying concentrations of progesterone from 0 to 3.33 µg/ml and analyzed in a CASA system. After standardization of the data, Chernoff faces were pictured for each measurement, and a principal component analysis (PCA) was used to reduce the dimensionality and pre-process the data before hierarchical clustering. The first twelve individual measurements showed abnormal features when Chernoff faces were drawn. PCA revealed that principal components 1 and 2 explained 63.08% of the variance in the dataset. Values of principal components for each individual measurement of semen samples were mapped to identify differences among treatment or among boars. Twelve individual measurements presented low values of principal component 1. Confidence ellipses on the map of principal components showed no statistically significant effects for treatment or boar. Hierarchical clustering realized on two first principal components produced three clusters. Cluster 1 contained evaluations of the two first samples in each treatment, each one of a different boar. With the exception of one individual measurement, all other measurements in cluster 1 were the same as observed in abnormal Chernoff faces. Unreliable data in cluster 1 are probably related to the operator inexperience with a CASA system. These findings could be used to objectively evaluate the skill level of an operator of a CASA system. This may be particularly useful in the quality control of semen analysis using CASA systems.
[Spatial distribution characteristics of the physical and chemical properties of water in the Kunes River after the supply of snowmelt during spring].

PubMed

Liu, Xiang; Guo, Ling-Peng; Zhang, Fei-Yun; Ma, Jie; Mu, Shu-Yong; Zhao, Xin; Li, Lan-Hai

2015-02-01

Eight physical and chemical indicators related to water quality were monitored from nineteen sampling sites along the Kunes River at the end of snowmelt season in spring. To investigate the spatial distribution characteristics of water physical and chemical properties, cluster analysis (CA), discriminant analysis (DA) and principal component analysis (PCA) are employed. The result of cluster analysis showed that the Kunes River could be divided into three reaches according to the similarities of water physical and chemical properties among sampling sites, representing the upstream, midstream and downstream of the river, respectively; The result of discriminant analysis demonstrated that the reliability of such a classification was high, and DO, Cl- and BOD5 were the significant indexes leading to this classification; Three principal components were extracted on the basis of the principal component analysis, in which accumulative variance contribution could reach 86.90%. The result of principal component analysis also indicated that water physical and chemical properties were mostly affected by EC, ORP, NO3(-) -N, NH4(+) -N, Cl- and BOD5. The sorted results of principal component scores in each sampling sites showed that the water quality was mainly influenced by DO in upstream, by pH in midstream, and by the rest of indicators in downstream. The order of comprehensive scores for principal components revealed that the water quality degraded from the upstream to downstream, i.e., the upstream had the best water quality, followed by the midstream, while the water quality at downstream was the worst. This result corresponded exactly to the three reaches classified using cluster analysis. Anthropogenic activity and the accumulation of pollutants along the river were probably the main reasons leading to this spatial difference.
Evidence for age-associated disinhibition of the wake drive provided by scoring principal components of the resting EEG spectrum in sleep-provoking conditions.

PubMed

Putilov, Arcady A; Donskaya, Olga G

2016-01-01

Age-associated changes in different bandwidths of the human electroencephalographic (EEG) spectrum are well documented, but their functional significance is poorly understood. This spectrum seems to represent summation of simultaneous influences of several sleep-wake regulatory processes. Scoring of its orthogonal (uncorrelated) principal components can help in separation of the brain signatures of these processes. In particular, the opposite age-associated changes were documented for scores on the two largest (1st and 2nd) principal components of the sleep EEG spectrum. A decrease of the first score and an increase of the second score can reflect, respectively, the weakening of the sleep drive and disinhibition of the opposing wake drive with age. In order to support the suggestion of age-associated disinhibition of the wake drive from the antagonistic influence of the sleep drive, we analyzed principal component scores of the resting EEG spectra obtained in sleep deprivation experiments with 81 healthy young adults aged between 19 and 26 and 40 healthy older adults aged between 45 and 66 years. At the second day of the sleep deprivation experiments, frontal scores on the 1st principal component of the EEG spectrum demonstrated an age-associated reduction of response to eyes closed relaxation. Scores on the 2nd principal component were either initially increased during wakefulness or less responsive to such sleep-provoking conditions (frontal and occipital scores, respectively). These results are in line with the suggestion of disinhibition of the wake drive with age. They provide an explanation of why older adults are less vulnerable to sleep deprivation than young adults.
The Relationship of School-Based Parental Involvement with Student Achievement: A Comparison of Principal and Parent Survey Reports from PISA 2012

ERIC Educational Resources Information Center

Sebastian, James; Moon, Jeong-Mi; Cunningham, Matt

2017-01-01

This paper explores parental involvement using principal and parent survey reports to examine whether parents' involvement in their children's schools predicts academic achievement. Survey data from principals and parents of seven countries from the PISA 2012 database and hierarchical linear modelling were used to analyse between- and within-…
Are Principal Background and School Processes Related to Teacher Job Satisfaction? A Multilevel Study Using Schools and Staffing Survey 2003-04

ERIC Educational Resources Information Center

Shen, Jianping; Leslie, Jeffrey M.; Spybrook, Jessaca K.; Ma, Xin

2012-01-01

Using nationally representative samples for public school teachers and principals, the authors inquired into whether principal background and school processes are related to teacher job satisfaction. Employing hierarchical linear modeling (HLM), the authors were able to control for background characteristics at both the teacher and school levels.…
A composite measure to explore visual disability in primary progressive multiple sclerosis.

PubMed

Poretto, Valentina; Petracca, Maria; Saiote, Catarina; Mormina, Enricomaria; Howard, Jonathan; Miller, Aaron; Lublin, Fred D; Inglese, Matilde

2017-01-01

Optical coherence tomography (OCT) and magnetic resonance imaging (MRI) can provide complementary information on visual system damage in multiple sclerosis (MS). The objective of this paper is to determine whether a composite OCT/MRI score, reflecting cumulative damage along the entire visual pathway, can predict visual deficits in primary progressive multiple sclerosis (PPMS). Twenty-five PPMS patients and 20 age-matched controls underwent neuro-ophthalmologic evaluation, spectral-domain OCT, and 3T brain MRI. Differences between groups were assessed by univariate general linear model and principal component analysis (PCA) grouped instrumental variables into main components. Linear regression analysis was used to assess the relationship between low-contrast visual acuity (LCVA), OCT/MRI-derived metrics and PCA-derived composite scores. PCA identified four main components explaining 80.69% of data variance. Considering each variable independently, LCVA 1.25% was significantly predicted by ganglion cell-inner plexiform layer (GCIPL) thickness, thalamic volume and optic radiation (OR) lesion volume (adjusted R 2 0.328, p = 0.00004; adjusted R 2 0.187, p = 0.002 and adjusted R 2 0.180, p = 0.002). The PCA composite score of global visual pathway damage independently predicted both LCVA 1.25% (adjusted R 2 value 0.361, p = 0.00001) and LCVA 2.50% (adjusted R 2 value 0.323, p = 0.00003). A multiparametric score represents a more comprehensive and effective tool to explain visual disability than a single instrumental metric in PPMS.
Application of principal component analysis to ecodiversity assessment of postglacial landscape (on the example of Debnica Kaszubska commune, Middle Pomerania)

NASA Astrophysics Data System (ADS)

Wojciechowski, Adam

2017-04-01

In order to assess ecodiversity understood as a comprehensive natural landscape factor (Jedicke 2001), it is necessary to apply research methods which recognize the environment in a holistic way. Principal component analysis may be considered as one of such methods as it allows to distinguish the main factors determining landscape diversity on the one hand, and enables to discover regularities shaping the relationships between various elements of the environment under study on the other hand. The procedure adopted to assess ecodiversity with the use of principal component analysis involves: a) determining and selecting appropriate factors of the assessed environment qualities (hypsometric, geological, hydrographic, plant, and others); b) calculating the absolute value of individual qualities for the basic areas under analysis (e.g. river length, forest area, altitude differences, etc.); c) principal components analysis and obtaining factor maps (maps of selected components); d) generating a resultant, detailed map and isolating several classes of ecodiversity. An assessment of ecodiversity with the use of principal component analysis was conducted in the test area of 299,67 km2 in Debnica Kaszubska commune. The whole commune is situated in the Weichselian glaciation area of high hypsometric and morphological diversity as well as high geo- and biodiversity. The analysis was based on topographical maps of the commune area in scale 1:25000 and maps of forest habitats. Consequently, nine factors reflecting basic environment elements were calculated: maximum height (m), minimum height (m), average height (m), the length of watercourses (km), the area of water reservoirs (m2), total forest area (ha), coniferous forests habitats area (ha), deciduous forest habitats area (ha), alder habitats area (ha). The values for individual factors were analysed for 358 grid cells of 1 km2. Based on the principal components analysis, four major factors affecting commune ecodiversity were distinguished: hypsometric component (PC1), deciduous forest habitats component (PC2), river valleys and alder habitats component (PC3), and lakes component (PC4). The distinguished factors characterise natural qualities of postglacial area and reflect well the role of the four most important groups of environment components in shaping ecodiversity of the area under study. The map of ecodiversity of Debnica Kaszubska commune was created on the basis of the first four principal component scores and then five classes of diversity were isolated: very low, low, average, high and very high. As a result of the assessment, five commune regions of very high ecodiversity were separated. These regions are also very attractive for tourists and valuable in terms of their rich nature which include protected areas such as Slupia Valley Landscape Park. The suggested method of ecodiversity assessment with the use of principal component analysis may constitute an alternative methodological proposition to other research methods used so far. Literature Jedicke E., 2001. Biodiversität, Geodiversität, Ökodiversität. Kriterien zur Analyse der Landschaftsstruktur - ein konzeptioneller Diskussionsbeitrag. Naturschutz und Landschaftsplanung, 33(2/3), 59-68.
Polarization locked vector solitons and axis instability in optical fiber.

PubMed

Cundiff, Steven T.; Collings, Brandon C.; Bergman, Keren

2000-09-01

We experimentally observe polarization-locked vector solitons in optical fiber. Polarization locked-vector solitons use nonlinearity to preserve their polarization state despite the presence of birefringence. To achieve conditions where the delicate balance between nonlinearity and birefringence can survive, we studied the polarization evolution of the pulses circulating in a laser constructed entirely of optical fiber. We observe two distinct states with fixed polarization. This first state occurs for very small values birefringence and is elliptically polarized. We measure the relative phase between orthogonal components along the two principal axes to be +/-pi/2. The relative amplitude varies linearly with the magnitude of the birefringence. This state is a polarization locked vector soliton. The second, linearly polarized, state occurs for larger values of birefringence. The second state is due to the fast axis instability. We provide complete characterization of these states, and present a physical explanation of both of these states and the stability of the polarization locked vector solitons. (c) 2000 American Institute of Physics.
Polarization locked vector solitons and axis instability in optical fiber

NASA Astrophysics Data System (ADS)

Cundiff, Steven T.; Collings, Brandon C.; Bergman, Keren

2000-09-01

We experimentally observe polarization-locked vector solitons in optical fiber. Polarization locked-vector solitons use nonlinearity to preserve their polarization state despite the presence of birefringence. To achieve conditions where the delicate balance between nonlinearity and birefringence can survive, we studied the polarization evolution of the pulses circulating in a laser constructed entirely of optical fiber. We observe two distinct states with fixed polarization. This first state occurs for very small values birefringence and is elliptically polarized. We measure the relative phase between orthogonal components along the two principal axes to be ±π/2. The relative amplitude varies linearly with the magnitude of the birefringence. This state is a polarization locked vector soliton. The second, linearly polarized, state occurs for larger values of birefringence. The second state is due to the fast axis instability. We provide complete characterization of these states, and present a physical explanation of both of these states and the stability of the polarization locked vector solitons.
Comprehensive Analysis of Large Sets of Age-Related Physiological Indicators Reveals Rapid Aging around the Age of 55 Years.

PubMed

Lixie, Erin; Edgeworth, Jameson; Shamir, Lior

2015-01-01

While many studies show a correlation between chronological age and physiological indicators, the nature of this correlation is not fully understood. To perform a comprehensive analysis of the correlation between chronological age and age-related physiological indicators. Physiological aging scores were deduced using principal component analysis from a large dataset of 1,227 variables measured in a cohort of 4,796 human subjects, and the correlation between the physiological aging scores and chronological age was assessed. Physiological age does not progress linearly or exponentially with chronological age: a more rapid physiological change is observed around the age of 55 years, followed by a mild decline until around the age of 70 years. These findings provide evidence that the progression of physiological age is not linear with that of chronological age, and that periods of mild change in physiological age are separated by periods of more rapid aging. © 2015 S. Karger AG, Basel.
Statistical and clustering analysis for disturbances: A case study of voltage dips in wind farms

DOE PAGES

Garcia-Sanchez, Tania; Gomez-Lazaro, Emilio; Muljadi, Eduard; ...

2016-01-28

This study proposes and evaluates an alternative statistical methodology to analyze a large number of voltage dips. For a given voltage dip, a set of lengths is first identified to characterize the root mean square (rms) voltage evolution along the disturbance, deduced from partial linearized time intervals and trajectories. Principal component analysis and K-means clustering processes are then applied to identify rms-voltage patterns and propose a reduced number of representative rms-voltage profiles from the linearized trajectories. This reduced group of averaged rms-voltage profiles enables the representation of a large amount of disturbances, which offers a visual and graphical representation ofmore » their evolution along the events, aspects that were not previously considered in other contributions. The complete process is evaluated on real voltage dips collected in intense field-measurement campaigns carried out in a wind farm in Spain among different years. The results are included in this paper.« less
Selection of a Geostatistical Method to Interpolate Soil Properties of the State Crop Testing Fields using Attributes of a Digital Terrain Model

NASA Astrophysics Data System (ADS)

Sahabiev, I. A.; Ryazanov, S. S.; Kolcova, T. G.; Grigoryan, B. R.

2018-03-01

The three most common techniques to interpolate soil properties at a field scale—ordinary kriging (OK), regression kriging with multiple linear regression drift model (RK + MLR), and regression kriging with principal component regression drift model (RK + PCR)—were examined. The results of the performed study were compiled into an algorithm of choosing the most appropriate soil mapping technique. Relief attributes were used as the auxiliary variables. When spatial dependence of a target variable was strong, the OK method showed more accurate interpolation results, and the inclusion of the auxiliary data resulted in an insignificant improvement in prediction accuracy. According to the algorithm, the RK + PCR method effectively eliminates multicollinearity of explanatory variables. However, if the number of predictors is less than ten, the probability of multicollinearity is reduced, and application of the PCR becomes irrational. In that case, the multiple linear regression should be used instead.
Post-mortem prediction of primal and selected retail cut weights of New Zealand lamb from carcass and animal characteristics.

PubMed

Ngo, L; Ho, H; Hunter, P; Quinn, K; Thomson, A; Pearson, G

2016-02-01

Post-mortem measurements (cold weight, grade and external carcass linear dimensions) as well as live animal data (age, breed, sex) were used to predict ovine primal and retail cut weights for 792 lamb carcases. Significant levels of variance could be explained using these predictors. The predictive power of those measurements on primal and retail cut weights was studied by using the results from principal component analysis and the absolute value of the t-statistics of the linear regression model. High prediction accuracy for primal cut weight was achieved (adjusted R(2) up to 0.95), as well as moderate accuracy for key retail cut weight: tenderloins (adj-R(2)=0.60), loin (adj-R(2)=0.62), French rack (adj-R(2)=0.76) and rump (adj-R(2)=0.75). The carcass cold weight had the best predictive power, with the accuracy increasing by around 10% after including the next three most significant variables. Copyright © 2015 Elsevier Ltd. All rights reserved.
A HIERARCHIAL STOCHASTIC MODEL OF LARGE SCALE ATMOSPHERIC CIRCULATION PATTERNS AND MULTIPLE STATION DAILY PRECIPITATION

EPA Science Inventory

A stochastic model of weather states and concurrent daily precipitation at multiple precipitation stations is described. our algorithms are invested for classification of daily weather states; k means, fuzzy clustering, principal components, and principal components coupled with ...
Rosacea assessment by erythema index and principal component analysis segmentation maps

NASA Astrophysics Data System (ADS)

Kuzmina, Ilona; Rubins, Uldis; Saknite, Inga; Spigulis, Janis

2017-12-01

RGB images of rosacea were analyzed using segmentation maps of principal component analysis (PCA) and erythema index (EI). Areas of segmented clusters were compared to Clinician's Erythema Assessment (CEA) values given by two dermatologists. The results show that visible blood vessels are segmented more precisely on maps of the erythema index and the third principal component (PC3). In many cases, a distribution of clusters on EI and PC3 maps are very similar. Mean values of clusters' areas on these maps show a decrease of the area of blood vessels and erythema and an increase of lighter skin area after the therapy for the patients with diagnosis CEA = 2 on the first visit and CEA=1 on the second visit. This study shows that EI and PC3 maps are more useful than the maps of the first (PC1) and second (PC2) principal components for indicating vascular structures and erythema on the skin of rosacea patients and therapy monitoring.
Airborne electromagnetic data levelling using principal component analysis based on flight line difference

NASA Astrophysics Data System (ADS)

Zhang, Qiong; Peng, Cong; Lu, Yiming; Wang, Hao; Zhu, Kaiguang

2018-04-01

A novel technique is developed to level airborne geophysical data using principal component analysis based on flight line difference. In the paper, flight line difference is introduced to enhance the features of levelling error for airborne electromagnetic (AEM) data and improve the correlation between pseudo tie lines. Thus we conduct levelling to the flight line difference data instead of to the original AEM data directly. Pseudo tie lines are selected distributively cross profile direction, avoiding the anomalous regions. Since the levelling errors of selective pseudo tie lines show high correlations, principal component analysis is applied to extract the local levelling errors by low-order principal components reconstruction. Furthermore, we can obtain the levelling errors of original AEM data through inverse difference after spatial interpolation. This levelling method does not need to fly tie lines and design the levelling fitting function. The effectiveness of this method is demonstrated by the levelling results of survey data, comparing with the results from tie-line levelling and flight-line correlation levelling.
Multilevel sparse functional principal component analysis.

PubMed

Di, Chongzhi; Crainiceanu, Ciprian M; Jank, Wolfgang S

2014-01-29

We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis (MFPCA; Di et al. 2009) was proposed for such data when functions are densely recorded. Here we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between and within subject levels. We address inherent methodological differences in the sparse sampling context to: 1) estimate the covariance operators; 2) estimate the functional principal component scores; 3) predict the underlying curves. Through simulations the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions.
[Content of mineral elements of Gastrodia elata by principal components analysis].

PubMed

Li, Jin-ling; Zhao, Zhi; Liu, Hong-chang; Luo, Chun-li; Huang, Ming-jin; Luo, Fu-lai; Wang, Hua-lei

2015-03-01

To study the content of mineral elements and the principal components in Gastrodia elata. Mineral elements were determined by ICP and the data was analyzed by SPSS. K element has the highest content-and the average content was 15.31 g x kg(-1). The average content of N element was 8.99 g x kg(-1), followed by K element. The coefficient of variation of K and N was small, but the Mn was the biggest with 51.39%. The highly significant positive correlation was found among N, P and K . Three principal components were selected by principal components analysis to evaluate the quality of G. elata. P, B, N, K, Cu, Mn, Fe and Mg were the characteristic elements of G. elata. The content of K and N elements was higher and relatively stable. The variation of Mn content was biggest. The quality of G. elata in Guizhou and Yunnan was better from the perspective of mineral elements.

Independent component analysis applied to long bunch beams in the Los Alamos Proton Storage Ring

NASA Astrophysics Data System (ADS)

Kolski, Jeffrey S.; Macek, Robert J.; McCrady, Rodney C.; Pang, Xiaoying

2012-11-01

Independent component analysis (ICA) is a powerful blind source separation (BSS) method. Compared to the typical BSS method, principal component analysis, ICA is more robust to noise, coupling, and nonlinearity. The conventional ICA application to turn-by-turn position data from multiple beam position monitors (BPMs) yields information about cross-BPM correlations. With this scheme, multi-BPM ICA has been used to measure the transverse betatron phase and amplitude functions, dispersion function, linear coupling, sextupole strength, and nonlinear beam dynamics. We apply ICA in a new way to slices along the bunch revealing correlations of particle motion within the beam bunch. We digitize beam signals of the long bunch at the Los Alamos Proton Storage Ring with a single device (BPM or fast current monitor) for an entire injection-extraction cycle. ICA of the digitized beam signals results in source signals, which we identify to describe varying betatron motion along the bunch, locations of transverse resonances along the bunch, measurement noise, characteristic frequencies of the digitizing oscilloscopes, and longitudinal beam structure.
Visualizing Hyolaryngeal Mechanics in Swallowing Using Dynamic MRI

PubMed Central

Pearson, William G.; Zumwalt, Ann C.

2013-01-01

Introduction Coordinates of anatomical landmarks are captured using dynamic MRI to explore whether a proposed two-sling mechanism underlies hyolaryngeal elevation in pharyngeal swallowing. A principal components analysis (PCA) is applied to coordinates to determine the covariant function of the proposed mechanism. Methods Dynamic MRI (dMRI) data were acquired from eleven healthy subjects during a repeated swallows task. Coordinates mapping the proposed mechanism are collected from each dynamic (frame) of a dynamic MRI swallowing series of a randomly selected subject in order to demonstrate shape changes in a single subject. Coordinates representing minimum and maximum hyolaryngeal elevation of all 11 subjects were also mapped to demonstrate shape changes of the system among all subjects. MophoJ software was used to perform PCA and determine vectors of shape change (eigenvectors) for elements of the two-sling mechanism of hyolaryngeal elevation. Results For both single subject and group PCAs, hyolaryngeal elevation accounted for the first principal component of variation. For the single subject PCA, the first principal component accounted for 81.5% of the variance. For the between subjects PCA, the first principal component accounted for 58.5% of the variance. Eigenvectors and shape changes associated with this first principal component are reported. Discussion Eigenvectors indicate that two-muscle slings and associated skeletal elements function as components of a covariant mechanism to elevate the hyolaryngeal complex. Morphological analysis is useful to model shape changes in the two-sling mechanism of hyolaryngeal elevation. PMID:25090608
Obesity, metabolic syndrome, impaired fasting glucose, and microvascular dysfunction: a principal component analysis approach.

PubMed

Panazzolo, Diogo G; Sicuro, Fernando L; Clapauch, Ruth; Maranhão, Priscila A; Bouskela, Eliete; Kraemer-Aguiar, Luiz G

2012-11-13

We aimed to evaluate the multivariate association between functional microvascular variables and clinical-laboratorial-anthropometrical measurements. Data from 189 female subjects (34.0 ± 15.5 years, 30.5 ± 7.1 kg/m2), who were non-smokers, non-regular drug users, without a history of diabetes and/or hypertension, were analyzed by principal component analysis (PCA). PCA is a classical multivariate exploratory tool because it highlights common variation between variables allowing inferences about possible biological meaning of associations between them, without pre-establishing cause-effect relationships. In total, 15 variables were used for PCA: body mass index (BMI), waist circumference, systolic and diastolic blood pressure (BP), fasting plasma glucose, levels of total cholesterol, high-density lipoprotein cholesterol (HDL-c), low-density lipoprotein cholesterol (LDL-c), triglycerides (TG), insulin, C-reactive protein (CRP), and functional microvascular variables measured by nailfold videocapillaroscopy. Nailfold videocapillaroscopy was used for direct visualization of nutritive capillaries, assessing functional capillary density, red blood cell velocity (RBCV) at rest and peak after 1 min of arterial occlusion (RBCV(max)), and the time taken to reach RBCV(max) (TRBCV(max)). A total of 35% of subjects had metabolic syndrome, 77% were overweight/obese, and 9.5% had impaired fasting glucose. PCA was able to recognize that functional microvascular variables and clinical-laboratorial-anthropometrical measurements had a similar variation. The first five principal components explained most of the intrinsic variation of the data. For example, principal component 1 was associated with BMI, waist circumference, systolic BP, diastolic BP, insulin, TG, CRP, and TRBCV(max) varying in the same way. Principal component 1 also showed a strong association among HDL-c, RBCV, and RBCV(max), but in the opposite way. Principal component 3 was associated only with microvascular variables in the same way (functional capillary density, RBCV and RBCV(max)). Fasting plasma glucose appeared to be related to principal component 4 and did not show any association with microvascular reactivity. In non-diabetic female subjects, a multivariate scenario of associations between classic clinical variables strictly related to obesity and metabolic syndrome suggests a significant relationship between these diseases and microvascular reactivity.
Decomposition-Based Failure Mode Identification Method for Risk-Free Design of Large Systems

NASA Technical Reports Server (NTRS)

Tumer, Irem Y.; Stone, Robert B.; Roberts, Rory A.; Clancy, Daniel (Technical Monitor)

2002-01-01

When designing products, it is crucial to assure failure and risk-free operation in the intended operating environment. Failures are typically studied and eliminated as much as possible during the early stages of design. The few failures that go undetected result in unacceptable damage and losses in high-risk applications where public safety is of concern. Published NASA and NTSB accident reports point to a variety of components identified as sources of failures in the reported cases. In previous work, data from these reports were processed and placed in matrix form for all the system components and failure modes encountered, and then manipulated using matrix methods to determine similarities between the different components and failure modes. In this paper, these matrices are represented in the form of a linear combination of failures modes, mathematically formed using Principal Components Analysis (PCA) decomposition. The PCA decomposition results in a low-dimensionality representation of all failure modes and components of interest, represented in a transformed coordinate system. Such a representation opens the way for efficient pattern analysis and prediction of failure modes with highest potential risks on the final product, rather than making decisions based on the large space of component and failure mode data. The mathematics of the proposed method are explained first using a simple example problem. The method is then applied to component failure data gathered from helicopter, accident reports to demonstrate its potential.
The factorial reliability of the Middlesex Hospital Questionnaire in normal subjects.

PubMed

Bagley, C

1980-03-01

The internal reliability of the Middlesex Hospital Questionnaire and its component subscales has been checked by means of principal components analyses of data on 256 normal subjects. The subscales (with the possible exception of Hysteria) were found to contribute to the general underlying factor of psychoneurosis. In general, the principal components analysis points to the reliability of the subscales, despite some item overlap.
The Derivation of Job Compensation Index Values from the Position Analysis Questionnaire (PAQ). Report No. 6.

ERIC Educational Resources Information Center

McCormick, Ernest J.; And Others

The study deals with the job component method of establishing compensation rates. The basic job analysis questionnaire used in the study was the Position Analysis Questionnaire (PAQ) (Form B). On the basis of a principal components analysis of PAQ data for a large sample (2,688) of jobs, a number of principal components (job dimensions) were…
Brittle failure of rock: A review and general linear criterion

NASA Astrophysics Data System (ADS)

Labuz, Joseph F.; Zeng, Feitao; Makhnenko, Roman; Li, Yuan

2018-07-01

A failure criterion typically is phenomenological since few models exist to theoretically derive the mathematical function. Indeed, a successful failure criterion is a generalization of experimental data obtained from strength tests on specimens subjected to known stress states. For isotropic rock that exhibits a pressure dependence on strength, a popular failure criterion is a linear equation in major and minor principal stresses, independent of the intermediate principal stress. A general linear failure criterion called Paul-Mohr-Coulomb (PMC) contains all three principal stresses with three material constants: friction angles for axisymmetric compression ϕc and extension ϕe and isotropic tensile strength V0. PMC provides a framework to describe a nonlinear failure surface by a set of planes "hugging" the curved surface. Brittle failure of rock is reviewed and multiaxial test methods are summarized. Equations are presented to implement PMC for fitting strength data and determining the three material parameters. A piecewise linear approximation to a nonlinear failure surface is illustrated by fitting two planes with six material parameters to form either a 6- to 12-sided pyramid or a 6- to 12- to 6-sided pyramid. The particular nature of the failure surface is dictated by the experimental data.
Perceptions of the Principal Evaluation Process and Performance Criteria: A Qualitative Study of the Challenge of Principal Evaluation

ERIC Educational Resources Information Center

Faginski-Stark, Erica; Casavant, Christopher; Collins, William; McCandless, Jason; Tencza, Marilyn

2012-01-01

Recent federal and state mandates have tasked school systems to move beyond principal evaluation as a bureaucratic function and to re-imagine it as a critical component to improve principal performance and compel school renewal. This qualitative study investigated the district leaders' and principals' perceptions of the performance evaluation…
2L-PCA: a two-level principal component analyzer for quantitative drug design and its applications.

PubMed

Du, Qi-Shi; Wang, Shu-Qing; Xie, Neng-Zhong; Wang, Qing-Yan; Huang, Ri-Bo; Chou, Kuo-Chen

2017-09-19

A two-level principal component predictor (2L-PCA) was proposed based on the principal component analysis (PCA) approach. It can be used to quantitatively analyze various compounds and peptides about their functions or potentials to become useful drugs. One level is for dealing with the physicochemical properties of drug molecules, while the other level is for dealing with their structural fragments. The predictor has the self-learning and feedback features to automatically improve its accuracy. It is anticipated that 2L-PCA will become a very useful tool for timely providing various useful clues during the process of drug development.
Assessing the role of feed water constituents in irreversible membrane fouling of pilot-scale ultrafiltration drinking water treatment systems.

PubMed

Peiris, R H; Jaklewicz, M; Budman, H; Legge, R L; Moresoli, C

2013-06-15

Fluorescence excitation-emission matrix (EEM) approach together with principal component analysis (PCA) was used for assessing hydraulically irreversible fouling of three pilot-scale ultrafiltration (UF) systems containing full-scale and bench-scale hollow fiber membrane modules in drinking water treatment. These systems were operated for at least three months with extensive cycles of permeation, combination of back-pulsing and scouring and chemical cleaning. The principal component (PC) scores generated from the PCA of the fluorescence EEMs were found to be related to humic substances (HS), protein-like and colloidal/particulate matter content. PC scores of HS- and protein-like matter of the UF feed water, when considered separately, showed reasonably good correlations with the rate of hydraulically irreversible fouling for long-term UF operations. In contrast, comparatively weaker correlations for PC scores of colloidal/particulate matter and the rate of hydraulically irreversible fouling were obtained for all UF systems. Since, individual correlations could not fully explain the evolution of the rate of irreversible fouling, multi-linear regression models were developed to relate the combined effect of HS-like, protein-like and colloidal/particulate matter PC scores to the rate of hydraulically irreversible fouling for each specific UF system. These multi-linear regression models revealed significant individual and combined contribution of HS- and protein-like matter to the rate of hydraulically irreversible fouling, with protein-like matter generally showing the greatest contribution. The contribution of colloidal/particulate matter to the rate of hydraulically irreversible fouling was not as significant. The addition of polyaluminum chloride, as coagulant, to UF feed appeared to have a positive impact in reducing hydraulically irreversible fouling by these constituents. The proposed approach has applications in quantifying the individual and synergistic contribution of major natural water constituents to the rate of hydraulically irreversible membrane fouling and shows potential for controlling UF irreversible fouling in the production of drinking water. Copyright © 2013 Elsevier Ltd. All rights reserved.
Quantitative structure-activity relationship of the curcumin-related compounds using various regression methods

NASA Astrophysics Data System (ADS)

Khazaei, Ardeshir; Sarmasti, Negin; Seyf, Jaber Yousefi

2016-03-01

Quantitative structure activity relationship were used to study a series of curcumin-related compounds with inhibitory effect on prostate cancer PC-3 cells, pancreas cancer Panc-1 cells, and colon cancer HT-29 cells. Sphere exclusion method was used to split data set in two categories of train and test set. Multiple linear regression, principal component regression and partial least squares were used as the regression methods. In other hand, to investigate the effect of feature selection methods, stepwise, Genetic algorithm, and simulated annealing were used. In two cases (PC-3 cells and Panc-1 cells), the best models were generated by a combination of multiple linear regression and stepwise (PC-3 cells: r2 = 0.86, q2 = 0.82, pred_r2 = 0.93, and r2m (test) = 0.43, Panc-1 cells: r2 = 0.85, q2 = 0.80, pred_r2 = 0.71, and r2m (test) = 0.68). For the HT-29 cells, principal component regression with stepwise (r2 = 0.69, q2 = 0.62, pred_r2 = 0.54, and r2m (test) = 0.41) is the best method. The QSAR study reveals descriptors which have crucial role in the inhibitory property of curcumin-like compounds. 6ChainCount, T_C_C_1, and T_O_O_7 are the most important descriptors that have the greatest effect. With a specific end goal to design and optimization of novel efficient curcumin-related compounds it is useful to introduce heteroatoms such as nitrogen, oxygen, and sulfur atoms in the chemical structure (reduce the contribution of T_C_C_1 descriptor) and increase the contribution of 6ChainCount and T_O_O_7 descriptors. Models can be useful in the better design of some novel curcumin-related compounds that can be used in the treatment of prostate, pancreas, and colon cancers.
Dietary habits and growth: an urban/rural comparison in the Andean region of Apurimac, Peru.

PubMed

Andrissi, Laura; Mottini, Giovanni; Sebastiani, Valeria; Boldrini, Laura; Giuliani, Alessandro

2013-01-01

The efficacy of interventions against children malnutrition crucially depends on a myriad of factors other than the simple food intake, that must be carefully studied in order to plan a balanced policy. The relation between dietary patterns and growth is at the very heart of the problem, especially in consideration of the fact that dietary pattern involves dimension other than pure caloric intake in its definition. In this work we investigated the relations between dietary pattern and growth comparing children from a rural and a urban area in Andean Peru, in terms of food habits and anthropometric variables to develop a model usable in context interventions against malnutrition. A sample of 159 children (80 from urban, 79 from rural area), aged from 4 to 120 months (72.7 ± 37.5 SD) was collected. The data were investigated by a multidimensional (principal component analysis followed by inferential approach) analysis to correlate the different hidden dimensions of both anthropometric and dietary observables. The correlation between these dimensions (in the form of principal components) were computed and contrasted with the effects of age and urban/rural environments. Caloric intake and growth were not linearly correlated in our data set. Moreover urban and rural environment were demonstrated to show very different patterns of both dietary and anthropometric variables pointing to the marked effect of dietary habits and demographic composition of the analyzed populations. The relation between malnutrition and overweight was at the same time demonstrated to follow a strict area-dependent distribution. We gave a proof-of-concept of the non-linear character of the relation between malnutrition (in terms of caloric intake) and growth, pointing to the need to calibrate interventions on food pattern and not only quantity to contrast malnutrition effects on growth. The education toward a balanced diet must go hand-in-hand with the intervention on caloric intake in order to prevent effects on health.
Quantifying and visualizing variations in sets of images using continuous linear optimal transport

NASA Astrophysics Data System (ADS)

Kolouri, Soheil; Rohde, Gustavo K.

2014-03-01

Modern advancements in imaging devices have enabled us to explore the subcellular structure of living organisms and extract vast amounts of information. However, interpreting the biological information mined in the captured images is not a trivial task. Utilizing predetermined numerical features is usually the only hope for quantifying this information. Nonetheless, direct visual or biological interpretation of results obtained from these selected features is non-intuitive and difficult. In this paper, we describe an automatic method for modeling visual variations in a set of images, which allows for direct visual interpretation of the most significant differences, without the need for predefined features. The method is based on a linearized version of the continuous optimal transport (OT) metric, which provides a natural linear embedding for the image data set, in which linear combination of images leads to a visually meaningful image. This enables us to apply linear geometric data analysis techniques such as principal component analysis and linear discriminant analysis in the linearly embedded space and visualize the most prominent modes, as well as the most discriminant modes of variations, in the dataset. Using the continuous OT framework, we are able to analyze variations in shape and texture in a set of images utilizing each image at full resolution, that otherwise cannot be done by existing methods. The proposed method is applied to a set of nuclei images segmented from Feulgen stained liver tissues in order to investigate the major visual differences in chromatin distribution of Fetal-Type Hepatoblastoma (FHB) cells compared to the normal cells.
Analytical framework for reconstructing heterogeneous environmental variables from mammal community structure.

PubMed

Louys, Julien; Meloro, Carlo; Elton, Sarah; Ditchfield, Peter; Bishop, Laura C

2015-01-01

We test the performance of two models that use mammalian communities to reconstruct multivariate palaeoenvironments. While both models exploit the correlation between mammal communities (defined in terms of functional groups) and arboreal heterogeneity, the first uses a multiple multivariate regression of community structure and arboreal heterogeneity, while the second uses a linear regression of the principal components of each ecospace. The success of these methods means the palaeoenvironment of a particular locality can be reconstructed in terms of the proportions of heavy, moderate, light, and absent tree canopy cover. The linear regression is less biased, and more precisely and accurately reconstructs heavy tree canopy cover than the multiple multivariate model. However, the multiple multivariate model performs better than the linear regression for all other canopy cover categories. Both models consistently perform better than randomly generated reconstructions. We apply both models to the palaeocommunity of the Upper Laetolil Beds, Tanzania. Our reconstructions indicate that there was very little heavy tree cover at this site (likely less than 10%), with the palaeo-landscape instead comprising a mixture of light and absent tree cover. These reconstructions help resolve the previous conflicting palaeoecological reconstructions made for this site. Copyright © 2014 Elsevier Ltd. All rights reserved.
Use of AMMI and linear regression models to analyze genotype-environment interaction in durum wheat.

PubMed

Nachit, M M; Nachit, G; Ketata, H; Gauch, H G; Zobel, R W

1992-03-01

The joint durum wheat (Triticum turgidum L var 'durum') breeding program of the International Maize and Wheat Improvement Center (CIMMYT) and the International Center for Agricultural Research in the Dry Areas (ICARDA) for the Mediterranean region employs extensive multilocation testing. Multilocation testing produces significant genotype-environment (GE) interaction that reduces the accuracy for estimating yield and selecting appropriate germ plasm. The sum of squares (SS) of GE interaction was partitioned by linear regression techniques into joint, genotypic, and environmental regressions, and by Additive Main effects and the Multiplicative Interactions (AMMI) model into five significant Interaction Principal Component Axes (IPCA). The AMMI model was more effective in partitioning the interaction SS than the linear regression technique. The SS contained in the AMMI model was 6 times higher than the SS for all three regressions. Postdictive assessment recommended the use of the first five IPCA axes, while predictive assessment AMMI1 (main effects plus IPCA1). After elimination of random variation, AMMI1 estimates for genotypic yields within sites were more precise than unadjusted means. This increased precision was equivalent to increasing the number of replications by a factor of 3.7.
Maximally reliable spatial filtering of steady state visual evoked potentials.

PubMed

Dmochowski, Jacek P; Greaves, Alex S; Norcia, Anthony M

2015-04-01

Due to their high signal-to-noise ratio (SNR) and robustness to artifacts, steady state visual evoked potentials (SSVEPs) are a popular technique for studying neural processing in the human visual system. SSVEPs are conventionally analyzed at individual electrodes or linear combinations of electrodes which maximize some variant of the SNR. Here we exploit the fundamental assumption of evoked responses--reproducibility across trials--to develop a technique that extracts a small number of high SNR, maximally reliable SSVEP components. This novel spatial filtering method operates on an array of Fourier coefficients and projects the data into a low-dimensional space in which the trial-to-trial spectral covariance is maximized. When applied to two sample data sets, the resulting technique recovers physiologically plausible components (i.e., the recovered topographies match the lead fields of the underlying sources) while drastically reducing the dimensionality of the data (i.e., more than 90% of the trial-to-trial reliability is captured in the first four components). Moreover, the proposed technique achieves a higher SNR than that of the single-best electrode or the Principal Components. We provide a freely-available MATLAB implementation of the proposed technique, herein termed "Reliable Components Analysis". Copyright © 2015 Elsevier Inc. All rights reserved.
Nonlinear Peculiar-Velocity Analysis and PCA

NASA Astrophysics Data System (ADS)

Dekel, Avishai; Eldar, Amiram; Silberman, Lior; Zehavi, Idit

We allow for nonlinear effects in the likelihood analysis of peculiar velocities, and obtain ˜35%-lower values for the cosmological density parameter and for the amplitude of mass-density fluctuations. The power spectrum in the linear regime is assumed to be of the flat ΛCDM model (h = 0.65, n = 1) with only Ω_m free. Since the likelihood is driven by the nonlinear regime, we "break" the power spectrum at k_b˜ 0.2 (h^{-1}Mpc)^{-1} and fit a two-parameter power-law at k > k b . This allows for an unbiased fit in the linear regime. Tests using improved mock catalogs demonstrate a reduced bias and a better fit. We find for the Mark III and SFI data Ω_m = 0.35± 0.09 with σ_8Ω_m^{0.6} = 0.55± 0.10 (90% errors). When allowing deviations from ΛCDM, we find an indication for a wiggle in the power spectrum in the form of an excess near k ˜ 0.05 and a deficiency at k ˜ 0.1 (h^{-1}Mpc)^{-1} - a "cold flow" which may be related to a feature indicated from redshift surveys and the second peak in the CMB anisotropy. A χ^2 test applied to principal modes demonstrates that the nonlinear procedure improves the goodness of fit. The Principal Component Analysis (PCA) helps identifying spatial features of the data and fine-tuning the theoretical and error models. We address the potential for optimal data compression using PCA.
Numerical Linear Algebra.

DTIC Science & Technology

1980-09-08

February 1979 through 31 March 1980 Title of Research: NUMERICAL LINEAR ALGEBRA Principal Investigators: Gene H. Golub James H. Wilkinson Research...BEFORE COMPLETING FORM 2 OTAgSSION NO. 3. RECIPIENT’S CATALOG NUMBER ITE~ btitle) ~qEE NUMERICAL LINEAR ALGEBRA #I ~ f#7&/8 PER.ORMING ORG. REPORT NUM 27R 7
Experimental Researches on the Durability Indicators and the Physiological Comfort of Fabrics using the Principal Component Analysis (PCA) Method

NASA Astrophysics Data System (ADS)

Hristian, L.; Ostafe, M. M.; Manea, L. R.; Apostol, L. L.

2017-06-01

The work pursued the distribution of combed wool fabrics destined to manufacturing of external articles of clothing in terms of the values of durability and physiological comfort indices, using the mathematical model of Principal Component Analysis (PCA). Principal Components Analysis (PCA) applied in this study is a descriptive method of the multivariate analysis/multi-dimensional data, and aims to reduce, under control, the number of variables (columns) of the matrix data as much as possible to two or three. Therefore, based on the information about each group/assortment of fabrics, it is desired that, instead of nine inter-correlated variables, to have only two or three new variables called components. The PCA target is to extract the smallest number of components which recover the most of the total information contained in the initial data.
Information extraction from multivariate images

NASA Technical Reports Server (NTRS)

Park, S. K.; Kegley, K. A.; Schiess, J. R.

1986-01-01

An overview of several multivariate image processing techniques is presented, with emphasis on techniques based upon the principal component transformation (PCT). Multiimages in various formats have a multivariate pixel value, associated with each pixel location, which has been scaled and quantized into a gray level vector, and the bivariate of the extent to which two images are correlated. The PCT of a multiimage decorrelates the multiimage to reduce its dimensionality and reveal its intercomponent dependencies if some off-diagonal elements are not small, and for the purposes of display the principal component images must be postprocessed into multiimage format. The principal component analysis of a multiimage is a statistical analysis based upon the PCT whose primary application is to determine the intrinsic component dimensionality of the multiimage. Computational considerations are also discussed.

Psychometric evaluation of the Persian version of the Templer's Death Anxiety Scale in cancer patients.

PubMed

Soleimani, Mohammad Ali; Yaghoobzadeh, Ameneh; Bahrami, Nasim; Sharif, Saeed Pahlevan; Sharif Nia, Hamid

2016-10-01

In this study, 398 Iranian cancer patients completed the 15-item Templer's Death Anxiety Scale (TDAS). Tests of internal consistency, principal components analysis, and confirmatory factor analysis were conducted to assess the internal consistency and factorial validity of the Persian TDAS. The construct reliability statistic and average variance extracted were also calculated to measure construct reliability, convergent validity, and discriminant validity. Principal components analysis indicated a 3-component solution, which was generally supported in the confirmatory analysis. However, acceptable cutoffs for construct reliability, convergent validity, and discriminant validity were not fulfilled for the three subscales that were derived from the principal component analysis. This study demonstrated both the advantages and potential limitations of using the TDAS with Persian-speaking cancer patients.
A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy

NASA Astrophysics Data System (ADS)

Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.

2015-05-01

The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.
Principal Component Clustering Approach to Teaching Quality Discriminant Analysis

ERIC Educational Resources Information Center

Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan

2016-01-01

Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…
Analysis of the principal component algorithm in phase-shifting interferometry.

PubMed

Vargas, J; Quiroga, J Antonio; Belenguer, T

2011-06-15

We recently presented a new asynchronous demodulation method for phase-sampling interferometry. The method is based in the principal component analysis (PCA) technique. In the former work, the PCA method was derived heuristically. In this work, we present an in-depth analysis of the PCA demodulation method.
Psychometric Measurement Models and Artificial Neural Networks

ERIC Educational Resources Information Center

Sese, Albert; Palmer, Alfonso L.; Montano, Juan J.

2004-01-01

The study of measurement models in psychometrics by means of dimensionality reduction techniques such as Principal Components Analysis (PCA) is a very common practice. In recent times, an upsurge of interest in the study of artificial neural networks apt to computing a principal component extraction has been observed. Despite this interest, the…
Burst and Principal Components Analyses of MEA Data for 16 Chemicals Describe at Least Three Effects Classes.

EPA Science Inventory

Microelectrode arrays (MEAs) detect drug and chemical induced changes in neuronal network function and have been used for neurotoxicity screening. As a proof-•of-concept, the current study assessed the utility of analytical "fingerprinting" using Principal Components Analysis (P...
Incremental principal component pursuit for video background modeling

DOEpatents

Rodriquez-Valderrama, Paul A.; Wohlberg, Brendt

2017-03-14

An incremental Principal Component Pursuit (PCP) algorithm for video background modeling that is able to process one frame at a time while adapting to changes in background, with a computational complexity that allows for real-time processing, having a low memory footprint and is robust to translational and rotational jitter.
Cosmological Density and Power Spectrum from Peculiar Velocities: Nonlinear Corrections and Principal Component Analysis

NASA Astrophysics Data System (ADS)

Silberman, L.; Dekel, A.; Eldar, A.; Zehavi, I.

2001-08-01

We allow for nonlinear effects in the likelihood analysis of galaxy peculiar velocities and obtain ~35% lower values for the cosmological density parameter Ωm and for the amplitude of mass density fluctuations σ8Ω0.6m. This result is obtained under the assumption that the power spectrum in the linear regime is of the flat ΛCDM model (h=0.65, n=1, COBE normalized) with only Ωm as a free parameter. Since the likelihood is driven by the nonlinear regime, we ``break'' the power spectrum at kb~0.2 (h-1 Mpc)-1 and fit a power law at k>kb. This allows for independent matching of the nonlinear behavior and an unbiased fit in the linear regime. The analysis assumes Gaussian fluctuations and errors and a linear relation between velocity and density. Tests using mock catalogs that properly simulate nonlinear effects demonstrate that this procedure results in a reduced bias and a better fit. We find for the Mark III and SFI data Ωm=0.32+/-0.06 and 0.37+/-0.09, respectively, with σ8Ω0.6m=0.49+/-0.06 and 0.63+/-0.08, in agreement with constraints from other data. The quoted 90% errors include distance errors and cosmic variance, for fixed values of the other parameters. The improvement in the likelihood due to the nonlinear correction is very significant for Mark III and moderately significant for SFI. When allowing deviations from ΛCDM, we find an indication for a wiggle in the power spectrum: an excess near k~0.05 (h-1 Mpc)-1 and a deficiency at k~0.1 (h-1 Mpc)-1, or a ``cold flow.'' This may be related to the wiggle seen in the power spectrum from redshift surveys and the second peak in the cosmic microwave background (CMB) anisotropy. A χ2 test applied to modes of a principal component analysis (PCA) shows that the nonlinear procedure improves the goodness of fit and reduces a spatial gradient that was of concern in the purely linear analysis. The PCA allows us to address spatial features of the data and to evaluate and fine-tune the theoretical and error models. It demonstrates in particular that the models used are appropriate for the cosmological parameter estimation performed. We address the potential for optimal data compression using PCA.
Dynamic competitive probabilistic principal components analysis.

PubMed

López-Rubio, Ezequiel; Ortiz-DE-Lazcano-Lobato, Juan Miguel

2009-04-01

We present a new neural model which extends the classical competitive learning (CL) by performing a Probabilistic Principal Components Analysis (PPCA) at each neuron. The model also has the ability to learn the number of basis vectors required to represent the principal directions of each cluster, so it overcomes a drawback of most local PCA models, where the dimensionality of a cluster must be fixed a priori. Experimental results are presented to show the performance of the network with multispectral image data.
Application of the principal fractional meta-trigonometric functions for the solution of linear commensurate-order time-invariant fractional differential equations.

PubMed

Lorenzo, C F; Hartley, T T; Malti, R

2013-05-13

A new and simplified method for the solution of linear constant coefficient fractional differential equations of any commensurate order is presented. The solutions are based on the R-function and on specialized Laplace transform pairs derived from the principal fractional meta-trigonometric functions. The new method simplifies the solution of such fractional differential equations and presents the solutions in the form of real functions as opposed to fractional complex exponential functions, and thus is directly applicable to real-world physics.
A principal components model of soundscape perception.

PubMed

Axelsson, Östen; Nilsson, Mats E; Berglund, Birgitta

2010-11-01

There is a need for a model that identifies underlying dimensions of soundscape perception, and which may guide measurement and improvement of soundscape quality. With the purpose to develop such a model, a listening experiment was conducted. One hundred listeners measured 50 excerpts of binaural recordings of urban outdoor soundscapes on 116 attribute scales. The average attribute scale values were subjected to principal components analysis, resulting in three components: Pleasantness, eventfulness, and familiarity, explaining 50, 18 and 6% of the total variance, respectively. The principal-component scores were correlated with physical soundscape properties, including categories of dominant sounds and acoustic variables. Soundscape excerpts dominated by technological sounds were found to be unpleasant, whereas soundscape excerpts dominated by natural sounds were pleasant, and soundscape excerpts dominated by human sounds were eventful. These relationships remained after controlling for the overall soundscape loudness (Zwicker's N(10)), which shows that 'informational' properties are substantial contributors to the perception of soundscape. The proposed principal components model provides a framework for future soundscape research and practice. In particular, it suggests which basic dimensions are necessary to measure, how to measure them by a defined set of attribute scales, and how to promote high-quality soundscapes.
Simultaneous Determination of Multiple Classes of Hydrophilic and Lipophilic Components in Shuang-Huang-Lian Oral Liquid Formulations by UPLC-Triple Quadrupole Linear Ion Trap Mass Spectrometry.

PubMed

Liang, Jun; Sun, Hui-Min; Wang, Tian-Long

2017-11-24

The Shuang-Huang-Lian (SHL) oral liquid is a combined herbal prescription used in the treatment of acute upper respiratory tract infection, acute bronchitis and pneumonia. Multiple constituents are considered to be responsible for the therapeutic effects of SHL. However, the quantitation of the multi-components from multiple classes is still unsatisfactory because of the high complexity of constituents in SHL. In this study, an accurate, rapid, and specific UPLC-MS/MS method was established for simultaneous quantification of 18 compounds from multiple classes in SHL oral liquid formulations. Chromatographic separation was performed on a HSS T3 (1.8 μm, 2.1 mm × 100 mm) column, using a gradient mobile phase system of 0.1% formic acid in acetonitrile and 0.1% formic acid in water at a flow rate of 0.2 mL·min -1 ; the run time was 23 min. The MS was operated in negative electrospray ionization (ESI - ) for analysis of 18 compounds using multiple reaction monitoring (MRM) mode. UPLC-ESI - -MRM-MS/MS method showed good linear relationships ( R ² > 0.999), repeatability (RSD < 3%), precisions (RSD < 3%) and recovery (84.03-101.62%). The validated method was successfully used to determine multiple classes of hydrophilic and lipophilic components in the SHL oral liquids. Finally, principal component analysis (PCA) was used to classify and differentiate SHL oral liquid samples attributed to different manufacturers of China. The proposed UPLC-ESI - -MRM-MS/MS coupled with PCA has been elucidated to be a simple and reliable method for quality evaluation of SHL oral liquids.
Fusion of Modis and Palsar Principal Component Images Through Curvelet Transform for Land Cover Classification

NASA Astrophysics Data System (ADS)

Singh, Dharmendra; Kumar, Harish

Earth observation satellites provide data that covers different portions of the electromagnetic spectrum at different spatial and spectral resolutions. The increasing availability of information products generated from satellite images are extending the ability to understand the patterns and dynamics of the earth resource systems at all scales of inquiry. In which one of the most important application is the generation of land cover classification from satellite images for understanding the actual status of various land cover classes. The prospect for the use of satel-lite images in land cover classification is an extremely promising one. The quality of satellite images available for land-use mapping is improving rapidly by development of advanced sensor technology. Particularly noteworthy in this regard is the improved spatial and spectral reso-lution of the images captured by new satellite sensors like MODIS, ASTER, Landsat 7, and SPOT 5. For the full exploitation of increasingly sophisticated multisource data, fusion tech-niques are being developed. Fused images may enhance the interpretation capabilities. The images used for fusion have different temporal, and spatial resolution. Therefore, the fused image provides a more complete view of the observed objects. It is one of the main aim of image fusion to integrate different data in order to obtain more information that can be de-rived from each of the single sensor data alone. A good example of this is the fusion of images acquired by different sensors having a different spatial resolution and of different spectral res-olution. Researchers are applying the fusion technique since from three decades and propose various useful methods and techniques. The importance of high-quality synthesis of spectral information is well suited and implemented for land cover classification. More recently, an underlying multiresolution analysis employing the discrete wavelet transform has been used in image fusion. It was found that multisensor image fusion is a tradeoff between the spectral information from a low resolution multi-spectral images and the spatial information from a high resolution multi-spectral images. With the wavelet transform based fusion method, it is easy to control this tradeoff. A new transform, the curvelet transform was used in recent years by Starck. A ridgelet transform is applied to square blocks of detail frames of undecimated wavelet decomposition, consequently the curvelet transform is obtained. Since the ridgelet transform possesses basis functions matching directional straight lines therefore, the curvelet transform is capable of representing piecewise linear contours on multiple scales through few significant coefficients. This property leads to a better separation between geometric details and background noise, which may be easily reduced by thresholding curvelet coefficients before they are used for fusion. The Terra and Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) instrument provides high radiometric sensitivity (12 bit) in 36 spectral bands ranging in wavelength from 0.4 m to 14.4 m and also it is freely available. Two bands are imaged at a nominal resolution of 250 m at nadir, with five bands at 500 m, and the remaining 29 bands at 1 km. In this paper, the band 1 of spatial resolution 250 m and bandwidth 620-670 nm, and band 2, of spatial resolution of 250m and bandwidth 842-876 nm is considered as these bands has special features to identify the agriculture and other land covers. In January 2006, the Advanced Land Observing Satellite (ALOS) was successfully launched by the Japan Aerospace Exploration Agency (JAXA). The Phased Arraytype L-band SAR (PALSAR) sensor onboard the satellite acquires SAR imagery at a wavelength of 23.5 cm (frequency 1.27 GHz) with capabilities of multimode and multipolarization observation. PALSAR can operate in several modes: the fine-beam single (FBS) polarization mode (HH), fine-beam dual (FBD) polariza-tion mode (HH/HV or VV/VH), polarimetric (PLR) mode (HH/HV/VH/VV), and ScanSAR (WB) mode (HH/VV) [15]. These makes PALSAR imagery very attractive for spatially and temporally consistent monitoring system. The Overview of Principal Component Analysis is that the most of the information within all the bands can be compressed into a much smaller number of bands with little loss of information. It allows us to extract the low-dimensional subspaces that capture the main linear correlation among the high-dimensional image data. This facilitates viewing the explained variance or signal in the available imagery, allowing both gross and more subtle features in the imagery to be seen. In this paper we have explored the fusion technique for enhancing the land cover classification of low resolution satellite data espe-cially freely available satellite data. For this purpose, we have considered to fuse the PALSAR principal component data with MODIS principal component data. Initially, the MODIS band 1 and band 2 is considered, its principal component is computed. Similarly the PALSAR HH, HV and VV polarized data are considered, and there principal component is also computed. con-sequently, the PALSAR principal component image is fused with MODIS principal component image. The aim of this paper is to analyze the effect of classification accuracy on major type of land cover types like agriculture, water and urban bodies with fusion of PALSAR data to MODIS data. Curvelet transformation has been applied for fusion of these two satellite images and Minimum Distance classification technique has been applied for the resultant fused image. It is qualitatively and visually observed that the overall classification accuracy of MODIS image after fusion is enhanced. This type of fusion technique may be quite helpful in near future to use freely available satellite data to develop monitoring system for different land cover classes on the earth.
Application of principal component analysis in protein unfolding: an all-atom molecular dynamics simulation study.

PubMed

Das, Atanu; Mukhopadhyay, Chaitali

2007-10-28

We have performed molecular dynamics (MD) simulation of the thermal denaturation of one protein and one peptide-ubiquitin and melittin. To identify the correlation in dynamics among various secondary structural fragments and also the individual contribution of different residues towards thermal unfolding, principal component analysis method was applied in order to give a new insight to protein dynamics by analyzing the contribution of coefficients of principal components. The cross-correlation matrix obtained from MD simulation trajectory provided important information regarding the anisotropy of backbone dynamics that leads to unfolding. Unfolding of ubiquitin was found to be a three-state process, while that of melittin, though smaller and mostly helical, is more complicated.
Application of principal component analysis in protein unfolding: An all-atom molecular dynamics simulation study

NASA Astrophysics Data System (ADS)

Das, Atanu; Mukhopadhyay, Chaitali

2007-10-01

We have performed molecular dynamics (MD) simulation of the thermal denaturation of one protein and one peptide—ubiquitin and melittin. To identify the correlation in dynamics among various secondary structural fragments and also the individual contribution of different residues towards thermal unfolding, principal component analysis method was applied in order to give a new insight to protein dynamics by analyzing the contribution of coefficients of principal components. The cross-correlation matrix obtained from MD simulation trajectory provided important information regarding the anisotropy of backbone dynamics that leads to unfolding. Unfolding of ubiquitin was found to be a three-state process, while that of melittin, though smaller and mostly helical, is more complicated.
SAS program for quantitative stratigraphic correlation by principal components

USGS Publications Warehouse

Hohn, M.E.

1985-01-01

A SAS program is presented which constructs a composite section of stratigraphic events through principal components analysis. The variables in the analysis are stratigraphic sections and the observational units are range limits of taxa. The program standardizes data in each section, extracts eigenvectors, estimates missing range limits, and computes the composite section from scores of events on the first principal component. Provided is an option of several types of diagnostic plots; these help one to determine conservative range limits or unrealistic estimates of missing values. Inspection of the graphs and eigenvalues allow one to evaluate goodness of fit between the composite and measured data. The program is extended easily to the creation of a rank-order composite. ?? 1985.
A novel principal component analysis for spatially misaligned multivariate air pollution data.

PubMed

Jandarov, Roman A; Sheppard, Lianne A; Sampson, Paul D; Szpiro, Adam A

2017-01-01

We propose novel methods for predictive (sparse) PCA with spatially misaligned data. These methods identify principal component loading vectors that explain as much variability in the observed data as possible, while also ensuring the corresponding principal component scores can be predicted accurately by means of spatial statistics at locations where air pollution measurements are not available. This will make it possible to identify important mixtures of air pollutants and to quantify their health effects in cohort studies, where currently available methods cannot be used. We demonstrate the utility of predictive (sparse) PCA in simulated data and apply the approach to annual averages of particulate matter speciation data from national Environmental Protection Agency (EPA) regulatory monitors.
Principals' Perceptions of Collegial Support as a Component of Administrative Inservice.

ERIC Educational Resources Information Center

Daresh, John C.

To address the problem of increasing professional isolation of building administrators, the Principals' Inservice Project helps establish principals' collegial support groups across the nation. The groups are typically composed of 6 to 10 principals who meet at least once each month over a 2-year period. One collegial support group of seven…
Training the Trainers: Learning to Be a Principal Supervisor

ERIC Educational Resources Information Center

Saltzman, Amy

2017-01-01

While most principal supervisors are former principals themselves, few come to the role with specific training in how to do the job effectively. For this reason, both the Washington, D.C., and Tulsa, Oklahoma, principal supervisor programs include a strong professional development component. In this article, the author takes a look inside these…
Preliminary Geologic/spectral Analysis of LANDSAT-4 Thematic Mapper Data, Wind River/bighorn Basin Area, Wyoming

NASA Technical Reports Server (NTRS)

Lang, H. R.; Conel, J. E.; Paylor, E. D.

1984-01-01

A LIDQA evaluation for geologic applications of a LANDSAT TM scene covering the Wind River/Bighorn Basin area, Wyoming, is examined. This involves a quantitative assessment of data quality including spatial and spectral characteristics. Analysis is concentrated on the 6 visible, near infrared, and short wavelength infrared bands. Preliminary analysis demonstrates that: (1) principal component images derived from the correlation matrix provide the most useful geologic information. To extract surface spectral reflectance, the TM radiance data must be calibrated. Scatterplots demonstrate that TM data can be calibrated and sensor response is essentially linear. Low instrumental offset and gain settings result in spectral data that do not utilize the full dynamic range of the TM system.

Modeling and analysis of several classes of self-oscillating inverters. I - State-plane representations. II - Model extension, classification, and duality relationships

NASA Technical Reports Server (NTRS)

Lee, F. C. Y.; Wilson, T. G.

1982-01-01

The present investigation is concerned with an important class of power conditioning networks, taking into account self-oscillating dc-to-square-wave transistor inverters. The considered circuits are widely used both as the principal power converting and processing means in many systems and as low-power analog-to-discrete-time converters for controlling the switching of the output-stage semiconductors in a variety of power conditioning systems. Aspects of piecewise-linear modeling are discussed, taking into consideration component models, and an equivalent-circuit model. Questions of singular point analysis and state plane representation are also investigated, giving attention to limit cycles, starting circuits, the region of attraction, a hard oscillator, and a soft oscillator.
Use of Geochemistry Data Collected by the Mars Exploration Rover Spirit in Gusev Crater to Teach Geomorphic Zonation through Principal Components Analysis

ERIC Educational Resources Information Center

Rodrigue, Christine M.

2011-01-01

This paper presents a laboratory exercise used to teach principal components analysis (PCA) as a means of surface zonation. The lab was built around abundance data for 16 oxides and elements collected by the Mars Exploration Rover Spirit in Gusev Crater between Sol 14 and Sol 470. Students used PCA to reduce 15 of these into 3 components, which,…
A Principal Components Analysis and Validation of the Coping with the College Environment Scale (CWCES)

ERIC Educational Resources Information Center

Ackermann, Margot Elise; Morrow, Jennifer Ann

2008-01-01

The present study describes the development and initial validation of the Coping with the College Environment Scale (CWCES). Participants included 433 college students who took an online survey. Principal Components Analysis (PCA) revealed six coping strategies: planning and self-management, seeking support from institutional resources, escaping…
Wavelet based de-noising of breath air absorption spectra profiles for improved classification by principal component analysis

NASA Astrophysics Data System (ADS)

Kistenev, Yu. V.; Shapovalov, A. V.; Borisov, A. V.; Vrazhnov, D. A.; Nikolaev, V. V.; Nikiforova, O. Yu.

2015-11-01

The comparison results of different mother wavelets used for de-noising of model and experimental data which were presented by profiles of absorption spectra of exhaled air are presented. The impact of wavelets de-noising on classification quality made by principal component analysis are also discussed.
Evaluation of skin melanoma in spectral range 450-950 nm using principal component analysis

NASA Astrophysics Data System (ADS)

Jakovels, D.; Lihacova, I.; Kuzmina, I.; Spigulis, J.

2013-06-01

Diagnostic potential of principal component analysis (PCA) of multi-spectral imaging data in the wavelength range 450- 950 nm for distant skin melanoma recognition is discussed. Processing of the measured clinical data by means of PCA resulted in clear separation between malignant melanomas and pigmented nevi.
40 CFR 60.2998 - What are the principal components of the model rule?

Code of Federal Regulations, 2012 CFR

2012-07-01

... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
40 CFR 60.2998 - What are the principal components of the model rule?

Code of Federal Regulations, 2014 CFR

2014-07-01

... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
40 CFR 60.2998 - What are the principal components of the model rule?

Code of Federal Regulations, 2011 CFR

2011-07-01

... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
40 CFR 60.1580 - What are the principal components of the model rule?

Code of Federal Regulations, 2010 CFR

2010-07-01

... the model rule? 60.1580 Section 60.1580 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines..., 1999 Use of Model Rule § 60.1580 What are the principal components of the model rule? The model rule...
40 CFR 60.2998 - What are the principal components of the model rule?

Code of Federal Regulations, 2013 CFR

2013-07-01

... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
Students' Perceptions of Teaching and Learning Practices: A Principal Component Approach

ERIC Educational Resources Information Center

Mukorera, Sophia; Nyatanga, Phocenah

2017-01-01

Students' attendance and engagement with teaching and learning practices is perceived as a critical element for academic performance. Even with stipulated attendance policies, students still choose not to engage. The study employed a principal component analysis to analyze first- and second-year students' perceptions of the importance of the 12…
Principal Perspectives about Policy Components and Practices for Reducing Cyberbullying in Urban Schools

ERIC Educational Resources Information Center

Hunley-Jenkins, Keisha Janine

2012-01-01

This qualitative study explores large, urban, mid-western principal perspectives about cyberbullying and the policy components and practices that they have found effective and ineffective at reducing its occurrence and/or negative effect on their schools' learning environments. More specifically, the researcher was interested in learning more…
Learning Principal Component Analysis by Using Data from Air Quality Networks

ERIC Educational Resources Information Center

Perez-Arribas, Luis Vicente; Leon-González, María Eugenia; Rosales-Conrado, Noelia

2017-01-01

With the final objective of using computational and chemometrics tools in the chemistry studies, this paper shows the methodology and interpretation of the Principal Component Analysis (PCA) using pollution data from different cities. This paper describes how students can obtain data on air quality and process such data for additional information…
Applications of Nonlinear Principal Components Analysis to Behavioral Data.

ERIC Educational Resources Information Center

Hicks, Marilyn Maginley

1981-01-01

An empirical investigation of the statistical procedure entitled nonlinear principal components analysis was conducted on a known equation and on measurement data in order to demonstrate the procedure and examine its potential usefulness. This method was suggested by R. Gnanadesikan and based on an early paper of Karl Pearson. (Author/AL)
Relationships between Association of Research Libraries (ARL) Statistics and Bibliometric Indicators: A Principal Components Analysis

ERIC Educational Resources Information Center

Hendrix, Dean

2010-01-01

This study analyzed 2005-2006 Web of Science bibliometric data from institutions belonging to the Association of Research Libraries (ARL) and corresponding ARL statistics to find any associations between indicators from the two data sets. Principal components analysis on 36 variables from 103 universities revealed obvious associations between…
Principal component analysis for protein folding dynamics.

PubMed

Maisuradze, Gia G; Liwo, Adam; Scheraga, Harold A

2009-01-09

Protein folding is considered here by studying the dynamics of the folding of the triple beta-strand WW domain from the Formin-binding protein 28. Starting from the unfolded state and ending either in the native or nonnative conformational states, trajectories are generated with the coarse-grained united residue (UNRES) force field. The effectiveness of principal components analysis (PCA), an already established mathematical technique for finding global, correlated motions in atomic simulations of proteins, is evaluated here for coarse-grained trajectories. The problems related to PCA and their solutions are discussed. The folding and nonfolding of proteins are examined with free-energy landscapes. Detailed analyses of many folding and nonfolding trajectories at different temperatures show that PCA is very efficient for characterizing the general folding and nonfolding features of proteins. It is shown that the first principal component captures and describes in detail the dynamics of a system. Anomalous diffusion in the folding/nonfolding dynamics is examined by the mean-square displacement (MSD) and the fractional diffusion and fractional kinetic equations. The collisionless (or ballistic) behavior of a polypeptide undergoing Brownian motion along the first few principal components is accounted for.
Principal Component 2-D Long Short-Term Memory for Font Recognition on Single Chinese Characters.

PubMed

Tao, Dapeng; Lin, Xu; Jin, Lianwen; Li, Xuelong

2016-03-01

Chinese character font recognition (CCFR) has received increasing attention as the intelligent applications based on optical character recognition becomes popular. However, traditional CCFR systems do not handle noisy data effectively. By analyzing in detail the basic strokes of Chinese characters, we propose that font recognition on a single Chinese character is a sequence classification problem, which can be effectively solved by recurrent neural networks. For robust CCFR, we integrate a principal component convolution layer with the 2-D long short-term memory (2DLSTM) and develop principal component 2DLSTM (PC-2DLSTM) algorithm. PC-2DLSTM considers two aspects: 1) the principal component layer convolution operation helps remove the noise and get a rational and complete font information and 2) simultaneously, 2DLSTM deals with the long-range contextual processing along scan directions that can contribute to capture the contrast between character trajectory and background. Experiments using the frequently used CCFR dataset suggest the effectiveness of PC-2DLSTM compared with other state-of-the-art font recognition methods.
Dynamic of consumer groups and response of commodity markets by principal component analysis

NASA Astrophysics Data System (ADS)

Nobi, Ashadun; Alam, Shafiqul; Lee, Jae Woo

2017-09-01

This study investigates financial states and group dynamics by applying principal component analysis to the cross-correlation coefficients of the daily returns of commodity futures. The eigenvalues of the cross-correlation matrix in the 6-month timeframe displays similar values during 2010-2011, but decline following 2012. A sharp drop in eigenvalue implies the significant change of the market state. Three commodity sectors, energy, metals and agriculture, are projected into two dimensional spaces consisting of two principal components (PC). We observe that they form three distinct clusters in relation to various sectors. However, commodities with distinct features have intermingled with one another and scattered during severe crises, such as the European sovereign debt crises. We observe the notable change of the position of two dimensional spaces of groups during financial crises. By considering the first principal component (PC1) within the 6-month moving timeframe, we observe that commodities of the same group change states in a similar pattern, and the change of states of one group can be used as a warning for other group.
[Determination and principal component analysis of mineral elements based on ICP-OES in Nitraria roborowskii fruits from different regions].

PubMed

Yuan, Yuan-Yuan; Zhou, Yu-Bi; Sun, Jing; Deng, Juan; Bai, Ying; Wang, Jie; Lu, Xue-Feng

2017-06-01

The content of elements in fifteen different regions of Nitraria roborowskii samples were determined by inductively coupled plasma-atomic emission spectrometry(ICP-OES), and its elemental characteristics were analyzed by principal component analysis. The results indicated that 18 mineral elements were detected in N. roborowskii of which V cannot be detected. In addition, contents of Na, K and Ca showed high concentration. Ti showed maximum content variance, while K is minimum. Four principal components were gained from the original data. The cumulative variance contribution rate is 81.542% and the variance contribution of the first principal component was 44.997%, indicating that Cr, Fe, P and Ca were the characteristic elements of N. roborowskii.Thus, the established method was simple, precise and can be used for determination of mineral elements in N.roborowskii Kom. fruits. The elemental distribution characteristics among N.roborowskii fruits are related to geographical origins which were clearly revealed by PCA. All the results will provide good basis for comprehensive utilization of N.roborowskii. Copyright© by the Chinese Pharmaceutical Association.
[Applications of three-dimensional fluorescence spectrum of dissolved organic matter to identification of red tide algae].

PubMed

Lü, Gui-Cai; Zhao, Wei-Hong; Wang, Jiang-Tao

2011-01-01

The identification techniques for 10 species of red tide algae often found in the coastal areas of China were developed by combining the three-dimensional fluorescence spectra of fluorescence dissolved organic matter (FDOM) from the cultured red tide algae with principal component analysis. Based on the results of principal component analysis, the first principal component loading spectrum of three-dimensional fluorescence spectrum was chosen as the identification characteristic spectrum for red tide algae, and the phytoplankton fluorescence characteristic spectrum band was established. Then the 10 algae species were tested using Bayesian discriminant analysis with a correct identification rate of more than 92% for Pyrrophyta on the level of species, and that of more than 75% for Bacillariophyta on the level of genus in which the correct identification rates were more than 90% for the phaeodactylum and chaetoceros. The results showed that the identification techniques for 10 species of red tide algae based on the three-dimensional fluorescence spectra of FDOM from the cultured red tide algae and principal component analysis could work well.

Stationary Wavelet-based Two-directional Two-dimensional Principal Component Analysis for EMG Signal Classification

NASA Astrophysics Data System (ADS)

Ji, Yi; Sun, Shanlin; Xie, Hong-Bo

2017-06-01

Discrete wavelet transform (WT) followed by principal component analysis (PCA) has been a powerful approach for the analysis of biomedical signals. Wavelet coefficients at various scales and channels were usually transformed into a one-dimensional array, causing issues such as the curse of dimensionality dilemma and small sample size problem. In addition, lack of time-shift invariance of WT coefficients can be modeled as noise and degrades the classifier performance. In this study, we present a stationary wavelet-based two-directional two-dimensional principal component analysis (SW2D2PCA) method for the efficient and effective extraction of essential feature information from signals. Time-invariant multi-scale matrices are constructed in the first step. The two-directional two-dimensional principal component analysis then operates on the multi-scale matrices to reduce the dimension, rather than vectors in conventional PCA. Results are presented from an experiment to classify eight hand motions using 4-channel electromyographic (EMG) signals recorded in healthy subjects and amputees, which illustrates the efficiency and effectiveness of the proposed method for biomedical signal analysis.
Hyperspectral optical imaging of human iris in vivo: characteristics of reflectance spectra

NASA Astrophysics Data System (ADS)

Medina, José M.; Pereira, Luís M.; Correia, Hélder T.; Nascimento, Sérgio M. C.

2011-07-01

We report a hyperspectral imaging system to measure the reflectance spectra of real human irises with high spatial resolution. A set of ocular prosthesis was used as the control condition. Reflectance data were decorrelated by the principal-component analysis. The main conclusion is that spectral complexity of the human iris is considerable: between 9 and 11 principal components are necessary to account for 99% of the cumulative variance in human irises. Correcting image misalignments associated with spontaneous ocular movements did not influence this result. The data also suggests a correlation between the first principal component and different levels of melanin present in the irises. It was also found that although the spectral characteristics of the first five principal components were not affected by the radial and angular position of the selected iridal areas, they affect the higher-order ones, suggesting a possible influence of the iris texture. The results show that hyperspectral imaging in the iris, together with adequate spectroscopic analyses provide more information than conventional colorimetric methods, making hyperspectral imaging suitable for the characterization of melanin and the noninvasive diagnosis of ocular diseases and iris color.
Seeing wholes: The concept of systems thinking and its implementation in school leadership

NASA Astrophysics Data System (ADS)

Shaked, Haim; Schechter, Chen

2013-12-01

Systems thinking (ST) is an approach advocating thinking about any given issue as a whole, emphasising the interrelationships between its components rather than the components themselves. This article aims to link ST and school leadership, claiming that ST may enable school principals to develop highly performing schools that can cope successfully with current challenges, which are more complex than ever before in today's era of accountability and high expectations. The article presents the concept of ST - its definition, components, history and applications. Thereafter, its connection to education and its contribution to school management are described. The article concludes by discussing practical processes including screening for ST-skilled principal candidates and developing ST skills among prospective and currently performing school principals, pinpointing three opportunities for skills acquisition: during preparatory programmes; during their first years on the job, supported by veteran school principals as mentors; and throughout their entire career. Such opportunities may not only provide school principals with ST skills but also improve their functioning throughout the aforementioned stages of professional development.
A modified procedure for mixture-model clustering of regional geochemical data

USGS Publications Warehouse

Ellefsen, Karl J.; Smith, David B.; Horton, John D.

2014-01-01

A modified procedure is proposed for mixture-model clustering of regional-scale geochemical data. The key modification is the robust principal component transformation of the isometric log-ratio transforms of the element concentrations. This principal component transformation and the associated dimension reduction are applied before the data are clustered. The principal advantage of this modification is that it significantly improves the stability of the clustering. The principal disadvantage is that it requires subjective selection of the number of clusters and the number of principal components. To evaluate the efficacy of this modified procedure, it is applied to soil geochemical data that comprise 959 samples from the state of Colorado (USA) for which the concentrations of 44 elements are measured. The distributions of element concentrations that are derived from the mixture model and from the field samples are similar, indicating that the mixture model is a suitable representation of the transformed geochemical data. Each cluster and the associated distributions of the element concentrations are related to specific geologic and anthropogenic features. In this way, mixture model clustering facilitates interpretation of the regional geochemical data.
Temporal evolution of financial-market correlations.

PubMed

Fenn, Daniel J; Porter, Mason A; Williams, Stacy; McDonald, Mark; Johnson, Neil F; Jones, Nick S

2011-08-01

We investigate financial market correlations using random matrix theory and principal component analysis. We use random matrix theory to demonstrate that correlation matrices of asset price changes contain structure that is incompatible with uncorrelated random price changes. We then identify the principal components of these correlation matrices and demonstrate that a small number of components accounts for a large proportion of the variability of the markets that we consider. We characterize the time-evolving relationships between the different assets by investigating the correlations between the asset price time series and principal components. Using this approach, we uncover notable changes that occurred in financial markets and identify the assets that were significantly affected by these changes. We show in particular that there was an increase in the strength of the relationships between several different markets following the 2007-2008 credit and liquidity crisis.
Temporal evolution of financial-market correlations

NASA Astrophysics Data System (ADS)

Fenn, Daniel J.; Porter, Mason A.; Williams, Stacy; McDonald, Mark; Johnson, Neil F.; Jones, Nick S.

2011-08-01

We investigate financial market correlations using random matrix theory and principal component analysis. We use random matrix theory to demonstrate that correlation matrices of asset price changes contain structure that is incompatible with uncorrelated random price changes. We then identify the principal components of these correlation matrices and demonstrate that a small number of components accounts for a large proportion of the variability of the markets that we consider. We characterize the time-evolving relationships between the different assets by investigating the correlations between the asset price time series and principal components. Using this approach, we uncover notable changes that occurred in financial markets and identify the assets that were significantly affected by these changes. We show in particular that there was an increase in the strength of the relationships between several different markets following the 2007-2008 credit and liquidity crisis.
Supervised chemical pattern recognition in almond ( Prunus dulcis ) Portuguese PDO cultivars: PCA- and LDA-based triennial study.

PubMed

Barreira, João C M; Casal, Susana; Ferreira, Isabel C F R; Peres, António M; Pereira, José Alberto; Oliveira, M Beatriz P P

2012-09-26

Almonds harvested in three years in Trás-os-Montes (Portugal) were characterized to find differences among Protected Designation of Origin (PDO) Amêndoa Douro and commercial non-PDO cultivars. Nutritional parameters, fiber (neutral and acid detergent fibers, acid detergent lignin, and cellulose), fatty acids, triacylglycerols (TAG), and tocopherols were evaluated. Fat was the major component, followed by carbohydrates, protein, and moisture. Fatty acids were mostly detected as monounsaturated and polyunsaturated forms, with relevance of oleic and linoleic acids. Accordingly, 1,2,3-trioleoylglycerol and 1,2-dioleoyl-3-linoleoylglycerol were the major TAG. α-Tocopherol was the leading tocopherol. To verify statistical differences among PDO and non-PDO cultivars independent of the harvest year, data were analyzed through an analysis of variance, a principal component analysis, and a linear discriminant analysis (LDA). These differences identified classification parameters, providing an important tool for authenticity purposes. The best results were achieved with TAG analysis coupled with LDA, which proved its effectiveness to discriminate almond cultivars.
Self organising maps for visualising and modelling

PubMed Central

2012-01-01

The paper describes the motivation of SOMs (Self Organising Maps) and how they are generally more accessible due to the wider available modern, more powerful, cost-effective computers. Their advantages compared to Principal Components Analysis and Partial Least Squares are discussed. These allow application to non-linear data, are not so dependent on least squares solutions, normality of errors and less influenced by outliers. In addition there are a wide variety of intuitive methods for visualisation that allow full use of the map space. Modern problems in analytical chemistry include applications to cultural heritage studies, environmental, metabolomic and biological problems result in complex datasets. Methods for visualising maps are described including best matching units, hit histograms, unified distance matrices and component planes. Supervised SOMs for classification including multifactor data and variable selection are discussed as is their use in Quality Control. The paper is illustrated using four case studies, namely the Near Infrared of food, the thermal analysis of polymers, metabolomic analysis of saliva using NMR, and on-line HPLC for pharmaceutical process monitoring. PMID:22594434
Extension of the quasistatic far-wing line shape theory to multicomponent anisotropic potentials

NASA Technical Reports Server (NTRS)

Ma, Q.; Tipping, R. H.

1994-01-01

The formalism developed previously for the calculation of the far-wing line shape function and the corresponding absorption coefficient using a single-component anisotropic interaction term and the binary collision and quasistatic approximations is generalized to multicomponent anisotropic potential functions. Explicit expressions are presented for several common cases, including the long-range dipole-dipole plus dipole-quadrupole interaction and a linear molecule interacting with a perturber atom. After determining the multicomponent functional representation for the interaction between the CO2 and Ar from previously published data, we calculate the theoretical line shape function and the corresponding absorption due to the nu(sub 3) band of CO2 in the frequency range 2400-2580 cm(exp -1) and compare our results with previous calculations carried out using a single-component anisotropic interaction, and with the results obtained assuming Lorentzian line shapes. The principal uncertainties in the present results, possible refinements of the theoretical formalism, and the applicability to other systems are discussed briefly.
Counties eliminating racial disparities in colorectal cancer mortality.

PubMed

Rust, George; Zhang, Shun; Yu, Zhongyuan; Caplan, Lee; Jain, Sanjay; Ayer, Turgay; McRoy, Luceta; Levine, Robert S

2016-06-01

Although colorectal cancer (CRC) mortality rates are declining, racial-ethnic disparities in CRC mortality nationally are widening. Herein, the authors attempted to identify county-level variations in this pattern, and to characterize counties with improving disparity trends. The authors examined 20-year trends in US county-level black-white disparities in CRC age-adjusted mortality rates during the study period between 1989 and 2010. Using a mixed linear model, counties were grouped into mutually exclusive patterns of black-white racial disparity trends in age-adjusted CRC mortality across 20 three-year rolling average data points. County-level characteristics from census data and from the Area Health Resources File were normalized and entered into a principal component analysis. Multinomial logistic regression models were used to test the relation between these factors (clusters of related contextual variables) and the disparity trend pattern group for each county. Counties were grouped into 4 disparity trend pattern groups: 1) persistent disparity (parallel black and white trend lines); 2) diverging (widening disparity); 3) sustained equality; and 4) converging (moving from disparate outcomes toward equality). The initial principal component analysis clustered the 82 independent variables into a smaller number of components, 6 of which explained 47% of the county-level variation in disparity trend patterns. County-level variation in social determinants, health care workforce, and health systems all were found to contribute to variations in cancer mortality disparity trend patterns from 1990 through 2010. Counties sustaining equality over time or moving from disparities to equality in cancer mortality suggest that disparities are not inevitable, and provide hope that more communities can achieve optimal and equitable cancer outcomes for all. Cancer 2016;122:1735-48. © 2016 American Cancer Society. © 2016 American Cancer Society.
Performance-based measures associate with frailty in patients with end-stage liver disease

PubMed Central

Lai, Jennifer C.; Volk, Michael L; Strasburg, Debra; Alexander, Neil

2016-01-01

Background Physical frailty, as measured by the Fried Frailty Index, is increasingly recognized as a critical determinant of outcomes in cirrhotics. However, its utility is limited by the inclusion of self-reported components. We aimed to identify performance-based measures associated with frailty in patients with cirrhosis. Methods Cirrhotics ≥50 years underwent: 6-minute walk test (6MWT, cardiopulmonary endurance), chair stands in 30 seconds (muscle endurance), isometric knee extension (lower extremity strength), unipedal stance time (static balance), and maximal step length (dynamic balance/coordination). Linear regression associated each physical performance test with frailty. Principal components exploratory factor analysis evaluated the inter-relatedness of frailty and the 5 physical performance tests. Results Of forty cirrhotics, with a median age of 64 years and Model for End-stage Liver Disease (MELD) MELD of 12,10 (25%) were frail by Fried Frailty Index ≥3. Frail cirrhotics had poorer performance in 6MWT distance (231 vs. 338 meters), 30 second chair stands (7 vs. 10), isometric knee extension (86 vs. 122 Newton meters), and maximal step length (22 vs. 27 inches) [p≤0.02 for each]. Each physical performance test was significantly associated with frailty (p<0.01), even after adjustment for MELD or hepatic encephalopathy. Principal component factor analysis demonstrated substantial, but unique, clustering of each physical performance test to a single factor – frailty. Conclusion Frailty in cirrhosis is a multi-dimensional construct that is distinct from liver dysfunction and incorporates endurance, strength, and balance. Our data provide specific targets for prehabilitation interventions aimed at reducing frailty in cirrhotics in preparation for liver transplantation. PMID:27495749
Performance-Based Measures Associate With Frailty in Patients With End-Stage Liver Disease.

PubMed

Lai, Jennifer C; Volk, Michael L; Strasburg, Debra; Alexander, Neil

2016-12-01

Physical frailty, as measured by the Fried Frailty Index, is increasingly recognized as a critical determinant of outcomes in patients with cirrhosis. However, its utility is limited by the inclusion of self-reported components. We aimed to identify performance-based measures associated with frailty in patients with cirrhosis. Patients with cirrhosis, aged 50 years or older, underwent: 6-minute walk test (cardiopulmonary endurance), chair stands in 30 seconds (muscle endurance), isometric knee extension (lower extremity strength), unipedal stance time (static balance), and maximal step length (dynamic balance/coordination). Linear regression associated each physical performance test with frailty. Principal components exploratory factor analysis evaluated the interrelatedness of frailty and the 5 physical performance tests. Of 40 patients with cirrhosis, with a median age of 64 years and Model for End-stage Liver Disease (MELD) MELD of 12.10 (25%) were frail by Fried Frailty Index ≥3. Frail patients with cirrhosis had poorer performance in 6-minute walk test distance (231 vs 338 m), 30-second chair stands (7 vs 10), isometric knee extension (86 vs 122 Newton meters), and maximal step length (22 vs 27 in. (P ≤ 0.02 for each). Each physical performance test was significantly associated with frailty (P < 0.01), even after adjustment for MELD or hepatic encephalopathy. Principal component factor analysis demonstrated substantial, but unique, clustering of each physical performance test to a single factor-frailty. Frailty in cirrhosis is a multidimensional construct that is distinct from liver dysfunction and incorporates endurance, strength, and balance. Our data provide specific targets for prehabilitation interventions aimed at reducing frailty in patients with cirrhosis in preparation for liver transplantation.
Evaluation of the temporal variations of air quality in Taipei City, Taiwan, from 1994 to 2003.

PubMed

Chang, Shuenn-Chin; Lee, Chung-Te

2008-03-01

Data collected from the five air-quality monitoring stations established by the Taiwan Environmental Protection Administration in Taipei City from 1994 to 2003 are analyzed to assess the temporal variations of air quality. Principal component analysis (PCA) is adopted to convert the original measuring pollutants into fewer independent components through linear combinations while still retaining the majority of the variance of the original data set. Two principal components (PCs) are retained together explaining 82.73% of the total variance. PC1, which represents primary pollutants such as CO, NO(x), and SO(2), shows an obvious decrease over the last 10 years. PC2, which represents secondary pollutants such as ozone, displays a yearly increase over the time period when a reduction of primary pollutants is obvious. In order to track down the control measures put forth by the authorities, 47 days of high PM(10) concentrations caused by transboundary transport have been eliminated in analyzing the long-term trend of PM(10) in Taipei City. The temporal variations over the past 10 years show that the moderate peak in O(3) demonstrates a significant upward trend even when the local primary pollutants have been well under control. Monthly variations of PC scores demonstrate that primary pollution is significant from January to April, while ozone increases from April to August. The results of the yearly variations of PC scores show that PM(10) has gradually shifted from a strong correlation with PC1 during the early years to become more related to PC2 in recent years. This implies that after a reduction of primary pollutants, the proportion of secondary aerosols in PM(10) may increase. Thus, reducing the precursor concentrations of secondary aerosols will be an effective way to lower PM(10) concentrations.
Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis

PubMed Central

Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.

2017-01-01

Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760
Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis.

PubMed

Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P

2017-01-01

Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.
QSAR modeling of flotation collectors using principal components extracted from topological indices.

PubMed

Natarajan, R; Nirdosh, Inderjit; Basak, Subhash C; Mills, Denise R

2002-01-01

Several topological indices were calculated for substituted-cupferrons that were tested as collectors for the froth flotation of uranium. The principal component analysis (PCA) was used for data reduction. Seven principal components (PC) were found to account for 98.6% of the variance among the computed indices. The principal components thus extracted were used in stepwise regression analyses to construct regression models for the prediction of separation efficiencies (Es) of the collectors. A two-parameter model with a correlation coefficient of 0.889 and a three-parameter model with a correlation coefficient of 0.913 were formed. PCs were found to be better than partition coefficient to form regression equations, and inclusion of an electronic parameter such as Hammett sigma or quantum mechanically derived electronic charges on the chelating atoms did not improve the correlation coefficient significantly. The method was extended to model the separation efficiencies of mercaptobenzothiazoles (MBT) and aminothiophenols (ATP) used in the flotation of lead and zinc ores, respectively. Five principal components were found to explain 99% of the data variability in each series. A three-parameter equation with correlation coefficient of 0.985 and a two-parameter equation with correlation coefficient of 0.926 were obtained for MBT and ATP, respectively. The amenability of separation efficiencies of chelating collectors to QSAR modeling using PCs based on topological indices might lead to the selection of collectors for synthesis and testing from a virtual database.
County community health associations of net voting shift in the 2016 U.S. presidential election

PubMed Central

Stewart, Charles; Bhambhani, Vijeta

2017-01-01

Importance In the U.S. presidential election of 2016, substantial shift in voting patterns occurred relative to previous elections. Although this shift has been associated with both education and race, the extent to which this shift was related to public health status is unclear. Objective To determine the extent to which county community health was associated with changes in voting between the presidential elections of 2016 and 2012. Design Ecological study with principal component analysis (PCA) using principal axis method to extract the components, then generalized linear regression. Setting General community. Participants All counties in the United States. Exposures Physically unhealthy days, mentally unhealthy days, percent food insecure, teen birth rate, primary care physician visit rate, age-adjusted mortality rate, violent crime rate, average health care costs, percent diabetic, and percent overweight or obese. Main outcome The percentage of Donald Trump votes in 2016 minus percentage of Mitt Romney votes in 2012 (“net voting shift”). Results Complete public health data was available for 3,009 counties which were included in the analysis. The mean net voting shift was 5.4% (+/- 5.8%). Of these 3,009 counties, 2,641 (87.8%) had positive net voting shift (shifted towards Trump) and 368 counties (12.2%) had negative net voting shift (shifted away from Trump). The first principal component (“unhealthy score”) accounted for 68% of the total variance in the data. The unhealthy score included all health variables except primary care physician rate, violent crime rate, and health care costs. The mean unhealthy score for counties was 0.39 (SD 0.16). Higher normalized unhealthy score was associated with positive net voting shift (22.1% shift per unit unhealthy, p < 0.0001). This association was stronger in states that switched Electoral College votes from 2012 to 2016 than in other states (5.9% per unit unhealthy, p <0.0001). Conclusions and relevance Substantial association exists between a shift toward voting for Donald Trump in 2016 relative to Mitt Romney in 2012 and measures of poor public health. Although these results do not demonstrate causality, these results suggest a possible role for health status in political choices. PMID:28968415
An Empirical Cumulus Parameterization Scheme for a Global Spectral Model

NASA Technical Reports Server (NTRS)

Rajendran, K.; Krishnamurti, T. N.; Misra, V.; Tao, W.-K.

2004-01-01

Realistic vertical heating and drying profiles in a cumulus scheme is important for obtaining accurate weather forecasts. A new empirical cumulus parameterization scheme based on a procedure to improve the vertical distribution of heating and moistening over the tropics is developed. The empirical cumulus parameterization scheme (ECPS) utilizes profiles of Tropical Rainfall Measuring Mission (TRMM) based heating and moistening derived from the European Centre for Medium- Range Weather Forecasts (ECMWF) analysis. A dimension reduction technique through rotated principal component analysis (RPCA) is performed on the vertical profiles of heating (Q1) and drying (Q2) over the convective regions of the tropics, to obtain the dominant modes of variability. Analysis suggests that most of the variance associated with the observed profiles can be explained by retaining the first three modes. The ECPS then applies a statistical approach in which Q1 and Q2 are expressed as a linear combination of the first three dominant principal components which distinctly explain variance in the troposphere as a function of the prevalent large-scale dynamics. The principal component (PC) score which quantifies the contribution of each PC to the corresponding loading profile is estimated through a multiple screening regression method which yields the PC score as a function of the large-scale variables. The profiles of Q1 and Q2 thus obtained are found to match well with the observed profiles. The impact of the ECPS is investigated in a series of short range (1-3 day) prediction experiments using the Florida State University global spectral model (FSUGSM, T126L14). Comparisons between short range ECPS forecasts and those with the modified Kuo scheme show a very marked improvement in the skill in ECPS forecasts. This improvement in the forecast skill with ECPS emphasizes the importance of incorporating realistic vertical distributions of heating and drying in the model cumulus scheme. This also suggests that in the absence of explicit models for convection, the proposed statistical scheme improves the modeling of the vertical distribution of heating and moistening in areas of deep convection.
[Vis-NIR spectroscopic pattern recognition combined with SG smoothing applied to breed screening of transgenic sugarcane].

PubMed

Liu, Gui-Song; Guo, Hao-Song; Pan, Tao; Wang, Ji-Hua; Cao, Gan

2014-10-01

Based on Savitzky-Golay (SG) smoothing screening, principal component analysis (PCA) combined with separately supervised linear discriminant analysis (LDA) and unsupervised hierarchical clustering analysis (HCA) were used for non-destructive visible and near-infrared (Vis-NIR) detection for breed screening of transgenic sugarcane. A random and stability-dependent framework of calibration, prediction, and validation was proposed. A total of 456 samples of sugarcane leaves planting in the elongating stage were collected from the field, which was composed of 306 transgenic (positive) samples containing Bt and Bar gene and 150 non-transgenic (negative) samples. A total of 156 samples (negative 50 and positive 106) were randomly selected as the validation set; the remaining samples (negative 100 and positive 200, a total of 300 samples) were used as the modeling set, and then the modeling set was subdivided into calibration (negative 50 and positive 100, a total of 150 samples) and prediction sets (negative 50 and positive 100, a total of 150 samples) for 50 times. The number of SG smoothing points was ex- panded, while some modes of higher derivative were removed because of small absolute value, and a total of 264 smoothing modes were used for screening. The pairwise combinations of first three principal components were used, and then the optimal combination of principal components was selected according to the model effect. Based on all divisions of calibration and prediction sets and all SG smoothing modes, the SG-PCA-LDA and SG-PCA-HCA models were established, the model parameters were optimized based on the average prediction effect for all divisions to produce modeling stability. Finally, the model validation was performed by validation set. With SG smoothing, the modeling accuracy and stability of PCA-LDA, PCA-HCA were signif- icantly improved. For the optimal SG-PCA-LDA model, the recognition rate of positive and negative validation samples were 94.3%, 96.0%; and were 92.5%, 98.0% for the optimal SG-PCA-LDA model, respectively. Vis-NIR spectro- scopic pattern recognition combined with SG smoothing could be used for accurate recognition of transgenic sugarcane leaves, and provided a convenient screening method for transgenic sugarcane breeding.
County community health associations of net voting shift in the 2016 U.S. presidential election.

PubMed

Wasfy, Jason H; Stewart, Charles; Bhambhani, Vijeta

2017-01-01

In the U.S. presidential election of 2016, substantial shift in voting patterns occurred relative to previous elections. Although this shift has been associated with both education and race, the extent to which this shift was related to public health status is unclear. To determine the extent to which county community health was associated with changes in voting between the presidential elections of 2016 and 2012. Ecological study with principal component analysis (PCA) using principal axis method to extract the components, then generalized linear regression. General community. All counties in the United States. Physically unhealthy days, mentally unhealthy days, percent food insecure, teen birth rate, primary care physician visit rate, age-adjusted mortality rate, violent crime rate, average health care costs, percent diabetic, and percent overweight or obese. The percentage of Donald Trump votes in 2016 minus percentage of Mitt Romney votes in 2012 ("net voting shift"). Complete public health data was available for 3,009 counties which were included in the analysis. The mean net voting shift was 5.4% (+/- 5.8%). Of these 3,009 counties, 2,641 (87.8%) had positive net voting shift (shifted towards Trump) and 368 counties (12.2%) had negative net voting shift (shifted away from Trump). The first principal component ("unhealthy score") accounted for 68% of the total variance in the data. The unhealthy score included all health variables except primary care physician rate, violent crime rate, and health care costs. The mean unhealthy score for counties was 0.39 (SD 0.16). Higher normalized unhealthy score was associated with positive net voting shift (22.1% shift per unit unhealthy, p < 0.0001). This association was stronger in states that switched Electoral College votes from 2012 to 2016 than in other states (5.9% per unit unhealthy, p <0.0001). Substantial association exists between a shift toward voting for Donald Trump in 2016 relative to Mitt Romney in 2012 and measures of poor public health. Although these results do not demonstrate causality, these results suggest a possible role for health status in political choices.

A PCA-Based method for determining craniofacial relationship and sexual dimorphism of facial shapes.

PubMed

Shui, Wuyang; Zhou, Mingquan; Maddock, Steve; He, Taiping; Wang, Xingce; Deng, Qingqiong

2017-11-01

Previous studies have used principal component analysis (PCA) to investigate the craniofacial relationship, as well as sex determination using facial factors. However, few studies have investigated the extent to which the choice of principal components (PCs) affects the analysis of craniofacial relationship and sexual dimorphism. In this paper, we propose a PCA-based method for visual and quantitative analysis, using 140 samples of 3D heads (70 male and 70 female), produced from computed tomography (CT) images. There are two parts to the method. First, skull and facial landmarks are manually marked to guide the model's registration so that dense corresponding vertices occupy the same relative position in every sample. Statistical shape spaces of the skull and face in dense corresponding vertices are constructed using PCA. Variations in these vertices, captured in every principal component (PC), are visualized to observe shape variability. The correlations of skull- and face-based PC scores are analysed, and linear regression is used to fit the craniofacial relationship. We compute the PC coefficients of a face based on this craniofacial relationship and the PC scores of a skull, and apply the coefficients to estimate a 3D face for the skull. To evaluate the accuracy of the computed craniofacial relationship, the mean and standard deviation of every vertex between the two models are computed, where these models are reconstructed using real PC scores and coefficients. Second, each PC in facial space is analysed for sex determination, for which support vector machines (SVMs) are used. We examined the correlation between PCs and sex, and explored the extent to which the choice of PCs affects the expression of sexual dimorphism. Our results suggest that skull- and face-based PCs can be used to describe the craniofacial relationship and that the accuracy of the method can be improved by using an increased number of face-based PCs. The results show that the accuracy of the sex classification is related to the choice of PCs. The highest sex classification rate is 91.43% using our method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Determination and importance of temperature dependence of retention coefficient (RPHPLC) in QSAR model of nitrazepams' partition coefficient in bile acid micelles.

PubMed

Posa, Mihalj; Pilipović, Ana; Lalić, Mladena; Popović, Jovan

2011-02-15

Linear dependence between temperature (t) and retention coefficient (k, reversed phase HPLC) of bile acids is obtained. Parameters (a, intercept and b, slope) of the linear function k=f(t) highly correlate with bile acids' structures. Investigated bile acids form linear congeneric groups on a principal component (calculated from k=f(t)) score plot that are in accordance with conformations of the hydroxyl and oxo groups in a bile acid steroid skeleton. Partition coefficient (K(p)) of nitrazepam in bile acids' micelles is investigated. Nitrazepam molecules incorporated in micelles show modified bioavailability (depo effect, higher permeability, etc.). Using multiple linear regression method QSAR models of nitrazepams' partition coefficient, K(p) are derived on the temperatures of 25°C and 37°C. For deriving linear regression models on both temperatures experimentally obtained lipophilicity parameters are included (PC1 from data k=f(t)) and in silico descriptors of the shape of a molecule while on the higher temperature molecular polarisation is introduced. This indicates the fact that the incorporation mechanism of nitrazepam in BA micelles changes on the higher temperatures. QSAR models are derived using partial least squares method as well. Experimental parameters k=f(t) are shown to be significant predictive variables. Both QSAR models are validated using cross validation and internal validation method. PLS models have slightly higher predictive capability than MLR models. Copyright © 2010 Elsevier B.V. All rights reserved.
Pattern Analysis of Dynamic Susceptibility Contrast-enhanced MR Imaging Demonstrates Peritumoral Tissue Heterogeneity

PubMed Central

Akbari, Hamed; Macyszyn, Luke; Da, Xiao; Wolf, Ronald L.; Bilello, Michel; Verma, Ragini; O’Rourke, Donald M.

2014-01-01

Purpose To augment the analysis of dynamic susceptibility contrast material–enhanced magnetic resonance (MR) images to uncover unique tissue characteristics that could potentially facilitate treatment planning through a better understanding of the peritumoral region in patients with glioblastoma. Materials and Methods Institutional review board approval was obtained for this study, with waiver of informed consent for retrospective review of medical records. Dynamic susceptibility contrast-enhanced MR imaging data were obtained for 79 patients, and principal component analysis was applied to the perfusion signal intensity. The first six principal components were sufficient to characterize more than 99% of variance in the temporal dynamics of blood perfusion in all regions of interest. The principal components were subsequently used in conjunction with a support vector machine classifier to create a map of heterogeneity within the peritumoral region, and the variance of this map served as the heterogeneity score. Results The calculated principal components allowed near-perfect separability of tissue that was likely highly infiltrated with tumor and tissue that was unlikely infiltrated with tumor. The heterogeneity map created by using the principal components showed a clear relationship between voxels judged by the support vector machine to be highly infiltrated and subsequent recurrence. The results demonstrated a significant correlation (r = 0.46, P < .0001) between the heterogeneity score and patient survival. The hazard ratio was 2.23 (95% confidence interval: 1.4, 3.6; P < .01) between patients with high and low heterogeneity scores on the basis of the median heterogeneity score. Conclusion Analysis of dynamic susceptibility contrast-enhanced MR imaging data by using principal component analysis can help identify imaging variables that can be subsequently used to evaluate the peritumoral region in glioblastoma. These variables are potentially indicative of tumor infiltration and may become useful tools in guiding therapy, as well as individualized prognostication. © RSNA, 2014 PMID:24955928
Virtual directions in paleomagnetism: A global and rapid approach to evaluate the NRM components.

NASA Astrophysics Data System (ADS)

Ramón, Maria J.; Pueyo, Emilio L.; Oliva-Urcia, Belén; Larrasoaña, Juan C.

2017-02-01

We introduce a method and software to process demagnetization data for a rapid and integrative estimation of characteristic remanent magnetization (ChRM) components. The virtual directions (VIDI) of a paleomagnetic site are “all” possible directions that can be calculated from a given demagnetization routine of “n” steps (being m the number of specimens in the site). If the ChRM can be defined for a site, it will be represented in the VIDI set. Directions can be calculated for successive steps using principal component analysis, both anchored to the origin (resultant virtual directions RVD; m * (n2+n)/2) and not anchored (difference virtual directions DVD; m * (n2-n)/2). The number of directions per specimen (n2) is very large and will enhance all ChRM components with noisy regions where two components were fitted together (mixing their unblocking intervals). In the same way, resultant and difference virtual circles (RVC, DVC) are calculated. Virtual directions and circles are a global and objective approach to unravel different natural remanent magnetization (NRM) components for a paleomagnetic site without any assumption. To better constrain the stable components, some filters can be applied, such as establishing an upper boundary to the MAD, removing samples with anomalous intensities, or stating a minimum number of demagnetization steps (objective filters) or selecting a given unblocking interval (subjective but based on the expertise). On the other hand, the VPD program also allows the application of standard approaches (classic PCA fitting of directions a circles) and other ancillary methods (stacking routine, linearity spectrum analysis) giving an objective, global and robust idea of the demagnetization structure with minimal assumptions. Application of the VIDI method to natural cases (outcrops in the Pyrenees and u-channel data from a Roman dam infill in northern Spain) and their comparison to other approaches (classic end-point, demagnetization circle analysis, stacking routine and linearity spectrum analysis) allows validation of this technique. The VIDI is a global approach and it is especially useful for large data sets and rapid estimation of the NRM components.
Comparative study of SVM methods combined with voxel selection for object category classification on fMRI data.

PubMed

Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li

2011-02-16

Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
Cultivating an Environment that Contributes to Teaching and Learning in Schools: High School Principals' Actions

ERIC Educational Resources Information Center

Lin, Mind-Dih

2012-01-01

Improving principal leadership is a vital component to the success of educational reform initiatives that seek to improve whole-school performance, as principal leadership often exercises positive but indirect effects on student learning. Because of the importance of principals within the field of school improvement, this article focuses on…
Measuring Principals' Effectiveness: Results from New Jersey's First Year of Statewide Principal Evaluation. REL 2016-156

ERIC Educational Resources Information Center

Herrmann, Mariesa; Ross, Christine

2016-01-01

States and districts across the country are implementing new principal evaluation systems that include measures of the quality of principals' school leadership practices and measures of student achievement growth. Because these evaluation systems will be used for high-stakes decisions, it is important that the component measures of the evaluation…
The Views of Novice and Late Career Principals Concerning Instructional and Organizational Leadership within Their Evaluation

ERIC Educational Resources Information Center

Hvidston, David J.; Range, Bret G.; McKim, Courtney Ann; Mette, Ian M.

2015-01-01

This study examined the perspectives of novice and late career principals concerning instructional and organizational leadership within their performance evaluations. An online survey was sent to 251 principals with a return rate of 49%. Instructional leadership components of the evaluation that were most important to all principals were:…
Figures of merit for present and future dark energy probes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mortonson, Michael J.; Huterer, Dragan; Hu, Wayne

2010-09-15

We compare current and forecasted constraints on dynamical dark energy models from Type Ia supernovae and the cosmic microwave background using figures of merit based on the volume of the allowed dark energy parameter space. For a two-parameter dark energy equation of state that varies linearly with the scale factor, and assuming a flat universe, the area of the error ellipse can be reduced by a factor of {approx}10 relative to current constraints by future space-based supernova data and CMB measurements from the Planck satellite. If the dark energy equation of state is described by a more general basis ofmore » principal components, the expected improvement in volume-based figures of merit is much greater. While the forecasted precision for any single parameter is only a factor of 2-5 smaller than current uncertainties, the constraints on dark energy models bounded by -1{<=}w{<=}1 improve for approximately 6 independent dark energy parameters resulting in a reduction of the total allowed volume of principal component parameter space by a factor of {approx}100. Typical quintessence models can be adequately described by just 2-3 of these parameters even given the precision of future data, leading to a more modest but still significant improvement. In addition to advances in supernova and CMB data, percent-level measurement of absolute distance and/or the expansion rate is required to ensure that dark energy constraints remain robust to variations in spatial curvature.« less
Quarry identification of historical building materials by means of laser induced breakdown spectroscopy, X-ray fluorescence and chemometric analysis

NASA Astrophysics Data System (ADS)

Colao, F.; Fantoni, R.; Ortiz, P.; Vazquez, M. A.; Martin, J. M.; Ortiz, R.; Idris, N.

2010-08-01

To characterize historical building materials according to the geographic origin of the quarries from which they have been mined, the relative content of major and trace elements were determined by means of Laser Induced Breakdown Spectroscopy (LIBS) and X-ray Fluorescence (XRF) techniques. 48 different specimens were studied and the entire samples' set was divided in two different groups: the first, used as reference set, was composed by samples mined from eight different quarries located in Seville province; the second group was composed by specimens of unknown provenance collected in several historical buildings and churches in the city of Seville. Data reduction and analysis on laser induced breakdown spectroscopy and X-ray fluorescence measurements was performed using multivariate statistical approach, namely the Linear Discriminant Analysis (LDA), Principal Component Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA). A clear separation among reference sample materials mined from different quarries was observed in Principal Components (PC) score plots, then a supervised soft independent modeling of class analogy classification was trained and run, aiming to assess the provenance of unknown samples according to their elemental content. The obtained results were compared with the provenance assignments made on the basis of petrographical description. This work gives experimental evidence that laser induced breakdown spectroscopy measurements on a relatively small set of elements is a fast and effective method for the purpose of origin identification.
Application of Electronic Nose for Measuring Total Volatile Basic Nitrogen and Total Viable Counts in Packaged Pork During Refrigerated Storage.

PubMed

Li, Miaoyun; Wang, Haibiao; Sun, Lingxia; Zhao, Gaiming; Huang, Xianqing

2016-04-01

The objective of this study was to predict the total viable counts (TVC) and total volatile basic nitrogen (TVB-N) in pork using an electronic nose (E-nose), and to assess the freshness of chilled pork during storage using different packaging methods, including pallet packaging (PP), vacuum packaging (VP), and modified atmosphere packaging (MAP, 40% O2 /40% CO2 /20% N2 ). Principal component analysis (PCA) was used to analyze the E-nose signals, and the results showed that the relationships between the freshness of chilled pork and E-nose signals could be distinguished in the loadings plots, and the freshness of chilled pork could be distributed along 2 first principal components. Multiple linear regression (MLR) was used to correlate TVC and TVB-N to E-nose signals. High F and R2 values were obtained in the MLR output of TVB-N (F = 32.1, 21.6, and 24.2 for PP [R2 = 0.93], VP [R2 = 0.94], and MAP [R2 = 0.95], respectively) and TVC (F = 34.2, 46.4, and 7.8 for PP [R2 = 0.98], VP [R2 = 0.89], and MAP [R2 = 0.85], respectively). The results of this study suggest that it is possible to use the E-nose technology to predict TVB-N and TVC for assessing the freshness of chilled pork during storage. © 2016 Institute of Food Technologists®
Determination of patellofemoral pain sub-groups and development of a method for predicting treatment outcome using running gait kinematics.

PubMed

Watari, Ricky; Kobsar, Dylan; Phinyomark, Angkoon; Osis, Sean; Ferber, Reed

2016-10-01

Not all patients with patellofemoral pain exhibit successful outcomes following exercise therapy. Thus, the ability to identify patellofemoral pain subgroups related to treatment response is important for the development of optimal therapeutic strategies to improve rehabilitation outcomes. The purpose of this study was to use baseline running gait kinematic and clinical outcome variables to classify patellofemoral pain patients on treatment response retrospectively. Forty-one individuals with patellofemoral pain that underwent a 6-week exercise intervention program were sub-grouped as treatment Responders (n=28) and Non-responders (n=13) based on self-reported measures of pain and function. Baseline three-dimensional running kinematics, and self-reported measures underwent a linear discriminant analysis of the principal components of the variables to retrospectively classify participants based on treatment response. The significance of the discriminant function was verified with a Wilk's lambda test (α=0.05). The model selected 2 gait principal components and had a 78.1% classification accuracy. Overall, Non-responders exhibited greater ankle dorsiflexion, knee abduction and hip flexion during the swing phase and greater ankle inversion during the stance phase, compared to Responders. This is the first study to investigate an objective method to use baseline kinematic and self-report outcome variables to classify on patellofemoral pain treatment outcome. This study represents a significant first step towards a method to help clinicians make evidence-informed decisions regarding optimal treatment strategies for patients with patellofemoral pain. Copyright © 2016 Elsevier Ltd. All rights reserved.
Evaluating exposures to complex mixtures of chemicals during a new production process in the plastics industry.

PubMed

Meijster, Tim; Burstyn, Igor; Van Wendel De Joode, Berna; Posthumus, Maarten A; Kromhout, Hans

2004-08-01

The goal of this study was to monitor emission of chemicals at a factory where plastics products were fabricated by a new robotic (impregnated tape winding) production process. Stationary and personal air measurements were taken to determine which chemicals were released and at what concentrations. Principal component analyses (PCA) and linear regression were used to determine the emission sources of different chemicals found in the air samples. We showed that complex mixtures of chemicals were released, but most concentrations were below Dutch exposure limits. Based on the results of the principal component analyses, the chemicals found were divided into three groups. The first group consisted of short chain aliphatic hydrocarbons (C2-C6). The second group included larger hydrocarbons (C9-C11) and some cyclic hydrocarbons. The third group contained all aromatic and two aliphatic hydrocarbons. Regression analyses showed that emission of the first group of chemicals was associated with cleaning activities and the use of epoxy resins. The second and third group showed strong association with the type of tape used in the new tape winding process. High levels of CO and HCN (above exposure limits) were measured on one occasion when a different brand of impregnated polypropylene sulphide tape was used in the tape winding process. Plans exist to drastically increase production with the new tape winding process. This will cause exposure levels to rise and therefore further control measures should be installed to reduce release of these chemicals.
Tomato seeds maturity detection system based on chlorophyll fluorescence

NASA Astrophysics Data System (ADS)

Li, Cuiling; Wang, Xiu; Meng, Zhijun

2016-10-01

Chlorophyll fluorescence intensity can be used as seed maturity and quality evaluation indicator. Chlorophyll fluorescence intensity of seed coats is tested to judge the level of chlorophyll content in seeds, and further to judge the maturity and quality of seeds. This research developed a detection system of tomato seeds maturity based on chlorophyll fluorescence spectrum technology, the system included an excitation light source unit, a fluorescent signal acquisition unit and a data processing unit. The excitation light source unit consisted of two high power LEDs, two radiators and two constant current power supplies, and it was designed to excite chlorophyll fluorescence of tomato seeds. The fluorescent signal acquisition unit was made up of a fluorescence spectrometer, an optical fiber, an optical fiber scaffolds and a narrowband filter. The data processing unit mainly included a computer. Tomato fruits of green ripe stage, discoloration stage, firm ripe stage and full ripe stage were harvested, and their seeds were collected directly. In this research, the developed tomato seeds maturity testing system was used to collect fluorescence spectrums of tomato seeds of different maturities. Principal component analysis (PCA) method was utilized to reduce the dimension of spectral data and extract principal components, and PCA was combined with linear discriminant analysis (LDA) to establish discriminant model of tomato seeds maturity, the discriminant accuracy was greater than 90%. Research results show that using chlorophyll fluorescence spectrum technology is feasible for seeds maturity detection, and the developed tomato seeds maturity testing system has high detection accuracy.
Spatial distribution and source apportionment of water pollution in different administrative zones of Wen-Rui-Tang (WRT) river watershed, China.

PubMed

Yang, Liping; Mei, Kun; Liu, Xingmei; Wu, Laosheng; Zhang, Minghua; Xu, Jianming; Wang, Fan

2013-08-01

Water quality degradation in river systems has caused great concerns all over the world. Identifying the spatial distribution and sources of water pollutants is the very first step for efficient water quality management. A set of water samples collected bimonthly at 12 monitoring sites in 2009 and 2010 were analyzed to determine the spatial distribution of critical parameters and to apportion the sources of pollutants in Wen-Rui-Tang (WRT) river watershed, near the East China Sea. The 12 monitoring sites were divided into three administrative zones of urban, suburban, and rural zones considering differences in land use and population density. Multivariate statistical methods [one-way analysis of variance, principal component analysis (PCA), and absolute principal component score-multiple linear regression (APCS-MLR) methods] were used to investigate the spatial distribution of water quality and to apportion the pollution sources. Results showed that most water quality parameters had no significant difference between the urban and suburban zones, whereas these two zones showed worse water quality than the rural zone. Based on PCA and APCS-MLR analysis, urban domestic sewage and commercial/service pollution, suburban domestic sewage along with fluorine point source pollution, and agricultural nonpoint source pollution with rural domestic sewage pollution were identified to the main pollution sources in urban, suburban, and rural zones, respectively. Understanding the water pollution characteristics of different administrative zones could put insights into effective water management policy-making especially in the area across various administrative zones.
[Simultaneous determination of principal components and related substances of raw material drug of ammonium glycyrrhizinate by reversed-phase high performance liquid chromatography].

PubMed

Zhao, Yanyan; Liu, Liyan; Han, Yuanyuan; Li, Yueqiu; Wang, Yan; Shi, Minjian

2013-09-01

An analytical method for the simultaneous determination of 18alpha-glycyrrhizic acid, 18beta-glycyrrhizinic acid, related substances A and B and drug quality standard by reversed-phase high performance liquid chromatography (RP-HPLC) was established. The assay was carried out on a Durashell-C18 column (250 mm x 4.6 mm, 5 microm) with 10 mmol/L ammonium perchlorate (the pH value was adjusted to 8.20 with ammonia)-methanol (48:52, v/v) as mobile phase at a flow rate of 0.80 mL/min, and the detection wavelength was set at 254 nm. The column temperature was 50 degrees C and the injection volume was 10 microL. Under the separation conditions, the calibration curves of the analytes showed good linearities within the mass concentrations of 0.50 -100 mg/L (r > 0.999 9). The detection limits for 18alpha-glycyrrhizic acid, 18beta-glycyrrhizinic acid, related substances A and B were 0.15, 0.10, 0.10, 0.15 mg/L, respectively. The average recoveries were between 97.32% and 99.33% (n = 3) with the relative standard deviations (RSDs) between 0.05% and 1.06%. The method is sensitive, reproducible, and the results are accurate and reliable. The method can be used for the determination of principal components and related substances of ammonium glycyrrhizinate for the quality control of raw material drug of ammonium glycyrrhizinate.
Rear shape in 3 dimensions summarized by principal component analysis is a good predictor of body condition score in Holstein dairy cows.

PubMed

Fischer, A; Luginbühl, T; Delattre, L; Delouard, J M; Faverdin, P

2015-07-01

Body condition is an indirect estimation of the level of body reserves, and its variation reflects cumulative variation in energy balance. It interacts with reproductive and health performance, which are important to consider in dairy production but not easy to monitor. The commonly used body condition score (BCS) is time consuming, subjective, and not very sensitive. The aim was therefore to develop and validate a method assessing BCS with 3-dimensional (3D) surfaces of the cow's rear. A camera captured 3D shapes 2 m from the floor in a weigh station at the milking parlor exit. The BCS was scored by 3 experts on the same day as 3D imaging. Four anatomical landmarks had to be identified manually on each 3D surface to define a space centered on the cow's rear. A set of 57 3D surfaces from 56 Holstein dairy cows was selected to cover a large BCS range (from 0.5 to 4.75 on a 0 to 5 scale) to calibrate 3D surfaces on BCS. After performing a principal component analysis on this data set, multiple linear regression was fitted on the coordinates of these surfaces in the principal components' space to assess BCS. The validation was performed on 2 external data sets: one with cows used for calibration, but at a different lactation stage, and one with cows not used for calibration. Additionally, 6 cows were scanned once and their surfaces processed 8 times each for repeatability and then these cows were scanned 8 times each the same day for reproducibility. The selected model showed perfect calibration and a good but weaker validation (root mean square error=0.31 for the data set with cows used for calibration; 0.32 for the data set with cows not used for calibration). Assessing BCS with 3D surfaces was 3 times more repeatable (standard error=0.075 versus 0.210 for BCS) and 2.8 times more reproducible than manually scored BCS (standard error=0.103 versus 0.280 for BCS). The prediction error was similar for both validation data sets, indicating that the method is not less efficient for cows not used for calibration. The major part of reproducibility error incorporates repeatability error. An automation of the anatomical landmarks identification is required, first to allow broadband measures of body condition and second to improve repeatability and consequently reproducibility. Assessing BCS using 3D imaging coupled with principal component analysis appears to be a very promising means of improving precision and feasibility of this trait measurement. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Checking Dimensionality in Item Response Models with Principal Component Analysis on Standardized Residuals

ERIC Educational Resources Information Center

Chou, Yeh-Tai; Wang, Wen-Chung

2010-01-01

Dimensionality is an important assumption in item response theory (IRT). Principal component analysis on standardized residuals has been used to check dimensionality, especially under the family of Rasch models. It has been suggested that an eigenvalue greater than 1.5 for the first eigenvalue signifies a violation of unidimensionality when there…
Variable Neighborhood Search Heuristics for Selecting a Subset of Variables in Principal Component Analysis

ERIC Educational Resources Information Center

Brusco, Michael J.; Singh, Renu; Steinley, Douglas

2009-01-01

The selection of a subset of variables from a pool of candidates is an important problem in several areas of multivariate statistics. Within the context of principal component analysis (PCA), a number of authors have argued that subset selection is crucial for identifying those variables that are required for correct interpretation of the…
Relaxation mode analysis of a peptide system: comparison with principal component analysis.

PubMed

Mitsutake, Ayori; Iijima, Hiromitsu; Takano, Hiroshi

2011-10-28

This article reports the first attempt to apply the relaxation mode analysis method to a simulation of a biomolecular system. In biomolecular systems, the principal component analysis is a well-known method for analyzing the static properties of fluctuations of structures obtained by a simulation and classifying the structures into some groups. On the other hand, the relaxation mode analysis has been used to analyze the dynamic properties of homopolymer systems. In this article, a long Monte Carlo simulation of Met-enkephalin in gas phase has been performed. The results are analyzed by the principal component analysis and relaxation mode analysis methods. We compare the results of both methods and show the effectiveness of the relaxation mode analysis.

Matrix partitioning and EOF/principal component analysis of Antarctic Sea ice brightness temperatures

NASA Technical Reports Server (NTRS)

Murray, C. W., Jr.; Mueller, J. L.; Zwally, H. J.

1984-01-01

A field of measured anomalies of some physical variable relative to their time averages, is partitioned in either the space domain or the time domain. Eigenvectors and corresponding principal components of the smaller dimensioned covariance matrices associated with the partitioned data sets are calculated independently, then joined to approximate the eigenstructure of the larger covariance matrix associated with the unpartitioned data set. The accuracy of the approximation (fraction of the total variance in the field) and the magnitudes of the largest eigenvalues from the partitioned covariance matrices together determine the number of local EOF's and principal components to be joined by any particular level. The space-time distribution of Nimbus-5 ESMR sea ice measurement is analyzed.
Fast principal component analysis for stacking seismic data

NASA Astrophysics Data System (ADS)

Wu, Juan; Bai, Min

2018-04-01

Stacking seismic data plays an indispensable role in many steps of the seismic data processing and imaging workflow. Optimal stacking of seismic data can help mitigate seismic noise and enhance the principal components to a great extent. Traditional average-based seismic stacking methods cannot obtain optimal performance when the ambient noise is extremely strong. We propose a principal component analysis (PCA) algorithm for stacking seismic data without being sensitive to noise level. Considering the computational bottleneck of the classic PCA algorithm in processing massive seismic data, we propose an efficient PCA algorithm to make the proposed method readily applicable for industrial applications. Two numerically designed examples and one real seismic data are used to demonstrate the performance of the presented method.
Multivariate analyses of salt stress and metabolite sensing in auto- and heterotroph Chenopodium cell suspensions.

PubMed

Wongchai, C; Chaidee, A; Pfeiffer, W

2012-01-01

Global warming increases plant salt stress via evaporation after irrigation, but how plant cells sense salt stress remains unknown. Here, we searched for correlation-based targets of salt stress sensing in Chenopodium rubrum cell suspension cultures. We proposed a linkage between the sensing of salt stress and the sensing of distinct metabolites. Consequently, we analysed various extracellular pH signals in autotroph and heterotroph cell suspensions. Our search included signals after 52 treatments: salt and osmotic stress, ion channel inhibitors (amiloride, quinidine), salt-sensing modulators (proline), amino acids, carboxylic acids and regulators (salicylic acid, 2,4-dichlorphenoxyacetic acid). Multivariate analyses revealed hirarchical clusters of signals and five principal components of extracellular proton flux. The principal component correlated with salt stress was an antagonism of γ-aminobutyric and salicylic acid, confirming involvement of acid-sensing ion channels (ASICs) in salt stress sensing. Proline, short non-substituted mono-carboxylic acids (C2-C6), lactic acid and amiloride characterised the four uncorrelated principal components of proton flux. The proline-associated principal component included an antagonism of 2,4-dichlorphenoxyacetic acid and a set of amino acids (hydrophobic, polar, acidic, basic). The five principal components captured 100% of variance of extracellular proton flux. Thus, a bias-free, functional high-throughput screening was established to extract new clusters of response elements and potential signalling pathways, and to serve as a core for quantitative meta-analysis in plant biology. The eigenvectors reorient research, associating proline with development instead of salt stress, and the proof of existence of multiple components of proton flux can help to resolve controversy about the acid growth theory. © 2011 German Botanical Society and The Royal Botanical Society of the Netherlands.
Item response theory and factor analysis as a mean to characterize occurrence of response shift in a longitudinal quality of life study in breast cancer patients

PubMed Central

2014-01-01

Background The occurrence of response shift (RS) in longitudinal health-related quality of life (HRQoL) studies, reflecting patient adaptation to disease, has already been demonstrated. Several methods have been developed to detect the three different types of response shift (RS), i.e. recalibration RS, 2) reprioritization RS, and 3) reconceptualization RS. We investigated two complementary methods that characterize the occurrence of RS: factor analysis, comprising Principal Component Analysis (PCA) and Multiple Correspondence Analysis (MCA), and a method of Item Response Theory (IRT). Methods Breast cancer patients (n = 381) completed the EORTC QLQ-C30 and EORTC QLQ-BR23 questionnaires at baseline, immediately following surgery, and three and six months after surgery, according to the “then-test/post-test” design. Recalibration was explored using MCA and a model of IRT, called the Linear Logistic Model with Relaxed Assumptions (LLRA) using the then-test method. Principal Component Analysis (PCA) was used to explore reconceptualization and reprioritization. Results MCA highlighted the main profiles of recalibration: patients with high HRQoL level report a slightly worse HRQoL level retrospectively and vice versa. The LLRA model indicated a downward or upward recalibration for each dimension. At six months, the recalibration effect was statistically significant for 11/22 dimensions of the QLQ-C30 and BR23 according to the LLRA model (p ≤ 0.001). Regarding the QLQ-C30, PCA indicated a reprioritization of symptom scales and reconceptualization via an increased correlation between functional scales. Conclusions Our findings demonstrate the usefulness of these analyses in characterizing the occurrence of RS. MCA and IRT model had convergent results with then-test method to characterize recalibration component of RS. PCA is an indirect method in investigating the reprioritization and reconceptualization components of RS. PMID:24606836
Powerful Electromechanical Linear Actuator

NASA Technical Reports Server (NTRS)

Cowan, John R.; Myers, William N.

1994-01-01

Powerful electromechanical linear actuator designed to replace hydraulic actuator that provides incremental linear movements to large object and holds its position against heavy loads. Electromechanical actuator cleaner and simpler, and needs less maintenance. Two principal innovative features that distinguish new actuator are use of shaft-angle resolver as source of position feedback to electronic control subsystem and antibacklash gearing arrangement.
Quality evaluation of Houttuynia cordata Thunb. by high performance liquid chromatography with photodiode-array detection (HPLC-DAD).

PubMed

Yang, Zhan-nan; Sun, Yi-ming; Luo, Shi-qiong; Chen, Jin-wu; Chen, Jin-wu; Yu, Zheng-wen; Sun, Min

2014-03-01

A new, validated method, developed for the simultaneous determination of 16 phenolics (chlorogenic acid, scopoletin, vitexin, rutin, afzelin, isoquercitrin, narirutin, kaempferitrin, quercitrin, quercetin, kaempferol, chrysosplenol D, vitexicarpin, 5-hydroxy-3,3',4',7-tetramethoxy flavonoids, 5-hydroxy-3,4',6,7-tetramethoxy flavonoids and kaempferol-3,7,4'-trimethyl ether) in Houttuynia cordata Thunb. was successfully applied to 35 batches of samples collected from different regions or at different times and their total antioxidant activities (TAAs) were investigated. The aim was to develop a quality control method to simultaneously determine the major active components in H. cordata. The HPLC-DAD method was performed using a reverse-phase C18 column with a gradient elution system (acetonitrile-methanol-water) and simultaneous detection at 345 nm. Linear behaviors of method for all the analytes were observed with linear regression relationship (r(2)>0.999) at the concentration ranges investigated. The recoveries of the 16 phenolics ranged from 98.93% to 101.26%. The samples analyzed were differentiated and classified based on the contents of the 16 characteristic compounds and the TAA using hierarchical clustering analysis (HCA) and principal component analysis (PCA). The results analyzed showed that similar chemical profiles and TAAs were divided into the same group. There was some evidence that active compounds, although they varied significantly, may possess uniform anti-oxidant activities and have potentially synergistic effects.
Effects of pumice mining on soil quality

NASA Astrophysics Data System (ADS)

Cruz-Ruíz, A.; Cruz-Ruíz, E.; Vaca, R.; Del Aguila, P.; Lugo, J.

2016-01-01

Mexico is the world's fourth most important maize producer; hence, there is a need to maintain soil quality for sustainable production in the upcoming years. Pumice mining is a superficial operation that modifies large areas in central Mexico. The main aim was to assess the present state of agricultural soils differing in elapsed time since pumice mining (0-15 years) in a representative area of the Calimaya region in the State of Mexico. The study sites in 0, 1, 4, 10, and 15 year old reclaimed soils were compared with an adjacent undisturbed site. Our results indicate that gravimetric moisture content, water hold capacity, bulk density, available phosphorus, total nitrogen, soil organic carbon, microbial biomass carbon and phosphatase and urease activity were greatly impacted by disturbance. A general trend of recovery towards the undisturbed condition with reclamation age was found after disturbance, the recovery of soil total N being faster than soil organic C. The soil quality indicators were selected using principal component analysis (PCA), correlations and multiple linear regressions. The first three components gathered explain 76.4 % of the total variability. The obtained results revealed that the most appropriate indicators to diagnose the quality of the soils were urease, available phosphorus and bulk density and minor total nitrogen. According to linear score analysis and the additive index, the soils showed a recuperation starting from 4 years of pumice extraction.
Characterization of CDOM from urban waters in Northern-Northeastern China using excitation-emission matrix fluorescence and parallel factor analysis.

PubMed

Zhao, Ying; Song, Kaishan; Li, Sijia; Ma, Jianhang; Wen, Zhidan

2016-08-01

Chromophoric dissolved organic matter (CDOM) plays an important role in aquatic systems, but high concentrations of organic materials are considered pollutants. The fluorescent component characteristics of CDOM in urban waters sampled from Northern and Northeastern China were examined by excitation-emission matrix fluorescence and parallel factor analysis (EEM-PARAFAC) to investigate the source and compositional changes of CDOM on both space and pollution levels. One humic-like (C1), one tryptophan-like component (C2), and one tyrosine-like component (C3) were identified by PARAFAC. Mean fluorescence intensities of the three CDOM components varied spatially and by pollution level in cities of Northern and Northeastern China during July-August, 2013 and 2014. Principal components analysis (PCA) was conducted to identify the relative distribution of all water samples. Cluster analysis (CA) was also used to categorize the samples into groups of similar pollution levels within a study area. Strong positive linear relationships were revealed between the CDOM absorption coefficients a(254) (R (2) = 0.89, p < 0.01); a(355) (R (2) = 0.94, p < 0.01); and the fluorescence intensity (F max) for the humic-like C1 component. A positive linear relationship (R (2) = 0.77) was also exhibited between dissolved organic carbon (DOC) and the F max for the humic-like C1 component, but a relatively weak correlation (R (2) = 0.56) was detected between DOC and the F max for the tryptophan-like component (C2). A strong positive correlation was observed between the F max for the tryptophan-like component (C2) and total nitrogen (TN) (R (2) = 0.78), but moderate correlations were observed with ammonium-N (NH4-N) (R (2) = 0.68), and chemical oxygen demand (CODMn) (R (2) = 0.52). Therefore, the fluorescence intensities of CDOM components can be applied to monitor water quality in real time compared to that of traditional approaches. These results demonstrate that EEM-PARAFAC is useful to evaluate the dynamics of CDOM fluorescent components in urban waters from Northern and Northeastern China and this method has potential applications for monitoring urban water quality in different regions with various hydrological conditions and pollution levels.
[The application of the multidimensional statistical methods in the evaluation of the influence of atmospheric pollution on the population's health].

PubMed

Surzhikov, V D; Surzhikov, D V

2014-01-01

The search and measurement of causal relationships between exposure to air pollution and health state of the population is based on the system analysis and risk assessment to improve the quality of research. With this purpose there is applied the modern statistical analysis with the use of criteria of independence, principal component analysis and discriminate function analysis. As a result of analysis out of all atmospheric pollutants there were separated four main components: for diseases of the circulatory system main principal component is implied with concentrations of suspended solids, nitrogen dioxide, carbon monoxide, hydrogen fluoride, for the respiratory diseases the main c principal component is closely associated with suspended solids, sulfur dioxide and nitrogen dioxide, charcoal black. The discriminant function was shown to be used as a measure of the level of air pollution.
Priority of VHS Development Based in Potential Area using Principal Component Analysis

NASA Astrophysics Data System (ADS)

Meirawan, D.; Ana, A.; Saripudin, S.

2018-02-01

The current condition of VHS is still inadequate in quality, quantity and relevance. The purpose of this research is to analyse the development of VHS based on the development of regional potential by using principal component analysis (PCA) in Bandung, Indonesia. This study used descriptive qualitative data analysis using the principle of secondary data reduction component. The method used is Principal Component Analysis (PCA) analysis with Minitab Statistics Software tool. The results of this study indicate the value of the lowest requirement is a priority of the construction of development VHS with a program of majors in accordance with the development of regional potential. Based on the PCA score found that the main priority in the development of VHS in Bandung is in Saguling, which has the lowest PCA value of 416.92 in area 1, Cihampelas with the lowest PCA value in region 2 and Padalarang with the lowest PCA value.
Comparison of dimensionality reduction methods to predict genomic breeding values for carcass traits in pigs.

PubMed

Azevedo, C F; Nascimento, M; Silva, F F; Resende, M D V; Lopes, P S; Guimarães, S E F; Glória, L S

2015-10-09

A significant contribution of molecular genetics is the direct use of DNA information to identify genetically superior individuals. With this approach, genome-wide selection (GWS) can be used for this purpose. GWS consists of analyzing a large number of single nucleotide polymorphism markers widely distributed in the genome; however, because the number of markers is much larger than the number of genotyped individuals, and such markers are highly correlated, special statistical methods are widely required. Among these methods, independent component regression, principal component regression, partial least squares, and partial principal components stand out. Thus, the aim of this study was to propose an application of the methods of dimensionality reduction to GWS of carcass traits in an F2 (Piau x commercial line) pig population. The results show similarities between the principal and the independent component methods and provided the most accurate genomic breeding estimates for most carcass traits in pigs.
Robust estimation for partially linear models with large-dimensional covariates

PubMed Central

Zhu, LiPing; Li, RunZe; Cui, HengJian

2014-01-01

We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of o(n), where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures. PMID:24955087
Robust estimation for partially linear models with large-dimensional covariates.

PubMed

Zhu, LiPing; Li, RunZe; Cui, HengJian

2013-10-01

We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of [Formula: see text], where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures.
Phases of female sexual response cycle among Malaysian women with infertility: a factor analysis study.

PubMed

Seen Heng, Yeoh; Sidi, Hatta; Nik Jaafar, Nik Ruzyanei; Razali, Rosdinom; Ram, Hari

2013-04-01

This cross-sectional study aimed to determine the construct of the phases of the female sexual response cycle (SRC) among women attending an infertility clinic in a Malaysian tertiary center. The sexual response phases were measured with a validated Malay version of the Female Sexual Function Index (FSFI). The correlation structure of the items of the SRC phases (i.e. desire, arousal, orgasm, satisfaction and pain) was determined using principal component analysis (PCA), with varimax rotation method. The number of factors obtained was decided using Kaiser's criteria. A total of 150 married women with a mean age of 32 years participated in this study. Factor loadings using PCA with varimax rotation divided the sexual domains into three components. The first construct comprised sexual arousal, lubrication and pain (suggesting a mechanical component). The second construct were orgasm and sexual satisfaction (suggesting a physical achievement). Sexual desire, suggesting a psychological component, stood on its own as the third. The findings suggest that three constructs could be identified and in favor of the Basson model (a non-linear concept of SRC) for Malaysian women's sexual functioning. Understanding this would help clinicians to strategize the treatment approach of sexual dysfunction in women with infertility. Copyright © 2013 Wiley Publishing Asia Pty Ltd.
Performance-Based Preparation of Principals: A Framework for Improvement. A Special Report of the NASSP Consortium for the Performance-Based Preparation of Principals.

ERIC Educational Resources Information Center

National Association of Secondary School Principals, Reston, VA.

Preparation programs for principals should have excellent academic and performance based components. In examining the nature of performance based principal preparation this report finds that school administration programs must bridge the gap between conceptual learning in the classroom and the requirements of professional practice. A number of…
Principal component greenness transformation in multitemporal agricultural Landsat data

NASA Technical Reports Server (NTRS)

Abotteen, R. A.

1978-01-01

A data compression technique for multitemporal Landsat imagery which extracts phenological growth pattern information for agricultural crops is described. The principal component greenness transformation was applied to multitemporal agricultural Landsat data for information retrieval. The transformation was favorable for applications in agricultural Landsat data analysis because of its physical interpretability and its relation to the phenological growth of crops. It was also found that the first and second greenness eigenvector components define a temporal small-grain trajectory and nonsmall-grain trajectory, respectively.
Prediction of genomic breeding values for dairy traits in Italian Brown and Simmental bulls using a principal component approach.

PubMed

Pintus, M A; Gaspa, G; Nicolazzi, E L; Vicario, D; Rossoni, A; Ajmone-Marsan, P; Nardone, A; Dimauro, C; Macciotta, N P P

2012-06-01

The large number of markers available compared with phenotypes represents one of the main issues in genomic selection. In this work, principal component analysis was used to reduce the number of predictors for calculating genomic breeding values (GEBV). Bulls of 2 cattle breeds farmed in Italy (634 Brown and 469 Simmental) were genotyped with the 54K Illumina beadchip (Illumina Inc., San Diego, CA). After data editing, 37,254 and 40,179 single nucleotide polymorphisms (SNP) were retained for Brown and Simmental, respectively. Principal component analysis carried out on the SNP genotype matrix extracted 2,257 and 3,596 new variables in the 2 breeds, respectively. Bulls were sorted by birth year to create reference and prediction populations. The effect of principal components on deregressed proofs in reference animals was estimated with a BLUP model. Results were compared with those obtained by using SNP genotypes as predictors with either the BLUP or Bayes_A method. Traits considered were milk, fat, and protein yields, fat and protein percentages, and somatic cell score. The GEBV were obtained for prediction population by blending direct genomic prediction and pedigree indexes. No substantial differences were observed in squared correlations between GEBV and EBV in prediction animals between the 3 methods in the 2 breeds. The principal component analysis method allowed for a reduction of about 90% in the number of independent variables when predicting direct genomic values, with a substantial decrease in calculation time and without loss of accuracy. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Identifying sources of emerging organic contaminants in a mixed use watershed using principal components analysis.

PubMed

Karpuzcu, M Ekrem; Fairbairn, David; Arnold, William A; Barber, Brian L; Kaufenberg, Elizabeth; Koskinen, William C; Novak, Paige J; Rice, Pamela J; Swackhamer, Deborah L

2014-01-01

Principal components analysis (PCA) was used to identify sources of emerging organic contaminants in the Zumbro River watershed in Southeastern Minnesota. Two main principal components (PCs) were identified, which together explained more than 50% of the variance in the data. Principal Component 1 (PC1) was attributed to urban wastewater-derived sources, including municipal wastewater and residential septic tank effluents, while Principal Component 2 (PC2) was attributed to agricultural sources. The variances of the concentrations of cotinine, DEET and the prescription drugs carbamazepine, erythromycin and sulfamethoxazole were best explained by PC1, while the variances of the concentrations of the agricultural pesticides atrazine, metolachlor and acetochlor were best explained by PC2. Mixed use compounds carbaryl, iprodione and daidzein did not specifically group with either PC1 or PC2. Furthermore, despite the fact that caffeine and acetaminophen have been historically associated with human use, they could not be attributed to a single dominant land use category (e.g., urban/residential or agricultural). Contributions from septic systems did not clarify the source for these two compounds, suggesting that additional sources, such as runoff from biosolid-amended soils, may exist. Based on these results, PCA may be a useful way to broadly categorize the sources of new and previously uncharacterized emerging contaminants or may help to clarify transport pathways in a given area. Acetaminophen and caffeine were not ideal markers for urban/residential contamination sources in the study area and may need to be reconsidered as such in other areas as well.
Sparse modeling of spatial environmental variables associated with asthma

PubMed Central

Chang, Timothy S.; Gangnon, Ronald E.; Page, C. David; Buckingham, William R.; Tandias, Aman; Cowan, Kelly J.; Tomasallo, Carrie D.; Arndt, Brian G.; Hanrahan, Lawrence P.; Guilbert, Theresa W.

2014-01-01

Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin’s Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5–50 years over a three-year period. Each patient’s home address was geocoded to one of 3,456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin’s geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. PMID:25533437
Sparse modeling of spatial environmental variables associated with asthma.

PubMed

Chang, Timothy S; Gangnon, Ronald E; David Page, C; Buckingham, William R; Tandias, Aman; Cowan, Kelly J; Tomasallo, Carrie D; Arndt, Brian G; Hanrahan, Lawrence P; Guilbert, Theresa W

2015-02-01

Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin's Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5-50years over a three-year period. Each patient's home address was geocoded to one of 3456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin's geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. Copyright © 2014 Elsevier Inc. All rights reserved.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.