Multilevel sparse functional principal component analysis.
Di, Chongzhi; Crainiceanu, Ciprian M; Jank, Wolfgang S
2014-01-29
We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis (MFPCA; Di et al. 2009) was proposed for such data when functions are densely recorded. Here we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between and within subject levels. We address inherent methodological differences in the sparse sampling context to: 1) estimate the covariance operators; 2) estimate the functional principal component scores; 3) predict the underlying curves. Through simulations the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions.
Sparse modeling of spatial environmental variables associated with asthma
Chang, Timothy S.; Gangnon, Ronald E.; Page, C. David; Buckingham, William R.; Tandias, Aman; Cowan, Kelly J.; Tomasallo, Carrie D.; Arndt, Brian G.; Hanrahan, Lawrence P.; Guilbert, Theresa W.
2014-01-01
Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin’s Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5–50 years over a three-year period. Each patient’s home address was geocoded to one of 3,456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin’s geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. PMID:25533437
Sparse modeling of spatial environmental variables associated with asthma.
Chang, Timothy S; Gangnon, Ronald E; David Page, C; Buckingham, William R; Tandias, Aman; Cowan, Kelly J; Tomasallo, Carrie D; Arndt, Brian G; Hanrahan, Lawrence P; Guilbert, Theresa W
2015-02-01
Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin's Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5-50years over a three-year period. Each patient's home address was geocoded to one of 3456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin's geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. Copyright © 2014 Elsevier Inc. All rights reserved.
An efficient classification method based on principal component and sparse representation.
Zhai, Lin; Fu, Shujun; Zhang, Caiming; Liu, Yunxian; Wang, Lu; Liu, Guohua; Yang, Mingqiang
2016-01-01
As an important application in optical imaging, palmprint recognition is interfered by many unfavorable factors. An effective fusion of blockwise bi-directional two-dimensional principal component analysis and grouping sparse classification is presented. The dimension reduction and normalizing are implemented by the blockwise bi-directional two-dimensional principal component analysis for palmprint images to extract feature matrixes, which are assembled into an overcomplete dictionary in sparse classification. A subspace orthogonal matching pursuit algorithm is designed to solve the grouping sparse representation. Finally, the classification result is gained by comparing the residual between testing and reconstructed images. Experiments are carried out on a palmprint database, and the results show that this method has better robustness against position and illumination changes of palmprint images, and can get higher rate of palmprint recognition.
Li, Ziyi; Safo, Sandra E; Long, Qi
2017-07-11
Sparse principal component analysis (PCA) is a popular tool for dimensionality reduction, pattern recognition, and visualization of high dimensional data. It has been recognized that complex biological mechanisms occur through concerted relationships of multiple genes working in networks that are often represented by graphs. Recent work has shown that incorporating such biological information improves feature selection and prediction performance in regression analysis, but there has been limited work on extending this approach to PCA. In this article, we propose two new sparse PCA methods called Fused and Grouped sparse PCA that enable incorporation of prior biological information in variable selection. Our simulation studies suggest that, compared to existing sparse PCA methods, the proposed methods achieve higher sensitivity and specificity when the graph structure is correctly specified, and are fairly robust to misspecified graph structures. Application to a glioblastoma gene expression dataset identified pathways that are suggested in the literature to be related with glioblastoma. The proposed sparse PCA methods Fused and Grouped sparse PCA can effectively incorporate prior biological information in variable selection, leading to improved feature selection and more interpretable principal component loadings and potentially providing insights on molecular underpinnings of complex diseases.
A novel principal component analysis for spatially misaligned multivariate air pollution data.
Jandarov, Roman A; Sheppard, Lianne A; Sampson, Paul D; Szpiro, Adam A
2017-01-01
We propose novel methods for predictive (sparse) PCA with spatially misaligned data. These methods identify principal component loading vectors that explain as much variability in the observed data as possible, while also ensuring the corresponding principal component scores can be predicted accurately by means of spatial statistics at locations where air pollution measurements are not available. This will make it possible to identify important mixtures of air pollutants and to quantify their health effects in cohort studies, where currently available methods cannot be used. We demonstrate the utility of predictive (sparse) PCA in simulated data and apply the approach to annual averages of particulate matter speciation data from national Environmental Protection Agency (EPA) regulatory monitors.
Sparse principal component analysis in medical shape modeling
NASA Astrophysics Data System (ADS)
Sjöstrand, Karl; Stegmann, Mikkel B.; Larsen, Rasmus
2006-03-01
Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims at producing easily interpreted models through sparse loadings, i.e. each new variable is a linear combination of a subset of the original variables. One of the aims of using SPCA is the possible separation of the results into isolated and easily identifiable effects. This article introduces SPCA for shape analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA algorithm has been implemented using Matlab and is available for download. The general behavior of the algorithm is investigated, and strengths and weaknesses are discussed. The original report on the SPCA algorithm argues that the ordering of modes is not an issue. We disagree on this point and propose several approaches to establish sensible orderings. A method that orders modes by decreasing variance and maximizes the sum of variances for all modes is presented and investigated in detail.
Tipton, John; Hooten, Mevin B.; Goring, Simon
2017-01-01
Scientific records of temperature and precipitation have been kept for several hundred years, but for many areas, only a shorter record exists. To understand climate change, there is a need for rigorous statistical reconstructions of the paleoclimate using proxy data. Paleoclimate proxy data are often sparse, noisy, indirect measurements of the climate process of interest, making each proxy uniquely challenging to model statistically. We reconstruct spatially explicit temperature surfaces from sparse and noisy measurements recorded at historical United States military forts and other observer stations from 1820 to 1894. One common method for reconstructing the paleoclimate from proxy data is principal component regression (PCR). With PCR, one learns a statistical relationship between the paleoclimate proxy data and a set of climate observations that are used as patterns for potential reconstruction scenarios. We explore PCR in a Bayesian hierarchical framework, extending classical PCR in a variety of ways. First, we model the latent principal components probabilistically, accounting for measurement error in the observational data. Next, we extend our method to better accommodate outliers that occur in the proxy data. Finally, we explore alternatives to the truncation of lower-order principal components using different regularization techniques. One fundamental challenge in paleoclimate reconstruction efforts is the lack of out-of-sample data for predictive validation. Cross-validation is of potential value, but is computationally expensive and potentially sensitive to outliers in sparse data scenarios. To overcome the limitations that a lack of out-of-sample records presents, we test our methods using a simulation study, applying proper scoring rules including a computationally efficient approximation to leave-one-out cross-validation using the log score to validate model performance. The result of our analysis is a spatially explicit reconstruction of spatio-temporal temperature from a very sparse historical record.
Wang, Ya-Xuan; Gao, Ying-Lian; Liu, Jin-Xing; Kong, Xiang-Zhen; Li, Hai-Jun
2017-09-01
Identifying differentially expressed genes from the thousands of genes is a challenging task. Robust principal component analysis (RPCA) is an efficient method in the identification of differentially expressed genes. RPCA method uses nuclear norm to approximate the rank function. However, theoretical studies showed that the nuclear norm minimizes all singular values, so it may not be the best solution to approximate the rank function. The truncated nuclear norm is defined as the sum of some smaller singular values, which may achieve a better approximation of the rank function than nuclear norm. In this paper, a novel method is proposed by replacing nuclear norm of RPCA with the truncated nuclear norm, which is named robust principal component analysis regularized by truncated nuclear norm (TRPCA). The method decomposes the observation matrix of genomic data into a low-rank matrix and a sparse matrix. Because the significant genes can be considered as sparse signals, the differentially expressed genes are viewed as the sparse perturbation signals. Thus, the differentially expressed genes can be identified according to the sparse matrix. The experimental results on The Cancer Genome Atlas data illustrate that the TRPCA method outperforms other state-of-the-art methods in the identification of differentially expressed genes.
Randomized subspace-based robust principal component analysis for hyperspectral anomaly detection
NASA Astrophysics Data System (ADS)
Sun, Weiwei; Yang, Gang; Li, Jialin; Zhang, Dianfa
2018-01-01
A randomized subspace-based robust principal component analysis (RSRPCA) method for anomaly detection in hyperspectral imagery (HSI) is proposed. The RSRPCA combines advantages of randomized column subspace and robust principal component analysis (RPCA). It assumes that the background has low-rank properties, and the anomalies are sparse and do not lie in the column subspace of the background. First, RSRPCA implements random sampling to sketch the original HSI dataset from columns and to construct a randomized column subspace of the background. Structured random projections are also adopted to sketch the HSI dataset from rows. Sketching from columns and rows could greatly reduce the computational requirements of RSRPCA. Second, the RSRPCA adopts the columnwise RPCA (CWRPCA) to eliminate negative effects of sampled anomaly pixels and that purifies the previous randomized column subspace by removing sampled anomaly columns. The CWRPCA decomposes the submatrix of the HSI data into a low-rank matrix (i.e., background component), a noisy matrix (i.e., noise component), and a sparse anomaly matrix (i.e., anomaly component) with only a small proportion of nonzero columns. The algorithm of inexact augmented Lagrange multiplier is utilized to optimize the CWRPCA problem and estimate the sparse matrix. Nonzero columns of the sparse anomaly matrix point to sampled anomaly columns in the submatrix. Third, all the pixels are projected onto the complemental subspace of the purified randomized column subspace of the background and the anomaly pixels in the original HSI data are finally exactly located. Several experiments on three real hyperspectral images are carefully designed to investigate the detection performance of RSRPCA, and the results are compared with four state-of-the-art methods. Experimental results show that the proposed RSRPCA outperforms four comparison methods both in detection performance and in computational time.
NASA Astrophysics Data System (ADS)
Li, Jun; Song, Minghui; Peng, Yuanxi
2018-03-01
Current infrared and visible image fusion methods do not achieve adequate information extraction, i.e., they cannot extract the target information from infrared images while retaining the background information from visible images. Moreover, most of them have high complexity and are time-consuming. This paper proposes an efficient image fusion framework for infrared and visible images on the basis of robust principal component analysis (RPCA) and compressed sensing (CS). The novel framework consists of three phases. First, RPCA decomposition is applied to the infrared and visible images to obtain their sparse and low-rank components, which represent the salient features and background information of the images, respectively. Second, the sparse and low-rank coefficients are fused by different strategies. On the one hand, the measurements of the sparse coefficients are obtained by the random Gaussian matrix, and they are then fused by the standard deviation (SD) based fusion rule. Next, the fused sparse component is obtained by reconstructing the result of the fused measurement using the fast continuous linearized augmented Lagrangian algorithm (FCLALM). On the other hand, the low-rank coefficients are fused using the max-absolute rule. Subsequently, the fused image is superposed by the fused sparse and low-rank components. For comparison, several popular fusion algorithms are tested experimentally. By comparing the fused results subjectively and objectively, we find that the proposed framework can extract the infrared targets while retaining the background information in the visible images. Thus, it exhibits state-of-the-art performance in terms of both fusion effects and timeliness.
NASA Astrophysics Data System (ADS)
Dafu, Shen; Leihong, Zhang; Dong, Liang; Bei, Li; Yi, Kang
2017-07-01
The purpose of this study is to improve the reconstruction precision and better copy the color of spectral image surfaces. A new spectral reflectance reconstruction algorithm based on an iterative threshold combined with weighted principal component space is presented in this paper, and the principal component with weighted visual features is the sparse basis. Different numbers of color cards are selected as the training samples, a multispectral image is the testing sample, and the color differences in the reconstructions are compared. The channel response value is obtained by a Mega Vision high-accuracy, multi-channel imaging system. The results show that spectral reconstruction based on weighted principal component space is superior in performance to that based on traditional principal component space. Therefore, the color difference obtained using the compressive-sensing algorithm with weighted principal component analysis is less than that obtained using the algorithm with traditional principal component analysis, and better reconstructed color consistency with human eye vision is achieved.
NASA Astrophysics Data System (ADS)
Zhao, Fengjun; Liu, Junting; Qu, Xiaochao; Xu, Xianhui; Chen, Xueli; Yang, Xiang; Cao, Feng; Liang, Jimin; Tian, Jie
2014-12-01
To solve the multicollinearity issue and unequal contribution of vascular parameters for the quantification of angiogenesis, we developed a quantification evaluation method of vascular parameters for angiogenesis based on in vivo micro-CT imaging of hindlimb ischemic model mice. Taking vascular volume as the ground truth parameter, nine vascular parameters were first assembled into sparse principal components (PCs) to reduce the multicolinearity issue. Aggregated boosted trees (ABTs) were then employed to analyze the importance of vascular parameters for the quantification of angiogenesis via the loadings of sparse PCs. The results demonstrated that vascular volume was mainly characterized by vascular area, vascular junction, connectivity density, segment number and vascular length, which indicated they were the key vascular parameters for the quantification of angiogenesis. The proposed quantitative evaluation method was compared with both the ABTs directly using the nine vascular parameters and Pearson correlation, which were consistent. In contrast to the ABTs directly using the vascular parameters, the proposed method can select all the key vascular parameters simultaneously, because all the key vascular parameters were assembled into the sparse PCs with the highest relative importance.
Corrected confidence bands for functional data using principal components.
Goldsmith, J; Greven, S; Crainiceanu, C
2013-03-01
Functional principal components (FPC) analysis is widely used to decompose and express functional observations. Curve estimates implicitly condition on basis functions and other quantities derived from FPC decompositions; however these objects are unknown in practice. In this article, we propose a method for obtaining correct curve estimates by accounting for uncertainty in FPC decompositions. Additionally, pointwise and simultaneous confidence intervals that account for both model- and decomposition-based variability are constructed. Standard mixed model representations of functional expansions are used to construct curve estimates and variances conditional on a specific decomposition. Iterated expectation and variance formulas combine model-based conditional estimates across the distribution of decompositions. A bootstrap procedure is implemented to understand the uncertainty in principal component decomposition quantities. Our method compares favorably to competing approaches in simulation studies that include both densely and sparsely observed functions. We apply our method to sparse observations of CD4 cell counts and to dense white-matter tract profiles. Code for the analyses and simulations is publicly available, and our method is implemented in the R package refund on CRAN. Copyright © 2013, The International Biometric Society.
Corrected Confidence Bands for Functional Data Using Principal Components
Goldsmith, J.; Greven, S.; Crainiceanu, C.
2014-01-01
Functional principal components (FPC) analysis is widely used to decompose and express functional observations. Curve estimates implicitly condition on basis functions and other quantities derived from FPC decompositions; however these objects are unknown in practice. In this article, we propose a method for obtaining correct curve estimates by accounting for uncertainty in FPC decompositions. Additionally, pointwise and simultaneous confidence intervals that account for both model- and decomposition-based variability are constructed. Standard mixed model representations of functional expansions are used to construct curve estimates and variances conditional on a specific decomposition. Iterated expectation and variance formulas combine model-based conditional estimates across the distribution of decompositions. A bootstrap procedure is implemented to understand the uncertainty in principal component decomposition quantities. Our method compares favorably to competing approaches in simulation studies that include both densely and sparsely observed functions. We apply our method to sparse observations of CD4 cell counts and to dense white-matter tract profiles. Code for the analyses and simulations is publicly available, and our method is implemented in the R package refund on CRAN. PMID:23003003
Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao
2015-01-01
Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
Reconstruction Error and Principal Component Based Anomaly Detection in Hyperspectral Imagery
2014-03-27
2003), and (Jackson D. A., 1993). In 1933, Hotelling ( Hotelling , 1933), who coined the term ‘principal components,’ surmised that there was a...goodness of fit and multivariate quality control with the statistic Qi = (Xi(1×p) − X̂i(1×p) )(Xi(1×p) − X̂i(1×p) ) T (20) where, under the...sparsely targeted scenes through SNR or other methods. 5) Customize sorting and histogram construction methods in Multiple PCA to avoid redundancy
Integrative sparse principal component analysis of gene expression data.
Liu, Mengque; Fan, Xinyan; Fang, Kuangnan; Zhang, Qingzhao; Ma, Shuangge
2017-12-01
In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high dimensionality" characteristic of gene expression data, the analysis results generated from a single dataset are often unsatisfactory. Under contexts other than dimension reduction, integrative analysis techniques, which jointly analyze the raw data of multiple independent datasets, have been developed and shown to outperform "classic" meta-analysis and other multidatasets techniques and single-dataset analysis. In this study, we conduct integrative analysis by developing the iSPCA (integrative SPCA) method. iSPCA achieves the selection and estimation of sparse loadings using a group penalty. To take advantage of the similarity across datasets and generate more accurate results, we further impose contrasted penalties. Different penalties are proposed to accommodate different data conditions. Extensive simulations show that iSPCA outperforms the alternatives under a wide spectrum of settings. The analysis of breast cancer and pancreatic cancer data further shows iSPCA's satisfactory performance. © 2017 WILEY PERIODICALS, INC.
NASA Astrophysics Data System (ADS)
He, Shiyuan; Wang, Lifan; Huang, Jianhua Z.
2018-04-01
With growing data from ongoing and future supernova surveys, it is possible to empirically quantify the shapes of SNIa light curves in more detail, and to quantitatively relate the shape parameters with the intrinsic properties of SNIa. Building such relationships is critical in controlling systematic errors associated with supernova cosmology. Based on a collection of well-observed SNIa samples accumulated in the past years, we construct an empirical SNIa light curve model using a statistical method called the functional principal component analysis (FPCA) for sparse and irregularly sampled functional data. Using this method, the entire light curve of an SNIa is represented by a linear combination of principal component functions, and the SNIa is represented by a few numbers called “principal component scores.” These scores are used to establish relations between light curve shapes and physical quantities such as intrinsic color, interstellar dust reddening, spectral line strength, and spectral classes. These relations allow for descriptions of some critical physical quantities based purely on light curve shape parameters. Our study shows that some important spectral feature information is being encoded in the broad band light curves; for instance, we find that the light curve shapes are correlated with the velocity and velocity gradient of the Si II λ6355 line. This is important for supernova surveys (e.g., LSST and WFIRST). Moreover, the FPCA light curve model is used to construct the entire light curve shape, which in turn is used in a functional linear form to adjust intrinsic luminosity when fitting distance models.
Improving Cluster Analysis with Automatic Variable Selection Based on Trees
2014-12-01
regression trees Daisy DISsimilAritY PAM partitioning around medoids PMA penalized multivariate analysis SPC sparse principal components UPGMA unweighted...unweighted pair-group average method ( UPGMA ). This method measures dissimilarities between all objects in two clusters and takes the average value
Large Covariance Estimation by Thresholding Principal Orthogonal Complements
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2012-01-01
This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented. PMID:24348088
Large Covariance Estimation by Thresholding Principal Orthogonal Complements.
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2013-09-01
This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented.
A Nonlinear Model for Gene-Based Gene-Environment Interaction.
Sa, Jian; Liu, Xu; He, Tao; Liu, Guifen; Cui, Yuehua
2016-06-04
A vast amount of literature has confirmed the role of gene-environment (G×E) interaction in the etiology of complex human diseases. Traditional methods are predominantly focused on the analysis of interaction between a single nucleotide polymorphism (SNP) and an environmental variable. Given that genes are the functional units, it is crucial to understand how gene effects (rather than single SNP effects) are influenced by an environmental variable to affect disease risk. Motivated by the increasing awareness of the power of gene-based association analysis over single variant based approach, in this work, we proposed a sparse principle component regression (sPCR) model to understand the gene-based G×E interaction effect on complex disease. We first extracted the sparse principal components for SNPs in a gene, then the effect of each principal component was modeled by a varying-coefficient (VC) model. The model can jointly model variants in a gene in which their effects are nonlinearly influenced by an environmental variable. In addition, the varying-coefficient sPCR (VC-sPCR) model has nice interpretation property since the sparsity on the principal component loadings can tell the relative importance of the corresponding SNPs in each component. We applied our method to a human birth weight dataset in Thai population. We analyzed 12,005 genes across 22 chromosomes and found one significant interaction effect using the Bonferroni correction method and one suggestive interaction. The model performance was further evaluated through simulation studies. Our model provides a system approach to evaluate gene-based G×E interaction.
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
The Models-3 Community Multi-scale Air Quality (CMAQ) model, first released by the USEPA in 1999 (Byun and Ching. 1999), continues to be developed and evaluated. The principal components of the CMAQ system include a comprehensive emission processor known as the Sparse Matrix O...
Iris recognition based on robust principal component analysis
NASA Astrophysics Data System (ADS)
Karn, Pradeep; He, Xiao Hai; Yang, Shuai; Wu, Xiao Hong
2014-11-01
Iris images acquired under different conditions often suffer from blur, occlusion due to eyelids and eyelashes, specular reflection, and other artifacts. Existing iris recognition systems do not perform well on these types of images. To overcome these problems, we propose an iris recognition method based on robust principal component analysis. The proposed method decomposes all training images into a low-rank matrix and a sparse error matrix, where the low-rank matrix is used for feature extraction. The sparsity concentration index approach is then applied to validate the recognition result. Experimental results using CASIA V4 and IIT Delhi V1iris image databases showed that the proposed method achieved competitive performances in both recognition accuracy and computational efficiency.
Clutter Mitigation in Echocardiography Using Sparse Signal Separation
Yavneh, Irad
2015-01-01
In ultrasound imaging, clutter artifacts degrade images and may cause inaccurate diagnosis. In this paper, we apply a method called Morphological Component Analysis (MCA) for sparse signal separation with the objective of reducing such clutter artifacts. The MCA approach assumes that the two signals in the additive mix have each a sparse representation under some dictionary of atoms (a matrix), and separation is achieved by finding these sparse representations. In our work, an adaptive approach is used for learning the dictionary from the echo data. MCA is compared to Singular Value Filtering (SVF), a Principal Component Analysis- (PCA-) based filtering technique, and to a high-pass Finite Impulse Response (FIR) filter. Each filter is applied to a simulated hypoechoic lesion sequence, as well as experimental cardiac ultrasound data. MCA is demonstrated in both cases to outperform the FIR filter and obtain results comparable to the SVF method in terms of contrast-to-noise ratio (CNR). Furthermore, MCA shows a lower impact on tissue sections while removing the clutter artifacts. In experimental heart data, MCA obtains in our experiments clutter mitigation with an average CNR improvement of 1.33 dB. PMID:26199622
Semi-blind sparse image reconstruction with application to MRFM.
Park, Se Un; Dobigeon, Nicolas; Hero, Alfred O
2012-09-01
We propose a solution to the image deconvolution problem where the convolution kernel or point spread function (PSF) is assumed to be only partially known. Small perturbations generated from the model are exploited to produce a few principal components explaining the PSF uncertainty in a high-dimensional space. Unlike recent developments on blind deconvolution of natural images, we assume the image is sparse in the pixel basis, a natural sparsity arising in magnetic resonance force microscopy (MRFM). Our approach adopts a Bayesian Metropolis-within-Gibbs sampling framework. The performance of our Bayesian semi-blind algorithm for sparse images is superior to previously proposed semi-blind algorithms such as the alternating minimization algorithm and blind algorithms developed for natural images. We illustrate our myopic algorithm on real MRFM tobacco virus data.
Using dynamic mode decomposition for real-time background/foreground separation in video
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kutz, Jose Nathan; Grosek, Jacob; Brunton, Steven
The technique of dynamic mode decomposition (DMD) is disclosed herein for the purpose of robustly separating video frames into background (low-rank) and foreground (sparse) components in real-time. Foreground/background separation is achieved at the computational cost of just one singular value decomposition (SVD) and one linear equation solve, thus producing results orders of magnitude faster than robust principal component analysis (RPCA). Additional techniques, including techniques for analyzing the video for multi-resolution time-scale components, and techniques for reusing computations to allow processing of streaming video in real time, are also described herein.
MR Image Reconstruction Using Block Matching and Adaptive Kernel Methods.
Schmidt, Johannes F M; Santelli, Claudio; Kozerke, Sebastian
2016-01-01
An approach to Magnetic Resonance (MR) image reconstruction from undersampled data is proposed. Undersampling artifacts are removed using an iterative thresholding algorithm applied to nonlinearly transformed image block arrays. Each block array is transformed using kernel principal component analysis where the contribution of each image block to the transform depends in a nonlinear fashion on the distance to other image blocks. Elimination of undersampling artifacts is achieved by conventional principal component analysis in the nonlinear transform domain, projection onto the main components and back-mapping into the image domain. Iterative image reconstruction is performed by interleaving the proposed undersampling artifact removal step and gradient updates enforcing consistency with acquired k-space data. The algorithm is evaluated using retrospectively undersampled MR cardiac cine data and compared to k-t SPARSE-SENSE, block matching with spatial Fourier filtering and k-t ℓ1-SPIRiT reconstruction. Evaluation of image quality and root-mean-squared-error (RMSE) reveal improved image reconstruction for up to 8-fold undersampled data with the proposed approach relative to k-t SPARSE-SENSE, block matching with spatial Fourier filtering and k-t ℓ1-SPIRiT. In conclusion, block matching and kernel methods can be used for effective removal of undersampling artifacts in MR image reconstruction and outperform methods using standard compressed sensing and ℓ1-regularized parallel imaging methods.
Structured Sparse Principal Components Analysis With the TV-Elastic Net Penalty.
de Pierrefeu, Amicie; Lofstedt, Tommy; Hadj-Selem, Fouad; Dubois, Mathieu; Jardri, Renaud; Fovet, Thomas; Ciuciu, Philippe; Frouin, Vincent; Duchesnay, Edouard
2018-02-01
Principal component analysis (PCA) is an exploratory tool widely used in data analysis to uncover the dominant patterns of variability within a population. Despite its ability to represent a data set in a low-dimensional space, PCA's interpretability remains limited. Indeed, the components produced by PCA are often noisy or exhibit no visually meaningful patterns. Furthermore, the fact that the components are usually non-sparse may also impede interpretation, unless arbitrary thresholding is applied. However, in neuroimaging, it is essential to uncover clinically interpretable phenotypic markers that would account for the main variability in the brain images of a population. Recently, some alternatives to the standard PCA approach, such as sparse PCA (SPCA), have been proposed, their aim being to limit the density of the components. Nonetheless, sparsity alone does not entirely solve the interpretability problem in neuroimaging, since it may yield scattered and unstable components. We hypothesized that the incorporation of prior information regarding the structure of the data may lead to improved relevance and interpretability of brain patterns. We therefore present a simple extension of the popular PCA framework that adds structured sparsity penalties on the loading vectors in order to identify the few stable regions in the brain images that capture most of the variability. Such structured sparsity can be obtained by combining, e.g., and total variation (TV) penalties, where the TV regularization encodes information on the underlying structure of the data. This paper presents the structured SPCA (denoted SPCA-TV) optimization framework and its resolution. We demonstrate SPCA-TV's effectiveness and versatility on three different data sets. It can be applied to any kind of structured data, such as, e.g., -dimensional array images or meshes of cortical surfaces. The gains of SPCA-TV over unstructured approaches (such as SPCA and ElasticNet PCA) or structured approach (such as GraphNet PCA) are significant, since SPCA-TV reveals the variability within a data set in the form of intelligible brain patterns that are easier to interpret and more stable across different samples.
An algorithm for extraction of periodic signals from sparse, irregularly sampled data
NASA Technical Reports Server (NTRS)
Wilcox, J. Z.
1994-01-01
Temporal gaps in discrete sampling sequences produce spurious Fourier components at the intermodulation frequencies of an oscillatory signal and the temporal gaps, thus significantly complicating spectral analysis of such sparsely sampled data. A new fast Fourier transform (FFT)-based algorithm has been developed, suitable for spectral analysis of sparsely sampled data with a relatively small number of oscillatory components buried in background noise. The algorithm's principal idea has its origin in the so-called 'clean' algorithm used to sharpen images of scenes corrupted by atmospheric and sensor aperture effects. It identifies as the signal's 'true' frequency that oscillatory component which, when passed through the same sampling sequence as the original data, produces a Fourier image that is the best match to the original Fourier space. The algorithm has generally met with succession trials with simulated data with a low signal-to-noise ratio, including those of a type similar to hourly residuals for Earth orientation parameters extracted from VLBI data. For eight oscillatory components in the diurnal and semidiurnal bands, all components with an amplitude-noise ratio greater than 0.2 were successfully extracted for all sequences and duty cycles (greater than 0.1) tested; the amplitude-noise ratios of the extracted signals were as low as 0.05 for high duty cycles and long sampling sequences. When, in addition to these high frequencies, strong low-frequency components are present in the data, the low-frequency components are generally eliminated first, by employing a version of the algorithm that searches for non-integer multiples of the discrete FET minimum frequency.
Szczesniak, Rhonda D; Li, Dan; Duan, Leo L; Altaye, Mekibib; Miodovnik, Menachem; Khoury, Jane C
2016-11-01
Objective To identify phenotypes of type 1 diabetes control and associations with maternal/neonatal characteristics based on blood pressure (BP), glucose, and insulin curves during gestation, using a novel functional data analysis approach that accounts for sparse longitudinal patterns of medical monitoring during pregnancy. Methods We performed a retrospective longitudinal cohort study of women with type 1 diabetes whose BP, glucose, and insulin requirements were monitored throughout gestation as part of a program-project grant. Scores from sparse functional principal component analysis (fPCA) were used to classify gestational profiles according to the degree of control for each monitored measure. Phenotypes created using fPCA were compared with respect to maternal and neonatal characteristics and outcome. Results Most of the gestational profile variation in the monitored measures was explained by the first principal component (82-94%). Profiles clustered into three subgroups of high, moderate, or low heterogeneity, relative to the overall mean response. Phenotypes were associated with baseline characteristics, longitudinal changes in glycohemoglobin A1 and weight, and to pregnancy-related outcomes. Conclusion Three distinct longitudinal patterns of glucose, insulin, and BP control were found. By identifying these phenotypes, interventions can be targeted for subgroups at highest risk for compromised outcome, to optimize diabetes management during pregnancy. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Sparse PCA with Oracle Property.
Gu, Quanquan; Wang, Zhaoran; Liu, Han
In this paper, we study the estimation of the k -dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank- k , and attains a [Formula: see text] statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.
Sparse PCA with Oracle Property
Gu, Quanquan; Wang, Zhaoran; Liu, Han
2014-01-01
In this paper, we study the estimation of the k-dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-k, and attains a s/n statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets. PMID:25684971
NASA Astrophysics Data System (ADS)
Santarius, John; Navarro, Marcos; Michalak, Matthew; Fancher, Aaron; Kulcinski, Gerald; Bonomo, Richard
2016-10-01
A newly initiated research project will be described that investigates methods for detecting shielded special nuclear materials by combining multi-dimensional neutron sources, forward/adjoint calculations modeling neutron and gamma transport, and sparse data analysis of detector signals. The key tasks for this project are: (1) developing a radiation transport capability for use in optimizing adaptive-geometry, inertial-electrostatic confinement (IEC) neutron source/detector configurations for neutron pulses distributed in space and/or phased in time; (2) creating distributed-geometry, gas-target, IEC fusion neutron sources; (3) applying sparse data and noise reduction algorithms, such as principal component analysis (PCA) and wavelet transform analysis, to enhance detection fidelity; and (4) educating graduate and undergraduate students. Funded by DHS DNDO Project 2015-DN-077-ARI095.
An approach for quantitative image quality analysis for CT
NASA Astrophysics Data System (ADS)
Rahimi, Amir; Cochran, Joe; Mooney, Doug; Regensburger, Joe
2016-03-01
An objective and standardized approach to assess image quality of Compute Tomography (CT) systems is required in a wide variety of imaging processes to identify CT systems appropriate for a given application. We present an overview of the framework we have developed to help standardize and to objectively assess CT image quality for different models of CT scanners used for security applications. Within this framework, we have developed methods to quantitatively measure metrics that should correlate with feature identification, detection accuracy and precision, and image registration capabilities of CT machines and to identify strengths and weaknesses in different CT imaging technologies in transportation security. To that end we have designed, developed and constructed phantoms that allow for systematic and repeatable measurements of roughly 88 image quality metrics, representing modulation transfer function, noise equivalent quanta, noise power spectra, slice sensitivity profiles, streak artifacts, CT number uniformity, CT number consistency, object length accuracy, CT number path length consistency, and object registration. Furthermore, we have developed a sophisticated MATLAB based image analysis tool kit to analyze CT generated images of phantoms and report these metrics in a format that is standardized across the considered models of CT scanners, allowing for comparative image quality analysis within a CT model or between different CT models. In addition, we have developed a modified sparse principal component analysis (SPCA) method to generate a modified set of PCA components as compared to the standard principal component analysis (PCA) with sparse loadings in conjunction with Hotelling T2 statistical analysis method to compare, qualify, and detect faults in the tested systems.
NASA Astrophysics Data System (ADS)
Zhang, Han; Chen, Xuefeng; Du, Zhaohui; Li, Xiang; Yan, Ruqiang
2016-04-01
Fault information of aero-engine bearings presents two particular phenomena, i.e., waveform distortion and impulsive feature frequency band dispersion, which leads to a challenging problem for current techniques of bearing fault diagnosis. Moreover, although many progresses of sparse representation theory have been made in feature extraction of fault information, the theory also confronts inevitable performance degradation due to the fact that relatively weak fault information has not sufficiently prominent and sparse representations. Therefore, a novel nonlocal sparse model (coined NLSM) and its algorithm framework has been proposed in this paper, which goes beyond simple sparsity by introducing more intrinsic structures of feature information. This work adequately exploits the underlying prior information that feature information exhibits nonlocal self-similarity through clustering similar signal fragments and stacking them together into groups. Within this framework, the prior information is transformed into a regularization term and a sparse optimization problem, which could be solved through block coordinate descent method (BCD), is formulated. Additionally, the adaptive structural clustering sparse dictionary learning technique, which utilizes k-Nearest-Neighbor (kNN) clustering and principal component analysis (PCA) learning, is adopted to further enable sufficient sparsity of feature information. Moreover, the selection rule of regularization parameter and computational complexity are described in detail. The performance of the proposed framework is evaluated through numerical experiment and its superiority with respect to the state-of-the-art method in the field is demonstrated through the vibration signals of experimental rig of aircraft engine bearings.
Boundary layer noise subtraction in hydrodynamic tunnel using robust principal component analysis.
Amailland, Sylvain; Thomas, Jean-Hugh; Pézerat, Charles; Boucheron, Romuald
2018-04-01
The acoustic study of propellers in a hydrodynamic tunnel is of paramount importance during the design process, but can involve significant difficulties due to the boundary layer noise (BLN). Indeed, advanced denoising methods are needed to recover the acoustic signal in case of poor signal-to-noise ratio. The technique proposed in this paper is based on the decomposition of the wall-pressure cross-spectral matrix (CSM) by taking advantage of both the low-rank property of the acoustic CSM and the sparse property of the BLN CSM. Thus, the algorithm belongs to the class of robust principal component analysis (RPCA), which derives from the widely used principal component analysis. If the BLN is spatially decorrelated, the proposed RPCA algorithm can blindly recover the acoustical signals even for negative signal-to-noise ratio. Unfortunately, in a realistic case, acoustic signals recorded in a hydrodynamic tunnel show that the noise may be partially correlated. A prewhitening strategy is then considered in order to take into account the spatially coherent background noise. Numerical simulations and experimental results show an improvement in terms of BLN reduction in the large hydrodynamic tunnel. The effectiveness of the denoising method is also investigated in the context of acoustic source localization.
SD-SEM: sparse-dense correspondence for 3D reconstruction of microscopic samples.
Baghaie, Ahmadreza; Tafti, Ahmad P; Owen, Heather A; D'Souza, Roshan M; Yu, Zeyun
2017-06-01
Scanning electron microscopy (SEM) imaging has been a principal component of many studies in biomedical, mechanical, and materials sciences since its emergence. Despite the high resolution of captured images, they remain two-dimensional (2D). In this work, a novel framework using sparse-dense correspondence is introduced and investigated for 3D reconstruction of stereo SEM images. SEM micrographs from microscopic samples are captured by tilting the specimen stage by a known angle. The pair of SEM micrographs is then rectified using sparse scale invariant feature transform (SIFT) features/descriptors and a contrario RANSAC for matching outlier removal to ensure a gross horizontal displacement between corresponding points. This is followed by dense correspondence estimation using dense SIFT descriptors and employing a factor graph representation of the energy minimization functional and loopy belief propagation (LBP) as means of optimization. Given the pixel-by-pixel correspondence and the tilt angle of the specimen stage during the acquisition of micrographs, depth can be recovered. Extensive tests reveal the strength of the proposed method for high-quality reconstruction of microscopic samples. Copyright © 2017 Elsevier Ltd. All rights reserved.
Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies.
Rahmani, Elior; Zaitlen, Noah; Baran, Yael; Eng, Celeste; Hu, Donglei; Galanter, Joshua; Oh, Sam; Burchard, Esteban G; Eskin, Eleazar; Zou, James; Halperin, Eran
2016-05-01
In epigenome-wide association studies (EWAS), different methylation profiles of distinct cell types may lead to false discoveries. We introduce ReFACTor, a method based on principal component analysis (PCA) and designed for the correction of cell type heterogeneity in EWAS. ReFACTor does not require knowledge of cell counts, and it provides improved estimates of cell type composition, resulting in improved power and control for false positives in EWAS. Corresponding software is available at http://www.cs.tau.ac.il/~heran/cozygene/software/refactor.html.
Structured functional additive regression in reproducing kernel Hilbert spaces.
Zhu, Hongxiao; Yao, Fang; Zhang, Hao Helen
2014-06-01
Functional additive models (FAMs) provide a flexible yet simple framework for regressions involving functional predictors. The utilization of data-driven basis in an additive rather than linear structure naturally extends the classical functional linear model. However, the critical issue of selecting nonlinear additive components has been less studied. In this work, we propose a new regularization framework for the structure estimation in the context of Reproducing Kernel Hilbert Spaces. The proposed approach takes advantage of the functional principal components which greatly facilitates the implementation and the theoretical analysis. The selection and estimation are achieved by penalized least squares using a penalty which encourages the sparse structure of the additive components. Theoretical properties such as the rate of convergence are investigated. The empirical performance is demonstrated through simulation studies and a real data application.
Yielding physically-interpretable emulators - A Sparse PCA approach
NASA Astrophysics Data System (ADS)
Galelli, S.; Alsahaf, A.; Giuliani, M.; Castelletti, A.
2015-12-01
Projection-based techniques, such as Principal Orthogonal Decomposition (POD), are a common approach to surrogate high-fidelity process-based models by lower order dynamic emulators. With POD, the dimensionality reduction is achieved by using observations, or 'snapshots' - generated with the high-fidelity model -, to project the entire set of input and state variables of this model onto a smaller set of basis functions that account for most of the variability in the data. While reduction efficiency and variance control of POD techniques are usually very high, the resulting emulators are structurally highly complex and can hardly be given a physically meaningful interpretation as each basis is a projection of the entire set of inputs and states. In this work, we propose a novel approach based on Sparse Principal Component Analysis (SPCA) that combines the several assets of POD methods with the potential for ex-post interpretation of the emulator structure. SPCA reduces the number of non-zero coefficients in the basis functions by identifying a sparse matrix of coefficients. While the resulting set of basis functions may retain less variance of the snapshots, the presence of a few non-zero coefficients assists in the interpretation of the underlying physical processes. The SPCA approach is tested on the reduction of a 1D hydro-ecological model (DYRESM-CAEDYM) used to describe the main ecological and hydrodynamic processes in Tono Dam, Japan. An experimental comparison against a standard POD approach shows that SPCA achieves the same accuracy in emulating a given output variable - for the same level of dimensionality reduction - while yielding better insights of the main process dynamics.
Multiple View Zenith Angle Observations of Reflectance From Ponderosa Pine Stands
NASA Technical Reports Server (NTRS)
Johnson, Lee F.; Lawless, James G. (Technical Monitor)
1994-01-01
Reflectance factors (RF(lambda)) from dense and sparse ponderosa pine (Pinus ponderosa) stands, derived from radiance data collected in the solar principal plane by the Advanced Solid-State Array Spectro-radiometer (ASAS), were examined as a function of view zenith angle (theta(sub v)). RF(lambda) was maximized with theta(sub v) nearest the solar retrodirection, and minimized near the specular direction throughout the ASAS spectral region. The dense stand had much higher RF anisotropy (ma)dmurn RF is minimum RF) in the red region than did the sparse stand (relative differences of 5.3 vs. 2.75, respectively), as a function of theta(sub v), due to the shadow component in the canopy. Anisotropy in the near-infrared (NIR) was more similar between the two stands (2.5 in the dense stand and 2.25 in the sparse stand); the dense stand exhibited a greater hotspot effect than 20 the sparse stand in this spectral region. Two common vegetation transforms, the NIR/red ratio and the normalized difference vegetation index (NDVI), both showed a theta(sub v) dependence for the dense stand. Minimum values occurred near the retrodirection and maximum values occurred near the specular direction. Greater relative differences were noted for the NIR/red ratio (2.1) than for the NDVI (1.3). The sparse stand showed no obvious dependence on theta(sub v) for either transform, except for slightly elevated values toward the specular direction.
Cocco, S; Monasson, R; Sessak, V
2011-05-01
We consider the problem of inferring the interactions between a set of N binary variables from the knowledge of their frequencies and pairwise correlations. The inference framework is based on the Hopfield model, a special case of the Ising model where the interaction matrix is defined through a set of patterns in the variable space, and is of rank much smaller than N. We show that maximum likelihood inference is deeply related to principal component analysis when the amplitude of the pattern components ξ is negligible compared to √N. Using techniques from statistical mechanics, we calculate the corrections to the patterns to the first order in ξ/√N. We stress the need to generalize the Hopfield model and include both attractive and repulsive patterns in order to correctly infer networks with sparse and strong interactions. We present a simple geometrical criterion to decide how many attractive and repulsive patterns should be considered as a function of the sampling noise. We moreover discuss how many sampled configurations are required for a good inference, as a function of the system size N and of the amplitude ξ. The inference approach is illustrated on synthetic and biological data.
Structured functional additive regression in reproducing kernel Hilbert spaces
Zhu, Hongxiao; Yao, Fang; Zhang, Hao Helen
2013-01-01
Summary Functional additive models (FAMs) provide a flexible yet simple framework for regressions involving functional predictors. The utilization of data-driven basis in an additive rather than linear structure naturally extends the classical functional linear model. However, the critical issue of selecting nonlinear additive components has been less studied. In this work, we propose a new regularization framework for the structure estimation in the context of Reproducing Kernel Hilbert Spaces. The proposed approach takes advantage of the functional principal components which greatly facilitates the implementation and the theoretical analysis. The selection and estimation are achieved by penalized least squares using a penalty which encourages the sparse structure of the additive components. Theoretical properties such as the rate of convergence are investigated. The empirical performance is demonstrated through simulation studies and a real data application. PMID:25013362
Identification of spatially-localized initial conditions via sparse PCA
NASA Astrophysics Data System (ADS)
Dwivedi, Anubhav; Jovanovic, Mihailo
2017-11-01
Principal Component Analysis involves maximization of a quadratic form subject to a quadratic constraint on the initial flow perturbations and it is routinely used to identify the most energetic flow structures. For general flow configurations, principal components can be efficiently computed via power iteration of the forward and adjoint governing equations. However, the resulting flow structures typically have a large spatial support leading to a question of physical realizability. To obtain spatially-localized structures, we modify the quadratic constraint on the initial condition to include a convex combination with an additional regularization term which promotes sparsity in the physical domain. We formulate this constrained optimization problem as a nonlinear eigenvalue problem and employ an inverse power-iteration-based method to solve it. The resulting solution is guaranteed to converge to a nonlinear eigenvector which becomes increasingly localized as our emphasis on sparsity increases. We use several fluids examples to demonstrate that our method indeed identifies the most energetic initial perturbations that are spatially compact. This work was supported by Office of Naval Research through Grant Number N00014-15-1-2522.
Ping-Keng Jao; Yuan-Pin Lin; Yi-Hsuan Yang; Tzyy-Ping Jung
2015-08-01
An emerging challenge for emotion classification using electroencephalography (EEG) is how to effectively alleviate day-to-day variability in raw data. This study employed the robust principal component analysis (RPCA) to address the problem with a posed hypothesis that background or emotion-irrelevant EEG perturbations lead to certain variability across days and somehow submerge emotion-related EEG dynamics. The empirical results of this study evidently validated our hypothesis and demonstrated the RPCA's feasibility through the analysis of a five-day dataset of 12 subjects. The RPCA allowed tackling the sparse emotion-relevant EEG dynamics from the accompanied background perturbations across days. Sequentially, leveraging the RPCA-purified EEG trials from more days appeared to improve the emotion-classification performance steadily, which was not found in the case using the raw EEG features. Therefore, incorporating the RPCA with existing emotion-aware machine-learning frameworks on a longitudinal dataset of each individual may shed light on the development of a robust affective brain-computer interface (ABCI) that can alleviate ecological inter-day variability.
Zhao, Ruifeng; Zhou, Huarong; Qian, Yibing; Zhang, Jianjun
2006-06-01
A total of 16 quadrants of wetlands and surrounding lands in the mid- and lower reaches of Tarim River were surveyed, and the data about the characteristics of plant communities and environmental factors were collected and counted. By using PCA (principal component analysis) ordination and regression procedure, the distribution patterns of plant communities and the relationships between the characteristics of plant community structure and environmental factors were analyzed. The results showed that the distribution of the plant communities was closely related to soil moisture, salt, and nutrient contents. The accumulative contribution rate of soil moisture and salt contents in the first principal component accounted for 35.70%, and that of soil nutrient content in the second principal component reached 25.97%. There were 4 types of habitats for the plant community distribution, i. e., fenny--light salt--medium nutrient, moist--medium salt--medium nutrient, mesophytic--medium salt--low nutrient, and medium xerophytic-heavy salt--low nutrient. Along these habitats, swamp vegetation, meadow vegetation, riparian sparse forest, halophytic desert, and salinized shrub were distributed. In the wetlands and surrounding lands of mid- and lower reaches of Tarim River, the ecological dominance of the plant communities was markedly and unitary-linearly correlated with the compound gradient of soil moisture and salt contents. The relationships between species diversity, ecological dominance, and compound gradient of soil moisture and salt contents were significantly accorded to binary-linear regression model.
A method of vehicle license plate recognition based on PCANet and compressive sensing
NASA Astrophysics Data System (ADS)
Ye, Xianyi; Min, Feng
2018-03-01
The manual feature extraction of the traditional method for vehicle license plates has no good robustness to change in diversity. And the high feature dimension that is extracted with Principal Component Analysis Network (PCANet) leads to low classification efficiency. For solving these problems, a method of vehicle license plate recognition based on PCANet and compressive sensing is proposed. First, PCANet is used to extract the feature from the images of characters. And then, the sparse measurement matrix which is a very sparse matrix and consistent with Restricted Isometry Property (RIP) condition of the compressed sensing is used to reduce the dimensions of extracted features. Finally, the Support Vector Machine (SVM) is used to train and recognize the features whose dimension has been reduced. Experimental results demonstrate that the proposed method has better performance than Convolutional Neural Network (CNN) in the recognition and time. Compared with no compression sensing, the proposed method has lower feature dimension for the increase of efficiency.
Disentangling giant component and finite cluster contributions in sparse random matrix spectra.
Kühn, Reimer
2016-04-01
We describe a method for disentangling giant component and finite cluster contributions to sparse random matrix spectra, using sparse symmetric random matrices defined on Erdős-Rényi graphs as an example and test bed. Our methods apply to sparse matrices defined in terms of arbitrary graphs in the configuration model class, as long as they have finite mean degree.
TESTING HIGH-DIMENSIONAL COVARIANCE MATRICES, WITH APPLICATION TO DETECTING SCHIZOPHRENIA RISK GENES
Zhu, Lingxue; Lei, Jing; Devlin, Bernie; Roeder, Kathryn
2017-01-01
Scientists routinely compare gene expression levels in cases versus controls in part to determine genes associated with a disease. Similarly, detecting case-control differences in co-expression among genes can be critical to understanding complex human diseases; however statistical methods have been limited by the high dimensional nature of this problem. In this paper, we construct a sparse-Leading-Eigenvalue-Driven (sLED) test for comparing two high-dimensional covariance matrices. By focusing on the spectrum of the differential matrix, sLED provides a novel perspective that accommodates what we assume to be common, namely sparse and weak signals in gene expression data, and it is closely related with Sparse Principal Component Analysis. We prove that sLED achieves full power asymptotically under mild assumptions, and simulation studies verify that it outperforms other existing procedures under many biologically plausible scenarios. Applying sLED to the largest gene-expression dataset obtained from post-mortem brain tissue from Schizophrenia patients and controls, we provide a novel list of genes implicated in Schizophrenia and reveal intriguing patterns in gene co-expression change for Schizophrenia subjects. We also illustrate that sLED can be generalized to compare other gene-gene “relationship” matrices that are of practical interest, such as the weighted adjacency matrices. PMID:29081874
Zhu, Lingxue; Lei, Jing; Devlin, Bernie; Roeder, Kathryn
2017-09-01
Scientists routinely compare gene expression levels in cases versus controls in part to determine genes associated with a disease. Similarly, detecting case-control differences in co-expression among genes can be critical to understanding complex human diseases; however statistical methods have been limited by the high dimensional nature of this problem. In this paper, we construct a sparse-Leading-Eigenvalue-Driven (sLED) test for comparing two high-dimensional covariance matrices. By focusing on the spectrum of the differential matrix, sLED provides a novel perspective that accommodates what we assume to be common, namely sparse and weak signals in gene expression data, and it is closely related with Sparse Principal Component Analysis. We prove that sLED achieves full power asymptotically under mild assumptions, and simulation studies verify that it outperforms other existing procedures under many biologically plausible scenarios. Applying sLED to the largest gene-expression dataset obtained from post-mortem brain tissue from Schizophrenia patients and controls, we provide a novel list of genes implicated in Schizophrenia and reveal intriguing patterns in gene co-expression change for Schizophrenia subjects. We also illustrate that sLED can be generalized to compare other gene-gene "relationship" matrices that are of practical interest, such as the weighted adjacency matrices.
NASA Astrophysics Data System (ADS)
Wang, Hai-Yan; Song, Chao; Sha, Min; Liu, Jun; Li, Li-Ping; Zhang, Zheng-Yong
2018-05-01
Raman spectra and ultraviolet-visible absorption spectra of four different geographic origins of Radix Astragali were collected. These data were analyzed using kernel principal component analysis combined with sparse representation classification. The results showed that the recognition rate reached 70.44% using Raman spectra for data input and 90.34% using ultraviolet-visible absorption spectra for data input. A new fusion method based on Raman combined with ultraviolet-visible data was investigated and the recognition rate was increased to 96.43%. The experimental results suggested that the proposed data fusion method effectively improved the utilization rate of the original data.
Zeng, Dong; Xie, Qi; Cao, Wenfei; Lin, Jiahui; Zhang, Hao; Zhang, Shanli; Huang, Jing; Bian, Zhaoying; Meng, Deyu; Xu, Zongben; Liang, Zhengrong; Chen, Wufan
2017-01-01
Dynamic cerebral perfusion computed tomography (DCPCT) has the ability to evaluate the hemodynamic information throughout the brain. However, due to multiple 3-D image volume acquisitions protocol, DCPCT scanning imposes high radiation dose on the patients with growing concerns. To address this issue, in this paper, based on the robust principal component analysis (RPCA, or equivalently the low-rank and sparsity decomposition) model and the DCPCT imaging procedure, we propose a new DCPCT image reconstruction algorithm to improve low dose DCPCT and perfusion maps quality via using a powerful measure, called Kronecker-basis-representation tensor sparsity regularization, for measuring low-rankness extent of a tensor. For simplicity, the first proposed model is termed tensor-based RPCA (T-RPCA). Specifically, the T-RPCA model views the DCPCT sequential images as a mixture of low-rank, sparse, and noise components to describe the maximum temporal coherence of spatial structure among phases in a tensor framework intrinsically. Moreover, the low-rank component corresponds to the “background” part with spatial–temporal correlations, e.g., static anatomical contribution, which is stationary over time about structure, and the sparse component represents the time-varying component with spatial–temporal continuity, e.g., dynamic perfusion enhanced information, which is approximately sparse over time. Furthermore, an improved nonlocal patch-based T-RPCA (NL-T-RPCA) model which describes the 3-D block groups of the “background” in a tensor is also proposed. The NL-T-RPCA model utilizes the intrinsic characteristics underlying the DCPCT images, i.e., nonlocal self-similarity and global correlation. Two efficient algorithms using alternating direction method of multipliers are developed to solve the proposed T-RPCA and NL-T-RPCA models, respectively. Extensive experiments with a digital brain perfusion phantom, preclinical monkey data, and clinical patient data clearly demonstrate that the two proposed models can achieve more gains than the existing popular algorithms in terms of both quantitative and visual quality evaluations from low-dose acquisitions, especially as low as 20 mAs. PMID:28880164
Classification of vegetation types in military region
NASA Astrophysics Data System (ADS)
Gonçalves, Miguel; Silva, Jose Silvestre; Bioucas-Dias, Jose
2015-10-01
In decision-making process regarding planning and execution of military operations, the terrain is a determining factor. Aerial photographs are a source of vital information for the success of an operation in hostile region, namely when the cartographic information behind enemy lines is scarce or non-existent. The objective of present work is the development of a tool capable of processing aerial photos. The methodology implemented starts with feature extraction, followed by the application of an automatic selector of features. The next step, using the k-fold cross validation technique, estimates the input parameters for the following classifiers: Sparse Multinomial Logist Regression (SMLR), K Nearest Neighbor (KNN), Linear Classifier using Principal Component Expansion on the Joint Data (PCLDC) and Multi-Class Support Vector Machine (MSVM). These classifiers were used in two different studies with distinct objectives: discrimination of vegetation's density and identification of vegetation's main components. It was found that the best classifier on the first approach is the Sparse Logistic Multinomial Regression (SMLR). On the second approach, the implemented methodology applied to high resolution images showed that the better performance was achieved by KNN classifier and PCLDC. Comparing the two approaches there is a multiscale issue, in which for different resolutions, the best solution to the problem requires different classifiers and the extraction of different features.
LiDAR point classification based on sparse representation
NASA Astrophysics Data System (ADS)
Li, Nan; Pfeifer, Norbert; Liu, Chun
2017-04-01
In order to combine the initial spatial structure and features of LiDAR data for accurate classification. The LiDAR data is represented as a 4-order tensor. Sparse representation for classification(SRC) method is used for LiDAR tensor classification. It turns out SRC need only a few of training samples from each class, meanwhile can achieve good classification result. Multiple features are extracted from raw LiDAR points to generate a high-dimensional vector at each point. Then the LiDAR tensor is built by the spatial distribution and feature vectors of the point neighborhood. The entries of LiDAR tensor are accessed via four indexes. Each index is called mode: three spatial modes in direction X ,Y ,Z and one feature mode. Sparse representation for classification(SRC) method is proposed in this paper. The sparsity algorithm is to find the best represent the test sample by sparse linear combination of training samples from a dictionary. To explore the sparsity of LiDAR tensor, the tucker decomposition is used. It decomposes a tensor into a core tensor multiplied by a matrix along each mode. Those matrices could be considered as the principal components in each mode. The entries of core tensor show the level of interaction between the different components. Therefore, the LiDAR tensor can be approximately represented by a sparse tensor multiplied by a matrix selected from a dictionary along each mode. The matrices decomposed from training samples are arranged as initial elements in the dictionary. By dictionary learning, a reconstructive and discriminative structure dictionary along each mode is built. The overall structure dictionary composes of class-specified sub-dictionaries. Then the sparse core tensor is calculated by tensor OMP(Orthogonal Matching Pursuit) method based on dictionaries along each mode. It is expected that original tensor should be well recovered by sub-dictionary associated with relevant class, while entries in the sparse tensor associated with other classed should be nearly zero. Therefore, SRC use the reconstruction error associated with each class to do data classification. A section of airborne LiDAR points of Vienna city is used and classified into 6classes: ground, roofs, vegetation, covered ground, walls and other points. Only 6 training samples from each class are taken. For the final classification result, ground and covered ground are merged into one same class(ground). The classification accuracy for ground is 94.60%, roof is 95.47%, vegetation is 85.55%, wall is 76.17%, other object is 20.39%.
Superintendents' Perceptions of Online Education for Principals
ERIC Educational Resources Information Center
Rutledge, David
2017-01-01
The proliferation of online educational leadership programs has led many aspiring principal candidates to complete their entire degree or licensure program online. While the online education is convenient and flexible for the learner, there is sparse research to determine the acceptability of online degrees in educational leadership, especially…
Background recovery via motion-based robust principal component analysis with matrix factorization
NASA Astrophysics Data System (ADS)
Pan, Peng; Wang, Yongli; Zhou, Mingyuan; Sun, Zhipeng; He, Guoping
2018-03-01
Background recovery is a key technique in video analysis, but it still suffers from many challenges, such as camouflage, lighting changes, and diverse types of image noise. Robust principal component analysis (RPCA), which aims to recover a low-rank matrix and a sparse matrix, is a general framework for background recovery. The nuclear norm is widely used as a convex surrogate for the rank function in RPCA, which requires computing the singular value decomposition (SVD), a task that is increasingly costly as matrix sizes and ranks increase. However, matrix factorization greatly reduces the dimension of the matrix for which the SVD must be computed. Motion information has been shown to improve low-rank matrix recovery in RPCA, but this method still finds it difficult to handle original video data sets because of its batch-mode formulation and implementation. Hence, in this paper, we propose a motion-assisted RPCA model with matrix factorization (FM-RPCA) for background recovery. Moreover, an efficient linear alternating direction method of multipliers with a matrix factorization (FL-ADM) algorithm is designed for solving the proposed FM-RPCA model. Experimental results illustrate that the method provides stable results and is more efficient than the current state-of-the-art algorithms.
Sparse approximation of currents for statistics on curves and surfaces.
Durrleman, Stanley; Pennec, Xavier; Trouvé, Alain; Ayache, Nicholas
2008-01-01
Computing, processing, visualizing statistics on shapes like curves or surfaces is a real challenge with many applications ranging from medical image analysis to computational geometry. Modelling such geometrical primitives with currents avoids feature-based approach as well as point-correspondence method. This framework has been proved to be powerful to register brain surfaces or to measure geometrical invariants. However, if the state-of-the-art methods perform efficiently pairwise registrations, new numerical schemes are required to process groupwise statistics due to an increasing complexity when the size of the database is growing. Statistics such as mean and principal modes of a set of shapes often have a heavy and highly redundant representation. We propose therefore to find an adapted basis on which mean and principal modes have a sparse decomposition. Besides the computational improvement, this sparse representation offers a way to visualize and interpret statistics on currents. Experiments show the relevance of the approach on 34 sets of 70 sulcal lines and on 50 sets of 10 meshes of deep brain structures.
New machine-learning algorithms for prediction of Parkinson's disease
NASA Astrophysics Data System (ADS)
Mandal, Indrajit; Sairam, N.
2014-03-01
This article presents an enhanced prediction accuracy of diagnosis of Parkinson's disease (PD) to prevent the delay and misdiagnosis of patients using the proposed robust inference system. New machine-learning methods are proposed and performance comparisons are based on specificity, sensitivity, accuracy and other measurable parameters. The robust methods of treating Parkinson's disease (PD) includes sparse multinomial logistic regression, rotation forest ensemble with support vector machines and principal components analysis, artificial neural networks, boosting methods. A new ensemble method comprising of the Bayesian network optimised by Tabu search algorithm as classifier and Haar wavelets as projection filter is used for relevant feature selection and ranking. The highest accuracy obtained by linear logistic regression and sparse multinomial logistic regression is 100% and sensitivity, specificity of 0.983 and 0.996, respectively. All the experiments are conducted over 95% and 99% confidence levels and establish the results with corrected t-tests. This work shows a high degree of advancement in software reliability and quality of the computer-aided diagnosis system and experimentally shows best results with supportive statistical inference.
Hamy, Valentin; Dikaios, Nikolaos; Punwani, Shonit; Melbourne, Andrew; Latifoltojar, Arash; Makanyanga, Jesica; Chouhan, Manil; Helbren, Emma; Menys, Alex; Taylor, Stuart; Atkinson, David
2014-02-01
Motion correction in Dynamic Contrast Enhanced (DCE-) MRI is challenging because rapid intensity changes can compromise common (intensity based) registration algorithms. In this study we introduce a novel registration technique based on robust principal component analysis (RPCA) to decompose a given time-series into a low rank and a sparse component. This allows robust separation of motion components that can be registered, from intensity variations that are left unchanged. This Robust Data Decomposition Registration (RDDR) is demonstrated on both simulated and a wide range of clinical data. Robustness to different types of motion and breathing choices during acquisition is demonstrated for a variety of imaged organs including liver, small bowel and prostate. The analysis of clinically relevant regions of interest showed both a decrease of error (15-62% reduction following registration) in tissue time-intensity curves and improved areas under the curve (AUC60) at early enhancement. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Robust Small Target Co-Detection from Airborne Infrared Image Sequences.
Gao, Jingli; Wen, Chenglin; Liu, Meiqin
2017-09-29
In this paper, a novel infrared target co-detection model combining the self-correlation features of backgrounds and the commonality features of targets in the spatio-temporal domain is proposed to detect small targets in a sequence of infrared images with complex backgrounds. Firstly, a dense target extraction model based on nonlinear weights is proposed, which can better suppress background of images and enhance small targets than weights of singular values. Secondly, a sparse target extraction model based on entry-wise weighted robust principal component analysis is proposed. The entry-wise weight adaptively incorporates structural prior in terms of local weighted entropy, thus, it can extract real targets accurately and suppress background clutters efficiently. Finally, the commonality of targets in the spatio-temporal domain are used to construct target refinement model for false alarms suppression and target confirmation. Since real targets could appear in both of the dense and sparse reconstruction maps of a single frame, and form trajectories after tracklet association of consecutive frames, the location correlation of the dense and sparse reconstruction maps for a single frame and tracklet association of the location correlation maps for successive frames have strong ability to discriminate between small targets and background clutters. Experimental results demonstrate that the proposed small target co-detection method can not only suppress background clutters effectively, but also detect targets accurately even if with target-like interference.
Nonlinear spike-and-slab sparse coding for interpretable image encoding.
Shelton, Jacquelyn A; Sheikh, Abdul-Saboor; Bornschein, Jörg; Sterne, Philip; Lücke, Jörg
2015-01-01
Sparse coding is a popular approach to model natural images but has faced two main challenges: modelling low-level image components (such as edge-like structures and their occlusions) and modelling varying pixel intensities. Traditionally, images are modelled as a sparse linear superposition of dictionary elements, where the probabilistic view of this problem is that the coefficients follow a Laplace or Cauchy prior distribution. We propose a novel model that instead uses a spike-and-slab prior and nonlinear combination of components. With the prior, our model can easily represent exact zeros for e.g. the absence of an image component, such as an edge, and a distribution over non-zero pixel intensities. With the nonlinearity (the nonlinear max combination rule), the idea is to target occlusions; dictionary elements correspond to image components that can occlude each other. There are major consequences of the model assumptions made by both (non)linear approaches, thus the main goal of this paper is to isolate and highlight differences between them. Parameter optimization is analytically and computationally intractable in our model, thus as a main contribution we design an exact Gibbs sampler for efficient inference which we can apply to higher dimensional data using latent variable preselection. Results on natural and artificial occlusion-rich data with controlled forms of sparse structure show that our model can extract a sparse set of edge-like components that closely match the generating process, which we refer to as interpretable components. Furthermore, the sparseness of the solution closely follows the ground-truth number of components/edges in the images. The linear model did not learn such edge-like components with any level of sparsity. This suggests that our model can adaptively well-approximate and characterize the meaningful generation process.
Nonlinear Spike-And-Slab Sparse Coding for Interpretable Image Encoding
Shelton, Jacquelyn A.; Sheikh, Abdul-Saboor; Bornschein, Jörg; Sterne, Philip; Lücke, Jörg
2015-01-01
Sparse coding is a popular approach to model natural images but has faced two main challenges: modelling low-level image components (such as edge-like structures and their occlusions) and modelling varying pixel intensities. Traditionally, images are modelled as a sparse linear superposition of dictionary elements, where the probabilistic view of this problem is that the coefficients follow a Laplace or Cauchy prior distribution. We propose a novel model that instead uses a spike-and-slab prior and nonlinear combination of components. With the prior, our model can easily represent exact zeros for e.g. the absence of an image component, such as an edge, and a distribution over non-zero pixel intensities. With the nonlinearity (the nonlinear max combination rule), the idea is to target occlusions; dictionary elements correspond to image components that can occlude each other. There are major consequences of the model assumptions made by both (non)linear approaches, thus the main goal of this paper is to isolate and highlight differences between them. Parameter optimization is analytically and computationally intractable in our model, thus as a main contribution we design an exact Gibbs sampler for efficient inference which we can apply to higher dimensional data using latent variable preselection. Results on natural and artificial occlusion-rich data with controlled forms of sparse structure show that our model can extract a sparse set of edge-like components that closely match the generating process, which we refer to as interpretable components. Furthermore, the sparseness of the solution closely follows the ground-truth number of components/edges in the images. The linear model did not learn such edge-like components with any level of sparsity. This suggests that our model can adaptively well-approximate and characterize the meaningful generation process. PMID:25954947
Precession missile feature extraction using sparse component analysis of radar measurements
NASA Astrophysics Data System (ADS)
Liu, Lihua; Du, Xiaoyong; Ghogho, Mounir; Hu, Weidong; McLernon, Des
2012-12-01
According to the working mode of the ballistic missile warning radar (BMWR), the radar return from the BMWR is usually sparse. To recognize and identify the warhead, it is necessary to extract the precession frequency and the locations of the scattering centers of the missile. This article first analyzes the radar signal model of the precessing conical missile during flight and develops the sparse dictionary which is parameterized by the unknown precession frequency. Based on the sparse dictionary, the sparse signal model is then established. A nonlinear least square estimation is first applied to roughly extract the precession frequency in the sparse dictionary. Based on the time segmented radar signal, a sparse component analysis method using the orthogonal matching pursuit algorithm is then proposed to jointly estimate the precession frequency and the scattering centers of the missile. Simulation results illustrate the validity of the proposed method.
A Systems Theory Approach to the District Central Office's Role in School-Level Improvement
ERIC Educational Resources Information Center
Mania-Singer, Jackie
2017-01-01
This qualitative case study used General Systems Theory and social network analysis to explore the relationships between the members of a district central office and principals of elementary schools within an urban school district in the Midwest. Findings revealed sparse relationships between members of the district central office and principals,…
Survival analysis with functional covariates for partial follow-up studies.
Fang, Hong-Bin; Wu, Tong Tong; Rapoport, Aaron P; Tan, Ming
2016-12-01
Predictive or prognostic analysis plays an increasingly important role in the era of personalized medicine to identify subsets of patients whom the treatment may benefit the most. Although various time-dependent covariate models are available, such models require that covariates be followed in the whole follow-up period. This article studies a new class of functional survival models where the covariates are only monitored in a time interval that is shorter than the whole follow-up period. This paper is motivated by the analysis of a longitudinal study on advanced myeloma patients who received stem cell transplants and T cell infusions after the transplants. The absolute lymphocyte cell counts were collected serially during hospitalization. Those patients are still followed up if they are alive after hospitalization, while their absolute lymphocyte cell counts cannot be measured after that. Another complication is that absolute lymphocyte cell counts are sparsely and irregularly measured. The conventional method using Cox model with time-varying covariates is not applicable because of the different lengths of observation periods. Analysis based on each single observation obviously underutilizes available information and, more seriously, may yield misleading results. This so-called partial follow-up study design represents increasingly common predictive modeling problem where we have serial multiple biomarkers up to a certain time point, which is shorter than the total length of follow-up. We therefore propose a solution to the partial follow-up design. The new method combines functional principal components analysis and survival analysis with selection of those functional covariates. It also has the advantage of handling sparse and irregularly measured longitudinal observations of covariates and measurement errors. Our analysis based on functional principal components reveals that it is the patterns of the trajectories of absolute lymphocyte cell counts, instead of the actual counts, that affect patient's disease-free survival time. © The Author(s) 2014.
Decentralized modal identification using sparse blind source separation
NASA Astrophysics Data System (ADS)
Sadhu, A.; Hazra, B.; Narasimhan, S.; Pandey, M. D.
2011-12-01
Popular ambient vibration-based system identification methods process information collected from a dense array of sensors centrally to yield the modal properties. In such methods, the need for a centralized processing unit capable of satisfying large memory and processing demands is unavoidable. With the advent of wireless smart sensor networks, it is now possible to process information locally at the sensor level, instead. The information at the individual sensor level can then be concatenated to obtain the global structure characteristics. A novel decentralized algorithm based on wavelet transforms to infer global structure mode information using measurements obtained using a small group of sensors at a time is proposed in this paper. The focus of the paper is on algorithmic development, while the actual hardware and software implementation is not pursued here. The problem of identification is cast within the framework of under-determined blind source separation invoking transformations of measurements to the time-frequency domain resulting in a sparse representation. The partial mode shape coefficients so identified are then combined to yield complete modal information. The transformations are undertaken using stationary wavelet packet transform (SWPT), yielding a sparse representation in the wavelet domain. Principal component analysis (PCA) is then performed on the resulting wavelet coefficients, yielding the partial mixing matrix coefficients from a few measurement channels at a time. This process is repeated using measurements obtained from multiple sensor groups, and the results so obtained from each group are concatenated to obtain the global modal characteristics of the structure.
Low-rank structure learning via nonconvex heuristic recovery.
Deng, Yue; Dai, Qionghai; Liu, Risheng; Zhang, Zengke; Hu, Sanqing
2013-03-01
In this paper, we propose a nonconvex framework to learn the essential low-rank structure from corrupted data. Different from traditional approaches, which directly utilizes convex norms to measure the sparseness, our method introduces more reasonable nonconvex measurements to enhance the sparsity in both the intrinsic low-rank structure and the sparse corruptions. We will, respectively, introduce how to combine the widely used ℓp norm (0 < p < 1) and log-sum term into the framework of low-rank structure learning. Although the proposed optimization is no longer convex, it still can be effectively solved by a majorization-minimization (MM)-type algorithm, with which the nonconvex objective function is iteratively replaced by its convex surrogate and the nonconvex problem finally falls into the general framework of reweighed approaches. We prove that the MM-type algorithm can converge to a stationary point after successive iterations. The proposed model is applied to solve two typical problems: robust principal component analysis and low-rank representation. Experimental results on low-rank structure learning demonstrate that our nonconvex heuristic methods, especially the log-sum heuristic recovery algorithm, generally perform much better than the convex-norm-based method (0 < p < 1) for both data with higher rank and with denser corruptions.
NASA Astrophysics Data System (ADS)
Gomez Gonzalez, C. A.; Absil, O.; Absil, P.-A.; Van Droogenbroeck, M.; Mawet, D.; Surdej, J.
2016-05-01
Context. Data processing constitutes a critical component of high-contrast exoplanet imaging. Its role is almost as important as the choice of a coronagraph or a wavefront control system, and it is intertwined with the chosen observing strategy. Among the data processing techniques for angular differential imaging (ADI), the most recent is the family of principal component analysis (PCA) based algorithms. It is a widely used statistical tool developed during the first half of the past century. PCA serves, in this case, as a subspace projection technique for constructing a reference point spread function (PSF) that can be subtracted from the science data for boosting the detectability of potential companions present in the data. Unfortunately, when building this reference PSF from the science data itself, PCA comes with certain limitations such as the sensitivity of the lower dimensional orthogonal subspace to non-Gaussian noise. Aims: Inspired by recent advances in machine learning algorithms such as robust PCA, we aim to propose a localized subspace projection technique that surpasses current PCA-based post-processing algorithms in terms of the detectability of companions at near real-time speed, a quality that will be useful for future direct imaging surveys. Methods: We used randomized low-rank approximation methods recently proposed in the machine learning literature, coupled with entry-wise thresholding to decompose an ADI image sequence locally into low-rank, sparse, and Gaussian noise components (LLSG). This local three-term decomposition separates the starlight and the associated speckle noise from the planetary signal, which mostly remains in the sparse term. We tested the performance of our new algorithm on a long ADI sequence obtained on β Pictoris with VLT/NACO. Results: Compared to a standard PCA approach, LLSG decomposition reaches a higher signal-to-noise ratio and has an overall better performance in the receiver operating characteristic space. This three-term decomposition brings a detectability boost compared to the full-frame standard PCA approach, especially in the small inner working angle region where complex speckle noise prevents PCA from discerning true companions from noise.
The application of low-rank and sparse decomposition method in the field of climatology
NASA Astrophysics Data System (ADS)
Gupta, Nitika; Bhaskaran, Prasad K.
2018-04-01
The present study reports a low-rank and sparse decomposition method that separates the mean and the variability of a climate data field. Until now, the application of this technique was limited only in areas such as image processing, web data ranking, and bioinformatics data analysis. In climate science, this method exactly separates the original data into a set of low-rank and sparse components, wherein the low-rank components depict the linearly correlated dataset (expected or mean behavior), and the sparse component represents the variation or perturbation in the dataset from its mean behavior. The study attempts to verify the efficacy of this proposed technique in the field of climatology with two examples of real world. The first example attempts this technique on the maximum wind-speed (MWS) data for the Indian Ocean (IO) region. The study brings to light a decadal reversal pattern in the MWS for the North Indian Ocean (NIO) during the months of June, July, and August (JJA). The second example deals with the sea surface temperature (SST) data for the Bay of Bengal region that exhibits a distinct pattern in the sparse component. The study highlights the importance of the proposed technique used for interpretation and visualization of climate data.
CT Image Sequence Restoration Based on Sparse and Low-Rank Decomposition
Gou, Shuiping; Wang, Yueyue; Wang, Zhilong; Peng, Yong; Zhang, Xiaopeng; Jiao, Licheng; Wu, Jianshe
2013-01-01
Blurry organ boundaries and soft tissue structures present a major challenge in biomedical image restoration. In this paper, we propose a low-rank decomposition-based method for computed tomography (CT) image sequence restoration, where the CT image sequence is decomposed into a sparse component and a low-rank component. A new point spread function of Weiner filter is employed to efficiently remove blur in the sparse component; a wiener filtering with the Gaussian PSF is used to recover the average image of the low-rank component. And then we get the recovered CT image sequence by combining the recovery low-rank image with all recovery sparse image sequence. Our method achieves restoration results with higher contrast, sharper organ boundaries and richer soft tissue structure information, compared with existing CT image restoration methods. The robustness of our method was assessed with numerical experiments using three different low-rank models: Robust Principle Component Analysis (RPCA), Linearized Alternating Direction Method with Adaptive Penalty (LADMAP) and Go Decomposition (GoDec). Experimental results demonstrated that the RPCA model was the most suitable for the small noise CT images whereas the GoDec model was the best for the large noisy CT images. PMID:24023764
A new species of Ceanothus from northern Baja California
Boyd, Steve; Keeley, Jon E.
2002-01-01
Ceanothus bolensis S. Boyd & J. Keeley is a new species in the subgenus Cerastes from northwestern Baja California, Mexico. It is well represented at elevations above 1000 m on Cerro Bola, a basaltic peak approximately 35 km south of the U.S./Mexican border. It is characterized by small, obovate to oblanceolate, cupped, essentially glabrous leaves with sparsely toothed margins, pale blue flowers, and globose fruits lacking horns. Principal components analysis on morphological traits shows it to be distinct from other members of Cerastes which are distributed away from the coast in southern California and Baja California, Mexico. These phenetic comparisons also suggest that Ceanothus otayensis should not be subsumed under C. crassifolius, as treated in the Jepson Manual, but rather should be retained at specific rank as well.
Gerard, Baudouin; Duvall, Jeremy R.; Lowe, Jason T.; Murillo, Tiffanie; Wei, Jingqiang; Akella, Lakshmi B.; Marcaurelle, Lisa A.
2011-01-01
We have implemented an aldol-based ‘build/couple/pair’ (B/C/P) strategy for the synthesis of stereochemically diverse 8-membered lactam and sultam scaffolds via SNAr cycloetherification. Each scaffold contains two handles, an amine and aryl bromide, for solid-phase diversification via N-capping and Pd-mediated cross coupling. A sparse matrix design strategy that achieves the dual objective of controlling physicochemical properties and selecting diverse library members was implemented. The production of two 8000-membered libraries is discussed including a full analysis of library purity and property distribution. Library diversity was evaluated in comparison to the Molecular Library Small Molecule Repository (MLSMR) through the use of a multi-fusion similarity (MFS) map and principal component analysis (PCA). PMID:21526820
Feature Selection and Pedestrian Detection Based on Sparse Representation.
Yao, Shihong; Wang, Tao; Shen, Weiming; Pan, Shaoming; Chong, Yanwen; Ding, Fei
2015-01-01
Pedestrian detection have been currently devoted to the extraction of effective pedestrian features, which has become one of the obstacles in pedestrian detection application according to the variety of pedestrian features and their large dimension. Based on the theoretical analysis of six frequently-used features, SIFT, SURF, Haar, HOG, LBP and LSS, and their comparison with experimental results, this paper screens out the sparse feature subsets via sparse representation to investigate whether the sparse subsets have the same description abilities and the most stable features. When any two of the six features are fused, the fusion feature is sparsely represented to obtain its important components. Sparse subsets of the fusion features can be rapidly generated by avoiding calculation of the corresponding index of dimension numbers of these feature descriptors; thus, the calculation speed of the feature dimension reduction is improved and the pedestrian detection time is reduced. Experimental results show that sparse feature subsets are capable of keeping the important components of these six feature descriptors. The sparse features of HOG and LSS possess the same description ability and consume less time compared with their full features. The ratios of the sparse feature subsets of HOG and LSS to their full sets are the highest among the six, and thus these two features can be used to best describe the characteristics of the pedestrian and the sparse feature subsets of the combination of HOG-LSS show better distinguishing ability and parsimony.
Gorban, A N; Mirkes, E M; Zinovyev, A
2016-12-01
Most of machine learning approaches have stemmed from the application of minimizing the mean squared distance principle, based on the computationally efficient quadratic optimization methods. However, when faced with high-dimensional and noisy data, the quadratic error functionals demonstrated many weaknesses including high sensitivity to contaminating factors and dimensionality curse. Therefore, a lot of recent applications in machine learning exploited properties of non-quadratic error functionals based on L 1 norm or even sub-linear potentials corresponding to quasinorms L p (0
An RFI Detection Algorithm for Microwave Radiometers Using Sparse Component Analysis
NASA Technical Reports Server (NTRS)
Mohammed-Tano, Priscilla N.; Korde-Patel, Asmita; Gholian, Armen; Piepmeier, Jeffrey R.; Schoenwald, Adam; Bradley, Damon
2017-01-01
Radio Frequency Interference (RFI) is a threat to passive microwave measurements and if undetected, can corrupt science retrievals. The sparse component analysis (SCA) for blind source separation has been investigated to detect RFI in microwave radiometer data. Various techniques using SCA have been simulated to determine detection performance with continuous wave (CW) RFI.
A joint sparse representation-based method for double-trial evoked potentials estimation.
Yu, Nannan; Liu, Haikuan; Wang, Xiaoyan; Lu, Hanbing
2013-12-01
In this paper, we present a novel approach to solving an evoked potentials estimating problem. Generally, the evoked potentials in two consecutive trials obtained by repeated identical stimuli of the nerves are extremely similar. In order to trace evoked potentials, we propose a joint sparse representation-based double-trial evoked potentials estimation method, taking full advantage of this similarity. The estimation process is performed in three stages: first, according to the similarity of evoked potentials and the randomness of a spontaneous electroencephalogram, the two consecutive observations of evoked potentials are considered as superpositions of the common component and the unique components; second, making use of their characteristics, the two sparse dictionaries are constructed; and finally, we apply the joint sparse representation method in order to extract the common component of double-trial observations, instead of the evoked potential in each trial. A series of experiments carried out on simulated and human test responses confirmed the superior performance of our method. © 2013 Elsevier Ltd. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Kumamoto, Yasuaki; Minamikawa, Takeo; Kawamura, Akinori; Matsumura, Junichi; Tsuda, Yuichiro; Ukon, Juichiro; Harada, Yoshinori; Tanaka, Hideo; Takamatsu, Tetsuro
2017-02-01
Nerve-sparing surgery is essential to avoid functional deficits of the limbs and organs. Raman scattering, a label-free, minimally invasive, and accurate modality, is one of the best candidate technologies to detect nerves for nerve-sparing surgery. However, Raman scattering imaging is too time-consuming to be employed in surgery. Here we present a rapid and accurate nerve visualization method using a multipoint Raman imaging technique that has enabled simultaneous spectra measurement from different locations (n=32) of a sample. Five sec is sufficient for measuring n=32 spectra with good S/N from a given tissue. Principal component regression discriminant analysis discriminated spectra obtained from peripheral nerves (n=863 from n=161 myelinated nerves) and connective tissue (n=828 from n=121 tendons) with sensitivity and specificity of 88.3% and 94.8%, respectively. To compensate the spatial information of a multipoint-Raman-derived tissue discrimination image that is too sparse to visualize nerve arrangement, we used morphological information obtained from a bright-field image. When merged with the sparse tissue discrimination image, a morphological image of a sample shows what portion of Raman measurement points in arbitrary structure is determined as nerve. Setting a nerve detection criterion on the portion of "nerve" points in the structure as 40% or more, myelinated nerves (n=161) and tendons (n=121) were discriminated with sensitivity and specificity of 97.5%. The presented technique utilizing a sparse multipoint Raman image and a bright-field image has enabled rapid, safe, and accurate detection of peripheral nerves.
NASA Astrophysics Data System (ADS)
Yang, Dong; Lu, Anxiang; Ren, Dong; Wang, Jihua
2017-11-01
This study explored the feasibility of rapid detection of biogenic amines (BAs) in cooked beef during the storage process using hyperspectral imaging technique combined with sparse representation (SR) algorithm. The hyperspectral images of samples were collected in the two spectral ranges of 400-1000 nm and 1000-1800 nm, separately. The spectral data were reduced dimensionality by SR and principal component analysis (PCA) algorithms, and then integrated the least square support vector machine (LS-SVM) to build the SR-LS-SVM and PC-LS-SVM models for the prediction of BAs values in cooked beef. The results showed that the SR-LS-SVM model exhibited the best predictive ability with determination coefficients (RP2) of 0.943 and root mean square errors (RMSEP) of 1.206 in the range of 400-1000 nm of prediction set. The SR and PCA algorithms were further combined to establish the best SR-PC-LS-SVM model for BAs prediction, which had high RP2of 0.969 and low RMSEP of 1.039 in the region of 400-1000 nm. The visual map of the BAs was generated using the best SR-PC-LS-SVM model with imaging process algorithms, which could be used to observe the changes of BAs in cooked beef more intuitively. The study demonstrated that hyperspectral imaging technique combined with sparse representation were able to detect effectively the BAs values in cooked beef during storage and the built SR-PC-LS-SVM model had a potential for rapid and accurate determination of freshness indexes in other meat and meat products.
Saunders, Kate; Bilderbeck, Amy; Palmius, Niclas; Goodwin, Guy; De Vos, Maarten
2017-01-01
Background We recently described a new questionnaire to monitor mood called mood zoom (MZ). MZ comprises 6 items assessing mood symptoms on a 7-point Likert scale; we had previously used standard principal component analysis (PCA) to tentatively understand its properties, but the presence of multiple nonzero loadings obstructed the interpretation of its latent variables. Objective The aim of this study was to rigorously investigate the internal properties and latent variables of MZ using an algorithmic approach which may lead to more interpretable results than PCA. Additionally, we explored three other widely used psychiatric questionnaires to investigate latent variable structure similarities with MZ: (1) Altman self-rating mania scale (ASRM), assessing mania; (2) quick inventory of depressive symptomatology (QIDS) self-report, assessing depression; and (3) generalized anxiety disorder (7-item) (GAD-7), assessing anxiety. Methods We elicited responses from 131 participants: 48 bipolar disorder (BD), 32 borderline personality disorder (BPD), and 51 healthy controls (HC), collected longitudinally (median [interquartile range, IQR]: 363 [276] days). Participants were requested to complete ASRM, QIDS, and GAD-7 weekly (all 3 questionnaires were completed on the Web) and MZ daily (using a custom-based smartphone app). We applied sparse PCA (SPCA) to determine the latent variables for the four questionnaires, where a small subset of the original items contributes toward each latent variable. Results We found that MZ had great consistency across the three cohorts studied. Three main principal components were derived using SPCA, which can be tentatively interpreted as (1) anxiety and sadness, (2) positive affect, and (3) irritability. The MZ principal component comprising anxiety and sadness explains most of the variance in BD and BPD, whereas the positive affect of MZ explains most of the variance in HC. The latent variables in ASRM were identical for the patient groups but different for HC; nevertheless, the latent variables shared common items across both the patient group and HC. On the contrary, QIDS had overall very different principal components across groups; sleep was a key element in HC and BD but was absent in BPD. In GAD-7, nervousness was the principal component explaining most of the variance in BD and HC. Conclusions This study has important implications for understanding self-reported mood. MZ has a consistent, intuitively interpretable latent variable structure and hence may be a good instrument for generic mood assessment. Irritability appears to be the key distinguishing latent variable between BD and BPD and might be useful for differential diagnosis. Anxiety and sadness are closely interlinked, a finding that might inform treatment effects to jointly address these covarying symptoms. Anxiety and nervousness appear to be amongst the cardinal latent variable symptoms in BD and merit close attention in clinical practice. PMID:28546141
Nonlocal low-rank and sparse matrix decomposition for spectral CT reconstruction
NASA Astrophysics Data System (ADS)
Niu, Shanzhou; Yu, Gaohang; Ma, Jianhua; Wang, Jing
2018-02-01
Spectral computed tomography (CT) has been a promising technique in research and clinics because of its ability to produce improved energy resolution images with narrow energy bins. However, the narrow energy bin image is often affected by serious quantum noise because of the limited number of photons used in the corresponding energy bin. To address this problem, we present an iterative reconstruction method for spectral CT using nonlocal low-rank and sparse matrix decomposition (NLSMD), which exploits the self-similarity of patches that are collected in multi-energy images. Specifically, each set of patches can be decomposed into a low-rank component and a sparse component, and the low-rank component represents the stationary background over different energy bins, while the sparse component represents the rest of the different spectral features in individual energy bins. Subsequently, an effective alternating optimization algorithm was developed to minimize the associated objective function. To validate and evaluate the NLSMD method, qualitative and quantitative studies were conducted by using simulated and real spectral CT data. Experimental results show that the NLSMD method improves spectral CT images in terms of noise reduction, artifact suppression and resolution preservation.
NASA Satellite Imagery Shows Sparse Population of Region Near Baja, California Earthquake
2010-04-09
This image from NASA Terra spacecraft shows where a magnitude 7.2 earthquake struck in Mexico Baja, California at shallow depth along the principal plate boundary between the North American and Pacific plates on April 4, 2010.
Lu, Ake Tzu-Hui; Austin, Erin; Bonner, Ashley; Huang, Hsin-Hsiung; Cantor, Rita M
2014-09-01
Machine learning methods (MLMs), designed to develop models using high-dimensional predictors, have been used to analyze genome-wide genetic and genomic data to predict risks for complex traits. We summarize the results from six contributions to our Genetic Analysis Workshop 18 working group; these investigators applied MLMs and data mining to analyses of rare and common genetic variants measured in pedigrees. To develop risk profiles, group members analyzed blood pressure traits along with single-nucleotide polymorphisms and rare variant genotypes derived from sequence and imputation analyses in large Mexican American pedigrees. Supervised MLMs included penalized regression with varying penalties, support vector machines, and permanental classification. Unsupervised MLMs included sparse principal components analysis and sparse graphical models. Entropy-based components analyses were also used to mine these data. None of the investigators fully capitalized on the genetic information provided by the complete pedigrees. Their approaches either corrected for the nonindependence of the individuals within the pedigrees or analyzed only those who were independent. Some methods allowed for covariate adjustment, whereas others did not. We evaluated these methods using a variety of metrics. Four contributors conducted primary analyses on the real data, and the other two research groups used the simulated data with and without knowledge of the underlying simulation model. One group used the answers to the simulated data to assess power and type I errors. Although the MLMs applied were substantially different, each research group concluded that MLMs have advantages over standard statistical approaches with these high-dimensional data. © 2014 WILEY PERIODICALS, INC.
An algorithm for separation of mixed sparse and Gaussian sources
Akkalkotkar, Ameya
2017-01-01
Independent component analysis (ICA) is a ubiquitous method for decomposing complex signal mixtures into a small set of statistically independent source signals. However, in cases in which the signal mixture consists of both nongaussian and Gaussian sources, the Gaussian sources will not be recoverable by ICA and will pollute estimates of the nongaussian sources. Therefore, it is desirable to have methods for mixed ICA/PCA which can separate mixtures of Gaussian and nongaussian sources. For mixtures of purely Gaussian sources, principal component analysis (PCA) can provide a basis for the Gaussian subspace. We introduce a new method for mixed ICA/PCA which we call Mixed ICA/PCA via Reproducibility Stability (MIPReSt). Our method uses a repeated estimations technique to rank sources by reproducibility, combined with decomposition of multiple subsamplings of the original data matrix. These multiple decompositions allow us to assess component stability as the size of the data matrix changes, which can be used to determinine the dimension of the nongaussian subspace in a mixture. We demonstrate the utility of MIPReSt for signal mixtures consisting of simulated sources and real-word (speech) sources, as well as mixture of unknown composition. PMID:28414814
An algorithm for separation of mixed sparse and Gaussian sources.
Akkalkotkar, Ameya; Brown, Kevin Scott
2017-01-01
Independent component analysis (ICA) is a ubiquitous method for decomposing complex signal mixtures into a small set of statistically independent source signals. However, in cases in which the signal mixture consists of both nongaussian and Gaussian sources, the Gaussian sources will not be recoverable by ICA and will pollute estimates of the nongaussian sources. Therefore, it is desirable to have methods for mixed ICA/PCA which can separate mixtures of Gaussian and nongaussian sources. For mixtures of purely Gaussian sources, principal component analysis (PCA) can provide a basis for the Gaussian subspace. We introduce a new method for mixed ICA/PCA which we call Mixed ICA/PCA via Reproducibility Stability (MIPReSt). Our method uses a repeated estimations technique to rank sources by reproducibility, combined with decomposition of multiple subsamplings of the original data matrix. These multiple decompositions allow us to assess component stability as the size of the data matrix changes, which can be used to determinine the dimension of the nongaussian subspace in a mixture. We demonstrate the utility of MIPReSt for signal mixtures consisting of simulated sources and real-word (speech) sources, as well as mixture of unknown composition.
NASA Astrophysics Data System (ADS)
Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.
2015-05-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.
Using Structural Equation Modeling To Fit Models Incorporating Principal Components.
ERIC Educational Resources Information Center
Dolan, Conor; Bechger, Timo; Molenaar, Peter
1999-01-01
Considers models incorporating principal components from the perspectives of structural-equation modeling. These models include the following: (1) the principal-component analysis of patterned matrices; (2) multiple analysis of variance based on principal components; and (3) multigroup principal-components analysis. Discusses fitting these models…
de Pierrefeu, Amicie; Fovet, Thomas; Hadj-Selem, Fouad; Löfstedt, Tommy; Ciuciu, Philippe; Lefebvre, Stephanie; Thomas, Pierre; Lopes, Renaud; Jardri, Renaud; Duchesnay, Edouard
2018-04-01
Despite significant progress in the field, the detection of fMRI signal changes during hallucinatory events remains difficult and time-consuming. This article first proposes a machine-learning algorithm to automatically identify resting-state fMRI periods that precede hallucinations versus periods that do not. When applied to whole-brain fMRI data, state-of-the-art classification methods, such as support vector machines (SVM), yield dense solutions that are difficult to interpret. We proposed to extend the existing sparse classification methods by taking the spatial structure of brain images into account with structured sparsity using the total variation penalty. Based on this approach, we obtained reliable classifying performances associated with interpretable predictive patterns, composed of two clearly identifiable clusters in speech-related brain regions. The variation in transition-to-hallucination functional patterns not only from one patient to another but also from one occurrence to the next (e.g., also depending on the sensory modalities involved) appeared to be the major difficulty when developing effective classifiers. Consequently, second, this article aimed to characterize the variability within the prehallucination patterns using an extension of principal component analysis with spatial constraints. The principal components (PCs) and the associated basis patterns shed light on the intrinsic structures of the variability present in the dataset. Such results are promising in the scope of innovative fMRI-guided therapy for drug-resistant hallucinations, such as fMRI-based neurofeedback. © 2018 Wiley Periodicals, Inc.
Signal Separation of Helicopter Radar Returns Using Wavelet-Based Sparse Signal Optimisation
2016-10-01
RR–0436 ABSTRACT A novel wavelet-based sparse signal representation technique is used to separate the main and tail rotor blade components of a...helicopter from the composite radar returns. The received signal consists of returns from the rotating main and tail rotor blades , the helicopter body...component signal com- prising of returns from the main body, the main and tail rotor hubs and blades . Temporal and Doppler characteristics of these
Origins of correlated spiking in the mammalian olfactory bulb
Gerkin, Richard C.; Tripathy, Shreejoy J.; Urban, Nathaniel N.
2013-01-01
Mitral/tufted (M/T) cells of the main olfactory bulb transmit odorant information to higher brain structures. The relative timing of action potentials across M/T cells has been proposed to encode this information and to be critical for the activation of downstream neurons. Using ensemble recordings from the mouse olfactory bulb in vivo, we measured how correlations between cells are shaped by stimulus (odor) identity, common respiratory drive, and other cells’ activity. The shared respiration cycle is the largest source of correlated firing, but even after accounting for all observable factors a residual positive noise correlation was observed. Noise correlation was maximal on a ∼100-ms timescale and was seen only in cells separated by <200 µm. This correlation is explained primarily by common activity in groups of nearby cells. Thus, M/T-cell correlation principally reflects respiratory modulation and sparse, local network connectivity, with odor identity accounting for a minor component. PMID:24082089
Thakur, Anil S.; Robin, Gautier; Guncar, Gregor; Saunders, Neil F. W.; Newman, Janet; Martin, Jennifer L.; Kobe, Bostjan
2007-01-01
Background Crystallization is a major bottleneck in the process of macromolecular structure determination by X-ray crystallography. Successful crystallization requires the formation of nuclei and their subsequent growth to crystals of suitable size. Crystal growth generally occurs spontaneously in a supersaturated solution as a result of homogenous nucleation. However, in a typical sparse matrix screening experiment, precipitant and protein concentration are not sampled extensively, and supersaturation conditions suitable for nucleation are often missed. Methodology/Principal Findings We tested the effect of nine potential heterogenous nucleating agents on crystallization of ten test proteins in a sparse matrix screen. Several nucleating agents induced crystal formation under conditions where no crystallization occurred in the absence of the nucleating agent. Four nucleating agents: dried seaweed; horse hair; cellulose and hydroxyapatite, had a considerable overall positive effect on crystallization success. This effect was further enhanced when these nucleating agents were used in combination with each other. Conclusions/Significance Our results suggest that the addition of heterogeneous nucleating agents increases the chances of crystal formation when using sparse matrix screens. PMID:17971854
Measuring Change in Leadership Identity and Problem Framing
ERIC Educational Resources Information Center
Young, Michelle D.; O'Doherty, Ann; Gooden, Mark A.; Goodnow, Elisabeth
2011-01-01
In recent years, significant attention has been directed to the need for more and better-prepared educational leaders (Young, Crow, Murphy & Ogawa, 2009). While many organizations prepare school principals, evidence of program impact is sparse (Orr & Pounder, 2008; Southern Regional Education Board, 2008). The study described in this…
Optshrink LR + S: accelerated fMRI reconstruction using non-convex optimal singular value shrinkage.
Aggarwal, Priya; Shrivastava, Parth; Kabra, Tanay; Gupta, Anubha
2017-03-01
This paper presents a new accelerated fMRI reconstruction method, namely, OptShrink LR + S method that reconstructs undersampled fMRI data using a linear combination of low-rank and sparse components. The low-rank component has been estimated using non-convex optimal singular value shrinkage algorithm, while the sparse component has been estimated using convex l 1 minimization. The performance of the proposed method is compared with the existing state-of-the-art algorithms on real fMRI dataset. The proposed OptShrink LR + S method yields good qualitative and quantitative results.
Krohn, M.D.; Milton, N.M.; Segal, D.; Enland, A.
1981-01-01
A principal component image enhancement has been effective in applying Landsat data to geologic mapping in a heavily forested area of E Virginia. The image enhancement procedure consists of a principal component transformation, a histogram normalization, and the inverse principal componnet transformation. The enhancement preserves the independence of the principal components, yet produces a more readily interpretable image than does a single principal component transformation. -from Authors
Sentürk, Damla; Dalrymple, Lorien S; Nguyen, Danh V
2014-11-30
We propose functional linear models for zero-inflated count data with a focus on the functional hurdle and functional zero-inflated Poisson (ZIP) models. Although the hurdle model assumes the counts come from a mixture of a degenerate distribution at zero and a zero-truncated Poisson distribution, the ZIP model considers a mixture of a degenerate distribution at zero and a standard Poisson distribution. We extend the generalized functional linear model framework with a functional predictor and multiple cross-sectional predictors to model counts generated by a mixture distribution. We propose an estimation procedure for functional hurdle and ZIP models, called penalized reconstruction, geared towards error-prone and sparsely observed longitudinal functional predictors. The approach relies on dimension reduction and pooling of information across subjects involving basis expansions and penalized maximum likelihood techniques. The developed functional hurdle model is applied to modeling hospitalizations within the first 2 years from initiation of dialysis, with a high percentage of zeros, in the Comprehensive Dialysis Study participants. Hospitalization counts are modeled as a function of sparse longitudinal measurements of serum albumin concentrations, patient demographics, and comorbidities. Simulation studies are used to study finite sample properties of the proposed method and include comparisons with an adaptation of standard principal components regression. Copyright © 2014 John Wiley & Sons, Ltd.
Dimension Reduction With Extreme Learning Machine.
Kasun, Liyanaarachchi Lekamalage Chamara; Yang, Yan; Huang, Guang-Bin; Zhang, Zhengyou
2016-08-01
Data may often contain noise or irrelevant information, which negatively affect the generalization capability of machine learning algorithms. The objective of dimension reduction algorithms, such as principal component analysis (PCA), non-negative matrix factorization (NMF), random projection (RP), and auto-encoder (AE), is to reduce the noise or irrelevant information of the data. The features of PCA (eigenvectors) and linear AE are not able to represent data as parts (e.g. nose in a face image). On the other hand, NMF and non-linear AE are maimed by slow learning speed and RP only represents a subspace of original data. This paper introduces a dimension reduction framework which to some extend represents data as parts, has fast learning speed, and learns the between-class scatter subspace. To this end, this paper investigates a linear and non-linear dimension reduction framework referred to as extreme learning machine AE (ELM-AE) and sparse ELM-AE (SELM-AE). In contrast to tied weight AE, the hidden neurons in ELM-AE and SELM-AE need not be tuned, and their parameters (e.g, input weights in additive neurons) are initialized using orthogonal and sparse random weights, respectively. Experimental results on USPS handwritten digit recognition data set, CIFAR-10 object recognition, and NORB object recognition data set show the efficacy of linear and non-linear ELM-AE and SELM-AE in terms of discriminative capability, sparsity, training time, and normalized mean square error.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Sparse representation of whole-brain fMRI signals for identification of functional networks.
Lv, Jinglei; Jiang, Xi; Li, Xiang; Zhu, Dajiang; Chen, Hanbo; Zhang, Tuo; Zhang, Shu; Hu, Xintao; Han, Junwei; Huang, Heng; Zhang, Jing; Guo, Lei; Liu, Tianming
2015-02-01
There have been several recent studies that used sparse representation for fMRI signal analysis and activation detection based on the assumption that each voxel's fMRI signal is linearly composed of sparse components. Previous studies have employed sparse coding to model functional networks in various modalities and scales. These prior contributions inspired the exploration of whether/how sparse representation can be used to identify functional networks in a voxel-wise way and on the whole brain scale. This paper presents a novel, alternative methodology of identifying multiple functional networks via sparse representation of whole-brain task-based fMRI signals. Our basic idea is that all fMRI signals within the whole brain of one subject are aggregated into a big data matrix, which is then factorized into an over-complete dictionary basis matrix and a reference weight matrix via an effective online dictionary learning algorithm. Our extensive experimental results have shown that this novel methodology can uncover multiple functional networks that can be well characterized and interpreted in spatial, temporal and frequency domains based on current brain science knowledge. Importantly, these well-characterized functional network components are quite reproducible in different brains. In general, our methods offer a novel, effective and unified solution to multiple fMRI data analysis tasks including activation detection, de-activation detection, and functional network identification. Copyright © 2014 Elsevier B.V. All rights reserved.
Groundwater discharge to lakes (GDL) - the disregarded component of lake nutrient budgets
NASA Astrophysics Data System (ADS)
Lewandowski, J.; Meinikmann, K.; Pöschke, F.; Nützmann, G.
2012-04-01
Eutrophication is a major threat to lakes in temperate climatic zones. It is necessary to determine the relevance of different nutrient sources to conduct effective management measures, to understand in-lake processes and to model future scenarios. A prerequisite for such nutrient budgets are water budgets. While most components of the water budget can be determined quite accurate the quantification of groundwater discharge to lakes (GDL) and surface water infiltration into the aquifer are much more difficult. For example, it is quite common to determine the groundwater component as residual in the water and nutrient budget which is extremely problematic since in that case all errors of the budget terms are summed up in the groundwater term. In total, we identified 10 different reasons for disregarding the groundwater path in nutrient budgets. We investigated the fate of the nutrients nitrogen and phosphorus on their pathway from the catchment through the reactive aquifer-lake interface into the lake. We reviewed the international literature and summarized numbers reported for GDL of nutrients. Since literature is quite sparse we also had a look at numbers reported for submarine groundwater discharge (SGD) of nutrients for which much more literature exists and which is despite some fundamental differences in principal comparable to GDL.
Center for Support of Mental Health Services in Isolated Rural Areas. Final Report.
ERIC Educational Resources Information Center
Ciarlo, James A.
In 1994, the University of Denver received a grant to develop and operate the Frontier Mental Health Services Resource Network (FMHSRN). FMHSRN's principal aim was to improve delivery of mental health services in sparsely populated "frontier" areas by providing technical assistance to frontier and rural audiences. Traditional…
Low-rank and Adaptive Sparse Signal (LASSI) Models for Highly Accelerated Dynamic Imaging
Ravishankar, Saiprasad; Moore, Brian E.; Nadakuditi, Raj Rao; Fessler, Jeffrey A.
2017-01-01
Sparsity-based approaches have been popular in many applications in image processing and imaging. Compressed sensing exploits the sparsity of images in a transform domain or dictionary to improve image recovery from undersampled measurements. In the context of inverse problems in dynamic imaging, recent research has demonstrated the promise of sparsity and low-rank techniques. For example, the patches of the underlying data are modeled as sparse in an adaptive dictionary domain, and the resulting image and dictionary estimation from undersampled measurements is called dictionary-blind compressed sensing, or the dynamic image sequence is modeled as a sum of low-rank and sparse (in some transform domain) components (L+S model) that are estimated from limited measurements. In this work, we investigate a data-adaptive extension of the L+S model, dubbed LASSI, where the temporal image sequence is decomposed into a low-rank component and a component whose spatiotemporal (3D) patches are sparse in some adaptive dictionary domain. We investigate various formulations and efficient methods for jointly estimating the underlying dynamic signal components and the spatiotemporal dictionary from limited measurements. We also obtain efficient sparsity penalized dictionary-blind compressed sensing methods as special cases of our LASSI approaches. Our numerical experiments demonstrate the promising performance of LASSI schemes for dynamic magnetic resonance image reconstruction from limited k-t space data compared to recent methods such as k-t SLR and L+S, and compared to the proposed dictionary-blind compressed sensing method. PMID:28092528
Low-Rank and Adaptive Sparse Signal (LASSI) Models for Highly Accelerated Dynamic Imaging.
Ravishankar, Saiprasad; Moore, Brian E; Nadakuditi, Raj Rao; Fessler, Jeffrey A
2017-05-01
Sparsity-based approaches have been popular in many applications in image processing and imaging. Compressed sensing exploits the sparsity of images in a transform domain or dictionary to improve image recovery fromundersampledmeasurements. In the context of inverse problems in dynamic imaging, recent research has demonstrated the promise of sparsity and low-rank techniques. For example, the patches of the underlying data are modeled as sparse in an adaptive dictionary domain, and the resulting image and dictionary estimation from undersampled measurements is called dictionary-blind compressed sensing, or the dynamic image sequence is modeled as a sum of low-rank and sparse (in some transform domain) components (L+S model) that are estimated from limited measurements. In this work, we investigate a data-adaptive extension of the L+S model, dubbed LASSI, where the temporal image sequence is decomposed into a low-rank component and a component whose spatiotemporal (3D) patches are sparse in some adaptive dictionary domain. We investigate various formulations and efficient methods for jointly estimating the underlying dynamic signal components and the spatiotemporal dictionary from limited measurements. We also obtain efficient sparsity penalized dictionary-blind compressed sensing methods as special cases of our LASSI approaches. Our numerical experiments demonstrate the promising performance of LASSI schemes for dynamicmagnetic resonance image reconstruction from limited k-t space data compared to recent methods such as k-t SLR and L+S, and compared to the proposed dictionary-blind compressed sensing method.
Method and product for phosphosilicate slurry for use in dentistry and related bone cements
Wagh, Arun S.; Primus, Carolyn
2006-08-01
The present invention is directed to magnesium phosphate ceramics and their methods of manufacture. The composition of the invention is produced by combining a mixture of a substantially dry powder component with a liquid component. The substantially dry powder component comprises a sparsely soluble oxide powder, an alkali metal phosphate powder, a sparsely soluble silicate powder, with the balance of the substantially dry powder component comprising at least one powder selected from the group consisting of bioactive powders, biocompatible powders, fluorescent powders, fluoride releasing powders, and radiopaque powders. The liquid component comprises a pH modifying agent, a monovalent alkali metal phosphate in aqueous solution, the balance of the liquid component being water. The use of calcined magnesium oxide as the oxide powder and hydroxylapatite as the bioactive powder produces a self-setting ceramic that is particularly suited for use in dental and orthopedic applications.
A Sparsity-Promoted Method Based on Majorization-Minimization for Weak Fault Feature Enhancement
Hao, Yansong; Song, Liuyang; Tang, Gang; Yuan, Hongfang
2018-01-01
Fault transient impulses induced by faulty components in rotating machinery usually contain substantial interference. Fault features are comparatively weak in the initial fault stage, which renders fault diagnosis more difficult. In this case, a sparse representation method based on the Majorzation-Minimization (MM) algorithm is proposed to enhance weak fault features and extract the features from strong background noise. However, the traditional MM algorithm suffers from two issues, which are the choice of sparse basis and complicated calculations. To address these challenges, a modified MM algorithm is proposed in which a sparse optimization objective function is designed firstly. Inspired by the Basis Pursuit (BP) model, the optimization function integrates an impulsive feature-preserving factor and a penalty function factor. Second, a modified Majorization iterative method is applied to address the convex optimization problem of the designed function. A series of sparse coefficients can be achieved through iterating, which only contain transient components. It is noteworthy that there is no need to select the sparse basis in the proposed iterative method because it is fixed as a unit matrix. Then the reconstruction step is omitted, which can significantly increase detection efficiency. Eventually, envelope analysis of the sparse coefficients is performed to extract weak fault features. Simulated and experimental signals including bearings and gearboxes are employed to validate the effectiveness of the proposed method. In addition, comparisons are made to prove that the proposed method outperforms the traditional MM algorithm in terms of detection results and efficiency. PMID:29597280
A Sparsity-Promoted Method Based on Majorization-Minimization for Weak Fault Feature Enhancement.
Ren, Bangyue; Hao, Yansong; Wang, Huaqing; Song, Liuyang; Tang, Gang; Yuan, Hongfang
2018-03-28
Fault transient impulses induced by faulty components in rotating machinery usually contain substantial interference. Fault features are comparatively weak in the initial fault stage, which renders fault diagnosis more difficult. In this case, a sparse representation method based on the Majorzation-Minimization (MM) algorithm is proposed to enhance weak fault features and extract the features from strong background noise. However, the traditional MM algorithm suffers from two issues, which are the choice of sparse basis and complicated calculations. To address these challenges, a modified MM algorithm is proposed in which a sparse optimization objective function is designed firstly. Inspired by the Basis Pursuit (BP) model, the optimization function integrates an impulsive feature-preserving factor and a penalty function factor. Second, a modified Majorization iterative method is applied to address the convex optimization problem of the designed function. A series of sparse coefficients can be achieved through iterating, which only contain transient components. It is noteworthy that there is no need to select the sparse basis in the proposed iterative method because it is fixed as a unit matrix. Then the reconstruction step is omitted, which can significantly increase detection efficiency. Eventually, envelope analysis of the sparse coefficients is performed to extract weak fault features. Simulated and experimental signals including bearings and gearboxes are employed to validate the effectiveness of the proposed method. In addition, comparisons are made to prove that the proposed method outperforms the traditional MM algorithm in terms of detection results and efficiency.
On the Fallibility of Principal Components in Research
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.; Li, Tenglong
2017-01-01
The measurement error in principal components extracted from a set of fallible measures is discussed and evaluated. It is shown that as long as one or more measures in a given set of observed variables contains error of measurement, so also does any principal component obtained from the set. The error variance in any principal component is shown…
Wavelet-based localization of oscillatory sources from magnetoencephalography data.
Lina, J M; Chowdhury, R; Lemay, E; Kobayashi, E; Grova, C
2014-08-01
Transient brain oscillatory activities recorded with Eelectroencephalography (EEG) or magnetoencephalography (MEG) are characteristic features in physiological and pathological processes. This study is aimed at describing, evaluating, and illustrating with clinical data a new method for localizing the sources of oscillatory cortical activity recorded by MEG. The method combines time-frequency representation and an entropic regularization technique in a common framework, assuming that brain activity is sparse in time and space. Spatial sparsity relies on the assumption that brain activity is organized among cortical parcels. Sparsity in time is achieved by transposing the inverse problem in the wavelet representation, for both data and sources. We propose an estimator of the wavelet coefficients of the sources based on the maximum entropy on the mean (MEM) principle. The full dynamics of the sources is obtained from the inverse wavelet transform, and principal component analysis of the reconstructed time courses is applied to extract oscillatory components. This methodology is evaluated using realistic simulations of single-trial signals, combining fast and sudden discharges (spike) along with bursts of oscillating activity. The method is finally illustrated with a clinical application using MEG data acquired on a patient with a right orbitofrontal epilepsy.
Fault detection, isolation, and diagnosis of self-validating multifunctional sensors.
Yang, Jing-Li; Chen, Yin-Sheng; Zhang, Li-Li; Sun, Zhen
2016-06-01
A novel fault detection, isolation, and diagnosis (FDID) strategy for self-validating multifunctional sensors is presented in this paper. The sparse non-negative matrix factorization-based method can effectively detect faults by using the squared prediction error (SPE) statistic, and the variables contribution plots based on SPE statistic can help to locate and isolate the faulty sensitive units. The complete ensemble empirical mode decomposition is employed to decompose the fault signals to a series of intrinsic mode functions (IMFs) and a residual. The sample entropy (SampEn)-weighted energy values of each IMFs and the residual are estimated to represent the characteristics of the fault signals. Multi-class support vector machine is introduced to identify the fault mode with the purpose of diagnosing status of the faulty sensitive units. The performance of the proposed strategy is compared with other fault detection strategies such as principal component analysis, independent component analysis, and fault diagnosis strategies such as empirical mode decomposition coupled with support vector machine. The proposed strategy is fully evaluated in a real self-validating multifunctional sensors experimental system, and the experimental results demonstrate that the proposed strategy provides an excellent solution to the FDID research topic of self-validating multifunctional sensors.
Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection.
Wang, Haoran; Yuan, Chunfeng; Hu, Weiming; Ling, Haibin; Yang, Wankou; Sun, Changyin
2014-02-01
In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.
Constraints on Fluctuations in Sparsely Characterized Biological Systems.
Hilfinger, Andreas; Norman, Thomas M; Vinnicombe, Glenn; Paulsson, Johan
2016-02-05
Biochemical processes are inherently stochastic, creating molecular fluctuations in otherwise identical cells. Such "noise" is widespread but has proven difficult to analyze because most systems are sparsely characterized at the single cell level and because nonlinear stochastic models are analytically intractable. Here, we exactly relate average abundances, lifetimes, step sizes, and covariances for any pair of components in complex stochastic reaction systems even when the dynamics of other components are left unspecified. Using basic mathematical inequalities, we then establish bounds for whole classes of systems. These bounds highlight fundamental trade-offs that show how efficient assembly processes must invariably exhibit large fluctuations in subunit levels and how eliminating fluctuations in one cellular component requires creating heterogeneity in another.
Constraints on Fluctuations in Sparsely Characterized Biological Systems
NASA Astrophysics Data System (ADS)
Hilfinger, Andreas; Norman, Thomas M.; Vinnicombe, Glenn; Paulsson, Johan
2016-02-01
Biochemical processes are inherently stochastic, creating molecular fluctuations in otherwise identical cells. Such "noise" is widespread but has proven difficult to analyze because most systems are sparsely characterized at the single cell level and because nonlinear stochastic models are analytically intractable. Here, we exactly relate average abundances, lifetimes, step sizes, and covariances for any pair of components in complex stochastic reaction systems even when the dynamics of other components are left unspecified. Using basic mathematical inequalities, we then establish bounds for whole classes of systems. These bounds highlight fundamental trade-offs that show how efficient assembly processes must invariably exhibit large fluctuations in subunit levels and how eliminating fluctuations in one cellular component requires creating heterogeneity in another.
Sparse representation based SAR vehicle recognition along with aspect angle.
Xing, Xiangwei; Ji, Kefeng; Zou, Huanxin; Sun, Jixiang
2014-01-01
As a method of representing the test sample with few training samples from an overcomplete dictionary, sparse representation classification (SRC) has attracted much attention in synthetic aperture radar (SAR) automatic target recognition (ATR) recently. In this paper, we develop a novel SAR vehicle recognition method based on sparse representation classification along with aspect information (SRCA), in which the correlation between the vehicle's aspect angle and the sparse representation vector is exploited. The detailed procedure presented in this paper can be summarized as follows. Initially, the sparse representation vector of a test sample is solved by sparse representation algorithm with a principle component analysis (PCA) feature-based dictionary. Then, the coefficient vector is projected onto a sparser one within a certain range of the vehicle's aspect angle. Finally, the vehicle is classified into a certain category that minimizes the reconstruction error with the novel sparse representation vector. Extensive experiments are conducted on the moving and stationary target acquisition and recognition (MSTAR) dataset and the results demonstrate that the proposed method performs robustly under the variations of depression angle and target configurations, as well as incomplete observation.
Principal Component and Linkage Analysis of Cardiovascular Risk Traits in the Norfolk Isolate
Cox, Hannah C.; Bellis, Claire; Lea, Rod A.; Quinlan, Sharon; Hughes, Roger; Dyer, Thomas; Charlesworth, Jac; Blangero, John; Griffiths, Lyn R.
2009-01-01
Objective(s) An individual's risk of developing cardiovascular disease (CVD) is influenced by genetic factors. This study focussed on mapping genetic loci for CVD-risk traits in a unique population isolate derived from Norfolk Island. Methods This investigation focussed on 377 individuals descended from the population founders. Principal component analysis was used to extract orthogonal components from 11 cardiovascular risk traits. Multipoint variance component methods were used to assess genome-wide linkage using SOLAR to the derived factors. A total of 285 of the 377 related individuals were informative for linkage analysis. Results A total of 4 principal components accounting for 83% of the total variance were derived. Principal component 1 was loaded with body size indicators; principal component 2 with body size, cholesterol and triglyceride levels; principal component 3 with the blood pressures; and principal component 4 with LDL-cholesterol and total cholesterol levels. Suggestive evidence of linkage for principal component 2 (h2 = 0.35) was observed on chromosome 5q35 (LOD = 1.85; p = 0.0008). While peak regions on chromosome 10p11.2 (LOD = 1.27; p = 0.005) and 12q13 (LOD = 1.63; p = 0.003) were observed to segregate with principal components 1 (h2 = 0.33) and 4 (h2 = 0.42), respectively. Conclusion(s): This study investigated a number of CVD risk traits in a unique isolated population. Findings support the clustering of CVD risk traits and provide interesting evidence of a region on chromosome 5q35 segregating with weight, waist circumference, HDL-c and total triglyceride levels. PMID:19339786
Seismic data interpolation and denoising by learning a tensor tight frame
NASA Astrophysics Data System (ADS)
Liu, Lina; Plonka, Gerlind; Ma, Jianwei
2017-10-01
Seismic data interpolation and denoising plays a key role in seismic data processing. These problems can be understood as sparse inverse problems, where the desired data are assumed to be sparsely representable within a suitable dictionary. In this paper, we present a new method based on a data-driven tight frame (DDTF) of Kronecker type (KronTF) that avoids the vectorization step and considers the multidimensional structure of data in a tensor-product way. It takes advantage of the structure contained in all different modes (dimensions) simultaneously. In order to overcome the limitations of a usual tensor-product approach we also incorporate data-driven directionality. The complete method is formulated as a sparsity-promoting minimization problem. It includes two main steps. In the first step, a hard thresholding algorithm is used to update the frame coefficients of the data in the dictionary; in the second step, an iterative alternating method is used to update the tight frame (dictionary) in each different mode. The dictionary that is learned in this way contains the principal components in each mode. Furthermore, we apply the proposed KronTF to seismic interpolation and denoising. Examples with synthetic and real seismic data show that the proposed method achieves better results than the traditional projection onto convex sets method based on the Fourier transform and the previous vectorized DDTF methods. In particular, the simple structure of the new frame construction makes it essentially more efficient.
Rank-sparsity constrained atlas construction and phenotyping
NASA Astrophysics Data System (ADS)
Clark, D. P.; Badea, C. T.
2015-03-01
Atlas construction is of great interest in the medical imaging community as a tool to visually and quantitatively characterize anatomic variability within a population. Because such atlases generally exhibit superior data fidelity relative to the individual data sets from which they are constructed, they have also proven invaluable in numerous informatics applications such as automated segmentation and classification, regularization of individual-specific reconstructions from undersampled data, and for characterizing physiologically relevant functional metrics. Perhaps the most valuable role of an anatomic atlas is not to define what is "normal," but, in fact, to recognize what is "abnormal." Here, we propose and demonstrate a novel anatomic atlas construction strategy that simultaneously recovers the average anatomy and the deviation from average in a visually meaningful way. The proposed approach treats the problem of atlas construction within the context of robust principal component analysis (RPCA) in which the redundant portion of the data (i.e. the low rank atlas) is separated from the spatially and gradient sparse portion of the data unique to each individual (i.e. the sparse variation). In this paper, we demonstrate the application of RPCA to the Shepp-Logan phantom, including several forms of variability encountered with in vivo data: population variability, class variability, contrast variability, and individual variability. We then present preliminary results produced by applying the proposed approach to in vivo, murine cardiac micro-CT data acquired in a model of right ventricle hypertrophy induced by pulmonary arteriole hypertension.
NASA Astrophysics Data System (ADS)
Ren, Ruizhi; Gu, Lingjia; Fu, Haoyang; Sun, Chenglin
2017-04-01
An effective super-resolution (SR) algorithm is proposed for actual spectral remote sensing images based on sparse representation and wavelet preprocessing. The proposed SR algorithm mainly consists of dictionary training and image reconstruction. Wavelet preprocessing is used to establish four subbands, i.e., low frequency, horizontal, vertical, and diagonal high frequency, for an input image. As compared to the traditional approaches involving the direct training of image patches, the proposed approach focuses on the training of features derived from these four subbands. The proposed algorithm is verified using different spectral remote sensing images, e.g., moderate-resolution imaging spectroradiometer (MODIS) images with different bands, and the latest Chinese Jilin-1 satellite images with high spatial resolution. According to the visual experimental results obtained from the MODIS remote sensing data, the SR images using the proposed SR algorithm are superior to those using a conventional bicubic interpolation algorithm or traditional SR algorithms without preprocessing. Fusion algorithms, e.g., standard intensity-hue-saturation, principal component analysis, wavelet transform, and the proposed SR algorithms are utilized to merge the multispectral and panchromatic images acquired by the Jilin-1 satellite. The effectiveness of the proposed SR algorithm is assessed by parameters such as peak signal-to-noise ratio, structural similarity index, correlation coefficient, root-mean-square error, relative dimensionless global error in synthesis, relative average spectral error, spectral angle mapper, and the quality index Q4, and its performance is better than that of the standard image fusion algorithms.
Maurer, Christian; Federolf, Peter; von Tscharner, Vinzenz; Stirling, Lisa; Nigg, Benno M
2012-05-01
Changes in gait kinematics have often been analyzed using pattern recognition methods such as principal component analysis (PCA). It is usually just the first few principal components that are analyzed, because they describe the main variability within a dataset and thus represent the main movement patterns. However, while subtle changes in gait pattern (for instance, due to different footwear) may not change main movement patterns, they may affect movements represented by higher principal components. This study was designed to test two hypotheses: (1) speed and gender differences can be observed in the first principal components, and (2) small interventions such as changing footwear change the gait characteristics of higher principal components. Kinematic changes due to different running conditions (speed - 3.1m/s and 4.9 m/s, gender, and footwear - control shoe and adidas MicroBounce shoe) were investigated by applying PCA and support vector machine (SVM) to a full-body reflective marker setup. Differences in speed changed the basic movement pattern, as was reflected by a change in the time-dependent coefficient derived from the first principal. Gender was differentiated by using the time-dependent coefficient derived from intermediate principal components. (Intermediate principal components are characterized by limb rotations of the thigh and shank.) Different shoe conditions were identified in higher principal components. This study showed that different interventions can be analyzed using a full-body kinematic approach. Within the well-defined vector space spanned by the data of all subjects, higher principal components should also be considered because these components show the differences that result from small interventions such as footwear changes. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.
A physiologically motivated sparse, compact, and smooth (SCS) approach to EEG source localization.
Cao, Cheng; Akalin Acar, Zeynep; Kreutz-Delgado, Kenneth; Makeig, Scott
2012-01-01
Here, we introduce a novel approach to the EEG inverse problem based on the assumption that principal cortical sources of multi-channel EEG recordings may be assumed to be spatially sparse, compact, and smooth (SCS). To enforce these characteristics of solutions to the EEG inverse problem, we propose a correlation-variance model which factors a cortical source space covariance matrix into the multiplication of a pre-given correlation coefficient matrix and the square root of the diagonal variance matrix learned from the data under a Bayesian learning framework. We tested the SCS method using simulated EEG data with various SNR and applied it to a real ECOG data set. We compare the results of SCS to those of an established SBL algorithm.
NASA Astrophysics Data System (ADS)
Nagai, Toshiki; Mitsutake, Ayori; Takano, Hiroshi
2013-02-01
A new relaxation mode analysis method, which is referred to as the principal component relaxation mode analysis method, has been proposed to handle a large number of degrees of freedom of protein systems. In this method, principal component analysis is carried out first and then relaxation mode analysis is applied to a small number of principal components with large fluctuations. To reduce the contribution of fast relaxation modes in these principal components efficiently, we have also proposed a relaxation mode analysis method using multiple evolution times. The principal component relaxation mode analysis method using two evolution times has been applied to an all-atom molecular dynamics simulation of human lysozyme in aqueous solution. Slow relaxation modes and corresponding relaxation times have been appropriately estimated, demonstrating that the method is applicable to protein systems.
Immunogenicity is preferentially induced in sparse dendritic cell cultures.
Nasi, Aikaterini; Bollampalli, Vishnu Priya; Sun, Meng; Chen, Yang; Amu, Sylvie; Nylén, Susanne; Eidsmo, Liv; Rothfuchs, Antonio Gigliotti; Réthi, Bence
2017-03-09
We have previously shown that human monocyte-derived dendritic cells (DCs) acquired different characteristics in dense or sparse cell cultures. Sparsity promoted the development of IL-12 producing migratory DCs, whereas dense cultures increased IL-10 production. Here we analysed whether the density-dependent endogenous breaks could modulate DC-based vaccines. Using murine bone marrow-derived DC models we show that sparse cultures were essential to achieve several key functions required for immunogenic DC vaccines, including mobility to draining lymph nodes, recruitment and massive proliferation of antigen-specific CD4+ T cells, in addition to their TH1 polarization. Transcription analyses confirmed higher commitment in sparse cultures towards T cell activation, whereas DCs obtained from dense cultures up-regulated immunosuppressive pathway components and genes suggesting higher differentiation plasticity towards osteoclasts. Interestingly, we detected a striking up-regulation of fatty acid and cholesterol biosynthesis pathways in sparse cultures, suggesting an important link between DC immunogenicity and lipid homeostasis regulation.
Dong, Jianghu J; Wang, Liangliang; Gill, Jagbir; Cao, Jiguo
2017-01-01
This article is motivated by some longitudinal clinical data of kidney transplant recipients, where kidney function progression is recorded as the estimated glomerular filtration rates at multiple time points post kidney transplantation. We propose to use the functional principal component analysis method to explore the major source of variations of glomerular filtration rate curves. We find that the estimated functional principal component scores can be used to cluster glomerular filtration rate curves. Ordering functional principal component scores can detect abnormal glomerular filtration rate curves. Finally, functional principal component analysis can effectively estimate missing glomerular filtration rate values and predict future glomerular filtration rate values.
WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING
Reiss, Philip T.; Huo, Lan; Zhao, Yihong; Kelly, Clare; Ogden, R. Todd
2016-01-01
An increasingly important goal of psychiatry is the use of brain imaging data to develop predictive models. Here we present two contributions to statistical methodology for this purpose. First, we propose and compare a set of wavelet-domain procedures for fitting generalized linear models with scalar responses and image predictors: sparse variants of principal component regression and of partial least squares, and the elastic net. Second, we consider assessing the contribution of image predictors over and above available scalar predictors, in particular via permutation tests and an extension of the idea of confounding to the case of functional or image predictors. Using the proposed methods, we assess whether maps of a spontaneous brain activity measure, derived from functional magnetic resonance imaging, can meaningfully predict presence or absence of attention deficit/hyperactivity disorder (ADHD). Our results shed light on the role of confounding in the surprising outcome of the recent ADHD-200 Global Competition, which challenged researchers to develop algorithms for automated image-based diagnosis of the disorder. PMID:27330652
Functional Additive Mixed Models
Scheipl, Fabian; Staicu, Ana-Maria; Greven, Sonja
2014-01-01
We propose an extensive framework for additive regression models for correlated functional responses, allowing for multiple partially nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data. Additionally, our framework includes linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. It accommodates densely or sparsely observed functional responses and predictors which may be observed with additional error and includes both spline-based and functional principal component-based terms. Estimation and inference in this framework is based on standard additive mixed models, allowing us to take advantage of established methods and robust, flexible algorithms. We provide easy-to-use open source software in the pffr() function for the R-package refund. Simulations show that the proposed method recovers relevant effects reliably, handles small sample sizes well and also scales to larger data sets. Applications with spatially and longitudinally observed functional data demonstrate the flexibility in modeling and interpretability of results of our approach. PMID:26347592
A rich diversity of opercle bone shape among teleost fishes
Small, Clayton M.; Knope, Matthew L.
2017-01-01
The opercle is a prominent craniofacial bone supporting the gill cover in all bony fish and has been the subject of morphological, developmental, and genetic investigation. We surveyed the shapes of this bone among 110 families spanning the teleost tree and examined its pattern of occupancy in a principal component-based morphospace. Contrasting with expectations from the literature that suggest the local morphospace would be only sparsely occupied, we find primarily dense, broad filling of the morphological landscape, indicating rich diversity. Phylomorphospace plots suggest that dynamic evolution underlies the observed spatial patterning. Evolutionary transits through the morphospaces are sometimes long, and occur in a variety of directions. The trajectories seem to represent both evolutionary divergences and convergences, the latter supported by convevol analysis. We suggest that that this pattern of occupancy reflects the various adaptations of different groups of fishes, seemingly paralleling their diverse marine and freshwater ecologies and life histories. Opercle shape evolution within the acanthomorphs, spiny ray-finned fishes, appears to have been especially dynamic. PMID:29281662
Functional Additive Mixed Models.
Scheipl, Fabian; Staicu, Ana-Maria; Greven, Sonja
2015-04-01
We propose an extensive framework for additive regression models for correlated functional responses, allowing for multiple partially nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data. Additionally, our framework includes linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. It accommodates densely or sparsely observed functional responses and predictors which may be observed with additional error and includes both spline-based and functional principal component-based terms. Estimation and inference in this framework is based on standard additive mixed models, allowing us to take advantage of established methods and robust, flexible algorithms. We provide easy-to-use open source software in the pffr() function for the R-package refund. Simulations show that the proposed method recovers relevant effects reliably, handles small sample sizes well and also scales to larger data sets. Applications with spatially and longitudinally observed functional data demonstrate the flexibility in modeling and interpretability of results of our approach.
An Accurate Framework for Arbitrary View Pedestrian Detection in Images
NASA Astrophysics Data System (ADS)
Fan, Y.; Wen, G.; Qiu, S.
2018-01-01
We consider the problem of detect pedestrian under from images collected under various viewpoints. This paper utilizes a novel framework called locality-constrained affine subspace coding (LASC). Firstly, the positive training samples are clustered into similar entities which represent similar viewpoint. Then Principal Component Analysis (PCA) is used to obtain the shared feature of each viewpoint. Finally, the samples that can be reconstructed by linear approximation using their top- k nearest shared feature with a small error are regarded as a correct detection. No negative samples are required for our method. Histograms of orientated gradient (HOG) features are used as the feature descriptors, and the sliding window scheme is adopted to detect humans in images. The proposed method exploits the sparse property of intrinsic information and the correlations among the multiple-views samples. Experimental results on the INRIA and SDL human datasets show that the proposed method achieves a higher performance than the state-of-the-art methods in form of effect and efficiency.
Genetic evidence for an ethnic diversity in the susceptibility to Ménière's disease.
Ohmen, Jeffrey Douglass; White, Cory H; Li, Xin; Wang, Juemei; Fisher, Laurel M; Zhang, Huan; Derebery, Mary Jennifer; Friedman, Rick A
2013-09-01
Ménière's disease (MD) is a debilitating disorder of the inner ear characterized by cochlear and vestibular dysfunction. The cause of this disease is still unknown, and epidemiological data for MD are sparse. From the existing literature, women seem to be more susceptible than men, and Caucasians seem to be more susceptible than Asians. In this article, we characterize a large definite MD cohort for sex and age of onset of disease and use molecular genetic methodologies to characterize ethnicity. Medical record review for sex and age of onset. Ancestry analysis compared results from the principal component analysis of whole-genome genotype data from MD patients to self-identified ancestry in control samples. House Clinic in Los Angeles. Definitive MD patients. Our review of medical records for definitive MD patients reveals that women are more susceptible than men. We also find that men and women have nearly identical age of onset for disease. Lastly, interrogation of molecular genetic data with principal component analysis allowed detailed observations about the ethnic ancestry of our patients. Comparison of the ethnicity of MD patients presenting to our tertiary care clinic with the self-recollected ethnicity of all patients visiting the clinic revealed an ethnic bias, with Caucasians presenting at a higher frequency than expected and the remaining major ethnicities populating Los Angeles (Hispanics, Blacks, and Asians) presenting at a lower frequency than expected. To the best of our knowledge, this report is the first ethnic characterization of a large MD cohort from a large metropolitan region using molecular genetic data. Our data suggest that there is a bias in sex and ethnic susceptibility to this disease.
Schwartz, Matthias; Meyer, Björn; Wirnitzer, Bernhard; Hopf, Carsten
2015-03-01
Conventional mass spectrometry image preprocessing methods used for denoising, such as the Savitzky-Golay smoothing or discrete wavelet transformation, typically do not only remove noise but also weak signals. Recently, memory-efficient principal component analysis (PCA) in conjunction with random projections (RP) has been proposed for reversible compression and analysis of large mass spectrometry imaging datasets. It considers single-pixel spectra in their local context and consequently offers the prospect of using information from the spectra of adjacent pixels for denoising or signal enhancement. However, little systematic analysis of key RP-PCA parameters has been reported so far, and the utility and validity of this method for context-dependent enhancement of known medically or pharmacologically relevant weak analyte signals in linear-mode matrix-assisted laser desorption/ionization (MALDI) mass spectra has not been explored yet. Here, we investigate MALDI imaging datasets from mouse models of Alzheimer's disease and gastric cancer to systematically assess the importance of selecting the right number of random projections k and of principal components (PCs) L for reconstructing reproducibly denoised images after compression. We provide detailed quantitative data for comparison of RP-PCA-denoising with the Savitzky-Golay and wavelet-based denoising in these mouse models as a resource for the mass spectrometry imaging community. Most importantly, we demonstrate that RP-PCA preprocessing can enhance signals of low-intensity amyloid-β peptide isoforms such as Aβ1-26 even in sparsely distributed Alzheimer's β-amyloid plaques and that it enables enhanced imaging of multiply acetylated histone H4 isoforms in response to pharmacological histone deacetylase inhibition in vivo. We conclude that RP-PCA denoising may be a useful preprocessing step in biomarker discovery workflows.
Wavelet decomposition based principal component analysis for face recognition using MATLAB
NASA Astrophysics Data System (ADS)
Sharma, Mahesh Kumar; Sharma, Shashikant; Leeprechanon, Nopbhorn; Ranjan, Aashish
2016-03-01
For the realization of face recognition systems in the static as well as in the real time frame, algorithms such as principal component analysis, independent component analysis, linear discriminate analysis, neural networks and genetic algorithms are used for decades. This paper discusses an approach which is a wavelet decomposition based principal component analysis for face recognition. Principal component analysis is chosen over other algorithms due to its relative simplicity, efficiency, and robustness features. The term face recognition stands for identifying a person from his facial gestures and having resemblance with factor analysis in some sense, i.e. extraction of the principal component of an image. Principal component analysis is subjected to some drawbacks, mainly the poor discriminatory power and the large computational load in finding eigenvectors, in particular. These drawbacks can be greatly reduced by combining both wavelet transform decomposition for feature extraction and principal component analysis for pattern representation and classification together, by analyzing the facial gestures into space and time domain, where, frequency and time are used interchangeably. From the experimental results, it is envisaged that this face recognition method has made a significant percentage improvement in recognition rate as well as having a better computational efficiency.
The Relation between Factor Score Estimates, Image Scores, and Principal Component Scores
ERIC Educational Resources Information Center
Velicer, Wayne F.
1976-01-01
Investigates the relation between factor score estimates, principal component scores, and image scores. The three methods compared are maximum likelihood factor analysis, principal component analysis, and a variant of rescaled image analysis. (RC)
The Butterflies of Principal Components: A Case of Ultrafine-Grained Polyphase Units
NASA Astrophysics Data System (ADS)
Rietmeijer, F. J. M.
1996-03-01
Dusts in the accretion regions of chondritic interplanetary dust particles [IDPs] consisted of three principal components: carbonaceous units [CUs], carbon-bearing chondritic units [GUs] and carbon-free silicate units [PUs]. Among others, differences among chondritic IDP morphologies and variable bulk C/Si ratios reflect variable mixtures of principal components. The spherical shapes of the initially amorphous principal components remain visible in many chondritic porous IDPs but fusion was documented for CUs, GUs and PUs. The PUs occur as coarse- and ultrafine-grained units that include so called GEMS. Spherical principal components preserved in an IDP as recognisable textural units have unique proporties with important implications for their petrological evolution from pre-accretion processing to protoplanet alteration and dynamic pyrometamorphism. Throughout their lifetime the units behaved as closed-systems without chemical exchange with other units. This behaviour is reflected in their mineralogies while the bulk compositions of principal components define the environments wherein they were formed.
Power Enhancement in High Dimensional Cross-Sectional Tests
Fan, Jianqing; Liao, Yuan; Yao, Jiawei
2016-01-01
We propose a novel technique to boost the power of testing a high-dimensional vector H : θ = 0 against sparse alternatives where the null hypothesis is violated only by a couple of components. Existing tests based on quadratic forms such as the Wald statistic often suffer from low powers due to the accumulation of errors in estimating high-dimensional parameters. More powerful tests for sparse alternatives such as thresholding and extreme-value tests, on the other hand, require either stringent conditions or bootstrap to derive the null distribution and often suffer from size distortions due to the slow convergence. Based on a screening technique, we introduce a “power enhancement component”, which is zero under the null hypothesis with high probability, but diverges quickly under sparse alternatives. The proposed test statistic combines the power enhancement component with an asymptotically pivotal statistic, and strengthens the power under sparse alternatives. The null distribution does not require stringent regularity conditions, and is completely determined by that of the pivotal statistic. As specific applications, the proposed methods are applied to testing the factor pricing models and validating the cross-sectional independence in panel data models. PMID:26778846
Xu, Yuan; Ding, Kun; Huo, Chunlei; Zhong, Zisha; Li, Haichang; Pan, Chunhong
2015-01-01
Very high resolution (VHR) image change detection is challenging due to the low discriminative ability of change feature and the difficulty of change decision in utilizing the multilevel contextual information. Most change feature extraction techniques put emphasis on the change degree description (i.e., in what degree the changes have happened), while they ignore the change pattern description (i.e., how the changes changed), which is of equal importance in characterizing the change signatures. Moreover, the simultaneous consideration of the classification robust to the registration noise and the multiscale region-consistent fusion is often neglected in change decision. To overcome such drawbacks, in this paper, a novel VHR image change detection method is proposed based on sparse change descriptor and robust discriminative dictionary learning. Sparse change descriptor combines the change degree component and the change pattern component, which are encoded by the sparse representation error and the morphological profile feature, respectively. Robust change decision is conducted by multiscale region-consistent fusion, which is implemented by the superpixel-level cosparse representation with robust discriminative dictionary and the conditional random field model. Experimental results confirm the effectiveness of the proposed change detection technique. PMID:25918748
Foch, Eric; Milner, Clare E
2014-01-03
Iliotibial band syndrome (ITBS) is a common knee overuse injury among female runners. Atypical discrete trunk and lower extremity biomechanics during running may be associated with the etiology of ITBS. Examining discrete data points limits the interpretation of a waveform to a single value. Characterizing entire kinematic and kinetic waveforms may provide additional insight into biomechanical factors associated with ITBS. Therefore, the purpose of this cross-sectional investigation was to determine whether female runners with previous ITBS exhibited differences in kinematics and kinetics compared to controls using a principal components analysis (PCA) approach. Forty participants comprised two groups: previous ITBS and controls. Principal component scores were retained for the first three principal components and were analyzed using independent t-tests. The retained principal components accounted for 93-99% of the total variance within each waveform. Runners with previous ITBS exhibited low principal component one scores for frontal plane hip angle. Principal component one accounted for the overall magnitude in hip adduction which indicated that runners with previous ITBS assumed less hip adduction throughout stance. No differences in the remaining retained principal component scores for the waveforms were detected among groups. A smaller hip adduction angle throughout the stance phase of running may be a compensatory strategy to limit iliotibial band strain. This running strategy may have persisted after ITBS symptoms subsided. © 2013 Published by Elsevier Ltd.
Low-rank Atlas Image Analyses in the Presence of Pathologies
Liu, Xiaoxiao; Niethammer, Marc; Kwitt, Roland; Singh, Nikhil; McCormick, Matt; Aylward, Stephen
2015-01-01
We present a common framework, for registering images to an atlas and for forming an unbiased atlas, that tolerates the presence of pathologies such as tumors and traumatic brain injury lesions. This common framework is particularly useful when a sufficient number of protocol-matched scans from healthy subjects cannot be easily acquired for atlas formation and when the pathologies in a patient cause large appearance changes. Our framework combines a low-rank-plus-sparse image decomposition technique with an iterative, diffeomorphic, group-wise image registration method. At each iteration of image registration, the decomposition technique estimates a “healthy” version of each image as its low-rank component and estimates the pathologies in each image as its sparse component. The healthy version of each image is used for the next iteration of image registration. The low-rank and sparse estimates are refined as the image registrations iteratively improve. When that framework is applied to image-to-atlas registration, the low-rank image is registered to a pre-defined atlas, to establish correspondence that is independent of the pathologies in the sparse component of each image. Ultimately, image-to-atlas registrations can be used to define spatial priors for tissue segmentation and to map information across subjects. When that framework is applied to unbiased atlas formation, at each iteration, the average of the low-rank images from the patients is used as the atlas image for the next iteration, until convergence. Since each iteration’s atlas is comprised of low-rank components, it provides a population-consistent, pathology-free appearance. Evaluations of the proposed methodology are presented using synthetic data as well as simulated and clinical tumor MRI images from the brain tumor segmentation (BRATS) challenge from MICCAI 2012. PMID:26111390
NASA Astrophysics Data System (ADS)
Parekh, Ankit
Sparsity has become the basis of some important signal processing methods over the last ten years. Many signal processing problems (e.g., denoising, deconvolution, non-linear component analysis) can be expressed as inverse problems. Sparsity is invoked through the formulation of an inverse problem with suitably designed regularization terms. The regularization terms alone encode sparsity into the problem formulation. Often, the ℓ1 norm is used to induce sparsity, so much so that ℓ1 regularization is considered to be `modern least-squares'. The use of ℓ1 norm, as a sparsity-inducing regularizer, leads to a convex optimization problem, which has several benefits: the absence of extraneous local minima, well developed theory of globally convergent algorithms, even for large-scale problems. Convex regularization via the ℓ1 norm, however, tends to under-estimate the non-zero values of sparse signals. In order to estimate the non-zero values more accurately, non-convex regularization is often favored over convex regularization. However, non-convex regularization generally leads to non-convex optimization, which suffers from numerous issues: convergence may be guaranteed to only a stationary point, problem specific parameters may be difficult to set, and the solution is sensitive to the initialization of the algorithm. The first part of this thesis is aimed toward combining the benefits of non-convex regularization and convex optimization to estimate sparse signals more effectively. To this end, we propose to use parameterized non-convex regularizers with designated non-convexity and provide a range for the non-convex parameter so as to ensure that the objective function is strictly convex. By ensuring convexity of the objective function (sum of data-fidelity and non-convex regularizer), we can make use of a wide variety of convex optimization algorithms to obtain the unique global minimum reliably. The second part of this thesis proposes a non-linear signal decomposition technique for an important biomedical signal processing problem: the detection of sleep spindles and K-complexes in human sleep electroencephalography (EEG). We propose a non-linear model for the EEG consisting of three components: (1) a transient (sparse piecewise constant) component, (2) a low-frequency component, and (3) an oscillatory component. The oscillatory component admits a sparse time-frequency representation. Using a convex objective function, we propose a fast non-linear optimization algorithm to estimate the three components in the proposed signal model. The low-frequency and oscillatory components are then used to estimate the K-complexes and sleep spindles respectively. The proposed detection method is shown to outperform several state-of-the-art automated sleep spindles detection methods.
Immunogenicity is preferentially induced in sparse dendritic cell cultures
Nasi, Aikaterini; Bollampalli, Vishnu Priya; Sun, Meng; Chen, Yang; Amu, Sylvie; Nylén, Susanne; Eidsmo, Liv; Rothfuchs, Antonio Gigliotti; Réthi, Bence
2017-01-01
We have previously shown that human monocyte-derived dendritic cells (DCs) acquired different characteristics in dense or sparse cell cultures. Sparsity promoted the development of IL-12 producing migratory DCs, whereas dense cultures increased IL-10 production. Here we analysed whether the density-dependent endogenous breaks could modulate DC-based vaccines. Using murine bone marrow-derived DC models we show that sparse cultures were essential to achieve several key functions required for immunogenic DC vaccines, including mobility to draining lymph nodes, recruitment and massive proliferation of antigen-specific CD4+ T cells, in addition to their TH1 polarization. Transcription analyses confirmed higher commitment in sparse cultures towards T cell activation, whereas DCs obtained from dense cultures up-regulated immunosuppressive pathway components and genes suggesting higher differentiation plasticity towards osteoclasts. Interestingly, we detected a striking up-regulation of fatty acid and cholesterol biosynthesis pathways in sparse cultures, suggesting an important link between DC immunogenicity and lipid homeostasis regulation. PMID:28276533
FGWAS: Functional genome wide association analysis.
Huang, Chao; Thompson, Paul; Wang, Yalin; Yu, Yang; Zhang, Jingwen; Kong, Dehan; Colen, Rivka R; Knickmeyer, Rebecca C; Zhu, Hongtu
2017-10-01
Functional phenotypes (e.g., subcortical surface representation), which commonly arise in imaging genetic studies, have been used to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. However, existing statistical methods largely ignore the functional features (e.g., functional smoothness and correlation). The aim of this paper is to develop a functional genome-wide association analysis (FGWAS) framework to efficiently carry out whole-genome analyses of functional phenotypes. FGWAS consists of three components: a multivariate varying coefficient model, a global sure independence screening procedure, and a test procedure. Compared with the standard multivariate regression model, the multivariate varying coefficient model explicitly models the functional features of functional phenotypes through the integration of smooth coefficient functions and functional principal component analysis. Statistically, compared with existing methods for genome-wide association studies (GWAS), FGWAS can substantially boost the detection power for discovering important genetic variants influencing brain structure and function. Simulation studies show that FGWAS outperforms existing GWAS methods for searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. We have successfully applied FGWAS to large-scale analysis of data from the Alzheimer's Disease Neuroimaging Initiative for 708 subjects, 30,000 vertices on the left and right hippocampal surfaces, and 501,584 SNPs. Copyright © 2017 Elsevier Inc. All rights reserved.
Face Aging Effect Simulation Using Hidden Factor Analysis Joint Sparse Representation.
Yang, Hongyu; Huang, Di; Wang, Yunhong; Wang, Heng; Tang, Yuanyan
2016-06-01
Face aging simulation has received rising investigations nowadays, whereas it still remains a challenge to generate convincing and natural age-progressed face images. In this paper, we present a novel approach to such an issue using hidden factor analysis joint sparse representation. In contrast to the majority of tasks in the literature that integrally handle the facial texture, the proposed aging approach separately models the person-specific facial properties that tend to be stable in a relatively long period and the age-specific clues that gradually change over time. It then transforms the age component to a target age group via sparse reconstruction, yielding aging effects, which is finally combined with the identity component to achieve the aged face. Experiments are carried out on three face aging databases, and the results achieved clearly demonstrate the effectiveness and robustness of the proposed method in rendering a face with aging effects. In addition, a series of evaluations prove its validity with respect to identity preservation and aging effect generation.
Nonlinear Principal Components Analysis: Introduction and Application
ERIC Educational Resources Information Center
Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Koojj, Anita J.
2007-01-01
The authors provide a didactic treatment of nonlinear (categorical) principal components analysis (PCA). This method is the nonlinear equivalent of standard PCA and reduces the observed variables to a number of uncorrelated principal components. The most important advantages of nonlinear over linear PCA are that it incorporates nominal and ordinal…
USDA-ARS?s Scientific Manuscript database
Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...
Similarities between principal components of protein dynamics and random diffusion
NASA Astrophysics Data System (ADS)
Hess, Berk
2000-12-01
Principal component analysis, also called essential dynamics, is a powerful tool for finding global, correlated motions in atomic simulations of macromolecules. It has become an established technique for analyzing molecular dynamics simulations of proteins. The first few principal components of simulations of large proteins often resemble cosines. We derive the principal components for high-dimensional random diffusion, which are almost perfect cosines. This resemblance between protein simulations and noise implies that for many proteins the time scales of current simulations are too short to obtain convergence of collective motions.
Directly Reconstructing Principal Components of Heterogeneous Particles from Cryo-EM Images
Tagare, Hemant D.; Kucukelbir, Alp; Sigworth, Fred J.; Wang, Hongwei; Rao, Murali
2015-01-01
Structural heterogeneity of particles can be investigated by their three-dimensional principal components. This paper addresses the question of whether, and with what algorithm, the three-dimensional principal components can be directly recovered from cryo-EM images. The first part of the paper extends the Fourier slice theorem to covariance functions showing that the three-dimensional covariance, and hence the principal components, of a heterogeneous particle can indeed be recovered from two-dimensional cryo-EM images. The second part of the paper proposes a practical algorithm for reconstructing the principal components directly from cryo-EM images without the intermediate step of calculating covariances. This algorithm is based on maximizing the (posterior) likelihood using the Expectation-Maximization algorithm. The last part of the paper applies this algorithm to simulated data and to two real cryo-EM data sets: a data set of the 70S ribosome with and without Elongation Factor-G (EF-G), and a data set of the inluenza virus RNA dependent RNA Polymerase (RdRP). The first principal component of the 70S ribosome data set reveals the expected conformational changes of the ribosome as the EF-G binds and unbinds. The first principal component of the RdRP data set reveals a conformational change in the two dimers of the RdRP. PMID:26049077
Efficient convolutional sparse coding
Wohlberg, Brendt
2017-06-20
Computationally efficient algorithms may be applied for fast dictionary learning solving the convolutional sparse coding problem in the Fourier domain. More specifically, efficient convolutional sparse coding may be derived within an alternating direction method of multipliers (ADMM) framework that utilizes fast Fourier transforms (FFT) to solve the main linear system in the frequency domain. Such algorithms may enable a significant reduction in computational cost over conventional approaches by implementing a linear solver for the most critical and computationally expensive component of the conventional iterative algorithm. The theoretical computational cost of the algorithm may be reduced from O(M.sup.3N) to O(MN log N), where N is the dimensionality of the data and M is the number of elements in the dictionary. This significant improvement in efficiency may greatly increase the range of problems that can practically be addressed via convolutional sparse representations.
Xie, Jianwen; Douglas, Pamela K; Wu, Ying Nian; Brody, Arthur L; Anderson, Ariana E
2017-04-15
Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet other mathematical constraints provide alternate biologically-plausible frameworks for generating brain networks. Non-negative matrix factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms (L1 Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks. The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks within scan for different constraints are used as basis functions to encode observed functional activity. These encodings are then decoded using machine learning, by using the time series weights to predict within scan whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects. The sparse coding algorithm of L1 Regularized Learning outperformed 4 variations of ICA (p<0.001) for predicting the task being performed within each scan using artifact-cleaned components. The NMF algorithms, which suppressed negative BOLD signal, had the poorest accuracy compared to the ICA and sparse coding algorithms. Holding constant the effect of the extraction algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (p<0.001). Lower classification accuracy occurred when the extracted spatial maps contained more CSF regions (p<0.001). The success of sparse coding algorithms suggests that algorithms which enforce sparsity, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA. Negative BOLD signal may capture task-related activations. Copyright © 2017 Elsevier B.V. All rights reserved.
Simulation of Devices with Molecular Potentials
2013-12-22
10] W. R. Frensley, Wigner - function model of a resonant-tunneling semiconductor de- vice, Phys. Rev. B, 36 (1987), pp. 1570–1580. 6 [11] M. J...develop the principal investigator’s Wigner -Poisson code and extend that code to deal with longer devices and more complex barrier profiles. Over...Research Triangle Park, NC 27709-2211 Molecular Confirmation, Sparse Interpolation, Wigner -Poisson Equation, Parallel Algorithms REPORT DOCUMENTATION PAGE 11
An Introductory Application of Principal Components to Cricket Data
ERIC Educational Resources Information Center
Manage, Ananda B. W.; Scariano, Stephen M.
2013-01-01
Principal Component Analysis is widely used in applied multivariate data analysis, and this article shows how to motivate student interest in this topic using cricket sports data. Here, principal component analysis is successfully used to rank the cricket batsmen and bowlers who played in the 2012 Indian Premier League (IPL) competition. In…
Least Principal Components Analysis (LPCA): An Alternative to Regression Analysis.
ERIC Educational Resources Information Center
Olson, Jeffery E.
Often, all of the variables in a model are latent, random, or subject to measurement error, or there is not an obvious dependent variable. When any of these conditions exist, an appropriate method for estimating the linear relationships among the variables is Least Principal Components Analysis. Least Principal Components are robust, consistent,…
Identifying apple surface defects using principal components analysis and artifical neural networks
USDA-ARS?s Scientific Manuscript database
Artificial neural networks and principal components were used to detect surface defects on apples in near-infrared images. Neural networks were trained and tested on sets of principal components derived from columns of pixels from images of apples acquired at two wavelengths (740 nm and 950 nm). I...
Sparse approximation problem: how rapid simulated annealing succeeds and fails
NASA Astrophysics Data System (ADS)
Obuchi, Tomoyuki; Kabashima, Yoshiyuki
2016-03-01
Information processing techniques based on sparseness have been actively studied in several disciplines. Among them, a mathematical framework to approximately express a given dataset by a combination of a small number of basis vectors of an overcomplete basis is termed the sparse approximation. In this paper, we apply simulated annealing, a metaheuristic algorithm for general optimization problems, to sparse approximation in the situation where the given data have a planted sparse representation and noise is present. The result in the noiseless case shows that our simulated annealing works well in a reasonable parameter region: the planted solution is found fairly rapidly. This is true even in the case where a common relaxation of the sparse approximation problem, the G-relaxation, is ineffective. On the other hand, when the dimensionality of the data is close to the number of non-zero components, another metastable state emerges, and our algorithm fails to find the planted solution. This phenomenon is associated with a first-order phase transition. In the case of very strong noise, it is no longer meaningful to search for the planted solution. In this situation, our algorithm determines a solution with close-to-minimum distortion fairly quickly.
Revealing the Hidden Relationship by Sparse Modules in Complex Networks with a Large-Scale Analysis
Jiao, Qing-Ju; Huang, Yan; Liu, Wei; Wang, Xiao-Fan; Chen, Xiao-Shuang; Shen, Hong-Bin
2013-01-01
One of the remarkable features of networks is module that can provide useful insights into not only network organizations but also functional behaviors between their components. Comprehensive efforts have been devoted to investigating cohesive modules in the past decade. However, it is still not clear whether there are important structural characteristics of the nodes that do not belong to any cohesive module. In order to answer this question, we performed a large-scale analysis on 25 complex networks with different types and scales using our recently developed BTS (bintree seeking) algorithm, which is able to detect both cohesive and sparse modules in the network. Our results reveal that the sparse modules composed by the cohesively isolated nodes widely co-exist with the cohesive modules. Detailed analysis shows that both types of modules provide better characterization for the division of a network into functional units than merely cohesive modules, because the sparse modules possibly re-organize the nodes in the so-called cohesive modules, which lack obvious modular significance, into meaningful groups. Compared with cohesive modules, the sizes of sparse ones are generally smaller. Sparse modules are also found to have preferences in social and biological networks than others. PMID:23762457
Finding Planets in K2: A New Method of Cleaning the Data
NASA Astrophysics Data System (ADS)
Currie, Miles; Mullally, Fergal; Thompson, Susan E.
2017-01-01
We present a new method of removing systematic flux variations from K2 light curves by employing a pixel-level principal component analysis (PCA). This method decomposes the light curves into its principal components (eigenvectors), each with an associated eigenvalue, the value of which is correlated to how much influence the basis vector has on the shape of the light curve. This method assumes that the most influential basis vectors will correspond to the unwanted systematic variations in the light curve produced by K2’s constant motion. We correct the raw light curve by automatically fitting and removing the strongest principal components. The strongest principal components generally correspond to the flux variations that result from the motion of the star in the field of view. Our primary method of calculating the strongest principal components to correct for in the raw light curve estimates the noise by measuring the scatter in the light curve after using an algorithm for Savitsy-Golay detrending, which computes the combined photometric precision value (SG-CDPP value) used in classic Kepler. We calculate this value after correcting the raw light curve for each element in a list of cumulative sums of principal components so that we have as many noise estimate values as there are principal components. We then take the derivative of the list of SG-CDPP values and take the number of principal components that correlates to the point at which the derivative effectively goes to zero. This is the optimal number of principal components to exclude from the refitting of the light curve. We find that a pixel-level PCA is sufficient for cleaning unwanted systematic and natural noise from K2’s light curves. We present preliminary results and a basic comparison to other methods of reducing the noise from the flux variations.
Finding Imaging Patterns of Structural Covariance via Non-Negative Matrix Factorization
Sotiras, Aristeidis; Resnick, Susan M.; Davatzikos, Christos
2015-01-01
In this paper, we investigate the use of Non-Negative Matrix Factorization (NNMF) for the analysis of structural neuroimaging data. The goal is to identify the brain regions that co-vary across individuals in a consistent way, hence potentially being part of underlying brain networks or otherwise influenced by underlying common mechanisms such as genetics and pathologies. NNMF offers a directly data-driven way of extracting relatively localized co-varying structural regions, thereby transcending limitations of Principal Component Analysis (PCA), Independent Component Analysis (ICA) and other related methods that tend to produce dispersed components of positive and negative loadings. In particular, leveraging upon the well known ability of NNMF to produce parts-based representations of image data, we derive decompositions that partition the brain into regions that vary in consistent ways across individuals. Importantly, these decompositions achieve dimensionality reduction via highly interpretable ways and generalize well to new data as shown via split-sample experiments. We empirically validate NNMF in two data sets: i) a Diffusion Tensor (DT) mouse brain development study, and ii) a structural Magnetic Resonance (sMR) study of human brain aging. We demonstrate the ability of NNMF to produce sparse parts-based representations of the data at various resolutions. These representations seem to follow what we know about the underlying functional organization of the brain and also capture some pathological processes. Moreover, we show that these low dimensional representations favorably compare to descriptions obtained with more commonly used matrix factorization methods like PCA and ICA. PMID:25497684
Directly reconstructing principal components of heterogeneous particles from cryo-EM images.
Tagare, Hemant D; Kucukelbir, Alp; Sigworth, Fred J; Wang, Hongwei; Rao, Murali
2015-08-01
Structural heterogeneity of particles can be investigated by their three-dimensional principal components. This paper addresses the question of whether, and with what algorithm, the three-dimensional principal components can be directly recovered from cryo-EM images. The first part of the paper extends the Fourier slice theorem to covariance functions showing that the three-dimensional covariance, and hence the principal components, of a heterogeneous particle can indeed be recovered from two-dimensional cryo-EM images. The second part of the paper proposes a practical algorithm for reconstructing the principal components directly from cryo-EM images without the intermediate step of calculating covariances. This algorithm is based on maximizing the posterior likelihood using the Expectation-Maximization algorithm. The last part of the paper applies this algorithm to simulated data and to two real cryo-EM data sets: a data set of the 70S ribosome with and without Elongation Factor-G (EF-G), and a data set of the influenza virus RNA dependent RNA Polymerase (RdRP). The first principal component of the 70S ribosome data set reveals the expected conformational changes of the ribosome as the EF-G binds and unbinds. The first principal component of the RdRP data set reveals a conformational change in the two dimers of the RdRP. Copyright © 2015 Elsevier Inc. All rights reserved.
40 CFR 60.2998 - What are the principal components of the model rule?
Code of Federal Regulations, 2010 CFR
2010-07-01
... 40 Protection of Environment 6 2010-07-01 2010-07-01 false What are the principal components of... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule... management plan. (c) Operator training and qualification. (d) Emission limitations and operating limits. (e...
40 CFR 60.2570 - What are the principal components of the model rule?
Code of Federal Regulations, 2010 CFR
2010-07-01
... 40 Protection of Environment 6 2010-07-01 2010-07-01 false What are the principal components of... Construction On or Before November 30, 1999 Use of Model Rule § 60.2570 What are the principal components of... (k) of this section. (a) Increments of progress toward compliance. (b) Waste management plan. (c...
Maisuradze, Gia G; Leitner, David M
2007-05-15
Dihedral principal component analysis (dPCA) has recently been developed and shown to display complex features of the free energy landscape of a biomolecule that may be absent in the free energy landscape plotted in principal component space due to mixing of internal and overall rotational motion that can occur in principal component analysis (PCA) [Mu et al., Proteins: Struct Funct Bioinfo 2005;58:45-52]. Another difficulty in the implementation of PCA is sampling convergence, which we address here for both dPCA and PCA using a tetrapeptide as an example. We find that for both methods the sampling convergence can be reached over a similar time. Minima in the free energy landscape in the space of the two largest dihedral principal components often correspond to unique structures, though we also find some distinct minima to correspond to the same structure. 2007 Wiley-Liss, Inc.
High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics
Carvalho, Carlos M.; Chang, Jeffrey; Lucas, Joseph E.; Nevins, Joseph R.; Wang, Quanli; West, Mike
2010-01-01
We describe studies in molecular profiling and biological pathway analysis that use sparse latent factor and regression models for microarray gene expression data. We discuss breast cancer applications and key aspects of the modeling and computational methodology. Our case studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, as well as links between aggregate patterns in gene expression profiles and clinical biomarkers. Based on the metaphor of statistically derived “factors” as representing biological “subpathway” structure, we explore the decomposition of fitted sparse factor models into pathway subcomponents and investigate how these components overlay multiple aspects of known biological activity. Our methodology is based on sparsity modeling of multivariate regression, ANOVA, and latent factor models, as well as a class of models that combines all components. Hierarchical sparsity priors address questions of dimension reduction and multiple comparisons, as well as scalability of the methodology. The models include practically relevant non-Gaussian/nonparametric components for latent structure, underlying often quite complex non-Gaussianity in multivariate expression patterns. Model search and fitting are addressed through stochastic simulation and evolutionary stochastic search methods that are exemplified in the oncogenic pathway studies. Supplementary supporting material provides more details of the applications, as well as examples of the use of freely available software tools for implementing the methodology. PMID:21218139
Lung dynamic MRI deblurring using low-rank decomposition and dictionary learning.
Gou, Shuiping; Wang, Yueyue; Wu, Jiaolong; Lee, Percy; Sheng, Ke
2015-04-01
Lung dynamic MRI (dMRI) has emerged to be an appealing tool to quantify lung motion for both planning and treatment guidance purposes. However, this modality can result in blurry images due to intrinsically low signal-to-noise ratio in the lung and spatial/temporal interpolation. The image blurring could adversely affect the image processing that depends on the availability of fine landmarks. The purpose of this study is to reduce dMRI blurring using image postprocessing. To enhance the image quality and exploit the spatiotemporal continuity of dMRI sequences, a low-rank decomposition and dictionary learning (LDDL) method was employed to deblur lung dMRI and enhance the conspicuity of lung blood vessels. Fifty frames of continuous 2D coronal dMRI frames using a steady state free precession sequence were obtained from five subjects including two healthy volunteer and three lung cancer patients. In LDDL, the lung dMRI was decomposed into sparse and low-rank components. Dictionary learning was employed to estimate the blurring kernel based on the whole image, low-rank or sparse component of the first image in the lung MRI sequence. Deblurring was performed on the whole image sequences using deconvolution based on the estimated blur kernel. The deblurring results were quantified using an automated blood vessel extraction method based on the classification of Hessian matrix filtered images. Accuracy of automated extraction was calculated using manual segmentation of the blood vessels as the ground truth. In the pilot study, LDDL based on the blurring kernel estimated from the sparse component led to performance superior to the other ways of kernel estimation. LDDL consistently improved image contrast and fine feature conspicuity of the original MRI without introducing artifacts. The accuracy of automated blood vessel extraction was on average increased by 16% using manual segmentation as the ground truth. Image blurring in dMRI images can be effectively reduced using a low-rank decomposition and dictionary learning method using kernels estimated by the sparse component.
Fast, Exact Bootstrap Principal Component Analysis for p > 1 million
Fisher, Aaron; Caffo, Brian; Schwartz, Brian; Zipunnikov, Vadim
2015-01-01
Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (p) is much larger than the number of subjects (n), calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same n-dimensional subspace as the original sample. As a result, all bootstrap principal components are limited to the same n-dimensional subspace and can be efficiently represented by their low dimensional coordinates in that subspace. Several uncertainty metrics can be computed solely based on the bootstrap distribution of these low dimensional coordinates, without calculating or storing the p-dimensional bootstrap components. Fast bootstrap PCA is applied to a dataset of sleep electroencephalogram recordings (p = 900, n = 392), and to a dataset of brain magnetic resonance images (MRIs) (p ≈ 3 million, n = 352). For the MRI dataset, our method allows for standard errors for the first 3 principal components based on 1000 bootstrap samples to be calculated on a standard laptop in 47 minutes, as opposed to approximately 4 days with standard methods. PMID:27616801
ERIC Educational Resources Information Center
Oplatka, Izhar
2017-01-01
Purpose: In order to fill the gap in theoretical and empirical knowledge about the characteristics of principal workload, the purpose of this paper is to explore the components of principal workload as well as its determinants and the coping strategies commonly used by principals to face this personal state. Design/methodology/approach:…
Supernova Photometric Lightcurve Classification
NASA Astrophysics Data System (ADS)
Zaidi, Tayeb; Narayan, Gautham
2016-01-01
This is a preliminary report on photometric supernova classification. We first explore the properties of supernova light curves, and attempt to restructure the unevenly sampled and sparse data from assorted datasets to allow for processing and classification. The data was primarily drawn from the Dark Energy Survey (DES) simulated data, created for the Supernova Photometric Classification Challenge. This poster shows a method for producing a non-parametric representation of the light curve data, and applying a Random Forest classifier algorithm to distinguish between supernovae types. We examine the impact of Principal Component Analysis to reduce the dimensionality of the dataset, for future classification work. The classification code will be used in a stage of the ANTARES pipeline, created for use on the Large Synoptic Survey Telescope alert data and other wide-field surveys. The final figure-of-merit for the DES data in the r band was 60% for binary classification (Type I vs II).Zaidi was supported by the NOAO/KPNO Research Experiences for Undergraduates (REU) Program which is funded by the National Science Foundation Research Experiences for Undergraduates Program (AST-1262829).
Ayubi, Erfan; Sani, Mohadeseh; Safiri, Saeid; Khedmati Morasae, Esmaeil; Almasi-Hashiani, Amir; Nazarzadeh, Milad
2017-07-01
The effect of socioeconomic status on adolescent smoking behaviors is unclear, and sparse studies are available about the potential association. The present study aimed to measure and explain socioeconomic inequality in smoking behavior among a sample of Iranian adolescents. In a cross-sectional survey, a multistage sample of adolescents ( n = 1,064) was recruited from high school students in Zanjan city, northwest of Iran. Principal component analysis was used to measure economic status of adolescents. Concentration index was used to measure socioeconomic inequality in smoking behavior, and then it was decomposed to reveal inequality contributors. Concentration index and its 95% confidence interval for never, experimental, and regular smoking behaviors were 0.004 [-0.03, 0.04], 0.05 [0.02, 0.11], and -0.10 [-0.04, -0.19], respectively. The contribution of economic status to measured inequality in experimental and regular smoking was 80.0% and 68.8%, respectively. Household economic status could be targeted as one of the relevant factors in the unequal distribution of smoking behavior among adolescents.
Image Fusion of CT and MR with Sparse Representation in NSST Domain
Qiu, Chenhui; Wang, Yuanyuan; Zhang, Huan
2017-01-01
Multimodal image fusion techniques can integrate the information from different medical images to get an informative image that is more suitable for joint diagnosis, preoperative planning, intraoperative guidance, and interventional treatment. Fusing images of CT and different MR modalities are studied in this paper. Firstly, the CT and MR images are both transformed to nonsubsampled shearlet transform (NSST) domain. So the low-frequency components and high-frequency components are obtained. Then the high-frequency components are merged using the absolute-maximum rule, while the low-frequency components are merged by a sparse representation- (SR-) based approach. And the dynamic group sparsity recovery (DGSR) algorithm is proposed to improve the performance of the SR-based approach. Finally, the fused image is obtained by performing the inverse NSST on the merged components. The proposed fusion method is tested on a number of clinical CT and MR images and compared with several popular image fusion methods. The experimental results demonstrate that the proposed fusion method can provide better fusion results in terms of subjective quality and objective evaluation. PMID:29250134
Image Fusion of CT and MR with Sparse Representation in NSST Domain.
Qiu, Chenhui; Wang, Yuanyuan; Zhang, Huan; Xia, Shunren
2017-01-01
Multimodal image fusion techniques can integrate the information from different medical images to get an informative image that is more suitable for joint diagnosis, preoperative planning, intraoperative guidance, and interventional treatment. Fusing images of CT and different MR modalities are studied in this paper. Firstly, the CT and MR images are both transformed to nonsubsampled shearlet transform (NSST) domain. So the low-frequency components and high-frequency components are obtained. Then the high-frequency components are merged using the absolute-maximum rule, while the low-frequency components are merged by a sparse representation- (SR-) based approach. And the dynamic group sparsity recovery (DGSR) algorithm is proposed to improve the performance of the SR-based approach. Finally, the fused image is obtained by performing the inverse NSST on the merged components. The proposed fusion method is tested on a number of clinical CT and MR images and compared with several popular image fusion methods. The experimental results demonstrate that the proposed fusion method can provide better fusion results in terms of subjective quality and objective evaluation.
Considering Horn's Parallel Analysis from a Random Matrix Theory Point of View.
Saccenti, Edoardo; Timmerman, Marieke E
2017-03-01
Horn's parallel analysis is a widely used method for assessing the number of principal components and common factors. We discuss the theoretical foundations of parallel analysis for principal components based on a covariance matrix by making use of arguments from random matrix theory. In particular, we show that (i) for the first component, parallel analysis is an inferential method equivalent to the Tracy-Widom test, (ii) its use to test high-order eigenvalues is equivalent to the use of the joint distribution of the eigenvalues, and thus should be discouraged, and (iii) a formal test for higher-order components can be obtained based on a Tracy-Widom approximation. We illustrate the performance of the two testing procedures using simulated data generated under both a principal component model and a common factors model. For the principal component model, the Tracy-Widom test performs consistently in all conditions, while parallel analysis shows unpredictable behavior for higher-order components. For the common factor model, including major and minor factors, both procedures are heuristic approaches, with variable performance. We conclude that the Tracy-Widom procedure is preferred over parallel analysis for statistically testing the number of principal components based on a covariance matrix.
Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis
Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.; ...
2017-10-16
This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less
Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.
This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less
Improved FastICA algorithm in fMRI data analysis using the sparsity property of the sources.
Ge, Ruiyang; Wang, Yubao; Zhang, Jipeng; Yao, Li; Zhang, Hang; Long, Zhiying
2016-04-01
As a blind source separation technique, independent component analysis (ICA) has many applications in functional magnetic resonance imaging (fMRI). Although either temporal or spatial prior information has been introduced into the constrained ICA and semi-blind ICA methods to improve the performance of ICA in fMRI data analysis, certain types of additional prior information, such as the sparsity, has seldom been added to the ICA algorithms as constraints. In this study, we proposed a SparseFastICA method by adding the source sparsity as a constraint to the FastICA algorithm to improve the performance of the widely used FastICA. The source sparsity is estimated through a smoothed ℓ0 norm method. We performed experimental tests on both simulated data and real fMRI data to investigate the feasibility and robustness of SparseFastICA and made a performance comparison between SparseFastICA, FastICA and Infomax ICA. Results of the simulated and real fMRI data demonstrated the feasibility and robustness of SparseFastICA for the source separation in fMRI data. Both the simulated and real fMRI experimental results showed that SparseFastICA has better robustness to noise and better spatial detection power than FastICA. Although the spatial detection power of SparseFastICA and Infomax did not show significant difference, SparseFastICA had faster computation speed than Infomax. SparseFastICA was comparable to the Infomax algorithm with a faster computation speed. More importantly, SparseFastICA outperformed FastICA in robustness and spatial detection power and can be used to identify more accurate brain networks than FastICA algorithm. Copyright © 2016 Elsevier B.V. All rights reserved.
The Influence Function of Principal Component Analysis by Self-Organizing Rule.
Higuchi; Eguchi
1998-07-28
This article is concerned with a neural network approach to principal component analysis (PCA). An algorithm for PCA by the self-organizing rule has been proposed and its robustness observed through the simulation study by Xu and Yuille (1995). In this article, the robustness of the algorithm against outliers is investigated by using the theory of influence function. The influence function of the principal component vector is given in an explicit form. Through this expression, the method is shown to be robust against any directions orthogonal to the principal component vector. In addition, a statistic generated by the self-organizing rule is proposed to assess the influence of data in PCA.
Brown, C. Erwin
1993-01-01
Correlation analysis in conjunction with principal-component and multiple-regression analyses were applied to laboratory chemical and petrographic data to assess the usefulness of these techniques in evaluating selected physical and hydraulic properties of carbonate-rock aquifers in central Pennsylvania. Correlation and principal-component analyses were used to establish relations and associations among variables, to determine dimensions of property variation of samples, and to filter the variables containing similar information. Principal-component and correlation analyses showed that porosity is related to other measured variables and that permeability is most related to porosity and grain size. Four principal components are found to be significant in explaining the variance of data. Stepwise multiple-regression analysis was used to see how well the measured variables could predict porosity and (or) permeability for this suite of rocks. The variation in permeability and porosity is not totally predicted by the other variables, but the regression is significant at the 5% significance level. ?? 1993.
Hemmateenejad, Bahram; Akhond, Morteza; Miri, Ramin; Shamsipur, Mojtaba
2003-01-01
A QSAR algorithm, principal component-genetic algorithm-artificial neural network (PC-GA-ANN), has been applied to a set of newly synthesized calcium channel blockers, which are of special interest because of their role in cardiac diseases. A data set of 124 1,4-dihydropyridines bearing different ester substituents at the C-3 and C-5 positions of the dihydropyridine ring and nitroimidazolyl, phenylimidazolyl, and methylsulfonylimidazolyl groups at the C-4 position with known Ca(2+) channel binding affinities was employed in this study. Ten different sets of descriptors (837 descriptors) were calculated for each molecule. The principal component analysis was used to compress the descriptor groups into principal components. The most significant descriptors of each set were selected and used as input for the ANN. The genetic algorithm (GA) was used for the selection of the best set of extracted principal components. A feed forward artificial neural network with a back-propagation of error algorithm was used to process the nonlinear relationship between the selected principal components and biological activity of the dihydropyridines. A comparison between PC-GA-ANN and routine PC-ANN shows that the first model yields better prediction ability.
Salvatore, Stefania; Bramness, Jørgen G; Røislien, Jo
2016-07-12
Wastewater-based epidemiology (WBE) is a novel approach in drug use epidemiology which aims to monitor the extent of use of various drugs in a community. In this study, we investigate functional principal component analysis (FPCA) as a tool for analysing WBE data and compare it to traditional principal component analysis (PCA) and to wavelet principal component analysis (WPCA) which is more flexible temporally. We analysed temporal wastewater data from 42 European cities collected daily over one week in March 2013. The main temporal features of ecstasy (MDMA) were extracted using FPCA using both Fourier and B-spline basis functions with three different smoothing parameters, along with PCA and WPCA with different mother wavelets and shrinkage rules. The stability of FPCA was explored through bootstrapping and analysis of sensitivity to missing data. The first three principal components (PCs), functional principal components (FPCs) and wavelet principal components (WPCs) explained 87.5-99.6 % of the temporal variation between cities, depending on the choice of basis and smoothing. The extracted temporal features from PCA, FPCA and WPCA were consistent. FPCA using Fourier basis and common-optimal smoothing was the most stable and least sensitive to missing data. FPCA is a flexible and analytically tractable method for analysing temporal changes in wastewater data, and is robust to missing data. WPCA did not reveal any rapid temporal changes in the data not captured by FPCA. Overall the results suggest FPCA with Fourier basis functions and common-optimal smoothing parameter as the most accurate approach when analysing WBE data.
40 CFR 62.14505 - What are the principal components of this subpart?
Code of Federal Regulations, 2010 CFR
2010-07-01
... 40 Protection of Environment 8 2010-07-01 2010-07-01 false What are the principal components of this subpart? 62.14505 Section 62.14505 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY... components of this subpart? This subpart contains the eleven major components listed in paragraphs (a...
Sparse regularization for force identification using dictionaries
NASA Astrophysics Data System (ADS)
Qiao, Baijie; Zhang, Xingwu; Wang, Chenxi; Zhang, Hang; Chen, Xuefeng
2016-04-01
The classical function expansion method based on minimizing l2-norm of the response residual employs various basis functions to represent the unknown force. Its difficulty lies in determining the optimum number of basis functions. Considering the sparsity of force in the time domain or in other basis space, we develop a general sparse regularization method based on minimizing l1-norm of the coefficient vector of basis functions. The number of basis functions is adaptively determined by minimizing the number of nonzero components in the coefficient vector during the sparse regularization process. First, according to the profile of the unknown force, the dictionary composed of basis functions is determined. Second, a sparsity convex optimization model for force identification is constructed. Third, given the transfer function and the operational response, Sparse reconstruction by separable approximation (SpaRSA) is developed to solve the sparse regularization problem of force identification. Finally, experiments including identification of impact and harmonic forces are conducted on a cantilever thin plate structure to illustrate the effectiveness and applicability of SpaRSA. Besides the Dirac dictionary, other three sparse dictionaries including Db6 wavelets, Sym4 wavelets and cubic B-spline functions can also accurately identify both the single and double impact forces from highly noisy responses in a sparse representation frame. The discrete cosine functions can also successfully reconstruct the harmonic forces including the sinusoidal, square and triangular forces. Conversely, the traditional Tikhonov regularization method with the L-curve criterion fails to identify both the impact and harmonic forces in these cases.
Latent feature decompositions for integrative analysis of multi-platform genomic data
Gregory, Karl B.; Momin, Amin A.; Coombes, Kevin R.; Baladandayuthapani, Veerabhadran
2015-01-01
Increased availability of multi-platform genomics data on matched samples has sparked research efforts to discover how diverse molecular features interact both within and between platforms. In addition, simultaneous measurements of genetic and epigenetic characteristics illuminate the roles their complex relationships play in disease progression and outcomes. However, integrative methods for diverse genomics data are faced with the challenges of ultra-high dimensionality and the existence of complex interactions both within and between platforms. We propose a novel modeling framework for integrative analysis based on decompositions of the large number of platform-specific features into a smaller number of latent features. Subsequently we build a predictive model for clinical outcomes accounting for both within- and between-platform interactions based on Bayesian model averaging procedures. Principal components, partial least squares and non-negative matrix factorization as well as sparse counterparts of each are used to define the latent features, and the performance of these decompositions is compared both on real and simulated data. The latent feature interactions are shown to preserve interactions between the original features and not only aid prediction but also allow explicit selection of outcome-related features. The methods are motivated by and applied to, a glioblastoma multiforme dataset from The Cancer Genome Atlas to predict patient survival times integrating gene expression, microRNA, copy number and methylation data. For the glioblastoma data, we find a high concordance between our selected prognostic genes and genes with known associations with glioblastoma. In addition, our model discovers several relevant cross-platform interactions such as copy number variation associated gene dosing and epigenetic regulation through promoter methylation. On simulated data, we show that our proposed method successfully incorporates interactions within and between genomic platforms to aid accurate prediction and variable selection. Our methods perform best when principal components are used to define the latent features. PMID:26146492
Hierarchical Regularity in Multi-Basin Dynamics on Protein Landscapes
NASA Astrophysics Data System (ADS)
Matsunaga, Yasuhiro; Kostov, Konstatin S.; Komatsuzaki, Tamiki
2004-04-01
We analyze time series of potential energy fluctuations and principal components at several temperatures for two kinds of off-lattice 46-bead models that have two distinctive energy landscapes. The less-frustrated "funnel" energy landscape brings about stronger nonstationary behavior of the potential energy fluctuations at the folding temperature than the other, rather frustrated energy landscape at the collapse temperature. By combining principal component analysis with an embedding nonlinear time-series analysis, it is shown that the fast fluctuations with small amplitudes of 70-80% of the principal components cause the time series to become almost "random" in only 100 simulation steps. However, the stochastic feature of the principal components tends to be suppressed through a wide range of degrees of freedom at the transition temperature.
Sparse dictionary learning for resting-state fMRI analysis
NASA Astrophysics Data System (ADS)
Lee, Kangjoo; Han, Paul Kyu; Ye, Jong Chul
2011-09-01
Recently, there has been increased interest in the usage of neuroimaging techniques to investigate what happens in the brain at rest. Functional imaging studies have revealed that the default-mode network activity is disrupted in Alzheimer's disease (AD). However, there is no consensus, as yet, on the choice of analysis method for the application of resting-state analysis for disease classification. This paper proposes a novel compressed sensing based resting-state fMRI analysis tool called Sparse-SPM. As the brain's functional systems has shown to have features of complex networks according to graph theoretical analysis, we apply a graph model to represent a sparse combination of information flows in complex network perspectives. In particular, a new concept of spatially adaptive design matrix has been proposed by implementing sparse dictionary learning based on sparsity. The proposed approach shows better performance compared to other conventional methods, such as independent component analysis (ICA) and seed-based approach, in classifying the AD patients from normal using resting-state analysis.
Principals' Perceptions Regarding Their Supervision and Evaluation
ERIC Educational Resources Information Center
Hvidston, David J.; Range, Bret G.; McKim, Courtney Ann
2015-01-01
This study examined the perceptions of principals concerning principal evaluation and supervisory feedback. Principals were asked two open-ended questions. Respondents included 82 principals in the Rocky Mountain region. The emerging themes were "Superintendent Performance," "Principal Evaluation Components," "Specific…
Elastic-Waveform Inversion with Compressive Sensing for Sparse Seismic Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Youzuo; Huang, Lianjie
2015-01-28
Accurate velocity models of compressional- and shear-waves are essential for geothermal reservoir characterization and microseismic imaging. Elastic-waveform inversion of multi-component seismic data can provide high-resolution inversion results of subsurface geophysical properties. However, the method requires seismic data acquired using dense source and receiver arrays. In practice, seismic sources and/or geophones are often sparsely distributed on the surface and/or in a borehole, such as 3D vertical seismic profiling (VSP) surveys. We develop a novel elastic-waveform inversion method with compressive sensing for inversion of sparse seismic data. We employ an alternating-minimization algorithm to solve the optimization problem of our new waveform inversionmore » method. We validate our new method using synthetic VSP data for a geophysical model built using geologic features found at the Raft River enhanced-geothermal-system (EGS) field. We apply our method to synthetic VSP data with a sparse source array and compare the results with those obtained with a dense source array. Our numerical results demonstrate that the velocity models produced with our new method using a sparse source array are almost as accurate as those obtained using a dense source array.« less
Dimensionality reduction of collective motion by principal manifolds
NASA Astrophysics Data System (ADS)
Gajamannage, Kelum; Butail, Sachit; Porfiri, Maurizio; Bollt, Erik M.
2015-01-01
While the existence of low-dimensional embedding manifolds has been shown in patterns of collective motion, the current battery of nonlinear dimensionality reduction methods is not amenable to the analysis of such manifolds. This is mainly due to the necessary spectral decomposition step, which limits control over the mapping from the original high-dimensional space to the embedding space. Here, we propose an alternative approach that demands a two-dimensional embedding which topologically summarizes the high-dimensional data. In this sense, our approach is closely related to the construction of one-dimensional principal curves that minimize orthogonal error to data points subject to smoothness constraints. Specifically, we construct a two-dimensional principal manifold directly in the high-dimensional space using cubic smoothing splines, and define the embedding coordinates in terms of geodesic distances. Thus, the mapping from the high-dimensional data to the manifold is defined in terms of local coordinates. Through representative examples, we show that compared to existing nonlinear dimensionality reduction methods, the principal manifold retains the original structure even in noisy and sparse datasets. The principal manifold finding algorithm is applied to configurations obtained from a dynamical system of multiple agents simulating a complex maneuver called predator mobbing, and the resulting two-dimensional embedding is compared with that of a well-established nonlinear dimensionality reduction method.
Disentangling multidimensional spatio-temporal data into their common and aberrant responses
Chang, Young Hwan; Korkola, James; Amin, Dhara N.; ...
2015-04-22
With the advent of high-throughput measurement techniques, scientists and engineers are starting to grapple with massive data sets and encountering challenges with how to organize, process and extract information into meaningful structures. Multidimensional spatio-temporal biological data sets such as time series gene expression with various perturbations over different cell lines, or neural spike trains across many experimental trials, have the potential to acquire insight about the dynamic behavior of the system. For this potential to be realized, we need a suitable representation to understand the data. A general question is how to organize the observed data into meaningful structures andmore » how to find an appropriate similarity measure. A natural way of viewing these complex high dimensional data sets is to examine and analyze the large-scale features and then to focus on the interesting details. Since the wide range of experiments and unknown complexity of the underlying system contribute to the heterogeneity of biological data, we develop a new method by proposing an extension of Robust Principal Component Analysis (RPCA), which models common variations across multiple experiments as the lowrank component and anomalies across these experiments as the sparse component. We show that the proposed method is able to find distinct subtypes and classify data sets in a robust way without any prior knowledge by separating these common responses and abnormal responses. Thus, the proposed method provides us a new representation of these data sets which has the potential to help users acquire new insight from data.« less
Finding imaging patterns of structural covariance via Non-Negative Matrix Factorization.
Sotiras, Aristeidis; Resnick, Susan M; Davatzikos, Christos
2015-03-01
In this paper, we investigate the use of Non-Negative Matrix Factorization (NNMF) for the analysis of structural neuroimaging data. The goal is to identify the brain regions that co-vary across individuals in a consistent way, hence potentially being part of underlying brain networks or otherwise influenced by underlying common mechanisms such as genetics and pathologies. NNMF offers a directly data-driven way of extracting relatively localized co-varying structural regions, thereby transcending limitations of Principal Component Analysis (PCA), Independent Component Analysis (ICA) and other related methods that tend to produce dispersed components of positive and negative loadings. In particular, leveraging upon the well known ability of NNMF to produce parts-based representations of image data, we derive decompositions that partition the brain into regions that vary in consistent ways across individuals. Importantly, these decompositions achieve dimensionality reduction via highly interpretable ways and generalize well to new data as shown via split-sample experiments. We empirically validate NNMF in two data sets: i) a Diffusion Tensor (DT) mouse brain development study, and ii) a structural Magnetic Resonance (sMR) study of human brain aging. We demonstrate the ability of NNMF to produce sparse parts-based representations of the data at various resolutions. These representations seem to follow what we know about the underlying functional organization of the brain and also capture some pathological processes. Moreover, we show that these low dimensional representations favorably compare to descriptions obtained with more commonly used matrix factorization methods like PCA and ICA. Copyright © 2014 Elsevier Inc. All rights reserved.
The disease complex of the gypsy moth. 1. Major components
R.W. Campbell; J.D. Podgwaite
1971-01-01
A study was undertaken to elucidate the impact of the various components of disease on natural populations of the gypsy moth, Porthetria dispar. Diseased larvae from both sparse and dense populations were examined and categorized on the basis of etiologic and nonetiologic mortality factors. Results indicated a significantly higher incidence of...
Nguyen, Phuong H
2007-05-15
Principal component analysis is a powerful method for projecting multidimensional conformational space of peptides or proteins onto lower dimensional subspaces in which the main conformations are present, making it easier to reveal the structures of molecules from e.g. molecular dynamics simulation trajectories. However, the identification of all conformational states is still difficult if the subspaces consist of more than two dimensions. This is mainly due to the fact that the principal components are not independent with each other, and states in the subspaces cannot be visualized. In this work, we propose a simple and fast scheme that allows one to obtain all conformational states in the subspaces. The basic idea is that instead of directly identifying the states in the subspace spanned by principal components, we first transform this subspace into another subspace formed by components that are independent of one other. These independent components are obtained from the principal components by employing the independent component analysis method. Because of independence between components, all states in this new subspace are defined as all possible combinations of the states obtained from each single independent component. This makes the conformational analysis much simpler. We test the performance of the method by analyzing the conformations of the glycine tripeptide and the alanine hexapeptide. The analyses show that our method is simple and quickly reveal all conformational states in the subspaces. The folding pathways between the identified states of the alanine hexapeptide are analyzed and discussed in some detail. 2007 Wiley-Liss, Inc.
Liu, Hui-lin; Wan, Xia; Yang, Gong-huan
2013-02-01
To explore the relationship between the strength of tobacco control and the effectiveness of creating smoke-free hospital, and summarize the main factors that affect the program of creating smoke-free hospitals. A total of 210 hospitals from 7 provinces/municipalities directly under the central government were enrolled in this study using stratified random sampling method. Principle component analysis and regression analysis were conducted to analyze the strength of tobacco control and the effectiveness of creating smoke-free hospitals. Two principal components were extracted in the strength of tobacco control index, which respectively reflected the tobacco control policies and efforts, and the willingness and leadership of hospital managers regarding tobacco control. The regression analysis indicated that only the first principal component was significantly correlated with the progression in creating smoke-free hospital (P<0.001), i.e. hospitals with higher scores on the first principal component had better achievements in smoke-free environment creation. Tobacco control policies and efforts are critical in creating smoke-free hospitals. The principal component analysis provides a comprehensive and objective tool for evaluating the creation of smoke-free hospitals.
A robust sparse-modeling framework for estimating schizophrenia biomarkers from fMRI.
Dillon, Keith; Calhoun, Vince; Wang, Yu-Ping
2017-01-30
Our goal is to identify the brain regions most relevant to mental illness using neuroimaging. State of the art machine learning methods commonly suffer from repeatability difficulties in this application, particularly when using large and heterogeneous populations for samples. We revisit both dimensionality reduction and sparse modeling, and recast them in a common optimization-based framework. This allows us to combine the benefits of both types of methods in an approach which we call unambiguous components. We use this to estimate the image component with a constrained variability, which is best correlated with the unknown disease mechanism. We apply the method to the estimation of neuroimaging biomarkers for schizophrenia, using task fMRI data from a large multi-site study. The proposed approach yields an improvement in both robustness of the estimate and classification accuracy. We find that unambiguous components incorporate roughly two thirds of the same brain regions as sparsity-based methods LASSO and elastic net, while roughly one third of the selected regions differ. Further, unambiguous components achieve superior classification accuracy in differentiating cases from controls. Unambiguous components provide a robust way to estimate important regions of imaging data. Copyright © 2016 Elsevier B.V. All rights reserved.
Critical Factors Explaining the Leadership Performance of High-Performing Principals
ERIC Educational Resources Information Center
Hutton, Disraeli M.
2018-01-01
The study explored critical factors that explain leadership performance of high-performing principals and examined the relationship between these factors based on the ratings of school constituents in the public school system. The principal component analysis with the use of Varimax Rotation revealed that four components explain 51.1% of the…
Molecular dynamics in principal component space.
Michielssens, Servaas; van Erp, Titus S; Kutzner, Carsten; Ceulemans, Arnout; de Groot, Bert L
2012-07-26
A molecular dynamics algorithm in principal component space is presented. It is demonstrated that sampling can be improved without changing the ensemble by assigning masses to the principal components proportional to the inverse square root of the eigenvalues. The setup of the simulation requires no prior knowledge of the system; a short initial MD simulation to extract the eigenvectors and eigenvalues suffices. Independent measures indicated a 6-7 times faster sampling compared to a regular molecular dynamics simulation.
Optimized principal component analysis on coronagraphic images of the fomalhaut system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meshkat, Tiffany; Kenworthy, Matthew A.; Quanz, Sascha P.
We present the results of a study to optimize the principal component analysis (PCA) algorithm for planet detection, a new algorithm complementing angular differential imaging and locally optimized combination of images (LOCI) for increasing the contrast achievable next to a bright star. The stellar point spread function (PSF) is constructed by removing linear combinations of principal components, allowing the flux from an extrasolar planet to shine through. The number of principal components used determines how well the stellar PSF is globally modeled. Using more principal components may decrease the number of speckles in the final image, but also increases themore » background noise. We apply PCA to Fomalhaut Very Large Telescope NaCo images acquired at 4.05 μm with an apodized phase plate. We do not detect any companions, with a model dependent upper mass limit of 13-18 M {sub Jup} from 4-10 AU. PCA achieves greater sensitivity than the LOCI algorithm for the Fomalhaut coronagraphic data by up to 1 mag. We make several adaptations to the PCA code and determine which of these prove the most effective at maximizing the signal-to-noise from a planet very close to its parent star. We demonstrate that optimizing the number of principal components used in PCA proves most effective for pulling out a planet signal.« less
A novel aliasing-free subband information fusion approach for wideband sparse spectral estimation
NASA Astrophysics Data System (ADS)
Luo, Ji-An; Zhang, Xiao-Ping; Wang, Zhi
2017-12-01
Wideband sparse spectral estimation is generally formulated as a multi-dictionary/multi-measurement (MD/MM) problem which can be solved by using group sparsity techniques. In this paper, the MD/MM problem is reformulated as a single sparse indicative vector (SIV) recovery problem at the cost of introducing an additional system error. Thus, the number of unknowns is reduced greatly. We show that the system error can be neglected under certain conditions. We then present a new subband information fusion (SIF) method to estimate the SIV by jointly utilizing all the frequency bins. With orthogonal matching pursuit (OMP) leveraging the binary property of SIV's components, we develop a SIF-OMP algorithm to reconstruct the SIV. The numerical simulations demonstrate the performance of the proposed method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clark, Darin P.; Badea, Cristian T., E-mail: cristian.badea@duke.edu; Lee, Chang-Lung
Purpose: X-ray computed tomography (CT) is widely used, both clinically and preclinically, for fast, high-resolution anatomic imaging; however, compelling opportunities exist to expand its use in functional imaging applications. For instance, spectral information combined with nanoparticle contrast agents enables quantification of tissue perfusion levels, while temporal information details cardiac and respiratory dynamics. The authors propose and demonstrate a projection acquisition and reconstruction strategy for 5D CT (3D + dual energy + time) which recovers spectral and temporal information without substantially increasing radiation dose or sampling time relative to anatomic imaging protocols. Methods: The authors approach the 5D reconstruction problem withinmore » the framework of low-rank and sparse matrix decomposition. Unlike previous work on rank-sparsity constrained CT reconstruction, the authors establish an explicit rank-sparse signal model to describe the spectral and temporal dimensions. The spectral dimension is represented as a well-sampled time and energy averaged image plus regularly undersampled principal components describing the spectral contrast. The temporal dimension is represented as the same time and energy averaged reconstruction plus contiguous, spatially sparse, and irregularly sampled temporal contrast images. Using a nonlinear, image domain filtration approach, the authors refer to as rank-sparse kernel regression, the authors transfer image structure from the well-sampled time and energy averaged reconstruction to the spectral and temporal contrast images. This regularization strategy strictly constrains the reconstruction problem while approximately separating the temporal and spectral dimensions. Separability results in a highly compressed representation for the 5D data in which projections are shared between the temporal and spectral reconstruction subproblems, enabling substantial undersampling. The authors solved the 5D reconstruction problem using the split Bregman method and GPU-based implementations of backprojection, reprojection, and kernel regression. Using a preclinical mouse model, the authors apply the proposed algorithm to study myocardial injury following radiation treatment of breast cancer. Results: Quantitative 5D simulations are performed using the MOBY mouse phantom. Twenty data sets (ten cardiac phases, two energies) are reconstructed with 88 μm, isotropic voxels from 450 total projections acquired over a single 360° rotation. In vivo 5D myocardial injury data sets acquired in two mice injected with gold and iodine nanoparticles are also reconstructed with 20 data sets per mouse using the same acquisition parameters (dose: ∼60 mGy). For both the simulations and the in vivo data, the reconstruction quality is sufficient to perform material decomposition into gold and iodine maps to localize the extent of myocardial injury (gold accumulation) and to measure cardiac functional metrics (vascular iodine). Their 5D CT imaging protocol represents a 95% reduction in radiation dose per cardiac phase and energy and a 40-fold decrease in projection sampling time relative to their standard imaging protocol. Conclusions: Their 5D CT data acquisition and reconstruction protocol efficiently exploits the rank-sparse nature of spectral and temporal CT data to provide high-fidelity reconstruction results without increased radiation dose or sampling time.« less
Spectrotemporal CT data acquisition and reconstruction at low dose
Clark, Darin P.; Lee, Chang-Lung; Kirsch, David G.; Badea, Cristian T.
2015-01-01
Purpose: X-ray computed tomography (CT) is widely used, both clinically and preclinically, for fast, high-resolution anatomic imaging; however, compelling opportunities exist to expand its use in functional imaging applications. For instance, spectral information combined with nanoparticle contrast agents enables quantification of tissue perfusion levels, while temporal information details cardiac and respiratory dynamics. The authors propose and demonstrate a projection acquisition and reconstruction strategy for 5D CT (3D + dual energy + time) which recovers spectral and temporal information without substantially increasing radiation dose or sampling time relative to anatomic imaging protocols. Methods: The authors approach the 5D reconstruction problem within the framework of low-rank and sparse matrix decomposition. Unlike previous work on rank-sparsity constrained CT reconstruction, the authors establish an explicit rank-sparse signal model to describe the spectral and temporal dimensions. The spectral dimension is represented as a well-sampled time and energy averaged image plus regularly undersampled principal components describing the spectral contrast. The temporal dimension is represented as the same time and energy averaged reconstruction plus contiguous, spatially sparse, and irregularly sampled temporal contrast images. Using a nonlinear, image domain filtration approach, the authors refer to as rank-sparse kernel regression, the authors transfer image structure from the well-sampled time and energy averaged reconstruction to the spectral and temporal contrast images. This regularization strategy strictly constrains the reconstruction problem while approximately separating the temporal and spectral dimensions. Separability results in a highly compressed representation for the 5D data in which projections are shared between the temporal and spectral reconstruction subproblems, enabling substantial undersampling. The authors solved the 5D reconstruction problem using the split Bregman method and GPU-based implementations of backprojection, reprojection, and kernel regression. Using a preclinical mouse model, the authors apply the proposed algorithm to study myocardial injury following radiation treatment of breast cancer. Results: Quantitative 5D simulations are performed using the MOBY mouse phantom. Twenty data sets (ten cardiac phases, two energies) are reconstructed with 88 μm, isotropic voxels from 450 total projections acquired over a single 360° rotation. In vivo 5D myocardial injury data sets acquired in two mice injected with gold and iodine nanoparticles are also reconstructed with 20 data sets per mouse using the same acquisition parameters (dose: ∼60 mGy). For both the simulations and the in vivo data, the reconstruction quality is sufficient to perform material decomposition into gold and iodine maps to localize the extent of myocardial injury (gold accumulation) and to measure cardiac functional metrics (vascular iodine). Their 5D CT imaging protocol represents a 95% reduction in radiation dose per cardiac phase and energy and a 40-fold decrease in projection sampling time relative to their standard imaging protocol. Conclusions: Their 5D CT data acquisition and reconstruction protocol efficiently exploits the rank-sparse nature of spectral and temporal CT data to provide high-fidelity reconstruction results without increased radiation dose or sampling time. PMID:26520724
[A study of Boletus bicolor from different areas using Fourier transform infrared spectrometry].
Zhou, Zai-Jin; Liu, Gang; Ren, Xian-Pei
2010-04-01
It is hard to differentiate the same species of wild growing mushrooms from different areas by macromorphological features. In this paper, Fourier transform infrared (FTIR) spectroscopy combined with principal component analysis was used to identify 58 samples of boletus bicolor from five different areas. Based on the fingerprint infrared spectrum of boletus bicolor samples, principal component analysis was conducted on 58 boletus bicolor spectra in the range of 1 350-750 cm(-1) using the statistical software SPSS 13.0. According to the result, the accumulated contributing ratio of the first three principal components accounts for 88.87%. They included almost all the information of samples. The two-dimensional projection plot using first and second principal component is a satisfactory clustering effect for the classification and discrimination of boletus bicolor. All boletus bicolor samples were divided into five groups with a classification accuracy of 98.3%. The study demonstrated that wild growing boletus bicolor at species level from different areas can be identified by FTIR spectra combined with principal components analysis.
Joint Feature Extraction and Classifier Design for ECG-Based Biometric Recognition.
Gutta, Sandeep; Cheng, Qi
2016-03-01
Traditional biometric recognition systems often utilize physiological traits such as fingerprint, face, iris, etc. Recent years have seen a growing interest in electrocardiogram (ECG)-based biometric recognition techniques, especially in the field of clinical medicine. In existing ECG-based biometric recognition methods, feature extraction and classifier design are usually performed separately. In this paper, a multitask learning approach is proposed, in which feature extraction and classifier design are carried out simultaneously. Weights are assigned to the features within the kernel of each task. We decompose the matrix consisting of all the feature weights into sparse and low-rank components. The sparse component determines the features that are relevant to identify each individual, and the low-rank component determines the common feature subspace that is relevant to identify all the subjects. A fast optimization algorithm is developed, which requires only the first-order information. The performance of the proposed approach is demonstrated through experiments using the MIT-BIH Normal Sinus Rhythm database.
An ultra-sparse code underliesthe generation of neural sequences in a songbird
NASA Astrophysics Data System (ADS)
Hahnloser, Richard H. R.; Kozhevnikov, Alexay A.; Fee, Michale S.
2002-09-01
Sequences of motor activity are encoded in many vertebrate brains by complex spatio-temporal patterns of neural activity; however, the neural circuit mechanisms underlying the generation of these pre-motor patterns are poorly understood. In songbirds, one prominent site of pre-motor activity is the forebrain robust nucleus of the archistriatum (RA), which generates stereotyped sequences of spike bursts during song and recapitulates these sequences during sleep. We show that the stereotyped sequences in RA are driven from nucleus HVC (high vocal centre), the principal pre-motor input to RA. Recordings of identified HVC neurons in sleeping and singing birds show that individual HVC neurons projecting onto RA neurons produce bursts sparsely, at a single, precise time during the RA sequence. These HVC neurons burst sequentially with respect to one another. We suggest that at each time in the RA sequence, the ensemble of active RA neurons is driven by a subpopulation of RA-projecting HVC neurons that is active only at that time. As a population, these HVC neurons may form an explicit representation of time in the sequence. Such a sparse representation, a temporal analogue of the `grandmother cell' concept for object recognition, eliminates the problem of temporal interference during sequence generation and learning attributed to more distributed representations.
How multi segmental patterns deviate in spastic diplegia from typical developed.
Zago, Matteo; Sforza, Chiarella; Bona, Alessia; Cimolin, Veronica; Costici, Pier Francesco; Condoluci, Claudia; Galli, Manuela
2017-10-01
The relationship between gait features and coordination in children with Cerebral Palsy is not sufficiently analyzed yet. Principal Component Analysis can help in understanding motion patterns decomposing movement into its fundamental components (Principal Movements). This study aims at quantitatively characterizing the functional connections between multi-joint gait patterns in Cerebral Palsy. 65 children with spastic diplegia aged 10.6 (SD 3.7) years participated in standardized gait analysis trials; 31 typically developing adolescents aged 13.6 (4.4) years were also tested. To determine if posture affects gait patterns, patients were split into Crouch and knee Hyperextension group according to knee flexion angle at standing. 3D coordinates of hips, knees, ankles, metatarsal joints, pelvis and shoulders were submitted to Principal Component Analysis. Four Principal Movements accounted for 99% of global variance; components 1-3 explained major sagittal patterns, components 4-5 referred to movements on frontal plane and component 6 to additional movement refinements. Dimensionality was higher in patients than in controls (p<0.01), and the Crouch group significantly differed from controls in the application of components 1 and 4-6 (p<0.05), while the knee Hyperextension group in components 1-2 and 5 (p<0.05). Compensatory strategies of children with Cerebral Palsy (interactions between main and secondary movement patterns), were objectively determined. Principal Movements can reduce the effort in interpreting gait reports, providing an immediate and quantitative picture of the connections between movement components. Copyright © 2017 Elsevier Ltd. All rights reserved.
Shen, Hong-Bin
2011-01-01
Modern science of networks has brought significant advances to our understanding of complex systems biology. As a representative model of systems biology, Protein Interaction Networks (PINs) are characterized by a remarkable modular structures, reflecting functional associations between their components. Many methods were proposed to capture cohesive modules so that there is a higher density of edges within modules than those across them. Recent studies reveal that cohesively interacting modules of proteins is not a universal organizing principle in PINs, which has opened up new avenues for revisiting functional modules in PINs. In this paper, functional clusters in PINs are found to be able to form unorthodox structures defined as bi-sparse module. In contrast to the traditional cohesive module, the nodes in the bi-sparse module are sparsely connected internally and densely connected with other bi-sparse or cohesive modules. We present a novel protocol called the BinTree Seeking (BTS) for mining both bi-sparse and cohesive modules in PINs based on Edge Density of Module (EDM) and matrix theory. BTS detects modules by depicting links and nodes rather than nodes alone and its derivation procedure is totally performed on adjacency matrix of networks. The number of modules in a PIN can be automatically determined in the proposed BTS approach. BTS is tested on three real PINs and the results demonstrate that functional modules in PINs are not dominantly cohesive but can be sparse. BTS software and the supporting information are available at: www.csbio.sjtu.edu.cn/bioinf/BTS/. PMID:22140454
NASA Technical Reports Server (NTRS)
Williams, D. L.; Borden, F. Y.
1977-01-01
Methods to accurately delineate the types of land cover in the urban-rural transition zone of metropolitan areas were considered. The application of principal components analysis to multidate LANDSAT imagery was investigated as a means of reducing the overlap between residential and agricultural spectral signatures. The statistical concepts of principal components analysis were discussed, as well as the results of this analysis when applied to multidate LANDSAT imagery of the Washington, D.C. metropolitan area.
Constrained Principal Component Analysis: Various Applications.
ERIC Educational Resources Information Center
Hunter, Michael; Takane, Yoshio
2002-01-01
Provides example applications of constrained principal component analysis (CPCA) that illustrate the method on a variety of contexts common to psychological research. Two new analyses, decompositions into finer components and fitting higher order structures, are presented, followed by an illustration of CPCA on contingency tables and the CPCA of…
NASA Astrophysics Data System (ADS)
Ginanjar, Irlandia; Pasaribu, Udjianna S.; Indratno, Sapto W.
2017-03-01
This article presents the application of the principal component analysis (PCA) biplot for the needs of data mining. This article aims to simplify and objectify the methods for objects clustering in PCA biplot. The novelty of this paper is to get a measure that can be used to objectify the objects clustering in PCA biplot. Orthonormal eigenvectors, which are the coefficients of a principal component model representing an association between principal components and initial variables. The existence of the association is a valid ground to objects clustering based on principal axes value, thus if m principal axes used in the PCA, then the objects can be classified into 2m clusters. The inter-city buses are clustered based on maintenance costs data by using two principal axes PCA biplot. The buses are clustered into four groups. The first group is the buses with high maintenance costs, especially for lube, and brake canvass. The second group is the buses with high maintenance costs, especially for tire, and filter. The third group is the buses with low maintenance costs, especially for lube, and brake canvass. The fourth group is buses with low maintenance costs, especially for tire, and filter.
Kakio, Tomoko; Nagase, Hitomi; Takaoka, Takashi; Yoshida, Naoko; Hirakawa, Junichi; Macha, Susan; Hiroshima, Takashi; Ikeda, Yukihiro; Tsuboi, Hirohito; Kimura, Kazuko
2018-06-01
The World Health Organization has warned that substandard and falsified medical products (SFs) can harm patients and fail to treat the diseases for which they were intended, and they affect every region of the world, leading to loss of confidence in medicines, health-care providers, and health systems. Therefore, development of analytical procedures to detect SFs is extremely important. In this study, we investigated the quality of pharmaceutical tablets containing the antihypertensive candesartan cilexetil, collected in China, Indonesia, Japan, and Myanmar, using the Japanese pharmacopeial analytical procedures for quality control, together with principal component analysis (PCA) of Raman spectrum obtained with handheld Raman spectrometer. Some samples showed delayed dissolution and failed to meet the pharmacopeial specification, whereas others failed the assay test. These products appeared to be substandard. Principal component analysis showed that all Raman spectra could be explained in terms of two components: the amount of the active pharmaceutical ingredient and the kinds of excipients. Principal component analysis score plot indicated one substandard, and the falsified tablets have similar principal components in Raman spectra, in contrast to authentic products. The locations of samples within the PCA score plot varied according to the source country, suggesting that manufacturers in different countries use different excipients. Our results indicate that the handheld Raman device will be useful for detection of SFs in the field. Principal component analysis of that Raman data clarify the difference in chemical properties between good quality products and SFs that circulate in the Asian market.
Principal component analysis and the locus of the Fréchet mean in the space of phylogenetic trees.
Nye, Tom M W; Tang, Xiaoxian; Weyenberg, Grady; Yoshida, Ruriko
2017-12-01
Evolutionary relationships are represented by phylogenetic trees, and a phylogenetic analysis of gene sequences typically produces a collection of these trees, one for each gene in the analysis. Analysis of samples of trees is difficult due to the multi-dimensionality of the space of possible trees. In Euclidean spaces, principal component analysis is a popular method of reducing high-dimensional data to a low-dimensional representation that preserves much of the sample's structure. However, the space of all phylogenetic trees on a fixed set of species does not form a Euclidean vector space, and methods adapted to tree space are needed. Previous work introduced the notion of a principal geodesic in this space, analogous to the first principal component. Here we propose a geometric object for tree space similar to the [Formula: see text]th principal component in Euclidean space: the locus of the weighted Fréchet mean of [Formula: see text] vertex trees when the weights vary over the [Formula: see text]-simplex. We establish some basic properties of these objects, in particular showing that they have dimension [Formula: see text], and propose algorithms for projection onto these surfaces and for finding the principal locus associated with a sample of trees. Simulation studies demonstrate that these algorithms perform well, and analyses of two datasets, containing Apicomplexa and African coelacanth genomes respectively, reveal important structure from the second principal components.
Fast Solution in Sparse LDA for Binary Classification
NASA Technical Reports Server (NTRS)
Moghaddam, Baback
2010-01-01
An algorithm that performs sparse linear discriminant analysis (Sparse-LDA) finds near-optimal solutions in far less time than the prior art when specialized to binary classification (of 2 classes). Sparse-LDA is a type of feature- or variable- selection problem with numerous applications in statistics, machine learning, computer vision, computational finance, operations research, and bio-informatics. Because of its combinatorial nature, feature- or variable-selection problems are NP-hard or computationally intractable in cases involving more than 30 variables or features. Therefore, one typically seeks approximate solutions by means of greedy search algorithms. The prior Sparse-LDA algorithm was a greedy algorithm that considered the best variable or feature to add/ delete to/ from its subsets in order to maximally discriminate between multiple classes of data. The present algorithm is designed for the special but prevalent case of 2-class or binary classification (e.g. 1 vs. 0, functioning vs. malfunctioning, or change versus no change). The present algorithm provides near-optimal solutions on large real-world datasets having hundreds or even thousands of variables or features (e.g. selecting the fewest wavelength bands in a hyperspectral sensor to do terrain classification) and does so in typical computation times of minutes as compared to days or weeks as taken by the prior art. Sparse LDA requires solving generalized eigenvalue problems for a large number of variable subsets (represented by the submatrices of the input within-class and between-class covariance matrices). In the general (fullrank) case, the amount of computation scales at least cubically with the number of variables and thus the size of the problems that can be solved is limited accordingly. However, in binary classification, the principal eigenvalues can be found using a special analytic formula, without resorting to costly iterative techniques. The present algorithm exploits this analytic form along with the inherent sequential nature of greedy search itself. Together this enables the use of highly-efficient partitioned-matrix-inverse techniques that result in large speedups of computation in both the forward-selection and backward-elimination stages of greedy algorithms in general.
USDA-ARS?s Scientific Manuscript database
In southern Bahia, cabruca is the agroforestry system in which cocoa is cultivated under the shade of sparse native forest trees. Aiming to characterize the tree component of this system and its management practices, we conducted an inventory of the non-cocoa trees in 16 ha of cabruca and do intervi...
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS.
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2011-01-01
The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied.
HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION
Mukherjee, Rajarshi; Pillai, Natesh S.; Lin, Xihong
2015-01-01
In this paper, we study the detection boundary for minimax hypothesis testing in the context of high-dimensional, sparse binary regression models. Motivated by genetic sequencing association studies for rare variant effects, we investigate the complexity of the hypothesis testing problem when the design matrix is sparse. We observe a new phenomenon in the behavior of detection boundary which does not occur in the case of Gaussian linear regression. We derive the detection boundary as a function of two components: a design matrix sparsity index and signal strength, each of which is a function of the sparsity of the alternative. For any alternative, if the design matrix sparsity index is too high, any test is asymptotically powerless irrespective of the magnitude of signal strength. For binary design matrices with the sparsity index that is not too high, our results are parallel to those in the Gaussian case. In this context, we derive detection boundaries for both dense and sparse regimes. For the dense regime, we show that the generalized likelihood ratio is rate optimal; for the sparse regime, we propose an extended Higher Criticism Test and show it is rate optimal and sharp. We illustrate the finite sample properties of the theoretical results using simulation studies. PMID:26246645
Fast and low-dose computed laminography using compressive sensing based technique
NASA Astrophysics Data System (ADS)
Abbas, Sajid; Park, Miran; Cho, Seungryong
2015-03-01
Computed laminography (CL) is well known for inspecting microstructures in the materials, weldments and soldering defects in high density packed components or multilayer printed circuit boards. The overload problem on x-ray tube and gross failure of the radio-sensitive electronics devices during a scan are among important issues in CL which needs to be addressed. The sparse-view CL can be one of the viable option to overcome such issues. In this work a numerical aluminum welding phantom was simulated to collect sparsely sampled projection data at only 40 views using a conventional CL scanning scheme i.e. oblique scan. A compressive-sensing inspired total-variation (TV) minimization algorithm was utilized to reconstruct the images. It is found that the images reconstructed using sparse view data are visually comparable with the images reconstructed using full scan data set i.e. at 360 views on regular interval. We have quantitatively confirmed that tiny structures such as copper and tungsten slags, and copper flakes in the reconstructed images from sparsely sampled data are comparable with the corresponding structure present in the fully sampled data case. A blurring effect can be seen near the edges of few pores at the bottom of the reconstructed images from sparsely sampled data, despite the overall image quality is reasonable for fast and low-dose NDT.
Meyer, Karin; Kirkpatrick, Mark
2005-01-01
Principal component analysis is a widely used 'dimension reduction' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1)/2 to m(2k - m + 1)/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given. PMID:15588566
Morin, R.H.
1997-01-01
Returns from drilling in unconsolidated cobble and sand aquifers commonly do not identify lithologic changes that may be meaningful for Hydrogeologic investigations. Vertical resolution of saturated, Quaternary, coarse braided-slream deposits is significantly improved by interpreting natural gamma (G), epithermal neutron (N), and electromagnetically induced resistivity (IR) logs obtained from wells at the Capital Station site in Boise, Idaho. Interpretation of these geophysical logs is simplified because these sediments are derived largely from high-gamma-producing source rocks (granitics of the Boise River drainage), contain few clays, and have undergone little diagenesis. Analysis of G, N, and IR data from these deposits with principal components analysis provides an objective means to determine if units can be recognized within the braided-stream deposits. In particular, performing principal components analysis on G, N, and IR data from eight wells at Capital Station (1) allows the variable system dimensionality to be reduced from three to two by selecting the two eigenvectors with the greatest variance as axes for principal component scatterplots, (2) generates principal components with interpretable physical meanings, (3) distinguishes sand from cobble-dominated units, and (4) provides a means to distinguish between cobble-dominated units.
Analysis and Evaluation of the Characteristic Taste Components in Portobello Mushroom.
Wang, Jinbin; Li, Wen; Li, Zhengpeng; Wu, Wenhui; Tang, Xueming
2018-05-10
To identify the characteristic taste components of the common cultivated mushroom (brown; Portobello), Agaricus bisporus, taste components in the stipe and pileus of Portobello mushroom harvested at different growth stages were extracted and identified, and principal component analysis (PCA) and taste active value (TAV) were used to reveal the characteristic taste components during the each of the growth stages of Portobello mushroom. In the stipe and pileus, 20 and 14 different principal taste components were identified, respectively, and they were considered as the principal taste components of Portobello mushroom fruit bodies, which included most amino acids and 5'-nucleotides. Some taste components that were found at high levels, such as lactic acid and citric acid, were not detected as Portobello mushroom principal taste components through PCA. However, due to their high content, Portobello mushroom could be used as a source of organic acids. The PCA and TAV results revealed that 5'-GMP, glutamic acid, malic acid, alanine, proline, leucine, and aspartic acid were the characteristic taste components of Portobello mushroom fruit bodies. Portobello mushroom was also found to be rich in protein and amino acids, so it might also be useful in the formulation of nutraceuticals and functional food. The results in this article could provide a theoretical basis for understanding and regulating the characteristic flavor components synthesis process of Portobello mushroom. © 2018 Institute of Food Technologists®.
NASA Astrophysics Data System (ADS)
Kistenev, Yu. V.; Shapovalov, A. V.; Borisov, A. V.; Vrazhnov, D. A.; Nikolaev, V. V.; Nikiforova, O. Y.
2015-12-01
The results of numerical simulation of application principal component analysis to absorption spectra of breath air of patients with pulmonary diseases are presented. Various methods of experimental data preprocessing are analyzed.
Piecewise multivariate modelling of sequential metabolic profiling data.
Rantalainen, Mattias; Cloarec, Olivier; Ebbels, Timothy M D; Lundstedt, Torbjörn; Nicholson, Jeremy K; Holmes, Elaine; Trygg, Johan
2008-02-19
Modelling the time-related behaviour of biological systems is essential for understanding their dynamic responses to perturbations. In metabolic profiling studies, the sampling rate and number of sampling points are often restricted due to experimental and biological constraints. A supervised multivariate modelling approach with the objective to model the time-related variation in the data for short and sparsely sampled time-series is described. A set of piecewise Orthogonal Projections to Latent Structures (OPLS) models are estimated, describing changes between successive time points. The individual OPLS models are linear, but the piecewise combination of several models accommodates modelling and prediction of changes which are non-linear with respect to the time course. We demonstrate the method on both simulated and metabolic profiling data, illustrating how time related changes are successfully modelled and predicted. The proposed method is effective for modelling and prediction of short and multivariate time series data. A key advantage of the method is model transparency, allowing easy interpretation of time-related variation in the data. The method provides a competitive complement to commonly applied multivariate methods such as OPLS and Principal Component Analysis (PCA) for modelling and analysis of short time-series data.
An atomistic fingerprint algorithm for learning ab initio molecular force fields
NASA Astrophysics Data System (ADS)
Tang, Yu-Hang; Zhang, Dongkun; Karniadakis, George Em
2018-01-01
Molecular fingerprints, i.e., feature vectors describing atomistic neighborhood configurations, is an important abstraction and a key ingredient for data-driven modeling of potential energy surface and interatomic force. In this paper, we present the density-encoded canonically aligned fingerprint algorithm, which is robust and efficient, for fitting per-atom scalar and vector quantities. The fingerprint is essentially a continuous density field formed through the superimposition of smoothing kernels centered on the atoms. Rotational invariance of the fingerprint is achieved by aligning, for each fingerprint instance, the neighboring atoms onto a local canonical coordinate frame computed from a kernel minisum optimization procedure. We show that this approach is superior over principal components analysis-based methods especially when the atomistic neighborhood is sparse and/or contains symmetry. We propose that the "distance" between the density fields be measured using a volume integral of their pointwise difference. This can be efficiently computed using optimal quadrature rules, which only require discrete sampling at a small number of grid points. We also experiment on the choice of weight functions for constructing the density fields and characterize their performance for fitting interatomic potentials. The applicability of the fingerprint is demonstrated through a set of benchmark problems.
Carpinus tibetana (Betulaceae), a new species from southeast Tibet, China
Lu, Zhiqiang; Li, Ying; Yang, Xiaoyue; Liu, Jianquan
2018-01-01
Abstract A new species Carpinus tibetana Z. Qiang Lu & J. Quan Liu from southeast Tibet is described and illustrated. The specimens of this new species were previously identified and placed under C. monbeigiana Hand.-Mazz. or C. mollicoma Hu. However, the specimens from southeast Tibet differ from those of C. monbeigiana from other regions with more lateral veins (19–24 vs 14–18) on each side of the midvein and dense pubescence on the abaxial leaf surface, while from those of C. mollicoma from other regions differ by nutlet with dense resinous glands and glabrous or sparsely villous at apex. Principal Component Analyses based on morphometric characters recognise the Tibetan populations as a separate group. Nuclear ribosomal ITS sequence variations show stable and distinct genetic divergences between the Tibetan populations and C. monbeigiana or C. mollicoma by two or three fixed nucleotide mutations. Phylogenetic analysis also identified three respective genetic clusters and the C. mollicoma cluster diverged early. In addition, the Tibetan populations show a disjunct geographic isolation from the other two species. Therefore, C. tibetana, based on the Tibetan populations, is here erected as a new species, distinctly different from C. monbeigiana and C. mollicoma. PMID:29750069
Keenan, Michael R; Smentkowski, Vincent S; Ulfig, Robert M; Oltman, Edward; Larson, David J; Kelly, Thomas F
2011-06-01
We demonstrate for the first time that multivariate statistical analysis techniques can be applied to atom probe tomography data to estimate the chemical composition of a sample at the full spatial resolution of the atom probe in three dimensions. Whereas the raw atom probe data provide the specific identity of an atom at a precise location, the multivariate results can be interpreted in terms of the probabilities that an atom representing a particular chemical phase is situated there. When aggregated to the size scale of a single atom (∼0.2 nm), atom probe spectral-image datasets are huge and extremely sparse. In fact, the average spectrum will have somewhat less than one total count per spectrum due to imperfect detection efficiency. These conditions, under which the variance in the data is completely dominated by counting noise, test the limits of multivariate analysis, and an extensive discussion of how to extract the chemical information is presented. Efficient numerical approaches to performing principal component analysis (PCA) on these datasets, which may number hundreds of millions of individual spectra, are put forward, and it is shown that PCA can be computed in a few seconds on a typical laptop computer.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dennis C. Smolarski, S.J.
Project Abstract This project was a continuation of work begun under a subcontract issued off of TSI-DOE Grant 1528746, awarded to the University of Illinois Urbana-Champaign. Dr. Anthony Mezzacappa is the Principal Investigator on the Illinois award. A separate award was issued to Santa Clara University to continue the collaboration during the time period May 2003 ? 2004. Smolarski continued to work on preconditioner technology and its interface with various iterative methods. He worked primarily with F. Dough Swesty (SUNY-Stony Brook) in continuing software development started in the 2002-03 academic year. Special attention was paid to the development and testingmore » of difference sparse approximate inverse preconditioners and their use in the solution of linear systems arising from radiation transport equations. The target was a high performance platform on which efficient implementation is a critical component of the overall effort. Smolarski also focused on the integration of the adaptive iterative algorithm, Chebycode, developed by Tom Manteuffel and Steve Ashby and adapted by Ryan Szypowski for parallel platforms, into the radiation transport code being developed at SUNY-Stony Brook.« less
Arbuscular mycorrhizas are present on Spitsbergen.
Newsham, K K; Eidesen, P B; Davey, M L; Axelsen, J; Courtecuisse, E; Flintrop, C; Johansson, A G; Kiepert, M; Larsen, S E; Lorberau, K E; Maurset, M; McQuilkin, J; Misiak, M; Pop, A; Thompson, S; Read, D J
2017-10-01
A previous study of 76 plant species on Spitsbergen in the High Arctic concluded that structures resembling arbuscular mycorrhizas were absent from roots. Here, we report a survey examining the roots of 13 grass and forb species collected from 12 sites on the island for arbuscular mycorrhizal (AM) colonisation. Of the 102 individuals collected, we recorded AM endophytes in the roots of 41 plants of 11 species (Alopecurus ovatus, Deschampsia alpina, Festuca rubra ssp. richardsonii, putative viviparous hybrids of Poa arctica and Poa pratensis, Poa arctica ssp. arctica, Trisetum spicatum, Coptidium spitsbergense, Ranunculus nivalis, Ranunculus pygmaeus, Ranunculus sulphureus and Taraxacum arcticum) sampled from 10 sites. Both coarse AM endophyte, with hyphae of 5-10 μm width, vesicles and occasional arbuscules, and fine endophyte, consisting of hyphae of 1-3 μm width and sparse arbuscules, were recorded in roots. Coarse AM hyphae, vesicles, arbuscules and fine endophyte hyphae occupied 1.0-30.7, 0.8-18.3, 0.7-11.9 and 0.7-12.8% of the root lengths of colonised plants, respectively. Principal component analysis indicated no associations between the abundances of AM structures in roots and edaphic factors. We conclude that the AM symbiosis is present in grass and forb roots on Spitsbergen.
Carpinus tibetana (Betulaceae), a new species from southeast Tibet, China.
Lu, Zhiqiang; Li, Ying; Yang, Xiaoyue; Liu, Jianquan
2018-01-01
A new species Carpinus tibetana Z. Qiang Lu & J. Quan Liu from southeast Tibet is described and illustrated. The specimens of this new species were previously identified and placed under C. monbeigiana Hand.-Mazz. or C. mollicoma Hu. However, the specimens from southeast Tibet differ from those of C. monbeigiana from other regions with more lateral veins (19-24 vs 14-18) on each side of the midvein and dense pubescence on the abaxial leaf surface, while from those of C. mollicoma from other regions differ by nutlet with dense resinous glands and glabrous or sparsely villous at apex. Principal Component Analyses based on morphometric characters recognise the Tibetan populations as a separate group. Nuclear ribosomal ITS sequence variations show stable and distinct genetic divergences between the Tibetan populations and C. monbeigiana or C. mollicoma by two or three fixed nucleotide mutations. Phylogenetic analysis also identified three respective genetic clusters and the C. mollicoma cluster diverged early. In addition, the Tibetan populations show a disjunct geographic isolation from the other two species. Therefore, C. tibetana , based on the Tibetan populations, is here erected as a new species, distinctly different from C. monbeigiana and C. mollicoma .
Statistical downscaling of GCM simulations to streamflow using relevance vector machine
NASA Astrophysics Data System (ADS)
Ghosh, Subimal; Mujumdar, P. P.
2008-01-01
General circulation models (GCMs), the climate models often used in assessing the impact of climate change, operate on a coarse scale and thus the simulation results obtained from GCMs are not particularly useful in a comparatively smaller river basin scale hydrology. The article presents a methodology of statistical downscaling based on sparse Bayesian learning and Relevance Vector Machine (RVM) to model streamflow at river basin scale for monsoon period (June, July, August, September) using GCM simulated climatic variables. NCEP/NCAR reanalysis data have been used for training the model to establish a statistical relationship between streamflow and climatic variables. The relationship thus obtained is used to project the future streamflow from GCM simulations. The statistical methodology involves principal component analysis, fuzzy clustering and RVM. Different kernel functions are used for comparison purpose. The model is applied to Mahanadi river basin in India. The results obtained using RVM are compared with those of state-of-the-art Support Vector Machine (SVM) to present the advantages of RVMs over SVMs. A decreasing trend is observed for monsoon streamflow of Mahanadi due to high surface warming in future, with the CCSR/NIES GCM and B2 scenario.
Contribution of non-negative matrix factorization to the classification of remote sensing images
NASA Astrophysics Data System (ADS)
Karoui, M. S.; Deville, Y.; Hosseini, S.; Ouamri, A.; Ducrot, D.
2008-10-01
Remote sensing has become an unavoidable tool for better managing our environment, generally by realizing maps of land cover using classification techniques. The classification process requires some pre-processing, especially for data size reduction. The most usual technique is Principal Component Analysis. Another approach consists in regarding each pixel of the multispectral image as a mixture of pure elements contained in the observed area. Using Blind Source Separation (BSS) methods, one can hope to unmix each pixel and to perform the recognition of the classes constituting the observed scene. Our contribution consists in using Non-negative Matrix Factorization (NMF) combined with sparse coding as a solution to BSS, in order to generate new images (which are at least partly separated images) using HRV SPOT images from Oran area, Algeria). These images are then used as inputs of a supervised classifier integrating textural information. The results of classifications of these "separated" images show a clear improvement (correct pixel classification rate improved by more than 20%) compared to classification of initial (i.e. non separated) images. These results show the contribution of NMF as an attractive pre-processing for classification of multispectral remote sensing imagery.
Dascălu, Cristina Gena; Antohe, Magda Ecaterina
2009-01-01
Based on the eigenvalues and the eigenvectors analysis, the principal component analysis has the purpose to identify the subspace of the main components from a set of parameters, which are enough to characterize the whole set of parameters. Interpreting the data for analysis as a cloud of points, we find through geometrical transformations the directions where the cloud's dispersion is maximal--the lines that pass through the cloud's center of weight and have a maximal density of points around them (by defining an appropriate criteria function and its minimization. This method can be successfully used in order to simplify the statistical analysis on questionnaires--because it helps us to select from a set of items only the most relevant ones, which cover the variations of the whole set of data. For instance, in the presented sample we started from a questionnaire with 28 items and, applying the principal component analysis we identified 7 principal components--or main items--fact that simplifies significantly the further data statistical analysis.
Perceptually controlled doping for audio source separation
NASA Astrophysics Data System (ADS)
Mahé, Gaël; Nadalin, Everton Z.; Suyama, Ricardo; Romano, João MT
2014-12-01
The separation of an underdetermined audio mixture can be performed through sparse component analysis (SCA) that relies however on the strong hypothesis that source signals are sparse in some domain. To overcome this difficulty in the case where the original sources are available before the mixing process, the informed source separation (ISS) embeds in the mixture a watermark, which information can help a further separation. Though powerful, this technique is generally specific to a particular mixing setup and may be compromised by an additional bitrate compression stage. Thus, instead of watermarking, we propose a `doping' method that makes the time-frequency representation of each source more sparse, while preserving its audio quality. This method is based on an iterative decrease of the distance between the distribution of the signal and a target sparse distribution, under a perceptual constraint. We aim to show that the proposed approach is robust to audio coding and that the use of the sparsified signals improves the source separation, in comparison with the original sources. In this work, the analysis is made only in instantaneous mixtures and focused on voice sources.
The HTM Spatial Pooler-A Neocortical Algorithm for Online Sparse Distributed Coding.
Cui, Yuwei; Ahmad, Subutai; Hawkins, Jeff
2017-01-01
Hierarchical temporal memory (HTM) provides a theoretical framework that models several key computational principles of the neocortex. In this paper, we analyze an important component of HTM, the HTM spatial pooler (SP). The SP models how neurons learn feedforward connections and form efficient representations of the input. It converts arbitrary binary input patterns into sparse distributed representations (SDRs) using a combination of competitive Hebbian learning rules and homeostatic excitability control. We describe a number of key properties of the SP, including fast adaptation to changing input statistics, improved noise robustness through learning, efficient use of cells, and robustness to cell death. In order to quantify these properties we develop a set of metrics that can be directly computed from the SP outputs. We show how the properties are met using these metrics and targeted artificial simulations. We then demonstrate the value of the SP in a complete end-to-end real-world HTM system. We discuss the relationship with neuroscience and previous studies of sparse coding. The HTM spatial pooler represents a neurally inspired algorithm for learning sparse representations from noisy data streams in an online fashion.
ERIC Educational Resources Information Center
Mugrage, Beverly; And Others
Three ridge regression solutions are compared with ordinary least squares regression and with principal components regression using all components. Ridge regression, particularly the Lawless-Wang solution, out-performed ordinary least squares regression and the principal components solution on the criteria of stability of coefficient and closeness…
A Note on McDonald's Generalization of Principal Components Analysis
ERIC Educational Resources Information Center
Shine, Lester C., II
1972-01-01
It is shown that McDonald's generalization of Classical Principal Components Analysis to groups of variables maximally channels the totalvariance of the original variables through the groups of variables acting as groups. An equation is obtained for determining the vectors of correlations of the L2 components with the original variables.…
Peterson, Leif E
2002-01-01
CLUSFAVOR (CLUSter and Factor Analysis with Varimax Orthogonal Rotation) 5.0 is a Windows-based computer program for hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles. CLUSFAVOR 5.0 standardizes input data; sorts data according to gene-specific coefficient of variation, standard deviation, average and total expression, and Shannon entropy; performs hierarchical cluster analysis using nearest-neighbor, unweighted pair-group method using arithmetic averages (UPGMA), or furthest-neighbor joining methods, and Euclidean, correlation, or jack-knife distances; and performs principal-component analysis. PMID:12184816
The Complexity of Human Walking: A Knee Osteoarthritis Study
Kotti, Margarita; Duffell, Lynsey D.; Faisal, Aldo A.; McGregor, Alison H.
2014-01-01
This study proposes a framework for deconstructing complex walking patterns to create a simple principal component space before checking whether the projection to this space is suitable for identifying changes from the normality. We focus on knee osteoarthritis, the most common knee joint disease and the second leading cause of disability. Knee osteoarthritis affects over 250 million people worldwide. The motivation for projecting the highly dimensional movements to a lower dimensional and simpler space is our belief that motor behaviour can be understood by identifying a simplicity via projection to a low principal component space, which may reflect upon the underlying mechanism. To study this, we recruited 180 subjects, 47 of which reported that they had knee osteoarthritis. They were asked to walk several times along a walkway equipped with two force plates that capture their ground reaction forces along 3 axes, namely vertical, anterior-posterior, and medio-lateral, at 1000 Hz. Data when the subject does not clearly strike the force plate were excluded, leaving 1–3 gait cycles per subject. To examine the complexity of human walking, we applied dimensionality reduction via Probabilistic Principal Component Analysis. The first principal component explains 34% of the variance in the data, whereas over 80% of the variance is explained by 8 principal components or more. This proves the complexity of the underlying structure of the ground reaction forces. To examine if our musculoskeletal system generates movements that are distinguishable between normal and pathological subjects in a low dimensional principal component space, we applied a Bayes classifier. For the tested cross-validated, subject-independent experimental protocol, the classification accuracy equals 82.62%. Also, a novel complexity measure is proposed, which can be used as an objective index to facilitate clinical decision making. This measure proves that knee osteoarthritis subjects exhibit more variability in the two-dimensional principal component space. PMID:25232949
Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Duarte-Carvajalino, Julio M; Sapiro, Guillermo; Lenglet, Christophe
2018-02-15
We present a sparse Bayesian unmixing algorithm BusineX: Bayesian Unmixing for Sparse Inference-based Estimation of Fiber Crossings (X), for estimation of white matter fiber parameters from compressed (under-sampled) diffusion MRI (dMRI) data. BusineX combines compressive sensing with linear unmixing and introduces sparsity to the previously proposed multiresolution data fusion algorithm RubiX, resulting in a method for improved reconstruction, especially from data with lower number of diffusion gradients. We formulate the estimation of fiber parameters as a sparse signal recovery problem and propose a linear unmixing framework with sparse Bayesian learning for the recovery of sparse signals, the fiber orientations and volume fractions. The data is modeled using a parametric spherical deconvolution approach and represented using a dictionary created with the exponential decay components along different possible diffusion directions. Volume fractions of fibers along these directions define the dictionary weights. The proposed sparse inference, which is based on the dictionary representation, considers the sparsity of fiber populations and exploits the spatial redundancy in data representation, thereby facilitating inference from under-sampled q-space. The algorithm improves parameter estimation from dMRI through data-dependent local learning of hyperparameters, at each voxel and for each possible fiber orientation, that moderate the strength of priors governing the parameter variances. Experimental results on synthetic and in-vivo data show improved accuracy with a lower uncertainty in fiber parameter estimates. BusineX resolves a higher number of second and third fiber crossings. For under-sampled data, the algorithm is also shown to produce more reliable estimates. Copyright © 2017 Elsevier Inc. All rights reserved.
Principal Components Analysis of a JWST NIRSpec Detector Subsystem
NASA Technical Reports Server (NTRS)
Arendt, Richard G.; Fixsen, D. J.; Greenhouse, Matthew A.; Lander, Matthew; Lindler, Don; Loose, Markus; Moseley, S. H.; Mott, D. Brent; Rauscher, Bernard J.; Wen, Yiting;
2013-01-01
We present principal component analysis (PCA) of a flight-representative James Webb Space Telescope NearInfrared Spectrograph (NIRSpec) Detector Subsystem. Although our results are specific to NIRSpec and its T - 40 K SIDECAR ASICs and 5 m cutoff H2RG detector arrays, the underlying technical approach is more general. We describe how we measured the systems response to small environmental perturbations by modulating a set of bias voltages and temperature. We used this information to compute the systems principal noise components. Together with information from the astronomical scene, we show how the zeroth principal component can be used to calibrate out the effects of small thermal and electrical instabilities to produce cosmetically cleaner images with significantly less correlated noise. Alternatively, if one were designing a new instrument, one could use a similar PCA approach to inform a set of environmental requirements (temperature stability, electrical stability, etc.) that enabled the planned instrument to meet performance requirements
Ghosh, Debasree; Chattopadhyay, Parimal
2012-06-01
The objective of the work was to use the method of quantitative descriptive analysis (QDA) to describe the sensory attributes of the fermented food products prepared with the incorporation of lactic cultures. Panellists were selected and trained to evaluate various attributes specially color and appearance, body texture, flavor, overall acceptability and acidity of the fermented food products like cow milk curd and soymilk curd, idli, sauerkraut and probiotic ice cream. Principal component analysis (PCA) identified the six significant principal components that accounted for more than 90% of the variance in the sensory attribute data. Overall product quality was modelled as a function of principal components using multiple least squares regression (R (2) = 0.8). The result from PCA was statistically analyzed by analysis of variance (ANOVA). These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring the fermented food product attributes that are important for consumer acceptability.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Waldmann, I. P., E-mail: ingo@star.ucl.ac.uk
2014-01-01
Independent component analysis (ICA) has recently been shown to be a promising new path in data analysis and de-trending of exoplanetary time series signals. Such approaches do not require or assume any prior or auxiliary knowledge about the data or instrument in order to de-convolve the astrophysical light curve signal from instrument or stellar systematic noise. These methods are often known as 'blind-source separation' (BSS) algorithms. Unfortunately, all BSS methods suffer from an amplitude and sign ambiguity of their de-convolved components, which severely limits these methods in low signal-to-noise (S/N) observations where their scalings cannot be determined otherwise. Here wemore » present a novel approach to calibrate ICA using sparse wavelet calibrators. The Amplitude Calibrated Independent Component Analysis (ACICA) allows for the direct retrieval of the independent components' scalings and the robust de-trending of low S/N data. Such an approach gives us an unique and unprecedented insight in the underlying morphology of a data set, which makes this method a powerful tool for exoplanetary data de-trending and signal diagnostics.« less
NASA Astrophysics Data System (ADS)
Lim, Hoong-Ta; Murukeshan, Vadakke Matham
2017-06-01
Hyperspectral imaging combines imaging and spectroscopy to provide detailed spectral information for each spatial point in the image. This gives a three-dimensional spatial-spatial-spectral datacube with hundreds of spectral images. Probe-based hyperspectral imaging systems have been developed so that they can be used in regions where conventional table-top platforms would find it difficult to access. A fiber bundle, which is made up of specially-arranged optical fibers, has recently been developed and integrated with a spectrograph-based hyperspectral imager. This forms a snapshot hyperspectral imaging probe, which is able to form a datacube using the information from each scan. Compared to the other configurations, which require sequential scanning to form a datacube, the snapshot configuration is preferred in real-time applications where motion artifacts and pixel misregistration can be minimized. Principal component analysis is a dimension-reducing technique that can be applied in hyperspectral imaging to convert the spectral information into uncorrelated variables known as principal components. A confidence ellipse can be used to define the region of each class in the principal component feature space and for classification. This paper demonstrates the use of the snapshot hyperspectral imaging probe to acquire data from samples of different colors. The spectral library of each sample was acquired and then analyzed using principal component analysis. Confidence ellipse was then applied to the principal components of each sample and used as the classification criteria. The results show that the applied analysis can be used to perform classification of the spectral data acquired using the snapshot hyperspectral imaging probe.
Pepper seed variety identification based on visible/near-infrared spectral technology
NASA Astrophysics Data System (ADS)
Li, Cuiling; Wang, Xiu; Meng, Zhijun; Fan, Pengfei; Cai, Jichen
2016-11-01
Pepper is a kind of important fruit vegetable, with the expansion of pepper hybrid planting area, detection of pepper seed purity is especially important. This research used visible/near infrared (VIS/NIR) spectral technology to detect the variety of single pepper seed, and chose hybrid pepper seeds "Zhuo Jiao NO.3", "Zhuo Jiao NO.4" and "Zhuo Jiao NO.5" as research sample. VIS/NIR spectral data of 80 "Zhuo Jiao NO.3", 80 "Zhuo Jiao NO.4" and 80 "Zhuo Jiao NO.5" pepper seeds were collected, and the original spectral data was pretreated with standard normal variable (SNV) transform, first derivative (FD), and Savitzky-Golay (SG) convolution smoothing methods. Principal component analysis (PCA) method was adopted to reduce the dimension of the spectral data and extract principal components, according to the distribution of the first principal component (PC1) along with the second principal component(PC2) in the twodimensional plane, similarly, the distribution of PC1 coupled with the third principal component(PC3), and the distribution of PC2 combined with PC3, distribution areas of three varieties of pepper seeds were divided in each twodimensional plane, and the discriminant accuracy of PCA was tested through observing the distribution area of samples' principal components in validation set. This study combined PCA and linear discriminant analysis (LDA) to identify single pepper seed varieties, results showed that with the FD preprocessing method, the discriminant accuracy of pepper seed varieties was 98% for validation set, it concludes that using VIS/NIR spectral technology is feasible for identification of single pepper seed varieties.
Long, J.M.; Fisher, W.L.
2006-01-01
We present a method for spatial interpretation of environmental variation in a reservoir that integrates principal components analysis (PCA) of environmental data with geographic information systems (GIS). To illustrate our method, we used data from a Great Plains reservoir (Skiatook Lake, Oklahoma) with longitudinal variation in physicochemical conditions. We measured 18 physicochemical features, mapped them using GIS, and then calculated and interpreted four principal components. Principal component 1 (PC1) was readily interpreted as longitudinal variation in water chemistry, but the other principal components (PC2-4) were difficult to interpret. Site scores for PC1-4 were calculated in GIS by summing weighted overlays of the 18 measured environmental variables, with the factor loadings from the PCA as the weights. PC1-4 were then ordered into a landscape hierarchy, an emergent property of this technique, which enabled their interpretation. PC1 was interpreted as a reservoir scale change in water chemistry, PC2 was a microhabitat variable of rip-rap substrate, PC3 identified coves/embayments and PC4 consisted of shoreline microhabitats related to slope. The use of GIS improved our ability to interpret the more obscure principal components (PC2-4), which made the spatial variability of the reservoir environment more apparent. This method is applicable to a variety of aquatic systems, can be accomplished using commercially available software programs, and allows for improved interpretation of the geographic environmental variability of a system compared to using typical PCA plots. ?? Copyright by the North American Lake Management Society 2006.
Giesen, E B W; Ding, M; Dalstra, M; van Eijden, T M G J
2003-09-01
As several morphological parameters of cancellous bone express more or less the same architectural measure, we applied principal components analysis to group these measures and correlated these to the mechanical properties. Cylindrical specimens (n = 24) were obtained in different orientations from embalmed mandibular condyles; the angle of the first principal direction and the axis of the specimen, expressing the orientation of the trabeculae, ranged from 10 degrees to 87 degrees. Morphological parameters were determined by a method based on Archimedes' principle and by micro-CT scanning, and the mechanical properties were obtained by mechanical testing. The principal components analysis was used to obtain a set of independent components to describe the morphology. This set was entered into linear regression analyses for explaining the variance in mechanical properties. The principal components analysis revealed four components: amount of bone, number of trabeculae, trabecular orientation, and miscellaneous. They accounted for about 90% of the variance in the morphological variables. The component loadings indicated that a higher amount of bone was primarily associated with more plate-like trabeculae, and not with more or thicker trabeculae. The trabecular orientation was most determinative (about 50%) in explaining stiffness, strength, and failure energy. The amount of bone was second most determinative and increased the explained variance to about 72%. These results suggest that trabecular orientation and amount of bone are important in explaining the anisotropic mechanical properties of the cancellous bone of the mandibular condyle.
2017-01-01
Introduction This research paper aims to assess factors reported by parents associated with the successful transition of children with complex additional support requirements that have undergone a transition between school environments from 8 European Union member states. Methods Quantitative data were collected from 306 parents within education systems from 8 EU member states (Bulgaria, Cyprus, Greece, Ireland, the Netherlands, Romania, Spain and the UK). The data were derived from an online questionnaire and consisted of 41 questions. Information was collected on: parental involvement in their child’s transition, child involvement in transition, child autonomy, school ethos, professionals’ involvement in transition and integrated working, such as, joint assessment, cooperation and coordination between agencies. Survey questions that were designed on a Likert-scale were included in the Principal Components Analysis (PCA), additional survey questions, along with the results from the PCA, were used to build a logistic regression model. Results Four principal components were identified accounting for 48.86% of the variability in the data. Principal component 1 (PC1), ‘child inclusive ethos,’ contains 16.17% of the variation. Principal component 2 (PC2), which represents child autonomy and involvement, is responsible for 8.52% of the total variation. Principal component 3 (PC3) contains questions relating to parental involvement and contributed to 12.26% of the overall variation. Principal component 4 (PC4), which involves transition planning and coordination, contributed to 11.91% of the overall variation. Finally, the principal components were included in a logistic regression to evaluate the relationship between inclusion and a successful transition, as well as whether other factors that may have influenced transition. All four principal components were significantly associated with a successful transition, with PC1 being having the most effect (OR: 4.04, CI: 2.43–7.18, p<0.0001). Discussion To support a child with complex additional support requirements through transition from special school to mainstream, governments and professionals need to ensure children with additional support requirements and their parents are at the centre of all decisions that affect them. It is important that professionals recognise the educational, psychological, social and cultural contexts of a child with additional support requirements and their families which will provide a holistic approach and remove barriers for learning. PMID:28636649
Ravenscroft, John; Wazny, Kerri; Davis, John M
2017-01-01
This research paper aims to assess factors reported by parents associated with the successful transition of children with complex additional support requirements that have undergone a transition between school environments from 8 European Union member states. Quantitative data were collected from 306 parents within education systems from 8 EU member states (Bulgaria, Cyprus, Greece, Ireland, the Netherlands, Romania, Spain and the UK). The data were derived from an online questionnaire and consisted of 41 questions. Information was collected on: parental involvement in their child's transition, child involvement in transition, child autonomy, school ethos, professionals' involvement in transition and integrated working, such as, joint assessment, cooperation and coordination between agencies. Survey questions that were designed on a Likert-scale were included in the Principal Components Analysis (PCA), additional survey questions, along with the results from the PCA, were used to build a logistic regression model. Four principal components were identified accounting for 48.86% of the variability in the data. Principal component 1 (PC1), 'child inclusive ethos,' contains 16.17% of the variation. Principal component 2 (PC2), which represents child autonomy and involvement, is responsible for 8.52% of the total variation. Principal component 3 (PC3) contains questions relating to parental involvement and contributed to 12.26% of the overall variation. Principal component 4 (PC4), which involves transition planning and coordination, contributed to 11.91% of the overall variation. Finally, the principal components were included in a logistic regression to evaluate the relationship between inclusion and a successful transition, as well as whether other factors that may have influenced transition. All four principal components were significantly associated with a successful transition, with PC1 being having the most effect (OR: 4.04, CI: 2.43-7.18, p<0.0001). To support a child with complex additional support requirements through transition from special school to mainstream, governments and professionals need to ensure children with additional support requirements and their parents are at the centre of all decisions that affect them. It is important that professionals recognise the educational, psychological, social and cultural contexts of a child with additional support requirements and their families which will provide a holistic approach and remove barriers for learning.
Ibrahim, George M; Morgan, Benjamin R; Macdonald, R Loch
2014-03-01
Predictors of outcome after aneurysmal subarachnoid hemorrhage have been determined previously through hypothesis-driven methods that often exclude putative covariates and require a priori knowledge of potential confounders. Here, we apply a data-driven approach, principal component analysis, to identify baseline patient phenotypes that may predict neurological outcomes. Principal component analysis was performed on 120 subjects enrolled in a prospective randomized trial of clazosentan for the prevention of angiographic vasospasm. Correlation matrices were created using a combination of Pearson, polyserial, and polychoric regressions among 46 variables. Scores of significant components (with eigenvalues>1) were included in multivariate logistic regression models with incidence of severe angiographic vasospasm, delayed ischemic neurological deficit, and long-term outcome as outcomes of interest. Sixteen significant principal components accounting for 74.6% of the variance were identified. A single component dominated by the patients' initial hemodynamic status, World Federation of Neurosurgical Societies score, neurological injury, and initial neutrophil/leukocyte counts was significantly associated with poor outcome. Two additional components were associated with angiographic vasospasm, of which one was also associated with delayed ischemic neurological deficit. The first was dominated by the aneurysm-securing procedure, subarachnoid clot clearance, and intracerebral hemorrhage, whereas the second had high contributions from markers of anemia and albumin levels. Principal component analysis, a data-driven approach, identified patient phenotypes that are associated with worse neurological outcomes. Such data reduction methods may provide a better approximation of unique patient phenotypes and may inform clinical care as well as patient recruitment into clinical trials. http://www.clinicaltrials.gov. Unique identifier: NCT00111085.
Principal components of wrist circumduction from electromagnetic surgical tracking.
Rasquinha, Brian J; Rainbow, Michael J; Zec, Michelle L; Pichora, David R; Ellis, Randy E
2017-02-01
An electromagnetic (EM) surgical tracking system was used for a functionally calibrated kinematic analysis of wrist motion. Circumduction motions were tested for differences in subject gender and for differences in the sense of the circumduction as clockwise or counter-clockwise motion. Twenty subjects were instrumented for EM tracking. Flexion-extension motion was used to identify the functional axis. Subjects performed unconstrained wrist circumduction in a clockwise and counter-clockwise sense. Data were decomposed into orthogonal flexion-extension motions and radial-ulnar deviation motions. PCA was used to concisely represent motions. Nonparametric Wilcoxon tests were used to distinguish the groups. Flexion-extension motions were projected onto a direction axis with a root-mean-square error of [Formula: see text]. Using the first three principal components, there was no statistically significant difference in gender (all [Formula: see text]). For motion sense, radial-ulnar deviation distinguished the sense of circumduction in the first principal component ([Formula: see text]) and in the third principal component ([Formula: see text]); flexion-extension distinguished the sense in the second principal component ([Formula: see text]). The clockwise sense of circumduction could be distinguished by a multifactorial combination of components; there were no gender differences in this small population. These data constitute a baseline for normal wrist circumduction. The multifactorial PCA findings suggest that a higher-dimensional method, such as manifold analysis, may be a more concise way of representing circumduction in human joints.
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2012-01-01
The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied. PMID:22661790
Introduction to uses and interpretation of principal component analyses in forest biology.
J. G. Isebrands; Thomas R. Crow
1975-01-01
The application of principal component analysis for interpretation of multivariate data sets is reviewed with emphasis on (1) reduction of the number of variables, (2) ordination of variables, and (3) applications in conjunction with multiple regression.
Principal component analysis of phenolic acid spectra
USDA-ARS?s Scientific Manuscript database
Phenolic acids are common plant metabolites that exhibit bioactive properties and have applications in functional food and animal feed formulations. The ultraviolet (UV) and infrared (IR) spectra of four closely related phenolic acid structures were evaluated by principal component analysis (PCA) to...
Optimal pattern synthesis for speech recognition based on principal component analysis
NASA Astrophysics Data System (ADS)
Korsun, O. N.; Poliyev, A. V.
2018-02-01
The algorithm for building an optimal pattern for the purpose of automatic speech recognition, which increases the probability of correct recognition, is developed and presented in this work. The optimal pattern forming is based on the decomposition of an initial pattern to principal components, which enables to reduce the dimension of multi-parameter optimization problem. At the next step the training samples are introduced and the optimal estimates for principal components decomposition coefficients are obtained by a numeric parameter optimization algorithm. Finally, we consider the experiment results that show the improvement in speech recognition introduced by the proposed optimization algorithm.
NASA Astrophysics Data System (ADS)
Gao, Yang; Chen, Maomao; Wu, Junyu; Zhou, Yuan; Cai, Chuangjian; Wang, Daliang; Luo, Jianwen
2017-09-01
Fluorescence molecular imaging has been used to target tumors in mice with xenograft tumors. However, tumor imaging is largely distorted by the aggregation of fluorescent probes in the liver. A principal component analysis (PCA)-based strategy was applied on the in vivo dynamic fluorescence imaging results of three mice with xenograft tumors to facilitate tumor imaging, with the help of a tumor-specific fluorescent probe. Tumor-relevant features were extracted from the original images by PCA and represented by the principal component (PC) maps. The second principal component (PC2) map represented the tumor-related features, and the first principal component (PC1) map retained the original pharmacokinetic profiles, especially of the liver. The distribution patterns of the PC2 map of the tumor-bearing mice were in good agreement with the actual tumor location. The tumor-to-liver ratio and contrast-to-noise ratio were significantly higher on the PC2 map than on the original images, thus distinguishing the tumor from its nearby fluorescence noise of liver. The results suggest that the PC2 map could serve as a bioimaging marker to facilitate in vivo tumor localization, and dynamic fluorescence molecular imaging with PCA could be a valuable tool for future studies of in vivo tumor metabolism and progression.
NASA Astrophysics Data System (ADS)
Ueki, Kenta; Iwamori, Hikaru
2017-10-01
In this study, with a view of understanding the structure of high-dimensional geochemical data and discussing the chemical processes at work in the evolution of arc magmas, we employed principal component analysis (PCA) to evaluate the compositional variations of volcanic rocks from the Sengan volcanic cluster of the Northeastern Japan Arc. We analyzed the trace element compositions of various arc volcanic rocks, sampled from 17 different volcanoes in a volcanic cluster. The PCA results demonstrated that the first three principal components accounted for 86% of the geochemical variation in the magma of the Sengan region. Based on the relationships between the principal components and the major elements, the mass-balance relationships with respect to the contributions of minerals, the composition of plagioclase phenocrysts, geothermal gradient, and seismic velocity structure in the crust, the first, the second, and the third principal components appear to represent magma mixing, crystallizations of olivine/pyroxene, and crystallizations of plagioclase, respectively. These represented 59%, 20%, and 6%, respectively, of the variance in the entire compositional range, indicating that magma mixing accounted for the largest variance in the geochemical variation of the arc magma. Our result indicated that crustal processes dominate the geochemical variation of magma in the Sengan volcanic cluster.
NASA Astrophysics Data System (ADS)
Moonon, Altan-Ulzii; Hu, Jianwen; Li, Shutao
2015-12-01
The remote sensing image fusion is an important preprocessing technique in remote sensing image processing. In this paper, a remote sensing image fusion method based on the nonsubsampled shearlet transform (NSST) with sparse representation (SR) is proposed. Firstly, the low resolution multispectral (MS) image is upsampled and color space is transformed from Red-Green-Blue (RGB) to Intensity-Hue-Saturation (IHS). Then, the high resolution panchromatic (PAN) image and intensity component of MS image are decomposed by NSST to high and low frequency coefficients. The low frequency coefficients of PAN and the intensity component are fused by the SR with the learned dictionary. The high frequency coefficients of intensity component and PAN image are fused by local energy based fusion rule. Finally, the fused result is obtained by performing inverse NSST and inverse IHS transform. The experimental results on IKONOS and QuickBird satellites demonstrate that the proposed method provides better spectral quality and superior spatial information in the fused image than other remote sensing image fusion methods both in visual effect and object evaluation.
NASA Astrophysics Data System (ADS)
Hyman, J. D.; Aldrich, G.; Viswanathan, H.; Makedonska, N.; Karra, S.
2016-08-01
We characterize how different fracture size-transmissivity relationships influence flow and transport simulations through sparse three-dimensional discrete fracture networks. Although it is generally accepted that there is a positive correlation between a fracture's size and its transmissivity/aperture, the functional form of that relationship remains a matter of debate. Relationships that assume perfect correlation, semicorrelation, and noncorrelation between the two have been proposed. To study the impact that adopting one of these relationships has on transport properties, we generate multiple sparse fracture networks composed of circular fractures whose radii follow a truncated power law distribution. The distribution of transmissivities are selected so that the mean transmissivity of the fracture networks are the same and the distributions of aperture and transmissivity in models that include a stochastic term are also the same. We observe that adopting a correlation between a fracture size and its transmissivity leads to earlier breakthrough times and higher effective permeability when compared to networks where no correlation is used. While fracture network geometry plays the principal role in determining where transport occurs within the network, the relationship between size and transmissivity controls the flow speed. These observations indicate DFN modelers should be aware that breakthrough times and effective permeabilities can be strongly influenced by such a relationship in addition to fracture and network statistics.
NASA Astrophysics Data System (ADS)
Hyman, J.; Aldrich, G. A.; Viswanathan, H. S.; Makedonska, N.; Karra, S.
2016-12-01
We characterize how different fracture size-transmissivity relationships influence flow and transport simulations through sparse three-dimensional discrete fracture networks. Although it is generally accepted that there is a positive correlation between a fracture's size and its transmissivity/aperture, the functional form of that relationship remains a matter of debate. Relationships that assume perfect correlation, semi-correlation, and non-correlation between the two have been proposed. To study the impact that adopting one of these relationships has on transport properties, we generate multiple sparse fracture networks composed of circular fractures whose radii follow a truncated power law distribution. The distribution of transmissivities are selected so that the mean transmissivity of the fracture networks are the same and the distributions of aperture and transmissivity in models that include a stochastic term are also the same.We observe that adopting a correlation between a fracture size and its transmissivity leads to earlier breakthrough times and higher effective permeability when compared to networks where no correlation is used. While fracture network geometry plays the principal role in determining where transport occurs within the network, the relationship between size and transmissivity controls the flow speed. These observations indicate DFN modelers should be aware that breakthrough times and effective permeabilities can be strongly influenced by such a relationship in addition to fracture and network statistics.
ERIC Educational Resources Information Center
Kronenberger, William G.; Thompson, Robert J., Jr.; Morrow, Catherine
1997-01-01
A principal components analysis of the Family Environment Scale (FES) (R. Moos and B. Moos, 1994) was performed using 113 undergraduates. Research supported 3 broad components encompassing the 10 FES subscales. These results supported previous research and the generalization of the FES to college samples. (SLD)
Time series analysis of collective motions in proteins
NASA Astrophysics Data System (ADS)
Alakent, Burak; Doruker, Pemra; ćamurdan, Mehmet C.
2004-01-01
The dynamics of α-amylase inhibitor tendamistat around its native state is investigated using time series analysis of the principal components of the Cα atomic displacements obtained from molecular dynamics trajectories. Collective motion along a principal component is modeled as a homogeneous nonstationary process, which is the result of the damped oscillations in local minima superimposed on a random walk. The motion in local minima is described by a stationary autoregressive moving average model, consisting of the frequency, damping factor, moving average parameters and random shock terms. Frequencies for the first 50 principal components are found to be in the 3-25 cm-1 range, which are well correlated with the principal component indices and also with atomistic normal mode analysis results. Damping factors, though their correlation is less pronounced, decrease as principal component indices increase, indicating that low frequency motions are less affected by friction. The existence of a positive moving average parameter indicates that the stochastic force term is likely to disturb the mode in opposite directions for two successive sampling times, showing the modes tendency to stay close to minimum. All these four parameters affect the mean square fluctuations of a principal mode within a single minimum. The inter-minima transitions are described by a random walk model, which is driven by a random shock term considerably smaller than that for the intra-minimum motion. The principal modes are classified into three subspaces based on their dynamics: essential, semiconstrained, and constrained, at least in partial consistency with previous studies. The Gaussian-type distributions of the intermediate modes, called "semiconstrained" modes, are explained by asserting that this random walk behavior is not completely free but between energy barriers.
Qi, Jin; Yang, Zhiyong
2014-01-01
Real-time human activity recognition is essential for human-robot interactions for assisted healthy independent living. Most previous work in this area is performed on traditional two-dimensional (2D) videos and both global and local methods have been used. Since 2D videos are sensitive to changes of lighting condition, view angle, and scale, researchers begun to explore applications of 3D information in human activity understanding in recently years. Unfortunately, features that work well on 2D videos usually don't perform well on 3D videos and there is no consensus on what 3D features should be used. Here we propose a model of human activity recognition based on 3D movements of body joints. Our method has three steps, learning dictionaries of sparse codes of 3D movements of joints, sparse coding, and classification. In the first step, space-time volumes of 3D movements of body joints are obtained via dense sampling and independent component analysis is then performed to construct a dictionary of sparse codes for each activity. In the second step, the space-time volumes are projected to the dictionaries and a set of sparse histograms of the projection coefficients are constructed as feature representations of the activities. Finally, the sparse histograms are used as inputs to a support vector machine to recognize human activities. We tested this model on three databases of human activities and found that it outperforms the state-of-the-art algorithms. Thus, this model can be used for real-time human activity recognition in many applications.
Burst and Principal Components Analyses of MEA Data Separates Chemicals by Class
Microelectrode arrays (MEAs) detect drug and chemical induced changes in action potential "spikes" in neuronal networks and can be used to screen chemicals for neurotoxicity. Analytical "fingerprinting," using Principal Components Analysis (PCA) on spike trains recorded from prim...
EVALUATION OF ACID DEPOSITION MODELS USING PRINCIPAL COMPONENT SPACES
An analytical technique involving principal components analysis is proposed for use in the evaluation of acid deposition models. elationships among model predictions are compared to those among measured data, rather than the more common one-to-one comparison of predictions to mea...
Fast and low-dose computed laminography using compressive sensing based technique
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abbas, Sajid, E-mail: scho@kaist.ac.kr; Park, Miran, E-mail: scho@kaist.ac.kr; Cho, Seungryong, E-mail: scho@kaist.ac.kr
2015-03-31
Computed laminography (CL) is well known for inspecting microstructures in the materials, weldments and soldering defects in high density packed components or multilayer printed circuit boards. The overload problem on x-ray tube and gross failure of the radio-sensitive electronics devices during a scan are among important issues in CL which needs to be addressed. The sparse-view CL can be one of the viable option to overcome such issues. In this work a numerical aluminum welding phantom was simulated to collect sparsely sampled projection data at only 40 views using a conventional CL scanning scheme i.e. oblique scan. A compressive-sensing inspiredmore » total-variation (TV) minimization algorithm was utilized to reconstruct the images. It is found that the images reconstructed using sparse view data are visually comparable with the images reconstructed using full scan data set i.e. at 360 views on regular interval. We have quantitatively confirmed that tiny structures such as copper and tungsten slags, and copper flakes in the reconstructed images from sparsely sampled data are comparable with the corresponding structure present in the fully sampled data case. A blurring effect can be seen near the edges of few pores at the bottom of the reconstructed images from sparsely sampled data, despite the overall image quality is reasonable for fast and low-dose NDT.« less
Principal components analysis in clinical studies.
Zhang, Zhongheng; Castelló, Adela
2017-09-01
In multivariate analysis, independent variables are usually correlated to each other which can introduce multicollinearity in the regression models. One approach to solve this problem is to apply principal components analysis (PCA) over these variables. This method uses orthogonal transformation to represent sets of potentially correlated variables with principal components (PC) that are linearly uncorrelated. PCs are ordered so that the first PC has the largest possible variance and only some components are selected to represent the correlated variables. As a result, the dimension of the variable space is reduced. This tutorial illustrates how to perform PCA in R environment, the example is a simulated dataset in which two PCs are responsible for the majority of the variance in the data. Furthermore, the visualization of PCA is highlighted.
Complexity of free energy landscapes of peptides revealed by nonlinear principal component analysis.
Nguyen, Phuong H
2006-12-01
Employing the recently developed hierarchical nonlinear principal component analysis (NLPCA) method of Saegusa et al. (Neurocomputing 2004;61:57-70 and IEICE Trans Inf Syst 2005;E88-D:2242-2248), the complexities of the free energy landscapes of several peptides, including triglycine, hexaalanine, and the C-terminal beta-hairpin of protein G, were studied. First, the performance of this NLPCA method was compared with the standard linear principal component analysis (PCA). In particular, we compared two methods according to (1) the ability of the dimensionality reduction and (2) the efficient representation of peptide conformations in low-dimensional spaces spanned by the first few principal components. The study revealed that NLPCA reduces the dimensionality of the considered systems much better, than did PCA. For example, in order to get the similar error, which is due to representation of the original data of beta-hairpin in low dimensional space, one needs 4 and 21 principal components of NLPCA and PCA, respectively. Second, by representing the free energy landscapes of the considered systems as a function of the first two principal components obtained from PCA, we obtained the relatively well-structured free energy landscapes. In contrast, the free energy landscapes of NLPCA are much more complicated, exhibiting many states which are hidden in the PCA maps, especially in the unfolded regions. Furthermore, the study also showed that many states in the PCA maps are mixed up by several peptide conformations, while those of the NLPCA maps are more pure. This finding suggests that the NLPCA should be used to capture the essential features of the systems. (c) 2006 Wiley-Liss, Inc.
Jović, Ozren; Smolić, Tomislav; Primožič, Ines; Hrenar, Tomica
2016-04-19
The aim of this study was to investigate the feasibility of FTIR-ATR spectroscopy coupled with the multivariate numerical methodology for qualitative and quantitative analysis of binary and ternary edible oil mixtures. Four pure oils (extra virgin olive oil, high oleic sunflower oil, rapeseed oil, and sunflower oil), as well as their 54 binary and 108 ternary mixtures, were analyzed using FTIR-ATR spectroscopy in combination with principal component and discriminant analysis, partial least-squares, and principal component regression. It was found that the composition of all 166 samples can be excellently represented using only the first three principal components describing 98.29% of total variance in the selected spectral range (3035-2989, 1170-1140, 1120-1100, 1093-1047, and 930-890 cm(-1)). Factor scores in 3D space spanned by these three principal components form a tetrahedral-like arrangement: pure oils being at the vertices, binary mixtures at the edges, and ternary mixtures on the faces of a tetrahedron. To confirm the validity of results, we applied several cross-validation methods. Quantitative analysis was performed by minimization of root-mean-square error of cross-validation values regarding the spectral range, derivative order, and choice of method (partial least-squares or principal component regression), which resulted in excellent predictions for test sets (R(2) > 0.99 in all cases). Additionally, experimentally more demanding gas chromatography analysis of fatty acid content was carried out for all specimens, confirming the results obtained by FTIR-ATR coupled with principal component analysis. However, FTIR-ATR provided a considerably better model for prediction of mixture composition than gas chromatography, especially for high oleic sunflower oil.
NASA Astrophysics Data System (ADS)
Li, Jiangtong; Luo, Yongdao; Dai, Honglin
2018-01-01
Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.
Vargas-Bello-Pérez, Einar; Toro-Mujica, Paula; Enriquez-Hidalgo, Daniel; Fellenberg, María Angélica; Gómez-Cortés, Pilar
2017-06-01
We used a multivariate chemometric approach to differentiate or associate retail bovine milks with different fat contents and non-dairy beverages, using fatty acid profiles and statistical analysis. We collected samples of bovine milk (whole, semi-skim, and skim; n = 62) and non-dairy beverages (n = 27), and we analyzed them using gas-liquid chromatography. Principal component analysis of the fatty acid data yielded 3 significant principal components, which accounted for 72% of the total variance in the data set. Principal component 1 was related to saturated fatty acids (C4:0, C6:0, C8:0, C12:0, C14:0, C17:0, and C18:0) and monounsaturated fatty acids (C14:1 cis-9, C16:1 cis-9, C17:1 cis-9, and C18:1 trans-11); whole milk samples were clearly differentiated from the rest using this principal component. Principal component 2 differentiated semi-skim milk samples by n-3 fatty acid content (C20:3n-3, C20:5n-3, and C22:6n-3). Principal component 3 was related to C18:2 trans-9,trans-12 and C20:4n-6, and its lower scores were observed in skim milk and non-dairy beverages. A cluster analysis yielded 3 groups: group 1 consisted of only whole milk samples, group 2 was represented mainly by semi-skim milks, and group 3 included skim milk and non-dairy beverages. Overall, the present study showed that a multivariate chemometric approach is a useful tool for differentiating or associating retail bovine milks and non-dairy beverages using their fatty acid profile. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Use of multivariate statistics to identify unreliable data obtained using CASA.
Martínez, Luis Becerril; Crispín, Rubén Huerta; Mendoza, Maximino Méndez; Gallegos, Oswaldo Hernández; Martínez, Andrés Aragón
2013-06-01
In order to identify unreliable data in a dataset of motility parameters obtained from a pilot study acquired by a veterinarian with experience in boar semen handling, but without experience in the operation of a computer assisted sperm analysis (CASA) system, a multivariate graphical and statistical analysis was performed. Sixteen boar semen samples were aliquoted then incubated with varying concentrations of progesterone from 0 to 3.33 µg/ml and analyzed in a CASA system. After standardization of the data, Chernoff faces were pictured for each measurement, and a principal component analysis (PCA) was used to reduce the dimensionality and pre-process the data before hierarchical clustering. The first twelve individual measurements showed abnormal features when Chernoff faces were drawn. PCA revealed that principal components 1 and 2 explained 63.08% of the variance in the dataset. Values of principal components for each individual measurement of semen samples were mapped to identify differences among treatment or among boars. Twelve individual measurements presented low values of principal component 1. Confidence ellipses on the map of principal components showed no statistically significant effects for treatment or boar. Hierarchical clustering realized on two first principal components produced three clusters. Cluster 1 contained evaluations of the two first samples in each treatment, each one of a different boar. With the exception of one individual measurement, all other measurements in cluster 1 were the same as observed in abnormal Chernoff faces. Unreliable data in cluster 1 are probably related to the operator inexperience with a CASA system. These findings could be used to objectively evaluate the skill level of an operator of a CASA system. This may be particularly useful in the quality control of semen analysis using CASA systems.
Liu, Xiang; Guo, Ling-Peng; Zhang, Fei-Yun; Ma, Jie; Mu, Shu-Yong; Zhao, Xin; Li, Lan-Hai
2015-02-01
Eight physical and chemical indicators related to water quality were monitored from nineteen sampling sites along the Kunes River at the end of snowmelt season in spring. To investigate the spatial distribution characteristics of water physical and chemical properties, cluster analysis (CA), discriminant analysis (DA) and principal component analysis (PCA) are employed. The result of cluster analysis showed that the Kunes River could be divided into three reaches according to the similarities of water physical and chemical properties among sampling sites, representing the upstream, midstream and downstream of the river, respectively; The result of discriminant analysis demonstrated that the reliability of such a classification was high, and DO, Cl- and BOD5 were the significant indexes leading to this classification; Three principal components were extracted on the basis of the principal component analysis, in which accumulative variance contribution could reach 86.90%. The result of principal component analysis also indicated that water physical and chemical properties were mostly affected by EC, ORP, NO3(-) -N, NH4(+) -N, Cl- and BOD5. The sorted results of principal component scores in each sampling sites showed that the water quality was mainly influenced by DO in upstream, by pH in midstream, and by the rest of indicators in downstream. The order of comprehensive scores for principal components revealed that the water quality degraded from the upstream to downstream, i.e., the upstream had the best water quality, followed by the midstream, while the water quality at downstream was the worst. This result corresponded exactly to the three reaches classified using cluster analysis. Anthropogenic activity and the accumulation of pollutants along the river were probably the main reasons leading to this spatial difference.
Putilov, Arcady A; Donskaya, Olga G
2016-01-01
Age-associated changes in different bandwidths of the human electroencephalographic (EEG) spectrum are well documented, but their functional significance is poorly understood. This spectrum seems to represent summation of simultaneous influences of several sleep-wake regulatory processes. Scoring of its orthogonal (uncorrelated) principal components can help in separation of the brain signatures of these processes. In particular, the opposite age-associated changes were documented for scores on the two largest (1st and 2nd) principal components of the sleep EEG spectrum. A decrease of the first score and an increase of the second score can reflect, respectively, the weakening of the sleep drive and disinhibition of the opposing wake drive with age. In order to support the suggestion of age-associated disinhibition of the wake drive from the antagonistic influence of the sleep drive, we analyzed principal component scores of the resting EEG spectra obtained in sleep deprivation experiments with 81 healthy young adults aged between 19 and 26 and 40 healthy older adults aged between 45 and 66 years. At the second day of the sleep deprivation experiments, frontal scores on the 1st principal component of the EEG spectrum demonstrated an age-associated reduction of response to eyes closed relaxation. Scores on the 2nd principal component were either initially increased during wakefulness or less responsive to such sleep-provoking conditions (frontal and occipital scores, respectively). These results are in line with the suggestion of disinhibition of the wake drive with age. They provide an explanation of why older adults are less vulnerable to sleep deprivation than young adults.
NASA Astrophysics Data System (ADS)
Wojciechowski, Adam
2017-04-01
In order to assess ecodiversity understood as a comprehensive natural landscape factor (Jedicke 2001), it is necessary to apply research methods which recognize the environment in a holistic way. Principal component analysis may be considered as one of such methods as it allows to distinguish the main factors determining landscape diversity on the one hand, and enables to discover regularities shaping the relationships between various elements of the environment under study on the other hand. The procedure adopted to assess ecodiversity with the use of principal component analysis involves: a) determining and selecting appropriate factors of the assessed environment qualities (hypsometric, geological, hydrographic, plant, and others); b) calculating the absolute value of individual qualities for the basic areas under analysis (e.g. river length, forest area, altitude differences, etc.); c) principal components analysis and obtaining factor maps (maps of selected components); d) generating a resultant, detailed map and isolating several classes of ecodiversity. An assessment of ecodiversity with the use of principal component analysis was conducted in the test area of 299,67 km2 in Debnica Kaszubska commune. The whole commune is situated in the Weichselian glaciation area of high hypsometric and morphological diversity as well as high geo- and biodiversity. The analysis was based on topographical maps of the commune area in scale 1:25000 and maps of forest habitats. Consequently, nine factors reflecting basic environment elements were calculated: maximum height (m), minimum height (m), average height (m), the length of watercourses (km), the area of water reservoirs (m2), total forest area (ha), coniferous forests habitats area (ha), deciduous forest habitats area (ha), alder habitats area (ha). The values for individual factors were analysed for 358 grid cells of 1 km2. Based on the principal components analysis, four major factors affecting commune ecodiversity were distinguished: hypsometric component (PC1), deciduous forest habitats component (PC2), river valleys and alder habitats component (PC3), and lakes component (PC4). The distinguished factors characterise natural qualities of postglacial area and reflect well the role of the four most important groups of environment components in shaping ecodiversity of the area under study. The map of ecodiversity of Debnica Kaszubska commune was created on the basis of the first four principal component scores and then five classes of diversity were isolated: very low, low, average, high and very high. As a result of the assessment, five commune regions of very high ecodiversity were separated. These regions are also very attractive for tourists and valuable in terms of their rich nature which include protected areas such as Slupia Valley Landscape Park. The suggested method of ecodiversity assessment with the use of principal component analysis may constitute an alternative methodological proposition to other research methods used so far. Literature Jedicke E., 2001. Biodiversität, Geodiversität, Ökodiversität. Kriterien zur Analyse der Landschaftsstruktur - ein konzeptioneller Diskussionsbeitrag. Naturschutz und Landschaftsplanung, 33(2/3), 59-68.
A stochastic model of weather states and concurrent daily precipitation at multiple precipitation stations is described. our algorithms are invested for classification of daily weather states; k means, fuzzy clustering, principal components, and principal components coupled with ...
Rosacea assessment by erythema index and principal component analysis segmentation maps
NASA Astrophysics Data System (ADS)
Kuzmina, Ilona; Rubins, Uldis; Saknite, Inga; Spigulis, Janis
2017-12-01
RGB images of rosacea were analyzed using segmentation maps of principal component analysis (PCA) and erythema index (EI). Areas of segmented clusters were compared to Clinician's Erythema Assessment (CEA) values given by two dermatologists. The results show that visible blood vessels are segmented more precisely on maps of the erythema index and the third principal component (PC3). In many cases, a distribution of clusters on EI and PC3 maps are very similar. Mean values of clusters' areas on these maps show a decrease of the area of blood vessels and erythema and an increase of lighter skin area after the therapy for the patients with diagnosis CEA = 2 on the first visit and CEA=1 on the second visit. This study shows that EI and PC3 maps are more useful than the maps of the first (PC1) and second (PC2) principal components for indicating vascular structures and erythema on the skin of rosacea patients and therapy monitoring.
NASA Astrophysics Data System (ADS)
Zhang, Qiong; Peng, Cong; Lu, Yiming; Wang, Hao; Zhu, Kaiguang
2018-04-01
A novel technique is developed to level airborne geophysical data using principal component analysis based on flight line difference. In the paper, flight line difference is introduced to enhance the features of levelling error for airborne electromagnetic (AEM) data and improve the correlation between pseudo tie lines. Thus we conduct levelling to the flight line difference data instead of to the original AEM data directly. Pseudo tie lines are selected distributively cross profile direction, avoiding the anomalous regions. Since the levelling errors of selective pseudo tie lines show high correlations, principal component analysis is applied to extract the local levelling errors by low-order principal components reconstruction. Furthermore, we can obtain the levelling errors of original AEM data through inverse difference after spatial interpolation. This levelling method does not need to fly tie lines and design the levelling fitting function. The effectiveness of this method is demonstrated by the levelling results of survey data, comparing with the results from tie-line levelling and flight-line correlation levelling.
[Content of mineral elements of Gastrodia elata by principal components analysis].
Li, Jin-ling; Zhao, Zhi; Liu, Hong-chang; Luo, Chun-li; Huang, Ming-jin; Luo, Fu-lai; Wang, Hua-lei
2015-03-01
To study the content of mineral elements and the principal components in Gastrodia elata. Mineral elements were determined by ICP and the data was analyzed by SPSS. K element has the highest content-and the average content was 15.31 g x kg(-1). The average content of N element was 8.99 g x kg(-1), followed by K element. The coefficient of variation of K and N was small, but the Mn was the biggest with 51.39%. The highly significant positive correlation was found among N, P and K . Three principal components were selected by principal components analysis to evaluate the quality of G. elata. P, B, N, K, Cu, Mn, Fe and Mg were the characteristic elements of G. elata. The content of K and N elements was higher and relatively stable. The variation of Mn content was biggest. The quality of G. elata in Guizhou and Yunnan was better from the perspective of mineral elements.
Visualizing Hyolaryngeal Mechanics in Swallowing Using Dynamic MRI
Pearson, William G.; Zumwalt, Ann C.
2013-01-01
Introduction Coordinates of anatomical landmarks are captured using dynamic MRI to explore whether a proposed two-sling mechanism underlies hyolaryngeal elevation in pharyngeal swallowing. A principal components analysis (PCA) is applied to coordinates to determine the covariant function of the proposed mechanism. Methods Dynamic MRI (dMRI) data were acquired from eleven healthy subjects during a repeated swallows task. Coordinates mapping the proposed mechanism are collected from each dynamic (frame) of a dynamic MRI swallowing series of a randomly selected subject in order to demonstrate shape changes in a single subject. Coordinates representing minimum and maximum hyolaryngeal elevation of all 11 subjects were also mapped to demonstrate shape changes of the system among all subjects. MophoJ software was used to perform PCA and determine vectors of shape change (eigenvectors) for elements of the two-sling mechanism of hyolaryngeal elevation. Results For both single subject and group PCAs, hyolaryngeal elevation accounted for the first principal component of variation. For the single subject PCA, the first principal component accounted for 81.5% of the variance. For the between subjects PCA, the first principal component accounted for 58.5% of the variance. Eigenvectors and shape changes associated with this first principal component are reported. Discussion Eigenvectors indicate that two-muscle slings and associated skeletal elements function as components of a covariant mechanism to elevate the hyolaryngeal complex. Morphological analysis is useful to model shape changes in the two-sling mechanism of hyolaryngeal elevation. PMID:25090608
Panazzolo, Diogo G; Sicuro, Fernando L; Clapauch, Ruth; Maranhão, Priscila A; Bouskela, Eliete; Kraemer-Aguiar, Luiz G
2012-11-13
We aimed to evaluate the multivariate association between functional microvascular variables and clinical-laboratorial-anthropometrical measurements. Data from 189 female subjects (34.0 ± 15.5 years, 30.5 ± 7.1 kg/m2), who were non-smokers, non-regular drug users, without a history of diabetes and/or hypertension, were analyzed by principal component analysis (PCA). PCA is a classical multivariate exploratory tool because it highlights common variation between variables allowing inferences about possible biological meaning of associations between them, without pre-establishing cause-effect relationships. In total, 15 variables were used for PCA: body mass index (BMI), waist circumference, systolic and diastolic blood pressure (BP), fasting plasma glucose, levels of total cholesterol, high-density lipoprotein cholesterol (HDL-c), low-density lipoprotein cholesterol (LDL-c), triglycerides (TG), insulin, C-reactive protein (CRP), and functional microvascular variables measured by nailfold videocapillaroscopy. Nailfold videocapillaroscopy was used for direct visualization of nutritive capillaries, assessing functional capillary density, red blood cell velocity (RBCV) at rest and peak after 1 min of arterial occlusion (RBCV(max)), and the time taken to reach RBCV(max) (TRBCV(max)). A total of 35% of subjects had metabolic syndrome, 77% were overweight/obese, and 9.5% had impaired fasting glucose. PCA was able to recognize that functional microvascular variables and clinical-laboratorial-anthropometrical measurements had a similar variation. The first five principal components explained most of the intrinsic variation of the data. For example, principal component 1 was associated with BMI, waist circumference, systolic BP, diastolic BP, insulin, TG, CRP, and TRBCV(max) varying in the same way. Principal component 1 also showed a strong association among HDL-c, RBCV, and RBCV(max), but in the opposite way. Principal component 3 was associated only with microvascular variables in the same way (functional capillary density, RBCV and RBCV(max)). Fasting plasma glucose appeared to be related to principal component 4 and did not show any association with microvascular reactivity. In non-diabetic female subjects, a multivariate scenario of associations between classic clinical variables strictly related to obesity and metabolic syndrome suggests a significant relationship between these diseases and microvascular reactivity.
Visual Tracking via Sparse and Local Linear Coding.
Wang, Guofeng; Qin, Xueying; Zhong, Fan; Liu, Yue; Li, Hongbo; Peng, Qunsheng; Yang, Ming-Hsuan
2015-11-01
The state search is an important component of any object tracking algorithm. Numerous algorithms have been proposed, but stochastic sampling methods (e.g., particle filters) are arguably one of the most effective approaches. However, the discretization of the state space complicates the search for the precise object location. In this paper, we propose a novel tracking algorithm that extends the state space of particle observations from discrete to continuous. The solution is determined accurately via iterative linear coding between two convex hulls. The algorithm is modeled by an optimal function, which can be efficiently solved by either convex sparse coding or locality constrained linear coding. The algorithm is also very flexible and can be combined with many generic object representations. Thus, we first use sparse representation to achieve an efficient searching mechanism of the algorithm and demonstrate its accuracy. Next, two other object representation models, i.e., least soft-threshold squares and adaptive structural local sparse appearance, are implemented with improved accuracy to demonstrate the flexibility of our algorithm. Qualitative and quantitative experimental results demonstrate that the proposed tracking algorithm performs favorably against the state-of-the-art methods in dynamic scenes.
The factorial reliability of the Middlesex Hospital Questionnaire in normal subjects.
Bagley, C
1980-03-01
The internal reliability of the Middlesex Hospital Questionnaire and its component subscales has been checked by means of principal components analyses of data on 256 normal subjects. The subscales (with the possible exception of Hysteria) were found to contribute to the general underlying factor of psychoneurosis. In general, the principal components analysis points to the reliability of the subscales, despite some item overlap.
ERIC Educational Resources Information Center
McCormick, Ernest J.; And Others
The study deals with the job component method of establishing compensation rates. The basic job analysis questionnaire used in the study was the Position Analysis Questionnaire (PAQ) (Form B). On the basis of a principal components analysis of PAQ data for a large sample (2,688) of jobs, a number of principal components (job dimensions) were…
ERIC Educational Resources Information Center
Faginski-Stark, Erica; Casavant, Christopher; Collins, William; McCandless, Jason; Tencza, Marilyn
2012-01-01
Recent federal and state mandates have tasked school systems to move beyond principal evaluation as a bureaucratic function and to re-imagine it as a critical component to improve principal performance and compel school renewal. This qualitative study investigated the district leaders' and principals' perceptions of the performance evaluation…
2L-PCA: a two-level principal component analyzer for quantitative drug design and its applications.
Du, Qi-Shi; Wang, Shu-Qing; Xie, Neng-Zhong; Wang, Qing-Yan; Huang, Ri-Bo; Chou, Kuo-Chen
2017-09-19
A two-level principal component predictor (2L-PCA) was proposed based on the principal component analysis (PCA) approach. It can be used to quantitatively analyze various compounds and peptides about their functions or potentials to become useful drugs. One level is for dealing with the physicochemical properties of drug molecules, while the other level is for dealing with their structural fragments. The predictor has the self-learning and feedback features to automatically improve its accuracy. It is anticipated that 2L-PCA will become a very useful tool for timely providing various useful clues during the process of drug development.
Effect of noise in principal component analysis with an application to ozone pollution
NASA Astrophysics Data System (ADS)
Tsakiri, Katerina G.
This thesis analyzes the effect of independent noise in principal components of k normally distributed random variables defined by a covariance matrix. We prove that the principal components as well as the canonical variate pairs determined from joint distribution of original sample affected by noise can be essentially different in comparison with those determined from the original sample. However when the differences between the eigenvalues of the original covariance matrix are sufficiently large compared to the level of the noise, the effect of noise in principal components and canonical variate pairs proved to be negligible. The theoretical results are supported by simulation study and examples. Moreover, we compare our results about the eigenvalues and eigenvectors in the two dimensional case with other models examined before. This theory can be applied in any field for the decomposition of the components in multivariate analysis. One application is the detection and prediction of the main atmospheric factor of ozone concentrations on the example of Albany, New York. Using daily ozone, solar radiation, temperature, wind speed and precipitation data, we determine the main atmospheric factor for the explanation and prediction of ozone concentrations. A methodology is described for the decomposition of the time series of ozone and other atmospheric variables into the global term component which describes the long term trend and the seasonal variations, and the synoptic scale component which describes the short term variations. By using the Canonical Correlation Analysis, we show that solar radiation is the only main factor between the atmospheric variables considered here for the explanation and prediction of the global and synoptic scale component of ozone. The global term components are modeled by a linear regression model, while the synoptic scale components by a vector autoregressive model and the Kalman filter. The coefficient of determination, R2, for the prediction of the synoptic scale ozone component was found to be the highest when we consider the synoptic scale component of the time series for solar radiation and temperature. KEY WORDS: multivariate analysis; principal component; canonical variate pairs; eigenvalue; eigenvector; ozone; solar radiation; spectral decomposition; Kalman filter; time series prediction
Pure endmember extraction using robust kernel archetypoid analysis for hyperspectral imagery
NASA Astrophysics Data System (ADS)
Sun, Weiwei; Yang, Gang; Wu, Ke; Li, Weiyue; Zhang, Dianfa
2017-09-01
A robust kernel archetypoid analysis (RKADA) method is proposed to extract pure endmembers from hyperspectral imagery (HSI). The RKADA assumes that each pixel is a sparse linear mixture of all endmembers and each endmember corresponds to a real pixel in the image scene. First, it improves the re8gular archetypal analysis with a new binary sparse constraint, and the adoption of the kernel function constructs the principal convex hull in an infinite Hilbert space and enlarges the divergences between pairwise pixels. Second, the RKADA transfers the pure endmember extraction problem into an optimization problem by minimizing residual errors with the Huber loss function. The Huber loss function reduces the effects from big noises and outliers in the convergence procedure of RKADA and enhances the robustness of the optimization function. Third, the random kernel sinks for fast kernel matrix approximation and the two-stage algorithm for optimizing initial pure endmembers are utilized to improve its computational efficiency in realistic implementations of RKADA, respectively. The optimization equation of RKADA is solved by using the block coordinate descend scheme and the desired pure endmembers are finally obtained. Six state-of-the-art pure endmember extraction methods are employed to make comparisons with the RKADA on both synthetic and real Cuprite HSI datasets, including three geometrical algorithms vertex component analysis (VCA), alternative volume maximization (AVMAX) and orthogonal subspace projection (OSP), and three matrix factorization algorithms the preconditioning for successive projection algorithm (PreSPA), hierarchical clustering based on rank-two nonnegative matrix factorization (H2NMF) and self-dictionary multiple measurement vector (SDMMV). Experimental results show that the RKADA outperforms all the six methods in terms of spectral angle distance (SAD) and root-mean-square-error (RMSE). Moreover, the RKADA has short computational times in offline operations and shows significant improvement in identifying pure endmembers for ground objects with smaller spectrum differences. Therefore, the RKADA could be an alternative for pure endmember extraction from hyperspectral images.
Paleomagnetism of the Late Triassic Hound Island Volcanics: Revisited
Haeussler, Peter J.; Coe, Robert S.; Onstott, T.C.
1992-01-01
The collision and accretion of the Alexander terrane profoundly influenced the geologic history of Alaska and western Canada; however, the terrane's displacement history is only poorly constrained by sparse paleomagnetic studies. We studied the paleomagnetism of the Hound Island Volcanics in order to evaluate the location of the Alexander terrane in Late Triassic time. We collected 618 samples at 102 sites in and near the Keku Strait, Alaska, from the Late Triassic Hound Island Volcanics, the Permian Pybus Formation, and 23-Ma gabbroic intrusions. We found three components of magnetization in the Hound Island Volcanics. The high-temperature component (component A) resides in hematite and magnetite and was found only in highly oxidized lava flows in a geographically restricted area. We think it is primary, or acquired soon after eruption of the lavas, principally because the directions pass a fold test. The paleolatitude indicated by this component (19.2° ± 10.3°) is similar to those determined for various portions of Wrangellia, consistent with the geologic interpretation that the Alexander terrane was with the Wrangellia terrane in Late Triassic time. We found two overprint directions in the Hound Island Volcanics. Component B was acquired 23 m.y. ago due to intrusion of gabbroic dikes and sills. This interpretation is indicated by the similarity of upper-hemisphere directions in the Hound Island Volcanics to those in the gabbro. Component C, found in both the Hound Island Volcanics and the Permian Pybus Formation, is oriented northeast and down, fails a regional fold test, and was acquired after regional deformation around 90 to 100 Ma. This overprint direction yields a paleolatitude similar to, but slightly higher than, slightly older rocks from the Coast Plutonic Complex, suggesting that the Alexander terrane was displaced 17° in early Late Cretaceous time. The occurrence of these two separate overprinting events provides a satisfying explanation of the earlier puzzling results from the Hound Island Volcanics (Hillhouse and Grommé, 1980). Finally, great-circle analysis of the paleomagnetic data from the Pybus Formation suggests the Alexander terrane may have been in the northern hemisphere in Permian time.
Joint fMRI analysis and subject clustering using sparse dictionary learning
NASA Astrophysics Data System (ADS)
Kim, Seung-Jun; Dontaraju, Krishna K.
2017-08-01
Multi-subject fMRI data analysis methods based on sparse dictionary learning are proposed. In addition to identifying the component spatial maps by exploiting the sparsity of the maps, clusters of the subjects are learned by postulating that the fMRI volumes admit a subspace clustering structure. Furthermore, in order to tune the associated hyper-parameters systematically, a cross-validation strategy is developed based on entry-wise sampling of the fMRI dataset. Efficient algorithms for solving the proposed constrained dictionary learning formulations are developed. Numerical tests performed on synthetic fMRI data show promising results and provides insights into the proposed technique.
NASA Astrophysics Data System (ADS)
Hristian, L.; Ostafe, M. M.; Manea, L. R.; Apostol, L. L.
2017-06-01
The work pursued the distribution of combed wool fabrics destined to manufacturing of external articles of clothing in terms of the values of durability and physiological comfort indices, using the mathematical model of Principal Component Analysis (PCA). Principal Components Analysis (PCA) applied in this study is a descriptive method of the multivariate analysis/multi-dimensional data, and aims to reduce, under control, the number of variables (columns) of the matrix data as much as possible to two or three. Therefore, based on the information about each group/assortment of fabrics, it is desired that, instead of nine inter-correlated variables, to have only two or three new variables called components. The PCA target is to extract the smallest number of components which recover the most of the total information contained in the initial data.
Information extraction from multivariate images
NASA Technical Reports Server (NTRS)
Park, S. K.; Kegley, K. A.; Schiess, J. R.
1986-01-01
An overview of several multivariate image processing techniques is presented, with emphasis on techniques based upon the principal component transformation (PCT). Multiimages in various formats have a multivariate pixel value, associated with each pixel location, which has been scaled and quantized into a gray level vector, and the bivariate of the extent to which two images are correlated. The PCT of a multiimage decorrelates the multiimage to reduce its dimensionality and reveal its intercomponent dependencies if some off-diagonal elements are not small, and for the purposes of display the principal component images must be postprocessed into multiimage format. The principal component analysis of a multiimage is a statistical analysis based upon the PCT whose primary application is to determine the intrinsic component dimensionality of the multiimage. Computational considerations are also discussed.
Soleimani, Mohammad Ali; Yaghoobzadeh, Ameneh; Bahrami, Nasim; Sharif, Saeed Pahlevan; Sharif Nia, Hamid
2016-10-01
In this study, 398 Iranian cancer patients completed the 15-item Templer's Death Anxiety Scale (TDAS). Tests of internal consistency, principal components analysis, and confirmatory factor analysis were conducted to assess the internal consistency and factorial validity of the Persian TDAS. The construct reliability statistic and average variance extracted were also calculated to measure construct reliability, convergent validity, and discriminant validity. Principal components analysis indicated a 3-component solution, which was generally supported in the confirmatory analysis. However, acceptable cutoffs for construct reliability, convergent validity, and discriminant validity were not fulfilled for the three subscales that were derived from the principal component analysis. This study demonstrated both the advantages and potential limitations of using the TDAS with Persian-speaking cancer patients.
Principal Component Clustering Approach to Teaching Quality Discriminant Analysis
ERIC Educational Resources Information Center
Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan
2016-01-01
Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…
Analysis of the principal component algorithm in phase-shifting interferometry.
Vargas, J; Quiroga, J Antonio; Belenguer, T
2011-06-15
We recently presented a new asynchronous demodulation method for phase-sampling interferometry. The method is based in the principal component analysis (PCA) technique. In the former work, the PCA method was derived heuristically. In this work, we present an in-depth analysis of the PCA demodulation method.
Psychometric Measurement Models and Artificial Neural Networks
ERIC Educational Resources Information Center
Sese, Albert; Palmer, Alfonso L.; Montano, Juan J.
2004-01-01
The study of measurement models in psychometrics by means of dimensionality reduction techniques such as Principal Components Analysis (PCA) is a very common practice. In recent times, an upsurge of interest in the study of artificial neural networks apt to computing a principal component extraction has been observed. Despite this interest, the…
Microelectrode arrays (MEAs) detect drug and chemical induced changes in neuronal network function and have been used for neurotoxicity screening. As a proof-•of-concept, the current study assessed the utility of analytical "fingerprinting" using Principal Components Analysis (P...
Incremental principal component pursuit for video background modeling
Rodriquez-Valderrama, Paul A.; Wohlberg, Brendt
2017-03-14
An incremental Principal Component Pursuit (PCP) algorithm for video background modeling that is able to process one frame at a time while adapting to changes in background, with a computational complexity that allows for real-time processing, having a low memory footprint and is robust to translational and rotational jitter.
NASA Astrophysics Data System (ADS)
Li, Zhengji; Teng, Qizhi; He, Xiaohai; Yue, Guihua; Wang, Zhengyong
2017-09-01
The parameter evaluation of reservoir rocks can help us to identify components and calculate the permeability and other parameters, and it plays an important role in the petroleum industry. Until now, computed tomography (CT) has remained an irreplaceable way to acquire the microstructure of reservoir rocks. During the evaluation and analysis, large samples and high-resolution images are required in order to obtain accurate results. Owing to the inherent limitations of CT, however, a large field of view results in low-resolution images, and high-resolution images entail a smaller field of view. Our method is a promising solution to these data collection limitations. In this study, a framework for sparse representation-based 3D volumetric super-resolution is proposed to enhance the resolution of 3D voxel images of reservoirs scanned with CT. A single reservoir structure and its downgraded model are divided into a large number of 3D cubes of voxel pairs and these cube pairs are used to calculate two overcomplete dictionaries and the sparse-representation coefficients in order to estimate the high frequency component. Future more, to better result, a new feature extract method with combine BM4D together with Laplacian filter are introduced. In addition, we conducted a visual evaluation of the method, and used the PSNR and FSIM to evaluate it qualitatively.
Dynamic competitive probabilistic principal components analysis.
López-Rubio, Ezequiel; Ortiz-DE-Lazcano-Lobato, Juan Miguel
2009-04-01
We present a new neural model which extends the classical competitive learning (CL) by performing a Probabilistic Principal Components Analysis (PPCA) at each neuron. The model also has the ability to learn the number of basis vectors required to represent the principal directions of each cluster, so it overcomes a drawback of most local PCA models, where the dimensionality of a cluster must be fixed a priori. Experimental results are presented to show the performance of the network with multispectral image data.
A principal components model of soundscape perception.
Axelsson, Östen; Nilsson, Mats E; Berglund, Birgitta
2010-11-01
There is a need for a model that identifies underlying dimensions of soundscape perception, and which may guide measurement and improvement of soundscape quality. With the purpose to develop such a model, a listening experiment was conducted. One hundred listeners measured 50 excerpts of binaural recordings of urban outdoor soundscapes on 116 attribute scales. The average attribute scale values were subjected to principal components analysis, resulting in three components: Pleasantness, eventfulness, and familiarity, explaining 50, 18 and 6% of the total variance, respectively. The principal-component scores were correlated with physical soundscape properties, including categories of dominant sounds and acoustic variables. Soundscape excerpts dominated by technological sounds were found to be unpleasant, whereas soundscape excerpts dominated by natural sounds were pleasant, and soundscape excerpts dominated by human sounds were eventful. These relationships remained after controlling for the overall soundscape loudness (Zwicker's N(10)), which shows that 'informational' properties are substantial contributors to the perception of soundscape. The proposed principal components model provides a framework for future soundscape research and practice. In particular, it suggests which basic dimensions are necessary to measure, how to measure them by a defined set of attribute scales, and how to promote high-quality soundscapes.
Lee, Young-Beom; Lee, Jeonghyeon; Tak, Sungho; Lee, Kangjoo; Na, Duk L; Seo, Sang Won; Jeong, Yong; Ye, Jong Chul
2016-01-15
Recent studies of functional connectivity MR imaging have revealed that the default-mode network activity is disrupted in diseases such as Alzheimer's disease (AD). However, there is not yet a consensus on the preferred method for resting-state analysis. Because the brain is reported to have complex interconnected networks according to graph theoretical analysis, the independency assumption, as in the popular independent component analysis (ICA) approach, often does not hold. Here, rather than using the independency assumption, we present a new statistical parameter mapping (SPM)-type analysis method based on a sparse graph model where temporal dynamics at each voxel position are described as a sparse combination of global brain dynamics. In particular, a new concept of a spatially adaptive design matrix has been proposed to represent local connectivity that shares the same temporal dynamics. If we further assume that local network structures within a group are similar, the estimation problem of global and local dynamics can be solved using sparse dictionary learning for the concatenated temporal data across subjects. Moreover, under the homoscedasticity variance assumption across subjects and groups that is often used in SPM analysis, the aforementioned individual and group analyses using sparse dictionary learning can be accurately modeled by a mixed-effect model, which also facilitates a standard SPM-type group-level inference using summary statistics. Using an extensive resting fMRI data set obtained from normal, mild cognitive impairment (MCI), and Alzheimer's disease patient groups, we demonstrated that the changes in the default mode network extracted by the proposed method are more closely correlated with the progression of Alzheimer's disease. Copyright © 2015 Elsevier Inc. All rights reserved.
Exarchakis, Georgios; Lücke, Jörg
2017-11-01
Sparse coding algorithms with continuous latent variables have been the subject of a large number of studies. However, discrete latent spaces for sparse coding have been largely ignored. In this work, we study sparse coding with latents described by discrete instead of continuous prior distributions. We consider the general case in which the latents (while being sparse) can take on any value of a finite set of possible values and in which we learn the prior probability of any value from data. This approach can be applied to any data generated by discrete causes, and it can be applied as an approximation of continuous causes. As the prior probabilities are learned, the approach then allows for estimating the prior shape without assuming specific functional forms. To efficiently train the parameters of our probabilistic generative model, we apply a truncated expectation-maximization approach (expectation truncation) that we modify to work with a general discrete prior. We evaluate the performance of the algorithm by applying it to a variety of tasks: (1) we use artificial data to verify that the algorithm can recover the generating parameters from a random initialization, (2) use image patches of natural images and discuss the role of the prior for the extraction of image components, (3) use extracellular recordings of neurons to present a novel method of analysis for spiking neurons that includes an intuitive discretization strategy, and (4) apply the algorithm on the task of encoding audio waveforms of human speech. The diverse set of numerical experiments presented in this letter suggests that discrete sparse coding algorithms can scale efficiently to work with realistic data sets and provide novel statistical quantities to describe the structure of the data.
Sparse modeling applied to patient identification for safety in medical physics applications
NASA Astrophysics Data System (ADS)
Lewkowitz, Stephanie
Every scheduled treatment at a radiation therapy clinic involves a series of safety protocol to ensure the utmost patient care. Despite safety protocol, on a rare occasion an entirely preventable medical event, an accident, may occur. Delivering a treatment plan to the wrong patient is preventable, yet still is a clinically documented error. This research describes a computational method to identify patients with a novel machine learning technique to combat misadministration. The patient identification program stores face and fingerprint data for each patient. New, unlabeled data from those patients are categorized according to the library. The categorization of data by this face-fingerprint detector is accomplished with new machine learning algorithms based on Sparse Modeling that have already begun transforming the foundation of Computer Vision. Previous patient recognition software required special subroutines for faces and different tailored subroutines for fingerprints. In this research, the same exact model is used for both fingerprints and faces, without any additional subroutines and even without adjusting the two hyperparameters. Sparse modeling is a powerful tool, already shown utility in the areas of super-resolution, denoising, inpainting, demosaicing, and sub-nyquist sampling, i.e. compressed sensing. Sparse Modeling is possible because natural images are inherently sparse in some bases, due to their inherent structure. This research chooses datasets of face and fingerprint images to test the patient identification model. The model stores the images of each dataset as a basis (library). One image at a time is removed from the library, and is classified by a sparse code in terms of the remaining library. The Locally Competitive Algorithm, a truly neural inspired Artificial Neural Network, solves the computationally difficult task of finding the sparse code for the test image. The components of the sparse representation vector are summed by ℓ1 pooling, and correct patient identification is consistently achieved 100% over 1000 trials, when either the face data or fingerprint data are implemented as a classification basis. The algorithm gets 100% classification when faces and fingerprints are concatenated into multimodal datasets. This suggests that 100% patient identification will be achievable in the clinal setting.
Das, Atanu; Mukhopadhyay, Chaitali
2007-10-28
We have performed molecular dynamics (MD) simulation of the thermal denaturation of one protein and one peptide-ubiquitin and melittin. To identify the correlation in dynamics among various secondary structural fragments and also the individual contribution of different residues towards thermal unfolding, principal component analysis method was applied in order to give a new insight to protein dynamics by analyzing the contribution of coefficients of principal components. The cross-correlation matrix obtained from MD simulation trajectory provided important information regarding the anisotropy of backbone dynamics that leads to unfolding. Unfolding of ubiquitin was found to be a three-state process, while that of melittin, though smaller and mostly helical, is more complicated.
NASA Astrophysics Data System (ADS)
Das, Atanu; Mukhopadhyay, Chaitali
2007-10-01
We have performed molecular dynamics (MD) simulation of the thermal denaturation of one protein and one peptide—ubiquitin and melittin. To identify the correlation in dynamics among various secondary structural fragments and also the individual contribution of different residues towards thermal unfolding, principal component analysis method was applied in order to give a new insight to protein dynamics by analyzing the contribution of coefficients of principal components. The cross-correlation matrix obtained from MD simulation trajectory provided important information regarding the anisotropy of backbone dynamics that leads to unfolding. Unfolding of ubiquitin was found to be a three-state process, while that of melittin, though smaller and mostly helical, is more complicated.
SAS program for quantitative stratigraphic correlation by principal components
Hohn, M.E.
1985-01-01
A SAS program is presented which constructs a composite section of stratigraphic events through principal components analysis. The variables in the analysis are stratigraphic sections and the observational units are range limits of taxa. The program standardizes data in each section, extracts eigenvectors, estimates missing range limits, and computes the composite section from scores of events on the first principal component. Provided is an option of several types of diagnostic plots; these help one to determine conservative range limits or unrealistic estimates of missing values. Inspection of the graphs and eigenvalues allow one to evaluate goodness of fit between the composite and measured data. The program is extended easily to the creation of a rank-order composite. ?? 1985.
NASA Astrophysics Data System (ADS)
Werth, Alexandra; Liakat, Sabbir; Dong, Anqi; Woods, Callie M.; Gmachl, Claire F.
2018-05-01
An integrating sphere is used to enhance the collection of backscattered light in a noninvasive glucose sensor based on quantum cascade laser spectroscopy. The sphere enhances signal stability by roughly an order of magnitude, allowing us to use a thermoelectrically (TE) cooled detector while maintaining comparable glucose prediction accuracy levels. Using a smaller TE-cooled detector reduces form factor, creating a mobile sensor. Principal component analysis has predicted principal components of spectra taken from human subjects that closely match the absorption peaks of glucose. These principal components are used as regressors in a linear regression algorithm to make glucose concentration predictions, over 75% of which are clinically accurate.
Principals' Perceptions of Collegial Support as a Component of Administrative Inservice.
ERIC Educational Resources Information Center
Daresh, John C.
To address the problem of increasing professional isolation of building administrators, the Principals' Inservice Project helps establish principals' collegial support groups across the nation. The groups are typically composed of 6 to 10 principals who meet at least once each month over a 2-year period. One collegial support group of seven…
Training the Trainers: Learning to Be a Principal Supervisor
ERIC Educational Resources Information Center
Saltzman, Amy
2017-01-01
While most principal supervisors are former principals themselves, few come to the role with specific training in how to do the job effectively. For this reason, both the Washington, D.C., and Tulsa, Oklahoma, principal supervisor programs include a strong professional development component. In this article, the author takes a look inside these…
ERIC Educational Resources Information Center
Rodrigue, Christine M.
2011-01-01
This paper presents a laboratory exercise used to teach principal components analysis (PCA) as a means of surface zonation. The lab was built around abundance data for 16 oxides and elements collected by the Mars Exploration Rover Spirit in Gusev Crater between Sol 14 and Sol 470. Students used PCA to reduce 15 of these into 3 components, which,…
ERIC Educational Resources Information Center
Ackermann, Margot Elise; Morrow, Jennifer Ann
2008-01-01
The present study describes the development and initial validation of the Coping with the College Environment Scale (CWCES). Participants included 433 college students who took an online survey. Principal Components Analysis (PCA) revealed six coping strategies: planning and self-management, seeking support from institutional resources, escaping…
NASA Astrophysics Data System (ADS)
Kistenev, Yu. V.; Shapovalov, A. V.; Borisov, A. V.; Vrazhnov, D. A.; Nikolaev, V. V.; Nikiforova, O. Yu.
2015-11-01
The comparison results of different mother wavelets used for de-noising of model and experimental data which were presented by profiles of absorption spectra of exhaled air are presented. The impact of wavelets de-noising on classification quality made by principal component analysis are also discussed.
Evaluation of skin melanoma in spectral range 450-950 nm using principal component analysis
NASA Astrophysics Data System (ADS)
Jakovels, D.; Lihacova, I.; Kuzmina, I.; Spigulis, J.
2013-06-01
Diagnostic potential of principal component analysis (PCA) of multi-spectral imaging data in the wavelength range 450- 950 nm for distant skin melanoma recognition is discussed. Processing of the measured clinical data by means of PCA resulted in clear separation between malignant melanomas and pigmented nevi.
ERIC Educational Resources Information Center
Linting, Marielle; Meulman, Jacqueline J.; Groenen, Patrick J. F.; van der Kooij, Anita J.
2007-01-01
Principal components analysis (PCA) is used to explore the structure of data sets containing linearly related numeric variables. Alternatively, nonlinear PCA can handle possibly nonlinearly related numeric as well as nonnumeric variables. For linear PCA, the stability of its solution can be established under the assumption of multivariate…
40 CFR 60.2998 - What are the principal components of the model rule?
Code of Federal Regulations, 2012 CFR
2012-07-01
... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
40 CFR 60.2998 - What are the principal components of the model rule?
Code of Federal Regulations, 2014 CFR
2014-07-01
... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
40 CFR 60.2998 - What are the principal components of the model rule?
Code of Federal Regulations, 2011 CFR
2011-07-01
... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
40 CFR 60.1580 - What are the principal components of the model rule?
Code of Federal Regulations, 2010 CFR
2010-07-01
... the model rule? 60.1580 Section 60.1580 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines..., 1999 Use of Model Rule § 60.1580 What are the principal components of the model rule? The model rule...
40 CFR 60.2998 - What are the principal components of the model rule?
Code of Federal Regulations, 2013 CFR
2013-07-01
... the model rule? 60.2998 Section 60.2998 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED) STANDARDS OF PERFORMANCE FOR NEW STATIONARY SOURCES Emission Guidelines... December 9, 2004 Model Rule-Use of Model Rule § 60.2998 What are the principal components of the model rule...
Students' Perceptions of Teaching and Learning Practices: A Principal Component Approach
ERIC Educational Resources Information Center
Mukorera, Sophia; Nyatanga, Phocenah
2017-01-01
Students' attendance and engagement with teaching and learning practices is perceived as a critical element for academic performance. Even with stipulated attendance policies, students still choose not to engage. The study employed a principal component analysis to analyze first- and second-year students' perceptions of the importance of the 12…
ERIC Educational Resources Information Center
Hunley-Jenkins, Keisha Janine
2012-01-01
This qualitative study explores large, urban, mid-western principal perspectives about cyberbullying and the policy components and practices that they have found effective and ineffective at reducing its occurrence and/or negative effect on their schools' learning environments. More specifically, the researcher was interested in learning more…
Principal Component Analysis: Resources for an Essential Application of Linear Algebra
ERIC Educational Resources Information Center
Pankavich, Stephen; Swanson, Rebecca
2015-01-01
Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…
Learning Principal Component Analysis by Using Data from Air Quality Networks
ERIC Educational Resources Information Center
Perez-Arribas, Luis Vicente; Leon-González, María Eugenia; Rosales-Conrado, Noelia
2017-01-01
With the final objective of using computational and chemometrics tools in the chemistry studies, this paper shows the methodology and interpretation of the Principal Component Analysis (PCA) using pollution data from different cities. This paper describes how students can obtain data on air quality and process such data for additional information…
Applications of Nonlinear Principal Components Analysis to Behavioral Data.
ERIC Educational Resources Information Center
Hicks, Marilyn Maginley
1981-01-01
An empirical investigation of the statistical procedure entitled nonlinear principal components analysis was conducted on a known equation and on measurement data in order to demonstrate the procedure and examine its potential usefulness. This method was suggested by R. Gnanadesikan and based on an early paper of Karl Pearson. (Author/AL)
ERIC Educational Resources Information Center
Hendrix, Dean
2010-01-01
This study analyzed 2005-2006 Web of Science bibliometric data from institutions belonging to the Association of Research Libraries (ARL) and corresponding ARL statistics to find any associations between indicators from the two data sets. Principal components analysis on 36 variables from 103 universities revealed obvious associations between…
Stannard, David I.
1993-01-01
Eddy correlation measurements of sensible and latent heat flux are used with measurements of net radiation, soil heat flux, and other micrometeorological variables to develop the Penman-Monteith, Shuttleworth-Wallace, and modified Priestley-Taylor evapotranspiration models for use in a sparsely vegetated, semiarid rangeland. The Penman-Monteith model, a one-component model designed for use with dense crops, is not sufficiently accurate (r2 = 0.56 for hourly data and r2 = 0.60 for daily data). The Shuttleworth-Wallace model, a two-component logical extension of the Penman-Monteith model for use with sparse crops, performs significantly better (r2 = 0.78 for hourly data and r2 = 0.85 for daily data). The modified Priestley-Taylor model, a one-component simplified form of the Penman potential evapotranspiration model, surprisingly performs as well as the Shuttle worth-Wallace model. The rigorous Shuttleworth-Wallace model predicts that about one quarter of the vapor flux to the atmosphere is from bare-soil evaporation. Further, during daylight hours, the small leaves are sinks for sensible heat produced at the hot soil surface.
Beyond Low Rank + Sparse: Multi-scale Low Rank Matrix Decomposition
Ong, Frank; Lustig, Michael
2016-01-01
We present a natural generalization of the recent low rank + sparse matrix decomposition and consider the decomposition of matrices into components of multiple scales. Such decomposition is well motivated in practice as data matrices often exhibit local correlations in multiple scales. Concretely, we propose a multi-scale low rank modeling that represents a data matrix as a sum of block-wise low rank matrices with increasing scales of block sizes. We then consider the inverse problem of decomposing the data matrix into its multi-scale low rank components and approach the problem via a convex formulation. Theoretically, we show that under various incoherence conditions, the convex program recovers the multi-scale low rank components either exactly or approximately. Practically, we provide guidance on selecting the regularization parameters and incorporate cycle spinning to reduce blocking artifacts. Experimentally, we show that the multi-scale low rank decomposition provides a more intuitive decomposition than conventional low rank methods and demonstrate its effectiveness in four applications, including illumination normalization for face images, motion separation for surveillance videos, multi-scale modeling of the dynamic contrast enhanced magnetic resonance imaging and collaborative filtering exploiting age information. PMID:28450978
Principal component analysis for protein folding dynamics.
Maisuradze, Gia G; Liwo, Adam; Scheraga, Harold A
2009-01-09
Protein folding is considered here by studying the dynamics of the folding of the triple beta-strand WW domain from the Formin-binding protein 28. Starting from the unfolded state and ending either in the native or nonnative conformational states, trajectories are generated with the coarse-grained united residue (UNRES) force field. The effectiveness of principal components analysis (PCA), an already established mathematical technique for finding global, correlated motions in atomic simulations of proteins, is evaluated here for coarse-grained trajectories. The problems related to PCA and their solutions are discussed. The folding and nonfolding of proteins are examined with free-energy landscapes. Detailed analyses of many folding and nonfolding trajectories at different temperatures show that PCA is very efficient for characterizing the general folding and nonfolding features of proteins. It is shown that the first principal component captures and describes in detail the dynamics of a system. Anomalous diffusion in the folding/nonfolding dynamics is examined by the mean-square displacement (MSD) and the fractional diffusion and fractional kinetic equations. The collisionless (or ballistic) behavior of a polypeptide undergoing Brownian motion along the first few principal components is accounted for.
Principal Component 2-D Long Short-Term Memory for Font Recognition on Single Chinese Characters.
Tao, Dapeng; Lin, Xu; Jin, Lianwen; Li, Xuelong
2016-03-01
Chinese character font recognition (CCFR) has received increasing attention as the intelligent applications based on optical character recognition becomes popular. However, traditional CCFR systems do not handle noisy data effectively. By analyzing in detail the basic strokes of Chinese characters, we propose that font recognition on a single Chinese character is a sequence classification problem, which can be effectively solved by recurrent neural networks. For robust CCFR, we integrate a principal component convolution layer with the 2-D long short-term memory (2DLSTM) and develop principal component 2DLSTM (PC-2DLSTM) algorithm. PC-2DLSTM considers two aspects: 1) the principal component layer convolution operation helps remove the noise and get a rational and complete font information and 2) simultaneously, 2DLSTM deals with the long-range contextual processing along scan directions that can contribute to capture the contrast between character trajectory and background. Experiments using the frequently used CCFR dataset suggest the effectiveness of PC-2DLSTM compared with other state-of-the-art font recognition methods.
Dynamic of consumer groups and response of commodity markets by principal component analysis
NASA Astrophysics Data System (ADS)
Nobi, Ashadun; Alam, Shafiqul; Lee, Jae Woo
2017-09-01
This study investigates financial states and group dynamics by applying principal component analysis to the cross-correlation coefficients of the daily returns of commodity futures. The eigenvalues of the cross-correlation matrix in the 6-month timeframe displays similar values during 2010-2011, but decline following 2012. A sharp drop in eigenvalue implies the significant change of the market state. Three commodity sectors, energy, metals and agriculture, are projected into two dimensional spaces consisting of two principal components (PC). We observe that they form three distinct clusters in relation to various sectors. However, commodities with distinct features have intermingled with one another and scattered during severe crises, such as the European sovereign debt crises. We observe the notable change of the position of two dimensional spaces of groups during financial crises. By considering the first principal component (PC1) within the 6-month moving timeframe, we observe that commodities of the same group change states in a similar pattern, and the change of states of one group can be used as a warning for other group.
Yuan, Yuan-Yuan; Zhou, Yu-Bi; Sun, Jing; Deng, Juan; Bai, Ying; Wang, Jie; Lu, Xue-Feng
2017-06-01
The content of elements in fifteen different regions of Nitraria roborowskii samples were determined by inductively coupled plasma-atomic emission spectrometry(ICP-OES), and its elemental characteristics were analyzed by principal component analysis. The results indicated that 18 mineral elements were detected in N. roborowskii of which V cannot be detected. In addition, contents of Na, K and Ca showed high concentration. Ti showed maximum content variance, while K is minimum. Four principal components were gained from the original data. The cumulative variance contribution rate is 81.542% and the variance contribution of the first principal component was 44.997%, indicating that Cr, Fe, P and Ca were the characteristic elements of N. roborowskii.Thus, the established method was simple, precise and can be used for determination of mineral elements in N.roborowskii Kom. fruits. The elemental distribution characteristics among N.roborowskii fruits are related to geographical origins which were clearly revealed by PCA. All the results will provide good basis for comprehensive utilization of N.roborowskii. Copyright© by the Chinese Pharmaceutical Association.
Lü, Gui-Cai; Zhao, Wei-Hong; Wang, Jiang-Tao
2011-01-01
The identification techniques for 10 species of red tide algae often found in the coastal areas of China were developed by combining the three-dimensional fluorescence spectra of fluorescence dissolved organic matter (FDOM) from the cultured red tide algae with principal component analysis. Based on the results of principal component analysis, the first principal component loading spectrum of three-dimensional fluorescence spectrum was chosen as the identification characteristic spectrum for red tide algae, and the phytoplankton fluorescence characteristic spectrum band was established. Then the 10 algae species were tested using Bayesian discriminant analysis with a correct identification rate of more than 92% for Pyrrophyta on the level of species, and that of more than 75% for Bacillariophyta on the level of genus in which the correct identification rates were more than 90% for the phaeodactylum and chaetoceros. The results showed that the identification techniques for 10 species of red tide algae based on the three-dimensional fluorescence spectra of FDOM from the cultured red tide algae and principal component analysis could work well.
NASA Astrophysics Data System (ADS)
Ji, Yi; Sun, Shanlin; Xie, Hong-Bo
2017-06-01
Discrete wavelet transform (WT) followed by principal component analysis (PCA) has been a powerful approach for the analysis of biomedical signals. Wavelet coefficients at various scales and channels were usually transformed into a one-dimensional array, causing issues such as the curse of dimensionality dilemma and small sample size problem. In addition, lack of time-shift invariance of WT coefficients can be modeled as noise and degrades the classifier performance. In this study, we present a stationary wavelet-based two-directional two-dimensional principal component analysis (SW2D2PCA) method for the efficient and effective extraction of essential feature information from signals. Time-invariant multi-scale matrices are constructed in the first step. The two-directional two-dimensional principal component analysis then operates on the multi-scale matrices to reduce the dimension, rather than vectors in conventional PCA. Results are presented from an experiment to classify eight hand motions using 4-channel electromyographic (EMG) signals recorded in healthy subjects and amputees, which illustrates the efficiency and effectiveness of the proposed method for biomedical signal analysis.
Hyperspectral optical imaging of human iris in vivo: characteristics of reflectance spectra
NASA Astrophysics Data System (ADS)
Medina, José M.; Pereira, Luís M.; Correia, Hélder T.; Nascimento, Sérgio M. C.
2011-07-01
We report a hyperspectral imaging system to measure the reflectance spectra of real human irises with high spatial resolution. A set of ocular prosthesis was used as the control condition. Reflectance data were decorrelated by the principal-component analysis. The main conclusion is that spectral complexity of the human iris is considerable: between 9 and 11 principal components are necessary to account for 99% of the cumulative variance in human irises. Correcting image misalignments associated with spontaneous ocular movements did not influence this result. The data also suggests a correlation between the first principal component and different levels of melanin present in the irises. It was also found that although the spectral characteristics of the first five principal components were not affected by the radial and angular position of the selected iridal areas, they affect the higher-order ones, suggesting a possible influence of the iris texture. The results show that hyperspectral imaging in the iris, together with adequate spectroscopic analyses provide more information than conventional colorimetric methods, making hyperspectral imaging suitable for the characterization of melanin and the noninvasive diagnosis of ocular diseases and iris color.
Seeing wholes: The concept of systems thinking and its implementation in school leadership
NASA Astrophysics Data System (ADS)
Shaked, Haim; Schechter, Chen
2013-12-01
Systems thinking (ST) is an approach advocating thinking about any given issue as a whole, emphasising the interrelationships between its components rather than the components themselves. This article aims to link ST and school leadership, claiming that ST may enable school principals to develop highly performing schools that can cope successfully with current challenges, which are more complex than ever before in today's era of accountability and high expectations. The article presents the concept of ST - its definition, components, history and applications. Thereafter, its connection to education and its contribution to school management are described. The article concludes by discussing practical processes including screening for ST-skilled principal candidates and developing ST skills among prospective and currently performing school principals, pinpointing three opportunities for skills acquisition: during preparatory programmes; during their first years on the job, supported by veteran school principals as mentors; and throughout their entire career. Such opportunities may not only provide school principals with ST skills but also improve their functioning throughout the aforementioned stages of professional development.
A modified procedure for mixture-model clustering of regional geochemical data
Ellefsen, Karl J.; Smith, David B.; Horton, John D.
2014-01-01
A modified procedure is proposed for mixture-model clustering of regional-scale geochemical data. The key modification is the robust principal component transformation of the isometric log-ratio transforms of the element concentrations. This principal component transformation and the associated dimension reduction are applied before the data are clustered. The principal advantage of this modification is that it significantly improves the stability of the clustering. The principal disadvantage is that it requires subjective selection of the number of clusters and the number of principal components. To evaluate the efficacy of this modified procedure, it is applied to soil geochemical data that comprise 959 samples from the state of Colorado (USA) for which the concentrations of 44 elements are measured. The distributions of element concentrations that are derived from the mixture model and from the field samples are similar, indicating that the mixture model is a suitable representation of the transformed geochemical data. Each cluster and the associated distributions of the element concentrations are related to specific geologic and anthropogenic features. In this way, mixture model clustering facilitates interpretation of the regional geochemical data.
Hyman, Jeffrey De'Haven; Aldrich, Garrett Allen; Viswanathan, Hari S.; ...
2016-08-01
We characterize how different fracture size-transmissivity relationships influence flow and transport simulations through sparse three-dimensional discrete fracture networks. Although it is generally accepted that there is a positive correlation between a fracture's size and its transmissivity/aperture, the functional form of that relationship remains a matter of debate. Relationships that assume perfect correlation, semicorrelation, and noncorrelation between the two have been proposed. To study the impact that adopting one of these relationships has on transport properties, we generate multiple sparse fracture networks composed of circular fractures whose radii follow a truncated power law distribution. The distribution of transmissivities are selected somore » that the mean transmissivity of the fracture networks are the same and the distributions of aperture and transmissivity in models that include a stochastic term are also the same. We observe that adopting a correlation between a fracture size and its transmissivity leads to earlier breakthrough times and higher effective permeability when compared to networks where no correlation is used. While fracture network geometry plays the principal role in determining where transport occurs within the network, the relationship between size and transmissivity controls the flow speed. Lastly, these observations indicate DFN modelers should be aware that breakthrough times and effective permeabilities can be strongly influenced by such a relationship in addition to fracture and network statistics.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hyman, Jeffrey De'Haven; Aldrich, Garrett Allen; Viswanathan, Hari S.
We characterize how different fracture size-transmissivity relationships influence flow and transport simulations through sparse three-dimensional discrete fracture networks. Although it is generally accepted that there is a positive correlation between a fracture's size and its transmissivity/aperture, the functional form of that relationship remains a matter of debate. Relationships that assume perfect correlation, semicorrelation, and noncorrelation between the two have been proposed. To study the impact that adopting one of these relationships has on transport properties, we generate multiple sparse fracture networks composed of circular fractures whose radii follow a truncated power law distribution. The distribution of transmissivities are selected somore » that the mean transmissivity of the fracture networks are the same and the distributions of aperture and transmissivity in models that include a stochastic term are also the same. We observe that adopting a correlation between a fracture size and its transmissivity leads to earlier breakthrough times and higher effective permeability when compared to networks where no correlation is used. While fracture network geometry plays the principal role in determining where transport occurs within the network, the relationship between size and transmissivity controls the flow speed. Lastly, these observations indicate DFN modelers should be aware that breakthrough times and effective permeabilities can be strongly influenced by such a relationship in addition to fracture and network statistics.« less
Vision-based method for detecting driver drowsiness and distraction in driver monitoring system
NASA Astrophysics Data System (ADS)
Jo, Jaeik; Lee, Sung Joo; Jung, Ho Gi; Park, Kang Ryoung; Kim, Jaihie
2011-12-01
Most driver-monitoring systems have attempted to detect either driver drowsiness or distraction, although both factors should be considered for accident prevention. Therefore, we propose a new driver-monitoring method considering both factors. We make the following contributions. First, if the driver is looking ahead, drowsiness detection is performed; otherwise, distraction detection is performed. Thus, the computational cost and eye-detection error can be reduced. Second, we propose a new eye-detection algorithm that combines adaptive boosting, adaptive template matching, and blob detection with eye validation, thereby reducing the eye-detection error and processing time significantly, which is hardly achievable using a single method. Third, to enhance eye-detection accuracy, eye validation is applied after initial eye detection, using a support vector machine based on appearance features obtained by principal component analysis (PCA) and linear discriminant analysis (LDA). Fourth, we propose a novel eye state-detection algorithm that combines appearance features obtained using PCA and LDA, with statistical features such as the sparseness and kurtosis of the histogram from the horizontal edge image of the eye. Experimental results showed that the detection accuracies of the eye region and eye states were 99 and 97%, respectively. Both driver drowsiness and distraction were detected with a success rate of 98%.
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-11-22
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k -nearest neighbor ( k -NN) method produced the greatest accuracy. A geostatistically-weighted k -NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer's and user's accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas.
NASA Astrophysics Data System (ADS)
Wan, Xiaoqing; Zhao, Chunhui; Gao, Bing
2017-11-01
The integration of an edge-preserving filtering technique in the classification of a hyperspectral image (HSI) has been proven effective in enhancing classification performance. This paper proposes an ensemble strategy for HSI classification using an edge-preserving filter along with a deep learning model and edge detection. First, an adaptive guided filter is applied to the original HSI to reduce the noise in degraded images and to extract powerful spectral-spatial features. Second, the extracted features are fed as input to a stacked sparse autoencoder to adaptively exploit more invariant and deep feature representations; then, a random forest classifier is applied to fine-tune the entire pretrained network and determine the classification output. Third, a Prewitt compass operator is further performed on the HSI to extract the edges of the first principal component after dimension reduction. Moreover, the regional growth rule is applied to the resulting edge logical image to determine the local region for each unlabeled pixel. Finally, the categories of the corresponding neighborhood samples are determined in the original classification map; then, the major voting mechanism is implemented to generate the final output. Extensive experiments proved that the proposed method achieves competitive performance compared with several traditional approaches.
Surtsey and Mount St. Helens: a comparison of early succession rates
NASA Astrophysics Data System (ADS)
del Moral, R.; Magnússon, B.
2014-04-01
Surtsey and Mount St. Helens are celebrated but very different volcanoes. Permanent plots allow for comparisons that reveal mechanisms that control succession and its rate and suggest general principles. We estimated rates from structure development, species composition using detrended correspondence analysis (DCA), changes in Euclidean distance (ED) of DCA vectors, and by principal components analysis (PCA) of DCA. On Surtsey, rates determined from DCA trajectory analyses decreased as follows: gull colony on lava with sand > gull colony on lava, no sand ≫ lava with sand > sand spit > block lava > tephra. On Mount St. Helens, plots on lahar deposits near woodlands were best developed. The succession rates of open meadows declined as follows: Lupinus-dominated pumice > protected ridge with Lupinus > other pumice and blasted sites > isolated lahar meadows > barren plain. Despite the prominent contrasts between the volcanoes, we found several common themes. Isolation restricted the number of colonists on Surtsey and to a lesser degree on Mount St. Helens. Nutrient input from outside the system was crucial. On Surtsey, seabirds fashioned very fertile substrates, while on Mount St. Helens wind brought a sparse nutrient rain, then Lupinus enhanced fertility to promote succession. Environmental stress limits succession in both cases. On Surtsey, bare lava, compacted tephra and infertile sands restrict development. On Mount St. Helens, exposure to wind and infertility slow succession.
Surtsey and Mount St. Helens: a comparison of early succession rates
NASA Astrophysics Data System (ADS)
del Moral, R.; Magnússon, B.
2013-12-01
Surtsey and Mount St. Helens are celebrated, but very different volcanoes. Permanent plots allow comparisons that reveal mechanisms that control succession and its rate and suggest general principles. We estimated rates from structure development, species composition using detrended correspondence analysis (DCA), changes in Euclidean distance (ED) of DCA vectors and by principal components analysis (PCA) of DCA. On Surtsey, rates determined from DCA trajectory analyses decreased as follows: gull colony on lava with sand > gull colony on lava, no sand ≫ lava with sand > sand spit > block lava > tephra. On Mount St. Helens, plots on lahar deposits near woodlands were best developed. The succession rates of open meadows declined as follows: Lupinus-dominated pumice > protected ridge with Lupinus > other pumice and blasted sites > isolated lahar meadows > barren plain. Despite the prominent contrasts between the volcanoes, common themes were revealed. Isolation restricted the number of colonists on Surtsey and to a lesser degree on Mount St. Helens. Nutrient input from outside the system was crucial. On Surtsey, seabirds fashioned very fertile substrates, while on Mount St. Helens wind brought a sparse nutrient rain, then Lupinus enhanced fertility to promote succession. Environmental stress limits succession in both cases. On Surtsey, bare lava, compacted tephra and infertile sands restrict development. On Mount St. Helens, exposure to wind and infertility slow succession.
Euclidean chemical spaces from molecular fingerprints: Hamming distance and Hempel's ravens.
Martin, Eric; Cao, Eddie
2015-05-01
Molecules are often characterized by sparse binary fingerprints, where 1s represent the presence of substructures and 0s represent their absence. Fingerprints are especially useful for similarity calculations, such as database searching or clustering, generally measuring similarity as the Tanimoto coefficient. In other cases, such as visualization, design of experiments, or latent variable regression, a low-dimensional Euclidian "chemical space" is more useful, where proximity between points reflects chemical similarity. A temptation is to apply principal components analysis (PCA) directly to these fingerprints to obtain a low dimensional continuous chemical space. However, Gower has shown that distances from PCA on bit vectors are proportional to the square root of Hamming distance. Unlike Tanimoto similarity, Hamming similarity (HS) gives equal weight to shared 0s as to shared 1s, that is, HS gives as much weight to substructures that neither molecule contains, as to substructures which both molecules contain. Illustrative examples show that proximity in the corresponding chemical space reflects mainly similar size and complexity rather than shared chemical substructures. These spaces are ill-suited for visualizing and optimizing coverage of chemical space, or as latent variables for regression. A more suitable alternative is shown to be Multi-dimensional scaling on the Tanimoto distance matrix, which produces a space where proximity does reflect structural similarity.
Aggregate Measures of Watershed Health from Reconstructed ...
Risk-based indices such as reliability, resilience, and vulnerability (R-R-V), have the potential to serve as watershed health assessment tools. Recent research has demonstrated the applicability of such indices for water quality (WQ) constituents such as total suspended solids and nutrients on an individual basis. However, the calculations can become tedious when time-series data for several WQ constituents have to be evaluated individually. Also, comparisons between locations with different sets of constituent data can prove difficult. In this study, data reconstruction using relevance vector machine algorithm was combined with dimensionality reduction via variational Bayesian noisy principal component analysis to reconstruct and condense sparse multidimensional WQ data sets into a single time series. The methodology allows incorporation of uncertainty in both the reconstruction and dimensionality-reduction steps. The R-R-V values were calculated using the aggregate time series at multiple locations within two Indiana watersheds. Results showed that uncertainty present in the reconstructed WQ data set propagates to the aggregate time series and subsequently to the aggregate R-R-V values as well. serving as motivating examples. Locations with different WQ constituents and different standards for impairment were successfully combined to provide aggregate measures of R-R-V values. Comparisons with individual constituent R-R-V values showed that v
iSAP: Interactive Sparse Astronomical Data Analysis Packages
NASA Astrophysics Data System (ADS)
Fourt, O.; Starck, J.-L.; Sureau, F.; Bobin, J.; Moudden, Y.; Abrial, P.; Schmitt, J.
2013-03-01
iSAP consists of three programs, written in IDL, which together are useful for spherical data analysis. MR/S (MultiResolution on the Sphere) contains routines for wavelet, ridgelet and curvelet transform on the sphere, and applications such denoising on the sphere using wavelets and/or curvelets, Gaussianity tests and Independent Component Analysis on the Sphere. MR/S has been designed for the PLANCK project, but can be used for many other applications. SparsePol (Polarized Spherical Wavelets and Curvelets) has routines for polarized wavelet, polarized ridgelet and polarized curvelet transform on the sphere, and applications such denoising on the sphere using wavelets and/or curvelets, Gaussianity tests and blind source separation on the Sphere. SparsePol has been designed for the PLANCK project. MS-VSTS (Multi-Scale Variance Stabilizing Transform on the Sphere), designed initially for the FERMI project, is useful for spherical mono-channel and multi-channel data analysis when the data are contaminated by a Poisson noise. It contains routines for wavelet/curvelet denoising, wavelet deconvolution, multichannel wavelet denoising and deconvolution.
Temporal evolution of financial-market correlations.
Fenn, Daniel J; Porter, Mason A; Williams, Stacy; McDonald, Mark; Johnson, Neil F; Jones, Nick S
2011-08-01
We investigate financial market correlations using random matrix theory and principal component analysis. We use random matrix theory to demonstrate that correlation matrices of asset price changes contain structure that is incompatible with uncorrelated random price changes. We then identify the principal components of these correlation matrices and demonstrate that a small number of components accounts for a large proportion of the variability of the markets that we consider. We characterize the time-evolving relationships between the different assets by investigating the correlations between the asset price time series and principal components. Using this approach, we uncover notable changes that occurred in financial markets and identify the assets that were significantly affected by these changes. We show in particular that there was an increase in the strength of the relationships between several different markets following the 2007-2008 credit and liquidity crisis.
Temporal evolution of financial-market correlations
NASA Astrophysics Data System (ADS)
Fenn, Daniel J.; Porter, Mason A.; Williams, Stacy; McDonald, Mark; Johnson, Neil F.; Jones, Nick S.
2011-08-01
We investigate financial market correlations using random matrix theory and principal component analysis. We use random matrix theory to demonstrate that correlation matrices of asset price changes contain structure that is incompatible with uncorrelated random price changes. We then identify the principal components of these correlation matrices and demonstrate that a small number of components accounts for a large proportion of the variability of the markets that we consider. We characterize the time-evolving relationships between the different assets by investigating the correlations between the asset price time series and principal components. Using this approach, we uncover notable changes that occurred in financial markets and identify the assets that were significantly affected by these changes. We show in particular that there was an increase in the strength of the relationships between several different markets following the 2007-2008 credit and liquidity crisis.
Non-linear principal component analysis applied to Lorenz models and to North Atlantic SLP
NASA Astrophysics Data System (ADS)
Russo, A.; Trigo, R. M.
2003-04-01
A non-linear generalisation of Principal Component Analysis (PCA), denoted Non-Linear Principal Component Analysis (NLPCA), is introduced and applied to the analysis of three data sets. Non-Linear Principal Component Analysis allows for the detection and characterisation of low-dimensional non-linear structure in multivariate data sets. This method is implemented using a 5-layer feed-forward neural network introduced originally in the chemical engineering literature (Kramer, 1991). The method is described and details of its implementation are addressed. Non-Linear Principal Component Analysis is first applied to a data set sampled from the Lorenz attractor (1963). It is found that the NLPCA approximations are more representative of the data than are the corresponding PCA approximations. The same methodology was applied to the less known Lorenz attractor (1984). However, the results obtained weren't as good as those attained with the famous 'Butterfly' attractor. Further work with this model is underway in order to assess if NLPCA techniques can be more representative of the data characteristics than are the corresponding PCA approximations. The application of NLPCA to relatively 'simple' dynamical systems, such as those proposed by Lorenz, is well understood. However, the application of NLPCA to a large climatic data set is much more challenging. Here, we have applied NLPCA to the sea level pressure (SLP) field for the entire North Atlantic area and the results show a slight imcrement of explained variance associated. Finally, directions for future work are presented.%}
Xiao, Keke; Chen, Yun; Jiang, Xie; Zhou, Yan
2017-03-01
An investigation was conducted for 20 different types of sludge in order to identify the key organic compounds in extracellular polymeric substances (EPS) that are important in assessing variations of sludge filterability. The different types of sludge varied in initial total solids (TS) content, organic composition and pre-treatment methods. For instance, some of the sludges were pre-treated by acid, ultrasonic, thermal, alkaline, or advanced oxidation technique. The Pearson's correlation results showed significant correlations between sludge filterability and zeta potential, pH, dissolved organic carbon, protein and polysaccharide in soluble EPS (SB EPS), loosely bound EPS (LB EPS) and tightly bound EPS (TB EPS). The principal component analysis (PCA) method was used to further explore correlations between variables and similarities among EPS fractions of different types of sludge. Two principal components were extracted: principal component 1 accounted for 59.24% of total EPS variations, while principal component 2 accounted for 25.46% of total EPS variations. Dissolved organic carbon, protein and polysaccharide in LB EPS showed higher eigenvector projection values than the corresponding compounds in SB EPS and TB EPS in principal component 1. Further characterization of fractionized key organic compounds in LB EPS was conducted with size-exclusion chromatography-organic carbon detection-organic nitrogen detection (LC-OCD-OND). A numerical multiple linear regression model was established to describe relationship between organic compounds in LB EPS and sludge filterability. Copyright © 2016 Elsevier Ltd. All rights reserved.
QSAR modeling of flotation collectors using principal components extracted from topological indices.
Natarajan, R; Nirdosh, Inderjit; Basak, Subhash C; Mills, Denise R
2002-01-01
Several topological indices were calculated for substituted-cupferrons that were tested as collectors for the froth flotation of uranium. The principal component analysis (PCA) was used for data reduction. Seven principal components (PC) were found to account for 98.6% of the variance among the computed indices. The principal components thus extracted were used in stepwise regression analyses to construct regression models for the prediction of separation efficiencies (Es) of the collectors. A two-parameter model with a correlation coefficient of 0.889 and a three-parameter model with a correlation coefficient of 0.913 were formed. PCs were found to be better than partition coefficient to form regression equations, and inclusion of an electronic parameter such as Hammett sigma or quantum mechanically derived electronic charges on the chelating atoms did not improve the correlation coefficient significantly. The method was extended to model the separation efficiencies of mercaptobenzothiazoles (MBT) and aminothiophenols (ATP) used in the flotation of lead and zinc ores, respectively. Five principal components were found to explain 99% of the data variability in each series. A three-parameter equation with correlation coefficient of 0.985 and a two-parameter equation with correlation coefficient of 0.926 were obtained for MBT and ATP, respectively. The amenability of separation efficiencies of chelating collectors to QSAR modeling using PCs based on topological indices might lead to the selection of collectors for synthesis and testing from a virtual database.
NASA Technical Reports Server (NTRS)
Interrante, Victoria
1997-01-01
The three-dimensional shape and relative depth of a smoothly curving layered transparent surface may be communicated particularly effectively when the surface is artistically enhanced with sparsely distributed opaque detail. This paper describes how the set of principal directions and principal curvatures specified by local geometric operators can be understood to define a natural 'flow' over the surface of an object, and can be used to guide the placement of the lines of a stroke texture that seeks to represent 3D shape information in a perceptually intuitive way. The driving application for this work is the visualization of layered isovalue surfaces in volume data, where the particular identity of an individual surface is not generally known a priori and observers will typically wish to view a variety of different level surfaces from the same distribution, superimposed over underlying opaque structures. By advecting an evenly distributed set of tiny opaque particles, and the empty space between them, via 3D line integral convolution through the vector field defined by the principal directions and principal curvatures of the level surfaces passing through each gridpoint of a 3D volume, it is possible to generate a single scan-converted solid stroke texture that may intuitively represent the essential shape information of any level surface in the volume. To generate longer strokes over more highly curved areas, where the directional information is both most stable and most relevant, and to simultaneously downplay the visual impact of directional information in the flatter regions, one may dynamically redefine the length of the filter kernel according to the magnitude of the maximum principal curvature of the level surface at the point around which it is applied.
Akbari, Hamed; Macyszyn, Luke; Da, Xiao; Wolf, Ronald L.; Bilello, Michel; Verma, Ragini; O’Rourke, Donald M.
2014-01-01
Purpose To augment the analysis of dynamic susceptibility contrast material–enhanced magnetic resonance (MR) images to uncover unique tissue characteristics that could potentially facilitate treatment planning through a better understanding of the peritumoral region in patients with glioblastoma. Materials and Methods Institutional review board approval was obtained for this study, with waiver of informed consent for retrospective review of medical records. Dynamic susceptibility contrast-enhanced MR imaging data were obtained for 79 patients, and principal component analysis was applied to the perfusion signal intensity. The first six principal components were sufficient to characterize more than 99% of variance in the temporal dynamics of blood perfusion in all regions of interest. The principal components were subsequently used in conjunction with a support vector machine classifier to create a map of heterogeneity within the peritumoral region, and the variance of this map served as the heterogeneity score. Results The calculated principal components allowed near-perfect separability of tissue that was likely highly infiltrated with tumor and tissue that was unlikely infiltrated with tumor. The heterogeneity map created by using the principal components showed a clear relationship between voxels judged by the support vector machine to be highly infiltrated and subsequent recurrence. The results demonstrated a significant correlation (r = 0.46, P < .0001) between the heterogeneity score and patient survival. The hazard ratio was 2.23 (95% confidence interval: 1.4, 3.6; P < .01) between patients with high and low heterogeneity scores on the basis of the median heterogeneity score. Conclusion Analysis of dynamic susceptibility contrast-enhanced MR imaging data by using principal component analysis can help identify imaging variables that can be subsequently used to evaluate the peritumoral region in glioblastoma. These variables are potentially indicative of tumor infiltration and may become useful tools in guiding therapy, as well as individualized prognostication. © RSNA, 2014 PMID:24955928
Grimbergen, M C M; van Swol, C F P; Kendall, C; Verdaasdonk, R M; Stone, N; Bosch, J L H R
2010-01-01
The overall quality of Raman spectra in the near-infrared region, where biological samples are often studied, has benefited from various improvements to optical instrumentation over the past decade. However, obtaining ample spectral quality for analysis is still challenging due to device requirements and short integration times required for (in vivo) clinical applications of Raman spectroscopy. Multivariate analytical methods, such as principal component analysis (PCA) and linear discriminant analysis (LDA), are routinely applied to Raman spectral datasets to develop classification models. Data compression is necessary prior to discriminant analysis to prevent or decrease the degree of over-fitting. The logical threshold for the selection of principal components (PCs) to be used in discriminant analysis is likely to be at a point before the PCs begin to introduce equivalent signal and noise and, hence, include no additional value. Assessment of the signal-to-noise ratio (SNR) at a certain peak or over a specific spectral region will depend on the sample measured. Therefore, the mean SNR over the whole spectral region (SNR(msr)) is determined in the original spectrum as well as for spectra reconstructed from an increasing number of principal components. This paper introduces a method of assessing the influence of signal and noise from individual PC loads and indicates a method of selection of PCs for LDA. To evaluate this method, two data sets with different SNRs were used. The sets were obtained with the same Raman system and the same measurement parameters on bladder tissue collected during white light cystoscopy (set A) and fluorescence-guided cystoscopy (set B). This method shows that the mean SNR over the spectral range in the original Raman spectra of these two data sets is related to the signal and noise contribution of principal component loads. The difference in mean SNR over the spectral range can also be appreciated since fewer principal components can reliably be used in the low SNR data set (set B) compared to the high SNR data set (set A). Despite the fact that no definitive threshold could be found, this method may help to determine the cutoff for the number of principal components used in discriminant analysis. Future analysis of a selection of spectral databases using this technique will allow optimum thresholds to be selected for different applications and spectral data quality levels.
Principal component reconstruction (PCR) for cine CBCT with motion learning from 2D fluoroscopy.
Gao, Hao; Zhang, Yawei; Ren, Lei; Yin, Fang-Fang
2018-01-01
This work aims to generate cine CT images (i.e., 4D images with high-temporal resolution) based on a novel principal component reconstruction (PCR) technique with motion learning from 2D fluoroscopic training images. In the proposed PCR method, the matrix factorization is utilized as an explicit low-rank regularization of 4D images that are represented as a product of spatial principal components and temporal motion coefficients. The key hypothesis of PCR is that temporal coefficients from 4D images can be reasonably approximated by temporal coefficients learned from 2D fluoroscopic training projections. For this purpose, we can acquire fluoroscopic training projections for a few breathing periods at fixed gantry angles that are free from geometric distortion due to gantry rotation, that is, fluoroscopy-based motion learning. Such training projections can provide an effective characterization of the breathing motion. The temporal coefficients can be extracted from these training projections and used as priors for PCR, even though principal components from training projections are certainly not the same for these 4D images to be reconstructed. For this purpose, training data are synchronized with reconstruction data using identical real-time breathing position intervals for projection binning. In terms of image reconstruction, with a priori temporal coefficients, the data fidelity for PCR changes from nonlinear to linear, and consequently, the PCR method is robust and can be solved efficiently. PCR is formulated as a convex optimization problem with the sum of linear data fidelity with respect to spatial principal components and spatiotemporal total variation regularization imposed on 4D image phases. The solution algorithm of PCR is developed based on alternating direction method of multipliers. The implementation is fully parallelized on GPU with NVIDIA CUDA toolbox and each reconstruction takes about a few minutes. The proposed PCR method is validated and compared with a state-of-art method, that is, PICCS, using both simulation and experimental data with the on-board cone-beam CT setting. The results demonstrated the feasibility of PCR for cine CBCT and significantly improved reconstruction quality of PCR from PICCS for cine CBCT. With a priori estimated temporal motion coefficients using fluoroscopic training projections, the PCR method can accurately reconstruct spatial principal components, and then generate cine CT images as a product of temporal motion coefficients and spatial principal components. © 2017 American Association of Physicists in Medicine.
ERIC Educational Resources Information Center
Lin, Mind-Dih
2012-01-01
Improving principal leadership is a vital component to the success of educational reform initiatives that seek to improve whole-school performance, as principal leadership often exercises positive but indirect effects on student learning. Because of the importance of principals within the field of school improvement, this article focuses on…
ERIC Educational Resources Information Center
Herrmann, Mariesa; Ross, Christine
2016-01-01
States and districts across the country are implementing new principal evaluation systems that include measures of the quality of principals' school leadership practices and measures of student achievement growth. Because these evaluation systems will be used for high-stakes decisions, it is important that the component measures of the evaluation…
ERIC Educational Resources Information Center
Hvidston, David J.; Range, Bret G.; McKim, Courtney Ann; Mette, Ian M.
2015-01-01
This study examined the perspectives of novice and late career principals concerning instructional and organizational leadership within their performance evaluations. An online survey was sent to 251 principals with a return rate of 49%. Instructional leadership components of the evaluation that were most important to all principals were:…
Men's Alcohol Expectancies at Selected Community Colleges
ERIC Educational Resources Information Center
Derby, Dustin C.
2011-01-01
Men's alcohol expectancies are an important cognitive-behavioral component of their consumption; yet, sparse research details such behaviors for men in two-year colleges. Selected for inclusion with the current study were 563 men from seven Illinois community colleges. Logistic regression analysis indicated four significant, positive relationships…
ERIC Educational Resources Information Center
Chou, Yeh-Tai; Wang, Wen-Chung
2010-01-01
Dimensionality is an important assumption in item response theory (IRT). Principal component analysis on standardized residuals has been used to check dimensionality, especially under the family of Rasch models. It has been suggested that an eigenvalue greater than 1.5 for the first eigenvalue signifies a violation of unidimensionality when there…
ERIC Educational Resources Information Center
Brusco, Michael J.; Singh, Renu; Steinley, Douglas
2009-01-01
The selection of a subset of variables from a pool of candidates is an important problem in several areas of multivariate statistics. Within the context of principal component analysis (PCA), a number of authors have argued that subset selection is crucial for identifying those variables that are required for correct interpretation of the…
Relaxation mode analysis of a peptide system: comparison with principal component analysis.
Mitsutake, Ayori; Iijima, Hiromitsu; Takano, Hiroshi
2011-10-28
This article reports the first attempt to apply the relaxation mode analysis method to a simulation of a biomolecular system. In biomolecular systems, the principal component analysis is a well-known method for analyzing the static properties of fluctuations of structures obtained by a simulation and classifying the structures into some groups. On the other hand, the relaxation mode analysis has been used to analyze the dynamic properties of homopolymer systems. In this article, a long Monte Carlo simulation of Met-enkephalin in gas phase has been performed. The results are analyzed by the principal component analysis and relaxation mode analysis methods. We compare the results of both methods and show the effectiveness of the relaxation mode analysis.
NASA Technical Reports Server (NTRS)
Murray, C. W., Jr.; Mueller, J. L.; Zwally, H. J.
1984-01-01
A field of measured anomalies of some physical variable relative to their time averages, is partitioned in either the space domain or the time domain. Eigenvectors and corresponding principal components of the smaller dimensioned covariance matrices associated with the partitioned data sets are calculated independently, then joined to approximate the eigenstructure of the larger covariance matrix associated with the unpartitioned data set. The accuracy of the approximation (fraction of the total variance in the field) and the magnitudes of the largest eigenvalues from the partitioned covariance matrices together determine the number of local EOF's and principal components to be joined by any particular level. The space-time distribution of Nimbus-5 ESMR sea ice measurement is analyzed.
Fast principal component analysis for stacking seismic data
NASA Astrophysics Data System (ADS)
Wu, Juan; Bai, Min
2018-04-01
Stacking seismic data plays an indispensable role in many steps of the seismic data processing and imaging workflow. Optimal stacking of seismic data can help mitigate seismic noise and enhance the principal components to a great extent. Traditional average-based seismic stacking methods cannot obtain optimal performance when the ambient noise is extremely strong. We propose a principal component analysis (PCA) algorithm for stacking seismic data without being sensitive to noise level. Considering the computational bottleneck of the classic PCA algorithm in processing massive seismic data, we propose an efficient PCA algorithm to make the proposed method readily applicable for industrial applications. Two numerically designed examples and one real seismic data are used to demonstrate the performance of the presented method.
Wongchai, C; Chaidee, A; Pfeiffer, W
2012-01-01
Global warming increases plant salt stress via evaporation after irrigation, but how plant cells sense salt stress remains unknown. Here, we searched for correlation-based targets of salt stress sensing in Chenopodium rubrum cell suspension cultures. We proposed a linkage between the sensing of salt stress and the sensing of distinct metabolites. Consequently, we analysed various extracellular pH signals in autotroph and heterotroph cell suspensions. Our search included signals after 52 treatments: salt and osmotic stress, ion channel inhibitors (amiloride, quinidine), salt-sensing modulators (proline), amino acids, carboxylic acids and regulators (salicylic acid, 2,4-dichlorphenoxyacetic acid). Multivariate analyses revealed hirarchical clusters of signals and five principal components of extracellular proton flux. The principal component correlated with salt stress was an antagonism of γ-aminobutyric and salicylic acid, confirming involvement of acid-sensing ion channels (ASICs) in salt stress sensing. Proline, short non-substituted mono-carboxylic acids (C2-C6), lactic acid and amiloride characterised the four uncorrelated principal components of proton flux. The proline-associated principal component included an antagonism of 2,4-dichlorphenoxyacetic acid and a set of amino acids (hydrophobic, polar, acidic, basic). The five principal components captured 100% of variance of extracellular proton flux. Thus, a bias-free, functional high-throughput screening was established to extract new clusters of response elements and potential signalling pathways, and to serve as a core for quantitative meta-analysis in plant biology. The eigenvectors reorient research, associating proline with development instead of salt stress, and the proof of existence of multiple components of proton flux can help to resolve controversy about the acid growth theory. © 2011 German Botanical Society and The Royal Botanical Society of the Netherlands.
Surzhikov, V D; Surzhikov, D V
2014-01-01
The search and measurement of causal relationships between exposure to air pollution and health state of the population is based on the system analysis and risk assessment to improve the quality of research. With this purpose there is applied the modern statistical analysis with the use of criteria of independence, principal component analysis and discriminate function analysis. As a result of analysis out of all atmospheric pollutants there were separated four main components: for diseases of the circulatory system main principal component is implied with concentrations of suspended solids, nitrogen dioxide, carbon monoxide, hydrogen fluoride, for the respiratory diseases the main c principal component is closely associated with suspended solids, sulfur dioxide and nitrogen dioxide, charcoal black. The discriminant function was shown to be used as a measure of the level of air pollution.
Priority of VHS Development Based in Potential Area using Principal Component Analysis
NASA Astrophysics Data System (ADS)
Meirawan, D.; Ana, A.; Saripudin, S.
2018-02-01
The current condition of VHS is still inadequate in quality, quantity and relevance. The purpose of this research is to analyse the development of VHS based on the development of regional potential by using principal component analysis (PCA) in Bandung, Indonesia. This study used descriptive qualitative data analysis using the principle of secondary data reduction component. The method used is Principal Component Analysis (PCA) analysis with Minitab Statistics Software tool. The results of this study indicate the value of the lowest requirement is a priority of the construction of development VHS with a program of majors in accordance with the development of regional potential. Based on the PCA score found that the main priority in the development of VHS in Bandung is in Saguling, which has the lowest PCA value of 416.92 in area 1, Cihampelas with the lowest PCA value in region 2 and Padalarang with the lowest PCA value.
Azevedo, C F; Nascimento, M; Silva, F F; Resende, M D V; Lopes, P S; Guimarães, S E F; Glória, L S
2015-10-09
A significant contribution of molecular genetics is the direct use of DNA information to identify genetically superior individuals. With this approach, genome-wide selection (GWS) can be used for this purpose. GWS consists of analyzing a large number of single nucleotide polymorphism markers widely distributed in the genome; however, because the number of markers is much larger than the number of genotyped individuals, and such markers are highly correlated, special statistical methods are widely required. Among these methods, independent component regression, principal component regression, partial least squares, and partial principal components stand out. Thus, the aim of this study was to propose an application of the methods of dimensionality reduction to GWS of carcass traits in an F2 (Piau x commercial line) pig population. The results show similarities between the principal and the independent component methods and provided the most accurate genomic breeding estimates for most carcass traits in pigs.
Fast Component Pursuit for Large-Scale Inverse Covariance Estimation.
Han, Lei; Zhang, Yu; Zhang, Tong
2016-08-01
The maximum likelihood estimation (MLE) for the Gaussian graphical model, which is also known as the inverse covariance estimation problem, has gained increasing interest recently. Most existing works assume that inverse covariance estimators contain sparse structure and then construct models with the ℓ 1 regularization. In this paper, different from existing works, we study the inverse covariance estimation problem from another perspective by efficiently modeling the low-rank structure in the inverse covariance, which is assumed to be a combination of a low-rank part and a diagonal matrix. One motivation for this assumption is that the low-rank structure is common in many applications including the climate and financial analysis, and another one is that such assumption can reduce the computational complexity when computing its inverse. Specifically, we propose an efficient COmponent Pursuit (COP) method to obtain the low-rank part, where each component can be sparse. For optimization, the COP method greedily learns a rank-one component in each iteration by maximizing the log-likelihood. Moreover, the COP algorithm enjoys several appealing properties including the existence of an efficient solution in each iteration and the theoretical guarantee on the convergence of this greedy approach. Experiments on large-scale synthetic and real-world datasets including thousands of millions variables show that the COP method is faster than the state-of-the-art techniques for the inverse covariance estimation problem when achieving comparable log-likelihood on test data.
ERIC Educational Resources Information Center
National Association of Secondary School Principals, Reston, VA.
Preparation programs for principals should have excellent academic and performance based components. In examining the nature of performance based principal preparation this report finds that school administration programs must bridge the gap between conceptual learning in the classroom and the requirements of professional practice. A number of…
Principal component greenness transformation in multitemporal agricultural Landsat data
NASA Technical Reports Server (NTRS)
Abotteen, R. A.
1978-01-01
A data compression technique for multitemporal Landsat imagery which extracts phenological growth pattern information for agricultural crops is described. The principal component greenness transformation was applied to multitemporal agricultural Landsat data for information retrieval. The transformation was favorable for applications in agricultural Landsat data analysis because of its physical interpretability and its relation to the phenological growth of crops. It was also found that the first and second greenness eigenvector components define a temporal small-grain trajectory and nonsmall-grain trajectory, respectively.
Labyrinth, An Abstract Model for Hypermedia Applications. Description of its Static Components.
ERIC Educational Resources Information Center
Diaz, Paloma; Aedo, Ignacio; Panetsos, Fivos
1997-01-01
The model for hypermedia applications called Labyrinth allows: (1) the design of platform-independent hypermedia applications; (2) the categorization, generalization and abstraction of sparse unstructured heterogeneous information in multiple and interconnected levels; (3) the creation of personal views in multiuser hyperdocuments for both groups…
Parts and Relations in Young Children's Shape-Based Object Recognition
ERIC Educational Resources Information Center
Augustine, Elaine; Smith, Linda B.; Jones, Susan S.
2011-01-01
The ability to recognize common objects from sparse information about geometric shape emerges during the same period in which children learn object names and object categories. Hummel and Biederman's (1992) theory of object recognition proposes that the geometric shapes of objects have two components--geometric volumes representing major object…
Dictionary Learning Algorithms for Sparse Representation
Kreutz-Delgado, Kenneth; Murray, Joseph F.; Rao, Bhaskar D.; Engan, Kjersti; Lee, Te-Won; Sejnowski, Terrence J.
2010-01-01
Algorithms for data-driven learning of domain-specific overcomplete dictionaries are developed to obtain maximum likelihood and maximum a posteriori dictionary estimates based on the use of Bayesian models with concave/Schur-concave (CSC) negative log priors. Such priors are appropriate for obtaining sparse representations of environmental signals within an appropriately chosen (environmentally matched) dictionary. The elements of the dictionary can be interpreted as concepts, features, or words capable of succinct expression of events encountered in the environment (the source of the measured signals). This is a generalization of vector quantization in that one is interested in a description involving a few dictionary entries (the proverbial “25 words or less”), but not necessarily as succinct as one entry. To learn an environmentally adapted dictionary capable of concise expression of signals generated by the environment, we develop algorithms that iterate between a representative set of sparse representations found by variants of FOCUSS and an update of the dictionary using these sparse representations. Experiments were performed using synthetic data and natural images. For complete dictionaries, we demonstrate that our algorithms have improved performance over other independent component analysis (ICA) methods, measured in terms of signal-to-noise ratios of separated sources. In the overcomplete case, we show that the true underlying dictionary and sparse sources can be accurately recovered. In tests with natural images, learned overcomplete dictionaries are shown to have higher coding efficiency than complete dictionaries; that is, images encoded with an over-complete dictionary have both higher compression (fewer bits per pixel) and higher accuracy (lower mean square error). PMID:12590811
Pintus, M A; Gaspa, G; Nicolazzi, E L; Vicario, D; Rossoni, A; Ajmone-Marsan, P; Nardone, A; Dimauro, C; Macciotta, N P P
2012-06-01
The large number of markers available compared with phenotypes represents one of the main issues in genomic selection. In this work, principal component analysis was used to reduce the number of predictors for calculating genomic breeding values (GEBV). Bulls of 2 cattle breeds farmed in Italy (634 Brown and 469 Simmental) were genotyped with the 54K Illumina beadchip (Illumina Inc., San Diego, CA). After data editing, 37,254 and 40,179 single nucleotide polymorphisms (SNP) were retained for Brown and Simmental, respectively. Principal component analysis carried out on the SNP genotype matrix extracted 2,257 and 3,596 new variables in the 2 breeds, respectively. Bulls were sorted by birth year to create reference and prediction populations. The effect of principal components on deregressed proofs in reference animals was estimated with a BLUP model. Results were compared with those obtained by using SNP genotypes as predictors with either the BLUP or Bayes_A method. Traits considered were milk, fat, and protein yields, fat and protein percentages, and somatic cell score. The GEBV were obtained for prediction population by blending direct genomic prediction and pedigree indexes. No substantial differences were observed in squared correlations between GEBV and EBV in prediction animals between the 3 methods in the 2 breeds. The principal component analysis method allowed for a reduction of about 90% in the number of independent variables when predicting direct genomic values, with a substantial decrease in calculation time and without loss of accuracy. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Karpuzcu, M Ekrem; Fairbairn, David; Arnold, William A; Barber, Brian L; Kaufenberg, Elizabeth; Koskinen, William C; Novak, Paige J; Rice, Pamela J; Swackhamer, Deborah L
2014-01-01
Principal components analysis (PCA) was used to identify sources of emerging organic contaminants in the Zumbro River watershed in Southeastern Minnesota. Two main principal components (PCs) were identified, which together explained more than 50% of the variance in the data. Principal Component 1 (PC1) was attributed to urban wastewater-derived sources, including municipal wastewater and residential septic tank effluents, while Principal Component 2 (PC2) was attributed to agricultural sources. The variances of the concentrations of cotinine, DEET and the prescription drugs carbamazepine, erythromycin and sulfamethoxazole were best explained by PC1, while the variances of the concentrations of the agricultural pesticides atrazine, metolachlor and acetochlor were best explained by PC2. Mixed use compounds carbaryl, iprodione and daidzein did not specifically group with either PC1 or PC2. Furthermore, despite the fact that caffeine and acetaminophen have been historically associated with human use, they could not be attributed to a single dominant land use category (e.g., urban/residential or agricultural). Contributions from septic systems did not clarify the source for these two compounds, suggesting that additional sources, such as runoff from biosolid-amended soils, may exist. Based on these results, PCA may be a useful way to broadly categorize the sources of new and previously uncharacterized emerging contaminants or may help to clarify transport pathways in a given area. Acetaminophen and caffeine were not ideal markers for urban/residential contamination sources in the study area and may need to be reconsidered as such in other areas as well.
Hua, Yang; Liu, Zhanqiang
2018-05-24
Residual stresses of turned Inconel 718 surface along its axial and circumferential directions affect the fatigue performance of machined components. However, it has not been clear that the axial and circumferential directions are the principle residual stress direction. The direction of the maximum principal residual stress is crucial for the machined component service life. The present work aims to focuses on determining the direction and magnitude of principal residual stress and investigating its influence on fatigue performance of turned Inconel 718. The turning experimental results show that the principal residual stress magnitude is much higher than surface residual stress. In addition, both the principal residual stress and surface residual stress increase significantly as the feed rate increases. The fatigue test results show that the direction of the maximum principal residual stress increased by 7.4%, while the fatigue life decreased by 39.4%. The maximum principal residual stress magnitude diminished by 17.9%, whereas the fatigue life increased by 83.6%. The maximum principal residual stress has a preponderant influence on fatigue performance as compared to the surface residual stress. The maximum principal residual stress can be considered as a prime indicator for evaluation of the residual stress influence on fatigue performance of turned Inconel 718.
Principal component analysis for designed experiments.
Konishi, Tomokazu
2015-01-01
Principal component analysis is used to summarize matrix data, such as found in transcriptome, proteome or metabolome and medical examinations, into fewer dimensions by fitting the matrix to orthogonal axes. Although this methodology is frequently used in multivariate analyses, it has disadvantages when applied to experimental data. First, the identified principal components have poor generality; since the size and directions of the components are dependent on the particular data set, the components are valid only within the data set. Second, the method is sensitive to experimental noise and bias between sample groups. It cannot reflect the experimental design that is planned to manage the noise and bias; rather, it estimates the same weight and independence to all the samples in the matrix. Third, the resulting components are often difficult to interpret. To address these issues, several options were introduced to the methodology. First, the principal axes were identified using training data sets and shared across experiments. These training data reflect the design of experiments, and their preparation allows noise to be reduced and group bias to be removed. Second, the center of the rotation was determined in accordance with the experimental design. Third, the resulting components were scaled to unify their size unit. The effects of these options were observed in microarray experiments, and showed an improvement in the separation of groups and robustness to noise. The range of scaled scores was unaffected by the number of items. Additionally, unknown samples were appropriately classified using pre-arranged axes. Furthermore, these axes well reflected the characteristics of groups in the experiments. As was observed, the scaling of the components and sharing of axes enabled comparisons of the components beyond experiments. The use of training data reduced the effects of noise and bias in the data, facilitating the physical interpretation of the principal axes. Together, these introduced options result in improved generality and objectivity of the analytical results. The methodology has thus become more like a set of multiple regression analyses that find independent models that specify each of the axes.
B. Desta Fekedulegn; J.J. Colbert; R.R., Jr. Hicks; Michael E. Schuckers
2002-01-01
The theory and application of principal components regression, a method for coping with multicollinearity among independent variables in analyzing ecological data, is exhibited in detail. A concrete example of the complex procedures that must be carried out in developing a diagnostic growth-climate model is provided. We use tree radial increment data taken from breast...
ERIC Educational Resources Information Center
Rahayu, Sri; Sugiarto, Teguh; Madu, Ludiro; Holiawati; Subagyo, Ahmad
2017-01-01
This study aims to apply the model principal component analysis to reduce multicollinearity on variable currency exchange rate in eight countries in Asia against US Dollar including the Yen (Japan), Won (South Korea), Dollar (Hong Kong), Yuan (China), Bath (Thailand), Rupiah (Indonesia), Ringgit (Malaysia), Dollar (Singapore). It looks at yield…
Radiative Transfer Modeling and Retrievals for Advanced Hyperspectral Sensors
NASA Technical Reports Server (NTRS)
Liu, Xu; Zhou, Daniel K.; Larar, Allen M.; Smith, William L., Sr.; Mango, Stephen A.
2009-01-01
A novel radiative transfer model and a physical inversion algorithm based on principal component analysis will be presented. Instead of dealing with channel radiances, the new approach fits principal component scores of these quantities. Compared to channel-based radiative transfer models, the new approach compresses radiances into a much smaller dimension making both forward modeling and inversion algorithm more efficient.
Principal component analysis of Raman spectra for TiO2 nanoparticle characterization
NASA Astrophysics Data System (ADS)
Ilie, Alina Georgiana; Scarisoareanu, Monica; Morjan, Ion; Dutu, Elena; Badiceanu, Maria; Mihailescu, Ion
2017-09-01
The Raman spectra of anatase/rutile mixed phases of Sn doped TiO2 nanoparticles and undoped TiO2 nanoparticles, synthesised by laser pyrolysis, with nanocrystallite dimensions varying from 8 to 28 nm, was simultaneously processed with a self-written software that applies Principal Component Analysis (PCA) on the measured spectrum to verify the possibility of objective auto-characterization of nanoparticles from their vibrational modes. The photo-excited process of Raman scattering is very sensible to the material characteristics, especially in the case of nanomaterials, where more properties become relevant for the vibrational behaviour. We used PCA, a statistical procedure that performs eigenvalue decomposition of descriptive data covariance, to automatically analyse the sample's measured Raman spectrum, and to interfere the correlation between nanoparticle dimensions, tin and carbon concentration, and their Principal Component values (PCs). This type of application can allow an approximation of the crystallite size, or tin concentration, only by measuring the Raman spectrum of the sample. The study of loadings of the principal components provides information of the way the vibrational modes are affected by the nanoparticle features and the spectral area relevant for the classification.
Sebro, Ronnie; Hoffman, Thomas J.; Lange, Christoph; Rogus, John J.; Risch, Neil J.
2013-01-01
Population stratification leads to a predictable phenomenon—a reduction in the number of heterozygotes compared to that calculated assuming Hardy-Weinberg Equilibrium (HWE). We show that population stratification results in another phenomenon—an excess in the proportion of spouse-pairs with the same genotypes at all ancestrally informative markers, resulting in ancestrally related positive assortative mating. We use principal components analysis to show that there is evidence of population stratification within the Framingham Heart Study, and show that the first principal component correlates with a North-South European cline. We then show that the first principal component is highly correlated between spouses (r=0.58, p=0.0013), demonstrating that there is ancestrally related positive assortative mating among the Framingham Caucasian population. We also show that the single nucleotide polymorphisms loading most heavily on the first principal component show an excess of homozygotes within the spouses, consistent with similar ancestry-related assortative mating in the previous generation. This nonrandom mating likely affects genetic structure seen more generally in the North American population of European descent today, and decreases the rate of decay of linkage disequilibrium for ancestrally informative markers. PMID:20842694
Puri, Ritika; Khamrui, Kaushik; Khetra, Yogesh; Malhotra, Ravinder; Devraja, H C
2016-02-01
Promising development and expansion in the market of cham-cham, a traditional Indian dairy product is expected in the coming future with the organized production of this milk product by some large dairies. The objective of this study was to document the extent of variation in sensory properties of market samples of cham-cham collected from four different locations known for their excellence in cham-cham production and to find out the attributes that govern much of variation in sensory scores of this product using quantitative descriptive analysis (QDA) and principal component analysis (PCA). QDA revealed significant (p < 0.05) difference in sensory attributes of cham-cham among the market samples. PCA identified four significant principal components that accounted for 72.4 % of the variation in the sensory data. Factor scores of each of the four principal components which primarily correspond to sweetness/shape/dryness of interior, surface appearance/surface dryness, rancid and firmness attributes specify the location of each market sample along each of the axes in 3-D graphs. These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring attributes of cham-cham that contribute most to its sensory acceptability.
Mahler, Barbara J.
2008-01-01
The statistical analyses taken together indicate that the geochemistry at the freshwater-zone wells is more variable than that at the transition-zone wells. The geochemical variability at the freshwater-zone wells might result from dilution of ground water by meteoric water. This is indicated by relatively constant major ion molar ratios; a preponderance of positive correlations between SC, major ions, and trace elements; and a principal components analysis in which the major ions are strongly loaded on the first principal component. Much of the variability at three of the four transition-zone wells might result from the use of different laboratory analytical methods or reporting procedures during the period of sampling. This is reflected by a lack of correlation between SC and major ion concentrations at the transition-zone wells and by a principal components analysis in which the variability is fairly evenly distributed across several principal components. The statistical analyses further indicate that, although the transition-zone wells are less well connected to surficial hydrologic conditions than the freshwater-zone wells, there is some connection but the response time is longer.
Matsen IV, Frederick A.; Evans, Steven N.
2013-01-01
Principal components analysis (PCA) and hierarchical clustering are two of the most heavily used techniques for analyzing the differences between nucleic acid sequence samples taken from a given environment. They have led to many insights regarding the structure of microbial communities. We have developed two new complementary methods that leverage how this microbial community data sits on a phylogenetic tree. Edge principal components analysis enables the detection of important differences between samples that contain closely related taxa. Each principal component axis is a collection of signed weights on the edges of the phylogenetic tree, and these weights are easily visualized by a suitable thickening and coloring of the edges. Squash clustering outputs a (rooted) clustering tree in which each internal node corresponds to an appropriate “average” of the original samples at the leaves below the node. Moreover, the length of an edge is a suitably defined distance between the averaged samples associated with the two incident nodes, rather than the less interpretable average of distances produced by UPGMA, the most widely used hierarchical clustering method in this context. We present these methods and illustrate their use with data from the human microbiome. PMID:23505415
Time Management Ideas for Assistant Principals.
ERIC Educational Resources Information Center
Cronk, Jerry
1987-01-01
Prioritizing the use of time, effective communication, delegating authority, having detailed job descriptions, and good secretarial assistance are important components of time management for assistant principals. (MD)
McSherry, Wilfred
2006-07-01
The aim of this study was to generate a deeper understanding of the factors and forces that may inhibit or advance the concepts of spirituality and spiritual care within both nursing and health care. This manuscript presents a model that emerged from a qualitative study using grounded theory. Implementation and use of this model may assist all health care practitioners and organizations to advance the concepts of spirituality and spiritual care within their own sphere of practice. The model has been termed the principal components model because participants identified six components as being crucial to the advancement of spiritual health care. Grounded theory was used meaning that there was concurrent data collection and analysis. Theoretical sampling was used to develop the emerging theory. These processes, along with data analysis, open, axial and theoretical coding led to the identification of a core category and the construction of the principal components model. Fifty-three participants (24 men and 29 women) were recruited and all consented to be interviewed. The sample included nurses (n=24), chaplains (n=7), a social worker (n=1), an occupational therapist (n=1), physiotherapists (n=2), patients (n=14) and the public (n=4). The investigation was conducted in three phases to substantiate the emerging theory and the development of the model. The principal components model contained six components: individuality, inclusivity, integrated, inter/intra-disciplinary, innate and institution. A great deal has been written on the concepts of spirituality and spiritual care. However, rhetoric alone will not remove some of the intrinsic and extrinsic barriers that are inhibiting the advancement of the spiritual dimension in terms of theory and practice. An awareness of and adherence to the principal components model may assist nurses and health care professionals to engage with and overcome some of the structural, organizational, political and social variables that are impacting upon spiritual care.
Principal component analysis of the nonlinear coupling of harmonic modes in heavy-ion collisions
NASA Astrophysics Data System (ADS)
BoŻek, Piotr
2018-03-01
The principal component analysis of flow correlations in heavy-ion collisions is studied. The correlation matrix of harmonic flow is generalized to correlations involving several different flow vectors. The method can be applied to study the nonlinear coupling between different harmonic modes in a double differential way in transverse momentum or pseudorapidity. The procedure is illustrated with results from the hydrodynamic model applied to Pb + Pb collisions at √{sN N}=2760 GeV. Three examples of generalized correlations matrices in transverse momentum are constructed corresponding to the coupling of v22 and v4, of v2v3 and v5, or of v23,v33 , and v6. The principal component decomposition is applied to the correlation matrices and the dominant modes are calculated.
Analysis and improvement measures of flight delay in China
NASA Astrophysics Data System (ADS)
Zang, Yuhang
2017-03-01
Firstly, this paper establishes the principal component regression model to analyze the data quantitatively, based on principal component analysis to get the three principal component factors of flight delays. Then the least square method is used to analyze the factors and obtained the regression equation expression by substitution, and then found that the main reason for flight delays is airlines, followed by weather and traffic. Aiming at the above problems, this paper improves the controllable aspects of traffic flow control. For reasons of traffic flow control, an adaptive genetic queuing model is established for the runway terminal area. This paper, establish optimization method that fifteen planes landed simultaneously on the three runway based on Beijing capital international airport, comparing the results with the existing FCFS algorithm, the superiority of the model is proved.
NASA Astrophysics Data System (ADS)
Haneishi, Hideaki; Sakuda, Yasunori; Honda, Toshio
2002-06-01
Spectral reflectance of most reflective objects such as natural objects and color hardcopy is relatively smooth and can be approximated by several numbers of principal components with high accuracy. Though the subspace spanned by those principal components represents a space in which reflective objects can exist, it dos not provide the bound in which the samples distribute. In this paper we propose to represent the gamut of reflective objects in more distinct form, i.e., as a polyhedron in the subspace spanned by several principal components. Concept of the polyhedral gamut representation and its application to calculation of metamer ensemble are described. Color-mismatch volume caused by different illuminant and/or observer for a metamer ensemble is also calculated and compared with theoretical one.
Evaluation of Low-Voltage Distribution Network Index Based on Improved Principal Component Analysis
NASA Astrophysics Data System (ADS)
Fan, Hanlu; Gao, Suzhou; Fan, Wenjie; Zhong, Yinfeng; Zhu, Lei
2018-01-01
In order to evaluate the development level of the low-voltage distribution network objectively and scientifically, chromatography analysis method is utilized to construct evaluation index model of low-voltage distribution network. Based on the analysis of principal component and the characteristic of logarithmic distribution of the index data, a logarithmic centralization method is adopted to improve the principal component analysis algorithm. The algorithm can decorrelate and reduce the dimensions of the evaluation model and the comprehensive score has a better dispersion degree. The clustering method is adopted to analyse the comprehensive score because the comprehensive score of the courts is concentrated. Then the stratification evaluation of the courts is realized. An example is given to verify the objectivity and scientificity of the evaluation method.
Online signature recognition using principal component analysis and artificial neural network
NASA Astrophysics Data System (ADS)
Hwang, Seung-Jun; Park, Seung-Je; Baek, Joong-Hwan
2016-12-01
In this paper, we propose an algorithm for on-line signature recognition using fingertip point in the air from the depth image acquired by Kinect. We extract 10 statistical features from X, Y, Z axis, which are invariant to changes in shifting and scaling of the signature trajectories in three-dimensional space. Artificial neural network is adopted to solve the complex signature classification problem. 30 dimensional features are converted into 10 principal components using principal component analysis, which is 99.02% of total variances. We implement the proposed algorithm and test to actual on-line signatures. In experiment, we verify the proposed method is successful to classify 15 different on-line signatures. Experimental result shows 98.47% of recognition rate when using only 10 feature vectors.
Petrov, Andrii Y; Herbst, Michael; Andrew Stenger, V
2017-08-15
Rapid whole-brain dynamic Magnetic Resonance Imaging (MRI) is of particular interest in Blood Oxygen Level Dependent (BOLD) functional MRI (fMRI). Faster acquisitions with higher temporal sampling of the BOLD time-course provide several advantages including increased sensitivity in detecting functional activation, the possibility of filtering out physiological noise for improving temporal SNR, and freezing out head motion. Generally, faster acquisitions require undersampling of the data which results in aliasing artifacts in the object domain. A recently developed low-rank (L) plus sparse (S) matrix decomposition model (L+S) is one of the methods that has been introduced to reconstruct images from undersampled dynamic MRI data. The L+S approach assumes that the dynamic MRI data, represented as a space-time matrix M, is a linear superposition of L and S components, where L represents highly spatially and temporally correlated elements, such as the image background, while S captures dynamic information that is sparse in an appropriate transform domain. This suggests that L+S might be suited for undersampled task or slow event-related fMRI acquisitions because the periodic nature of the BOLD signal is sparse in the temporal Fourier transform domain and slowly varying low-rank brain background signals, such as physiological noise and drift, will be predominantly low-rank. In this work, as a proof of concept, we exploit the L+S method for accelerating block-design fMRI using a 3D stack of spirals (SoS) acquisition where undersampling is performed in the k z -t domain. We examined the feasibility of the L+S method to accurately separate temporally correlated brain background information in the L component while capturing periodic BOLD signals in the S component. We present results acquired in control human volunteers at 3T for both retrospective and prospectively acquired fMRI data for a visual activation block-design task. We show that a SoS fMRI acquisition with an acceleration of four and L+S reconstruction can achieve a brain coverage of 40 slices at 2mm isotropic resolution and 64 x 64 matrix size every 500ms. Copyright © 2017 Elsevier Inc. All rights reserved.
Leontovich, T A; Zvegintseva, E G
1985-10-01
Two principal classes of striatum long axonal neurons (sparsely ramified reticular cells and densely ramified dendritic cells) were analyzed quantitatively in four animal species: hedgehog, rabbit, dog and monkey. The cross section area, total dendritic length and the area of dendritic field were measured using "LEITZ-ASM" system. Classes of neurons studied were significantly different in dogs and monkeys, while no differences were noted between hedgehog and rabbit. Reticular neurons of different species varied much more than dendritic ones. Quantitative analysis has revealed the progressive increase in the complexity of dendritic tree in mammals from rabbit to monkey.
Jesse, Stephen; Kalinin, Sergei V
2009-02-25
An approach for the analysis of multi-dimensional, spectroscopic-imaging data based on principal component analysis (PCA) is explored. PCA selects and ranks relevant response components based on variance within the data. It is shown that for examples with small relative variations between spectra, the first few PCA components closely coincide with results obtained using model fitting, and this is achieved at rates approximately four orders of magnitude faster. For cases with strong response variations, PCA allows an effective approach to rapidly process, de-noise, and compress data. The prospects for PCA combined with correlation function analysis of component maps as a universal tool for data analysis and representation in microscopy are discussed.
The Artistic Nature of the High School Principal.
ERIC Educational Resources Information Center
Ritschel, Robert E.
The role of high school principals can be compared to that of composers of music. For instance, composers put musical components together into a coherent whole; similarly, principals organize high schools by establishing class schedules, assigning roles to subordinates, and maintaining a safe and orderly learning environment. Second, composers…
ERIC Educational Resources Information Center
Odegard-Koester, Melissa A.; Watkins, Paul
2016-01-01
The working relationship between principals and school counselors have received some attention in the literature, however, little empirical research exists that examines specifically the components that facilitate a collaborative working relationship between the principal and school counselor. This qualitative case study examined the unique…
The Retention and Attrition of Catholic School Principals
ERIC Educational Resources Information Center
Durow, W. Patrick; Brock, Barbara L.
2004-01-01
This article reports the results of a study of the retention of principals in Catholic elementary and secondary schools in one Midwestern diocese. Findings revealed that personal needs, career advancement, support from employer, and clearly defined role expectations were key factors in principals' retention decisions. A profile of components of…
Network dynamics underlying the formation of sparse, informative representations in the hippocampus.
Karlsson, Mattias P; Frank, Loren M
2008-12-24
During development, activity-dependent processes increase the specificity of neural responses to stimuli, but the role that this type of process plays in adult plasticity is unclear. We examined the dynamics of hippocampal activity as animals learned about new environments to understand how neural selectivity changes with experience. Hippocampal principal neurons fire when the animal is located in a particular subregion of its environment, and in any given environment the hippocampal representation is sparse: less than half of the neurons in areas CA1 and CA3 are active whereas the rest are essentially silent. Here we show that different dynamics govern the evolution of this sparsity in CA1 and upstream area CA3. CA1, but not CA3, produces twice as many spikes in novel compared with familiar environments. This high rate firing continues during sharp wave ripple events in a subsequent rest period. The overall CA1 population rate declines and the number of active cells decreases as the environment becomes familiar and task performance improves, but the decline in rate is not uniform across neurons. Instead, the activity of cells with initial peak spatial rates above approximately 12 Hz is enhanced, whereas the activity of cells with lower initial peak rates is suppressed. The result of these changes is that the active CA1 population comes to consist of a relatively small group of cells with strong spatial tuning. This process is not evident in CA3, indicating that a region-specific and long timescale process operates in CA1 to create a sparse, spatially informative population of neurons.
ERIC Educational Resources Information Center
Lawson, J. S.; Inglis, James
1984-01-01
A learning disability index (LDI) for the assessment of intellectual deficits on the Wechsler Intelligence Scale for Children-Revised (WISC-R) is described. The Factor II score coefficients derived from an unrotated principal components analysis of the WISC-R normative data, in combination with the individual's scaled scores, are used for this…
Perturbation analyses of intermolecular interactions
NASA Astrophysics Data System (ADS)
Koyama, Yohei M.; Kobayashi, Tetsuya J.; Ueda, Hiroki R.
2011-08-01
Conformational fluctuations of a protein molecule are important to its function, and it is known that environmental molecules, such as water molecules, ions, and ligand molecules, significantly affect the function by changing the conformational fluctuations. However, it is difficult to systematically understand the role of environmental molecules because intermolecular interactions related to the conformational fluctuations are complicated. To identify important intermolecular interactions with regard to the conformational fluctuations, we develop herein (i) distance-independent and (ii) distance-dependent perturbation analyses of the intermolecular interactions. We show that these perturbation analyses can be realized by performing (i) a principal component analysis using conditional expectations of truncated and shifted intermolecular potential energy terms and (ii) a functional principal component analysis using products of intermolecular forces and conditional cumulative densities. We refer to these analyses as intermolecular perturbation analysis (IPA) and distance-dependent intermolecular perturbation analysis (DIPA), respectively. For comparison of the IPA and the DIPA, we apply them to the alanine dipeptide isomerization in explicit water. Although the first IPA principal components discriminate two states (the α state and PPII (polyproline II) + β states) for larger cutoff length, the separation between the PPII state and the β state is unclear in the second IPA principal components. On the other hand, in the large cutoff value, DIPA eigenvalues converge faster than that for IPA and the top two DIPA principal components clearly identify the three states. By using the DIPA biplot, the contributions of the dipeptide-water interactions to each state are analyzed systematically. Since the DIPA improves the state identification and the convergence rate with retaining distance information, we conclude that the DIPA is a more practical method compared with the IPA. To test the feasibility of the DIPA for larger molecules, we apply the DIPA to the ten-residue chignolin folding in explicit water. The top three principal components identify the four states (native state, two misfolded states, and unfolded state) and their corresponding eigenfunctions identify important chignolin-water interactions to each state. Thus, the DIPA provides the practical method to identify conformational states and their corresponding important intermolecular interactions with distance information.
Perturbation analyses of intermolecular interactions.
Koyama, Yohei M; Kobayashi, Tetsuya J; Ueda, Hiroki R
2011-08-01
Conformational fluctuations of a protein molecule are important to its function, and it is known that environmental molecules, such as water molecules, ions, and ligand molecules, significantly affect the function by changing the conformational fluctuations. However, it is difficult to systematically understand the role of environmental molecules because intermolecular interactions related to the conformational fluctuations are complicated. To identify important intermolecular interactions with regard to the conformational fluctuations, we develop herein (i) distance-independent and (ii) distance-dependent perturbation analyses of the intermolecular interactions. We show that these perturbation analyses can be realized by performing (i) a principal component analysis using conditional expectations of truncated and shifted intermolecular potential energy terms and (ii) a functional principal component analysis using products of intermolecular forces and conditional cumulative densities. We refer to these analyses as intermolecular perturbation analysis (IPA) and distance-dependent intermolecular perturbation analysis (DIPA), respectively. For comparison of the IPA and the DIPA, we apply them to the alanine dipeptide isomerization in explicit water. Although the first IPA principal components discriminate two states (the α state and PPII (polyproline II) + β states) for larger cutoff length, the separation between the PPII state and the β state is unclear in the second IPA principal components. On the other hand, in the large cutoff value, DIPA eigenvalues converge faster than that for IPA and the top two DIPA principal components clearly identify the three states. By using the DIPA biplot, the contributions of the dipeptide-water interactions to each state are analyzed systematically. Since the DIPA improves the state identification and the convergence rate with retaining distance information, we conclude that the DIPA is a more practical method compared with the IPA. To test the feasibility of the DIPA for larger molecules, we apply the DIPA to the ten-residue chignolin folding in explicit water. The top three principal components identify the four states (native state, two misfolded states, and unfolded state) and their corresponding eigenfunctions identify important chignolin-water interactions to each state. Thus, the DIPA provides the practical method to identify conformational states and their corresponding important intermolecular interactions with distance information.
Inayama, T; Kashiwazaki, H; Sakamoto, M
1998-12-01
We tried to analyze synthetically teachers' view points associated with health education and roles of school lunch in primary education. For this purpose, a survey using an open-ended questionnaire consisting of eight items relating to health education in the school curriculum was carried out in 100 teachers of ten public primary schools. Subjects were asked to describe their view regarding the following eight items: 1) health and physical guidance education, 2) school lunch guidance education, 3) pupils' attitude toward their own health and nutrition, 4) health education, 5) role of school lunch in education, 6) future subjects of health education, 7) class room lesson related to school lunch, 8) guidance in case of pupil with unbalanced dieting and food avoidance. Subjects described their own opinions on an open-ended questionnaire response sheet. Keywords in individual descriptions were selected, rearranged and classified into categories according to their own meanings, and each of the selected keywords were used as the dummy variable. To assess individual opinions synthetically, a principal component analysis was then applied to the variables collected through the teachers' descriptions, and four factors were extracted. The results were as follows. 1) Four factors obtained from the repeated principal component analysis were summarized as; roles of health education and school lunch program (the first principal component), cooperation with nurse-teachers and those in charge of lunch service (the second principal component), time allocation for health education in home-room activity and lunch time (the third principal component) and contents of health education and school lunch guidance and their future plan (the fourth principal component). 2) Teachers regarded the role of school lunch in primary education as providing daily supply of nutrients, teaching of table manners and building up friendships with classmates, health education and food and nutrition education, and developing food preferences through eating lunch together with classmates. 3) Significant positive correlation was observed between "the teachers' opinion about the role of school lunch of providing opportunity to learn good behavior for food preferences through eating lunch together with classmates" and the first principal component "roles of health education and school lunch program" (r = 0.39, p < 0.01). The variable "the role of school lunch is health education and food and nutrition education" showed positive correlation with the principle component "cooperation with nurse-teachers and those in charge of lunch service" (r = 0.27, p < 0.01). Interesting relationships obtained were that teachers with longer educational experience tended to place importance in health education and food and nutrition education as the role of school lunch, and that male teachers regarded the roles of school lunch more importantly for future education in primary education than female teachers did.
Phenomenology of mixed states: a principal component analysis study.
Bertschy, G; Gervasoni, N; Favre, S; Liberek, C; Ragama-Pardos, E; Aubry, J-M; Gex-Fabry, M; Dayer, A
2007-12-01
To contribute to the definition of external and internal limits of mixed states and study the place of dysphoric symptoms in the psychopathology of mixed states. One hundred and sixty-five inpatients with major mood episodes were diagnosed as presenting with either pure depression, mixed depression (depression plus at least three manic symptoms), full mixed state (full depression and full mania), mixed mania (mania plus at least three depressive symptoms) or pure mania, using an adapted version of the Mini International Neuropsychiatric Interview (DSM-IV version). They were evaluated using a 33-item inventory of depressive, manic and mixed affective signs and symptoms. Principal component analysis without rotation yielded three components that together explained 43.6% of the variance. The first component (24.3% of the variance) contrasted typical depressive symptoms with typical euphoric, manic symptoms. The second component, labeled 'dysphoria', (13.8%) had strong positive loadings for irritability, distressing sensitivity to light and noise, impulsivity and inner tension. The third component (5.5%) included symptoms of insomnia. Median scores for the first component significantly decreased from the pure depression group to the pure mania group. For the dysphoria component, scores were highest among patients with full mixed states and decreased towards both patients with pure depression and those with pure mania. Principal component analysis revealed that dysphoria represents an important dimension of mixed states.
A Principle Component Analysis of Galaxy Properties from a Large, Gas-Selected Sample
Chang, Yu-Yen; Chao, Rikon; Wang, Wei-Hao; ...
2012-01-01
Disney emore » t al. (2008) have found a striking correlation among global parameters of H i -selected galaxies and concluded that this is in conflict with the CDM model. Considering the importance of the issue, we reinvestigate the problem using the principal component analysis on a fivefold larger sample and additional near-infrared data. We use databases from the Arecibo Legacy Fast Arecibo L -band Feed Array Survey for the gas properties, the Sloan Digital Sky Survey for the optical properties, and the Two Micron All Sky Survey for the near-infrared properties. We confirm that the parameters are indeed correlated where a single physical parameter can explain 83% of the variations. When color ( g - i ) is included, the first component still dominates but it develops a second principal component. In addition, the near-infrared color ( i - J ) shows an obvious second principal component that might provide evidence of the complex old star formation. Based on our data, we suggest that it is premature to pronounce the failure of the CDM model and it motivates more theoretical work.« less
Principal component analysis of dynamic fluorescence images for diagnosis of diabetic vasculopathy
NASA Astrophysics Data System (ADS)
Seo, Jihye; An, Yuri; Lee, Jungsul; Ku, Taeyun; Kang, Yujung; Ahn, Chulwoo; Choi, Chulhee
2016-04-01
Indocyanine green (ICG) fluorescence imaging has been clinically used for noninvasive visualizations of vascular structures. We have previously developed a diagnostic system based on dynamic ICG fluorescence imaging for sensitive detection of vascular disorders. However, because high-dimensional raw data were used, the analysis of the ICG dynamics proved difficult. We used principal component analysis (PCA) in this study to extract important elements without significant loss of information. We examined ICG spatiotemporal profiles and identified critical features related to vascular disorders. PCA time courses of the first three components showed a distinct pattern in diabetic patients. Among the major components, the second principal component (PC2) represented arterial-like features. The explained variance of PC2 in diabetic patients was significantly lower than in normal controls. To visualize the spatial pattern of PCs, pixels were mapped with red, green, and blue channels. The PC2 score showed an inverse pattern between normal controls and diabetic patients. We propose that PC2 can be used as a representative bioimaging marker for the screening of vascular diseases. It may also be useful in simple extractions of arterial-like features.
Convergence Speed of a Dynamical System for Sparse Recovery
NASA Astrophysics Data System (ADS)
Balavoine, Aurele; Rozell, Christopher J.; Romberg, Justin
2013-09-01
This paper studies the convergence rate of a continuous-time dynamical system for L1-minimization, known as the Locally Competitive Algorithm (LCA). Solving L1-minimization} problems efficiently and rapidly is of great interest to the signal processing community, as these programs have been shown to recover sparse solutions to underdetermined systems of linear equations and come with strong performance guarantees. The LCA under study differs from the typical L1 solver in that it operates in continuous time: instead of being specified by discrete iterations, it evolves according to a system of nonlinear ordinary differential equations. The LCA is constructed from simple components, giving it the potential to be implemented as a large-scale analog circuit. The goal of this paper is to give guarantees on the convergence time of the LCA system. To do so, we analyze how the LCA evolves as it is recovering a sparse signal from underdetermined measurements. We show that under appropriate conditions on the measurement matrix and the problem parameters, the path the LCA follows can be described as a sequence of linear differential equations, each with a small number of active variables. This allows us to relate the convergence time of the system to the restricted isometry constant of the matrix. Interesting parallels to sparse-recovery digital solvers emerge from this study. Our analysis covers both the noisy and noiseless settings and is supported by simulation results.
Jiang, Xi; Zhang, Xin; Zhu, Dajiang
2014-10-01
Alzheimer's disease (AD) is the most common type of dementia (accounting for 60% to 80%) and is the fifth leading cause of death for those people who are 65 or older. By 2050, one new case of AD in United States is expected to develop every 33 sec. Unfortunately, there is no available effective treatment that can stop or slow the death of neurons that causes AD symptoms. On the other hand, it is widely believed that AD starts before development of the associated symptoms, so its prestages, including mild cognitive impairment (MCI) or even significant memory concern (SMC), have received increasing attention, not only because of their potential as a precursor of AD, but also as a possible predictor of conversion to other neurodegenerative diseases. Although these prestages have been defined clinically, accurate/efficient diagnosis is still challenging. Moreover, brain functional abnormalities behind those alterations and conversions are still unclear. In this article, by developing novel sparse representations of whole-brain resting-state functional magnetic resonance imaging signals and by using the most updated Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, we successfully identified multiple functional components simultaneously, and which potentially represent those intrinsic functional networks involved in the resting-state activities. Interestingly, these identified functional components contain all the resting-state networks obtained from traditional independent-component analysis. Moreover, by using the features derived from those functional components, it yields high classification accuracy for both AD (94%) and MCI (92%) versus normal controls. Even for SMC we can still have 92% accuracy.
Amygdala and Hippocampus Enlargement during Adolescence in Autism
ERIC Educational Resources Information Center
Groen, Wouter; Teluij, Michelle; Buitelaar, Jan; Tendolkar, Indira
2010-01-01
Objective: The amygdala and hippocampus are key components of the neural system mediating emotion perception and regulation and are thought to be involved in the pathophysiology of autism. Although some studies in children with autism suggest that there is an enlargement of amygdala and hippocampal volume, findings in adolescence are sparse.…
Rural Schools: Off the Beaten Path
ERIC Educational Resources Information Center
Gordon, Dan
2011-01-01
This article is the second of a two-part series on how schools in different types of communities meet the challenge of implementing technology. The emergence of technology as a critical component of education has presented rural districts with an invaluable tool for overcoming the problems created by sparse and remote populations. But rural…
Zuendorf, Gerhard; Kerrouche, Nacer; Herholz, Karl; Baron, Jean-Claude
2003-01-01
Principal component analysis (PCA) is a well-known technique for reduction of dimensionality of functional imaging data. PCA can be looked at as the projection of the original images onto a new orthogonal coordinate system with lower dimensions. The new axes explain the variance in the images in decreasing order of importance, showing correlations between brain regions. We used an efficient, stable and analytical method to work out the PCA of Positron Emission Tomography (PET) images of 74 normal subjects using [(18)F]fluoro-2-deoxy-D-glucose (FDG) as a tracer. Principal components (PCs) and their relation to age effects were investigated. Correlations between the projections of the images on the new axes and the age of the subjects were carried out. The first two PCs could be identified as being the only PCs significantly correlated to age. The first principal component, which explained 10% of the data set variance, was reduced only in subjects of age 55 or older and was related to loss of signal in and adjacent to ventricles and basal cisterns, reflecting expected age-related brain atrophy with enlarging CSF spaces. The second principal component, which accounted for 8% of the total variance, had high loadings from prefrontal, posterior parietal and posterior cingulate cortices and showed the strongest correlation with age (r = -0.56), entirely consistent with previously documented age-related declines in brain glucose utilization. Thus, our method showed that the effect of aging on brain metabolism has at least two independent dimensions. This method should have widespread applications in multivariate analysis of brain functional images. Copyright 2002 Wiley-Liss, Inc.
HT-FRTC: a fast radiative transfer code using kernel regression
NASA Astrophysics Data System (ADS)
Thelen, Jean-Claude; Havemann, Stephan; Lewis, Warren
2016-09-01
The HT-FRTC is a principal component based fast radiative transfer code that can be used across the electromagnetic spectrum from the microwave through to the ultraviolet to calculate transmittance, radiance and flux spectra. The principal components cover the spectrum at a very high spectral resolution, which allows very fast line-by-line, hyperspectral and broadband simulations for satellite-based, airborne and ground-based sensors. The principal components are derived during a code training phase from line-by-line simulations for a diverse set of atmosphere and surface conditions. The derived principal components are sensor independent, i.e. no extra training is required to include additional sensors. During the training phase we also derive the predictors which are required by the fast radiative transfer code to determine the principal component scores from the monochromatic radiances (or fluxes, transmittances). These predictors are calculated for each training profile at a small number of frequencies, which are selected by a k-means cluster algorithm during the training phase. Until recently the predictors were calculated using a linear regression. However, during a recent rewrite of the code the linear regression was replaced by a Gaussian Process (GP) regression which resulted in a significant increase in accuracy when compared to the linear regression. The HT-FRTC has been trained with a large variety of gases, surface properties and scatterers. Rayleigh scattering as well as scattering by frozen/liquid clouds, hydrometeors and aerosols have all been included. The scattering phase function can be fully accounted for by an integrated line-by-line version of the Edwards-Slingo spherical harmonics radiation code or approximately by a modification to the extinction (Chou scaling).
Spectral decomposition of asteroid Itokawa based on principal component analysis
NASA Astrophysics Data System (ADS)
Koga, Sumire C.; Sugita, Seiji; Kamata, Shunichi; Ishiguro, Masateru; Hiroi, Takahiro; Tatsumi, Eri; Sasaki, Sho
2018-01-01
The heliocentric stratification of asteroid spectral types may hold important information on the early evolution of the Solar System. Asteroid spectral taxonomy is based largely on principal component analysis. However, how the surface properties of asteroids, such as the composition and age, are projected in the principal-component (PC) space is not understood well. We decompose multi-band disk-resolved visible spectra of the Itokawa surface with principal component analysis (PCA) in comparison with main-belt asteroids. The obtained distribution of Itokawa spectra projected in the PC space of main-belt asteroids follows a linear trend linking the Q-type and S-type regions and is consistent with the results of space-weathering experiments on ordinary chondrites and olivine, suggesting that this trend may be a space-weathering-induced spectral evolution track for S-type asteroids. Comparison with space-weathering experiments also yield a short average surface age (< a few million years) for Itokawa, consistent with the cosmic-ray-exposure time of returned samples from Itokawa. The Itokawa PC score distribution exhibits asymmetry along the evolution track, strongly suggesting that space weathering has begun saturated on this young asteroid. The freshest spectrum found on Itokawa exhibits a clear sign for space weathering, indicating again that space weathering occurs very rapidly on this body. We also conducted PCA on Itokawa spectra alone and compared the results with space-weathering experiments. The obtained results indicate that the first principal component of Itokawa surface spectra is consistent with spectral change due to space weathering and that the spatial variation in the degree of space weathering is very large (a factor of three in surface age), which would strongly suggest the presence of strong regional/local resurfacing process(es) on this small asteroid.
Cross-correlation matrix analysis of Chinese and American bank stocks in subprime crisis
NASA Astrophysics Data System (ADS)
Zhu, Shi-Zhao; Li, Xin-Li; Nie, Sen; Zhang, Wen-Qing; Yu, Gao-Feng; Han, Xiao-Pu; Wang, Bing-Hong
2015-05-01
In order to study the universality of the interactions among different markets, we analyze the cross-correlation matrix of the price of the Chinese and American bank stocks. We then find that the stock prices of the emerging market are more correlated than that of the developed market. Considering that the values of the components for the eigenvector may be positive or negative, we analyze the differences between two markets in combination with the endogenous and exogenous events which influence the financial markets. We find that the sparse pattern of components of eigenvectors out of the threshold value has no change in American bank stocks before and after the subprime crisis. However, it changes from sparse to dense for Chinese bank stocks. By using the threshold value to exclude the external factors, we simulate the interactions in financial markets. Project supported by the National Natural Science Foundation of China (Grant Nos. 11275186, 91024026, and FOM2014OF001) and the University of Shanghai for Science and Technology (USST) of Humanities and Social Sciences, China (Grant Nos. USST13XSZ05 and 11YJA790231).
Sambo, Francesco; de Oca, Marco A Montes; Di Camillo, Barbara; Toffolo, Gianna; Stützle, Thomas
2012-01-01
Reverse engineering is the problem of inferring the structure of a network of interactions between biological variables from a set of observations. In this paper, we propose an optimization algorithm, called MORE, for the reverse engineering of biological networks from time series data. The model inferred by MORE is a sparse system of nonlinear differential equations, complex enough to realistically describe the dynamics of a biological system. MORE tackles separately the discrete component of the problem, the determination of the biological network topology, and the continuous component of the problem, the strength of the interactions. This approach allows us both to enforce system sparsity, by globally constraining the number of edges, and to integrate a priori information about the structure of the underlying interaction network. Experimental results on simulated and real-world networks show that the mixed discrete/continuous optimization approach of MORE significantly outperforms standard continuous optimization and that MORE is competitive with the state of the art in terms of accuracy of the inferred networks.
A sparse grid based method for generative dimensionality reduction of high-dimensional data
NASA Astrophysics Data System (ADS)
Bohn, Bastian; Garcke, Jochen; Griebel, Michael
2016-03-01
Generative dimensionality reduction methods play an important role in machine learning applications because they construct an explicit mapping from a low-dimensional space to the high-dimensional data space. We discuss a general framework to describe generative dimensionality reduction methods, where the main focus lies on a regularized principal manifold learning variant. Since most generative dimensionality reduction algorithms exploit the representer theorem for reproducing kernel Hilbert spaces, their computational costs grow at least quadratically in the number n of data. Instead, we introduce a grid-based discretization approach which automatically scales just linearly in n. To circumvent the curse of dimensionality of full tensor product grids, we use the concept of sparse grids. Furthermore, in real-world applications, some embedding directions are usually more important than others and it is reasonable to refine the underlying discretization space only in these directions. To this end, we employ a dimension-adaptive algorithm which is based on the ANOVA (analysis of variance) decomposition of a function. In particular, the reconstruction error is used to measure the quality of an embedding. As an application, the study of large simulation data from an engineering application in the automotive industry (car crash simulation) is performed.
NASA Astrophysics Data System (ADS)
Chattopadhyay, Goutami; Chattopadhyay, Surajit; Chakraborthy, Parthasarathi
2012-07-01
The present study deals with daily total ozone concentration time series over four metro cities of India namely Kolkata, Mumbai, Chennai, and New Delhi in the multivariate environment. Using the Kaiser-Meyer-Olkin measure, it is established that the data set under consideration are suitable for principal component analysis. Subsequently, by introducing rotated component matrix for the principal components, the predictors suitable for generating artificial neural network (ANN) for daily total ozone prediction are identified. The multicollinearity is removed in this way. Models of ANN in the form of multilayer perceptron trained through backpropagation learning are generated for all of the study zones, and the model outcomes are assessed statistically. Measuring various statistics like Pearson correlation coefficients, Willmott's indices, percentage errors of prediction, and mean absolute errors, it is observed that for Mumbai and Kolkata the proposed ANN model generates very good predictions. The results are supported by the linearly distributed coordinates in the scatterplots.
NASA Astrophysics Data System (ADS)
Seo, Jihye; An, Yuri; Lee, Jungsul; Choi, Chulhee
2015-03-01
Indocyanine green (ICG), a near-infrared fluorophore, has been used in visualization of vascular structure and non-invasive diagnosis of vascular disease. Although many imaging techniques have been developed, there are still limitations in diagnosis of vascular diseases. We have recently developed a minimally invasive diagnostics system based on ICG fluorescence imaging for sensitive detection of vascular insufficiency. In this study, we used principal component analysis (PCA) to examine ICG spatiotemporal profile and to obtain pathophysiological information from ICG dynamics. Here we demonstrated that principal components of ICG dynamics in both feet showed significant differences between normal control and diabetic patients with vascula complications. We extracted the PCA time courses of the first three components and found distinct pattern in diabetic patient. We propose that PCA of ICG dynamics reveal better classification performance compared to fluorescence intensity analysis. We anticipate that specific feature of spatiotemporal ICG dynamics can be useful in diagnosis of various vascular diseases.
Leadership Coaching: A Multiple-Case Study of Urban Public Charter School Principals' Experiences
ERIC Educational Resources Information Center
Lackritz, Anne D.
2017-01-01
This multi-case study seeks to understand the experiences of New York City and Washington, DC public charter school principals who have experienced leadership coaching, a component of leadership development, beyond their novice years. The research questions framing this study address how experienced public charter school principals describe the…
The View from the Principal's Office: An Observation Protocol Boosts Literacy :eadership
ERIC Educational Resources Information Center
Novak, Sandi; Houck, Bonnie
2016-01-01
The Minnesota Elementary School Principals' Association offered Minnesota principals professional learning that placed a high priority on literacy instruction and developing a collegial culture. A key component is the literacy classroom visit, an observation protocol used to gather data to determine the status of literacy teaching and student…
ERIC Educational Resources Information Center
Agnew, David W.
2011-01-01
Public school principals must meet many challenges and make decisions concerning financial obligations while providing the best learning environment for students. A major challenge to principals is implementing technological components successfully while providing teachers the 21st century instructional skills needed to enhance students'…
Differential principal component analysis of ChIP-seq.
Ji, Hongkai; Li, Xia; Wang, Qian-fei; Ning, Yang
2013-04-23
We propose differential principal component analysis (dPCA) for analyzing multiple ChIP-sequencing datasets to identify differential protein-DNA interactions between two biological conditions. dPCA integrates unsupervised pattern discovery, dimension reduction, and statistical inference into a single framework. It uses a small number of principal components to summarize concisely the major multiprotein synergistic differential patterns between the two conditions. For each pattern, it detects and prioritizes differential genomic loci by comparing the between-condition differences with the within-condition variation among replicate samples. dPCA provides a unique tool for efficiently analyzing large amounts of ChIP-sequencing data to study dynamic changes of gene regulation across different biological conditions. We demonstrate this approach through analyses of differential chromatin patterns at transcription factor binding sites and promoters as well as allele-specific protein-DNA interactions.
Three dimensional empirical mode decomposition analysis apparatus, method and article manufacture
NASA Technical Reports Server (NTRS)
Gloersen, Per (Inventor)
2004-01-01
An apparatus and method of analysis for three-dimensional (3D) physical phenomena. The physical phenomena may include any varying 3D phenomena such as time varying polar ice flows. A repesentation of the 3D phenomena is passed through a Hilbert transform to convert the data into complex form. A spatial variable is separated from the complex representation by producing a time based covariance matrix. The temporal parts of the principal components are produced by applying Singular Value Decomposition (SVD). Based on the rapidity with which the eigenvalues decay, the first 3-10 complex principal components (CPC) are selected for Empirical Mode Decomposition into intrinsic modes. The intrinsic modes produced are filtered in order to reconstruct the spatial part of the CPC. Finally, a filtered time series may be reconstructed from the first 3-10 filtered complex principal components.
Liang, Xuedong; Liu, Canmian; Li, Zhi
2017-01-01
In connection with the sustainable development of scenic spots, this paper, with consideration of resource conditions, economic benefits, auxiliary industry scale and ecological environment, establishes a comprehensive measurement model of the sustainable capacity of scenic spots; optimizes the index system by principal components analysis to extract principal components; assigns the weight of principal components by entropy method; analyzes the sustainable capacity of scenic spots in each province of China comprehensively in combination with TOPSIS method and finally puts forward suggestions aid decision-making. According to the study, this method provides an effective reference for the study of the sustainable development of scenic spots and is very significant for considering the sustainable development of scenic spots and auxiliary industries to establish specific and scientific countermeasures for improvement. PMID:29271947
The variance needed to accurately describe jump height from vertical ground reaction force data.
Richter, Chris; McGuinness, Kevin; O'Connor, Noel E; Moran, Kieran
2014-12-01
In functional principal component analysis (fPCA) a threshold is chosen to define the number of retained principal components, which corresponds to the amount of preserved information. A variety of thresholds have been used in previous studies and the chosen threshold is often not evaluated. The aim of this study is to identify the optimal threshold that preserves the information needed to describe a jump height accurately utilizing vertical ground reaction force (vGRF) curves. To find an optimal threshold, a neural network was used to predict jump height from vGRF curve measures generated using different fPCA thresholds. The findings indicate that a threshold from 99% to 99.9% (6-11 principal components) is optimal for describing jump height, as these thresholds generated significantly lower jump height prediction errors than other thresholds.
Liang, Xuedong; Liu, Canmian; Li, Zhi
2017-12-22
In connection with the sustainable development of scenic spots, this paper, with consideration of resource conditions, economic benefits, auxiliary industry scale and ecological environment, establishes a comprehensive measurement model of the sustainable capacity of scenic spots; optimizes the index system by principal components analysis to extract principal components; assigns the weight of principal components by entropy method; analyzes the sustainable capacity of scenic spots in each province of China comprehensively in combination with TOPSIS method and finally puts forward suggestions aid decision-making. According to the study, this method provides an effective reference for the study of the sustainable development of scenic spots and is very significant for considering the sustainable development of scenic spots and auxiliary industries to establish specific and scientific countermeasures for improvement.
Hou, Guangjin; Gupta, Rupal; Polenova, Tatyana; Vega, Alexander J
2014-02-01
Proton chemical shifts are a rich probe of structure and hydrogen bonding environments in organic and biological molecules. Until recently, measurements of 1 H chemical shift tensors have been restricted to either solid systems with sparse proton sites or were based on the indirect determination of anisotropic tensor components from cross-relaxation and liquid-crystal experiments. We have introduced an MAS approach that permits site-resolved determination of CSA tensors of protons forming chemical bonds with labeled spin-1/2 nuclei in fully protonated solids with multiple sites, including organic molecules and proteins. This approach, originally introduced for the measurements of chemical shift tensors of amide protons, is based on three RN -symmetry based experiments, from which the principal components of the 1 H CS tensor can be reliably extracted by simultaneous triple fit of the data. In this article, we expand our approach to a much more challenging system involving aliphatic and aromatic protons. We start with a review of the prior work on experimental-NMR and computational-quantum-chemical approaches for the measurements of 1 H chemical shift tensors and for relating these to the electronic structures. We then present our experimental results on U- 13 C, 15 N-labeled histdine demonstrating that 1 H chemical shift tensors can be reliably determined for the 1 H 15 N and 1 H 13 C spin pairs in cationic and neutral forms of histidine. Finally, we demonstrate that the experimental 1 H(C) and 1 H(N) chemical shift tensors are in agreement with Density Functional Theory calculations, therefore establishing the usefulness of our method for characterization of structure and hydrogen bonding environment in organic and biological solids.
NASA Technical Reports Server (NTRS)
Robertson, Franklin R.; Bosilovich, Michael G.; Roberts, Jason B.
2016-01-01
Vertically integrated atmospheric moisture transport from ocean to land [vertically integrated atmospheric moisture flux convergence (VMFC)] is a dynamic component of the global climate system but remains problematic in atmospheric reanalyses, with current estimates having significant multidecadal global trends differing even in sign. Continual evolution of the global observing system, particularly stepwise improvements in satellite observations, has introduced discrete changes in the ability of data assimilation to correct systematic model biases, manifesting as nonphysical variability. Land surface models (LSMs) forced with observed precipitation P and near-surface meteorology and radiation provide estimates of evapotranspiration (ET). Since variability of atmospheric moisture storage is small on interannual and longer time scales, VMFC equals P minus ET is a good approximation and LSMs can provide an alternative estimate. However, heterogeneous density of rain gauge coverage, especially the sparse coverage over tropical continents, remains a serious concern. Rotated principal component analysis (RPCA) with prefiltering of VMFC to isolate the artificial variability is used to investigate artifacts in five reanalysis systems. This procedure, although ad hoc, enables useful VMFC corrections over global land. The P minus ET estimates from seven different LSMs are evaluated and subsequently used to confirm the efficacy of the RPCA-based adjustments. Global VMFC trends over the period 1979-2012 ranging from 0.07 to minus 0.03 millimeters per day per decade are reduced by the adjustments to 0.016 millimeters per day per decade, much closer to the LSM P minus ET estimate (0.007 millimeters per day per decade). Neither is significant at the 90 percent level. ENSO (El Nino-Southern Oscillation)-related modulation of VMFC and P minus ET remains the largest global interannual signal, with mean LSM and adjusted reanalysis time series correlating at 0.86.
Components of Air Pollution and Cognitive Function in Middle-aged and Older Adults in Los Angeles
Gatto, Nicole M.; Henderson, Victor W.; Hodis, Howard N.; St John, Jan A.; Lurmann, Fred; Chen, Jiu-Chiuan; Mack, Wendy J.
2014-01-01
While experiments in animals demonstrate neurotoxic effects of particulate matter (PM) and ozone (O3), epidemiologic evidence is sparse regarding the relationship between different constituencies of air pollution mixtures and cognitive function in adults. We examined cross-sectional associations between various ambient air pollutants [O3, PM2.5 and nitrogen dioxide (NO2)] and six measures of cognitive function and global cognition among healthy, cognitively intact individuals (n=1,496, mean age 60.5 years) residing in the Los Angeles Basin. Air pollution exposures were assigned to each residential address in 2000–06 using a geographic information system that included monitoring data. A neuropsychological battery was used to assess cognitive function; a principal components analysis defined six domain-specific functions and a measure of global cognitive function was created. Regression models estimated effects of air pollutants on cognitive function, adjusting for age, gender, race, education, income, study and mood. Increasing exposure to PM2.5 was associated with lower verbal learning (β = −0.32 per 10 ug/m3 PM2.5, 95% CI = −0.63, 0.00; p = 0.05). Ambient exposure to NO2 >20 ppb tended to be associated with lower logical memory. Compared to the lowest level of exposure to ambient O3, exposure above 49 ppb was associated with lower executive function. Including carotid artery intima-media thickness, a measure of subclinical atherosclerosis, in models as a possible mediator did not attenuate effect estimates. This study provides support for cross-sectional associations between increasing levels of ambient O3, PM2.5 and NO2 and measures of domain-specific cognitive abilities. PMID:24148924
Components of air pollution and cognitive function in middle-aged and older adults in Los Angeles.
Gatto, Nicole M; Henderson, Victor W; Hodis, Howard N; St John, Jan A; Lurmann, Fred; Chen, Jiu-Chiuan; Mack, Wendy J
2014-01-01
While experiments in animals demonstrate neurotoxic effects of particulate matter (PM) and ozone (O3), epidemiologic evidence is sparse regarding the relationship between different constituencies of air pollution mixtures and cognitive function in adults. We examined cross-sectional associations between various ambient air pollutants [O3, PM2.5 and nitrogen dioxide (NO2)] and six measures of cognitive function and global cognition among healthy, cognitively intact individuals (n=1496, mean age 60.5 years) residing in the Los Angeles Basin. Air pollution exposures were assigned to each residential address in 2000-06 using a geographic information system that included monitoring data. A neuropsychological battery was used to assess cognitive function; a principal components analysis defined six domain-specific functions and a measure of global cognitive function was created. Regression models estimated effects of air pollutants on cognitive function, adjusting for age, gender, race, education, income, study and mood. Increasing exposure to PM2.5 was associated with lower verbal learning (β=-0.32 per 10 μg/m(3) PM2.5, 95% CI=-0.63, 0.00; p=0.05). Ambient exposure to NO2 >20 ppb tended to be associated with lower logical memory. Compared to the lowest level of exposure to ambient O3, exposure above 49 ppb was associated with lower executive function. Including carotid artery intima-media thickness, a measure of subclinical atherosclerosis, in models as a possible mediator did not attenuate effect estimates. This study provides support for cross-sectional associations between increasing levels of ambient O3, PM2.5 and NO2 and measures of domain-specific cognitive abilities. Copyright © 2014 Elsevier Inc. All rights reserved.
The challenge of precise orbit determination for STSAT-2C using extremely sparse SLR data
NASA Astrophysics Data System (ADS)
Kim, Young-Rok; Park, Eunseo; Kucharski, Daniel; Lim, Hyung-Chul; Kim, Byoungsoo
2016-03-01
The Science and Technology Satellite (STSAT)-2C is the first Korean satellite equipped with a laser retro-reflector array for satellite laser ranging (SLR). SLR is the only on-board tracking source for precise orbit determination (POD) of STSAT-2C. However, POD for the STSAT-2C is a challenging issue, as the laser measurements of the satellite are extremely sparse, largely due to the inaccurate two-line element (TLE)-based orbit predictions used by the SLR tracking stations. In this study, POD for the STSAT-2C using extremely sparse SLR data is successfully implemented, and new laser-based orbit predictions are obtained. The NASA/GSFC GEODYN II software and seven-day arcs are used for the SLR data processing of two years of normal points from March 2013 to May 2015. To compensate for the extremely sparse laser tracking, the number of estimation parameters are minimized, and only the atmospheric drag coefficients are estimated with various intervals. The POD results show that the weighted root mean square (RMS) post-fit residuals are less than 10 m, and the 3D day boundaries vary from 30 m to 3 km. The average four-day orbit overlaps are less than 20/330/20 m for the radial/along-track/cross-track components. The quality of the new laser-based prediction is verified by SLR observations, and the SLR residuals show better results than those of previous TLE-based predictions. This study demonstrates that POD for the STSAT-2C can be successfully achieved against extreme sparseness of SLR data, and the results can deliver more accurate predictions.
Richard Tran Mills; Jitendra Kumar; Forrest M. Hoffman; William W. Hargrove; Joseph P. Spruce; Steven P. Norman
2013-01-01
We investigated the use of principal components analysis (PCA) to visualize dominant patterns and identify anomalies in a multi-year land surface phenology data set (231 m à 231 m normalized difference vegetation index (NDVI) values derived from the Moderate Resolution Imaging Spectroradiometer (MODIS)) used for detecting threats to forest health in the conterminous...
Multivariate analysis of light scattering spectra of liquid dairy products
NASA Astrophysics Data System (ADS)
Khodasevich, M. A.
2010-05-01
Visible light scattering spectra from the surface layer of samples of commercial liquid dairy products are recorded with a colorimeter. The principal component method is used to analyze these spectra. Vectors representing the samples of dairy products in a multidimensional space of spectral counts are projected onto a three-dimensional subspace of principal components. The magnitudes of these projections are found to depend on the type of dairy product.
James R. Wallis
1965-01-01
Written in Fortran IV and MAP, this computer program can handle up to 120 variables, and retain 40 principal components. It can perform simultaneous regression of up to 40 criterion variables upon the varimax rotated factor weight matrix. The columns and rows of all output matrices are labeled by six-character alphanumeric names. Data input can be from punch cards or...
Dihedral angle principal component analysis of molecular dynamics simulations.
Altis, Alexandros; Nguyen, Phuong H; Hegger, Rainer; Stock, Gerhard
2007-06-28
It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {phi(n)} to the metric coordinate space {x(n)=cos phi(n),y(n)=sin phi(n)} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300 ns molecular dynamics simulation, a critical comparison of the various methods is given.
Dihedral angle principal component analysis of molecular dynamics simulations
NASA Astrophysics Data System (ADS)
Altis, Alexandros; Nguyen, Phuong H.; Hegger, Rainer; Stock, Gerhard
2007-06-01
It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {φn} to the metric coordinate space {xn=cosφn,yn=sinφn} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300ns molecular dynamics simulation, a critical comparison of the various methods is given.
The rate of change in declining steroid hormones: a new parameter of healthy aging in men?
Walther, Andreas; Philipp, Michel; Lozza, Niclà; Ehlert, Ulrike
2016-09-20
Research on healthy aging in men has increasingly focused on age-related hormonal changes. Testosterone (T) decline is primarily investigated, while age-related changes in other sex steroids (dehydroepiandrosterone [DHEA], estradiol [E2], progesterone [P]) are mostly neglected. An integrated hormone parameter reflecting aging processes in men has yet to be identified. 271 self-reporting healthy men between 40 and 75 provided both psychometric data and saliva samples for hormone analysis. Correlation analysis between age and sex steroids revealed negative associations for the four sex steroids (T, DHEA, E2, and P). Principal component analysis including ten salivary analytes identified a principal component mainly unifying the variance of the four sex steroid hormones. Subsequent principal component analysis including the four sex steroids extracted the principal component of declining steroid hormones (DSH). Moderation analysis of the association between age and DSH revealed significant moderation effects for psychosocial factors such as depression, chronic stress and perceived general health. In conclusion, these results provide further evidence that sex steroids decline in aging men and that the integrated hormone parameter DSH and its rate of change can be used as biomarkers for healthy aging in men. Furthermore, the negative association of age and DSH is moderated by psychosocial factors.
The rate of change in declining steroid hormones: a new parameter of healthy aging in men?
Walther, Andreas; Philipp, Michel; Lozza, Niclà; Ehlert, Ulrike
2016-01-01
Research on healthy aging in men has increasingly focused on age-related hormonal changes. Testosterone (T) decline is primarily investigated, while age-related changes in other sex steroids (dehydroepiandrosterone [DHEA], estradiol [E2], progesterone [P]) are mostly neglected. An integrated hormone parameter reflecting aging processes in men has yet to be identified. 271 self-reporting healthy men between 40 and 75 provided both psychometric data and saliva samples for hormone analysis. Correlation analysis between age and sex steroids revealed negative associations for the four sex steroids (T, DHEA, E2, and P). Principal component analysis including ten salivary analytes identified a principal component mainly unifying the variance of the four sex steroid hormones. Subsequent principal component analysis including the four sex steroids extracted the principal component of declining steroid hormones (DSH). Moderation analysis of the association between age and DSH revealed significant moderation effects for psychosocial factors such as depression, chronic stress and perceived general health. In conclusion, these results provide further evidence that sex steroids decline in aging men and that the integrated hormone parameter DSH and its rate of change can be used as biomarkers for healthy aging in men. Furthermore, the negative association of age and DSH is moderated by psychosocial factors. PMID:27589836
Fleming, Brandon J.; LaMotte, Andrew E.; Sekellick, Andrew J.
2013-01-01
Hydrogeologic regions in the fractured rock area of Maryland were classified using geographic information system tools with principal components and cluster analyses. A study area consisting of the 8-digit Hydrologic Unit Code (HUC) watersheds with rivers that flow through the fractured rock area of Maryland and bounded by the Fall Line was further subdivided into 21,431 catchments from the National Hydrography Dataset Plus. The catchments were then used as a common hydrologic unit to compile relevant climatic, topographic, and geologic variables. A principal components analysis was performed on 10 input variables, and 4 principal components that accounted for 83 percent of the variability in the original data were identified. A subsequent cluster analysis grouped the catchments based on four principal component scores into six hydrogeologic regions. Two crystalline rock hydrogeologic regions, including large parts of the Washington, D.C. and Baltimore metropolitan regions that represent over 50 percent of the fractured rock area of Maryland, are distinguished by differences in recharge, Precipitation minus Potential Evapotranspiration, sand content in soils, and groundwater contributions to streams. This classification system will provide a georeferenced digital hydrogeologic framework for future investigations of groundwater availability in the fractured rock area of Maryland.
NASA Technical Reports Server (NTRS)
Liu, Xu; Smith, William L.; Zhou, Daniel K.; Larar, Allen
2005-01-01
Modern infrared satellite sensors such as Atmospheric Infrared Sounder (AIRS), Cosmic Ray Isotope Spectrometer (CrIS), Thermal Emission Spectrometer (TES), Geosynchronous Imaging Fourier Transform Spectrometer (GIFTS) and Infrared Atmospheric Sounding Interferometer (IASI) are capable of providing high spatial and spectral resolution infrared spectra. To fully exploit the vast amount of spectral information from these instruments, super fast radiative transfer models are needed. This paper presents a novel radiative transfer model based on principal component analysis. Instead of predicting channel radiance or transmittance spectra directly, the Principal Component-based Radiative Transfer Model (PCRTM) predicts the Principal Component (PC) scores of these quantities. This prediction ability leads to significant savings in computational time. The parameterization of the PCRTM model is derived from properties of PC scores and instrument line shape functions. The PCRTM is very accurate and flexible. Due to its high speed and compressed spectral information format, it has great potential for super fast one-dimensional physical retrievals and for Numerical Weather Prediction (NWP) large volume radiance data assimilation applications. The model has been successfully developed for the National Polar-orbiting Operational Environmental Satellite System Airborne Sounder Testbed - Interferometer (NAST-I) and AIRS instruments. The PCRTM model performs monochromatic radiative transfer calculations and is able to include multiple scattering calculations to account for clouds and aerosols.
Relationship between regional population and healthcare delivery in Japan.
Niga, Takeo; Mori, Maiko; Kawahara, Kazuo
2016-01-01
In order to address regional inequality in healthcare delivery in Japan, healthcare districts were established in 1985. However, regional healthcare delivery has now become a national issue because of population migration and the aging population. In this study, the state of healthcare delivery at the district level is examined by analyzing population, the number of physicians, and the number of hospital beds. The results indicate a continuing disparity in healthcare delivery among districts. We find that the rate of change in population has a strong positive correlation with that in the number of physicians and a weak positive correlation with that in the number of hospital beds. In addition, principal component analysis is performed on three variables: the rate of change in population, the number of physicians per capita, and the number of hospital beds per capita. This analysis suggests that the two principal components contribute 90.1% of the information. The first principal component is thought to show the effect of the regulations on hospital beds. The second principal component is thought to show the capacity to recruit physicians. This study indicates that an adjustment to the regulations on hospital beds as well as physician allocation by public funds may be key to resolving the impending issue of regionally disproportionate healthcare delivery.
Performance evaluation of PCA-based spike sorting algorithms.
Adamos, Dimitrios A; Kosmidis, Efstratios K; Theophilidis, George
2008-09-01
Deciphering the electrical activity of individual neurons from multi-unit noisy recordings is critical for understanding complex neural systems. A widely used spike sorting algorithm is being evaluated for single-electrode nerve trunk recordings. The algorithm is based on principal component analysis (PCA) for spike feature extraction. In the neuroscience literature it is generally assumed that the use of the first two or most commonly three principal components is sufficient. We estimate the optimum PCA-based feature space by evaluating the algorithm's performance on simulated series of action potentials. A number of modifications are made to the open source nev2lkit software to enable systematic investigation of the parameter space. We introduce a new metric to define clustering error considering over-clustering more favorable than under-clustering as proposed by experimentalists for our data. Both the program patch and the metric are available online. Correlated and white Gaussian noise processes are superimposed to account for biological and artificial jitter in the recordings. We report that the employment of more than three principal components is in general beneficial for all noise cases considered. Finally, we apply our results to experimental data and verify that the sorting process with four principal components is in agreement with a panel of electrophysiology experts.
Fluorescence fingerprint as an instrumental assessment of the sensory quality of tomato juices.
Trivittayasil, Vipavee; Tsuta, Mizuki; Imamura, Yoshinori; Sato, Tsuneo; Otagiri, Yuji; Obata, Akio; Otomo, Hiroe; Kokawa, Mito; Sugiyama, Junichi; Fujita, Kaori; Yoshimura, Masatoshi
2016-03-15
Sensory analysis is an important standard for evaluating food products. However, as trained panelists and time are required for the process, the potential of using fluorescence fingerprint as a rapid instrumental method to approximate sensory characteristics was explored in this study. Thirty-five out of 44 descriptive sensory attributes were found to show a significant difference between samples (analysis of variance test). Principal component analysis revealed that principal component 1 could capture 73.84 and 75.28% variance for aroma category and combined flavor and taste category respectively. Fluorescence fingerprints of tomato juices consisted of two visible peaks at excitation/emission wavelengths of 290/350 and 315/425 nm and a long narrow emission peak at 680 nm. The 680 nm peak was only clearly observed in juices obtained from tomatoes cultivated to be eaten raw. The ability to predict overall sensory profiles was investigated by using principal component 1 as a regression target. Fluorescence fingerprint could predict principal component 1 of both aroma and combined flavor and taste with a coefficient of determination above 0.8. The results obtained in this study indicate the potential of using fluorescence fingerprint as an instrumental method for assessing sensory characteristics of tomato juices. © 2015 Society of Chemical Industry.
Zhang, Xiaolei; Liu, Fei; He, Yong; Li, Xiaoli
2012-01-01
Hyperspectral imaging in the visible and near infrared (VIS-NIR) region was used to develop a novel method for discriminating different varieties of commodity maize seeds. Firstly, hyperspectral images of 330 samples of six varieties of maize seeds were acquired using a hyperspectral imaging system in the 380–1,030 nm wavelength range. Secondly, principal component analysis (PCA) and kernel principal component analysis (KPCA) were used to explore the internal structure of the spectral data. Thirdly, three optimal wavelengths (523, 579 and 863 nm) were selected by implementing PCA directly on each image. Then four textural variables including contrast, homogeneity, energy and correlation were extracted from gray level co-occurrence matrix (GLCM) of each monochromatic image based on the optimal wavelengths. Finally, several models for maize seeds identification were established by least squares-support vector machine (LS-SVM) and back propagation neural network (BPNN) using four different combinations of principal components (PCs), kernel principal components (KPCs) and textural features as input variables, respectively. The recognition accuracy achieved in the PCA-GLCM-LS-SVM model (98.89%) was the most satisfactory one. We conclude that hyperspectral imaging combined with texture analysis can be implemented for fast classification of different varieties of maize seeds. PMID:23235456
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-01-01
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k-nearest neighbor (k-NN) method produced the greatest accuracy. A geostatistically-weighted k-NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer’s and user’s accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas. PMID:27879661
Dixit, Anshuman; Verkhivker, Gennady M.
2012-01-01
Deciphering functional mechanisms of the Hsp90 chaperone machinery is an important objective in cancer biology aiming to facilitate discovery of targeted anti-cancer therapies. Despite significant advances in understanding structure and function of molecular chaperones, organizing molecular principles that control the relationship between conformational diversity and functional mechanisms of the Hsp90 activity lack a sufficient quantitative characterization. We combined molecular dynamics simulations, principal component analysis, the energy landscape model and structure-functional analysis of Hsp90 regulatory interactions to systematically investigate functional dynamics of the molecular chaperone. This approach has identified a network of conserved regions common to the Hsp90 chaperones that could play a universal role in coordinating functional dynamics, principal collective motions and allosteric signaling of Hsp90. We have found that these functional motifs may be utilized by the molecular chaperone machinery to act collectively as central regulators of Hsp90 dynamics and activity, including the inter-domain communications, control of ATP hydrolysis, and protein client binding. These findings have provided support to a long-standing assertion that allosteric regulation and catalysis may have emerged via common evolutionary routes. The interaction networks regulating functional motions of Hsp90 may be determined by the inherent structural architecture of the molecular chaperone. At the same time, the thermodynamics-based “conformational selection” of functional states is likely to be activated based on the nature of the binding partner. This mechanistic model of Hsp90 dynamics and function is consistent with the notion that allosteric networks orchestrating cooperative protein motions can be formed by evolutionary conserved and sparsely connected residue clusters. Hence, allosteric signaling through a small network of distantly connected residue clusters may be a rather general functional requirement encoded across molecular chaperones. The obtained insights may be useful in guiding discovery of allosteric Hsp90 inhibitors targeting protein interfaces with co-chaperones and protein binding clients. PMID:22624053
ERIC Educational Resources Information Center
Watson, Pat; And Others
Survey responses from over half of Oklahoma City's 2,500 teachers indicated their views of the effectiveness and leadership of the city's 94 school principals. The survey's 82 items were selected from ideas suggested in the principal effectiveness literature and from the leadership component of Oklahoma City's prinipal evaluation forms. The…
ERIC Educational Resources Information Center
Klinker, JoAnn Franklin; Hackmann, Donald G.
High school principals confront ethical dilemmas daily. This report describes a study that examined how MetLife/NASSP secondary principals of the year made ethical decisions conforming to three dispositions from Standard 5 of the ISLLC standards and whether they could identify processes used to reach those decisions through Rest's Four Component…
The Middle Management Paradox of the Urban High School Assistant Principal: Making It Happen
ERIC Educational Resources Information Center
Jubilee, Sabriya Kaleen
2013-01-01
Scholars of transformational leadership literature assert that school-based management teams are a vital component in transforming schools. Many of these works focus heavily on the roles of principals and teachers, ignoring the contribution of Assistant Principals (APs). More attention is now being given to the unique role that Assistant…
E-Mentoring for New Principals: A Case Study of a Mentoring Program
ERIC Educational Resources Information Center
Russo, Erin D.
2013-01-01
This descriptive case study includes both new principals and their mentor principals engaged in e-mentoring activities. This study examines the components of a school district's mentoring program in order to make sense of e-mentoring technology. The literature review highlights mentoring practices in education, and also draws upon e-mentoring…
Salvatore, Stefania; Røislien, Jo; Baz-Lomba, Jose A; Bramness, Jørgen G
2017-03-01
Wastewater-based epidemiology is an alternative method for estimating the collective drug use in a community. We applied functional data analysis, a statistical framework developed for analysing curve data, to investigate weekly temporal patterns in wastewater measurements of three prescription drugs with known abuse potential: methadone, oxazepam and methylphenidate, comparing them to positive and negative control drugs. Sewage samples were collected in February 2014 from a wastewater treatment plant in Oslo, Norway. The weekly pattern of each drug was extracted by fitting of generalized additive models, using trigonometric functions to model the cyclic behaviour. From the weekly component, the main temporal features were then extracted using functional principal component analysis. Results are presented through the functional principal components (FPCs) and corresponding FPC scores. Clinically, the most important weekly feature of the wastewater-based epidemiology data was the second FPC, representing the difference between average midweek level and a peak during the weekend, representing possible recreational use of a drug in the weekend. Estimated scores on this FPC indicated recreational use of methylphenidate, with a high weekend peak, but not for methadone and oxazepam. The functional principal component analysis uncovered clinically important temporal features of the weekly patterns of the use of prescription drugs detected from wastewater analysis. This may be used as a post-marketing surveillance method to monitor prescription drugs with abuse potential. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Espeland, Mark A; Bray, George A; Neiberg, Rebecca; Rejeski, W Jack; Knowler, William C; Lang, Wei; Cheskin, Lawrence J; Williamson, Don; Lewis, C Beth; Wing, Rena
2009-10-01
To demonstrate how principal components analysis can be used to describe patterns of weight changes in response to an intensive lifestyle intervention. Principal components analysis was applied to monthly percent weight changes measured on 2,485 individuals enrolled in the lifestyle arm of the Action for Health in Diabetes (Look AHEAD) clinical trial. These individuals were 45 to 75 years of age, with type 2 diabetes and body mass indices greater than 25 kg/m(2). Associations between baseline characteristics and weight loss patterns were described using analyses of variance. Three components collectively accounted for 97.0% of total intrasubject variance: a gradually decelerating weight loss (88.8%), early versus late weight loss (6.6%), and a mid-year trough (1.6%). In agreement with previous reports, each of the baseline characteristics we examined had statistically significant relationships with weight loss patterns. As examples, males tended to have a steeper trajectory of percent weight loss and to lose weight more quickly than women. Individuals with higher hemoglobin A(1c) (glycosylated hemoglobin; HbA(1c)) tended to have a flatter trajectory of percent weight loss and to have mid-year troughs in weight loss compared to those with lower HbA(1c). Principal components analysis provided a coherent description of characteristic patterns of weight changes and is a useful vehicle for identifying their correlates and potentially for predicting weight control outcomes.
Research on distributed heterogeneous data PCA algorithm based on cloud platform
NASA Astrophysics Data System (ADS)
Zhang, Jin; Huang, Gang
2018-05-01
Principal component analysis (PCA) of heterogeneous data sets can solve the problem that centralized data scalability is limited. In order to reduce the generation of intermediate data and error components of distributed heterogeneous data sets, a principal component analysis algorithm based on heterogeneous data sets under cloud platform is proposed. The algorithm performs eigenvalue processing by using Householder tridiagonalization and QR factorization to calculate the error component of the heterogeneous database associated with the public key to obtain the intermediate data set and the lost information. Experiments on distributed DBM heterogeneous datasets show that the model method has the feasibility and reliability in terms of execution time and accuracy.
Sullivan, Karen A; Lurie, Janine K
2017-01-01
The study examined the component structure of the Neurobehavioral Symptom Inventory (NSI) under five different models. The evaluated models comprised the full NSI (NSI-22) and the NSI-20 (NSI minus two orphan items). A civilian nonclinical sample was used. The 575 volunteers were predominantly university students who screened negative for mild TBI. The study design was cross-sectional, with questionnaires administered online. The main measure was the Neurobehavioral Symptom Inventory. Subscale, total and embedded validity scores were derived (the Validity-10, the LOW6, and the NIM5). In both models, the principal components analysis yielded two intercorrelated components (psychological and somatic/sensory) with acceptable internal consistency (alphas > 0.80). In this civilian nonclinical sample, the NSI had two underlying components. These components represent psychological and somatic/sensory neurobehavioral symptoms.
NASA Astrophysics Data System (ADS)
Kim, Young-Pil; Hong, Mi-Young; Shon, Hyun Kyong; Chegal, Won; Cho, Hyun Mo; Moon, Dae Won; Kim, Hak-Sung; Lee, Tae Geol
2008-12-01
Interaction between streptavidin and biotin on poly(amidoamine) (PAMAM) dendrimer-activated surfaces and on self-assembled monolayers (SAMs) was quantitatively studied by using time-of-flight secondary ion mass spectrometry (ToF-SIMS). The surface protein density was systematically varied as a function of protein concentration and independently quantified using the ellipsometry technique. Principal component analysis (PCA) and principal component regression (PCR) were used to identify a correlation between the intensities of the secondary ion peaks and the surface protein densities. From the ToF-SIMS and ellipsometry results, a good linear correlation of protein density was found. Our study shows that surface protein densities are higher on dendrimer-activated surfaces than on SAMs surfaces due to the spherical property of the dendrimer, and that these surface protein densities can be easily quantified with high sensitivity in a label-free manner by ToF-SIMS.
Exploring patterns enriched in a dataset with contrastive principal component analysis.
Abid, Abubakar; Zhang, Martin J; Bagaria, Vivek K; Zou, James
2018-05-30
Visualization and exploration of high-dimensional data is a ubiquitous challenge across disciplines. Widely used techniques such as principal component analysis (PCA) aim to identify dominant trends in one dataset. However, in many settings we have datasets collected under different conditions, e.g., a treatment and a control experiment, and we are interested in visualizing and exploring patterns that are specific to one dataset. This paper proposes a method, contrastive principal component analysis (cPCA), which identifies low-dimensional structures that are enriched in a dataset relative to comparison data. In a wide variety of experiments, we demonstrate that cPCA with a background dataset enables us to visualize dataset-specific patterns missed by PCA and other standard methods. We further provide a geometric interpretation of cPCA and strong mathematical guarantees. An implementation of cPCA is publicly available, and can be used for exploratory data analysis in many applications where PCA is currently used.
Variability search in M 31 using principal component analysis and the Hubble Source Catalogue
NASA Astrophysics Data System (ADS)
Moretti, M. I.; Hatzidimitriou, D.; Karampelas, A.; Sokolovsky, K. V.; Bonanos, A. Z.; Gavras, P.; Yang, M.
2018-06-01
Principal component analysis (PCA) is being extensively used in Astronomy but not yet exhaustively exploited for variability search. The aim of this work is to investigate the effectiveness of using the PCA as a method to search for variable stars in large photometric data sets. We apply PCA to variability indices computed for light curves of 18 152 stars in three fields in M 31 extracted from the Hubble Source Catalogue. The projection of the data into the principal components is used as a stellar variability detection and classification tool, capable of distinguishing between RR Lyrae stars, long-period variables (LPVs) and non-variables. This projection recovered more than 90 per cent of the known variables and revealed 38 previously unknown variable stars (about 30 per cent more), all LPVs except for one object of uncertain variability type. We conclude that this methodology can indeed successfully identify candidate variable stars.
A Genealogical Interpretation of Principal Components Analysis
McVean, Gil
2009-01-01
Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's fst and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference. PMID:19834557
Spatial and temporal variability of hyperspectral signatures of terrain
NASA Astrophysics Data System (ADS)
Jones, K. F.; Perovich, D. K.; Koenig, G. G.
2008-04-01
Electromagnetic signatures of terrain exhibit significant spatial heterogeneity on a range of scales as well as considerable temporal variability. A statistical characterization of the spatial heterogeneity and spatial scaling algorithms of terrain electromagnetic signatures are required to extrapolate measurements to larger scales. Basic terrain elements including bare soil, grass, deciduous, and coniferous trees were studied in a quasi-laboratory setting using instrumented test sites in Hanover, NH and Yuma, AZ. Observations were made using a visible and near infrared spectroradiometer (350 - 2500 nm) and hyperspectral camera (400 - 1100 nm). Results are reported illustrating: i) several difference scenes; ii) a terrain scene time series sampled over an annual cycle; and iii) the detection of artifacts in scenes. A principal component analysis indicated that the first three principal components typically explained between 90 and 99% of the variance of the 30 to 40-channel hyperspectral images. Higher order principal components of hyperspectral images are useful for detecting artifacts in scenes.
A Spectral Algorithm for Envelope Reduction of Sparse Matrices
NASA Technical Reports Server (NTRS)
Barnard, Stephen T.; Pothen, Alex; Simon, Horst D.
1993-01-01
The problem of reordering a sparse symmetric matrix to reduce its envelope size is considered. A new spectral algorithm for computing an envelope-reducing reordering is obtained by associating a Laplacian matrix with the given matrix and then sorting the components of a specified eigenvector of the Laplacian. This Laplacian eigenvector solves a continuous relaxation of a discrete problem related to envelope minimization called the minimum 2-sum problem. The permutation vector computed by the spectral algorithm is a closest permutation vector to the specified Laplacian eigenvector. Numerical results show that the new reordering algorithm usually computes smaller envelope sizes than those obtained from the current standard algorithms such as Gibbs-Poole-Stockmeyer (GPS) or SPARSPAK reverse Cuthill-McKee (RCM), in some cases reducing the envelope by more than a factor of two.
2011-01-01
Background Hemorrhagic fever with renal syndrome (HFRS) is an important infectious disease caused by different species of hantaviruses. As a rodent-borne disease with a seasonal distribution, external environmental factors including climate factors may play a significant role in its transmission. The city of Shenyang is one of the most seriously endemic areas for HFRS. Here, we characterized the dynamic temporal trend of HFRS, and identified climate-related risk factors and their roles in HFRS transmission in Shenyang, China. Methods The annual and monthly cumulative numbers of HFRS cases from 2004 to 2009 were calculated and plotted to show the annual and seasonal fluctuation in Shenyang. Cross-correlation and autocorrelation analyses were performed to detect the lagged effect of climate factors on HFRS transmission and the autocorrelation of monthly HFRS cases. Principal component analysis was constructed by using climate data from 2004 to 2009 to extract principal components of climate factors to reduce co-linearity. The extracted principal components and autocorrelation terms of monthly HFRS cases were added into a multiple regression model called principal components regression model (PCR) to quantify the relationship between climate factors, autocorrelation terms and transmission of HFRS. The PCR model was compared to a general multiple regression model conducted only with climate factors as independent variables. Results A distinctly declining temporal trend of annual HFRS incidence was identified. HFRS cases were reported every month, and the two peak periods occurred in spring (March to May) and winter (November to January), during which, nearly 75% of the HFRS cases were reported. Three principal components were extracted with a cumulative contribution rate of 86.06%. Component 1 represented MinRH0, MT1, RH1, and MWV1; component 2 represented RH2, MaxT3, and MAP3; and component 3 represented MaxT2, MAP2, and MWV2. The PCR model was composed of three principal components and two autocorrelation terms. The association between HFRS epidemics and climate factors was better explained in the PCR model (F = 446.452, P < 0.001, adjusted R2 = 0.75) than in the general multiple regression model (F = 223.670, P < 0.000, adjusted R2 = 0.51). Conclusion The temporal distribution of HFRS in Shenyang varied in different years with a distinctly declining trend. The monthly trends of HFRS were significantly associated with local temperature, relative humidity, precipitation, air pressure, and wind velocity of the different previous months. The model conducted in this study will make HFRS surveillance simpler and the control of HFRS more targeted in Shenyang. PMID:22133347
Preferences for Key Ethical Principles that Guide Business School Students
ERIC Educational Resources Information Center
Guyette, Roger; Piotrowski, Chris
2010-01-01
Business ethics is presently a major component of the business school curriculum. Although there has been much attention focused on the impact of such coursework on instilling ethical decision-making (Nguyen et al., 2008), there is sparse research on how business students view the major ethical principles that serve as the foundation of business…
Underdetermined blind separation of three-way fluorescence spectra of PAHs in water
NASA Astrophysics Data System (ADS)
Yang, Ruifang; Zhao, Nanjing; Xiao, Xue; Zhu, Wei; Chen, Yunan; Yin, Gaofang; Liu, Jianguo; Liu, Wenqing
2018-06-01
In this work, underdetermined blind decomposition method is developed to recognize individual components from the three-way fluorescent spectra of their mixtures by using sparse component analysis (SCA). The mixing matrix is estimated from the mixtures using fuzzy data clustering algorithm together with the scatters corresponding to local energy maximum value in the time-frequency domain, and the spectra of object components are recovered by pseudo inverse technique. As an example, using this method three and four pure components spectra can be blindly extracted from two samples of their mixture, with similarities between resolved and reference spectra all above 0.80. This work opens a new and effective path to realize monitoring PAHs in water by three-way fluorescence spectroscopy technique.
A multifactor approach to forecasting Romanian gross domestic product (GDP) in the short run.
Armeanu, Daniel; Andrei, Jean Vasile; Lache, Leonard; Panait, Mirela
2017-01-01
The purpose of this paper is to investigate the application of a generalized dynamic factor model (GDFM) based on dynamic principal components analysis to forecasting short-term economic growth in Romania. We have used a generalized principal components approach to estimate a dynamic model based on a dataset comprising 86 economic and non-economic variables that are linked to economic output. The model exploits the dynamic correlations between these variables and uses three common components that account for roughly 72% of the information contained in the original space. We show that it is possible to generate reliable forecasts of quarterly real gross domestic product (GDP) using just the common components while also assessing the contribution of the individual variables to the dynamics of real GDP. In order to assess the relative performance of the GDFM to standard models based on principal components analysis, we have also estimated two Stock-Watson (SW) models that were used to perform the same out-of-sample forecasts as the GDFM. The results indicate significantly better performance of the GDFM compared with the competing SW models, which empirically confirms our expectations that the GDFM produces more accurate forecasts when dealing with large datasets.
A multifactor approach to forecasting Romanian gross domestic product (GDP) in the short run
Armeanu, Daniel; Lache, Leonard; Panait, Mirela
2017-01-01
The purpose of this paper is to investigate the application of a generalized dynamic factor model (GDFM) based on dynamic principal components analysis to forecasting short-term economic growth in Romania. We have used a generalized principal components approach to estimate a dynamic model based on a dataset comprising 86 economic and non-economic variables that are linked to economic output. The model exploits the dynamic correlations between these variables and uses three common components that account for roughly 72% of the information contained in the original space. We show that it is possible to generate reliable forecasts of quarterly real gross domestic product (GDP) using just the common components while also assessing the contribution of the individual variables to the dynamics of real GDP. In order to assess the relative performance of the GDFM to standard models based on principal components analysis, we have also estimated two Stock-Watson (SW) models that were used to perform the same out-of-sample forecasts as the GDFM. The results indicate significantly better performance of the GDFM compared with the competing SW models, which empirically confirms our expectations that the GDFM produces more accurate forecasts when dealing with large datasets. PMID:28742100
Xiao, Hong; Tian, Huai-Yu; Gao, Li-Dong; Liu, Hai-Ning; Duan, Liang-Song; Basta, Nicole; Cazelles, Bernard; Li, Xiu-Jun; Lin, Xiao-Ling; Wu, Hong-Wei; Chen, Bi-Yun; Yang, Hui-Suo; Xu, Bing; Grenfell, Bryan
2014-01-01
China has the highest incidence of hemorrhagic fever with renal syndrome (HFRS) worldwide. Reported cases account for 90% of the total number of global cases. By 2010, approximately 1.4 million HFRS cases had been reported in China. This study aimed to explore the effect of the rodent reservoir, and natural and socioeconomic variables, on the transmission pattern of HFRS. Data on monthly HFRS cases were collected from 2006 to 2010. Dynamic rodent monitoring data, normalized difference vegetation index (NDVI) data, climate data, and socioeconomic data were also obtained. Principal component analysis was performed, and the time-lag relationships between the extracted principal components and HFRS cases were analyzed. Polynomial distributed lag (PDL) models were used to fit and forecast HFRS transmission. Four principal components were extracted. Component 1 (F1) represented rodent density, the NDVI, and monthly average temperature. Component 2 (F2) represented monthly average rainfall and monthly average relative humidity. Component 3 (F3) represented rodent density and monthly average relative humidity. The last component (F4) represented gross domestic product and the urbanization rate. F2, F3, and F4 were significantly correlated, with the monthly HFRS incidence with lags of 4 months (r = -0.289, P<0.05), 5 months (r = -0.523, P<0.001), and 0 months (r = -0.376, P<0.01), respectively. F1 was correlated with the monthly HFRS incidence, with a lag of 4 months (r = 0.179, P = 0.192). Multivariate PDL modeling revealed that the four principal components were significantly associated with the transmission of HFRS. The monthly trend in HFRS cases was significantly associated with the local rodent reservoir, climatic factors, the NDVI, and socioeconomic conditions present during the previous months. The findings of this study may facilitate the development of early warning systems for the control and prevention of HFRS and similar diseases.
Multivariate classification of small order watersheds in the Quabbin Reservoir Basin, Massachusetts
Lent, R.M.; Waldron, M.C.; Rader, J.C.
1998-01-01
A multivariate approach was used to analyze hydrologic, geologic, geographic, and water-chemistry data from small order watersheds in the Quabbin Reservoir Basin in central Massachusetts. Eighty three small order watersheds were delineated and landscape attributes defining hydrologic, geologic, and geographic features of the watersheds were compiled from geographic information system data layers. Principal components analysis was used to evaluate 11 chemical constituents collected bi-weekly for 1 year at 15 surface-water stations in order to subdivide the basin into subbasins comprised of watersheds with similar water quality characteristics. Three principal components accounted for about 90 percent of the variance in water chemistry data. The principal components were defined as a biogeochemical variable related to wetland density, an acid-neutralization variable, and a road-salt variable related to density of primary roads. Three subbasins were identified. Analysis of variance and multiple comparisons of means were used to identify significant differences in stream water chemistry and landscape attributes among subbasins. All stream water constituents were significantly different among subbasins. Multiple regression techniques were used to relate stream water chemistry to landscape attributes. Important differences in landscape attributes were related to wetlands, slope, and soil type.A multivariate approach was used to analyze hydrologic, geologic, geographic, and water-chemistry data from small order watersheds in the Quabbin Reservoir Basin in central Massachusetts. Eighty three small order watersheds were delineated and landscape attributes defining hydrologic, geologic, and geographic features of the watersheds were compiled from geographic information system data layers. Principal components analysis was used to evaluate 11 chemical constituents collected bi-weekly for 1 year at 15 surface-water stations in order to subdivide the basin into subbasins comprised of watersheds with similar water quality characteristics. Three principal components accounted for about 90 percent of the variance in water chemistry data. The principal components were defined as a biogeochemical variable related to wetland density, an acid-neutralization variable, and a road-salt variable related to density of primary roads. Three subbasins were identified. Analysis of variance and multiple comparisons of means were used to identify significant differences in stream water chemistry and landscape attributes among subbasins. All stream water constituents were significantly different among subbasins. Multiple regression techniques were used to relate stream water chemistry to landscape attributes. Important differences in landscape attributes were related to wetlands, slope, and soil type.
Influential Observations in Principal Factor Analysis.
ERIC Educational Resources Information Center
Tanaka, Yutaka; Odaka, Yoshimasa
1989-01-01
A method is proposed for detecting influential observations in iterative principal factor analysis. Theoretical influence functions are derived for two components of the common variance decomposition. The major mathematical tool is the influence function derived by Tanaka (1988). (SLD)
ERIC Educational Resources Information Center
Steinley, Douglas; Brusco, Michael J.; Henson, Robert
2012-01-01
A measure of "clusterability" serves as the basis of a new methodology designed to preserve cluster structure in a reduced dimensional space. Similar to principal component analysis, which finds the direction of maximal variance in multivariate space, principal cluster axes find the direction of maximum clusterability in multivariate space.…
ERIC Educational Resources Information Center
Yan, Zi; Sin, Kuen-fung
2015-01-01
This study aimed at providing explanation and prediction of principals' inclusive education intentions and practices under the framework of the Theory of Planned Behaviour (TPB). A sample of 209 principals from Hong Kong schools was surveyed using five scales that were developed to assess the five components of TPB: attitude, subjective norm,…
Record of C4 Photosynthesis Through the Late Neogene and Pleistocene
NASA Astrophysics Data System (ADS)
Cerling, T. E.
2016-12-01
C4 photosynthesis is an adaptation to the low atmospheric carbon dioxide concentrations experienced in the Neogene; it is found principally in tropical to sub-tropical/temperate regions where temperatures are high in the growing season. Although C4 photosynthesis makes up about 50% of Net Primary Productivity in tropical regions, its macroscopic fossil record is extremely sparse. Therefore, inferences to its significance in local ecosystems are based primarily on stable isotopes, with phytoliths become more important as phytolith morphology becomes better associated with plant structure and classification. Stable isotopes have been the principal recorder for understanding the history of C4 photosynthesis; however, different materials record different aspects of the C4 contribution to ecosystem structure and thus are telling different parts of the same story. With the fossil record so poorly known, we often assume similar ecosystem structures and functions as we observe in modern analogues. It is likely that large evolutionary changes have taken place within C4 plants as they went from < 1% tropical NPP to > 50% tropical NPP in the late Neogene.
Fransen, Katrien; Vanbeselaere, Norbert; De Cuyper, Bert; Vande Broek, Gert; Boen, Filip
2014-01-01
Although coaches and players recognise the importance of leaders within the team, research on athlete leadership is sparse. The present study expands knowledge of athlete leadership by extending the current leadership classification and exploring the importance of the team captain as formal leader of the team. An online survey was completed by 4,451 participants (31% females and 69% males) within nine different team sports in Flanders (Belgium). Players (N = 3,193) and coaches (N = 1,258) participated on all different levels in their sports. Results revealed that the proposed additional role of motivational leader was perceived as clearly distinct from the already established roles (task, social and external leader). Furthermore, almost half of the participants (44%) did not perceive their captain as the principal leader on any of the four roles. These findings underline the fact that the leadership qualities attributed to the captain as the team's formal leader are overrated. It can be concluded that leadership is spread throughout the team; informal leaders rather than the captain take the lead, both on and off the field.
Igawa, Satomi; Kishibe, Mari; Honma, Masaru; Murakami, Masamoto; Mizuno, Yuki; Suga, Yasushi; Seishima, Mariko; Ohguchi, Yuka; Akiyama, Masashi; Hirose, Kenji; Ishida-Yamamoto, Akemi; Iizuka, Hajime
2013-10-01
Atopic dermatitis (AD), Netherton syndrome (NS) and peeling skin syndrome type B (PSS) may show some clinical phenotypic overlap. Corneodesmosomes are crucial for maintaining stratum corneum integrity and the components' localization can be visualized by immunostaining tape-stripped corneocytes. In normal skin, they are detected at the cell periphery. To determine whether AD, NS, PSS and ichthyosis vulgaris (IV) have differences in the corneodesmosomal components' distribution and corneocytes surface areas. Corneocytes were tape-stripped from a control group (n=12) and a disease group (37 AD cases, 3 IV cases, 4 NS cases, and 3 PSS cases), and analyzed with immunofluorescent microscopy. The distribution patterns of corneodesmosomal components: desmoglein 1, corneodesmosin, and desmocollin 1 were classified into four types: peripheral, sparse diffuse, dense diffuse and partial diffuse. Corneocyte surface areas were also measured. The corneodesmosome staining patterns were abnormal in the disease group. Other than in the 3 PSS cases, all three components showed similar patterns in each category. In lesional AD skin, the dense diffuse pattern was prominent. A high rate of the partial diffuse pattern, loss of linear cell-cell contacts, and irregular stripping manners were unique to NS. Only in PSS was corneodesmosin staining virtually absent. The corneocyte surface areas correlated significantly with the rate of combined sparse and dense diffuse patterns of desmoglein 1. This method may be used to assess abnormally differentiated corneocytes in AD and other diseases tested. In PSS samples, tape stripping analysis may serve as a non-invasive diagnostic test. Copyright © 2013 Japanese Society for Investigative Dermatology. Published by Elsevier Ireland Ltd. All rights reserved.
The risk of misclassifying subjects within principal component based asset index
2014-01-01
The asset index is often used as a measure of socioeconomic status in empirical research as an explanatory variable or to control confounding. Principal component analysis (PCA) is frequently used to create the asset index. We conducted a simulation study to explore how accurately the principal component based asset index reflects the study subjects’ actual poverty level, when the actual poverty level is generated by a simple factor analytic model. In the simulation study using the PC-based asset index, only 1% to 4% of subjects preserved their real position in a quintile scale of assets; between 44% to 82% of subjects were misclassified into the wrong asset quintile. If the PC-based asset index explained less than 30% of the total variance in the component variables, then we consistently observed more than 50% misclassification across quintiles of the index. The frequency of misclassification suggests that the PC-based asset index may not provide a valid measure of poverty level and should be used cautiously as a measure of socioeconomic status. PMID:24987446
Machine learning of frustrated classical spin models. I. Principal component analysis
NASA Astrophysics Data System (ADS)
Wang, Ce; Zhai, Hui
2017-10-01
This work aims at determining whether artificial intelligence can recognize a phase transition without prior human knowledge. If this were successful, it could be applied to, for instance, analyzing data from the quantum simulation of unsolved physical models. Toward this goal, we first need to apply the machine learning algorithm to well-understood models and see whether the outputs are consistent with our prior knowledge, which serves as the benchmark for this approach. In this work, we feed the computer data generated by the classical Monte Carlo simulation for the X Y model in frustrated triangular and union jack lattices, which has two order parameters and exhibits two phase transitions. We show that the outputs of the principal component analysis agree very well with our understanding of different orders in different phases, and the temperature dependences of the major components detect the nature and the locations of the phase transitions. Our work offers promise for using machine learning techniques to study sophisticated statistical models, and our results can be further improved by using principal component analysis with kernel tricks and the neural network method.
Dong, Fengxia; Mitchell, Paul D; Colquhoun, Jed
2015-01-01
Measuring farm sustainability performance is a crucial component for improving agricultural sustainability. While extensive assessments and indicators exist that reflect the different facets of agricultural sustainability, because of the relatively large number of measures and interactions among them, a composite indicator that integrates and aggregates over all variables is particularly useful. This paper describes and empirically evaluates a method for constructing a composite sustainability indicator that individually scores and ranks farm sustainability performance. The method first uses non-negative polychoric principal component analysis to reduce the number of variables, to remove correlation among variables and to transform categorical variables to continuous variables. Next the method applies common-weight data envelope analysis to these principal components to individually score each farm. The method solves weights endogenously and allows identifying important practices in sustainability evaluation. An empirical application to Wisconsin cranberry farms finds heterogeneity in sustainability practice adoption, implying that some farms could adopt relevant practices to improve the overall sustainability performance of the industry. Copyright © 2014 Elsevier Ltd. All rights reserved.
Guo, Zhiqiang; Wang, Huaiqing; Yang, Jie; Miller, David J
2015-01-01
In this paper, we propose and implement a hybrid model combining two-directional two-dimensional principal component analysis ((2D)2PCA) and a Radial Basis Function Neural Network (RBFNN) to forecast stock market behavior. First, 36 stock market technical variables are selected as the input features, and a sliding window is used to obtain the input data of the model. Next, (2D)2PCA is utilized to reduce the dimension of the data and extract its intrinsic features. Finally, an RBFNN accepts the data processed by (2D)2PCA to forecast the next day's stock price or movement. The proposed model is used on the Shanghai stock market index, and the experiments show that the model achieves a good level of fitness. The proposed model is then compared with one that uses the traditional dimension reduction method principal component analysis (PCA) and independent component analysis (ICA). The empirical results show that the proposed model outperforms the PCA-based model, as well as alternative models based on ICA and on the multilayer perceptron.
Guo, Zhiqiang; Wang, Huaiqing; Yang, Jie; Miller, David J.
2015-01-01
In this paper, we propose and implement a hybrid model combining two-directional two-dimensional principal component analysis ((2D)2PCA) and a Radial Basis Function Neural Network (RBFNN) to forecast stock market behavior. First, 36 stock market technical variables are selected as the input features, and a sliding window is used to obtain the input data of the model. Next, (2D)2PCA is utilized to reduce the dimension of the data and extract its intrinsic features. Finally, an RBFNN accepts the data processed by (2D)2PCA to forecast the next day's stock price or movement. The proposed model is used on the Shanghai stock market index, and the experiments show that the model achieves a good level of fitness. The proposed model is then compared with one that uses the traditional dimension reduction method principal component analysis (PCA) and independent component analysis (ICA). The empirical results show that the proposed model outperforms the PCA-based model, as well as alternative models based on ICA and on the multilayer perceptron. PMID:25849483
Comparison of AIS Versus TMS Data Collected over the Virginia Piedmont
NASA Technical Reports Server (NTRS)
Bell, R.; Evans, C. S.
1985-01-01
The Airborne Imaging Spectrometer (AIS, NS001 Thematic Mapper Simlulator (TMS), and Zeiss camera collected remotely sensed data simultaneously on October 27, 1983, at an altitude of 6860 meters (22,500 feet). AIS data were collected in 32 channels covering 1200 to 1500 nm. A simple atmospheric correction was applied to the AIS data, after which spectra for four cover types were plotted. Spectra for these ground cover classes showed a telescoping effect for the wavelength endpoints. Principal components were extracted from the shortwave region of the AIS (1200 to 1280 nm), full spectrum AIS (1200 to 1500 nm) and TMS (450 to 12,500 nm) to create three separate three-component color image composites. A comparison of the TMS band 5 (1000 to 1300 nm) to the six principal components from the shortwave AIS region (1200 to 1280 nm) showed improved visual discrimination of ground cover types. Contrast of color image composites created from principal components showed the AIS composites to exhibit a clearer demarcation between certain ground cover types but subtle differences within other regions of the imagery were not as readily seen.
Research on Air Quality Evaluation based on Principal Component Analysis
NASA Astrophysics Data System (ADS)
Wang, Xing; Wang, Zilin; Guo, Min; Chen, Wei; Zhang, Huan
2018-01-01
Economic growth has led to environmental capacity decline and the deterioration of air quality. Air quality evaluation as a fundamental of environmental monitoring and air pollution control has become increasingly important. Based on the principal component analysis (PCA), this paper evaluates the air quality of a large city in Beijing-Tianjin-Hebei Area in recent 10 years and identifies influencing factors, in order to provide reference to air quality management and air pollution control.
Principal components analysis of the photoresponse nonuniformity of a matrix detector.
Ferrero, Alejandro; Alda, Javier; Campos, Joaquín; López-Alonso, Jose Manuel; Pons, Alicia
2007-01-01
The principal component analysis is used to identify and quantify spatial distributions of relative photoresponse as a function of the exposure time for a visible CCD array. The analysis shows a simple way to define an invariant photoresponse nonuniformity and compare it with the definition of this invariant pattern as the one obtained for long exposure times. Experimental data of radiant exposure from levels of irradiance obtained in a stable and well-controlled environment are used.
Catanuto, Giuseppe; Taher, Wafa; Rocco, Nicola; Catalano, Francesca; Allegra, Dario; Milotta, Filippo Luigi Maria; Stanco, Filippo; Gallo, Giovanni; Nava, Maurizio Bruno
2018-03-20
Breast shape is defined utilizing mainly qualitative assessment (full, flat, ptotic) or estimates, such as volume or distances between reference points, that cannot describe it reliably. We will quantitatively describe breast shape with two parameters derived from a statistical methodology denominated principal component analysis (PCA). We created a heterogeneous dataset of breast shapes acquired with a commercial infrared 3-dimensional scanner on which PCA was performed. We plotted on a Cartesian plane the two highest values of PCA for each breast (principal components 1 and 2). Testing of the methodology on a preoperative and postoperative surgical case and test-retest was performed by two operators. The first two principal components derived from PCA are able to characterize the shape of the breast included in the dataset. The test-retest demonstrated that different operators are able to obtain very similar values of PCA. The system is also able to identify major changes in the preoperative and postoperative stages of a two-stage reconstruction. Even minor changes were correctly detected by the system. This methodology can reliably describe the shape of a breast. An expert operator and a newly trained operator can reach similar results in a test/re-testing validation. Once developed and after further validation, this methodology could be employed as a good tool for outcome evaluation, auditing, and benchmarking.
Nesakumar, Noel; Baskar, Chanthini; Kesavan, Srinivasan; Rayappan, John Bosco Balaguru; Alwarappan, Subbiah
2018-05-22
The moisture content of beetroot varies during long-term cold storage. In this work, we propose a strategy to identify the moisture content and age of beetroot using principal component analysis coupled Fourier transform infrared spectroscopy (FTIR). Frequent FTIR measurements were recorded directly from the beetroot sample surface over a period of 34 days for analysing its moisture content employing attenuated total reflectance in the spectral ranges of 2614-4000 and 1465-1853 cm -1 with a spectral resolution of 8 cm -1 . In order to estimate the transmittance peak height (T p ) and area under the transmittance curve [Formula: see text] over the spectral ranges of 2614-4000 and 1465-1853 cm -1 , Gaussian curve fitting algorithm was performed on FTIR data. Principal component and nonlinear regression analyses were utilized for FTIR data analysis. Score plot over the ranges of 2614-4000 and 1465-1853 cm -1 allowed beetroot quality discrimination. Beetroot quality predictive models were developed by employing biphasic dose response function. Validation experiment results confirmed that the accuracy of the beetroot quality predictive model reached 97.5%. This research work proves that FTIR spectroscopy in combination with principal component analysis and beetroot quality predictive models could serve as an effective tool for discriminating moisture content in fresh, half and completely spoiled stages of beetroot samples and for providing status alerts.
Fine structure of the low-frequency spectra of heart rate and blood pressure
Kuusela, Tom A; Kaila, Timo J; Kähönen, Mika
2003-01-01
Background The aim of this study was to explore the principal frequency components of the heart rate and blood pressure variability in the low frequency (LF) and very low frequency (VLF) band. The spectral composition of the R–R interval (RRI) and systolic arterial blood pressure (SAP) in the frequency range below 0.15 Hz were carefully analyzed using three different spectral methods: Fast Fourier transform (FFT), Wigner-Ville distribution (WVD), and autoregression (AR). All spectral methods were used to create time–frequency plots to uncover the principal spectral components that are least dependent on time. The accurate frequencies of these components were calculated from the pole decomposition of the AR spectral density after determining the optimal model order – the most crucial factor when using this method – with the help of FFT and WVD methods. Results Spectral analysis of the RRI and SAP of 12 healthy subjects revealed that there are always at least three spectral components below 0.15 Hz. The three principal frequency components are 0.026 ± 0.003 (mean ± SD) Hz, 0.076 ± 0.012 Hz, and 0.117 ± 0.016 Hz. These principal components vary only slightly over time. FFT-based coherence and phase-function analysis suggests that the second and third components are related to the baroreflex control of blood pressure, since the phase difference between SAP and RRI was negative and almost constant, whereas the origin of the first component is different since no clear SAP–RRI phase relationship was found. Conclusion The above data indicate that spontaneous fluctuations in heart rate and blood pressure within the standard low-frequency range of 0.04–0.15 Hz typically occur at two frequency components rather than only at one as widely believed, and these components are not harmonically related. This new observation in humans can help explain divergent results in the literature concerning spontaneous low-frequency oscillations. It also raises methodological and computational questions regarding the usability and validity of the low-frequency spectral band when estimating sympathetic activity and baroreflex gain. PMID:14552660
Fine structure of the low-frequency spectra of heart rate and blood pressure.
Kuusela, Tom A; Kaila, Timo J; Kähönen, Mika
2003-10-13
The aim of this study was to explore the principal frequency components of the heart rate and blood pressure variability in the low frequency (LF) and very low frequency (VLF) band. The spectral composition of the R-R interval (RRI) and systolic arterial blood pressure (SAP) in the frequency range below 0.15 Hz were carefully analyzed using three different spectral methods: Fast Fourier transform (FFT), Wigner-Ville distribution (WVD), and autoregression (AR). All spectral methods were used to create time-frequency plots to uncover the principal spectral components that are least dependent on time. The accurate frequencies of these components were calculated from the pole decomposition of the AR spectral density after determining the optimal model order--the most crucial factor when using this method--with the help of FFT and WVD methods. Spectral analysis of the RRI and SAP of 12 healthy subjects revealed that there are always at least three spectral components below 0.15 Hz. The three principal frequency components are 0.026 +/- 0.003 (mean +/- SD) Hz, 0.076 +/- 0.012 Hz, and 0.117 +/- 0.016 Hz. These principal components vary only slightly over time. FFT-based coherence and phase-function analysis suggests that the second and third components are related to the baroreflex control of blood pressure, since the phase difference between SAP and RRI was negative and almost constant, whereas the origin of the first component is different since no clear SAP-RRI phase relationship was found. The above data indicate that spontaneous fluctuations in heart rate and blood pressure within the standard low-frequency range of 0.04-0.15 Hz typically occur at two frequency components rather than only at one as widely believed, and these components are not harmonically related. This new observation in humans can help explain divergent results in the literature concerning spontaneous low-frequency oscillations. It also raises methodological and computational questions regarding the usability and validity of the low-frequency spectral band when estimating sympathetic activity and baroreflex gain.
Principal component analysis on a torus: Theory and application to protein dynamics.
Sittel, Florian; Filk, Thomas; Stock, Gerhard
2017-12-28
A dimensionality reduction method for high-dimensional circular data is developed, which is based on a principal component analysis (PCA) of data points on a torus. Adopting a geometrical view of PCA, various distance measures on a torus are introduced and the associated problem of projecting data onto the principal subspaces is discussed. The main idea is that the (periodicity-induced) projection error can be minimized by transforming the data such that the maximal gap of the sampling is shifted to the periodic boundary. In a second step, the covariance matrix and its eigendecomposition can be computed in a standard manner. Adopting molecular dynamics simulations of two well-established biomolecular systems (Aib 9 and villin headpiece), the potential of the method to analyze the dynamics of backbone dihedral angles is demonstrated. The new approach allows for a robust and well-defined construction of metastable states and provides low-dimensional reaction coordinates that accurately describe the free energy landscape. Moreover, it offers a direct interpretation of covariances and principal components in terms of the angular variables. Apart from its application to PCA, the method of maximal gap shifting is general and can be applied to any other dimensionality reduction method for circular data.
Principal component analysis on a torus: Theory and application to protein dynamics
NASA Astrophysics Data System (ADS)
Sittel, Florian; Filk, Thomas; Stock, Gerhard
2017-12-01
A dimensionality reduction method for high-dimensional circular data is developed, which is based on a principal component analysis (PCA) of data points on a torus. Adopting a geometrical view of PCA, various distance measures on a torus are introduced and the associated problem of projecting data onto the principal subspaces is discussed. The main idea is that the (periodicity-induced) projection error can be minimized by transforming the data such that the maximal gap of the sampling is shifted to the periodic boundary. In a second step, the covariance matrix and its eigendecomposition can be computed in a standard manner. Adopting molecular dynamics simulations of two well-established biomolecular systems (Aib9 and villin headpiece), the potential of the method to analyze the dynamics of backbone dihedral angles is demonstrated. The new approach allows for a robust and well-defined construction of metastable states and provides low-dimensional reaction coordinates that accurately describe the free energy landscape. Moreover, it offers a direct interpretation of covariances and principal components in terms of the angular variables. Apart from its application to PCA, the method of maximal gap shifting is general and can be applied to any other dimensionality reduction method for circular data.
Quantification of localized vertebral deformities using a sparse wavelet-based shape model.
Zewail, R; Elsafi, A; Durdle, N
2008-01-01
Medical experts often examine hundreds of spine x-ray images to determine existence of various pathologies. Common pathologies of interest are anterior osteophites, disc space narrowing, and wedging. By careful inspection of the outline shapes of the vertebral bodies, experts are able to identify and assess vertebral abnormalities with respect to the pathology under investigation. In this paper, we present a novel method for quantification of vertebral deformation using a sparse shape model. Using wavelets and Independent component analysis (ICA), we construct a sparse shape model that benefits from the approximation power of wavelets and the capability of ICA to capture higher order statistics in wavelet space. The new model is able to capture localized pathology-related shape deformations, hence it allows for quantification of vertebral shape variations. We investigate the capability of the model to predict localized pathology related deformations. Next, using support-vector machines, we demonstrate the diagnostic capabilities of the method through the discrimination of anterior osteophites in lumbar vertebrae. Experiments were conducted using a set of 150 contours from digital x-ray images of lumbar spine. Each vertebra is labeled as normal or abnormal. Results reported in this work focus on anterior osteophites as the pathology of interest.
Brodsky, V Y; Malchenko, L A; Konchenko, D S; Zvezdina, N D; Dubovaya, T K
2016-08-01
Primary cultures of rat hepatocytes were studied in serum-free media. Ultradian protein synthesis rhythm was used as a marker of cell synchronization in the population. Addition of glutamic acid (0.2 mg/ml) to the medium of nonsynchronous sparse cultures resulted in detection of a common protein synthesis rhythm, hence in synchronization of the cells. The antagonist of glutamic acid metabotropic receptors MCPG (0.01 mg/ml) added together with glutamic acid abolished the synchronization effect; in sparse cultures, no rhythm was detected. Feeding rats with glutamic acid (30 mg with food) resulted in protein synthesis rhythm in sparse cultures obtained from the rats. After feeding without glutamic acid, linear kinetics of protein synthesis was revealed. Thus, glutamic acid, a component of blood as a non-neural transmitter, can synchronize the activity of hepatocytes and can form common rhythm of protein synthesis in vitro and in vivo. This effect is realized via receptors. Mechanisms of cell-cell communication are discussed on analyzing effects of non-neural functions of neurotransmitters. Glutamic acid is used clinically in humans. Hence, a previously unknown function of this drug is revealed.
Zhang, Zhilin; Jung, Tzyy-Ping; Makeig, Scott; Rao, Bhaskar D
2013-02-01
Fetal ECG (FECG) telemonitoring is an important branch in telemedicine. The design of a telemonitoring system via a wireless body area network with low energy consumption for ambulatory use is highly desirable. As an emerging technique, compressed sensing (CS) shows great promise in compressing/reconstructing data with low energy consumption. However, due to some specific characteristics of raw FECG recordings such as nonsparsity and strong noise contamination, current CS algorithms generally fail in this application. This paper proposes to use the block sparse Bayesian learning framework to compress/reconstruct nonsparse raw FECG recordings. Experimental results show that the framework can reconstruct the raw recordings with high quality. Especially, the reconstruction does not destroy the interdependence relation among the multichannel recordings. This ensures that the independent component analysis decomposition of the reconstructed recordings has high fidelity. Furthermore, the framework allows the use of a sparse binary sensing matrix with much fewer nonzero entries to compress recordings. Particularly, each column of the matrix can contain only two nonzero entries. This shows that the framework, compared to other algorithms such as current CS algorithms and wavelet algorithms, can greatly reduce code execution in CPU in the data compression stage.
NASA Astrophysics Data System (ADS)
Wang, Zhuozheng; Deller, J. R.; Fleet, Blair D.
2016-01-01
Acquired digital images are often corrupted by a lack of camera focus, faulty illumination, or missing data. An algorithm is presented for fusion of multiple corrupted images of a scene using the lifting wavelet transform. The method employs adaptive fusion arithmetic based on matrix completion and self-adaptive regional variance estimation. Characteristics of the wavelet coefficients are used to adaptively select fusion rules. Robust principal component analysis is applied to low-frequency image components, and regional variance estimation is applied to high-frequency components. Experiments reveal that the method is effective for multifocus, visible-light, and infrared image fusion. Compared with traditional algorithms, the new algorithm not only increases the amount of preserved information and clarity but also improves robustness.
ECOPASS - a multivariate model used as an index of growth performance of poplar clones
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ceulemans, R.; Impens, I.
The model (ECOlogical PASSport) reported was constructed by principal component analysis from a combination of biochemical, anatomical/morphological and ecophysiological gas exchange parameters measured on 5 fast growing poplar clones. Productivity data were 10 selected trees in 3 plantations in Belgium and given as m.a.i.(b.a.). The model is shown to be able to reflect not only genetic origin and the relative effects of the different parameters of the clones, but also their production potential. Multiple regression analysis of the 4 principal components showed a high cumulative correlation (96%) between the 3 components related to ecophysiological, biochemical and morphological parameters, and productivity;more » the ecophysiological component alone correlated 85% with productivity.« less
Linkage Analysis of Urine Arsenic Species Patterns in the Strong Heart Family Study
Gribble, Matthew O.; Voruganti, Venkata Saroja; Cole, Shelley A.; Haack, Karin; Balakrishnan, Poojitha; Laston, Sandra L.; Tellez-Plaza, Maria; Francesconi, Kevin A.; Goessler, Walter; Umans, Jason G.; Thomas, Duncan C.; Gilliland, Frank; North, Kari E.; Franceschini, Nora; Navas-Acien, Ana
2015-01-01
Arsenic toxicokinetics are important for disease risks in exposed populations, but genetic determinants are not fully understood. We examined urine arsenic species patterns measured by HPLC-ICPMS among 2189 Strong Heart Study participants 18 years of age and older with data on ∼400 genome-wide microsatellite markers spaced ∼10 cM and arsenic speciation (683 participants from Arizona, 684 from Oklahoma, and 822 from North and South Dakota). We logit-transformed % arsenic species (% inorganic arsenic, %MMA, and %DMA) and also conducted principal component analyses of the logit % arsenic species. We used inverse-normalized residuals from multivariable-adjusted polygenic heritability analysis for multipoint variance components linkage analysis. We also examined the contribution of polymorphisms in the arsenic metabolism gene AS3MT via conditional linkage analysis. We localized a quantitative trait locus (QTL) on chromosome 10 (LOD 4.12 for %MMA, 4.65 for %DMA, and 4.84 for the first principal component of logit % arsenic species). This peak was partially but not fully explained by measured AS3MT variants. We also localized a QTL for the second principal component of logit % arsenic species on chromosome 5 (LOD 4.21) that was not evident from considering % arsenic species individually. Some other loci were suggestive or significant for 1 geographical area but not overall across all areas, indicating possible locus heterogeneity. This genome-wide linkage scan suggests genetic determinants of arsenic toxicokinetics to be identified by future fine-mapping, and illustrates the utility of principal component analysis as a novel approach that considers % arsenic species jointly. PMID:26209557
Modified neural networks for rapid recovery of tokamak plasma parameters for real time control
NASA Astrophysics Data System (ADS)
Sengupta, A.; Ranjan, P.
2002-07-01
Two modified neural network techniques are used for the identification of the equilibrium plasma parameters of the Superconducting Steady State Tokamak I from external magnetic measurements. This is expected to ultimately assist in a real time plasma control. As different from the conventional network structure where a single network with the optimum number of processing elements calculates the outputs, a multinetwork system connected in parallel does the calculations here in one of the methods. This network is called the double neural network. The accuracy of the recovered parameters is clearly more than the conventional network. The other type of neural network used here is based on the statistical function parametrization combined with a neural network. The principal component transformation removes linear dependences from the measurements and a dimensional reduction process reduces the dimensionality of the input space. This reduced and transformed input set, rather than the entire set, is fed into the neural network input. This is known as the principal component transformation-based neural network. The accuracy of the recovered parameters in the latter type of modified network is found to be a further improvement over the accuracy of the double neural network. This result differs from that obtained in an earlier work where the double neural network showed better performance. The conventional network and the function parametrization methods have also been used for comparison. The conventional network has been used for an optimization of the set of magnetic diagnostics. The effective set of sensors, as assessed by this network, are compared with the principal component based network. Fault tolerance of the neural networks has been tested. The double neural network showed the maximum resistance to faults in the diagnostics, while the principal component based network performed poorly. Finally the processing times of the methods have been compared. The double network and the principal component network involve the minimum computation time, although the conventional network also performs well enough to be used in real time.
Strale, Mathieu; Krysinska, Karolina; Overmeiren, Gaëtan Van; Andriessen, Karl
2017-06-01
This study investigated the geographic distribution of suicide and railway suicide in Belgium over 2008--2013 on local (i.e., district or arrondissement) level. There were differences in the regional distribution of suicide and railway suicides in Belgium over the study period. Principal component analysis identified three groups of correlations among population variables and socio-economic indicators, such as population density, unemployment, and age group distribution, on two components that helped explaining the variance of railway suicide at a local (arrondissement) level. This information is of particular importance to prevent suicides in high-risk areas on the Belgian railway network.
Efficient and Robust Signal Approximations
2009-05-01
otherwise. Remark. Permutation matrices are both orthogonal and doubly- stochastic [62]. We will now show how to further simplify the Robust Coding...reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching...Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 Keywords: signal processing, image compression, independent component analysis , sparse
Understanding Discrepancies in Rater Judgement on National-Level Oral Examination Tasks
ERIC Educational Resources Information Center
Ang-Aw, Hui Teng; Goh, Christine Chuen Meng
2011-01-01
The oral examination is an important component of the high-stakes "O" Level examination in Singapore taken by 16-17 year olds whose first language may or may not be English. In spite of this, there has been sparse research into the examination. This paper reports findings of an exploratory study which attempted to determine whether there…
Woody debris dynamics in Interior West forests and woodlands
John D. Shaw; James Long; Raffaella Marzano; Matteo Garbarino
2012-01-01
Managers are interested in the dynamics of down woody material because of its role as a fuel component, a feature of wildlife habitat, a carbon pool, and other characteristics. We analyzed nearly 9,000 plots from the Interior West, spanning the range from sparse juniper and mesquite woodland to dense spruce-fir forests, in order to characterize down woody material as...
ERIC Educational Resources Information Center
Rosa, Victor M.
2013-01-01
Purpose: The purpose of this study was to determine the extent to which California public high school principals perceive the WASC Self-Study Process as a valuable tool for bringing about school improvement. The study specifically examines the principals' perceptions of five components within the Self-Study Process: (1) The creation of the…
Applications of matched field processing to damage detection in composite wind turbine blades
NASA Astrophysics Data System (ADS)
Tippmann, Jeffery D.; Lanza di Scalea, Francesco
2015-03-01
There are many structures serving vital infrastructure, energy, and national security purposes. Inspecting the components and areas of the structure most prone to failure during maintenance operations by using non- destructive evaluation methods has been essential in avoiding costly, but preventable, catastrophic failures. In many cases, the inspections are performed by introducing acoustic, ultrasonic, or even thermographic waves into the structure and then evaluating the response. Sometimes the structure, or a component, is not accessible for active inspection methods. Because of this, there is a growing interest to use passive methods, such as using ambient noise, or sources of opportunity, to produce a passive impulse response function similar to the active approach. Several matched field processing techniques most notably used in oceanography and seismology applications are examined in more detail. While sparse array imaging in structures has been studied for years, all methods studied previously have used an active interrogation approach. Here, structural damage detection is studied by use of the reconstructed impulse response functions in ambient noise within sparse array imaging techniques, such as matched-field processing. This has been studied in experiments on a 9-m wind turbine blade.
Predefined Redundant Dictionary for Effective Depth Maps Representation
NASA Astrophysics Data System (ADS)
Sebai, Dorsaf; Chaieb, Faten; Ghorbel, Faouzi
2016-01-01
The multi-view video plus depth (MVD) video format consists of two components: texture and depth map, where a combination of these components enables a receiver to generate arbitrary virtual views. However, MVD presents a very voluminous video format that requires a compression process for storage and especially for transmission. Conventional codecs are perfectly efficient for texture images compression but not for intrinsic depth maps properties. Depth images indeed are characterized by areas of smoothly varying grey levels separated by sharp discontinuities at the position of object boundaries. Preserving these characteristics is important to enable high quality view synthesis at the receiver side. In this paper, sparse representation of depth maps is discussed. It is shown that a significant gain in sparsity is achieved when particular mixed dictionaries are used for approximating these types of images with greedy selection strategies. Experiments are conducted to confirm the effectiveness at producing sparse representations, and competitiveness, with respect to candidate state-of-art dictionaries. Finally, the resulting method is shown to be effective for depth maps compression and represents an advantage over the ongoing 3D high efficiency video coding compression standard, particularly at medium and high bitrates.
Survey of the Heritability and Sparse Architecture of Gene Expression Traits across Human Tissues.
Wheeler, Heather E; Shah, Kaanan P; Brenner, Jonathon; Garcia, Tzintzuni; Aquino-Michaels, Keston; Cox, Nancy J; Nicolae, Dan L; Im, Hae Kyung
2016-11-01
Understanding the genetic architecture of gene expression traits is key to elucidating the underlying mechanisms of complex traits. Here, for the first time, we perform a systematic survey of the heritability and the distribution of effect sizes across all representative tissues in the human body. We find that local h2 can be relatively well characterized with 59% of expressed genes showing significant h2 (FDR < 0.1) in the DGN whole blood cohort. However, current sample sizes (n ≤ 922) do not allow us to compute distal h2. Bayesian Sparse Linear Mixed Model (BSLMM) analysis provides strong evidence that the genetic contribution to local expression traits is dominated by a handful of genetic variants rather than by the collective contribution of a large number of variants each of modest size. In other words, the local architecture of gene expression traits is sparse rather than polygenic across all 40 tissues (from DGN and GTEx) examined. This result is confirmed by the sparsity of optimal performing gene expression predictors via elastic net modeling. To further explore the tissue context specificity, we decompose the expression traits into cross-tissue and tissue-specific components using a novel Orthogonal Tissue Decomposition (OTD) approach. Through a series of simulations we show that the cross-tissue and tissue-specific components are identifiable via OTD. Heritability and sparsity estimates of these derived expression phenotypes show similar characteristics to the original traits. Consistent properties relative to prior GTEx multi-tissue analysis results suggest that these traits reflect the expected biology. Finally, we apply this knowledge to develop prediction models of gene expression traits for all tissues. The prediction models, heritability, and prediction performance R2 for original and decomposed expression phenotypes are made publicly available (https://github.com/hakyimlab/PrediXcan).
From measurements to metrics: PCA-based indicators of cyber anomaly
NASA Astrophysics Data System (ADS)
Ahmed, Farid; Johnson, Tommy; Tsui, Sonia
2012-06-01
We present a framework of the application of Principal Component Analysis (PCA) to automatically obtain meaningful metrics from intrusion detection measurements. In particular, we report the progress made in applying PCA to analyze the behavioral measurements of malware and provide some preliminary results in selecting dominant attributes from an arbitrary number of malware attributes. The results will be useful in formulating an optimal detection threshold in the principal component space, which can both validate and augment existing malware classifiers.
NASA Astrophysics Data System (ADS)
Jakovels, Dainis; Lihacova, Ilze; Kuzmina, Ilona; Spigulis, Janis
2013-11-01
Non-invasive and fast primary diagnostics of pigmented skin lesions is required due to frequent incidence of skin cancer - melanoma. Diagnostic potential of principal component analysis (PCA) for distant skin melanoma recognition is discussed. Processing of the measured clinical multi-spectral images (31 melanomas and 94 nonmalignant pigmented lesions) in the wavelength range of 450-950 nm by means of PCA resulted in 87 % sensitivity and 78 % specificity for separation between malignant melanomas and pigmented nevi.
NASA Astrophysics Data System (ADS)
Penttilä, Antti; Martikainen, Julia; Gritsevich, Maria; Muinonen, Karri
2018-02-01
Meteorite samples are measured with the University of Helsinki integrating-sphere UV-vis-NIR spectrometer. The resulting spectra of 30 meteorites are compared with selected spectra from the NASA Planetary Data System meteorite spectra database. The spectral measurements are transformed with the principal component analysis, and it is shown that different meteorite types can be distinguished from the transformed data. The motivation is to improve the link between asteroid spectral observations and meteorite spectral measurements.
Multiscale characterization and analysis of shapes
Prasad, Lakshman; Rao, Ramana
2002-01-01
An adaptive multiscale method approximates shapes with continuous or uniformly and densely sampled contours, with the purpose of sparsely and nonuniformly discretizing the boundaries of shapes at any prescribed resolution, while at the same time retaining the salient shape features at that resolution. In another aspect, a fundamental geometric filtering scheme using the Constrained Delaunay Triangulation (CDT) of polygonized shapes creates an efficient parsing of shapes into components that have semantic significance dependent only on the shapes' structure and not on their representations per se. A shape skeletonization process generalizes to sparsely discretized shapes, with the additional benefit of prunability to filter out irrelevant and morphologically insignificant features. The skeletal representation of characters of varying thickness and the elimination of insignificant and noisy spurs and branches from the skeleton greatly increases the robustness, reliability and recognition rates of character recognition algorithms.