Huard, Edouard; Derelle, Sophie; Jaeck, Julien; Nghiem, Jean; Haïdar, Riad; Primot, Jérôme
2018-03-05
A challenging point in the prediction of the image quality of infrared imaging systems is the evaluation of the detector modulation transfer function (MTF). In this paper, we present a linear method to get a 2D continuous MTF from sparse spectral data. Within the method, an object with a predictable sparse spatial spectrum is imaged by the focal plane array. The sparse data is then treated to return the 2D continuous MTF with the hypothesis that all the pixels have an identical spatial response. The linearity of the treatment is a key point to estimate directly the error bars of the resulting detector MTF. The test bench will be presented along with measurement tests on a 25 μm pitch InGaAs detector.
Prediction of siRNA potency using sparse logistic regression.
Hu, Wei; Hu, John
2014-06-01
RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.
Sparse Bayesian Learning for Identifying Imaging Biomarkers in AD Prediction
Shen, Li; Qi, Yuan; Kim, Sungeun; Nho, Kwangsik; Wan, Jing; Risacher, Shannon L.; Saykin, Andrew J.
2010-01-01
We apply sparse Bayesian learning methods, automatic relevance determination (ARD) and predictive ARD (PARD), to Alzheimer’s disease (AD) classification to make accurate prediction and identify critical imaging markers relevant to AD at the same time. ARD is one of the most successful Bayesian feature selection methods. PARD is a powerful Bayesian feature selection method, and provides sparse models that is easy to interpret. PARD selects the model with the best estimate of the predictive performance instead of choosing the one with the largest marginal model likelihood. Comparative study with support vector machine (SVM) shows that ARD/PARD in general outperform SVM in terms of prediction accuracy. Additional comparison with surface-based general linear model (GLM) analysis shows that regions with strongest signals are identified by both GLM and ARD/PARD. While GLM P-map returns significant regions all over the cortex, ARD/PARD provide a small number of relevant and meaningful imaging markers with predictive power, including both cortical and subcortical measures. PMID:20879451
Prediction of gene expression with cis-SNPs using mixed models and regularization methods.
Zeng, Ping; Zhou, Xiang; Huang, Shuiping
2017-05-11
It has been shown that gene expression in human tissues is heritable, thus predicting gene expression using only SNPs becomes possible. The prediction of gene expression can offer important implications on the genetic architecture of individual functional associated SNPs and further interpretations of the molecular basis underlying human diseases. We compared three types of methods for predicting gene expression using only cis-SNPs, including the polygenic model, i.e. linear mixed model (LMM), two sparse models, i.e. Lasso and elastic net (ENET), and the hybrid of LMM and sparse model, i.e. Bayesian sparse linear mixed model (BSLMM). The three kinds of prediction methods have very different assumptions of underlying genetic architectures. These methods were evaluated using simulations under various scenarios, and were applied to the Geuvadis gene expression data. The simulations showed that these four prediction methods (i.e. Lasso, ENET, LMM and BSLMM) behaved best when their respective modeling assumptions were satisfied, but BSLMM had a robust performance across a range of scenarios. According to R 2 of these models in the Geuvadis data, the four methods performed quite similarly. We did not observe any clustering or enrichment of predictive genes (defined as genes with R 2 ≥ 0.05) across the chromosomes, and also did not see there was any clear relationship between the proportion of the predictive genes and the proportion of genes in each chromosome. However, an interesting finding in the Geuvadis data was that highly predictive genes (e.g. R 2 ≥ 0.30) may have sparse genetic architectures since Lasso, ENET and BSLMM outperformed LMM for these genes; and this observation was validated in another gene expression data. We further showed that the predictive genes were enriched in approximately independent LD blocks. Gene expression can be predicted with only cis-SNPs using well-developed prediction models and these predictive genes were enriched in some approximately independent LD blocks. The prediction of gene expression can shed some light on the functional interpretation for identified SNPs in GWASs.
NASA Astrophysics Data System (ADS)
Yihaa Roodhiyah, Lisa’; Tjong, Tiffany; Nurhasan; Sutarno, D.
2018-04-01
The late research, linear matrices of vector finite element in two dimensional(2-D) magnetotelluric (MT) responses modeling was solved by non-sparse direct solver in TE mode. Nevertheless, there is some weakness which have to be improved especially accuracy in the low frequency (10-3 Hz-10-5 Hz) which is not achieved yet and high cost computation in dense mesh. In this work, the solver which is used is sparse direct solver instead of non-sparse direct solverto overcome the weaknesses of solving linear matrices of vector finite element metod using non-sparse direct solver. Sparse direct solver will be advantageous in solving linear matrices of vector finite element method because of the matrix properties which is symmetrical and sparse. The validation of sparse direct solver in solving linear matrices of vector finite element has been done for a homogen half-space model and vertical contact model by analytical solution. Thevalidation result of sparse direct solver in solving linear matrices of vector finite element shows that sparse direct solver is more stable than non-sparse direct solver in computing linear problem of vector finite element method especially in low frequency. In the end, the accuracy of 2D MT responses modelling in low frequency (10-3 Hz-10-5 Hz) has been reached out under the efficient allocation memory of array and less computational time consuming.
Amesos2 and Belos: Direct and Iterative Solvers for Large Sparse Linear Systems
Bavier, Eric; Hoemmen, Mark; Rajamanickam, Sivasankaran; ...
2012-01-01
Solvers for large sparse linear systems come in two categories: direct and iterative. Amesos2, a package in the Trilinos software project, provides direct methods, and Belos, another Trilinos package, provides iterative methods. Amesos2 offers a common interface to many different sparse matrix factorization codes, and can handle any implementation of sparse matrices and vectors, via an easy-to-extend C++ traits interface. It can also factor matrices whose entries have arbitrary “Scalar” type, enabling extended-precision and mixed-precision algorithms. Belos includes many different iterative methods for solving large sparse linear systems and least-squares problems. Unlike competing iterative solver libraries, Belos completely decouples themore » algorithms from the implementations of the underlying linear algebra objects. This lets Belos exploit the latest hardware without changes to the code. Belos favors algorithms that solve higher-level problems, such as multiple simultaneous linear systems and sequences of related linear systems, faster than standard algorithms. The package also supports extended-precision and mixed-precision algorithms. Together, Amesos2 and Belos form a complete suite of sparse linear solvers.« less
Sparse partial least squares regression for simultaneous dimension reduction and variable selection
Chun, Hyonho; Keleş, Sündüz
2010-01-01
Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data. PMID:20107611
Sparse 4D TomoSAR imaging in the presence of non-linear deformation
NASA Astrophysics Data System (ADS)
Khwaja, Ahmed Shaharyar; ćetin, Müjdat
2018-04-01
In this paper, we present a sparse four-dimensional tomographic synthetic aperture radar (4D TomoSAR) imaging scheme that can estimate elevation and linear as well as non-linear seasonal deformation rates of scatterers using the interferometric phase. Unlike existing sparse processing techniques that use fixed dictionaries based on a linear deformation model, we use a variable dictionary for the non-linear deformation in the form of seasonal sinusoidal deformation, in addition to the fixed dictionary for the linear deformation. We estimate the amplitude of the sinusoidal deformation using an optimization method and create the variable dictionary using the estimated amplitude. We show preliminary results using simulated data that demonstrate the soundness of our proposed technique for sparse 4D TomoSAR imaging in the presence of non-linear deformation.
Modeling Alzheimer's disease cognitive scores using multi-task sparse group lasso.
Liu, Xiaoli; Goncalves, André R; Cao, Peng; Zhao, Dazhe; Banerjee, Arindam
2018-06-01
Alzheimer's disease (AD) is a severe neurodegenerative disorder characterized by loss of memory and reduction in cognitive functions due to progressive degeneration of neurons and their connections, eventually leading to death. In this paper, we consider the problem of simultaneously predicting several different cognitive scores associated with categorizing subjects as normal, mild cognitive impairment (MCI), or Alzheimer's disease (AD) in a multi-task learning framework using features extracted from brain images obtained from ADNI (Alzheimer's Disease Neuroimaging Initiative). To solve the problem, we present a multi-task sparse group lasso (MT-SGL) framework, which estimates sparse features coupled across tasks, and can work with loss functions associated with any Generalized Linear Models. Through comparisons with a variety of baseline models using multiple evaluation metrics, we illustrate the promising predictive performance of MT-SGL on ADNI along with its ability to identify brain regions more likely to help the characterization Alzheimer's disease progression. Copyright © 2017 Elsevier Ltd. All rights reserved.
Are V1 Simple Cells Optimized for Visual Occlusions? A Comparative Study
Bornschein, Jörg; Henniges, Marc; Lücke, Jörg
2013-01-01
Simple cells in primary visual cortex were famously found to respond to low-level image components such as edges. Sparse coding and independent component analysis (ICA) emerged as the standard computational models for simple cell coding because they linked their receptive fields to the statistics of visual stimuli. However, a salient feature of image statistics, occlusions of image components, is not considered by these models. Here we ask if occlusions have an effect on the predicted shapes of simple cell receptive fields. We use a comparative approach to answer this question and investigate two models for simple cells: a standard linear model and an occlusive model. For both models we simultaneously estimate optimal receptive fields, sparsity and stimulus noise. The two models are identical except for their component superposition assumption. We find the image encoding and receptive fields predicted by the models to differ significantly. While both models predict many Gabor-like fields, the occlusive model predicts a much sparser encoding and high percentages of ‘globular’ receptive fields. This relatively new center-surround type of simple cell response is observed since reverse correlation is used in experimental studies. While high percentages of ‘globular’ fields can be obtained using specific choices of sparsity and overcompleteness in linear sparse coding, no or only low proportions are reported in the vast majority of studies on linear models (including all ICA models). Likewise, for the here investigated linear model and optimal sparsity, only low proportions of ‘globular’ fields are observed. In comparison, the occlusive model robustly infers high proportions and can match the experimentally observed high proportions of ‘globular’ fields well. Our computational study, therefore, suggests that ‘globular’ fields may be evidence for an optimal encoding of visual occlusions in primary visual cortex. PMID:23754938
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Ray -Bing; Wang, Weichung; Jeff Wu, C. F.
A numerical method, called OBSM, was recently proposed which employs overcomplete basis functions to achieve sparse representations. While the method can handle non-stationary response without the need of inverting large covariance matrices, it lacks the capability to quantify uncertainty in predictions. We address this issue by proposing a Bayesian approach which first imposes a normal prior on the large space of linear coefficients, then applies the MCMC algorithm to generate posterior samples for predictions. From these samples, Bayesian credible intervals can then be obtained to assess prediction uncertainty. A key application for the proposed method is the efficient construction ofmore » sequential designs. Several sequential design procedures with different infill criteria are proposed based on the generated posterior samples. As a result, numerical studies show that the proposed schemes are capable of solving problems of positive point identification, optimization, and surrogate fitting.« less
Ding, Weifu; Zhang, Jiangshe; Leung, Yee
2016-10-01
In this paper, we predict air pollutant concentration using a feedforward artificial neural network inspired by the mechanism of the human brain as a useful alternative to traditional statistical modeling techniques. The neural network is trained based on sparse response back-propagation in which only a small number of neurons respond to the specified stimulus simultaneously and provide a high convergence rate for the trained network, in addition to low energy consumption and greater generalization. Our method is evaluated on Hong Kong air monitoring station data and corresponding meteorological variables for which five air quality parameters were gathered at four monitoring stations in Hong Kong over 4 years (2012-2015). Our results show that our training method has more advantages in terms of the precision of the prediction, effectiveness, and generalization of traditional linear regression algorithms when compared with a feedforward artificial neural network trained using traditional back-propagation.
Chen, Ray -Bing; Wang, Weichung; Jeff Wu, C. F.
2017-04-12
A numerical method, called OBSM, was recently proposed which employs overcomplete basis functions to achieve sparse representations. While the method can handle non-stationary response without the need of inverting large covariance matrices, it lacks the capability to quantify uncertainty in predictions. We address this issue by proposing a Bayesian approach which first imposes a normal prior on the large space of linear coefficients, then applies the MCMC algorithm to generate posterior samples for predictions. From these samples, Bayesian credible intervals can then be obtained to assess prediction uncertainty. A key application for the proposed method is the efficient construction ofmore » sequential designs. Several sequential design procedures with different infill criteria are proposed based on the generated posterior samples. As a result, numerical studies show that the proposed schemes are capable of solving problems of positive point identification, optimization, and surrogate fitting.« less
Benzi, Michele; Evans, Thomas M.; Hamilton, Steven P.; ...
2017-03-05
Here, we consider hybrid deterministic-stochastic iterative algorithms for the solution of large, sparse linear systems. Starting from a convergent splitting of the coefficient matrix, we analyze various types of Monte Carlo acceleration schemes applied to the original preconditioned Richardson (stationary) iteration. We expect that these methods will have considerable potential for resiliency to faults when implemented on massively parallel machines. We also establish sufficient conditions for the convergence of the hybrid schemes, and we investigate different types of preconditioners including sparse approximate inverses. Numerical experiments on linear systems arising from the discretization of partial differential equations are presented.
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2014-06-01
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression.
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2014-01-01
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression. PMID:25598560
Exploring Deep Learning and Sparse Matrix Format Selection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Y.; Liao, C.; Shen, X.
We proposed to explore the use of Deep Neural Networks (DNN) for addressing the longstanding barriers. The recent rapid progress of DNN technology has created a large impact in many fields, which has significantly improved the prediction accuracy over traditional machine learning techniques in image classifications, speech recognitions, machine translations, and so on. To some degree, these tasks resemble the decision makings in many HPC tasks, including the aforementioned format selection for SpMV and linear solver selection. For instance, sparse matrix format selection is akin to image classification—such as, to tell whether an image contains a dog or a cat;more » in both problems, the right decisions are primarily determined by the spatial patterns of the elements in an input. For image classification, the patterns are of pixels, and for sparse matrix format selection, they are of non-zero elements. DNN could be naturally applied if we regard a sparse matrix as an image and the format selection or solver selection as classification problems.« less
Thin-film sparse boundary array design for passive acoustic mapping during ultrasound therapy.
Coviello, Christian M; Kozick, Richard J; Hurrell, Andrew; Smith, Penny Probert; Coussios, Constantin-C
2012-10-01
A new 2-D hydrophone array for ultrasound therapy monitoring is presented, along with a novel algorithm for passive acoustic mapping using a sparse weighted aperture. The array is constructed using existing polyvinylidene fluoride (PVDF) ultrasound sensor technology, and is utilized for its broadband characteristics and its high receive sensitivity. For most 2-D arrays, high-resolution imagery is desired, which requires a large aperture at the cost of a large number of elements. The proposed array's geometry is sparse, with elements only on the boundary of the rectangular aperture. The missing information from the interior is filled in using linear imaging techniques. After receiving acoustic emissions during ultrasound therapy, this algorithm applies an apodization to the sparse aperture to limit side lobes and then reconstructs acoustic activity with high spatiotemporal resolution. Experiments show verification of the theoretical point spread function, and cavitation maps in agar phantoms correspond closely to predicted areas, showing the validity of the array and methodology.
Liu, Zhenqiu; Sun, Fengzhu; McGovern, Dermot P
2017-01-01
Feature selection and prediction are the most important tasks for big data mining. The common strategies for feature selection in big data mining are L 1 , SCAD and MC+. However, none of the existing algorithms optimizes L 0 , which penalizes the number of nonzero features directly. In this paper, we develop a novel sparse generalized linear model (GLM) with L 0 approximation for feature selection and prediction with big omics data. The proposed approach approximate the L 0 optimization directly. Even though the original L 0 problem is non-convex, the problem is approximated by sequential convex optimizations with the proposed algorithm. The proposed method is easy to implement with only several lines of code. Novel adaptive ridge algorithms ( L 0 ADRIDGE) for L 0 penalized GLM with ultra high dimensional big data are developed. The proposed approach outperforms the other cutting edge regularization methods including SCAD and MC+ in simulations. When it is applied to integrated analysis of mRNA, microRNA, and methylation data from TCGA ovarian cancer, multilevel gene signatures associated with suboptimal debulking are identified simultaneously. The biological significance and potential clinical importance of those genes are further explored. The developed Software L 0 ADRIDGE in MATLAB is available at https://github.com/liuzqx/L0adridge.
Nakamura, Kengo; Yasutaka, Tetsuo; Kuwatani, Tatsu; Komai, Takeshi
2017-11-01
In this study, we applied sparse multiple linear regression (SMLR) analysis to clarify the relationships between soil properties and adsorption characteristics for a range of soils across Japan and identify easily-obtained physical and chemical soil properties that could be used to predict K and n values of cadmium, lead and fluorine. A model was first constructed that can easily predict the K and n values from nine soil parameters (pH, cation exchange capacity, specific surface area, total carbon, soil organic matter from loss on ignition and water holding capacity, the ratio of sand, silt and clay). The K and n values of cadmium, lead and fluorine of 17 soil samples were used to verify the SMLR models by the root mean square error values obtained from 512 combinations of soil parameters. The SMLR analysis indicated that fluorine adsorption to soil may be associated with organic matter, whereas cadmium or lead adsorption to soil is more likely to be influenced by soil pH, IL. We found that an accurate K value can be predicted from more than three soil parameters for most soils. Approximately 65% of the predicted values were between 33 and 300% of their measured values for the K value; 76% of the predicted values were within ±30% of their measured values for the n value. Our findings suggest that adsorption properties of lead, cadmium and fluorine to soil can be predicted from the soil physical and chemical properties using the presented models. Copyright © 2017 Elsevier Ltd. All rights reserved.
Parameter Estimation and Model Selection for Indoor Environments Based on Sparse Observations
NASA Astrophysics Data System (ADS)
Dehbi, Y.; Loch-Dehbi, S.; Plümer, L.
2017-09-01
This paper presents a novel method for the parameter estimation and model selection for the reconstruction of indoor environments based on sparse observations. While most approaches for the reconstruction of indoor models rely on dense observations, we predict scenes of the interior with high accuracy in the absence of indoor measurements. We use a model-based top-down approach and incorporate strong but profound prior knowledge. The latter includes probability density functions for model parameters and sparse observations such as room areas and the building footprint. The floorplan model is characterized by linear and bi-linear relations with discrete and continuous parameters. We focus on the stochastic estimation of model parameters based on a topological model derived by combinatorial reasoning in a first step. A Gauss-Markov model is applied for estimation and simulation of the model parameters. Symmetries are represented and exploited during the estimation process. Background knowledge as well as observations are incorporated in a maximum likelihood estimation and model selection is performed with AIC/BIC. The likelihood is also used for the detection and correction of potential errors in the topological model. Estimation results are presented and discussed.
NASA Astrophysics Data System (ADS)
Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.
2015-05-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.
Saravanan, Chandra; Shao, Yihan; Baer, Roi; Ross, Philip N; Head-Gordon, Martin
2003-04-15
A sparse matrix multiplication scheme with multiatom blocks is reported, a tool that can be very useful for developing linear-scaling methods with atom-centered basis functions. Compared to conventional element-by-element sparse matrix multiplication schemes, efficiency is gained by the use of the highly optimized basic linear algebra subroutines (BLAS). However, some sparsity is lost in the multiatom blocking scheme because these matrix blocks will in general contain negligible elements. As a result, an optimal block size that minimizes the CPU time by balancing these two effects is recovered. In calculations on linear alkanes, polyglycines, estane polymers, and water clusters the optimal block size is found to be between 40 and 100 basis functions, where about 55-75% of the machine peak performance was achieved on an IBM RS6000 workstation. In these calculations, the blocked sparse matrix multiplications can be 10 times faster than a standard element-by-element sparse matrix package. Copyright 2003 Wiley Periodicals, Inc. J Comput Chem 24: 618-622, 2003
Sparse Logistic Regression for Diagnosis of Liver Fibrosis in Rat by Using SCAD-Penalized Likelihood
Yan, Fang-Rong; Lin, Jin-Guan; Liu, Yu
2011-01-01
The objective of the present study is to find out the quantitative relationship between progression of liver fibrosis and the levels of certain serum markers using mathematic model. We provide the sparse logistic regression by using smoothly clipped absolute deviation (SCAD) penalized function to diagnose the liver fibrosis in rats. Not only does it give a sparse solution with high accuracy, it also provides the users with the precise probabilities of classification with the class information. In the simulative case and the experiment case, the proposed method is comparable to the stepwise linear discriminant analysis (SLDA) and the sparse logistic regression with least absolute shrinkage and selection operator (LASSO) penalty, by using receiver operating characteristic (ROC) with bayesian bootstrap estimating area under the curve (AUC) diagnostic sensitivity for selected variable. Results show that the new approach provides a good correlation between the serum marker levels and the liver fibrosis induced by thioacetamide (TAA) in rats. Meanwhile, this approach might also be used in predicting the development of liver cirrhosis. PMID:21716672
A General Sparse Tensor Framework for Electronic Structure Theory
Manzer, Samuel; Epifanovsky, Evgeny; Krylov, Anna I.; ...
2017-01-24
Linear-scaling algorithms must be developed in order to extend the domain of applicability of electronic structure theory to molecules of any desired size. But, the increasing complexity of modern linear-scaling methods makes code development and maintenance a significant challenge. A major contributor to this difficulty is the lack of robust software abstractions for handling block-sparse tensor operations. We therefore report the development of a highly efficient symbolic block-sparse tensor library in order to provide access to high-level software constructs to treat such problems. Our implementation supports arbitrary multi-dimensional sparsity in all input and output tensors. We then avoid cumbersome machine-generatedmore » code by implementing all functionality as a high-level symbolic C++ language library and demonstrate that our implementation attains very high performance for linear-scaling sparse tensor contractions.« less
AZTEC. Parallel Iterative method Software for Solving Linear Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hutchinson, S.; Shadid, J.; Tuminaro, R.
1995-07-01
AZTEC is an interactive library that greatly simplifies the parrallelization process when solving the linear systems of equations Ax=b where A is a user supplied n X n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. AZTEC is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparse unstructured matricesmore » for parallel solutions.« less
Nonlinear Estimation With Sparse Temporal Measurements
2016-09-01
Kalman filter , the extended Kalman filter (EKF) and unscented Kalman filter (UKF) are commonly used in practical application. The Kalman filter is an...optimal estimator for linear systems; the EKF and UKF are sub-optimal approximations of the Kalman filter . The EKF uses a first-order Taylor series...propagated covariance is compared for similarity with a Monte Carlo propagation. The similarity of the covariance matrices is shown to predict filter
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pinski, Peter; Riplinger, Christoph; Neese, Frank, E-mail: evaleev@vt.edu, E-mail: frank.neese@cec.mpg.de
2015-07-21
In this work, a systematic infrastructure is described that formalizes concepts implicit in previous work and greatly simplifies computer implementation of reduced-scaling electronic structure methods. The key concept is sparse representation of tensors using chains of sparse maps between two index sets. Sparse map representation can be viewed as a generalization of compressed sparse row, a common representation of a sparse matrix, to tensor data. By combining few elementary operations on sparse maps (inversion, chaining, intersection, etc.), complex algorithms can be developed, illustrated here by a linear-scaling transformation of three-center Coulomb integrals based on our compact code library that implementsmore » sparse maps and operations on them. The sparsity of the three-center integrals arises from spatial locality of the basis functions and domain density fitting approximation. A novel feature of our approach is the use of differential overlap integrals computed in linear-scaling fashion for screening products of basis functions. Finally, a robust linear scaling domain based local pair natural orbital second-order Möller-Plesset (DLPNO-MP2) method is described based on the sparse map infrastructure that only depends on a minimal number of cutoff parameters that can be systematically tightened to approach 100% of the canonical MP2 correlation energy. With default truncation thresholds, DLPNO-MP2 recovers more than 99.9% of the canonical resolution of the identity MP2 (RI-MP2) energy while still showing a very early crossover with respect to the computational effort. Based on extensive benchmark calculations, relative energies are reproduced with an error of typically <0.2 kcal/mol. The efficiency of the local MP2 (LMP2) method can be drastically improved by carrying out the LMP2 iterations in a basis of pair natural orbitals. While the present work focuses on local electron correlation, it is of much broader applicability to computation with sparse tensors in quantum chemistry and beyond.« less
Pinski, Peter; Riplinger, Christoph; Valeev, Edward F; Neese, Frank
2015-07-21
In this work, a systematic infrastructure is described that formalizes concepts implicit in previous work and greatly simplifies computer implementation of reduced-scaling electronic structure methods. The key concept is sparse representation of tensors using chains of sparse maps between two index sets. Sparse map representation can be viewed as a generalization of compressed sparse row, a common representation of a sparse matrix, to tensor data. By combining few elementary operations on sparse maps (inversion, chaining, intersection, etc.), complex algorithms can be developed, illustrated here by a linear-scaling transformation of three-center Coulomb integrals based on our compact code library that implements sparse maps and operations on them. The sparsity of the three-center integrals arises from spatial locality of the basis functions and domain density fitting approximation. A novel feature of our approach is the use of differential overlap integrals computed in linear-scaling fashion for screening products of basis functions. Finally, a robust linear scaling domain based local pair natural orbital second-order Möller-Plesset (DLPNO-MP2) method is described based on the sparse map infrastructure that only depends on a minimal number of cutoff parameters that can be systematically tightened to approach 100% of the canonical MP2 correlation energy. With default truncation thresholds, DLPNO-MP2 recovers more than 99.9% of the canonical resolution of the identity MP2 (RI-MP2) energy while still showing a very early crossover with respect to the computational effort. Based on extensive benchmark calculations, relative energies are reproduced with an error of typically <0.2 kcal/mol. The efficiency of the local MP2 (LMP2) method can be drastically improved by carrying out the LMP2 iterations in a basis of pair natural orbitals. While the present work focuses on local electron correlation, it is of much broader applicability to computation with sparse tensors in quantum chemistry and beyond.
Sparse matrix methods based on orthogonality and conjugacy
NASA Technical Reports Server (NTRS)
Lawson, C. L.
1973-01-01
A matrix having a high percentage of zero elements is called spares. In the solution of systems of linear equations or linear least squares problems involving large sparse matrices, significant saving of computer cost can be achieved by taking advantage of the sparsity. The conjugate gradient algorithm and a set of related algorithms are described.
Abraham, Gad; Kowalczyk, Adam; Zobel, Justin; Inouye, Michael
2013-02-01
A central goal of medical genetics is to accurately predict complex disease from genotypes. Here, we present a comprehensive analysis of simulated and real data using lasso and elastic-net penalized support-vector machine models, a mixed-effects linear model, a polygenic score, and unpenalized logistic regression. In simulation, the sparse penalized models achieved lower false-positive rates and higher precision than the other methods for detecting causal SNPs. The common practice of prefiltering SNP lists for subsequent penalized modeling was examined and shown to substantially reduce the ability to recover the causal SNPs. Using genome-wide SNP profiles across eight complex diseases within cross-validation, lasso and elastic-net models achieved substantially better predictive ability in celiac disease, type 1 diabetes, and Crohn's disease, and had equivalent predictive ability in the rest, with the results in celiac disease strongly replicating between independent datasets. We investigated the effect of linkage disequilibrium on the predictive models, showing that the penalized methods leverage this information to their advantage, compared with methods that assume SNP independence. Our findings show that sparse penalized approaches are robust across different disease architectures, producing as good as or better phenotype predictions and variance explained. This has fundamental ramifications for the selection and future development of methods to genetically predict human disease. © 2012 WILEY PERIODICALS, INC.
Exhaustive Search for Sparse Variable Selection in Linear Regression
NASA Astrophysics Data System (ADS)
Igarashi, Yasuhiko; Takenaka, Hikaru; Nakanishi-Ohno, Yoshinori; Uemura, Makoto; Ikeda, Shiro; Okada, Masato
2018-04-01
We propose a K-sparse exhaustive search (ES-K) method and a K-sparse approximate exhaustive search method (AES-K) for selecting variables in linear regression. With these methods, K-sparse combinations of variables are tested exhaustively assuming that the optimal combination of explanatory variables is K-sparse. By collecting the results of exhaustively computing ES-K, various approximate methods for selecting sparse variables can be summarized as density of states. With this density of states, we can compare different methods for selecting sparse variables such as relaxation and sampling. For large problems where the combinatorial explosion of explanatory variables is crucial, the AES-K method enables density of states to be effectively reconstructed by using the replica-exchange Monte Carlo method and the multiple histogram method. Applying the ES-K and AES-K methods to type Ia supernova data, we confirmed the conventional understanding in astronomy when an appropriate K is given beforehand. However, we found the difficulty to determine K from the data. Using virtual measurement and analysis, we argue that this is caused by data shortage.
High-SNR spectrum measurement based on Hadamard encoding and sparse reconstruction
NASA Astrophysics Data System (ADS)
Wang, Zhaoxin; Yue, Jiang; Han, Jing; Li, Long; Jin, Yong; Gao, Yuan; Li, Baoming
2017-12-01
The denoising capabilities of the H-matrix and cyclic S-matrix based on the sparse reconstruction, employed in the Pixel of Focal Plane Coded Visible Spectrometer for spectrum measurement are investigated, where the spectrum is sparse in a known basis. In the measurement process, the digital micromirror device plays an important role, which implements the Hadamard coding. In contrast with Hadamard transform spectrometry, based on the shift invariability, this spectrometer may have the advantage of a high efficiency. Simulations and experiments show that the nonlinear solution with a sparse reconstruction has a better signal-to-noise ratio than the linear solution and the H-matrix outperforms the cyclic S-matrix whether the reconstruction method is nonlinear or linear.
Iterative algorithms for large sparse linear systems on parallel computers
NASA Technical Reports Server (NTRS)
Adams, L. M.
1982-01-01
Algorithms for assembling in parallel the sparse system of linear equations that result from finite difference or finite element discretizations of elliptic partial differential equations, such as those that arise in structural engineering are developed. Parallel linear stationary iterative algorithms and parallel preconditioned conjugate gradient algorithms are developed for solving these systems. In addition, a model for comparing parallel algorithms on array architectures is developed and results of this model for the algorithms are given.
Huang, Jian; Zhang, Cun-Hui
2013-01-01
The ℓ1-penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1-penalized estimators for convex loss functions of a general form, including the generalized linear models. We study the estimation, prediction, selection and sparsity properties of the weighted ℓ1-penalized estimator in sparse, high-dimensional settings where the number of predictors p can be much larger than the sample size n. Adaptive Lasso is considered as a special case. A multistage method is developed to approximate concave regularized estimation by applying an adaptive Lasso recursively. We provide prediction and estimation oracle inequalities for single- and multi-stage estimators, a general selection consistency theorem, and an upper bound for the dimension of the Lasso estimator. Important models including the linear regression, logistic regression and log-linear models are used throughout to illustrate the applications of the general results. PMID:24348100
LSRN: A PARALLEL ITERATIVE SOLVER FOR STRONGLY OVER- OR UNDERDETERMINED SYSTEMS*
Meng, Xiangrui; Saunders, Michael A.; Mahoney, Michael W.
2014-01-01
We describe a parallel iterative least squares solver named LSRN that is based on random normal projection. LSRN computes the min-length solution to minx∈ℝn ‖Ax − b‖2, where A ∈ ℝm × n with m ≫ n or m ≪ n, and where A may be rank-deficient. Tikhonov regularization may also be included. Since A is involved only in matrix-matrix and matrix-vector multiplications, it can be a dense or sparse matrix or a linear operator, and LSRN automatically speeds up when A is sparse or a fast linear operator. The preconditioning phase consists of a random normal projection, which is embarrassingly parallel, and a singular value decomposition of size ⌈γ min(m, n)⌉ × min(m, n), where γ is moderately larger than 1, e.g., γ = 2. We prove that the preconditioned system is well-conditioned, with a strong concentration result on the extreme singular values, and hence that the number of iterations is fully predictable when we apply LSQR or the Chebyshev semi-iterative method. As we demonstrate, the Chebyshev method is particularly efficient for solving large problems on clusters with high communication cost. Numerical results show that on a shared-memory machine, LSRN is very competitive with LAPACK’s DGELSD and a fast randomized least squares solver called Blendenpik on large dense problems, and it outperforms the least squares solver from SuiteSparseQR on sparse problems without sparsity patterns that can be exploited to reduce fill-in. Further experiments show that LSRN scales well on an Amazon Elastic Compute Cloud cluster. PMID:25419094
Sparse network-based models for patient classification using fMRI
Rosa, Maria J.; Portugal, Liana; Hahn, Tim; Fallgatter, Andreas J.; Garrido, Marta I.; Shawe-Taylor, John; Mourao-Miranda, Janaina
2015-01-01
Pattern recognition applied to whole-brain neuroimaging data, such as functional Magnetic Resonance Imaging (fMRI), has proved successful at discriminating psychiatric patients from healthy participants. However, predictive patterns obtained from whole-brain voxel-based features are difficult to interpret in terms of the underlying neurobiology. Many psychiatric disorders, such as depression and schizophrenia, are thought to be brain connectivity disorders. Therefore, pattern recognition based on network models might provide deeper insights and potentially more powerful predictions than whole-brain voxel-based approaches. Here, we build a novel sparse network-based discriminative modeling framework, based on Gaussian graphical models and L1-norm regularized linear Support Vector Machines (SVM). In addition, the proposed framework is optimized in terms of both predictive power and reproducibility/stability of the patterns. Our approach aims to provide better pattern interpretation than voxel-based whole-brain approaches by yielding stable brain connectivity patterns that underlie discriminative changes in brain function between the groups. We illustrate our technique by classifying patients with major depressive disorder (MDD) and healthy participants, in two (event- and block-related) fMRI datasets acquired while participants performed a gender discrimination and emotional task, respectively, during the visualization of emotional valent faces. PMID:25463459
Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners
Li, Ruipeng; Saad, Yousef
2017-08-01
This study presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits domain decomposition (DD) and low-rank corrections. The DD approach decouples the matrix and, once inverted, a low-rank approximation is applied by exploiting the Sherman--Morrison--Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisonsmore » with pARMS, a DD-based parallel incomplete LU (ILU) preconditioning method, are presented for solving Poisson's equation and linear elasticity problems.« less
Improving the energy efficiency of sparse linear system solvers on multicore and manycore systems.
Anzt, H; Quintana-Ortí, E S
2014-06-28
While most recent breakthroughs in scientific research rely on complex simulations carried out in large-scale supercomputers, the power draft and energy spent for this purpose is increasingly becoming a limiting factor to this trend. In this paper, we provide an overview of the current status in energy-efficient scientific computing by reviewing different technologies used to monitor power draft as well as power- and energy-saving mechanisms available in commodity hardware. For the particular domain of sparse linear algebra, we analyse the energy efficiency of a broad collection of hardware architectures and investigate how algorithmic and implementation modifications can improve the energy performance of sparse linear system solvers, without negatively impacting their performance. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Ruipeng; Saad, Yousef
This study presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits domain decomposition (DD) and low-rank corrections. The DD approach decouples the matrix and, once inverted, a low-rank approximation is applied by exploiting the Sherman--Morrison--Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisonsmore » with pARMS, a DD-based parallel incomplete LU (ILU) preconditioning method, are presented for solving Poisson's equation and linear elasticity problems.« less
New machine-learning algorithms for prediction of Parkinson's disease
NASA Astrophysics Data System (ADS)
Mandal, Indrajit; Sairam, N.
2014-03-01
This article presents an enhanced prediction accuracy of diagnosis of Parkinson's disease (PD) to prevent the delay and misdiagnosis of patients using the proposed robust inference system. New machine-learning methods are proposed and performance comparisons are based on specificity, sensitivity, accuracy and other measurable parameters. The robust methods of treating Parkinson's disease (PD) includes sparse multinomial logistic regression, rotation forest ensemble with support vector machines and principal components analysis, artificial neural networks, boosting methods. A new ensemble method comprising of the Bayesian network optimised by Tabu search algorithm as classifier and Haar wavelets as projection filter is used for relevant feature selection and ranking. The highest accuracy obtained by linear logistic regression and sparse multinomial logistic regression is 100% and sensitivity, specificity of 0.983 and 0.996, respectively. All the experiments are conducted over 95% and 99% confidence levels and establish the results with corrected t-tests. This work shows a high degree of advancement in software reliability and quality of the computer-aided diagnosis system and experimentally shows best results with supportive statistical inference.
NASA Astrophysics Data System (ADS)
Kuai, Xiao-yan; Sun, Hai-xin; Qi, Jie; Cheng, En; Xu, Xiao-ka; Guo, Yu-hui; Chen, You-gan
2014-06-01
In this paper, we investigate the performance of adaptive modulation (AM) orthogonal frequency division multiplexing (OFDM) system in underwater acoustic (UWA) communications. The aim is to solve the problem of large feedback overhead for channel state information (CSI) in every subcarrier. A novel CSI feedback scheme is proposed based on the theory of compressed sensing (CS). We propose a feedback from the receiver that only feedback the sparse channel parameters. Additionally, prediction of the channel state is proposed every several symbols to realize the AM in practice. We describe a linear channel prediction algorithm which is used in adaptive transmission. This system has been tested in the real underwater acoustic channel. The linear channel prediction makes the AM transmission techniques more feasible for acoustic channel communications. The simulation and experiment show that significant improvements can be obtained both in bit error rate (BER) and throughput in the AM scheme compared with the fixed Quadrature Phase Shift Keying (QPSK) modulation scheme. Moreover, the performance with standard CS outperforms the Discrete Cosine Transform (DCT) method.
Speech coding at low to medium bit rates
NASA Astrophysics Data System (ADS)
Leblanc, Wilfred Paul
1992-09-01
Improved search techniques coupled with improved codebook design methodologies are proposed to improve the performance of conventional code-excited linear predictive coders for speech. Improved methods for quantizing the short term filter are developed by employing a tree search algorithm and joint codebook design to multistage vector quantization. Joint codebook design procedures are developed to design locally optimal multistage codebooks. Weighting during centroid computation is introduced to improve the outlier performance of the multistage vector quantizer. Multistage vector quantization is shown to be both robust against input characteristics and in the presence of channel errors. Spectral distortions of about 1 dB are obtained at rates of 22-28 bits/frame. Structured codebook design procedures for excitation in code-excited linear predictive coders are compared to general codebook design procedures. Little is lost using significant structure in the excitation codebooks while greatly reducing the search complexity. Sparse multistage configurations are proposed for reducing computational complexity and memory size. Improved search procedures are applied to code-excited linear prediction which attempt joint optimization of the short term filter, the adaptive codebook, and the excitation. Improvements in signal to noise ratio of 1-2 dB are realized in practice.
Efficient convolutional sparse coding
Wohlberg, Brendt
2017-06-20
Computationally efficient algorithms may be applied for fast dictionary learning solving the convolutional sparse coding problem in the Fourier domain. More specifically, efficient convolutional sparse coding may be derived within an alternating direction method of multipliers (ADMM) framework that utilizes fast Fourier transforms (FFT) to solve the main linear system in the frequency domain. Such algorithms may enable a significant reduction in computational cost over conventional approaches by implementing a linear solver for the most critical and computationally expensive component of the conventional iterative algorithm. The theoretical computational cost of the algorithm may be reduced from O(M.sup.3N) to O(MN log N), where N is the dimensionality of the data and M is the number of elements in the dictionary. This significant improvement in efficiency may greatly increase the range of problems that can practically be addressed via convolutional sparse representations.
Hine, N D M; Haynes, P D; Mostofi, A A; Payne, M C
2010-09-21
We present calculations of formation energies of defects in an ionic solid (Al(2)O(3)) extrapolated to the dilute limit, corresponding to a simulation cell of infinite size. The large-scale calculations required for this extrapolation are enabled by developments in the approach to parallel sparse matrix algebra operations, which are central to linear-scaling density-functional theory calculations. The computational cost of manipulating sparse matrices, whose sizes are determined by the large number of basis functions present, is greatly improved with this new approach. We present details of the sparse algebra scheme implemented in the ONETEP code using hierarchical sparsity patterns, and demonstrate its use in calculations on a wide range of systems, involving thousands of atoms on hundreds to thousands of parallel processes.
Summer Proceedings 2016: The Center for Computing Research at Sandia National Laboratories
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carleton, James Brian; Parks, Michael L.
Solving sparse linear systems from the discretization of elliptic partial differential equations (PDEs) is an important building block in many engineering applications. Sparse direct solvers can solve general linear systems, but are usually slower and use much more memory than effective iterative solvers. To overcome these two disadvantages, a hierarchical solver (LoRaSp) based on H2-matrices was introduced in [22]. Here, we have developed a parallel version of the algorithm in LoRaSp to solve large sparse matrices on distributed memory machines. On a single processor, the factorization time of our parallel solver scales almost linearly with the problem size for three-dimensionalmore » problems, as opposed to the quadratic scalability of many existing sparse direct solvers. Moreover, our solver leads to almost constant numbers of iterations, when used as a preconditioner for Poisson problems. On more than one processor, our algorithm has significant speedups compared to sequential runs. With this parallel algorithm, we are able to solve large problems much faster than many existing packages as demonstrated by the numerical experiments.« less
Sparse signals recovered by non-convex penalty in quasi-linear systems.
Cui, Angang; Li, Haiyang; Wen, Meng; Peng, Jigen
2018-01-01
The goal of compressed sensing is to reconstruct a sparse signal under a few linear measurements far less than the dimension of the ambient space of the signal. However, many real-life applications in physics and biomedical sciences carry some strongly nonlinear structures, and the linear model is no longer suitable. Compared with the compressed sensing under the linear circumstance, this nonlinear compressed sensing is much more difficult, in fact also NP-hard, combinatorial problem, because of the discrete and discontinuous nature of the [Formula: see text]-norm and the nonlinearity. In order to get a convenience for sparse signal recovery, we set the nonlinear models have a smooth quasi-linear nature in this paper, and study a non-convex fraction function [Formula: see text] in this quasi-linear compressed sensing. We propose an iterative fraction thresholding algorithm to solve the regularization problem [Formula: see text] for all [Formula: see text]. With the change of parameter [Formula: see text], our algorithm could get a promising result, which is one of the advantages for our algorithm compared with some state-of-art algorithms. Numerical experiments show that our method performs much better than some state-of-the-art methods.
NASA Technical Reports Server (NTRS)
Kincaid, D. R.; Young, D. M.
1984-01-01
Adapting and designing mathematical software to achieve optimum performance on the CYBER 205 is discussed. Comments and observations are made in light of recent work done on modifying the ITPACK software package and on writing new software for vector supercomputers. The goal was to develop very efficient vector algorithms and software for solving large sparse linear systems using iterative methods.
Poly-Omic Prediction of Complex Traits: OmicKriging
Wheeler, Heather E.; Aquino-Michaels, Keston; Gamazon, Eric R.; Trubetskoy, Vassily V.; Dolan, M. Eileen; Huang, R. Stephanie; Cox, Nancy J.; Im, Hae Kyung
2014-01-01
High-confidence prediction of complex traits such as disease risk or drug response is an ultimate goal of personalized medicine. Although genome-wide association studies have discovered thousands of well-replicated polymorphisms associated with a broad spectrum of complex traits, the combined predictive power of these associations for any given trait is generally too low to be of clinical relevance. We propose a novel systems approach to complex trait prediction, which leverages and integrates similarity in genetic, transcriptomic, or other omics-level data. We translate the omic similarity into phenotypic similarity using a method called Kriging, commonly used in geostatistics and machine learning. Our method called OmicKriging emphasizes the use of a wide variety of systems-level data, such as those increasingly made available by comprehensive surveys of the genome, transcriptome, and epigenome, for complex trait prediction. Furthermore, our OmicKriging framework allows easy integration of prior information on the function of subsets of omics-level data from heterogeneous sources without the sometimes heavy computational burden of Bayesian approaches. Using seven disease datasets from the Wellcome Trust Case Control Consortium (WTCCC), we show that OmicKriging allows simple integration of sparse and highly polygenic components yielding comparable performance at a fraction of the computing time of a recently published Bayesian sparse linear mixed model method. Using a cellular growth phenotype, we show that integrating mRNA and microRNA expression data substantially increases performance over either dataset alone. Using clinical statin response, we show improved prediction over existing methods. PMID:24799323
User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.
NASA Technical Reports Server (NTRS)
Reddy, C. J.
2000-01-01
PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.
A linear geospatial streamflow modeling system for data sparse environments
Asante, Kwabena O.; Arlan, Guleid A.; Pervez, Md Shahriar; Rowland, James
2008-01-01
In many river basins around the world, inaccessibility of flow data is a major obstacle to water resource studies and operational monitoring. This paper describes a geospatial streamflow modeling system which is parameterized with global terrain, soils and land cover data and run operationally with satellite‐derived precipitation and evapotranspiration datasets. Simple linear methods transfer water through the subsurface, overland and river flow phases, and the resulting flows are expressed in terms of standard deviations from mean annual flow. In sample applications, the modeling system was used to simulate flow variations in the Congo, Niger, Nile, Zambezi, Orange and Lake Chad basins between 1998 and 2005, and the resulting flows were compared with mean monthly values from the open‐access Global River Discharge Database. While the uncalibrated model cannot predict the absolute magnitude of flow, it can quantify flow anomalies in terms of relative departures from mean flow. Most of the severe flood events identified in the flow anomalies were independently verified by the Dartmouth Flood Observatory (DFO) and the Emergency Disaster Database (EM‐DAT). Despite its limitations, the modeling system is valuable for rapid characterization of the relative magnitude of flood hazards and seasonal flow changes in data sparse settings.
Sampling schemes and parameter estimation for nonlinear Bernoulli-Gaussian sparse models
NASA Astrophysics Data System (ADS)
Boudineau, Mégane; Carfantan, Hervé; Bourguignon, Sébastien; Bazot, Michael
2016-06-01
We address the sparse approximation problem in the case where the data are approximated by the linear combination of a small number of elementary signals, each of these signals depending non-linearly on additional parameters. Sparsity is explicitly expressed through a Bernoulli-Gaussian hierarchical model in a Bayesian framework. Posterior mean estimates are computed using Markov Chain Monte-Carlo algorithms. We generalize the partially marginalized Gibbs sampler proposed in the linear case in [1], and build an hybrid Hastings-within-Gibbs algorithm in order to account for the nonlinear parameters. All model parameters are then estimated in an unsupervised procedure. The resulting method is evaluated on a sparse spectral analysis problem. It is shown to converge more efficiently than the classical joint estimation procedure, with only a slight increase of the computational cost per iteration, consequently reducing the global cost of the estimation procedure.
Sparse brain network using penalized linear regression
NASA Astrophysics Data System (ADS)
Lee, Hyekyoung; Lee, Dong Soo; Kang, Hyejin; Kim, Boong-Nyun; Chung, Moo K.
2011-03-01
Sparse partial correlation is a useful connectivity measure for brain networks when it is difficult to compute the exact partial correlation in the small-n large-p setting. In this paper, we formulate the problem of estimating partial correlation as a sparse linear regression with a l1-norm penalty. The method is applied to brain network consisting of parcellated regions of interest (ROIs), which are obtained from FDG-PET images of the autism spectrum disorder (ASD) children and the pediatric control (PedCon) subjects. To validate the results, we check their reproducibilities of the obtained brain networks by the leave-one-out cross validation and compare the clustered structures derived from the brain networks of ASD and PedCon.
Linear and nonlinear trending and prediction for AVHRR time series data
NASA Technical Reports Server (NTRS)
Smid, J.; Volf, P.; Slama, M.; Palus, M.
1995-01-01
The variability of AVHRR calibration coefficient in time was analyzed using algorithms of linear and non-linear time series analysis. Specifically we have used the spline trend modeling, autoregressive process analysis, incremental neural network learning algorithm and redundancy functional testing. The analysis performed on available AVHRR data sets revealed that (1) the calibration data have nonlinear dependencies, (2) the calibration data depend strongly on the target temperature, (3) both calibration coefficients and the temperature time series can be modeled, in the first approximation, as autonomous dynamical systems, (4) the high frequency residuals of the analyzed data sets can be best modeled as an autoregressive process of the 10th degree. We have dealt with a nonlinear identification problem and the problem of noise filtering (data smoothing). The system identification and filtering are significant problems for AVHRR data sets. The algorithms outlined in this study can be used for the future EOS missions. Prediction and smoothing algorithms for time series of calibration data provide a functional characterization of the data. Those algorithms can be particularly useful when calibration data are incomplete or sparse.
Automated Dynamic Demand Response Implementation on a Micro-grid
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuppannagari, Sanmukh R.; Kannan, Rajgopal; Chelmis, Charalampos
In this paper, we describe a system for real-time automated Dynamic and Sustainable Demand Response with sparse data consumption prediction implemented on the University of Southern California campus microgrid. Supply side approaches to resolving energy supply-load imbalance do not work at high levels of renewable energy penetration. Dynamic Demand Response (D 2R) is a widely used demand-side technique to dynamically adjust electricity consumption during peak load periods. Our D 2R system consists of accurate machine learning based energy consumption forecasting models that work with sparse data coupled with fast and sustainable load curtailment optimization algorithms that provide the ability tomore » dynamically adapt to changing supply-load imbalances in near real-time. Our Sustainable DR (SDR) algorithms attempt to distribute customer curtailment evenly across sub-intervals during a DR event and avoid expensive demand peaks during a few sub-intervals. It also ensures that each customer is penalized fairly in order to achieve the targeted curtailment. We develop near linear-time constant-factor approximation algorithms along with Polynomial Time Approximation Schemes (PTAS) for SDR curtailment that minimizes the curtailment error defined as the difference between the target and achieved curtailment values. Our SDR curtailment problem is formulated as an Integer Linear Program that optimally matches customers to curtailment strategies during a DR event while also explicitly accounting for customer strategy switching overhead as a constraint. We demonstrate the results of our D 2R system using real data from experiments performed on the USC smartgrid and show that 1) our prediction algorithms can very accurately predict energy consumption even with noisy or missing data and 2) our curtailment algorithms deliver DR with extremely low curtailment errors in the 0.01-0.05 kWh range.« less
Human mobility prediction from region functions with taxi trajectories.
Wang, Minjie; Yang, Su; Sun, Yi; Gao, Jun
2017-01-01
People in cities nowadays suffer from increasingly severe traffic jams due to less awareness of how collective human mobility is affected by urban planning. Besides, understanding how region functions shape human mobility is critical for business planning but remains unsolved so far. This study aims to discover the association between region functions and resulting human mobility. We establish a linear regression model to predict the traffic flows of Beijing based on the input referred to as bag of POIs. By solving the predictor in the sense of sparse representation, we find that the average prediction precision is over 74% and each type of POI contributes differently in the predictor, which accounts for what factors and how such region functions attract people visiting. Based on these findings, predictive human mobility could be taken into account when planning new regions and region functions.
Survey of the Heritability and Sparse Architecture of Gene Expression Traits across Human Tissues.
Wheeler, Heather E; Shah, Kaanan P; Brenner, Jonathon; Garcia, Tzintzuni; Aquino-Michaels, Keston; Cox, Nancy J; Nicolae, Dan L; Im, Hae Kyung
2016-11-01
Understanding the genetic architecture of gene expression traits is key to elucidating the underlying mechanisms of complex traits. Here, for the first time, we perform a systematic survey of the heritability and the distribution of effect sizes across all representative tissues in the human body. We find that local h2 can be relatively well characterized with 59% of expressed genes showing significant h2 (FDR < 0.1) in the DGN whole blood cohort. However, current sample sizes (n ≤ 922) do not allow us to compute distal h2. Bayesian Sparse Linear Mixed Model (BSLMM) analysis provides strong evidence that the genetic contribution to local expression traits is dominated by a handful of genetic variants rather than by the collective contribution of a large number of variants each of modest size. In other words, the local architecture of gene expression traits is sparse rather than polygenic across all 40 tissues (from DGN and GTEx) examined. This result is confirmed by the sparsity of optimal performing gene expression predictors via elastic net modeling. To further explore the tissue context specificity, we decompose the expression traits into cross-tissue and tissue-specific components using a novel Orthogonal Tissue Decomposition (OTD) approach. Through a series of simulations we show that the cross-tissue and tissue-specific components are identifiable via OTD. Heritability and sparsity estimates of these derived expression phenotypes show similar characteristics to the original traits. Consistent properties relative to prior GTEx multi-tissue analysis results suggest that these traits reflect the expected biology. Finally, we apply this knowledge to develop prediction models of gene expression traits for all tissues. The prediction models, heritability, and prediction performance R2 for original and decomposed expression phenotypes are made publicly available (https://github.com/hakyimlab/PrediXcan).
Visual Tracking via Sparse and Local Linear Coding.
Wang, Guofeng; Qin, Xueying; Zhong, Fan; Liu, Yue; Li, Hongbo; Peng, Qunsheng; Yang, Ming-Hsuan
2015-11-01
The state search is an important component of any object tracking algorithm. Numerous algorithms have been proposed, but stochastic sampling methods (e.g., particle filters) are arguably one of the most effective approaches. However, the discretization of the state space complicates the search for the precise object location. In this paper, we propose a novel tracking algorithm that extends the state space of particle observations from discrete to continuous. The solution is determined accurately via iterative linear coding between two convex hulls. The algorithm is modeled by an optimal function, which can be efficiently solved by either convex sparse coding or locality constrained linear coding. The algorithm is also very flexible and can be combined with many generic object representations. Thus, we first use sparse representation to achieve an efficient searching mechanism of the algorithm and demonstrate its accuracy. Next, two other object representation models, i.e., least soft-threshold squares and adaptive structural local sparse appearance, are implemented with improved accuracy to demonstrate the flexibility of our algorithm. Qualitative and quantitative experimental results demonstrate that the proposed tracking algorithm performs favorably against the state-of-the-art methods in dynamic scenes.
SPARSKIT: A basic tool kit for sparse matrix computations
NASA Technical Reports Server (NTRS)
Saad, Youcef
1990-01-01
Presented here are the main features of a tool package for manipulating and working with sparse matrices. One of the goals of the package is to provide basic tools to facilitate the exchange of software and data between researchers in sparse matrix computations. The starting point is the Harwell/Boeing collection of matrices for which the authors provide a number of tools. Among other things, the package provides programs for converting data structures, printing simple statistics on a matrix, plotting a matrix profile, and performing linear algebra operations with sparse matrices.
Estimating the size of an open population using sparse capture-recapture data.
Huggins, Richard; Stoklosa, Jakub; Roach, Cameron; Yip, Paul
2018-03-01
Sparse capture-recapture data from open populations are difficult to analyze using currently available frequentist statistical methods. However, in closed capture-recapture experiments, the Chao sparse estimator (Chao, 1989, Biometrics 45, 427-438) may be used to estimate population sizes when there are few recaptures. Here, we extend the Chao (1989) closed population size estimator to the open population setting by using linear regression and extrapolation techniques. We conduct a small simulation study and apply the models to several sparse capture-recapture data sets. © 2017, The International Biometric Society.
Regional flow duration curves: Geostatistical techniques versus multivariate regression
Pugliese, Alessio; Farmer, William H.; Castellarin, Attilio; Archfield, Stacey A.; Vogel, Richard M.
2016-01-01
A period-of-record flow duration curve (FDC) represents the relationship between the magnitude and frequency of daily streamflows. Prediction of FDCs is of great importance for locations characterized by sparse or missing streamflow observations. We present a detailed comparison of two methods which are capable of predicting an FDC at ungauged basins: (1) an adaptation of the geostatistical method, Top-kriging, employing a linear weighted average of dimensionless empirical FDCs, standardised with a reference streamflow value; and (2) regional multiple linear regression of streamflow quantiles, perhaps the most common method for the prediction of FDCs at ungauged sites. In particular, Top-kriging relies on a metric for expressing the similarity between catchments computed as the negative deviation of the FDC from a reference streamflow value, which we termed total negative deviation (TND). Comparisons of these two methods are made in 182 largely unregulated river catchments in the southeastern U.S. using a three-fold cross-validation algorithm. Our results reveal that the two methods perform similarly throughout flow-regimes, with average Nash-Sutcliffe Efficiencies 0.566 and 0.662, (0.883 and 0.829 on log-transformed quantiles) for the geostatistical and the linear regression models, respectively. The differences between the reproduction of FDC's occurred mostly for low flows with exceedance probability (i.e. duration) above 0.98.
Parallel iterative methods for sparse linear and nonlinear equations
NASA Technical Reports Server (NTRS)
Saad, Youcef
1989-01-01
As three-dimensional models are gaining importance, iterative methods will become almost mandatory. Among these, preconditioned Krylov subspace methods have been viewed as the most efficient and reliable, when solving linear as well as nonlinear systems of equations. There has been several different approaches taken to adapt iterative methods for supercomputers. Some of these approaches are discussed and the methods that deal more specifically with general unstructured sparse matrices, such as those arising from finite element methods, are emphasized.
Sparse Substring Pattern Set Discovery Using Linear Programming Boosting
NASA Astrophysics Data System (ADS)
Kashihara, Kazuaki; Hatano, Kohei; Bannai, Hideo; Takeda, Masayuki
In this paper, we consider finding a small set of substring patterns which classifies the given documents well. We formulate the problem as 1 norm soft margin optimization problem where each dimension corresponds to a substring pattern. Then we solve this problem by using LPBoost and an optimal substring discovery algorithm. Since the problem is a linear program, the resulting solution is likely to be sparse, which is useful for feature selection. We evaluate the proposed method for real data such as movie reviews.
Nonlinear spike-and-slab sparse coding for interpretable image encoding.
Shelton, Jacquelyn A; Sheikh, Abdul-Saboor; Bornschein, Jörg; Sterne, Philip; Lücke, Jörg
2015-01-01
Sparse coding is a popular approach to model natural images but has faced two main challenges: modelling low-level image components (such as edge-like structures and their occlusions) and modelling varying pixel intensities. Traditionally, images are modelled as a sparse linear superposition of dictionary elements, where the probabilistic view of this problem is that the coefficients follow a Laplace or Cauchy prior distribution. We propose a novel model that instead uses a spike-and-slab prior and nonlinear combination of components. With the prior, our model can easily represent exact zeros for e.g. the absence of an image component, such as an edge, and a distribution over non-zero pixel intensities. With the nonlinearity (the nonlinear max combination rule), the idea is to target occlusions; dictionary elements correspond to image components that can occlude each other. There are major consequences of the model assumptions made by both (non)linear approaches, thus the main goal of this paper is to isolate and highlight differences between them. Parameter optimization is analytically and computationally intractable in our model, thus as a main contribution we design an exact Gibbs sampler for efficient inference which we can apply to higher dimensional data using latent variable preselection. Results on natural and artificial occlusion-rich data with controlled forms of sparse structure show that our model can extract a sparse set of edge-like components that closely match the generating process, which we refer to as interpretable components. Furthermore, the sparseness of the solution closely follows the ground-truth number of components/edges in the images. The linear model did not learn such edge-like components with any level of sparsity. This suggests that our model can adaptively well-approximate and characterize the meaningful generation process.
Nonlinear Spike-And-Slab Sparse Coding for Interpretable Image Encoding
Shelton, Jacquelyn A.; Sheikh, Abdul-Saboor; Bornschein, Jörg; Sterne, Philip; Lücke, Jörg
2015-01-01
Sparse coding is a popular approach to model natural images but has faced two main challenges: modelling low-level image components (such as edge-like structures and their occlusions) and modelling varying pixel intensities. Traditionally, images are modelled as a sparse linear superposition of dictionary elements, where the probabilistic view of this problem is that the coefficients follow a Laplace or Cauchy prior distribution. We propose a novel model that instead uses a spike-and-slab prior and nonlinear combination of components. With the prior, our model can easily represent exact zeros for e.g. the absence of an image component, such as an edge, and a distribution over non-zero pixel intensities. With the nonlinearity (the nonlinear max combination rule), the idea is to target occlusions; dictionary elements correspond to image components that can occlude each other. There are major consequences of the model assumptions made by both (non)linear approaches, thus the main goal of this paper is to isolate and highlight differences between them. Parameter optimization is analytically and computationally intractable in our model, thus as a main contribution we design an exact Gibbs sampler for efficient inference which we can apply to higher dimensional data using latent variable preselection. Results on natural and artificial occlusion-rich data with controlled forms of sparse structure show that our model can extract a sparse set of edge-like components that closely match the generating process, which we refer to as interpretable components. Furthermore, the sparseness of the solution closely follows the ground-truth number of components/edges in the images. The linear model did not learn such edge-like components with any level of sparsity. This suggests that our model can adaptively well-approximate and characterize the meaningful generation process. PMID:25954947
1-norm support vector novelty detection and its sparseness.
Zhang, Li; Zhou, WeiDa
2013-12-01
This paper proposes a 1-norm support vector novelty detection (SVND) method and discusses its sparseness. 1-norm SVND is formulated as a linear programming problem and uses two techniques for inducing sparseness, or the 1-norm regularization and the hinge loss function. We also find two upper bounds on the sparseness of 1-norm SVND, or exact support vector (ESV) and kernel Gram matrix rank bounds. The ESV bound indicates that 1-norm SVND has a sparser representation model than SVND. The kernel Gram matrix rank bound can loosely estimate the sparseness of 1-norm SVND. Experimental results show that 1-norm SVND is feasible and effective. Copyright © 2013 Elsevier Ltd. All rights reserved.
Sparse array angle estimation using reduced-dimension ESPRIT-MUSIC in MIMO radar.
Zhang, Chaozhu; Pang, Yucai
2013-01-01
Sparse linear arrays provide better performance than the filled linear arrays in terms of angle estimation and resolution with reduced size and low cost. However, they are subject to manifold ambiguity. In this paper, both the transmit array and receive array are sparse linear arrays in the bistatic MIMO radar. Firstly, we present an ESPRIT-MUSIC method in which ESPRIT algorithm is used to obtain ambiguous angle estimates. The disambiguation algorithm uses MUSIC-based procedure to identify the true direction cosine estimate from a set of ambiguous candidate estimates. The paired transmit angle and receive angle can be estimated and the manifold ambiguity can be solved. However, the proposed algorithm has high computational complexity due to the requirement of two-dimension search. Further, the Reduced-Dimension ESPRIT-MUSIC (RD-ESPRIT-MUSIC) is proposed to reduce the complexity of the algorithm. And the RD-ESPRIT-MUSIC only demands one-dimension search. Simulation results demonstrate the effectiveness of the method.
Human mobility prediction from region functions with taxi trajectories
Wang, Minjie; Sun, Yi; Gao, Jun
2017-01-01
People in cities nowadays suffer from increasingly severe traffic jams due to less awareness of how collective human mobility is affected by urban planning. Besides, understanding how region functions shape human mobility is critical for business planning but remains unsolved so far. This study aims to discover the association between region functions and resulting human mobility. We establish a linear regression model to predict the traffic flows of Beijing based on the input referred to as bag of POIs. By solving the predictor in the sense of sparse representation, we find that the average prediction precision is over 74% and each type of POI contributes differently in the predictor, which accounts for what factors and how such region functions attract people visiting. Based on these findings, predictive human mobility could be taken into account when planning new regions and region functions. PMID:29190708
NASA Astrophysics Data System (ADS)
Pohle, Ina; Glendell, Miriam; Stutter, Marc I.; Helliwell, Rachel C.
2017-04-01
An understanding of catchment response to climate and land use change at a regional scale is necessary for the assessment of mitigation and adaptation options addressing diffuse nutrient pollution. It is well documented that the physicochemical properties of a river ecosystem respond to change in a non-linear fashion. This is particularly important when threshold water concentrations, relevant to national and EU legislation, are exceeded. Large scale (regional) model assessments required for regulatory purposes must represent the key processes and mechanisms that are more readily understood in catchments with water quantity and water quality data monitored at high spatial and temporal resolution. While daily discharge data are available for most catchments in Scotland, nitrate and phosphorus are mostly available on a monthly basis only, as typified by regulatory monitoring. However, high resolution (hourly to daily) water quantity and water quality data exist for a limited number of research catchments. To successfully implement adaptation measures across Scotland, an upscaling from data-rich to data-sparse catchments is required. In addition, the widespread availability of spatial datasets affecting hydrological and biogeochemical responses (e.g. soils, topography/geomorphology, land use, vegetation etc.) provide an opportunity to transfer predictions between data-rich and data-sparse areas by linking processes and responses to catchment attributes. Here, we develop a framework of catchment typologies as a prerequisite for transferring information from data-rich to data-sparse catchments by focusing on how hydrological catchment similarity can be used as an indicator of grouped behaviours in water quality response. As indicators of hydrological catchment similarity we use flow indices derived from observed discharge data across Scotland as well as hydrological model parameters. For the latter, we calibrated the lumped rainfall-runoff model TUWModel using multiple objective functions. The relationships between indicators of hydrological catchment similarity, physical catchment characteristics and nitrate and phosphorus concentrations in rivers are then investigated using multivariate statistics. This understanding of the relationship between catchment characteristics, hydrological processes and water quality will allow us to implement more efficient regulatory water quality monitoring strategies, to improve existing water quality models and to model mitigation and adaptation scenarios to global change in data-sparse catchments.
Ejlerskov, Katrine T.; Jensen, Signe M.; Christensen, Line B.; Ritz, Christian; Michaelsen, Kim F.; Mølgaard, Christian
2014-01-01
For 3-year-old children suitable methods to estimate body composition are sparse. We aimed to develop predictive equations for estimating fat-free mass (FFM) from bioelectrical impedance (BIA) and anthropometry using dual-energy X-ray absorptiometry (DXA) as reference method using data from 99 healthy 3-year-old Danish children. Predictive equations were derived from two multiple linear regression models, a comprehensive model (height2/resistance (RI), six anthropometric measurements) and a simple model (RI, height, weight). Their uncertainty was quantified by means of 10-fold cross-validation approach. Prediction error of FFM was 3.0% for both equations (root mean square error: 360 and 356 g, respectively). The derived equations produced BIA-based prediction of FFM and FM near DXA scan results. We suggest that the predictive equations can be applied in similar population samples aged 2–4 years. The derived equations may prove useful for studies linking body composition to early risk factors and early onset of obesity. PMID:24463487
Ejlerskov, Katrine T; Jensen, Signe M; Christensen, Line B; Ritz, Christian; Michaelsen, Kim F; Mølgaard, Christian
2014-01-27
For 3-year-old children suitable methods to estimate body composition are sparse. We aimed to develop predictive equations for estimating fat-free mass (FFM) from bioelectrical impedance (BIA) and anthropometry using dual-energy X-ray absorptiometry (DXA) as reference method using data from 99 healthy 3-year-old Danish children. Predictive equations were derived from two multiple linear regression models, a comprehensive model (height(2)/resistance (RI), six anthropometric measurements) and a simple model (RI, height, weight). Their uncertainty was quantified by means of 10-fold cross-validation approach. Prediction error of FFM was 3.0% for both equations (root mean square error: 360 and 356 g, respectively). The derived equations produced BIA-based prediction of FFM and FM near DXA scan results. We suggest that the predictive equations can be applied in similar population samples aged 2-4 years. The derived equations may prove useful for studies linking body composition to early risk factors and early onset of obesity.
Mapping visual stimuli to perceptual decisions via sparse decoding of mesoscopic neural activity.
Sajda, Paul
2010-01-01
In this talk I will describe our work investigating sparse decoding of neural activity, given a realistic mapping of the visual scene to neuronal spike trains generated by a model of primary visual cortex (V1). We use a linear decoder which imposes sparsity via an L1 norm. The decoder can be viewed as a decoding neuron (linear summation followed by a sigmoidal nonlinearity) in which there are relatively few non-zero synaptic weights. We find: (1) the best decoding performance is for a representation that is sparse in both space and time, (2) decoding of a temporal code results in better performance than a rate code and is also a better fit to the psychophysical data, (3) the number of neurons required for decoding increases monotonically as signal-to-noise in the stimulus decreases, with as little as 1% of the neurons required for decoding at the highest signal-to-noise levels, and (4) sparse decoding results in a more accurate decoding of the stimulus and is a better fit to psychophysical performance than a distributed decoding, for example one imposed by an L2 norm. We conclude that sparse coding is well-justified from a decoding perspective in that it results in a minimum number of neurons and maximum accuracy when sparse representations can be decoded from the neural dynamics.
Wang, Yubo; Veluvolu, Kalyana C
2017-06-14
It is often difficult to analyze biological signals because of their nonlinear and non-stationary characteristics. This necessitates the usage of time-frequency decomposition methods for analyzing the subtle changes in these signals that are often connected to an underlying phenomena. This paper presents a new approach to analyze the time-varying characteristics of such signals by employing a simple truncated Fourier series model, namely the band-limited multiple Fourier linear combiner (BMFLC). In contrast to the earlier designs, we first identified the sparsity imposed on the signal model in order to reformulate the model to a sparse linear regression model. The coefficients of the proposed model are then estimated by a convex optimization algorithm. The performance of the proposed method was analyzed with benchmark test signals. An energy ratio metric is employed to quantify the spectral performance and results show that the proposed method Sparse-BMFLC has high mean energy (0.9976) ratio and outperforms existing methods such as short-time Fourier transfrom (STFT), continuous Wavelet transform (CWT) and BMFLC Kalman Smoother. Furthermore, the proposed method provides an overall 6.22% in reconstruction error.
Short-term memory capacity in networks via the restricted isometry property.
Charles, Adam S; Yap, Han Lun; Rozell, Christopher J
2014-06-01
Cortical networks are hypothesized to rely on transient network activity to support short-term memory (STM). In this letter, we study the capacity of randomly connected recurrent linear networks for performing STM when the input signals are approximately sparse in some basis. We leverage results from compressed sensing to provide rigorous nonasymptotic recovery guarantees, quantifying the impact of the input sparsity level, the input sparsity basis, and the network characteristics on the system capacity. Our analysis demonstrates that network memory capacities can scale superlinearly with the number of nodes and in some situations can achieve STM capacities that are much larger than the network size. We provide perfect recovery guarantees for finite sequences and recovery bounds for infinite sequences. The latter analysis predicts that network STM systems may have an optimal recovery length that balances errors due to omission and recall mistakes. Furthermore, we show that the conditions yielding optimal STM capacity can be embodied in several network topologies, including networks with sparse or dense connectivities.
Real-time model learning using Incremental Sparse Spectrum Gaussian Process Regression.
Gijsberts, Arjan; Metta, Giorgio
2013-05-01
Novel applications in unstructured and non-stationary human environments require robots that learn from experience and adapt autonomously to changing conditions. Predictive models therefore not only need to be accurate, but should also be updated incrementally in real-time and require minimal human intervention. Incremental Sparse Spectrum Gaussian Process Regression is an algorithm that is targeted specifically for use in this context. Rather than developing a novel algorithm from the ground up, the method is based on the thoroughly studied Gaussian Process Regression algorithm, therefore ensuring a solid theoretical foundation. Non-linearity and a bounded update complexity are achieved simultaneously by means of a finite dimensional random feature mapping that approximates a kernel function. As a result, the computational cost for each update remains constant over time. Finally, algorithmic simplicity and support for automated hyperparameter optimization ensures convenience when employed in practice. Empirical validation on a number of synthetic and real-life learning problems confirms that the performance of Incremental Sparse Spectrum Gaussian Process Regression is superior with respect to the popular Locally Weighted Projection Regression, while computational requirements are found to be significantly lower. The method is therefore particularly suited for learning with real-time constraints or when computational resources are limited. Copyright © 2012 Elsevier Ltd. All rights reserved.
Structured sparse linear graph embedding.
Wang, Haixian
2012-03-01
Subspace learning is a core issue in pattern recognition and machine learning. Linear graph embedding (LGE) is a general framework for subspace learning. In this paper, we propose a structured sparse extension to LGE (SSLGE) by introducing a structured sparsity-inducing norm into LGE. Specifically, SSLGE casts the projection bases learning into a regression-type optimization problem, and then the structured sparsity regularization is applied to the regression coefficients. The regularization selects a subset of features and meanwhile encodes high-order information reflecting a priori structure information of the data. The SSLGE technique provides a unified framework for discovering structured sparse subspace. Computationally, by using a variational equality and the Procrustes transformation, SSLGE is efficiently solved with closed-form updates. Experimental results on face image show the effectiveness of the proposed method. Copyright © 2011 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Bellugi, D. G.; Tennant, C.; Larsen, L.
2016-12-01
Catchment and climate heterogeneity complicate prediction of runoff across time and space, and resulting parameter uncertainty can lead to large accumulated errors in hydrologic models, particularly in ungauged basins. Recently, data-driven modeling approaches have been shown to avoid the accumulated uncertainty associated with many physically-based models, providing an appealing alternative for hydrologic prediction. However, the effectiveness of different methods in hydrologically and geomorphically distinct catchments, and the robustness of these methods to changing climate and changing hydrologic processes remain to be tested. Here, we evaluate the use of machine learning techniques to predict daily runoff across time and space using only essential climatic forcing (e.g. precipitation, temperature, and potential evapotranspiration) time series as model input. Model training and testing was done using a high quality dataset of daily runoff and climate forcing data for 25+ years for 600+ minimally-disturbed catchments (drainage area range 5-25,000 km2, median size 336 km2) that cover a wide range of climatic and physical characteristics. Preliminary results using Support Vector Regression (SVR) suggest that in some catchments this nonlinear-based regression technique can accurately predict daily runoff, while the same approach fails in other catchments, indicating that the representation of climate inputs and/or catchment filter characteristics in the model structure need further refinement to increase performance. We bolster this analysis by using Sparse Identification of Nonlinear Dynamics (a sparse symbolic regression technique) to uncover the governing equations that describe runoff processes in catchments where SVR performed well and for ones where it performed poorly, thereby enabling inference about governing processes. This provides a robust means of examining how catchment complexity influences runoff prediction skill, and represents a contribution towards the integration of data-driven inference and physically-based models.
Ghosh, A
1988-08-01
Lanczos and conjugate gradient algorithms are important in computational linear algebra. In this paper, a parallel pipelined realization of these algorithms on a ring of optical linear algebra processors is described. The flow of data is designed to minimize the idle times of the optical multiprocessor and the redundancy of computations. The effects of optical round-off errors on the solutions obtained by the optical Lanczos and conjugate gradient algorithms are analyzed, and it is shown that optical preconditioning can improve the accuracy of these algorithms substantially. Algorithms for optical preconditioning and results of numerical experiments on solving linear systems of equations arising from partial differential equations are discussed. Since the Lanczos algorithm is used mostly with sparse matrices, a folded storage scheme to represent sparse matrices on spatial light modulators is also described.
Sparse principal component analysis in medical shape modeling
NASA Astrophysics Data System (ADS)
Sjöstrand, Karl; Stegmann, Mikkel B.; Larsen, Rasmus
2006-03-01
Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims at producing easily interpreted models through sparse loadings, i.e. each new variable is a linear combination of a subset of the original variables. One of the aims of using SPCA is the possible separation of the results into isolated and easily identifiable effects. This article introduces SPCA for shape analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA algorithm has been implemented using Matlab and is available for download. The general behavior of the algorithm is investigated, and strengths and weaknesses are discussed. The original report on the SPCA algorithm argues that the ordering of modes is not an issue. We disagree on this point and propose several approaches to establish sensible orderings. A method that orders modes by decreasing variance and maximizes the sum of variances for all modes is presented and investigated in detail.
Piecewise multivariate modelling of sequential metabolic profiling data.
Rantalainen, Mattias; Cloarec, Olivier; Ebbels, Timothy M D; Lundstedt, Torbjörn; Nicholson, Jeremy K; Holmes, Elaine; Trygg, Johan
2008-02-19
Modelling the time-related behaviour of biological systems is essential for understanding their dynamic responses to perturbations. In metabolic profiling studies, the sampling rate and number of sampling points are often restricted due to experimental and biological constraints. A supervised multivariate modelling approach with the objective to model the time-related variation in the data for short and sparsely sampled time-series is described. A set of piecewise Orthogonal Projections to Latent Structures (OPLS) models are estimated, describing changes between successive time points. The individual OPLS models are linear, but the piecewise combination of several models accommodates modelling and prediction of changes which are non-linear with respect to the time course. We demonstrate the method on both simulated and metabolic profiling data, illustrating how time related changes are successfully modelled and predicted. The proposed method is effective for modelling and prediction of short and multivariate time series data. A key advantage of the method is model transparency, allowing easy interpretation of time-related variation in the data. The method provides a competitive complement to commonly applied multivariate methods such as OPLS and Principal Component Analysis (PCA) for modelling and analysis of short time-series data.
Multi-linear sparse reconstruction for SAR imaging based on higher-order SVD
NASA Astrophysics Data System (ADS)
Gao, Yu-Fei; Gui, Guan; Cong, Xun-Chao; Yang, Yue; Zou, Yan-Bin; Wan, Qun
2017-12-01
This paper focuses on the spotlight synthetic aperture radar (SAR) imaging for point scattering targets based on tensor modeling. In a real-world scenario, scatterers usually distribute in the block sparse pattern. Such a distribution feature has been scarcely utilized by the previous studies of SAR imaging. Our work takes advantage of this structure property of the target scene, constructing a multi-linear sparse reconstruction algorithm for SAR imaging. The multi-linear block sparsity is introduced into higher-order singular value decomposition (SVD) with a dictionary constructing procedure by this research. The simulation experiments for ideal point targets show the robustness of the proposed algorithm to the noise and sidelobe disturbance which always influence the imaging quality of the conventional methods. The computational resources requirement is further investigated in this paper. As a consequence of the algorithm complexity analysis, the present method possesses the superiority on resource consumption compared with the classic matching pursuit method. The imaging implementations for practical measured data also demonstrate the effectiveness of the algorithm developed in this paper.
NASA Astrophysics Data System (ADS)
Stoykov, S.; Atanassov, E.; Margenov, S.
2016-10-01
Many of the scientific applications involve sparse or dense matrix operations, such as solving linear systems, matrix-matrix products, eigensolvers, etc. In what concerns structural nonlinear dynamics, the computations of periodic responses and the determination of stability of the solution are of primary interest. Shooting method iswidely used for obtaining periodic responses of nonlinear systems. The method involves simultaneously operations with sparse and dense matrices. One of the computationally expensive operations in the method is multiplication of sparse by dense matrices. In the current work, a new algorithm for sparse matrix by dense matrix products is presented. The algorithm takes into account the structure of the sparse matrix, which is obtained by space discretization of the nonlinear Mindlin's plate equation of motion by the finite element method. The algorithm is developed to use the vector engine of Intel Xeon Phi coprocessors. It is compared with the standard sparse matrix by dense matrix algorithm and the one developed by Intel MKL and it is shown that by considering the properties of the sparse matrix better algorithms can be developed.
Locality-preserving sparse representation-based classification in hyperspectral imagery
NASA Astrophysics Data System (ADS)
Gao, Lianru; Yu, Haoyang; Zhang, Bing; Li, Qingting
2016-10-01
This paper proposes to combine locality-preserving projections (LPP) and sparse representation (SR) for hyperspectral image classification. The LPP is first used to reduce the dimensionality of all the training and testing data by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold, where the high-dimensional data lies. Then, SR codes the projected testing pixels as sparse linear combinations of all the training samples to classify the testing pixels by evaluating which class leads to the minimum approximation error. The integration of LPP and SR represents an innovative contribution to the literature. The proposed approach, called locality-preserving SR-based classification, addresses the imbalance between high dimensionality of hyperspectral data and the limited number of training samples. Experimental results on three real hyperspectral data sets demonstrate that the proposed approach outperforms the original counterpart, i.e., SR-based classification.
Color Sparse Representations for Image Processing: Review, Models, and Prospects.
Barthélemy, Quentin; Larue, Anthony; Mars, Jérôme I
2015-11-01
Sparse representations have been extended to deal with color images composed of three channels. A review of dictionary-learning-based sparse representations for color images is made here, detailing the differences between the models, and comparing their results on the real and simulated data. These models are considered in a unifying framework that is based on the degrees of freedom of the linear filtering/transformation of the color channels. Moreover, this allows it to be shown that the scalar quaternionic linear model is equivalent to constrained matrix-based color filtering, which highlights the filtering implicitly applied through this model. Based on this reformulation, the new color filtering model is introduced, using unconstrained filters. In this model, spatial morphologies of color images are encoded by atoms, and colors are encoded by color filters. Color variability is no longer captured in increasing the dictionary size, but with color filters, this gives an efficient color representation.
Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Duarte-Carvajalino, Julio M; Sapiro, Guillermo; Lenglet, Christophe
2018-02-15
We present a sparse Bayesian unmixing algorithm BusineX: Bayesian Unmixing for Sparse Inference-based Estimation of Fiber Crossings (X), for estimation of white matter fiber parameters from compressed (under-sampled) diffusion MRI (dMRI) data. BusineX combines compressive sensing with linear unmixing and introduces sparsity to the previously proposed multiresolution data fusion algorithm RubiX, resulting in a method for improved reconstruction, especially from data with lower number of diffusion gradients. We formulate the estimation of fiber parameters as a sparse signal recovery problem and propose a linear unmixing framework with sparse Bayesian learning for the recovery of sparse signals, the fiber orientations and volume fractions. The data is modeled using a parametric spherical deconvolution approach and represented using a dictionary created with the exponential decay components along different possible diffusion directions. Volume fractions of fibers along these directions define the dictionary weights. The proposed sparse inference, which is based on the dictionary representation, considers the sparsity of fiber populations and exploits the spatial redundancy in data representation, thereby facilitating inference from under-sampled q-space. The algorithm improves parameter estimation from dMRI through data-dependent local learning of hyperparameters, at each voxel and for each possible fiber orientation, that moderate the strength of priors governing the parameter variances. Experimental results on synthetic and in-vivo data show improved accuracy with a lower uncertainty in fiber parameter estimates. BusineX resolves a higher number of second and third fiber crossings. For under-sampled data, the algorithm is also shown to produce more reliable estimates. Copyright © 2017 Elsevier Inc. All rights reserved.
Local structure preserving sparse coding for infrared target recognition
Han, Jing; Yue, Jiang; Zhang, Yi; Bai, Lianfa
2017-01-01
Sparse coding performs well in image classification. However, robust target recognition requires a lot of comprehensive template images and the sparse learning process is complex. We incorporate sparsity into a template matching concept to construct a local sparse structure matching (LSSM) model for general infrared target recognition. A local structure preserving sparse coding (LSPSc) formulation is proposed to simultaneously preserve the local sparse and structural information of objects. By adding a spatial local structure constraint into the classical sparse coding algorithm, LSPSc can improve the stability of sparse representation for targets and inhibit background interference in infrared images. Furthermore, a kernel LSPSc (K-LSPSc) formulation is proposed, which extends LSPSc to the kernel space to weaken the influence of the linear structure constraint in nonlinear natural data. Because of the anti-interference and fault-tolerant capabilities, both LSPSc- and K-LSPSc-based LSSM can implement target identification based on a simple template set, which just needs several images containing enough local sparse structures to learn a sufficient sparse structure dictionary of a target class. Specifically, this LSSM approach has stable performance in the target detection with scene, shape and occlusions variations. High performance is demonstrated on several datasets, indicating robust infrared target recognition in diverse environments and imaging conditions. PMID:28323824
Effects of radiation type and delivery mode on a radioresistant eukaryote Cryptococcus neoformans
Shuryak, Igor; Bryan, Ruth A.; Broitman, Jack; Marino, Stephen A.; Morgenstern, Alfred; Apostolidis, Christos; Dadachova, Ekaterina
2015-01-01
Introduction Most research on radioresistant fungi, particularly on human pathogens such as Cryptococcus neoformans, involves sparsely-ionizing radiation. Consequently, fungal responses to densely-ionizing radiation, which can be harnessed to treat life-threatening fungal infections, remain incompletely understood. Methods We addressed this issue by quantifying and comparing the effects of densely-ionizing α-particles (delivered either by external beam or by 213Bi-labeled monoclonal antibodies), and sparsely-ionizing 137Cs γ-rays, on Cryptococus neoformans. Results The best-fit linear-quadratic parameters for clonogenic survival were the following: α=0.24×10−2 Gy−1 for γ-rays and 1.07×10−2 Gy−1 for external-beam α-particles, and β=1.44×10−5 Gy−2 for both radiation types. Fungal cell killing by radiolabeled antibodies was consistent with predictions based on the α-particle dose to the cell nucleus and the linear-quadratic parameters for external-beam α-particles. The estimated RBE (for α-particles vs γ-rays) at low doses was 4.47 for the initial portion of the α-particle track, and 7.66 for the Bragg peak. Non-radiological antibody effects accounted for up to 23% of cell death. Conclusions These results quantify the degree of C. neoformans resistance to densely-ionizing radiations, and show how this resistance can be overcome with fungus-specific radiolabeled antibodies. PMID:25800676
Commentary: Ethical Issues of Current Health-Protection Policies on Low-Dose Ionizing Radiation
Socol, Yehoshua; Dobrzyński, Ludwik; Doss, Mohan; Feinendegen, Ludwig E.; Janiak, Marek K.; Miller, Mark L.; Sanders, Charles L.; Scott, Bobby R.; Ulsh, Brant; Vaiserman, Alexander
2014-01-01
The linear no-threshold (LNT) model of ionizing-radiation-induced cancer is based on the assumption that every radiation dose increment constitutes increased cancer risk for humans. The risk is hypothesized to increase linearly as the total dose increases. While this model is the basis for radiation safety regulations, its scientific validity has been questioned and debated for many decades. The recent memorandum of the International Commission on Radiological Protection admits that the LNT-model predictions at low doses are “speculative, unproven, undetectable and ‘phantom’.” Moreover, numerous experimental, ecological, and epidemiological studies show that low doses of sparsely-ionizing or sparsely-ionizing plus highly-ionizing radiation may be beneficial to human health (hormesis/adaptive response). The present LNT-model-based regulations impose excessive costs on the society. For example, the median-cost medical program is 5000 times more cost-efficient in saving lives than controlling radiation emissions. There are also lives lost: e.g., following Fukushima accident, more than 1000 disaster-related yet non-radiogenic premature deaths were officially registered among the population evacuated due to radiation concerns. Additional negative impacts of LNT-model-inspired radiophobia include: refusal of some patients to undergo potentially life-saving medical imaging; discouragement of the study of low-dose radiation therapies; motivation for radiological terrorism and promotion of nuclear proliferation. PMID:24910586
Many-core graph analytics using accelerated sparse linear algebra routines
NASA Astrophysics Data System (ADS)
Kozacik, Stephen; Paolini, Aaron L.; Fox, Paul; Kelmelis, Eric
2016-05-01
Graph analytics is a key component in identifying emerging trends and threats in many real-world applications. Largescale graph analytics frameworks provide a convenient and highly-scalable platform for developing algorithms to analyze large datasets. Although conceptually scalable, these techniques exhibit poor performance on modern computational hardware. Another model of graph computation has emerged that promises improved performance and scalability by using abstract linear algebra operations as the basis for graph analysis as laid out by the GraphBLAS standard. By using sparse linear algebra as the basis, existing highly efficient algorithms can be adapted to perform computations on the graph. This approach, however, is often less intuitive to graph analytics experts, who are accustomed to vertex-centric APIs such as Giraph, GraphX, and Tinkerpop. We are developing an implementation of the high-level operations supported by these APIs in terms of linear algebra operations. This implementation is be backed by many-core implementations of the fundamental GraphBLAS operations required, and offers the advantages of both the intuitive programming model of a vertex-centric API and the performance of a sparse linear algebra implementation. This technology can reduce the number of nodes required, as well as the run-time for a graph analysis problem, enabling customers to perform more complex analysis with less hardware at lower cost. All of this can be accomplished without the requirement for the customer to make any changes to their analytics code, thanks to the compatibility with existing graph APIs.
Two conditions for equivalence of 0-norm solution and 1-norm solution in sparse representation.
Li, Yuanqing; Amari, Shun-Ichi
2010-07-01
In sparse representation, two important sparse solutions, the 0-norm and 1-norm solutions, have been receiving much of attention. The 0-norm solution is the sparsest, however it is not easy to obtain. Although the 1-norm solution may not be the sparsest, it can be easily obtained by the linear programming method. In many cases, the 0-norm solution can be obtained through finding the 1-norm solution. Many discussions exist on the equivalence of the two sparse solutions. This paper analyzes two conditions for the equivalence of the two sparse solutions. The first condition is necessary and sufficient, however, difficult to verify. Although the second is necessary but is not sufficient, it is easy to verify. In this paper, we analyze the second condition within the stochastic framework and propose a variant. We then prove that the equivalence of the two sparse solutions holds with high probability under the variant of the second condition. Furthermore, in the limit case where the 0-norm solution is extremely sparse, the second condition is also a sufficient condition with probability 1.
Computational efficiency improvements for image colorization
NASA Astrophysics Data System (ADS)
Yu, Chao; Sharma, Gaurav; Aly, Hussein
2013-03-01
We propose an efficient algorithm for colorization of greyscale images. As in prior work, colorization is posed as an optimization problem: a user specifies the color for a few scribbles drawn on the greyscale image and the color image is obtained by propagating color information from the scribbles to surrounding regions, while maximizing the local smoothness of colors. In this formulation, colorization is obtained by solving a large sparse linear system, which normally requires substantial computation and memory resources. Our algorithm improves the computational performance through three innovations over prior colorization implementations. First, the linear system is solved iteratively without explicitly constructing the sparse matrix, which significantly reduces the required memory. Second, we formulate each iteration in terms of integral images obtained by dynamic programming, reducing repetitive computation. Third, we use a coarseto- fine framework, where a lower resolution subsampled image is first colorized and this low resolution color image is upsampled to initialize the colorization process for the fine level. The improvements we develop provide significant speedup and memory savings compared to the conventional approach of solving the linear system directly using off-the-shelf sparse solvers, and allow us to colorize images with typical sizes encountered in realistic applications on typical commodity computing platforms.
Incomplete Sparse Approximate Inverses for Parallel Preconditioning
Anzt, Hartwig; Huckle, Thomas K.; Bräckle, Jürgen; ...
2017-10-28
In this study, we propose a new preconditioning method that can be seen as a generalization of block-Jacobi methods, or as a simplification of the sparse approximate inverse (SAI) preconditioners. The “Incomplete Sparse Approximate Inverses” (ISAI) is in particular efficient in the solution of sparse triangular linear systems of equations. Those arise, for example, in the context of incomplete factorization preconditioning. ISAI preconditioners can be generated via an algorithm providing fine-grained parallelism, which makes them attractive for hardware with a high concurrency level. Finally, in a study covering a large number of matrices, we identify the ISAI preconditioner as anmore » attractive alternative to exact triangular solves in the context of incomplete factorization preconditioning.« less
Zhang, Shang; Dong, Yuhan; Fu, Hongyan; Huang, Shao-Lun; Zhang, Lin
2018-02-22
The miniaturization of spectrometer can broaden the application area of spectrometry, which has huge academic and industrial value. Among various miniaturization approaches, filter-based miniaturization is a promising implementation by utilizing broadband filters with distinct transmission functions. Mathematically, filter-based spectral reconstruction can be modeled as solving a system of linear equations. In this paper, we propose an algorithm of spectral reconstruction based on sparse optimization and dictionary learning. To verify the feasibility of the reconstruction algorithm, we design and implement a simple prototype of a filter-based miniature spectrometer. The experimental results demonstrate that sparse optimization is well applicable to spectral reconstruction whether the spectra are directly sparse or not. As for the non-directly sparse spectra, their sparsity can be enhanced by dictionary learning. In conclusion, the proposed approach has a bright application prospect in fabricating a practical miniature spectrometer.
Zhang, Shang; Fu, Hongyan; Huang, Shao-Lun; Zhang, Lin
2018-01-01
The miniaturization of spectrometer can broaden the application area of spectrometry, which has huge academic and industrial value. Among various miniaturization approaches, filter-based miniaturization is a promising implementation by utilizing broadband filters with distinct transmission functions. Mathematically, filter-based spectral reconstruction can be modeled as solving a system of linear equations. In this paper, we propose an algorithm of spectral reconstruction based on sparse optimization and dictionary learning. To verify the feasibility of the reconstruction algorithm, we design and implement a simple prototype of a filter-based miniature spectrometer. The experimental results demonstrate that sparse optimization is well applicable to spectral reconstruction whether the spectra are directly sparse or not. As for the non-directly sparse spectra, their sparsity can be enhanced by dictionary learning. In conclusion, the proposed approach has a bright application prospect in fabricating a practical miniature spectrometer. PMID:29470406
Masuda, Y; Misztal, I; Legarra, A; Tsuruta, S; Lourenco, D A L; Fragomeni, B O; Aguilar, I
2017-01-01
This paper evaluates an efficient implementation to multiply the inverse of a numerator relationship matrix for genotyped animals () by a vector (). The computation is required for solving mixed model equations in single-step genomic BLUP (ssGBLUP) with the preconditioned conjugate gradient (PCG). The inverse can be decomposed into sparse matrices that are blocks of the sparse inverse of a numerator relationship matrix () including genotyped animals and their ancestors. The elements of were rapidly calculated with the Henderson's rule and stored as sparse matrices in memory. Implementation of was by a series of sparse matrix-vector multiplications. Diagonal elements of , which were required as preconditioners in PCG, were approximated with a Monte Carlo method using 1,000 samples. The efficient implementation of was compared with explicit inversion of with 3 data sets including about 15,000, 81,000, and 570,000 genotyped animals selected from populations with 213,000, 8.2 million, and 10.7 million pedigree animals, respectively. The explicit inversion required 1.8 GB, 49 GB, and 2,415 GB (estimated) of memory, respectively, and 42 s, 56 min, and 13.5 d (estimated), respectively, for the computations. The efficient implementation required <1 MB, 2.9 GB, and 2.3 GB of memory, respectively, and <1 sec, 3 min, and 5 min, respectively, for setting up. Only <1 sec was required for the multiplication in each PCG iteration for any data sets. When the equations in ssGBLUP are solved with the PCG algorithm, is no longer a limiting factor in the computations.
Multivariate quadrature for representing cloud condensation nuclei activity of aerosol populations
Fierce, Laura; McGraw, Robert L.
2017-07-26
Here, sparse representations of atmospheric aerosols are needed for efficient regional- and global-scale chemical transport models. Here we introduce a new framework for representing aerosol distributions, based on the quadrature method of moments. Given a set of moment constraints, we show how linear programming, combined with an entropy-inspired cost function, can be used to construct optimized quadrature representations of aerosol distributions. The sparse representations derived from this approach accurately reproduce cloud condensation nuclei (CCN) activity for realistically complex distributions simulated by a particleresolved model. Additionally, the linear programming techniques described in this study can be used to bound key aerosolmore » properties, such as the number concentration of CCN. Unlike the commonly used sparse representations, such as modal and sectional schemes, the maximum-entropy approach described here is not constrained to pre-determined size bins or assumed distribution shapes. This study is a first step toward a particle-based aerosol scheme that will track multivariate aerosol distributions with sufficient computational efficiency for large-scale simulations.« less
Multivariate quadrature for representing cloud condensation nuclei activity of aerosol populations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fierce, Laura; McGraw, Robert L.
Here, sparse representations of atmospheric aerosols are needed for efficient regional- and global-scale chemical transport models. Here we introduce a new framework for representing aerosol distributions, based on the quadrature method of moments. Given a set of moment constraints, we show how linear programming, combined with an entropy-inspired cost function, can be used to construct optimized quadrature representations of aerosol distributions. The sparse representations derived from this approach accurately reproduce cloud condensation nuclei (CCN) activity for realistically complex distributions simulated by a particleresolved model. Additionally, the linear programming techniques described in this study can be used to bound key aerosolmore » properties, such as the number concentration of CCN. Unlike the commonly used sparse representations, such as modal and sectional schemes, the maximum-entropy approach described here is not constrained to pre-determined size bins or assumed distribution shapes. This study is a first step toward a particle-based aerosol scheme that will track multivariate aerosol distributions with sufficient computational efficiency for large-scale simulations.« less
A linear recurrent kernel online learning algorithm with sparse updates.
Fan, Haijin; Song, Qing
2014-02-01
In this paper, we propose a recurrent kernel algorithm with selectively sparse updates for online learning. The algorithm introduces a linear recurrent term in the estimation of the current output. This makes the past information reusable for updating of the algorithm in the form of a recurrent gradient term. To ensure that the reuse of this recurrent gradient indeed accelerates the convergence speed, a novel hybrid recurrent training is proposed to switch on or off learning the recurrent information according to the magnitude of the current training error. Furthermore, the algorithm includes a data-dependent adaptive learning rate which can provide guaranteed system weight convergence at each training iteration. The learning rate is set as zero when the training violates the derived convergence conditions, which makes the algorithm updating process sparse. Theoretical analyses of the weight convergence are presented and experimental results show the good performance of the proposed algorithm in terms of convergence speed and estimation accuracy. Copyright © 2013 Elsevier Ltd. All rights reserved.
A general parallel sparse-blocked matrix multiply for linear scaling SCF theory
NASA Astrophysics Data System (ADS)
Challacombe, Matt
2000-06-01
A general approach to the parallel sparse-blocked matrix-matrix multiply is developed in the context of linear scaling self-consistent-field (SCF) theory. The data-parallel message passing method uses non-blocking communication to overlap computation and communication. The space filling curve heuristic is used to achieve data locality for sparse matrix elements that decay with “separation”. Load balance is achieved by solving the bin packing problem for blocks with variable size.With this new method as the kernel, parallel performance of the simplified density matrix minimization (SDMM) for solution of the SCF equations is investigated for RHF/6-31G ∗∗ water clusters and RHF/3-21G estane globules. Sustained rates above 5.7 GFLOPS for the SDMM have been achieved for (H 2 O) 200 with 95 Origin 2000 processors. Scalability is found to be limited by load imbalance, which increases with decreasing granularity, due primarily to the inhomogeneous distribution of variable block sizes.
Molecular cancer classification using a meta-sample-based regularized robust coding method.
Wang, Shu-Lin; Sun, Liuchao; Fang, Jianwen
2014-01-01
Previous studies have demonstrated that machine learning based molecular cancer classification using gene expression profiling (GEP) data is promising for the clinic diagnosis and treatment of cancer. Novel classification methods with high efficiency and prediction accuracy are still needed to deal with high dimensionality and small sample size of typical GEP data. Recently the sparse representation (SR) method has been successfully applied to the cancer classification. Nevertheless, its efficiency needs to be improved when analyzing large-scale GEP data. In this paper we present the meta-sample-based regularized robust coding classification (MRRCC), a novel effective cancer classification technique that combines the idea of meta-sample-based cluster method with regularized robust coding (RRC) method. It assumes that the coding residual and the coding coefficient are respectively independent and identically distributed. Similar to meta-sample-based SR classification (MSRC), MRRCC extracts a set of meta-samples from the training samples, and then encodes a testing sample as the sparse linear combination of these meta-samples. The representation fidelity is measured by the l2-norm or l1-norm of the coding residual. Extensive experiments on publicly available GEP datasets demonstrate that the proposed method is more efficient while its prediction accuracy is equivalent to existing MSRC-based methods and better than other state-of-the-art dimension reduction based methods.
Orthogonal sparse linear discriminant analysis
NASA Astrophysics Data System (ADS)
Liu, Zhonghua; Liu, Gang; Pu, Jiexin; Wang, Xiaohong; Wang, Haijun
2018-03-01
Linear discriminant analysis (LDA) is a linear feature extraction approach, and it has received much attention. On the basis of LDA, researchers have done a lot of research work on it, and many variant versions of LDA were proposed. However, the inherent problem of LDA cannot be solved very well by the variant methods. The major disadvantages of the classical LDA are as follows. First, it is sensitive to outliers and noises. Second, only the global discriminant structure is preserved, while the local discriminant information is ignored. In this paper, we present a new orthogonal sparse linear discriminant analysis (OSLDA) algorithm. The k nearest neighbour graph is first constructed to preserve the locality discriminant information of sample points. Then, L2,1-norm constraint on the projection matrix is used to act as loss function, which can make the proposed method robust to outliers in data points. Extensive experiments have been performed on several standard public image databases, and the experiment results demonstrate the performance of the proposed OSLDA algorithm.
NASA Astrophysics Data System (ADS)
Vishnukumar, S.; Wilscy, M.
2017-12-01
In this paper, we propose a single image Super-Resolution (SR) method based on Compressive Sensing (CS) and Improved Total Variation (TV) Minimization Sparse Recovery. In the CS framework, low-resolution (LR) image is treated as the compressed version of high-resolution (HR) image. Dictionary Training and Sparse Recovery are the two phases of the method. K-Singular Value Decomposition (K-SVD) method is used for dictionary training and the dictionary represents HR image patches in a sparse manner. Here, only the interpolated version of the LR image is used for training purpose and thereby the structural self similarity inherent in the LR image is exploited. In the sparse recovery phase the sparse representation coefficients with respect to the trained dictionary for LR image patches are derived using Improved TV Minimization method. HR image can be reconstructed by the linear combination of the dictionary and the sparse coefficients. The experimental results show that the proposed method gives better results quantitatively as well as qualitatively on both natural and remote sensing images. The reconstructed images have better visual quality since edges and other sharp details are preserved.
Sparse Regression as a Sparse Eigenvalue Problem
NASA Technical Reports Server (NTRS)
Moghaddam, Baback; Gruber, Amit; Weiss, Yair; Avidan, Shai
2008-01-01
We extend the l0-norm "subspectral" algorithms for sparse-LDA [5] and sparse-PCA [6] to general quadratic costs such as MSE in linear (kernel) regression. The resulting "Sparse Least Squares" (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem (e.g., binary sparse-LDA [7]). Specifically, for a general quadratic cost we use a highly-efficient technique for direct eigenvalue computation using partitioned matrix inverses which leads to dramatic x103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) scaling behaviour that up to now has limited the previous algorithms' utility for high-dimensional learning problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes more efficient than forward selection. Similarly, branch-and-bound search for Exact Sparse Least Squares (ESLS) also benefits from partitioned matrix inverse techniques. Our Greedy Sparse Least Squares (GSLS) generalizes Natarajan's algorithm [9] also known as Order-Recursive Matching Pursuit (ORMP). Specifically, the forward half of GSLS is exactly equivalent to ORMP but more efficient. By including the backward pass, which only doubles the computation, we can achieve lower MSE than ORMP. Experimental comparisons to the state-of-the-art LARS algorithm [3] show forward-GSLS is faster, more accurate and more flexible in terms of choice of regularization
Algorithms for solving large sparse systems of simultaneous linear equations on vector processors
NASA Technical Reports Server (NTRS)
David, R. E.
1984-01-01
Very efficient algorithms for solving large sparse systems of simultaneous linear equations have been developed for serial processing computers. These involve a reordering of matrix rows and columns in order to obtain a near triangular pattern of nonzero elements. Then an LU factorization is developed to represent the matrix inverse in terms of a sequence of elementary Gaussian eliminations, or pivots. In this paper it is shown how these algorithms are adapted for efficient implementation on vector processors. Results obtained on the CYBER 200 Model 205 are presented for a series of large test problems which show the comparative advantages of the triangularization and vector processing algorithms.
The challenge of precise orbit determination for STSAT-2C using extremely sparse SLR data
NASA Astrophysics Data System (ADS)
Kim, Young-Rok; Park, Eunseo; Kucharski, Daniel; Lim, Hyung-Chul; Kim, Byoungsoo
2016-03-01
The Science and Technology Satellite (STSAT)-2C is the first Korean satellite equipped with a laser retro-reflector array for satellite laser ranging (SLR). SLR is the only on-board tracking source for precise orbit determination (POD) of STSAT-2C. However, POD for the STSAT-2C is a challenging issue, as the laser measurements of the satellite are extremely sparse, largely due to the inaccurate two-line element (TLE)-based orbit predictions used by the SLR tracking stations. In this study, POD for the STSAT-2C using extremely sparse SLR data is successfully implemented, and new laser-based orbit predictions are obtained. The NASA/GSFC GEODYN II software and seven-day arcs are used for the SLR data processing of two years of normal points from March 2013 to May 2015. To compensate for the extremely sparse laser tracking, the number of estimation parameters are minimized, and only the atmospheric drag coefficients are estimated with various intervals. The POD results show that the weighted root mean square (RMS) post-fit residuals are less than 10 m, and the 3D day boundaries vary from 30 m to 3 km. The average four-day orbit overlaps are less than 20/330/20 m for the radial/along-track/cross-track components. The quality of the new laser-based prediction is verified by SLR observations, and the SLR residuals show better results than those of previous TLE-based predictions. This study demonstrates that POD for the STSAT-2C can be successfully achieved against extreme sparseness of SLR data, and the results can deliver more accurate predictions.
Sparse Modeling of Human Actions from Motion Imagery
2011-09-02
is here developed. Spatio-temporal features that char- acterize local changes in the image are rst extracted. This is followed by the learning of a...video comes from the optimal sparse linear com- bination of the learned basis vectors (action primitives) representing the actions. A low...computational cost deep-layer model learning the inter- class correlations of the data is added for increasing discriminative power. In spite of its simplicity
WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING
Reiss, Philip T.; Huo, Lan; Zhao, Yihong; Kelly, Clare; Ogden, R. Todd
2016-01-01
An increasingly important goal of psychiatry is the use of brain imaging data to develop predictive models. Here we present two contributions to statistical methodology for this purpose. First, we propose and compare a set of wavelet-domain procedures for fitting generalized linear models with scalar responses and image predictors: sparse variants of principal component regression and of partial least squares, and the elastic net. Second, we consider assessing the contribution of image predictors over and above available scalar predictors, in particular via permutation tests and an extension of the idea of confounding to the case of functional or image predictors. Using the proposed methods, we assess whether maps of a spontaneous brain activity measure, derived from functional magnetic resonance imaging, can meaningfully predict presence or absence of attention deficit/hyperactivity disorder (ADHD). Our results shed light on the role of confounding in the surprising outcome of the recent ADHD-200 Global Competition, which challenged researchers to develop algorithms for automated image-based diagnosis of the disorder. PMID:27330652
Sparse/DCT (S/DCT) two-layered representation of prediction residuals for video coding.
Kang, Je-Won; Gabbouj, Moncef; Kuo, C-C Jay
2013-07-01
In this paper, we propose a cascaded sparse/DCT (S/DCT) two-layer representation of prediction residuals, and implement this idea on top of the state-of-the-art high efficiency video coding (HEVC) standard. First, a dictionary is adaptively trained to contain featured patterns of residual signals so that a high portion of energy in a structured residual can be efficiently coded via sparse coding. It is observed that the sparse representation alone is less effective in the R-D performance due to the side information overhead at higher bit rates. To overcome this problem, the DCT representation is cascaded at the second stage. It is applied to the remaining signal to improve coding efficiency. The two representations successfully complement each other. It is demonstrated by experimental results that the proposed algorithm outperforms the HEVC reference codec HM5.0 in the Common Test Condition.
Hieke, Stefanie; Benner, Axel; Schlenl, Richard F; Schumacher, Martin; Bullinger, Lars; Binder, Harald
2016-08-30
High-throughput technology allows for genome-wide measurements at different molecular levels for the same patient, e.g. single nucleotide polymorphisms (SNPs) and gene expression. Correspondingly, it might be beneficial to also integrate complementary information from different molecular levels when building multivariable risk prediction models for a clinical endpoint, such as treatment response or survival. Unfortunately, such a high-dimensional modeling task will often be complicated by a limited overlap of molecular measurements at different levels between patients, i.e. measurements from all molecular levels are available only for a smaller proportion of patients. We propose a sequential strategy for building clinical risk prediction models that integrate genome-wide measurements from two molecular levels in a complementary way. To deal with partial overlap, we develop an imputation approach that allows us to use all available data. This approach is investigated in two acute myeloid leukemia applications combining gene expression with either SNP or DNA methylation data. After obtaining a sparse risk prediction signature e.g. from SNP data, an automatically selected set of prognostic SNPs, by componentwise likelihood-based boosting, imputation is performed for the corresponding linear predictor by a linking model that incorporates e.g. gene expression measurements. The imputed linear predictor is then used for adjustment when building a prognostic signature from the gene expression data. For evaluation, we consider stability, as quantified by inclusion frequencies across resampling data sets. Despite an extremely small overlap in the application example with gene expression and SNPs, several genes are seen to be more stably identified when taking the (imputed) linear predictor from the SNP data into account. In the application with gene expression and DNA methylation, prediction performance with respect to survival also indicates that the proposed approach might work well. We consider imputation of linear predictor values to be a feasible and sensible approach for dealing with partial overlap in complementary integrative analysis of molecular measurements at different levels. More generally, these results indicate that a complementary strategy for integrating different molecular levels can result in more stable risk prediction signatures, potentially providing a more reliable insight into the underlying biology.
Statistical prediction with Kanerva's sparse distributed memory
NASA Technical Reports Server (NTRS)
Rogers, David
1989-01-01
A new viewpoint of the processing performed by Kanerva's sparse distributed memory (SDM) is presented. In conditions of near- or over-capacity, where the associative-memory behavior of the model breaks down, the processing performed by the model can be interpreted as that of a statistical predictor. Mathematical results are presented which serve as the framework for a new statistical viewpoint of sparse distributed memory and for which the standard formulation of SDM is a special case. This viewpoint suggests possible enhancements to the SDM model, including a procedure for improving the predictiveness of the system based on Holland's work with genetic algorithms, and a method for improving the capacity of SDM even when used as an associative memory.
Label consistent K-SVD: learning a discriminative dictionary for recognition.
Jiang, Zhuolin; Lin, Zhe; Davis, Larry S
2013-11-01
A label consistent K-SVD (LC-KSVD) algorithm to learn a discriminative dictionary for sparse coding is presented. In addition to using class labels of training data, we also associate label information with each dictionary item (columns of the dictionary matrix) to enforce discriminability in sparse codes during the dictionary learning process. More specifically, we introduce a new label consistency constraint called "discriminative sparse-code error" and combine it with the reconstruction error and the classification error to form a unified objective function. The optimal solution is efficiently obtained using the K-SVD algorithm. Our algorithm learns a single overcomplete dictionary and an optimal linear classifier jointly. The incremental dictionary learning algorithm is presented for the situation of limited memory resources. It yields dictionaries so that feature points with the same class labels have similar sparse codes. Experimental results demonstrate that our algorithm outperforms many recently proposed sparse-coding techniques for face, action, scene, and object category recognition under the same learning conditions.
Data Assimilation and Propagation of Uncertainty in Multiscale Cardiovascular Simulation
NASA Astrophysics Data System (ADS)
Schiavazzi, Daniele; Marsden, Alison
2015-11-01
Cardiovascular modeling is the application of computational tools to predict hemodynamics. State-of-the-art techniques couple a 3D incompressible Navier-Stokes solver with a boundary circulation model and can predict local and peripheral hemodynamics, analyze the post-operative performance of surgical designs and complement clinical data collection minimizing invasive and risky measurement practices. The ability of these tools to make useful predictions is directly related to their accuracy in representing measured physiologies. Tuning of model parameters is therefore a topic of paramount importance and should include clinical data uncertainty, revealing how this uncertainty will affect the predictions. We propose a fully Bayesian, multi-level approach to data assimilation of uncertain clinical data in multiscale circulation models. To reduce the computational cost, we use a stable, condensed approximation of the 3D model build by linear sparse regression of the pressure/flow rate relationship at the outlets. Finally, we consider the problem of non-invasively propagating the uncertainty in model parameters to the resulting hemodynamics and compare Monte Carlo simulation with Stochastic Collocation approaches based on Polynomial or Multi-resolution Chaos expansions.
Joint sparse representation for robust multimodal biometrics recognition.
Shekhar, Sumit; Patel, Vishal M; Nasrabadi, Nasser M; Chellappa, Rama
2014-01-01
Traditional biometric recognition systems rely on a single biometric signature for authentication. While the advantage of using multiple sources of information for establishing the identity has been widely recognized, computational models for multimodal biometrics recognition have only recently received attention. We propose a multimodal sparse representation method, which represents the test data by a sparse linear combination of training data, while constraining the observations from different modalities of the test subject to share their sparse representations. Thus, we simultaneously take into account correlations as well as coupling information among biometric modalities. A multimodal quality measure is also proposed to weigh each modality as it gets fused. Furthermore, we also kernelize the algorithm to handle nonlinearity in data. The optimization problem is solved using an efficient alternative direction method. Various experiments show that the proposed method compares favorably with competing fusion-based methods.
Three-dimensional unstructured grid Euler computations using a fully-implicit, upwind method
NASA Technical Reports Server (NTRS)
Whitaker, David L.
1993-01-01
A method has been developed to solve the Euler equations on a three-dimensional unstructured grid composed of tetrahedra. The method uses an upwind flow solver with a linearized, backward-Euler time integration scheme. Each time step results in a sparse linear system of equations which is solved by an iterative, sparse matrix solver. Local-time stepping, switched evolution relaxation (SER), preconditioning and reuse of the Jacobian are employed to accelerate the convergence rate. Implicit boundary conditions were found to be extremely important for fast convergence. Numerical experiments have shown that convergence rates comparable to that of a multigrid, central-difference scheme are achievable on the same mesh. Results are presented for several grids about an ONERA M6 wing.
Completing sparse and disconnected protein-protein network by deep learning.
Huang, Lei; Liao, Li; Wu, Cathy H
2018-03-22
Protein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes. Recently, an increasing number of computational methods have shifted from pair-wise prediction to network level prediction. Many of the existing network level methods predict PPIs under the assumption that the training network should be connected. However, this assumption greatly affects the prediction power and limits the application area because the current golden standard PPI networks are usually very sparse and disconnected. Therefore, how to effectively predict PPIs based on a training network that is sparse and disconnected remains a challenge. In this work, we developed a novel PPI prediction method based on deep learning neural network and regularized Laplacian kernel. We use a neural network with an autoencoder-like architecture to implicitly simulate the evolutionary processes of a PPI network. Neurons of the output layer correspond to proteins and are labeled with values (1 for interaction and 0 for otherwise) from the adjacency matrix of a sparse disconnected training PPI network. Unlike autoencoder, neurons at the input layer are given all zero input, reflecting an assumption of no a priori knowledge about PPIs, and hidden layers of smaller sizes mimic ancient interactome at different times during evolution. After the training step, an evolved PPI network whose rows are outputs of the neural network can be obtained. We then predict PPIs by applying the regularized Laplacian kernel to the transition matrix that is built upon the evolved PPI network. The results from cross-validation experiments show that the PPI prediction accuracies for yeast data and human data measured as AUC are increased by up to 8.4 and 14.9% respectively, as compared to the baseline. Moreover, the evolved PPI network can also help us leverage complementary information from the disconnected training network and multiple heterogeneous data sources. Tested by the yeast data with six heterogeneous feature kernels, the results show our method can further improve the prediction performance by up to 2%, which is very close to an upper bound that is obtained by an Approximate Bayesian Computation based sampling method. The proposed evolution deep neural network, coupled with regularized Laplacian kernel, is an effective tool in completing sparse and disconnected PPI networks and in facilitating integration of heterogeneous data sources.
Computing Gröbner Bases within Linear Algebra
NASA Astrophysics Data System (ADS)
Suzuki, Akira
In this paper, we present an alternative algorithm to compute Gröbner bases, which is based on computations on sparse linear algebra. Both of S-polynomial computations and monomial reductions are computed in linear algebra simultaneously in this algorithm. So it can be implemented to any computational system which can handle linear algebra. For a given ideal in a polynomial ring, it calculates a Gröbner basis along with the corresponding term order appropriately.
Hunt, Jonathan J; Dayan, Peter; Goodhill, Geoffrey J
2013-01-01
Receptive fields acquired through unsupervised learning of sparse representations of natural scenes have similar properties to primary visual cortex (V1) simple cell receptive fields. However, what drives in vivo development of receptive fields remains controversial. The strongest evidence for the importance of sensory experience in visual development comes from receptive field changes in animals reared with abnormal visual input. However, most sparse coding accounts have considered only normal visual input and the development of monocular receptive fields. Here, we applied three sparse coding models to binocular receptive field development across six abnormal rearing conditions. In every condition, the changes in receptive field properties previously observed experimentally were matched to a similar and highly faithful degree by all the models, suggesting that early sensory development can indeed be understood in terms of an impetus towards sparsity. As previously predicted in the literature, we found that asymmetries in inter-ocular correlation across orientations lead to orientation-specific binocular receptive fields. Finally we used our models to design a novel stimulus that, if present during rearing, is predicted by the sparsity principle to lead robustly to radically abnormal receptive fields.
Hunt, Jonathan J.; Dayan, Peter; Goodhill, Geoffrey J.
2013-01-01
Receptive fields acquired through unsupervised learning of sparse representations of natural scenes have similar properties to primary visual cortex (V1) simple cell receptive fields. However, what drives in vivo development of receptive fields remains controversial. The strongest evidence for the importance of sensory experience in visual development comes from receptive field changes in animals reared with abnormal visual input. However, most sparse coding accounts have considered only normal visual input and the development of monocular receptive fields. Here, we applied three sparse coding models to binocular receptive field development across six abnormal rearing conditions. In every condition, the changes in receptive field properties previously observed experimentally were matched to a similar and highly faithful degree by all the models, suggesting that early sensory development can indeed be understood in terms of an impetus towards sparsity. As previously predicted in the literature, we found that asymmetries in inter-ocular correlation across orientations lead to orientation-specific binocular receptive fields. Finally we used our models to design a novel stimulus that, if present during rearing, is predicted by the sparsity principle to lead robustly to radically abnormal receptive fields. PMID:23675290
Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins.
Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan
2016-02-24
Predicting protein subcellular localization is indispensable for inferring protein functions. Recent studies have been focusing on predicting not only single-location proteins, but also multi-location proteins. Almost all of the high performing predictors proposed recently use gene ontology (GO) terms to construct feature vectors for classification. Despite their high performance, their prediction decisions are difficult to interpret because of the large number of GO terms involved. This paper proposes using sparse regressions to exploit GO information for both predicting and interpreting subcellular localization of single- and multi-location proteins. Specifically, we compared two multi-label sparse regression algorithms, namely multi-label LASSO (mLASSO) and multi-label elastic net (mEN), for large-scale predictions of protein subcellular localization. Both algorithms can yield sparse and interpretable solutions. By using the one-vs-rest strategy, mLASSO and mEN identified 87 and 429 out of more than 8,000 GO terms, respectively, which play essential roles in determining subcellular localization. More interestingly, many of the GO terms selected by mEN are from the biological process and molecular function categories, suggesting that the GO terms of these categories also play vital roles in the prediction. With these essential GO terms, not only where a protein locates can be decided, but also why it resides there can be revealed. Experimental results show that the output of both mEN and mLASSO are interpretable and they perform significantly better than existing state-of-the-art predictors. Moreover, mEN selects more features and performs better than mLASSO on a stringent human benchmark dataset. For readers' convenience, an online server called SpaPredictor for both mLASSO and mEN is available at http://bioinfo.eie.polyu.edu.hk/SpaPredictorServer/.
Burant, Aniela; Lowry, Gregory V; Karamalidis, Athanasios K
2016-02-01
Treatment and reuse of brines, produced from energy extraction activities, requires aqueous solubility data for organic compounds in saline solutions. The presence of salts decreases the aqueous solubility of organic compounds (i.e. salting-out effect) and can be modeled using the Setschenow Equation, the validity of which has not been assessed in high salt concentrations. In this study, we used solid-phase microextraction to determine Setschenow constants for selected organic compounds in aqueous solutions up to 2-5 M NaCl, 1.5-2 M CaCl2, and in Na-Ca binary electrolyte solutions to assess additivity of the constants. These compounds exhibited log-linear behavior up to these high NaCl concentrations. Log-linear decreases in solubility with increasing salt concentration were observed up to 1.5-2 M CaCl2 for all compounds, and added to a sparse database of CaCl2 Setschenow constants. Setschenow constants were additive in binary electrolyte mixtures. New models to predict CaCl2 and KCl Setschenow constants from NaCl Setschenow constants were developed, which successfully predicted the solubility of the compounds measured in this study. Overall, data show that the Setschenow Equation is valid for a wide range of salinity conditions typically found in energy-related technologies. Copyright © 2015 Elsevier Ltd. All rights reserved.
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.
Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo
2015-08-01
Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
Convergence Speed of a Dynamical System for Sparse Recovery
NASA Astrophysics Data System (ADS)
Balavoine, Aurele; Rozell, Christopher J.; Romberg, Justin
2013-09-01
This paper studies the convergence rate of a continuous-time dynamical system for L1-minimization, known as the Locally Competitive Algorithm (LCA). Solving L1-minimization} problems efficiently and rapidly is of great interest to the signal processing community, as these programs have been shown to recover sparse solutions to underdetermined systems of linear equations and come with strong performance guarantees. The LCA under study differs from the typical L1 solver in that it operates in continuous time: instead of being specified by discrete iterations, it evolves according to a system of nonlinear ordinary differential equations. The LCA is constructed from simple components, giving it the potential to be implemented as a large-scale analog circuit. The goal of this paper is to give guarantees on the convergence time of the LCA system. To do so, we analyze how the LCA evolves as it is recovering a sparse signal from underdetermined measurements. We show that under appropriate conditions on the measurement matrix and the problem parameters, the path the LCA follows can be described as a sequence of linear differential equations, each with a small number of active variables. This allows us to relate the convergence time of the system to the restricted isometry constant of the matrix. Interesting parallels to sparse-recovery digital solvers emerge from this study. Our analysis covers both the noisy and noiseless settings and is supported by simulation results.
Massively parallel sparse matrix function calculations with NTPoly
NASA Astrophysics Data System (ADS)
Dawson, William; Nakajima, Takahito
2018-04-01
We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.
A Case Study on a Combination NDVI Forecasting Model Based on the Entropy Weight Method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, Shengzhi; Ming, Bo; Huang, Qiang
It is critically meaningful to accurately predict NDVI (Normalized Difference Vegetation Index), which helps guide regional ecological remediation and environmental managements. In this study, a combination forecasting model (CFM) was proposed to improve the performance of NDVI predictions in the Yellow River Basin (YRB) based on three individual forecasting models, i.e., the Multiple Linear Regression (MLR), Artificial Neural Network (ANN), and Support Vector Machine (SVM) models. The entropy weight method was employed to determine the weight coefficient for each individual model depending on its predictive performance. Results showed that: (1) ANN exhibits the highest fitting capability among the four orecastingmore » models in the calibration period, whilst its generalization ability becomes weak in the validation period; MLR has a poor performance in both calibration and validation periods; the predicted results of CFM in the calibration period have the highest stability; (2) CFM generally outperforms all individual models in the validation period, and can improve the reliability and stability of predicted results through combining the strengths while reducing the weaknesses of individual models; (3) the performances of all forecasting models are better in dense vegetation areas than in sparse vegetation areas.« less
NASA Astrophysics Data System (ADS)
Riplinger, Christoph; Pinski, Peter; Becker, Ute; Valeev, Edward F.; Neese, Frank
2016-01-01
Domain based local pair natural orbital coupled cluster theory with single-, double-, and perturbative triple excitations (DLPNO-CCSD(T)) is a highly efficient local correlation method. It is known to be accurate and robust and can be used in a black box fashion in order to obtain coupled cluster quality total energies for large molecules with several hundred atoms. While previous implementations showed near linear scaling up to a few hundred atoms, several nonlinear scaling steps limited the applicability of the method for very large systems. In this work, these limitations are overcome and a linear scaling DLPNO-CCSD(T) method for closed shell systems is reported. The new implementation is based on the concept of sparse maps that was introduced in Part I of this series [P. Pinski, C. Riplinger, E. F. Valeev, and F. Neese, J. Chem. Phys. 143, 034108 (2015)]. Using the sparse map infrastructure, all essential computational steps (integral transformation and storage, initial guess, pair natural orbital construction, amplitude iterations, triples correction) are achieved in a linear scaling fashion. In addition, a number of additional algorithmic improvements are reported that lead to significant speedups of the method. The new, linear-scaling DLPNO-CCSD(T) implementation typically is 7 times faster than the previous implementation and consumes 4 times less disk space for large three-dimensional systems. For linear systems, the performance gains and memory savings are substantially larger. Calculations with more than 20 000 basis functions and 1000 atoms are reported in this work. In all cases, the time required for the coupled cluster step is comparable to or lower than for the preceding Hartree-Fock calculation, even if this is carried out with the efficient resolution-of-the-identity and chain-of-spheres approximations. The new implementation even reduces the error in absolute correlation energies by about a factor of two, compared to the already accurate previous implementation.
Natural image sequences constrain dynamic receptive fields and imply a sparse code.
Häusler, Chris; Susemihl, Alex; Nawrot, Martin P
2013-11-06
In their natural environment, animals experience a complex and dynamic visual scenery. Under such natural stimulus conditions, neurons in the visual cortex employ a spatially and temporally sparse code. For the input scenario of natural still images, previous work demonstrated that unsupervised feature learning combined with the constraint of sparse coding can predict physiologically measured receptive fields of simple cells in the primary visual cortex. This convincingly indicated that the mammalian visual system is adapted to the natural spatial input statistics. Here, we extend this approach to the time domain in order to predict dynamic receptive fields that can account for both spatial and temporal sparse activation in biological neurons. We rely on temporal restricted Boltzmann machines and suggest a novel temporal autoencoding training procedure. When tested on a dynamic multi-variate benchmark dataset this method outperformed existing models of this class. Learning features on a large dataset of natural movies allowed us to model spatio-temporal receptive fields for single neurons. They resemble temporally smooth transformations of previously obtained static receptive fields and are thus consistent with existing theories. A neuronal spike response model demonstrates how the dynamic receptive field facilitates temporal and population sparseness. We discuss the potential mechanisms and benefits of a spatially and temporally sparse representation of natural visual input. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhu, Hao
Sparsity plays an instrumental role in a plethora of scientific fields, including statistical inference for variable selection, parsimonious signal representations, and solving under-determined systems of linear equations - what has led to the ground-breaking result of compressive sampling (CS). This Thesis leverages exciting ideas of sparse signal reconstruction to develop sparsity-cognizant algorithms, and analyze their performance. The vision is to devise tools exploiting the 'right' form of sparsity for the 'right' application domain of multiuser communication systems, array signal processing systems, and the emerging challenges in the smart power grid. Two important power system monitoring tasks are addressed first by capitalizing on the hidden sparsity. To robustify power system state estimation, a sparse outlier model is leveraged to capture the possible corruption in every datum, while the problem nonconvexity due to nonlinear measurements is handled using the semidefinite relaxation technique. Different from existing iterative methods, the proposed algorithm approximates well the global optimum regardless of the initialization. In addition, for enhanced situational awareness, a novel sparse overcomplete representation is introduced to capture (possibly multiple) line outages, and develop real-time algorithms for solving the combinatorially complex identification problem. The proposed algorithms exhibit near-optimal performance while incurring only linear complexity in the number of lines, which makes it possible to quickly bring contingencies to attention. This Thesis also accounts for two basic issues in CS, namely fully-perturbed models and the finite alphabet property. The sparse total least-squares (S-TLS) approach is proposed to furnish CS algorithms for fully-perturbed linear models, leading to statistically optimal and computationally efficient solvers. The S-TLS framework is well motivated for grid-based sensing applications and exhibits higher accuracy than existing sparse algorithms. On the other hand, exploiting the finite alphabet of unknown signals emerges naturally in communication systems, along with sparsity coming from the low activity of each user. Compared to approaches only accounting for either one of the two, joint exploitation of both leads to statistically optimal detectors with improved error performance.
Effect of missing data on multitask prediction methods.
de la Vega de León, Antonio; Chen, Beining; Gillet, Valerie J
2018-05-22
There has been a growing interest in multitask prediction in chemoinformatics, helped by the increasing use of deep neural networks in this field. This technique is applied to multitarget data sets, where compounds have been tested against different targets, with the aim of developing models to predict a profile of biological activities for a given compound. However, multitarget data sets tend to be sparse; i.e., not all compound-target combinations have experimental values. There has been little research on the effect of missing data on the performance of multitask methods. We have used two complete data sets to simulate sparseness by removing data from the training set. Different models to remove the data were compared. These sparse sets were used to train two different multitask methods, deep neural networks and Macau, which is a Bayesian probabilistic matrix factorization technique. Results from both methods were remarkably similar and showed that the performance decrease because of missing data is at first small before accelerating after large amounts of data are removed. This work provides a first approximation to assess how much data is required to produce good performance in multitask prediction exercises.
Sparse Additive Ordinary Differential Equations for Dynamic Gene Regulatory Network Modeling.
Wu, Hulin; Lu, Tao; Xue, Hongqi; Liang, Hua
2014-04-02
The gene regulation network (GRN) is a high-dimensional complex system, which can be represented by various mathematical or statistical models. The ordinary differential equation (ODE) model is one of the popular dynamic GRN models. High-dimensional linear ODE models have been proposed to identify GRNs, but with a limitation of the linear regulation effect assumption. In this article, we propose a sparse additive ODE (SA-ODE) model, coupled with ODE estimation methods and adaptive group LASSO techniques, to model dynamic GRNs that could flexibly deal with nonlinear regulation effects. The asymptotic properties of the proposed method are established and simulation studies are performed to validate the proposed approach. An application example for identifying the nonlinear dynamic GRN of T-cell activation is used to illustrate the usefulness of the proposed method.
NASA Astrophysics Data System (ADS)
Orović, Irena; Stanković, Srdjan; Amin, Moeness
2013-05-01
A modified robust two-dimensional compressive sensing algorithm for reconstruction of sparse time-frequency representation (TFR) is proposed. The ambiguity function domain is assumed to be the domain of observations. The two-dimensional Fourier bases are used to linearly relate the observations to the sparse TFR, in lieu of the Wigner distribution. We assume that a set of available samples in the ambiguity domain is heavily corrupted by an impulsive type of noise. Consequently, the problem of sparse TFR reconstruction cannot be tackled using standard compressive sensing optimization algorithms. We introduce a two-dimensional L-statistics based modification into the transform domain representation. It provides suitable initial conditions that will produce efficient convergence of the reconstruction algorithm. This approach applies sorting and weighting operations to discard an expected amount of samples corrupted by noise. The remaining samples serve as observations used in sparse reconstruction of the time-frequency signal representation. The efficiency of the proposed approach is demonstrated on numerical examples that comprise both cases of monocomponent and multicomponent signals.
Hierarchical Bayesian sparse image reconstruction with application to MRFM.
Dobigeon, Nicolas; Hero, Alfred O; Tourneret, Jean-Yves
2009-09-01
This paper presents a hierarchical Bayesian model to reconstruct sparse images when the observations are obtained from linear transformations and corrupted by an additive white Gaussian noise. Our hierarchical Bayes model is well suited to such naturally sparse image applications as it seamlessly accounts for properties such as sparsity and positivity of the image via appropriate Bayes priors. We propose a prior that is based on a weighted mixture of a positive exponential distribution and a mass at zero. The prior has hyperparameters that are tuned automatically by marginalization over the hierarchical Bayesian model. To overcome the complexity of the posterior distribution, a Gibbs sampling strategy is proposed. The Gibbs samples can be used to estimate the image to be recovered, e.g., by maximizing the estimated posterior distribution. In our fully Bayesian approach, the posteriors of all the parameters are available. Thus, our algorithm provides more information than other previously proposed sparse reconstruction methods that only give a point estimate. The performance of the proposed hierarchical Bayesian sparse reconstruction method is illustrated on synthetic data and real data collected from a tobacco virus sample using a prototype MRFM instrument.
NASA Astrophysics Data System (ADS)
Bouchet, L.; Amestoy, P.; Buttari, A.; Rouet, F.-H.; Chauvin, M.
2013-02-01
Nowadays, analyzing and reducing the ever larger astronomical datasets is becoming a crucial challenge, especially for long cumulated observation times. The INTEGRAL/SPI X/γ-ray spectrometer is an instrument for which it is essential to process many exposures at the same time in order to increase the low signal-to-noise ratio of the weakest sources. In this context, the conventional methods for data reduction are inefficient and sometimes not feasible at all. Processing several years of data simultaneously requires computing not only the solution of a large system of equations, but also the associated uncertainties. We aim at reducing the computation time and the memory usage. Since the SPI transfer function is sparse, we have used some popular methods for the solution of large sparse linear systems; we briefly review these methods. We use the Multifrontal Massively Parallel Solver (MUMPS) to compute the solution of the system of equations. We also need to compute the variance of the solution, which amounts to computing selected entries of the inverse of the sparse matrix corresponding to our linear system. This can be achieved through one of the latest features of the MUMPS software that has been partly motivated by this work. In this paper we provide a brief presentation of this feature and evaluate its effectiveness on astrophysical problems requiring the processing of large datasets simultaneously, such as the study of the entire emission of the Galaxy. We used these algorithms to solve the large sparse systems arising from SPI data processing and to obtain both their solutions and the associated variances. In conclusion, thanks to these newly developed tools, processing large datasets arising from SPI is now feasible with both a reasonable execution time and a low memory usage.
Zhang, Guoqing; Sun, Huaijiang; Xia, Guiyu; Sun, Quansen
2016-07-07
Sparse representation based classification (SRC) has been developed and shown great potential for real-world application. Based on SRC, Yang et al. [10] devised a SRC steered discriminative projection (SRC-DP) method. However, as a linear algorithm, SRC-DP cannot handle the data with highly nonlinear distribution. Kernel sparse representation-based classifier (KSRC) is a non-linear extension of SRC and can remedy the drawback of SRC. KSRC requires the use of a predetermined kernel function and selection of the kernel function and its parameters is difficult. Recently, multiple kernel learning for SRC (MKL-SRC) [22] has been proposed to learn a kernel from a set of base kernels. However, MKL-SRC only considers the within-class reconstruction residual while ignoring the between-class relationship, when learning the kernel weights. In this paper, we propose a novel multiple kernel sparse representation-based classifier (MKSRC), and then we use it as a criterion to design a multiple kernel sparse representation based orthogonal discriminative projection method (MK-SR-ODP). The proposed algorithm aims at learning a projection matrix and a corresponding kernel from the given base kernels such that in the low dimension subspace the between-class reconstruction residual is maximized and the within-class reconstruction residual is minimized. Furthermore, to achieve a minimum overall loss by performing recognition in the learned low-dimensional subspace, we introduce cost information into the dimensionality reduction method. The solutions for the proposed method can be efficiently found based on trace ratio optimization method [33]. Extensive experimental results demonstrate the superiority of the proposed algorithm when compared with the state-of-the-art methods.
Solving large tomographic linear systems: size reduction and error estimation
NASA Astrophysics Data System (ADS)
Voronin, Sergey; Mikesell, Dylan; Slezak, Inna; Nolet, Guust
2014-10-01
We present a new approach to reduce a sparse, linear system of equations associated with tomographic inverse problems. We begin by making a modification to the commonly used compressed sparse-row format, whereby our format is tailored to the sparse structure of finite-frequency (volume) sensitivity kernels in seismic tomography. Next, we cluster the sparse matrix rows to divide a large matrix into smaller subsets representing ray paths that are geographically close. Singular value decomposition of each subset allows us to project the data onto a subspace associated with the largest eigenvalues of the subset. After projection we reject those data that have a signal-to-noise ratio (SNR) below a chosen threshold. Clustering in this way assures that the sparse nature of the system is minimally affected by the projection. Moreover, our approach allows for a precise estimation of the noise affecting the data while also giving us the ability to identify outliers. We illustrate the method by reducing large matrices computed for global tomographic systems with cross-correlation body wave delays, as well as with surface wave phase velocity anomalies. For a massive matrix computed for 3.7 million Rayleigh wave phase velocity measurements, imposing a threshold of 1 for the SNR, we condensed the matrix size from 1103 to 63 Gbyte. For a global data set of multiple-frequency P wave delays from 60 well-distributed deep earthquakes we obtain a reduction to 5.9 per cent. This type of reduction allows one to avoid loss of information due to underparametrizing models. Alternatively, if data have to be rejected to fit the system into computer memory, it assures that the most important data are preserved.
Performance bounds for modal analysis using sparse linear arrays
NASA Astrophysics Data System (ADS)
Li, Yuanxin; Pezeshki, Ali; Scharf, Louis L.; Chi, Yuejie
2017-05-01
We study the performance of modal analysis using sparse linear arrays (SLAs) such as nested and co-prime arrays, in both first-order and second-order measurement models. We treat SLAs as constructed from a subset of sensors in a dense uniform linear array (ULA), and characterize the performance loss of SLAs with respect to the ULA due to using much fewer sensors. In particular, we claim that, provided the same aperture, in order to achieve comparable performance in terms of Cramér-Rao bound (CRB) for modal analysis, SLAs require more snapshots, of which the number is about the number of snapshots used by ULA times the compression ratio in the number of sensors. This is shown analytically for the case with one undamped mode, as well as empirically via extensive numerical experiments for more complex scenarios. Moreover, the misspecified CRB proposed by Richmond and Horowitz is also studied, where SLAs suffer more performance loss than their ULA counterpart.
Augmented l1 and Nuclear-Norm Models with a Globally Linearly Convergent Algorithm. Revision 1
2012-10-17
nonzero and sampled from the standard Gaussian distribution (for Figure 2) or the Bernoulli distribution (for Figure 3). Both tests had the same sensing...dual variable y(k) Figure 3: Convergence of primal and dual variables of three algorithms on Bernoulli sparse x0 was the slowest. Besides the obvious...slower convergence than the final stage. Comparing the results of two tests, the convergence was faster on the Bernoulli sparse signal than the
NoGOA: predicting noisy GO annotations using evidences and sparse representation.
Yu, Guoxian; Lu, Chang; Wang, Jun
2017-07-21
Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .
Predict Brain MR Image Registration via Sparse Learning of Appearance and Transformation
Wang, Qian; Kim, Minjeong; Shi, Yonghong; Wu, Guorong; Shen, Dinggang
2014-01-01
We propose a new approach to register the subject image with the template by leveraging a set of intermediate images that are pre-aligned to the template. We argue that, if points in the subject and the intermediate images share similar local appearances, they may have common correspondence in the template. In this way, we learn the sparse representation of a certain subject point to reveal several similar candidate points in the intermediate images. Each selected intermediate candidate can bridge the correspondence from the subject point to the template space, thus predicting the transformation associated with the subject point at the confidence level that relates to the learned sparse coefficient. Following this strategy, we first predict transformations at selected key points, and retain multiple predictions on each key point, instead of allowing only a single correspondence. Then, by utilizing all key points and their predictions with varying confidences, we adaptively reconstruct the dense transformation field that warps the subject to the template. We further embed the prediction-reconstruction protocol above into a multi-resolution hierarchy. In the final, we refine our estimated transformation field via existing registration method in effective manners. We apply our method to registering brain MR images, and conclude that the proposed framework is competent to improve registration performances substantially. PMID:25476412
Distant failure prediction for early stage NSCLC by analyzing PET with sparse representation
NASA Astrophysics Data System (ADS)
Hao, Hongxia; Zhou, Zhiguo; Wang, Jing
2017-03-01
Positron emission tomography (PET) imaging has been widely explored for treatment outcome prediction. Radiomicsdriven methods provide a new insight to quantitatively explore underlying information from PET images. However, it is still a challenging problem to automatically extract clinically meaningful features for prognosis. In this work, we develop a PET-guided distant failure predictive model for early stage non-small cell lung cancer (NSCLC) patients after stereotactic ablative radiotherapy (SABR) by using sparse representation. The proposed method does not need precalculated features and can learn intrinsically distinctive features contributing to classification of patients with distant failure. The proposed framework includes two main parts: 1) intra-tumor heterogeneity description; and 2) dictionary pair learning based sparse representation. Tumor heterogeneity is initially captured through anisotropic kernel and represented as a set of concatenated vectors, which forms the sample gallery. Then, given a test tumor image, its identity (i.e., distant failure or not) is classified by applying the dictionary pair learning based sparse representation. We evaluate the proposed approach on 48 NSCLC patients treated by SABR at our institute. Experimental results show that the proposed approach can achieve an area under the characteristic curve (AUC) of 0.70 with a sensitivity of 69.87% and a specificity of 69.51% using a five-fold cross validation.
An efficient implementation of a high-order filter for a cubed-sphere spectral element model
NASA Astrophysics Data System (ADS)
Kang, Hyun-Gyu; Cheong, Hyeong-Bin
2017-03-01
A parallel-scalable, isotropic, scale-selective spatial filter was developed for the cubed-sphere spectral element model on the sphere. The filter equation is a high-order elliptic (Helmholtz) equation based on the spherical Laplacian operator, which is transformed into cubed-sphere local coordinates. The Laplacian operator is discretized on the computational domain, i.e., on each cell, by the spectral element method with Gauss-Lobatto Lagrange interpolating polynomials (GLLIPs) as the orthogonal basis functions. On the global domain, the discrete filter equation yielded a linear system represented by a highly sparse matrix. The density of this matrix increases quadratically (linearly) with the order of GLLIP (order of the filter), and the linear system is solved in only O (Ng) operations, where Ng is the total number of grid points. The solution, obtained by a row reduction method, demonstrated the typical accuracy and convergence rate of the cubed-sphere spectral element method. To achieve computational efficiency on parallel computers, the linear system was treated by an inverse matrix method (a sparse matrix-vector multiplication). The density of the inverse matrix was lowered to only a few times of the original sparse matrix without degrading the accuracy of the solution. For better computational efficiency, a local-domain high-order filter was introduced: The filter equation is applied to multiple cells, and then the central cell was only used to reconstruct the filtered field. The parallel efficiency of applying the inverse matrix method to the global- and local-domain filter was evaluated by the scalability on a distributed-memory parallel computer. The scale-selective performance of the filter was demonstrated on Earth topography. The usefulness of the filter as a hyper-viscosity for the vorticity equation was also demonstrated.
Dimension Reduction With Extreme Learning Machine.
Kasun, Liyanaarachchi Lekamalage Chamara; Yang, Yan; Huang, Guang-Bin; Zhang, Zhengyou
2016-08-01
Data may often contain noise or irrelevant information, which negatively affect the generalization capability of machine learning algorithms. The objective of dimension reduction algorithms, such as principal component analysis (PCA), non-negative matrix factorization (NMF), random projection (RP), and auto-encoder (AE), is to reduce the noise or irrelevant information of the data. The features of PCA (eigenvectors) and linear AE are not able to represent data as parts (e.g. nose in a face image). On the other hand, NMF and non-linear AE are maimed by slow learning speed and RP only represents a subspace of original data. This paper introduces a dimension reduction framework which to some extend represents data as parts, has fast learning speed, and learns the between-class scatter subspace. To this end, this paper investigates a linear and non-linear dimension reduction framework referred to as extreme learning machine AE (ELM-AE) and sparse ELM-AE (SELM-AE). In contrast to tied weight AE, the hidden neurons in ELM-AE and SELM-AE need not be tuned, and their parameters (e.g, input weights in additive neurons) are initialized using orthogonal and sparse random weights, respectively. Experimental results on USPS handwritten digit recognition data set, CIFAR-10 object recognition, and NORB object recognition data set show the efficacy of linear and non-linear ELM-AE and SELM-AE in terms of discriminative capability, sparsity, training time, and normalized mean square error.
New methods for sampling sparse populations
Anna Ringvall
2007-01-01
To improve surveys of sparse objects, methods that use auxiliary information have been suggested. Guided transect sampling uses prior information, e.g., from aerial photographs, for the layout of survey strips. Instead of being laid out straight, the strips will wind between potentially more interesting areas. 3P sampling (probability proportional to prediction) uses...
Sparse distributed memory: understanding the speed and robustness of expert memory
Brogliato, Marcelo S.; Chada, Daniel M.; Linhares, Alexandre
2014-01-01
How can experts, sometimes in exacting detail, almost immediately and very precisely recall memory items from a vast repertoire? The problem in which we will be interested concerns models of theoretical neuroscience that could explain the speed and robustness of an expert's recollection. The approach is based on Sparse Distributed Memory, which has been shown to be plausible, both in a neuroscientific and in a psychological manner, in a number of ways. A crucial characteristic concerns the limits of human recollection, the “tip-of-tongue” memory event—which is found at a non-linearity in the model. We expand the theoretical framework, deriving an optimization formula to solve this non-linearity. Numerical results demonstrate how the higher frequency of rehearsal, through work or study, immediately increases the robustness and speed associated with expert memory. PMID:24808842
Image super-resolution via sparse representation.
Yang, Jianchao; Wright, John; Huang, Thomas S; Ma, Yi
2010-11-01
This paper presents a new approach to single-image super-resolution, based on sparse signal representation. Research on image statistics suggests that image patches can be well-represented as a sparse linear combination of elements from an appropriately chosen over-complete dictionary. Inspired by this observation, we seek a sparse representation for each patch of the low-resolution input, and then use the coefficients of this representation to generate the high-resolution output. Theoretical results from compressed sensing suggest that under mild conditions, the sparse representation can be correctly recovered from the downsampled signals. By jointly training two dictionaries for the low- and high-resolution image patches, we can enforce the similarity of sparse representations between the low resolution and high resolution image patch pair with respect to their own dictionaries. Therefore, the sparse representation of a low resolution image patch can be applied with the high resolution image patch dictionary to generate a high resolution image patch. The learned dictionary pair is a more compact representation of the patch pairs, compared to previous approaches, which simply sample a large amount of image patch pairs, reducing the computational cost substantially. The effectiveness of such a sparsity prior is demonstrated for both general image super-resolution and the special case of face hallucination. In both cases, our algorithm generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods. In addition, the local sparse modeling of our approach is naturally robust to noise, and therefore the proposed algorithm can handle super-resolution with noisy inputs in a more unified framework.
Sparse RNA folding revisited: space-efficient minimum free energy structure prediction.
Will, Sebastian; Jabbari, Hosna
2016-01-01
RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while guaranteeing the same result; however, for many such folding problems, space efficiency is of even greater concern, particularly for long RNA sequences. So far, space-efficient sparsified RNA folding with fold reconstruction was solved only for simple base-pair-based pseudo-energy models. Here, we revisit the problem of space-efficient free energy minimization. Whereas the space-efficient minimization of the free energy has been sketched before, the reconstruction of the optimum structure has not even been discussed. We show that this reconstruction is not possible in trivial extension of the method for simple energy models. Then, we present the time- and space-efficient sparsified free energy minimization algorithm SparseMFEFold that guarantees MFE structure prediction. In particular, this novel algorithm provides efficient fold reconstruction based on dynamically garbage-collected trace arrows. The complexity of our algorithm depends on two parameters, the number of candidates Z and the number of trace arrows T; both are bounded by [Formula: see text], but are typically much smaller. The time complexity of RNA folding is reduced from [Formula: see text] to [Formula: see text]; the space complexity, from [Formula: see text] to [Formula: see text]. Our empirical results show more than 80 % space savings over RNAfold [Vienna RNA package] on the long RNAs from the RNA STRAND database (≥2500 bases). The presented technique is intentionally generalizable to complex prediction algorithms; due to their high space demands, algorithms like pseudoknot prediction and RNA-RNA-interaction prediction are expected to profit even stronger than "standard" MFE folding. SparseMFEFold is free software, available at http://www.bioinf.uni-leipzig.de/~will/Software/SparseMFEFold.
Characteristics of voxel prediction power in full-brain Granger causality analysis of fMRI data
NASA Astrophysics Data System (ADS)
Garg, Rahul; Cecchi, Guillermo A.; Rao, A. Ravishankar
2011-03-01
Functional neuroimaging research is moving from the study of "activations" to the study of "interactions" among brain regions. Granger causality analysis provides a powerful technique to model spatio-temporal interactions among brain regions. We apply this technique to full-brain fMRI data without aggregating any voxel data into regions of interest (ROIs). We circumvent the problem of dimensionality using sparse regression from machine learning. On a simple finger-tapping experiment we found that (1) a small number of voxels in the brain have very high prediction power, explaining the future time course of other voxels in the brain; (2) these voxels occur in small sized clusters (of size 1-4 voxels) distributed throughout the brain; (3) albeit small, these clusters overlap with most of the clusters identified with the non-temporal General Linear Model (GLM); and (4) the method identifies clusters which, while not determined by the task and not detectable by GLM, still influence brain activity.
Systematic sparse matrix error control for linear scaling electronic structure calculations.
Rubensson, Emanuel H; Sałek, Paweł
2005-11-30
Efficient truncation criteria used in multiatom blocked sparse matrix operations for ab initio calculations are proposed. As system size increases, so does the need to stay on top of errors and still achieve high performance. A variant of a blocked sparse matrix algebra to achieve strict error control with good performance is proposed. The presented idea is that the condition to drop a certain submatrix should depend not only on the magnitude of that particular submatrix, but also on which other submatrices that are dropped. The decision to remove a certain submatrix is based on the contribution the removal would cause to the error in the chosen norm. We study the effect of an accumulated truncation error in iterative algorithms like trace correcting density matrix purification. One way to reduce the initial exponential growth of this error is presented. The presented error control for a sparse blocked matrix toolbox allows for achieving optimal performance by performing only necessary operations needed to maintain the requested level of accuracy. Copyright 2005 Wiley Periodicals, Inc.
Insights from Classifying Visual Concepts with Multiple Kernel Learning
Binder, Alexander; Nakajima, Shinichi; Kloft, Marius; Müller, Christina; Samek, Wojciech; Brefeld, Ulf; Müller, Klaus-Robert; Kawanabe, Motoaki
2012-01-01
Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, 1-norm regularized MKL variants are often observed to be outperformed by an unweighted sum kernel. The main contributions of this paper are the following: we apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks from the application domain of computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum-kernel SVM and sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. Data sets (kernel matrices) as well as further information are available at http://doc.ml.tu-berlin.de/image_mkl/(Accessed 2012 Jun 25). PMID:22936970
Computing row and column counts for sparse QR and LU factorization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilbert, John R.; Li, Xiaoye S.; Ng, Esmond G.
2001-01-01
We present algorithms to determine the number of nonzeros in each row and column of the factors of a sparse matrix, for both the QR factorization and the LU factorization with partial pivoting. The algorithms use only the nonzero structure of the input matrix, and run in time nearly linear in the number of nonzeros in that matrix. They may be used to set up data structures or schedule parallel operations in advance of the numerical factorization. The row and column counts we compute are upper bounds on the actual counts. If the input matrix is strong Hall and theremore » is no coincidental numerical cancellation, the counts are exact for QR factorization and are the tightest bounds possible for LU factorization. These algorithms are based on our earlier work on computing row and column counts for sparse Cholesky factorization, plus an efficient method to compute the column elimination tree of a sparse matrix without explicitly forming the product of the matrix and its transpose.« less
Yang, Su; Shi, Shixiong; Hu, Xiaobing; Wang, Minjie
2015-01-01
Spatial-temporal correlations among the data play an important role in traffic flow prediction. Correspondingly, traffic modeling and prediction based on big data analytics emerges due to the city-scale interactions among traffic flows. A new methodology based on sparse representation is proposed to reveal the spatial-temporal dependencies among traffic flows so as to simplify the correlations among traffic data for the prediction task at a given sensor. Three important findings are observed in the experiments: (1) Only traffic flows immediately prior to the present time affect the formation of current traffic flows, which implies the possibility to reduce the traditional high-order predictors into an 1-order model. (2) The spatial context relevant to a given prediction task is more complex than what is assumed to exist locally and can spread out to the whole city. (3) The spatial context varies with the target sensor undergoing prediction and enlarges with the increment of time lag for prediction. Because the scope of human mobility is subject to travel time, identifying the varying spatial context against time lag is crucial for prediction. Since sparse representation can capture the varying spatial context to adapt to the prediction task, it outperforms the traditional methods the inputs of which are confined as the data from a fixed number of nearby sensors. As the spatial-temporal context for any prediction task is fully detected from the traffic data in an automated manner, where no additional information regarding network topology is needed, it has good scalability to be applicable to large-scale networks.
Yang, Su; Shi, Shixiong; Hu, Xiaobing; Wang, Minjie
2015-01-01
Spatial-temporal correlations among the data play an important role in traffic flow prediction. Correspondingly, traffic modeling and prediction based on big data analytics emerges due to the city-scale interactions among traffic flows. A new methodology based on sparse representation is proposed to reveal the spatial-temporal dependencies among traffic flows so as to simplify the correlations among traffic data for the prediction task at a given sensor. Three important findings are observed in the experiments: (1) Only traffic flows immediately prior to the present time affect the formation of current traffic flows, which implies the possibility to reduce the traditional high-order predictors into an 1-order model. (2) The spatial context relevant to a given prediction task is more complex than what is assumed to exist locally and can spread out to the whole city. (3) The spatial context varies with the target sensor undergoing prediction and enlarges with the increment of time lag for prediction. Because the scope of human mobility is subject to travel time, identifying the varying spatial context against time lag is crucial for prediction. Since sparse representation can capture the varying spatial context to adapt to the prediction task, it outperforms the traditional methods the inputs of which are confined as the data from a fixed number of nearby sensors. As the spatial-temporal context for any prediction task is fully detected from the traffic data in an automated manner, where no additional information regarding network topology is needed, it has good scalability to be applicable to large-scale networks. PMID:26496370
Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Sapiro, Guillermo; Lenglet, Christophe
2017-09-01
We propose a sparse Bayesian learning algorithm for improved estimation of white matter fiber parameters from compressed (under-sampled q-space) multi-shell diffusion MRI data. The multi-shell data is represented in a dictionary form using a non-monoexponential decay model of diffusion, based on continuous gamma distribution of diffusivities. The fiber volume fractions with predefined orientations, which are the unknown parameters, form the dictionary weights. These unknown parameters are estimated with a linear un-mixing framework, using a sparse Bayesian learning algorithm. A localized learning of hyperparameters at each voxel and for each possible fiber orientations improves the parameter estimation. Our experiments using synthetic data from the ISBI 2012 HARDI reconstruction challenge and in-vivo data from the Human Connectome Project demonstrate the improvements.
Interaction Models for Functional Regression.
Usset, Joseph; Staicu, Ana-Maria; Maity, Arnab
2016-02-01
A functional regression model with a scalar response and multiple functional predictors is proposed that accommodates two-way interactions in addition to their main effects. The proposed estimation procedure models the main effects using penalized regression splines, and the interaction effect by a tensor product basis. Extensions to generalized linear models and data observed on sparse grids or with measurement error are presented. A hypothesis testing procedure for the functional interaction effect is described. The proposed method can be easily implemented through existing software. Numerical studies show that fitting an additive model in the presence of interaction leads to both poor estimation performance and lost prediction power, while fitting an interaction model where there is in fact no interaction leads to negligible losses. The methodology is illustrated on the AneuRisk65 study data.
NASA Astrophysics Data System (ADS)
Lin, Chuang; Wang, Binghui; Jiang, Ning; Farina, Dario
2018-04-01
Objective. This paper proposes a novel simultaneous and proportional multiple degree of freedom (DOF) myoelectric control method for active prostheses. Approach. The approach is based on non-negative matrix factorization (NMF) of surface EMG signals with the inclusion of sparseness constraints. By applying a sparseness constraint to the control signal matrix, it is possible to extract the basis information from arbitrary movements (quasi-unsupervised approach) for multiple DOFs concurrently. Main Results. In online testing based on target hitting, able-bodied subjects reached a greater throughput (TP) when using sparse NMF (SNMF) than with classic NMF or with linear regression (LR). Accordingly, the completion time (CT) was shorter for SNMF than NMF or LR. The same observations were made in two patients with unilateral limb deficiencies. Significance. The addition of sparseness constraints to NMF allows for a quasi-unsupervised approach to myoelectric control with superior results with respect to previous methods for the simultaneous and proportional control of multi-DOF. The proposed factorization algorithm allows robust simultaneous and proportional control, is superior to previous supervised algorithms, and, because of minimal supervision, paves the way to online adaptation in myoelectric control.
One-shot 3D scanning by combining sparse landmarks with dense gradient information
NASA Astrophysics Data System (ADS)
Di Martino, Matías; Flores, Jorge; Ferrari, José A.
2018-06-01
Scene understanding is one of the most challenging and popular problems in the field of robotics and computer vision and the estimation of 3D information is at the core of most of these applications. In order to retrieve the 3D structure of a test surface we propose a single shot approach that combines dense gradient information with sparse absolute measurements. To that end, we designed a colored pattern that codes fine horizontal and vertical fringes, with sparse corners landmarks. By measuring the deformation (bending) of horizontal and vertical fringes, we are able to estimate surface local variations (i.e. its gradient field). Then corner sparse landmarks are detected and matched to infer spare absolute information about the test surface height. Local gradient information is combined with the sparse absolute values which work as anchors to guide the integration process. We show that this can be mathematically done in a very compact and intuitive way by properly defining a Poisson-like partial differential equation. Then we address in detail how the problem can be formulated in a discrete domain and how it can be practically solved by straight forward linear numerical solvers. Finally, validation experiment are presented.
NASA Astrophysics Data System (ADS)
Zhang, G.; Lu, D.; Ye, M.; Gunzburger, M.
2011-12-01
Markov Chain Monte Carlo (MCMC) methods have been widely used in many fields of uncertainty analysis to estimate the posterior distributions of parameters and credible intervals of predictions in the Bayesian framework. However, in practice, MCMC may be computationally unaffordable due to slow convergence and the excessive number of forward model executions required, especially when the forward model is expensive to compute. Both disadvantages arise from the curse of dimensionality, i.e., the posterior distribution is usually a multivariate function of parameters. Recently, sparse grid method has been demonstrated to be an effective technique for coping with high-dimensional interpolation or integration problems. Thus, in order to accelerate the forward model and avoid the slow convergence of MCMC, we propose a new method for uncertainty analysis based on sparse grid interpolation and quasi-Monte Carlo sampling. First, we construct a polynomial approximation of the forward model in the parameter space by using the sparse grid interpolation. This approximation then defines an accurate surrogate posterior distribution that can be evaluated repeatedly at minimal computational cost. Second, instead of using MCMC, a quasi-Monte Carlo method is applied to draw samples in the parameter space. Then, the desired probability density function of each prediction is approximated by accumulating the posterior density values of all the samples according to the prediction values. Our method has the following advantages: (1) the polynomial approximation of the forward model on the sparse grid provides a very efficient evaluation of the surrogate posterior distribution; (2) the quasi-Monte Carlo method retains the same accuracy in approximating the PDF of predictions but avoids all disadvantages of MCMC. The proposed method is applied to a controlled numerical experiment of groundwater flow modeling. The results show that our method attains the same accuracy much more efficiently than traditional MCMC.
A novel principal component analysis for spatially misaligned multivariate air pollution data.
Jandarov, Roman A; Sheppard, Lianne A; Sampson, Paul D; Szpiro, Adam A
2017-01-01
We propose novel methods for predictive (sparse) PCA with spatially misaligned data. These methods identify principal component loading vectors that explain as much variability in the observed data as possible, while also ensuring the corresponding principal component scores can be predicted accurately by means of spatial statistics at locations where air pollution measurements are not available. This will make it possible to identify important mixtures of air pollutants and to quantify their health effects in cohort studies, where currently available methods cannot be used. We demonstrate the utility of predictive (sparse) PCA in simulated data and apply the approach to annual averages of particulate matter speciation data from national Environmental Protection Agency (EPA) regulatory monitors.
A Systematic Approach for Obtaining Performance on Matrix-Like Operations
NASA Astrophysics Data System (ADS)
Veras, Richard Michael
Scientific Computation provides a critical role in the scientific process because it allows us ask complex queries and test predictions that would otherwise be unfeasible to perform experimentally. Because of its power, Scientific Computing has helped drive advances in many fields ranging from Engineering and Physics to Biology and Sociology to Economics and Drug Development and even to Machine Learning and Artificial Intelligence. Common among these domains is the desire for timely computational results, thus a considerable amount of human expert effort is spent towards obtaining performance for these scientific codes. However, this is no easy task because each of these domains present their own unique set of challenges to software developers, such as domain specific operations, structurally complex data and ever-growing datasets. Compounding these problems are the myriads of constantly changing, complex and unique hardware platforms that an expert must target. Unfortunately, an expert is typically forced to reproduce their effort across multiple problem domains and hardware platforms. In this thesis, we demonstrate the automatic generation of expert level high-performance scientific codes for Dense Linear Algebra (DLA), Structured Mesh (Stencil), Sparse Linear Algebra and Graph Analytic. In particular, this thesis seeks to address the issue of obtaining performance on many complex platforms for a certain class of matrix-like operations that span across many scientific, engineering and social fields. We do this by automating a method used for obtaining high performance in DLA and extending it to structured, sparse and scale-free domains. We argue that it is through the use of the underlying structure found in the data from these domains that enables this process. Thus, obtaining performance for most operations does not occur in isolation of the data being operated on, but instead depends significantly on the structure of the data.
A survey of packages for large linear systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Kesheng; Milne, Brent
2000-02-11
This paper evaluates portable software packages for the iterative solution of very large sparse linear systems on parallel architectures. While we cannot hope to tell individual users which package will best suit their needs, we do hope that our systematic evaluation provides essential unbiased information about the packages and the evaluation process may serve as an example on how to evaluate these packages. The information contained here include feature comparisons, usability evaluations and performance characterizations. This review is primarily focused on self-contained packages that can be easily integrated into an existing program and are capable of computing solutions to verymore » large sparse linear systems of equations. More specifically, it concentrates on portable parallel linear system solution packages that provide iterative solution schemes and related preconditioning schemes because iterative methods are more frequently used than competing schemes such as direct methods. The eight packages evaluated are: Aztec, BlockSolve,ISIS++, LINSOL, P-SPARSLIB, PARASOL, PETSc, and PINEAPL. Among the eight portable parallel iterative linear system solvers reviewed, we recommend PETSc and Aztec for most application programmers because they have well designed user interface, extensive documentation and very responsive user support. Both PETSc and Aztec are written in the C language and are callable from Fortran. For those users interested in using Fortran 90, PARASOL is a good alternative. ISIS++is a good alternative for those who prefer the C++ language. Both PARASOL and ISIS++ are relatively new and are continuously evolving. Thus their user interface may change. In general, those packages written in Fortran 77 are more cumbersome to use because the user may need to directly deal with a number of arrays of varying sizes. Languages like C++ and Fortran 90 offer more convenient data encapsulation mechanisms which make it easier to implement a clean and intuitive user interface. In addition to reviewing these portable parallel iterative solver packages, we also provide a more cursory assessment of a range of related packages, from specialized parallel preconditioners to direct methods for sparse linear systems.« less
Mandibular bone changes in 24 years and skeletal fracture prediction.
Jonasson, G; Sundh, V; Hakeberg, M; Hassani-Nejad, A; Lissner, L; Ahlqwist, M
2013-03-01
The objectives of the investigation were to describe changes in mandibular bone structure with aging and to compare the usefulness of cortical and trabecular bone for fracture prediction. From 1968 to 1993, 1,003 women were examined. With the help of panoramic radiographs, cortex thickness was measured and cortex was categorized as: normal, moderately, or severely eroded. The trabeculation was assessed as sparse, mixed, or dense. Visually, the mandibular compact and trabecular bone transformed gradually during the 24 years. The compact bone became more porous, the intertrabecular spaces increased, and the radiographic image of the trabeculae seemed less mineralized. Cortex thickness increased up to the age of 50 and decreased significantly thereafter. At all examinations, the sparse trabeculation group had more fractures (71-78 %) than the non-sparse group (27-31 %), whereas the severely eroded compact group showed more fractures than the less eroded groups only in 1992/1993, 24 years later. Sparse trabecular pattern was associated with future fractures both in perimenopausal and older women (relative risk (RR), 1.47-4.37) and cortical erosion in older women (RR, 1.35-1.55). RR for future fracture associated with a severely eroded cortex increased to 4.98 for cohort 1930 in 1992/1993. RR for future fracture associated with sparse trabeculation increased to 11.43 for cohort 1922 in 1992/1993. Dental radiographs contain enough information to identify women most at risk of future fracture. When observing sparse mandibular trabeculation, dentists can identify 40-69 % of women at risk for future fractures, depending on participant age at examination.
Zhang, Wanhong; Zhou, Tong
2015-01-01
Motivation Identifying gene regulatory networks (GRNs) which consist of a large number of interacting units has become a problem of paramount importance in systems biology. Situations exist extensively in which causal interacting relationships among these units are required to be reconstructed from measured expression data and other a priori information. Though numerous classical methods have been developed to unravel the interactions of GRNs, these methods either have higher computing complexities or have lower estimation accuracies. Note that great similarities exist between identification of genes that directly regulate a specific gene and a sparse vector reconstruction, which often relates to the determination of the number, location and magnitude of nonzero entries of an unknown vector by solving an underdetermined system of linear equations y = Φx. Based on these similarities, we propose a novel framework of sparse reconstruction to identify the structure of a GRN, so as to increase accuracy of causal regulation estimations, as well as to reduce their computational complexity. Results In this paper, a sparse reconstruction framework is proposed on basis of steady-state experiment data to identify GRN structure. Different from traditional methods, this approach is adopted which is well suitable for a large-scale underdetermined problem in inferring a sparse vector. We investigate how to combine the noisy steady-state experiment data and a sparse reconstruction algorithm to identify causal relationships. Efficiency of this method is tested by an artificial linear network, a mitogen-activated protein kinase (MAPK) pathway network and the in silico networks of the DREAM challenges. The performance of the suggested approach is compared with two state-of-the-art algorithms, the widely adopted total least-squares (TLS) method and those available results on the DREAM project. Actual results show that, with a lower computational cost, the proposed method can significantly enhance estimation accuracy and greatly reduce false positive and negative errors. Furthermore, numerical calculations demonstrate that the proposed algorithm may have faster convergence speed and smaller fluctuation than other methods when either estimate error or estimate bias is considered. PMID:26207991
Pratapa, Phanisri P.; Suryanarayana, Phanish; Pask, John E.
2015-12-01
We employ Anderson extrapolation to accelerate the classical Jacobi iterative method for large, sparse linear systems. Specifically, we utilize extrapolation at periodic intervals within the Jacobi iteration to develop the Alternating Anderson–Jacobi (AAJ) method. We verify the accuracy and efficacy of AAJ in a range of test cases, including nonsymmetric systems of equations. We demonstrate that AAJ possesses a favorable scaling with system size that is accompanied by a small prefactor, even in the absence of a preconditioner. In particular, we show that AAJ is able to accelerate the classical Jacobi iteration by over four orders of magnitude, with speed-upsmore » that increase as the system gets larger. Moreover, we find that AAJ significantly outperforms the Generalized Minimal Residual (GMRES) method in the range of problems considered here, with the relative performance again improving with size of the system. As a result, the proposed method represents a simple yet efficient technique that is particularly attractive for large-scale parallel solutions of linear systems of equations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pratapa, Phanisri P.; Suryanarayana, Phanish; Pask, John E.
We employ Anderson extrapolation to accelerate the classical Jacobi iterative method for large, sparse linear systems. Specifically, we utilize extrapolation at periodic intervals within the Jacobi iteration to develop the Alternating Anderson–Jacobi (AAJ) method. We verify the accuracy and efficacy of AAJ in a range of test cases, including nonsymmetric systems of equations. We demonstrate that AAJ possesses a favorable scaling with system size that is accompanied by a small prefactor, even in the absence of a preconditioner. In particular, we show that AAJ is able to accelerate the classical Jacobi iteration by over four orders of magnitude, with speed-upsmore » that increase as the system gets larger. Moreover, we find that AAJ significantly outperforms the Generalized Minimal Residual (GMRES) method in the range of problems considered here, with the relative performance again improving with size of the system. As a result, the proposed method represents a simple yet efficient technique that is particularly attractive for large-scale parallel solutions of linear systems of equations.« less
Fast solution of elliptic partial differential equations using linear combinations of plane waves.
Pérez-Jordá, José M
2016-02-01
Given an arbitrary elliptic partial differential equation (PDE), a procedure for obtaining its solution is proposed based on the method of Ritz: the solution is written as a linear combination of plane waves and the coefficients are obtained by variational minimization. The PDE to be solved is cast as a system of linear equations Ax=b, where the matrix A is not sparse, which prevents the straightforward application of standard iterative methods in order to solve it. This sparseness problem can be circumvented by means of a recursive bisection approach based on the fast Fourier transform, which makes it possible to implement fast versions of some stationary iterative methods (such as Gauss-Seidel) consuming O(NlogN) memory and executing an iteration in O(Nlog(2)N) time, N being the number of plane waves used. In a similar way, fast versions of Krylov subspace methods and multigrid methods can also be implemented. These procedures are tested on Poisson's equation expressed in adaptive coordinates. It is found that the best results are obtained with the GMRES method using a multigrid preconditioner with Gauss-Seidel relaxation steps.
Archer, A.W.; Maples, C.G.
1989-01-01
Numerous departures from ideal relationships are revealed by Monte Carlo simulations of widely accepted binomial coefficients. For example, simulations incorporating varying levels of matrix sparseness (presence of zeros indicating lack of data) and computation of expected values reveal that not only are all common coefficients influenced by zero data, but also that some coefficients do not discriminate between sparse or dense matrices (few zero data). Such coefficients computationally merge mutually shared and mutually absent information and do not exploit all the information incorporated within the standard 2 ?? 2 contingency table; therefore, the commonly used formulae for such coefficients are more complicated than the actual range of values produced. Other coefficients do differentiate between mutual presences and absences; however, a number of these coefficients do not demonstrate a linear relationship to matrix sparseness. Finally, simulations using nonrandom matrices with known degrees of row-by-row similarities signify that several coefficients either do not display a reasonable range of values or are nonlinear with respect to known relationships within the data. Analyses with nonrandom matrices yield clues as to the utility of certain coefficients for specific applications. For example, coefficients such as Jaccard, Dice, and Baroni-Urbani and Buser are useful if correction of sparseness is desired, whereas the Russell-Rao coefficient is useful when sparseness correction is not desired. ?? 1989 International Association for Mathematical Geology.
Semi-implicit integration factor methods on sparse grids for high-dimensional systems
NASA Astrophysics Data System (ADS)
Wang, Dongyong; Chen, Weitao; Nie, Qing
2015-07-01
Numerical methods for partial differential equations in high-dimensional spaces are often limited by the curse of dimensionality. Though the sparse grid technique, based on a one-dimensional hierarchical basis through tensor products, is popular for handling challenges such as those associated with spatial discretization, the stability conditions on time step size due to temporal discretization, such as those associated with high-order derivatives in space and stiff reactions, remain. Here, we incorporate the sparse grids with the implicit integration factor method (IIF) that is advantageous in terms of stability conditions for systems containing stiff reactions and diffusions. We combine IIF, in which the reaction is treated implicitly and the diffusion is treated explicitly and exactly, with various sparse grid techniques based on the finite element and finite difference methods and a multi-level combination approach. The overall method is found to be efficient in terms of both storage and computational time for solving a wide range of PDEs in high dimensions. In particular, the IIF with the sparse grid combination technique is flexible and effective in solving systems that may include cross-derivatives and non-constant diffusion coefficients. Extensive numerical simulations in both linear and nonlinear systems in high dimensions, along with applications of diffusive logistic equations and Fokker-Planck equations, demonstrate the accuracy, efficiency, and robustness of the new methods, indicating potential broad applications of the sparse grid-based integration factor method.
Sparse representation of whole-brain fMRI signals for identification of functional networks.
Lv, Jinglei; Jiang, Xi; Li, Xiang; Zhu, Dajiang; Chen, Hanbo; Zhang, Tuo; Zhang, Shu; Hu, Xintao; Han, Junwei; Huang, Heng; Zhang, Jing; Guo, Lei; Liu, Tianming
2015-02-01
There have been several recent studies that used sparse representation for fMRI signal analysis and activation detection based on the assumption that each voxel's fMRI signal is linearly composed of sparse components. Previous studies have employed sparse coding to model functional networks in various modalities and scales. These prior contributions inspired the exploration of whether/how sparse representation can be used to identify functional networks in a voxel-wise way and on the whole brain scale. This paper presents a novel, alternative methodology of identifying multiple functional networks via sparse representation of whole-brain task-based fMRI signals. Our basic idea is that all fMRI signals within the whole brain of one subject are aggregated into a big data matrix, which is then factorized into an over-complete dictionary basis matrix and a reference weight matrix via an effective online dictionary learning algorithm. Our extensive experimental results have shown that this novel methodology can uncover multiple functional networks that can be well characterized and interpreted in spatial, temporal and frequency domains based on current brain science knowledge. Importantly, these well-characterized functional network components are quite reproducible in different brains. In general, our methods offer a novel, effective and unified solution to multiple fMRI data analysis tasks including activation detection, de-activation detection, and functional network identification. Copyright © 2014 Elsevier B.V. All rights reserved.
Li, Ziyi; Safo, Sandra E; Long, Qi
2017-07-11
Sparse principal component analysis (PCA) is a popular tool for dimensionality reduction, pattern recognition, and visualization of high dimensional data. It has been recognized that complex biological mechanisms occur through concerted relationships of multiple genes working in networks that are often represented by graphs. Recent work has shown that incorporating such biological information improves feature selection and prediction performance in regression analysis, but there has been limited work on extending this approach to PCA. In this article, we propose two new sparse PCA methods called Fused and Grouped sparse PCA that enable incorporation of prior biological information in variable selection. Our simulation studies suggest that, compared to existing sparse PCA methods, the proposed methods achieve higher sensitivity and specificity when the graph structure is correctly specified, and are fairly robust to misspecified graph structures. Application to a glioblastoma gene expression dataset identified pathways that are suggested in the literature to be related with glioblastoma. The proposed sparse PCA methods Fused and Grouped sparse PCA can effectively incorporate prior biological information in variable selection, leading to improved feature selection and more interpretable principal component loadings and potentially providing insights on molecular underpinnings of complex diseases.
Solving very large, sparse linear systems on mesh-connected parallel computers
NASA Technical Reports Server (NTRS)
Opsahl, Torstein; Reif, John
1987-01-01
The implementation of Pan and Reif's Parallel Nested Dissection (PND) algorithm on mesh connected parallel computers is described. This is the first known algorithm that allows very large, sparse linear systems of equations to be solved efficiently in polylog time using a small number of processors. How the processor bound of PND can be matched to the number of processors available on a given parallel computer by slowing down the algorithm by constant factors is described. Also, for the important class of problems where G(A) is a grid graph, a unique memory mapping that reduces the inter-processor communication requirements of PND to those that can be executed on mesh connected parallel machines is detailed. A description of an implementation on the Goodyear Massively Parallel Processor (MPP), located at Goddard is given. Also, a detailed discussion of data mappings and performance issues is given.
Sparse Coding and Counting for Robust Visual Tracking
Liu, Risheng; Wang, Jing; Shang, Xiaoke; Wang, Yiyang; Su, Zhixun; Cai, Yu
2016-01-01
In this paper, we propose a novel sparse coding and counting method under Bayesian framework for visual tracking. In contrast to existing methods, the proposed method employs the combination of L0 and L1 norm to regularize the linear coefficients of incrementally updated linear basis. The sparsity constraint enables the tracker to effectively handle difficult challenges, such as occlusion or image corruption. To achieve real-time processing, we propose a fast and efficient numerical algorithm for solving the proposed model. Although it is an NP-hard problem, the proposed accelerated proximal gradient (APG) approach is guaranteed to converge to a solution quickly. Besides, we provide a closed solution of combining L0 and L1 regularized representation to obtain better sparsity. Experimental results on challenging video sequences demonstrate that the proposed method achieves state-of-the-art results both in accuracy and speed. PMID:27992474
SparseBeads data: benchmarking sparsity-regularized computed tomography
NASA Astrophysics Data System (ADS)
Jørgensen, Jakob S.; Coban, Sophia B.; Lionheart, William R. B.; McDonald, Samuel A.; Withers, Philip J.
2017-12-01
Sparsity regularization (SR) such as total variation (TV) minimization allows accurate image reconstruction in x-ray computed tomography (CT) from fewer projections than analytical methods. Exactly how few projections suffice and how this number may depend on the image remain poorly understood. Compressive sensing connects the critical number of projections to the image sparsity, but does not cover CT, however empirical results suggest a similar connection. The present work establishes for real CT data a connection between gradient sparsity and the sufficient number of projections for accurate TV-regularized reconstruction. A collection of 48 x-ray CT datasets called SparseBeads was designed for benchmarking SR reconstruction algorithms. Beadpacks comprising glass beads of five different sizes as well as mixtures were scanned in a micro-CT scanner to provide structured datasets with variable image sparsity levels, number of projections and noise levels to allow the systematic assessment of parameters affecting performance of SR reconstruction algorithms6. Using the SparseBeads data, TV-regularized reconstruction quality was assessed as a function of numbers of projections and gradient sparsity. The critical number of projections for satisfactory TV-regularized reconstruction increased almost linearly with the gradient sparsity. This establishes a quantitative guideline from which one may predict how few projections to acquire based on expected sample sparsity level as an aid in planning of dose- or time-critical experiments. The results are expected to hold for samples of similar characteristics, i.e. consisting of few, distinct phases with relatively simple structure. Such cases are plentiful in porous media, composite materials, foams, as well as non-destructive testing and metrology. For samples of other characteristics the proposed methodology may be used to investigate similar relations.
The Joker: A Custom Monte Carlo Sampler for Binary-star and Exoplanet Radial Velocity Data
NASA Astrophysics Data System (ADS)
Price-Whelan, Adrian M.; Hogg, David W.; Foreman-Mackey, Daniel; Rix, Hans-Walter
2017-03-01
Given sparse or low-quality radial velocity measurements of a star, there are often many qualitatively different stellar or exoplanet companion orbit models that are consistent with the data. The consequent multimodality of the likelihood function leads to extremely challenging search, optimization, and Markov chain Monte Carlo (MCMC) posterior sampling over the orbital parameters. Here we create a custom Monte Carlo sampler for sparse or noisy radial velocity measurements of two-body systems that can produce posterior samples for orbital parameters even when the likelihood function is poorly behaved. The six standard orbital parameters for a binary system can be split into four nonlinear parameters (period, eccentricity, argument of pericenter, phase) and two linear parameters (velocity amplitude, barycenter velocity). We capitalize on this by building a sampling method in which we densely sample the prior probability density function (pdf) in the nonlinear parameters and perform rejection sampling using a likelihood function marginalized over the linear parameters. With sparse or uninformative data, the sampling obtained by this rejection sampling is generally multimodal and dense. With informative data, the sampling becomes effectively unimodal but too sparse: in these cases we follow the rejection sampling with standard MCMC. The method produces correct samplings in orbital parameters for data that include as few as three epochs. The Joker can therefore be used to produce proper samplings of multimodal pdfs, which are still informative and can be used in hierarchical (population) modeling. We give some examples that show how the posterior pdf depends sensitively on the number and time coverage of the observations and their uncertainties.
Ray, J.; Lee, J.; Yadav, V.; ...
2014-08-20
We present a sparse reconstruction scheme that can also be used to ensure non-negativity when fitting wavelet-based random field models to limited observations in non-rectangular geometries. The method is relevant when multiresolution fields are estimated using linear inverse problems. Examples include the estimation of emission fields for many anthropogenic pollutants using atmospheric inversion or hydraulic conductivity in aquifers from flow measurements. The scheme is based on three new developments. Firstly, we extend an existing sparse reconstruction method, Stagewise Orthogonal Matching Pursuit (StOMP), to incorporate prior information on the target field. Secondly, we develop an iterative method that uses StOMP tomore » impose non-negativity on the estimated field. Finally, we devise a method, based on compressive sensing, to limit the estimated field within an irregularly shaped domain. We demonstrate the method on the estimation of fossil-fuel CO 2 (ffCO 2) emissions in the lower 48 states of the US. The application uses a recently developed multiresolution random field model and synthetic observations of ffCO 2 concentrations from a limited set of measurement sites. We find that our method for limiting the estimated field within an irregularly shaped region is about a factor of 10 faster than conventional approaches. It also reduces the overall computational cost by a factor of two. Further, the sparse reconstruction scheme imposes non-negativity without introducing strong nonlinearities, such as those introduced by employing log-transformed fields, and thus reaps the benefits of simplicity and computational speed that are characteristic of linear inverse problems.« less
Application of Fast Multipole Methods to the NASA Fast Scattering Code
NASA Technical Reports Server (NTRS)
Dunn, Mark H.; Tinetti, Ana F.
2008-01-01
The NASA Fast Scattering Code (FSC) is a versatile noise prediction program designed to conduct aeroacoustic noise reduction studies. The equivalent source method is used to solve an exterior Helmholtz boundary value problem with an impedance type boundary condition. The solution process in FSC v2.0 requires direct manipulation of a large, dense system of linear equations, limiting the applicability of the code to small scales and/or moderate excitation frequencies. Recent advances in the use of Fast Multipole Methods (FMM) for solving scattering problems, coupled with sparse linear algebra techniques, suggest that a substantial reduction in computer resource utilization over conventional solution approaches can be obtained. Implementation of the single level FMM (SLFMM) and a variant of the Conjugate Gradient Method (CGM) into the FSC is discussed in this paper. The culmination of this effort, FSC v3.0, was used to generate solutions for three configurations of interest. Benchmarking against previously obtained simulations indicate that a twenty-fold reduction in computational memory and up to a four-fold reduction in computer time have been achieved on a single processor.
Energy conserving, linear scaling Born-Oppenheimer molecular dynamics.
Cawkwell, M J; Niklasson, Anders M N
2012-10-07
Born-Oppenheimer molecular dynamics simulations with long-term conservation of the total energy and a computational cost that scales linearly with system size have been obtained simultaneously. Linear scaling with a low pre-factor is achieved using density matrix purification with sparse matrix algebra and a numerical threshold on matrix elements. The extended Lagrangian Born-Oppenheimer molecular dynamics formalism [A. M. N. Niklasson, Phys. Rev. Lett. 100, 123004 (2008)] yields microcanonical trajectories with the approximate forces obtained from the linear scaling method that exhibit no systematic drift over hundreds of picoseconds and which are indistinguishable from trajectories computed using exact forces.
The Dropout Learning Algorithm
Baldi, Pierre; Sadowski, Peter
2014-01-01
Dropout is a recently introduced algorithm for training neural network by randomly dropping units during training to prevent their co-adaptation. A mathematical analysis of some of the static and dynamic properties of dropout is provided using Bernoulli gating variables, general enough to accommodate dropout on units or connections, and with variable rates. The framework allows a complete analysis of the ensemble averaging properties of dropout in linear networks, which is useful to understand the non-linear case. The ensemble averaging properties of dropout in non-linear logistic networks result from three fundamental equations: (1) the approximation of the expectations of logistic functions by normalized geometric means, for which bounds and estimates are derived; (2) the algebraic equality between normalized geometric means of logistic functions with the logistic of the means, which mathematically characterizes logistic functions; and (3) the linearity of the means with respect to sums, as well as products of independent variables. The results are also extended to other classes of transfer functions, including rectified linear functions. Approximation errors tend to cancel each other and do not accumulate. Dropout can also be connected to stochastic neurons and used to predict firing rates, and to backpropagation by viewing the backward propagation as ensemble averaging in a dropout linear network. Moreover, the convergence properties of dropout can be understood in terms of stochastic gradient descent. Finally, for the regularization properties of dropout, the expectation of the dropout gradient is the gradient of the corresponding approximation ensemble, regularized by an adaptive weight decay term with a propensity for self-consistent variance minimization and sparse representations. PMID:24771879
Liu, Hongcheng; Yao, Tao; Li, Runze; Ye, Yinyu
2017-11-01
This paper concerns the folded concave penalized sparse linear regression (FCPSLR), a class of popular sparse recovery methods. Although FCPSLR yields desirable recovery performance when solved globally, computing a global solution is NP-complete. Despite some existing statistical performance analyses on local minimizers or on specific FCPSLR-based learning algorithms, it still remains open questions whether local solutions that are known to admit fully polynomial-time approximation schemes (FPTAS) may already be sufficient to ensure the statistical performance, and whether that statistical performance can be non-contingent on the specific designs of computing procedures. To address the questions, this paper presents the following threefold results: (i) Any local solution (stationary point) is a sparse estimator, under some conditions on the parameters of the folded concave penalties. (ii) Perhaps more importantly, any local solution satisfying a significant subspace second-order necessary condition (S 3 ONC), which is weaker than the second-order KKT condition, yields a bounded error in approximating the true parameter with high probability. In addition, if the minimal signal strength is sufficient, the S 3 ONC solution likely recovers the oracle solution. This result also explicates that the goal of improving the statistical performance is consistent with the optimization criteria of minimizing the suboptimality gap in solving the non-convex programming formulation of FCPSLR. (iii) We apply (ii) to the special case of FCPSLR with minimax concave penalty (MCP) and show that under the restricted eigenvalue condition, any S 3 ONC solution with a better objective value than the Lasso solution entails the strong oracle property. In addition, such a solution generates a model error (ME) comparable to the optimal but exponential-time sparse estimator given a sufficient sample size, while the worst-case ME is comparable to the Lasso in general. Furthermore, to guarantee the S 3 ONC admits FPTAS.
Compressive Sensing with Cross-Validation and Stop-Sampling for Sparse Polynomial Chaos Expansions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huan, Xun; Safta, Cosmin; Sargsyan, Khachik
Compressive sensing is a powerful technique for recovering sparse solutions of underdetermined linear systems, which is often encountered in uncertainty quanti cation analysis of expensive and high-dimensional physical models. We perform numerical investigations employing several com- pressive sensing solvers that target the unconstrained LASSO formulation, with a focus on linear systems that arise in the construction of polynomial chaos expansions. With core solvers of l1 ls, SpaRSA, CGIST, FPC AS, and ADMM, we develop techniques to mitigate over tting through an automated selection of regularization constant based on cross-validation, and a heuristic strategy to guide the stop-sampling decision. Practical recommendationsmore » on parameter settings for these tech- niques are provided and discussed. The overall method is applied to a series of numerical examples of increasing complexity, including large eddy simulations of supersonic turbulent jet-in-cross flow involving a 24-dimensional input. Through empirical phase-transition diagrams and convergence plots, we illustrate sparse recovery performance under structures induced by polynomial chaos, accuracy and computational tradeoffs between polynomial bases of different degrees, and practi- cability of conducting compressive sensing for a realistic, high-dimensional physical application. Across test cases studied in this paper, we find ADMM to have demonstrated empirical advantages through consistent lower errors and faster computational times.« less
Using Perturbed QR Factorizations To Solve Linear Least-Squares Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Avron, Haim; Ng, Esmond G.; Toledo, Sivan
2008-03-21
We propose and analyze a new tool to help solve sparse linear least-squares problems min{sub x} {parallel}Ax-b{parallel}{sub 2}. Our method is based on a sparse QR factorization of a low-rank perturbation {cflx A} of A. More precisely, we show that the R factor of {cflx A} is an effective preconditioner for the least-squares problem min{sub x} {parallel}Ax-b{parallel}{sub 2}, when solved using LSQR. We propose applications for the new technique. When A is rank deficient we can add rows to ensure that the preconditioner is well-conditioned without column pivoting. When A is sparse except for a few dense rows we canmore » drop these dense rows from A to obtain {cflx A}. Another application is solving an updated or downdated problem. If R is a good preconditioner for the original problem A, it is a good preconditioner for the updated/downdated problem {cflx A}. We can also solve what-if scenarios, where we want to find the solution if a column of the original matrix is changed/removed. We present a spectral theory that analyzes the generalized spectrum of the pencil (A*A,R*R) and analyze the applications.« less
Cloud-In-Cell modeling of shocked particle-laden flows at a ``SPARSE'' cost
NASA Astrophysics Data System (ADS)
Taverniers, Soren; Jacobs, Gustaaf; Sen, Oishik; Udaykumar, H. S.
2017-11-01
A common tool for enabling process-scale simulations of shocked particle-laden flows is Eulerian-Lagrangian Particle-Source-In-Cell (PSIC) modeling where each particle is traced in its Lagrangian frame and treated as a mathematical point. Its dynamics are governed by Stokes drag corrected for high Reynolds and Mach numbers. The computational burden is often reduced further through a ``Cloud-In-Cell'' (CIC) approach which amalgamates groups of physical particles into computational ``macro-particles''. CIC does not account for subgrid particle fluctuations, leading to erroneous predictions of cloud dynamics. A Subgrid Particle-Averaged Reynolds-Stress Equivalent (SPARSE) model is proposed that incorporates subgrid interphase velocity and temperature perturbations. A bivariate Gaussian source distribution, whose covariance captures the cloud's deformation to first order, accounts for the particles' momentum and energy influence on the carrier gas. SPARSE is validated by conducting tests on the interaction of a particle cloud with the accelerated flow behind a shock. The cloud's average dynamics and its deformation over time predicted with SPARSE converge to their counterparts computed with reference PSIC models as the number of Gaussians is increased from 1 to 16. This work was supported by AFOSR Grant No. FA9550-16-1-0008.
Temporally-Constrained Group Sparse Learning for Longitudinal Data Analysis in Alzheimer’s Disease
Jie, Biao; Liu, Mingxia; Liu, Jun
2016-01-01
Sparse learning has been widely investigated for analysis of brain images to assist the diagnosis of Alzheimer’s disease (AD) and its prodromal stage, i.e., mild cognitive impairment (MCI). However, most existing sparse learning-based studies only adopt cross-sectional analysis methods, where the sparse model is learned using data from a single time-point. Actually, multiple time-points of data are often available in brain imaging applications, which can be used in some longitudinal analysis methods to better uncover the disease progression patterns. Accordingly, in this paper we propose a novel temporally-constrained group sparse learning method aiming for longitudinal analysis with multiple time-points of data. Specifically, we learn a sparse linear regression model by using the imaging data from multiple time-points, where a group regularization term is first employed to group the weights for the same brain region across different time-points together. Furthermore, to reflect the smooth changes between data derived from adjacent time-points, we incorporate two smoothness regularization terms into the objective function, i.e., one fused smoothness term which requires that the differences between two successive weight vectors from adjacent time-points should be small, and another output smoothness term which requires the differences between outputs of two successive models from adjacent time-points should also be small. We develop an efficient optimization algorithm to solve the proposed objective function. Experimental results on ADNI database demonstrate that, compared with conventional sparse learning-based methods, our proposed method can achieve improved regression performance and also help in discovering disease-related biomarkers. PMID:27093313
NASA Astrophysics Data System (ADS)
Parekh, Ankit
Sparsity has become the basis of some important signal processing methods over the last ten years. Many signal processing problems (e.g., denoising, deconvolution, non-linear component analysis) can be expressed as inverse problems. Sparsity is invoked through the formulation of an inverse problem with suitably designed regularization terms. The regularization terms alone encode sparsity into the problem formulation. Often, the ℓ1 norm is used to induce sparsity, so much so that ℓ1 regularization is considered to be `modern least-squares'. The use of ℓ1 norm, as a sparsity-inducing regularizer, leads to a convex optimization problem, which has several benefits: the absence of extraneous local minima, well developed theory of globally convergent algorithms, even for large-scale problems. Convex regularization via the ℓ1 norm, however, tends to under-estimate the non-zero values of sparse signals. In order to estimate the non-zero values more accurately, non-convex regularization is often favored over convex regularization. However, non-convex regularization generally leads to non-convex optimization, which suffers from numerous issues: convergence may be guaranteed to only a stationary point, problem specific parameters may be difficult to set, and the solution is sensitive to the initialization of the algorithm. The first part of this thesis is aimed toward combining the benefits of non-convex regularization and convex optimization to estimate sparse signals more effectively. To this end, we propose to use parameterized non-convex regularizers with designated non-convexity and provide a range for the non-convex parameter so as to ensure that the objective function is strictly convex. By ensuring convexity of the objective function (sum of data-fidelity and non-convex regularizer), we can make use of a wide variety of convex optimization algorithms to obtain the unique global minimum reliably. The second part of this thesis proposes a non-linear signal decomposition technique for an important biomedical signal processing problem: the detection of sleep spindles and K-complexes in human sleep electroencephalography (EEG). We propose a non-linear model for the EEG consisting of three components: (1) a transient (sparse piecewise constant) component, (2) a low-frequency component, and (3) an oscillatory component. The oscillatory component admits a sparse time-frequency representation. Using a convex objective function, we propose a fast non-linear optimization algorithm to estimate the three components in the proposed signal model. The low-frequency and oscillatory components are then used to estimate the K-complexes and sleep spindles respectively. The proposed detection method is shown to outperform several state-of-the-art automated sleep spindles detection methods.
Relationships between milk culture results and milk yield in Norwegian dairy cattle.
Reksen, O; Sølverød, L; Østerås, O
2007-10-01
Associations between test-day milk yield and positive milk cultures for Staphylococcus aureus, Streptococcus spp., and other mastitis pathogens or a negative milk culture for mastitis pathogens were assessed in quarter milk samples from randomly sampled cows selected without regard to current or previous udder health status. Staphylococcus aureus was dichotomized according to sparse (< or =1,500 cfu/mL of milk) or rich (>1,500 cfu/mL of milk) growth of the bacteria. Quarter milk samples were obtained on 1 to 4 occasions from 2,740 cows in 354 Norwegian dairy herds, resulting in a total of 3,430 samplings. Measures of test-day milk yield were obtained monthly and related to 3,547 microbiological diagnoses at the cow level. Mixed model linear regression models incorporating an autoregressive covariance structure accounting for repeated test-day milk yields within cow and random effects at the herd and sample level were used to quantify the effect of positive milk cultures on test-day milk yields. Identical models were run separately for first-parity, second-parity, and third-parity or older cows. Fixed effects were days in milk, the natural logarithm of days in milk, sparse and rich growth of Staph. aureus (1/0), Streptococcus spp. (1/0), other mastitis pathogens (1/0), calving season, time of test-day milk yields relative to time of microbiological diagnosis (test day relative to time of diagnosis), and the interaction terms between microbiological diagnosis and test day relative to time of diagnosis. The models were run with the logarithmically transformed composite milk somatic cell count excluded and included. Rich growth of Staph. aureus was associated with decreased production levels in first-parity cows. An interaction between rich growth of Staph. aureus and test day relative to time of diagnosis also predicted a decline in milk production in third-parity or older cows. Interaction between sparse growth of Staph. aureus and test day relative to time of diagnosis predicted declining test-day milk yields in first-parity cows. Sparse growth of Staph. aureus was associated with high milk yields in third-parity or older cows after including the logarithmically transformed composite milk somatic cell count in the model, which illustrates that lower production levels are related to elevated somatic cell counts in high-producing cows. The same association with test-day milk yield was found among Streptococcus spp.-positive pluriparous cows.
Predicting Solar Activity Using Machine-Learning Methods
NASA Astrophysics Data System (ADS)
Bobra, M.
2017-12-01
Of all the activity observed on the Sun, two of the most energetic events are flares and coronal mass ejections. However, we do not, as of yet, fully understand the physical mechanism that triggers solar eruptions. A machine-learning algorithm, which is favorable in cases where the amount of data is large, is one way to [1] empirically determine the signatures of this mechanism in solar image data and [2] use them to predict solar activity. In this talk, we discuss the application of various machine learning algorithms - specifically, a Support Vector Machine, a sparse linear regression (Lasso), and Convolutional Neural Network - to image data from the photosphere, chromosphere, transition region, and corona taken by instruments aboard the Solar Dynamics Observatory in order to predict solar activity on a variety of time scales. Such an approach may be useful since, at the present time, there are no physical models of flares available for real-time prediction. We discuss our results (Bobra and Couvidat, 2015; Bobra and Ilonidis, 2016; Jonas et al., 2017) as well as other attempts to predict flares using machine-learning (e.g. Ahmed et al., 2013; Nishizuka et al. 2017) and compare these results with the more traditional techniques used by the NOAA Space Weather Prediction Center (Crown, 2012). We also discuss some of the challenges in using machine-learning algorithms for space science applications.
Numerical methods in Markov chain modeling
NASA Technical Reports Server (NTRS)
Philippe, Bernard; Saad, Youcef; Stewart, William J.
1989-01-01
Several methods for computing stationary probability distributions of Markov chains are described and compared. The main linear algebra problem consists of computing an eigenvector of a sparse, usually nonsymmetric, matrix associated with a known eigenvalue. It can also be cast as a problem of solving a homogeneous singular linear system. Several methods based on combinations of Krylov subspace techniques are presented. The performance of these methods on some realistic problems are compared.
ML 3.0 smoothed aggregation user's guide.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sala, Marzio; Hu, Jonathan Joseph; Tuminaro, Raymond Stephen
2004-05-01
ML is a multigrid preconditioning package intended to solve linear systems of equations Az = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. ML should be used on large sparse linear systems arising from partial differential equation (PDE) discretizations. While technically any linear system can be considered, ML should be used on linear systems that correspond to things that work well with multigrid methods (e.g. elliptic PDEs). ML can be used as a stand-alone package ormore » to generate preconditioners for a traditional iterative solver package (e.g. Krylov methods). We have supplied support for working with the AZTEC 2.1 and AZTECOO iterative package [15]. However, other solvers can be used by supplying a few functions. This document describes one specific algebraic multigrid approach: smoothed aggregation. This approach is used within several specialized multigrid methods: one for the eddy current formulation for Maxwell's equations, and a multilevel and domain decomposition method for symmetric and non-symmetric systems of equations (like elliptic equations, or compressible and incompressible fluid dynamics problems). Other methods exist within ML but are not described in this document. Examples are given illustrating the problem definition and exercising multigrid options.« less
ML 3.1 smoothed aggregation user's guide.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sala, Marzio; Hu, Jonathan Joseph; Tuminaro, Raymond Stephen
2004-10-01
ML is a multigrid preconditioning package intended to solve linear systems of equations Ax = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. ML should be used on large sparse linear systems arising from partial differential equation (PDE) discretizations. While technically any linear system can be considered, ML should be used on linear systems that correspond to things that work well with multigrid methods (e.g. elliptic PDEs). ML can be used as a stand-alone package ormore » to generate preconditioners for a traditional iterative solver package (e.g. Krylov methods). We have supplied support for working with the Aztec 2.1 and AztecOO iterative package [16]. However, other solvers can be used by supplying a few functions. This document describes one specific algebraic multigrid approach: smoothed aggregation. This approach is used within several specialized multigrid methods: one for the eddy current formulation for Maxwell's equations, and a multilevel and domain decomposition method for symmetric and nonsymmetric systems of equations (like elliptic equations, or compressible and incompressible fluid dynamics problems). Other methods exist within ML but are not described in this document. Examples are given illustrating the problem definition and exercising multigrid options.« less
Liang, Yujie; Ying, Rendong; Lu, Zhenqi; Liu, Peilin
2014-01-01
In the design phase of sensor arrays during array signal processing, the estimation performance and system cost are largely determined by array aperture size. In this article, we address the problem of joint direction-of-arrival (DOA) estimation with distributed sparse linear arrays (SLAs) and propose an off-grid synchronous approach based on distributed compressed sensing to obtain larger array aperture. We focus on the complex source distribution in the practical applications and classify the sources into common and innovation parts according to whether a signal of source can impinge on all the SLAs or a specific one. For each SLA, we construct a corresponding virtual uniform linear array (ULA) to create the relationship of random linear map between the signals respectively observed by these two arrays. The signal ensembles including the common/innovation sources for different SLAs are abstracted as a joint spatial sparsity model. And we use the minimization of concatenated atomic norm via semidefinite programming to solve the problem of joint DOA estimation. Joint calculation of the signals observed by all the SLAs exploits their redundancy caused by the common sources and decreases the requirement of array size. The numerical results illustrate the advantages of the proposed approach. PMID:25420150
Bit error rate tester using fast parallel generation of linear recurring sequences
Pierson, Lyndon G.; Witzke, Edward L.; Maestas, Joseph H.
2003-05-06
A fast method for generating linear recurring sequences by parallel linear recurring sequence generators (LRSGs) with a feedback circuit optimized to balance minimum propagation delay against maximal sequence period. Parallel generation of linear recurring sequences requires decimating the sequence (creating small contiguous sections of the sequence in each LRSG). A companion matrix form is selected depending on whether the LFSR is right-shifting or left-shifting. The companion matrix is completed by selecting a primitive irreducible polynomial with 1's most closely grouped in a corner of the companion matrix. A decimation matrix is created by raising the companion matrix to the (n*k).sup.th power, where k is the number of parallel LRSGs and n is the number of bits to be generated at a time by each LRSG. Companion matrices with 1's closely grouped in a corner will yield sparse decimation matrices. A feedback circuit comprised of XOR logic gates implements the decimation matrix in hardware. Sparse decimation matrices can be implemented with minimum number of XOR gates, and therefore a minimum propagation delay through the feedback circuit. The LRSG of the invention is particularly well suited to use as a bit error rate tester on high speed communication lines because it permits the receiver to synchronize to the transmitted pattern within 2n bits.
Reduced kernel recursive least squares algorithm for aero-engine degradation prediction
NASA Astrophysics Data System (ADS)
Zhou, Haowen; Huang, Jinquan; Lu, Feng
2017-10-01
Kernel adaptive filters (KAFs) generate a linear growing radial basis function (RBF) network with the number of training samples, thereby lacking sparseness. To deal with this drawback, traditional sparsification techniques select a subset of original training data based on a certain criterion to train the network and discard the redundant data directly. Although these methods curb the growth of the network effectively, it should be noted that information conveyed by these redundant samples is omitted, which may lead to accuracy degradation. In this paper, we present a novel online sparsification method which requires much less training time without sacrificing the accuracy performance. Specifically, a reduced kernel recursive least squares (RKRLS) algorithm is developed based on the reduced technique and the linear independency. Unlike conventional methods, our novel methodology employs these redundant data to update the coefficients of the existing network. Due to the effective utilization of the redundant data, the novel algorithm achieves a better accuracy performance, although the network size is significantly reduced. Experiments on time series prediction and online regression demonstrate that RKRLS algorithm requires much less computational consumption and maintains the satisfactory accuracy performance. Finally, we propose an enhanced multi-sensor prognostic model based on RKRLS and Hidden Markov Model (HMM) for remaining useful life (RUL) estimation. A case study in a turbofan degradation dataset is performed to evaluate the performance of the novel prognostic approach.
A Sparse Matrix Approach for Simultaneous Quantification of Nystagmus and Saccade
NASA Technical Reports Server (NTRS)
Kukreja, Sunil L.; Stone, Lee; Boyle, Richard D.
2012-01-01
The vestibulo-ocular reflex (VOR) consists of two intermingled non-linear subsystems; namely, nystagmus and saccade. Typically, nystagmus is analysed using a single sufficiently long signal or a concatenation of them. Saccade information is not analysed and discarded due to insufficient data length to provide consistent and minimum variance estimates. This paper presents a novel sparse matrix approach to system identification of the VOR. It allows for the simultaneous estimation of both nystagmus and saccade signals. We show via simulation of the VOR that our technique provides consistent and unbiased estimates in the presence of output additive noise.
Optshrink LR + S: accelerated fMRI reconstruction using non-convex optimal singular value shrinkage.
Aggarwal, Priya; Shrivastava, Parth; Kabra, Tanay; Gupta, Anubha
2017-03-01
This paper presents a new accelerated fMRI reconstruction method, namely, OptShrink LR + S method that reconstructs undersampled fMRI data using a linear combination of low-rank and sparse components. The low-rank component has been estimated using non-convex optimal singular value shrinkage algorithm, while the sparse component has been estimated using convex l 1 minimization. The performance of the proposed method is compared with the existing state-of-the-art algorithms on real fMRI dataset. The proposed OptShrink LR + S method yields good qualitative and quantitative results.
On A Nonlinear Generalization of Sparse Coding and Dictionary Learning.
Xie, Yuchen; Ho, Jeffrey; Vemuri, Baba
2013-01-01
Existing dictionary learning algorithms are based on the assumption that the data are vectors in an Euclidean vector space ℝ d , and the dictionary is learned from the training data using the vector space structure of ℝ d and its Euclidean L 2 -metric. However, in many applications, features and data often originated from a Riemannian manifold that does not support a global linear (vector space) structure. Furthermore, the extrinsic viewpoint of existing dictionary learning algorithms becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to the application. This paper proposes a novel framework for sparse coding and dictionary learning for data on a Riemannian manifold, and it shows that the existing sparse coding and dictionary learning methods can be considered as special (Euclidean) cases of the more general framework proposed here. We show that both the dictionary and sparse coding can be effectively computed for several important classes of Riemannian manifolds, and we validate the proposed method using two well-known classification problems in computer vision and medical imaging analysis.
On A Nonlinear Generalization of Sparse Coding and Dictionary Learning
Xie, Yuchen; Ho, Jeffrey; Vemuri, Baba
2013-01-01
Existing dictionary learning algorithms are based on the assumption that the data are vectors in an Euclidean vector space ℝd, and the dictionary is learned from the training data using the vector space structure of ℝd and its Euclidean L2-metric. However, in many applications, features and data often originated from a Riemannian manifold that does not support a global linear (vector space) structure. Furthermore, the extrinsic viewpoint of existing dictionary learning algorithms becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to the application. This paper proposes a novel framework for sparse coding and dictionary learning for data on a Riemannian manifold, and it shows that the existing sparse coding and dictionary learning methods can be considered as special (Euclidean) cases of the more general framework proposed here. We show that both the dictionary and sparse coding can be effectively computed for several important classes of Riemannian manifolds, and we validate the proposed method using two well-known classification problems in computer vision and medical imaging analysis. PMID:24129583
Exact recovery of sparse multiple measurement vectors by [Formula: see text]-minimization.
Wang, Changlong; Peng, Jigen
2018-01-01
The joint sparse recovery problem is a generalization of the single measurement vector problem widely studied in compressed sensing. It aims to recover a set of jointly sparse vectors, i.e., those that have nonzero entries concentrated at a common location. Meanwhile [Formula: see text]-minimization subject to matrixes is widely used in a large number of algorithms designed for this problem, i.e., [Formula: see text]-minimization [Formula: see text] Therefore the main contribution in this paper is two theoretical results about this technique. The first one is proving that in every multiple system of linear equations there exists a constant [Formula: see text] such that the original unique sparse solution also can be recovered from a minimization in [Formula: see text] quasi-norm subject to matrixes whenever [Formula: see text]. The other one is showing an analytic expression of such [Formula: see text]. Finally, we display the results of one example to confirm the validity of our conclusions, and we use some numerical experiments to show that we increase the efficiency of these algorithms designed for [Formula: see text]-minimization by using our results.
SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *
Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.
2014-01-01
The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844
Sparseness- and continuity-constrained seismic imaging
NASA Astrophysics Data System (ADS)
Herrmann, Felix J.
2005-04-01
Non-linear solution strategies to the least-squares seismic inverse-scattering problem with sparseness and continuity constraints are proposed. Our approach is designed to (i) deal with substantial amounts of additive noise (SNR < 0 dB); (ii) use the sparseness and locality (both in position and angle) of directional basis functions (such as curvelets and contourlets) on the model: the reflectivity; and (iii) exploit the near invariance of these basis functions under the normal operator, i.e., the scattering-followed-by-imaging operator. Signal-to-noise ratio and the continuity along the imaged reflectors are significantly enhanced by formulating the solution of the seismic inverse problem in terms of an optimization problem. During the optimization, sparseness on the basis and continuity along the reflectors are imposed by jointly minimizing the l1- and anisotropic diffusion/total-variation norms on the coefficients and reflectivity, respectively. [Joint work with Peyman P. Moghaddam was carried out as part of the SINBAD project, with financial support secured through ITF (the Industry Technology Facilitator) from the following organizations: BG Group, BP, ExxonMobil, and SHELL. Additional funding came from the NSERC Discovery Grants 22R81254.
Non-uniform sampling: post-Fourier era of NMR data collection and processing.
Kazimierczuk, Krzysztof; Orekhov, Vladislav
2015-11-01
The invention of multidimensional techniques in the 1970s revolutionized NMR, making it the general tool of structural analysis of molecules and materials. In the most straightforward approach, the signal sampling in the indirect dimensions of a multidimensional experiment is performed in the same manner as in the direct dimension, i.e. with a grid of equally spaced points. This results in lengthy experiments with a resolution often far from optimum. To circumvent this problem, numerous sparse-sampling techniques have been developed in the last three decades, including two traditionally distinct approaches: the radial sampling and non-uniform sampling. This mini review discusses the sparse signal sampling and reconstruction techniques from the point of view of an underdetermined linear algebra problem that arises when a full, equally spaced set of sampled points is replaced with sparse sampling. Additional assumptions that are introduced to solve the problem, as well as the shape of the undersampled Fourier transform operator (visualized as so-called point spread function), are shown to be the main differences between various sparse-sampling methods. Copyright © 2015 John Wiley & Sons, Ltd.
Multi-objective based spectral unmixing for hyperspectral images
NASA Astrophysics Data System (ADS)
Xu, Xia; Shi, Zhenwei
2017-02-01
Sparse hyperspectral unmixing assumes that each observed pixel can be expressed by a linear combination of several pure spectra in a priori library. Sparse unmixing is challenging, since it is usually transformed to a NP-hard l0 norm based optimization problem. Existing methods usually utilize a relaxation to the original l0 norm. However, the relaxation may bring in sensitive weighted parameters and additional calculation error. In this paper, we propose a novel multi-objective based algorithm to solve the sparse unmixing problem without any relaxation. We transform sparse unmixing to a multi-objective optimization problem, which contains two correlative objectives: minimizing the reconstruction error and controlling the endmember sparsity. To improve the efficiency of multi-objective optimization, a population-based randomly flipping strategy is designed. Moreover, we theoretically prove that the proposed method is able to recover a guaranteed approximate solution from the spectral library within limited iterations. The proposed method can directly deal with l0 norm via binary coding for the spectral signatures in the library. Experiments on both synthetic and real hyperspectral datasets demonstrate the effectiveness of the proposed method.
Robust visual tracking via multiscale deep sparse networks
NASA Astrophysics Data System (ADS)
Wang, Xin; Hou, Zhiqiang; Yu, Wangsheng; Xue, Yang; Jin, Zefenfen; Dai, Bo
2017-04-01
In visual tracking, deep learning with offline pretraining can extract more intrinsic and robust features. It has significant success solving the tracking drift in a complicated environment. However, offline pretraining requires numerous auxiliary training datasets and is considerably time-consuming for tracking tasks. To solve these problems, a multiscale sparse networks-based tracker (MSNT) under the particle filter framework is proposed. Based on the stacked sparse autoencoders and rectifier linear unit, the tracker has a flexible and adjustable architecture without the offline pretraining process and exploits the robust and powerful features effectively only through online training of limited labeled data. Meanwhile, the tracker builds four deep sparse networks of different scales, according to the target's profile type. During tracking, the tracker selects the matched tracking network adaptively in accordance with the initial target's profile type. It preserves the inherent structural information more efficiently than the single-scale networks. Additionally, a corresponding update strategy is proposed to improve the robustness of the tracker. Extensive experimental results on a large scale benchmark dataset show that the proposed method performs favorably against state-of-the-art methods in challenging environments.
Optical fringe-reflection deflectometry with sparse representation
NASA Astrophysics Data System (ADS)
Xiao, Yong-Liang; Li, Sikun; Zhang, Qican; Zhong, Jianxin; Su, Xianyu; You, Zhisheng
2018-05-01
Optical fringe-reflection deflectometry is a surprisingly attractive scratch detection technique for specular surfaces owing to its unparalleled local sensibility. Full-field surface topography is obtained from a measured normal field using gradient integration. However, there may not be an ideal measured gradient field for deflectometry reconstruction in practice. Both the non-integrability condition and various kinds of image noise distributions, which are present in the indirect measured gradient field, may lead to ambiguity about the scratches on specular surfaces. In order to reduce misjudgment of scratches, sparse representation is introduced into the Southwell curl equation for deflectometry. The curl can be represented as a linear combination of the given redundant dictionary for curl and the sparsest solution for gradient refinement. The non-integrability condition and noise permutation can be overcome with sparse representation for gradient refinement. Numerical simulations demonstrate that the accuracy rate of judgment of scratches can be enhanced with sparse representation compared to the standard least-squares integration. Preliminary experiments are performed with the application of practical measured deflectometric data to verify the validity of the algorithm.
Joint sparse coding based spatial pyramid matching for classification of color medical image.
Shi, Jun; Li, Yi; Zhu, Jie; Sun, Haojie; Cai, Yin
2015-04-01
Although color medical images are important in clinical practice, they are usually converted to grayscale for further processing in pattern recognition, resulting in loss of rich color information. The sparse coding based linear spatial pyramid matching (ScSPM) and its variants are popular for grayscale image classification, but cannot extract color information. In this paper, we propose a joint sparse coding based SPM (JScSPM) method for the classification of color medical images. A joint dictionary can represent both the color information in each color channel and the correlation between channels. Consequently, the joint sparse codes calculated from a joint dictionary can carry color information, and therefore this method can easily transform a feature descriptor originally designed for grayscale images to a color descriptor. A color hepatocellular carcinoma histological image dataset was used to evaluate the performance of the proposed JScSPM algorithm. Experimental results show that JScSPM provides significant improvements as compared with the majority voting based ScSPM and the original ScSPM for color medical image classification. Copyright © 2014 Elsevier Ltd. All rights reserved.
Jonasson, Grethe; Sundh, Valter; Ahlqwist, Margareta; Hakeberg, Magnus; Björkelund, Cecilia; Lissner, Lauren
2011-10-01
Bone structure is the key to the understanding of fracture risk. The hypothesis tested in this prospective study is that dense mandibular trabeculation predicts low fracture risk, whereas sparse trabeculation is predictive of high fracture risk. Out of 731 women from the Prospective Population Study of Women in Gothenburg with dental examinations at baseline 1968, 222 had their first fracture in the follow-up period until 2006. Mandibular trabeculation was defined as dense, mixed dense plus sparse, and sparse based on panoramic radiographs from 1968 and/or 1980. Time to fracture was ascertained and used as the dependent variable in three Cox proportional hazards regression analyses. The first analysis covered 12 years of follow-up with self-reported endpoints; the second covered 26 years of follow-up with hospital verified endpoints; and the third combined the two follow-up periods, totaling 38 years. Mandibular trabeculation was the main independent variable predicting incident fractures, with age, physical activity, alcohol consumption and body mass index as covariates. The Kaplan-Meier curve indicated a graded association between trabecular density and fracture risk. During the whole period covered, the hazard ratio of future fracture for sparse trabeculation compared to mixed trabeculation was 2.9 (95% CI: 2.2-3.8, p<0.0001), and for dense versus mixed trabeculation was 0.21 (95% CI: 0.1-0.4, p<0.0001). The trabecular pattern was a highly significant predictor of future fracture risk. Our findings imply that dentists, using ordinary dental radiographs, can identify women at high risk for future fractures at 38-54 years of age, often long before the first fracture occurs. Copyright © 2011 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Noh, Hae Young; Kiremidjian, Anne S.
2011-04-01
This paper introduces a data compression method using the K-SVD algorithm and its application to experimental ambient vibration data for structural health monitoring purposes. Because many damage diagnosis algorithms that use system identification require vibration measurements of multiple locations, it is necessary to transmit long threads of data. In wireless sensor networks for structural health monitoring, however, data transmission is often a major source of battery consumption. Therefore, reducing the amount of data to transmit can significantly lengthen the battery life and reduce maintenance cost. The K-SVD algorithm was originally developed in information theory for sparse signal representation. This algorithm creates an optimal over-complete set of bases, referred to as a dictionary, using singular value decomposition (SVD) and represents the data as sparse linear combinations of these bases using the orthogonal matching pursuit (OMP) algorithm. Since ambient vibration data are stationary, we can segment them and represent each segment sparsely. Then only the dictionary and the sparse vectors of the coefficients need to be transmitted wirelessly for restoration of the original data. We applied this method to ambient vibration data measured from a four-story steel moment resisting frame. The results show that the method can compress the data efficiently and restore the data with very little error.
HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION
Mukherjee, Rajarshi; Pillai, Natesh S.; Lin, Xihong
2015-01-01
In this paper, we study the detection boundary for minimax hypothesis testing in the context of high-dimensional, sparse binary regression models. Motivated by genetic sequencing association studies for rare variant effects, we investigate the complexity of the hypothesis testing problem when the design matrix is sparse. We observe a new phenomenon in the behavior of detection boundary which does not occur in the case of Gaussian linear regression. We derive the detection boundary as a function of two components: a design matrix sparsity index and signal strength, each of which is a function of the sparsity of the alternative. For any alternative, if the design matrix sparsity index is too high, any test is asymptotically powerless irrespective of the magnitude of signal strength. For binary design matrices with the sparsity index that is not too high, our results are parallel to those in the Gaussian case. In this context, we derive detection boundaries for both dense and sparse regimes. For the dense regime, we show that the generalized likelihood ratio is rate optimal; for the sparse regime, we propose an extended Higher Criticism Test and show it is rate optimal and sharp. We illustrate the finite sample properties of the theoretical results using simulation studies. PMID:26246645
The application of low-rank and sparse decomposition method in the field of climatology
NASA Astrophysics Data System (ADS)
Gupta, Nitika; Bhaskaran, Prasad K.
2018-04-01
The present study reports a low-rank and sparse decomposition method that separates the mean and the variability of a climate data field. Until now, the application of this technique was limited only in areas such as image processing, web data ranking, and bioinformatics data analysis. In climate science, this method exactly separates the original data into a set of low-rank and sparse components, wherein the low-rank components depict the linearly correlated dataset (expected or mean behavior), and the sparse component represents the variation or perturbation in the dataset from its mean behavior. The study attempts to verify the efficacy of this proposed technique in the field of climatology with two examples of real world. The first example attempts this technique on the maximum wind-speed (MWS) data for the Indian Ocean (IO) region. The study brings to light a decadal reversal pattern in the MWS for the North Indian Ocean (NIO) during the months of June, July, and August (JJA). The second example deals with the sea surface temperature (SST) data for the Bay of Bengal region that exhibits a distinct pattern in the sparse component. The study highlights the importance of the proposed technique used for interpretation and visualization of climate data.
On the sparseness of 1-norm support vector machines.
Zhang, Li; Zhou, Weida
2010-04-01
There is some empirical evidence available showing that 1-norm Support Vector Machines (1-norm SVMs) have good sparseness; however, both how good sparseness 1-norm SVMs can reach and whether they have a sparser representation than that of standard SVMs are not clear. In this paper we take into account the sparseness of 1-norm SVMs. Two upper bounds on the number of nonzero coefficients in the decision function of 1-norm SVMs are presented. First, the number of nonzero coefficients in 1-norm SVMs is at most equal to the number of only the exact support vectors lying on the +1 and -1 discriminating surfaces, while that in standard SVMs is equal to the number of support vectors, which implies that 1-norm SVMs have better sparseness than that of standard SVMs. Second, the number of nonzero coefficients is at most equal to the rank of the sample matrix. A brief review of the geometry of linear programming and the primal steepest edge pricing simplex method are given, which allows us to provide the proof of the two upper bounds and evaluate their tightness by experiments. Experimental results on toy data sets and the UCI data sets illustrate our analysis. Copyright 2009 Elsevier Ltd. All rights reserved.
Orthogonal Procrustes Analysis for Dictionary Learning in Sparse Linear Representation.
Grossi, Giuliano; Lanzarotti, Raffaella; Lin, Jianyi
2017-01-01
In the sparse representation model, the design of overcomplete dictionaries plays a key role for the effectiveness and applicability in different domains. Recent research has produced several dictionary learning approaches, being proven that dictionaries learnt by data examples significantly outperform structured ones, e.g. wavelet transforms. In this context, learning consists in adapting the dictionary atoms to a set of training signals in order to promote a sparse representation that minimizes the reconstruction error. Finding the best fitting dictionary remains a very difficult task, leaving the question still open. A well-established heuristic method for tackling this problem is an iterative alternating scheme, adopted for instance in the well-known K-SVD algorithm. Essentially, it consists in repeating two stages; the former promotes sparse coding of the training set and the latter adapts the dictionary to reduce the error. In this paper we present R-SVD, a new method that, while maintaining the alternating scheme, adopts the Orthogonal Procrustes analysis to update the dictionary atoms suitably arranged into groups. Comparative experiments on synthetic data prove the effectiveness of R-SVD with respect to well known dictionary learning algorithms such as K-SVD, ILS-DLA and the online method OSDL. Moreover, experiments on natural data such as ECG compression, EEG sparse representation, and image modeling confirm R-SVD's robustness and wide applicability.
Multilevel sparse functional principal component analysis.
Di, Chongzhi; Crainiceanu, Ciprian M; Jank, Wolfgang S
2014-01-29
We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis (MFPCA; Di et al. 2009) was proposed for such data when functions are densely recorded. Here we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between and within subject levels. We address inherent methodological differences in the sparse sampling context to: 1) estimate the covariance operators; 2) estimate the functional principal component scores; 3) predict the underlying curves. Through simulations the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions.
Lee, Ching-Pei; Lin, Chih-Jen
2014-04-01
Linear rankSVM is one of the widely used methods for learning to rank. Although its performance may be inferior to nonlinear methods such as kernel rankSVM and gradient boosting decision trees, linear rankSVM is useful to quickly produce a baseline model. Furthermore, following its recent development for classification, linear rankSVM may give competitive performance for large and sparse data. A great deal of works have studied linear rankSVM. The focus is on the computational efficiency when the number of preference pairs is large. In this letter, we systematically study existing works, discuss their advantages and disadvantages, and propose an efficient algorithm. We discuss different implementation issues and extensions with detailed experiments. Finally, we develop a robust linear rankSVM tool for public use.
FPGA implementation of sparse matrix algorithm for information retrieval
NASA Astrophysics Data System (ADS)
Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio
2005-06-01
Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
Sparse matrix-vector multiplication on network-on-chip
NASA Astrophysics Data System (ADS)
Sun, C.-C.; Götze, J.; Jheng, H.-Y.; Ruan, S.-J.
2010-12-01
In this paper, we present an idea for performing matrix-vector multiplication by using Network-on-Chip (NoC) architecture. In traditional IC design on-chip communications have been designed with dedicated point-to-point interconnections. Therefore, regular local data transfer is the major concept of many parallel implementations. However, when dealing with the parallel implementation of sparse matrix-vector multiplication (SMVM), which is the main step of all iterative algorithms for solving systems of linear equation, the required data transfers depend on the sparsity structure of the matrix and can be extremely irregular. Using the NoC architecture makes it possible to deal with arbitrary structure of the data transfers; i.e. with the irregular structure of the sparse matrices. So far, we have already implemented the proposed SMVM-NoC architecture with the size 4×4 and 5×5 in IEEE 754 single float point precision using FPGA.
A Shifted Block Lanczos Algorithm 1: The Block Recurrence
NASA Technical Reports Server (NTRS)
Grimes, Roger G.; Lewis, John G.; Simon, Horst D.
1990-01-01
In this paper we describe a block Lanczos algorithm that is used as the key building block of a software package for the extraction of eigenvalues and eigenvectors of large sparse symmetric generalized eigenproblems. The software package comprises: a version of the block Lanczos algorithm specialized for spectrally transformed eigenproblems; an adaptive strategy for choosing shifts, and efficient codes for factoring large sparse symmetric indefinite matrices. This paper describes the algorithmic details of our block Lanczos recurrence. This uses a novel combination of block generalizations of several features that have only been investigated independently in the past. In particular new forms of partial reorthogonalization, selective reorthogonalization and local reorthogonalization are used, as is a new algorithm for obtaining the M-orthogonal factorization of a matrix. The heuristic shifting strategy, the integration with sparse linear equation solvers and numerical experience with the code are described in a companion paper.
A Fast Gradient Method for Nonnegative Sparse Regression With Self-Dictionary
NASA Astrophysics Data System (ADS)
Gillis, Nicolas; Luce, Robert
2018-01-01
A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images.
DOLPHIn—Dictionary Learning for Phase Retrieval
NASA Astrophysics Data System (ADS)
Tillmann, Andreas M.; Eldar, Yonina C.; Mairal, Julien
2016-12-01
We propose a new algorithm to learn a dictionary for reconstructing and sparsely encoding signals from measurements without phase. Specifically, we consider the task of estimating a two-dimensional image from squared-magnitude measurements of a complex-valued linear transformation of the original image. Several recent phase retrieval algorithms exploit underlying sparsity of the unknown signal in order to improve recovery performance. In this work, we consider such a sparse signal prior in the context of phase retrieval, when the sparsifying dictionary is not known in advance. Our algorithm jointly reconstructs the unknown signal - possibly corrupted by noise - and learns a dictionary such that each patch of the estimated image can be sparsely represented. Numerical experiments demonstrate that our approach can obtain significantly better reconstructions for phase retrieval problems with noise than methods that cannot exploit such "hidden" sparsity. Moreover, on the theoretical side, we provide a convergence result for our method.
Zhao, Tuo; Liu, Han
2016-01-01
We propose an accelerated path-following iterative shrinkage thresholding algorithm (APISTA) for solving high dimensional sparse nonconvex learning problems. The main difference between APISTA and the path-following iterative shrinkage thresholding algorithm (PISTA) is that APISTA exploits an additional coordinate descent subroutine to boost the computational performance. Such a modification, though simple, has profound impact: APISTA not only enjoys the same theoretical guarantee as that of PISTA, i.e., APISTA attains a linear rate of convergence to a unique sparse local optimum with good statistical properties, but also significantly outperforms PISTA in empirical benchmarks. As an application, we apply APISTA to solve a family of nonconvex optimization problems motivated by estimating sparse semiparametric graphical models. APISTA allows us to obtain new statistical recovery results which do not exist in the existing literature. Thorough numerical results are provided to back up our theory. PMID:28133430
Belilovsky, Eugene; Gkirtzou, Katerina; Misyrlis, Michail; Konova, Anna B; Honorio, Jean; Alia-Klein, Nelly; Goldstein, Rita Z; Samaras, Dimitris; Blaschko, Matthew B
2015-12-01
We explore various sparse regularization techniques for analyzing fMRI data, such as the ℓ1 norm (often called LASSO in the context of a squared loss function), elastic net, and the recently introduced k-support norm. Employing sparsity regularization allows us to handle the curse of dimensionality, a problem commonly found in fMRI analysis. In this work we consider sparse regularization in both the regression and classification settings. We perform experiments on fMRI scans from cocaine-addicted as well as healthy control subjects. We show that in many cases, use of the k-support norm leads to better predictive performance, solution stability, and interpretability as compared to other standard approaches. We additionally analyze the advantages of using the absolute loss function versus the standard squared loss which leads to significantly better predictive performance for the regularization methods tested in almost all cases. Our results support the use of the k-support norm for fMRI analysis and on the clinical side, the generalizability of the I-RISA model of cocaine addiction. Copyright © 2015 Elsevier Ltd. All rights reserved.
Multiple Sparse Representations Classification
Plenge, Esben; Klein, Stefan S.; Niessen, Wiro J.; Meijering, Erik
2015-01-01
Sparse representations classification (SRC) is a powerful technique for pixelwise classification of images and it is increasingly being used for a wide variety of image analysis tasks. The method uses sparse representation and learned redundant dictionaries to classify image pixels. In this empirical study we propose to further leverage the redundancy of the learned dictionaries to achieve a more accurate classifier. In conventional SRC, each image pixel is associated with a small patch surrounding it. Using these patches, a dictionary is trained for each class in a supervised fashion. Commonly, redundant/overcomplete dictionaries are trained and image patches are sparsely represented by a linear combination of only a few of the dictionary elements. Given a set of trained dictionaries, a new patch is sparse coded using each of them, and subsequently assigned to the class whose dictionary yields the minimum residual energy. We propose a generalization of this scheme. The method, which we call multiple sparse representations classification (mSRC), is based on the observation that an overcomplete, class specific dictionary is capable of generating multiple accurate and independent estimates of a patch belonging to the class. So instead of finding a single sparse representation of a patch for each dictionary, we find multiple, and the corresponding residual energies provides an enhanced statistic which is used to improve classification. We demonstrate the efficacy of mSRC for three example applications: pixelwise classification of texture images, lumen segmentation in carotid artery magnetic resonance imaging (MRI), and bifurcation point detection in carotid artery MRI. We compare our method with conventional SRC, K-nearest neighbor, and support vector machine classifiers. The results show that mSRC outperforms SRC and the other reference methods. In addition, we present an extensive evaluation of the effect of the main mSRC parameters: patch size, dictionary size, and sparsity level. PMID:26177106
Scalable domain decomposition solvers for stochastic PDEs in high performance computing
Desai, Ajit; Khalil, Mohammad; Pettit, Chris; ...
2017-09-21
Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
Scalable domain decomposition solvers for stochastic PDEs in high performance computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Desai, Ajit; Khalil, Mohammad; Pettit, Chris
Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
Broadband implementation of coprime linear microphone arrays for direction of arrival estimation.
Bush, Dane; Xiang, Ning
2015-07-01
Coprime arrays represent a form of sparse sensing which can achieve narrow beams using relatively few elements, exceeding the spatial Nyquist sampling limit. The purpose of this paper is to expand on and experimentally validate coprime array theory in an acoustic implementation. Two nested sparse uniform linear subarrays with coprime number of elements ( M and N) each produce grating lobes that overlap with one another completely in just one direction. When the subarray outputs are combined it is possible to retain the shared beam while mostly canceling the other superfluous grating lobes. In this way a small number of microphones ( N+M-1) creates a narrow beam at higher frequencies, comparable to a densely populated uniform linear array of MN microphones. In this work beampatterns are simulated for a range of single frequencies, as well as bands of frequencies. Narrowband experimental beampatterns are shown to correspond with simulated results even at frequencies other than the arrays design frequency. Narrowband side lobe locations are shown to correspond to the theoretical values. Side lobes in the directional pattern are mitigated by increasing bandwidth of analyzed signals. Direction of arrival estimation is also implemented for two simultaneous noise sources in a free field condition.
Low-count PET image restoration using sparse representation
NASA Astrophysics Data System (ADS)
Li, Tao; Jiang, Changhui; Gao, Juan; Yang, Yongfeng; Liang, Dong; Liu, Xin; Zheng, Hairong; Hu, Zhanli
2018-04-01
In the field of positron emission tomography (PET), reconstructed images are often blurry and contain noise. These problems are primarily caused by the low resolution of projection data. Solving this problem by improving hardware is an expensive solution, and therefore, we attempted to develop a solution based on optimizing several related algorithms in both the reconstruction and image post-processing domains. As sparse technology is widely used, sparse prediction is increasingly applied to solve this problem. In this paper, we propose a new sparse method to process low-resolution PET images. Two dictionaries (D1 for low-resolution PET images and D2 for high-resolution PET images) are learned from a group real PET image data sets. Among these two dictionaries, D1 is used to obtain a sparse representation for each patch of the input PET image. Then, a high-resolution PET image is generated from this sparse representation using D2. Experimental results indicate that the proposed method exhibits a stable and superior ability to enhance image resolution and recover image details. Quantitatively, this method achieves better performance than traditional methods. This proposed strategy is a new and efficient approach for improving the quality of PET images.
Scalability improvements to NRLMOL for DFT calculations of large molecules
NASA Astrophysics Data System (ADS)
Diaz, Carlos Manuel
Advances in high performance computing (HPC) have provided a way to treat large, computationally demanding tasks using thousands of processors. With the development of more powerful HPC architectures, the need to create efficient and scalable code has grown more important. Electronic structure calculations are valuable in understanding experimental observations and are routinely used for new materials predictions. For the electronic structure calculations, the memory and computation time are proportional to the number of atoms. Memory requirements for these calculations scale as N2, where N is the number of atoms. While the recent advances in HPC offer platforms with large numbers of cores, the limited amount of memory available on a given node and poor scalability of the electronic structure code hinder their efficient usage of these platforms. This thesis will present some developments to overcome these bottlenecks in order to study large systems. These developments, which are implemented in the NRLMOL electronic structure code, involve the use of sparse matrix storage formats and the use of linear algebra using sparse and distributed matrices. These developments along with other related development now allow ground state density functional calculations using up to 25,000 basis functions and the excited state calculations using up to 17,000 basis functions while utilizing all cores on a node. An example on a light-harvesting triad molecule is described. Finally, future plans to further improve the scalability will be presented.
Jang, In Sock; Dienstmann, Rodrigo; Margolin, Adam A; Guinney, Justin
2015-01-01
Complex mechanisms involving genomic aberrations in numerous proteins and pathways are believed to be a key cause of many diseases such as cancer. With recent advances in genomics, elucidating the molecular basis of cancer at a patient level is now feasible, and has led to personalized treatment strategies whereby a patient is treated according to his or her genomic profile. However, there is growing recognition that existing treatment modalities are overly simplistic, and do not fully account for the deep genomic complexity associated with sensitivity or resistance to cancer therapies. To overcome these limitations, large-scale pharmacogenomic screens of cancer cell lines--in conjunction with modern statistical learning approaches--have been used to explore the genetic underpinnings of drug response. While these analyses have demonstrated the ability to infer genetic predictors of compound sensitivity, to date most modeling approaches have been data-driven, i.e. they do not explicitly incorporate domain-specific knowledge (priors) in the process of learning a model. While a purely data-driven approach offers an unbiased perspective of the data--and may yield unexpected or novel insights--this strategy introduces challenges for both model interpretability and accuracy. In this study, we propose a novel prior-incorporated sparse regression model in which the choice of informative predictor sets is carried out by knowledge-driven priors (gene sets) in a stepwise fashion. Under regularization in a linear regression model, our algorithm is able to incorporate prior biological knowledge across the predictive variables thereby improving the interpretability of the final model with no loss--and often an improvement--in predictive performance. We evaluate the performance of our algorithm compared to well-known regularization methods such as LASSO, Ridge and Elastic net regression in the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (Sanger) pharmacogenomics datasets, demonstrating that incorporation of the biological priors selected by our model confers improved predictability and interpretability, despite much fewer predictors, over existing state-of-the-art methods.
Impact of view reduction in CT on radiation dose for patients
NASA Astrophysics Data System (ADS)
Parcero, E.; Flores, L.; Sánchez, M. G.; Vidal, V.; Verdú, G.
2017-08-01
Iterative methods have become a hot topic of research in computed tomography (CT) imaging because of their capacity to resolve the reconstruction problem from a limited number of projections. This allows the reduction of radiation exposure on patients during the data acquisition. The reconstruction time and the high radiation dose imposed on patients are the two major drawbacks in CT. To solve them effectively we adapted the method for sparse linear equations and sparse least squares (LSQR) with soft threshold filtering (STF) and the fast iterative shrinkage-thresholding algorithm (FISTA) to computed tomography reconstruction. The feasibility of the proposed methods is demonstrated numerically.
An algebraic equation solution process formulated in anticipation of banded linear equations.
DOT National Transportation Integrated Search
1971-01-01
A general method for the solution of large, sparsely banded, positive-definite, coefficient matrices is presented. The goal in developing the method was to produce an efficient and reliable solution process and to provide the user-programmer with a p...
A comparison of SuperLU solvers on the intel MIC architecture
NASA Astrophysics Data System (ADS)
Tuncel, Mehmet; Duran, Ahmet; Celebi, M. Serdar; Akaydin, Bora; Topkaya, Figen O.
2016-10-01
In many science and engineering applications, problems may result in solving a sparse linear system AX=B. For example, SuperLU_MCDT, a linear solver, was used for the large penta-diagonal matrices for 2D problems and hepta-diagonal matrices for 3D problems, coming from the incompressible blood flow simulation (see [1]). It is important to test the status and potential improvements of state-of-the-art solvers on new technologies. In this work, sequential, multithreaded and distributed versions of SuperLU solvers (see [2]) are examined on the Intel Xeon Phi coprocessors using offload programming model at the EURORA cluster of CINECA in Italy. We consider a portfolio of test matrices containing patterned matrices from UFMM ([3]) and randomly located matrices. This architecture can benefit from high parallelism and large vectors. We find that the sequential SuperLU benefited up to 45 % performance improvement from the offload programming depending on the sparse matrix type and the size of transferred and processed data.
Multistatic Array Sampling Scheme for Fast Near-Field Image Reconstruction
2016-01-01
reconstruction. The array topology samples the scene on a regular grid of phase centers, using a tiling of Boundary Arrays (BAs). Following a simple correction...hardware. Fig. 1 depicts the multistatic array topology. As seen, the topology is a tiled arrangement of Boundary Arrays (BAs). The BA is a well-known...sparse array layout comprised of two linear transmit arrays, and two linear receive arrays [6]. A slightly different tiled arrangement of BAs was used
NASA Technical Reports Server (NTRS)
Park, K. C.; Belvin, W. Keith
1990-01-01
A general form for the first-order representation of the continuous second-order linear structural-dynamics equations is introduced to derive a corresponding form of first-order continuous Kalman filtering equations. Time integration of the resulting equations is carried out via a set of linear multistep integration formulas. It is shown that a judicious combined selection of computational paths and the undetermined matrices introduced in the general form of the first-order linear structural systems leads to a class of second-order discrete Kalman filtering equations involving only symmetric sparse N x N solution matrices.
Orthogonal Procrustes Analysis for Dictionary Learning in Sparse Linear Representation
Grossi, Giuliano; Lin, Jianyi
2017-01-01
In the sparse representation model, the design of overcomplete dictionaries plays a key role for the effectiveness and applicability in different domains. Recent research has produced several dictionary learning approaches, being proven that dictionaries learnt by data examples significantly outperform structured ones, e.g. wavelet transforms. In this context, learning consists in adapting the dictionary atoms to a set of training signals in order to promote a sparse representation that minimizes the reconstruction error. Finding the best fitting dictionary remains a very difficult task, leaving the question still open. A well-established heuristic method for tackling this problem is an iterative alternating scheme, adopted for instance in the well-known K-SVD algorithm. Essentially, it consists in repeating two stages; the former promotes sparse coding of the training set and the latter adapts the dictionary to reduce the error. In this paper we present R-SVD, a new method that, while maintaining the alternating scheme, adopts the Orthogonal Procrustes analysis to update the dictionary atoms suitably arranged into groups. Comparative experiments on synthetic data prove the effectiveness of R-SVD with respect to well known dictionary learning algorithms such as K-SVD, ILS-DLA and the online method OSDL. Moreover, experiments on natural data such as ECG compression, EEG sparse representation, and image modeling confirm R-SVD’s robustness and wide applicability. PMID:28103283
An Improved Sparse Representation over Learned Dictionary Method for Seizure Detection.
Li, Junhui; Zhou, Weidong; Yuan, Shasha; Zhang, Yanli; Li, Chengcheng; Wu, Qi
2016-02-01
Automatic seizure detection has played an important role in the monitoring, diagnosis and treatment of epilepsy. In this paper, a patient specific method is proposed for seizure detection in the long-term intracranial electroencephalogram (EEG) recordings. This seizure detection method is based on sparse representation with online dictionary learning and elastic net constraint. The online learned dictionary could sparsely represent the testing samples more accurately, and the elastic net constraint which combines the 11-norm and 12-norm not only makes the coefficients sparse but also avoids over-fitting problem. First, the EEG signals are preprocessed using wavelet filtering and differential filtering, and the kernel function is applied to make the samples closer to linearly separable. Then the dictionaries of seizure and nonseizure are respectively learned from original ictal and interictal training samples with online dictionary optimization algorithm to compose the training dictionary. After that, the test samples are sparsely coded over the learned dictionary and the residuals associated with ictal and interictal sub-dictionary are calculated, respectively. Eventually, the test samples are classified as two distinct categories, seizure or nonseizure, by comparing the reconstructed residuals. The average segment-based sensitivity of 95.45%, specificity of 99.08%, and event-based sensitivity of 94.44% with false detection rate of 0.23/h and average latency of -5.14 s have been achieved with our proposed method.
Monitoring Marine Weather Systems Using Quikscat and TRMM Data
NASA Technical Reports Server (NTRS)
Liu, W.; Tang, W.; Datta, A.; Hsu, C.
1999-01-01
We do not understand nor are able to predict marine storms, particularly tropical cyclones, sufficiently well because ground-based measurements are sparse and operational numerical weather prediction models do not have sufficient spatial resolution nor accurate parameterization of the physics.
Sparse orthogonal population representation of spatial context in the retrosplenial cortex.
Mao, Dun; Kandler, Steffen; McNaughton, Bruce L; Bonin, Vincent
2017-08-15
Sparse orthogonal coding is a key feature of hippocampal neural activity, which is believed to increase episodic memory capacity and to assist in navigation. Some retrosplenial cortex (RSC) neurons convey distributed spatial and navigational signals, but place-field representations such as observed in the hippocampus have not been reported. Combining cellular Ca 2+ imaging in RSC of mice with a head-fixed locomotion assay, we identified a population of RSC neurons, located predominantly in superficial layers, whose ensemble activity closely resembles that of hippocampal CA1 place cells during the same task. Like CA1 place cells, these RSC neurons fire in sequences during movement, and show narrowly tuned firing fields that form a sparse, orthogonal code correlated with location. RSC 'place' cell activity is robust to environmental manipulations, showing partial remapping similar to that observed in CA1. This population code for spatial context may assist the RSC in its role in memory and/or navigation.Neurons in the retrosplenial cortex (RSC) encode spatial and navigational signals. Here the authors use calcium imaging to show that, similar to the hippocampus, RSC neurons also encode place cell-like activity in a sparse orthogonal representation, partially anchored to the allocentric cues on the linear track.
A coarse-to-fine approach for medical hyperspectral image classification with sparse representation
NASA Astrophysics Data System (ADS)
Chang, Lan; Zhang, Mengmeng; Li, Wei
2017-10-01
A coarse-to-fine approach with sparse representation is proposed for medical hyperspectral image classification in this work. Segmentation technique with different scales is employed to exploit edges of the input image, where coarse super-pixel patches provide global classification information while fine ones further provide detail information. Different from common RGB image, hyperspectral image has multi bands to adjust the cluster center with more high precision. After segmentation, each super pixel is classified by recently-developed sparse representation-based classification (SRC), which assigns label for testing samples in one local patch by means of sparse linear combination of all the training samples. Furthermore, segmentation with multiple scales is employed because single scale is not suitable for complicate distribution of medical hyperspectral imagery. Finally, classification results for different sizes of super pixel are fused by some fusion strategy, offering at least two benefits: (1) the final result is obviously superior to that of segmentation with single scale, and (2) the fusion process significantly simplifies the choice of scales. Experimental results using real medical hyperspectral images demonstrate that the proposed method outperforms the state-of-the-art SRC.
NASA Astrophysics Data System (ADS)
Kaporin, I. E.
2012-02-01
In order to precondition a sparse symmetric positive definite matrix, its approximate inverse is examined, which is represented as the product of two sparse mutually adjoint triangular matrices. In this way, the solution of the corresponding system of linear algebraic equations (SLAE) by applying the preconditioned conjugate gradient method (CGM) is reduced to performing only elementary vector operations and calculating sparse matrix-vector products. A method for constructing the above preconditioner is described and analyzed. The triangular factor has a fixed sparsity pattern and is optimal in the sense that the preconditioned matrix has a minimum K-condition number. The use of polynomial preconditioning based on Chebyshev polynomials makes it possible to considerably reduce the amount of scalar product operations (at the cost of an insignificant increase in the total number of arithmetic operations). The possibility of an efficient massively parallel implementation of the resulting method for solving SLAEs is discussed. For a sequential version of this method, the results obtained by solving 56 test problems from the Florida sparse matrix collection (which are large-scale and ill-conditioned) are presented. These results show that the method is highly reliable and has low computational costs.
CT Image Sequence Restoration Based on Sparse and Low-Rank Decomposition
Gou, Shuiping; Wang, Yueyue; Wang, Zhilong; Peng, Yong; Zhang, Xiaopeng; Jiao, Licheng; Wu, Jianshe
2013-01-01
Blurry organ boundaries and soft tissue structures present a major challenge in biomedical image restoration. In this paper, we propose a low-rank decomposition-based method for computed tomography (CT) image sequence restoration, where the CT image sequence is decomposed into a sparse component and a low-rank component. A new point spread function of Weiner filter is employed to efficiently remove blur in the sparse component; a wiener filtering with the Gaussian PSF is used to recover the average image of the low-rank component. And then we get the recovered CT image sequence by combining the recovery low-rank image with all recovery sparse image sequence. Our method achieves restoration results with higher contrast, sharper organ boundaries and richer soft tissue structure information, compared with existing CT image restoration methods. The robustness of our method was assessed with numerical experiments using three different low-rank models: Robust Principle Component Analysis (RPCA), Linearized Alternating Direction Method with Adaptive Penalty (LADMAP) and Go Decomposition (GoDec). Experimental results demonstrated that the RPCA model was the most suitable for the small noise CT images whereas the GoDec model was the best for the large noisy CT images. PMID:24023764
Reconstruction of finite-valued sparse signals
NASA Astrophysics Data System (ADS)
Keiper, Sandra; Kutyniok, Gitta; Lee, Dae Gwan; Pfander, Götz
2017-08-01
The need of reconstructing discrete-valued sparse signals from few measurements, that is solving an undetermined system of linear equations, appears frequently in science and engineering. Those signals appear, for example, in error correcting codes as well as massive Multiple-Input Multiple-Output (MIMO) channel and wideband spectrum sensing. A particular example is given by wireless communications, where the transmitted signals are sequences of bits, i.e., with entries in f0; 1g. Whereas classical compressed sensing algorithms do not incorporate the additional knowledge of the discrete nature of the signal, classical lattice decoding approaches do not utilize sparsity constraints. In this talk, we present an approach that incorporates a discrete values prior into basis pursuit. In particular, we address finite-valued sparse signals, i.e., sparse signals with entries in a finite alphabet. We will introduce an equivalent null space characterization and show that phase transition takes place earlier than when using the classical basis pursuit approach. We will further discuss robustness of the algorithm and show that the nonnegative case is very different from the bipolar one. One of our findings is that the positioning of the zero in the alphabet - i.e., whether it is a boundary element or not - is crucial.
Parameter Estimation for a Turbulent Buoyant Jet Using Approximate Bayesian Computation
NASA Astrophysics Data System (ADS)
Christopher, Jason D.; Wimer, Nicholas T.; Hayden, Torrey R. S.; Lapointe, Caelan; Grooms, Ian; Rieker, Gregory B.; Hamlington, Peter E.
2016-11-01
Approximate Bayesian Computation (ABC) is a powerful tool that allows sparse experimental or other "truth" data to be used for the prediction of unknown model parameters in numerical simulations of real-world engineering systems. In this presentation, we introduce the ABC approach and then use ABC to predict unknown inflow conditions in simulations of a two-dimensional (2D) turbulent, high-temperature buoyant jet. For this test case, truth data are obtained from a simulation with known boundary conditions and problem parameters. Using spatially-sparse temperature statistics from the 2D buoyant jet truth simulation, we show that the ABC method provides accurate predictions of the true jet inflow temperature. The success of the ABC approach in the present test suggests that ABC is a useful and versatile tool for engineering fluid dynamics research.
Sparse Zero-Sum Games as Stable Functional Feature Selection
Sokolovska, Nataliya; Teytaud, Olivier; Rizkalla, Salwa; Clément, Karine; Zucker, Jean-Daniel
2015-01-01
In large-scale systems biology applications, features are structured in hidden functional categories whose predictive power is identical. Feature selection, therefore, can lead not only to a problem with a reduced dimensionality, but also reveal some knowledge on functional classes of variables. In this contribution, we propose a framework based on a sparse zero-sum game which performs a stable functional feature selection. In particular, the approach is based on feature subsets ranking by a thresholding stochastic bandit. We provide a theoretical analysis of the introduced algorithm. We illustrate by experiments on both synthetic and real complex data that the proposed method is competitive from the predictive and stability viewpoints. PMID:26325268
Deep neural network-based domain adaptation for classification of remote sensing images
NASA Astrophysics Data System (ADS)
Ma, Li; Song, Jiazhen
2017-10-01
We investigate the effectiveness of deep neural network for cross-domain classification of remote sensing images in this paper. In the network, class centroid alignment is utilized as a domain adaptation strategy, making the network able to transfer knowledge from the source domain to target domain on a per-class basis. Since predicted labels of target data should be used to estimate the centroid of each class, we use overall centroid alignment as a coarse domain adaptation method to improve the estimation accuracy. In addition, rectified linear unit is used as the activation function to produce sparse features, which may improve the separation capability. The proposed network can provide both aligned features and an adaptive classifier, as well as obtain label-free classification of target domain data. The experimental results using Hyperion, NCALM, and WorldView-2 remote sensing images demonstrated the effectiveness of the proposed approach.
Data mining for materials design: A computational study of single molecule magnet
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dam, Hieu Chi; Faculty of Physics, Vietnam National University, 334 Nguyen Trai, Hanoi; Pham, Tien Lam
2014-01-28
We develop a method that combines data mining and first principles calculation to guide the designing of distorted cubane Mn{sup 4+} Mn {sub 3}{sup 3+} single molecule magnets. The essential idea of the method is a process consisting of sparse regressions and cross-validation for analyzing calculated data of the materials. The method allows us to demonstrate that the exchange coupling between Mn{sup 4+} and Mn{sup 3+} ions can be predicted from the electronegativities of constituent ligands and the structural features of the molecule by a linear regression model with high accuracy. The relations between the structural features and magnetic propertiesmore » of the materials are quantitatively and consistently evaluated and presented by a graph. We also discuss the properties of the materials and guide the material design basing on the obtained results.« less
A LANDSAT study of ephemeral and perennial rangeland vegetation and soils
NASA Technical Reports Server (NTRS)
Bentley, R. G., Jr. (Principal Investigator); Salmon-Drexler, B. C.; Bonner, W. J.; Vincent, R. K.
1976-01-01
The author has identified the following significant results. Several methods of computer processing were applied to LANDSAT data for mapping vegetation characteristics of perennial rangeland in Montana and ephemeral rangeland in Arizona. The choice of optimal processing technique was dependent on prescribed mapping and site condition. Single channel level slicing and ratioing of channels were used for simple enhancement. Predictive models for mapping percent vegetation cover based on data from field spectra and LANDSAT data were generated by multiple linear regression of six unique LANDSAT spectral ratios. Ratio gating logic and maximum likelihood classification were applied successfully to recognize plant communities in Montana. Maximum likelihood classification did little to improve recognition of terrain features when compared to a single channel density slice in sparsely vegetated Arizona. LANDSAT was found to be more sensitive to differences between plant communities based on percentages of vigorous vegetation than to actual physical or spectral differences among plant species.
Computational Methods for Sparse Solution of Linear Inverse Problems
2009-03-01
this approach is that the algorithms take advantage of fast matrix–vector multiplications. An implementation is available as pdco and SolveBP in the...M. A. Saunders, “ PDCO : primal-dual interior-point method for con- vex objectives,” Systems Optimization Laboratory, Stanford University, Tech. Rep
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, W; Sawant, A; Ruan, D
2016-06-15
Purpose: Surface photogrammetry (e.g. VisionRT, C-Rad) provides a noninvasive way to obtain high-frequency measurement for patient motion monitoring in radiotherapy. This work aims to develop a real-time surface reconstruction method on the acquired point clouds, whose acquisitions are subject to noise and missing measurements. In contrast to existing surface reconstruction methods that are usually computationally expensive, the proposed method reconstructs continuous surfaces with comparable accuracy in real-time. Methods: The key idea in our method is to solve and propagate a sparse linear relationship from the point cloud (measurement) manifold to the surface (reconstruction) manifold, taking advantage of the similarity inmore » local geometric topology in both manifolds. With consistent point cloud acquisition, we propose a sparse regression (SR) model to directly approximate the target point cloud as a sparse linear combination from the training set, building the point correspondences by the iterative closest point (ICP) method. To accommodate changing noise levels and/or presence of inconsistent occlusions, we further propose a modified sparse regression (MSR) model to account for the large and sparse error built by ICP, with a Laplacian prior. We evaluated our method on both clinical acquired point clouds under consistent conditions and simulated point clouds with inconsistent occlusions. The reconstruction accuracy was evaluated w.r.t. root-mean-squared-error, by comparing the reconstructed surfaces against those from the variational reconstruction method. Results: On clinical point clouds, both the SR and MSR models achieved sub-millimeter accuracy, with mean reconstruction time reduced from 82.23 seconds to 0.52 seconds and 0.94 seconds, respectively. On simulated point cloud with inconsistent occlusions, the MSR model has demonstrated its advantage in achieving consistent performance despite the introduced occlusions. Conclusion: We have developed a real-time and robust surface reconstruction method on point clouds acquired by photogrammetry systems. It serves an important enabling step for real-time motion tracking in radiotherapy. This work is supported in part by NIH grant R01 CA169102-02.« less
Liu, Wenyang; Cheung, Yam; Sawant, Amit; Ruan, Dan
2016-05-01
To develop a robust and real-time surface reconstruction method on point clouds captured from a 3D surface photogrammetry system. The authors have developed a robust and fast surface reconstruction method on point clouds acquired by the photogrammetry system, without explicitly solving the partial differential equation required by a typical variational approach. Taking advantage of the overcomplete nature of the acquired point clouds, their method solves and propagates a sparse linear relationship from the point cloud manifold to the surface manifold, assuming both manifolds share similar local geometry. With relatively consistent point cloud acquisitions, the authors propose a sparse regression (SR) model to directly approximate the target point cloud as a sparse linear combination from the training set, assuming that the point correspondences built by the iterative closest point (ICP) is reasonably accurate and have residual errors following a Gaussian distribution. To accommodate changing noise levels and/or presence of inconsistent occlusions during the acquisition, the authors further propose a modified sparse regression (MSR) model to model the potentially large and sparse error built by ICP with a Laplacian prior. The authors evaluated the proposed method on both clinical point clouds acquired under consistent acquisition conditions and on point clouds with inconsistent occlusions. The authors quantitatively evaluated the reconstruction performance with respect to root-mean-squared-error, by comparing its reconstruction results against that from the variational method. On clinical point clouds, both the SR and MSR models have achieved sub-millimeter reconstruction accuracy and reduced the reconstruction time by two orders of magnitude to a subsecond reconstruction time. On point clouds with inconsistent occlusions, the MSR model has demonstrated its advantage in achieving consistent and robust performance despite the introduced occlusions. The authors have developed a fast and robust surface reconstruction method on point clouds captured from a 3D surface photogrammetry system, with demonstrated sub-millimeter reconstruction accuracy and subsecond reconstruction time. It is suitable for real-time motion tracking in radiotherapy, with clear surface structures for better quantifications.
Liu, Wenyang; Cheung, Yam; Sawant, Amit; Ruan, Dan
2016-01-01
Purpose: To develop a robust and real-time surface reconstruction method on point clouds captured from a 3D surface photogrammetry system. Methods: The authors have developed a robust and fast surface reconstruction method on point clouds acquired by the photogrammetry system, without explicitly solving the partial differential equation required by a typical variational approach. Taking advantage of the overcomplete nature of the acquired point clouds, their method solves and propagates a sparse linear relationship from the point cloud manifold to the surface manifold, assuming both manifolds share similar local geometry. With relatively consistent point cloud acquisitions, the authors propose a sparse regression (SR) model to directly approximate the target point cloud as a sparse linear combination from the training set, assuming that the point correspondences built by the iterative closest point (ICP) is reasonably accurate and have residual errors following a Gaussian distribution. To accommodate changing noise levels and/or presence of inconsistent occlusions during the acquisition, the authors further propose a modified sparse regression (MSR) model to model the potentially large and sparse error built by ICP with a Laplacian prior. The authors evaluated the proposed method on both clinical point clouds acquired under consistent acquisition conditions and on point clouds with inconsistent occlusions. The authors quantitatively evaluated the reconstruction performance with respect to root-mean-squared-error, by comparing its reconstruction results against that from the variational method. Results: On clinical point clouds, both the SR and MSR models have achieved sub-millimeter reconstruction accuracy and reduced the reconstruction time by two orders of magnitude to a subsecond reconstruction time. On point clouds with inconsistent occlusions, the MSR model has demonstrated its advantage in achieving consistent and robust performance despite the introduced occlusions. Conclusions: The authors have developed a fast and robust surface reconstruction method on point clouds captured from a 3D surface photogrammetry system, with demonstrated sub-millimeter reconstruction accuracy and subsecond reconstruction time. It is suitable for real-time motion tracking in radiotherapy, with clear surface structures for better quantifications. PMID:27147347
Efficient Implementation of an Optimal Interpolator for Large Spatial Data Sets
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.
2007-01-01
Scattered data interpolation is a problem of interest in numerous areas such as electronic imaging, smooth surface modeling, and computational geometry. Our motivation arises from applications in geology and mining, which often involve large scattered data sets and a demand for high accuracy. The method of choice is ordinary kriging. This is because it is a best unbiased estimator. Unfortunately, this interpolant is computationally very expensive to compute exactly. For n scattered data points, computing the value of a single interpolant involves solving a dense linear system of size roughly n x n. This is infeasible for large n. In practice, kriging is solved approximately by local approaches that are based on considering only a relatively small'number of points that lie close to the query point. There are many problems with this local approach, however. The first is that determining the proper neighborhood size is tricky, and is usually solved by ad hoc methods such as selecting a fixed number of nearest neighbors or all the points lying within a fixed radius. Such fixed neighborhood sizes may not work well for all query points, depending on local density of the point distribution. Local methods also suffer from the problem that the resulting interpolant is not continuous. Meyer showed that while kriging produces smooth continues surfaces, it has zero order continuity along its borders. Thus, at interface boundaries where the neighborhood changes, the interpolant behaves discontinuously. Therefore, it is important to consider and solve the global system for each interpolant. However, solving such large dense systems for each query point is impractical. Recently a more principled approach to approximating kriging has been proposed based on a technique called covariance tapering. The problems arise from the fact that the covariance functions that are used in kriging have global support. Our implementations combine, utilize, and enhance a number of different approaches that have been introduced in literature for solving large linear systems for interpolation of scattered data points. For very large systems, exact methods such as Gaussian elimination are impractical since they require 0(n(exp 3)) time and 0(n(exp 2)) storage. As Billings et al. suggested, we use an iterative approach. In particular, we use the SYMMLQ method, for solving the large but sparse ordinary kriging systems that result from tapering. The main technical issue that need to be overcome in our algorithmic solution is that the points' covariance matrix for kriging should be symmetric positive definite. The goal of tapering is to obtain a sparse approximate representation of the covariance matrix while maintaining its positive definiteness. Furrer et al. used tapering to obtain a sparse linear system of the form Ax = b, where A is the tapered symmetric positive definite covariance matrix. Thus, Cholesky factorization could be used to solve their linear systems. They implemented an efficient sparse Cholesky decomposition method. They also showed if these tapers are used for a limited class of covariance models, the solution of the system converges to the solution of the original system. Matrix A in the ordinary kriging system, while symmetric, is not positive definite. Thus, their approach is not applicable to the ordinary kriging system. Therefore, we use tapering only to obtain a sparse linear system. Then, we use SYMMLQ to solve the ordinary kriging system. We show that solving large kriging systems becomes practical via tapering and iterative methods, and results in lower estimation errors compared to traditional local approaches, and significant memory savings compared to the original global system. We also developed a more efficient variant of the sparse SYMMLQ method for large ordinary kriging systems. This approach adaptively finds the correct local neighborhood for each query point in the interpolation process.
Deep ensemble learning of sparse regression models for brain disease diagnosis.
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2017-04-01
Recent studies on brain imaging analysis witnessed the core roles of machine learning techniques in computer-assisted intervention for brain disease diagnosis. Of various machine-learning techniques, sparse regression models have proved their effectiveness in handling high-dimensional data but with a small number of training samples, especially in medical problems. In the meantime, deep learning methods have been making great successes by outperforming the state-of-the-art performances in various applications. In this paper, we propose a novel framework that combines the two conceptually different methods of sparse regression and deep learning for Alzheimer's disease/mild cognitive impairment diagnosis and prognosis. Specifically, we first train multiple sparse regression models, each of which is trained with different values of a regularization control parameter. Thus, our multiple sparse regression models potentially select different feature subsets from the original feature set; thereby they have different powers to predict the response values, i.e., clinical label and clinical scores in our work. By regarding the response values from our sparse regression models as target-level representations, we then build a deep convolutional neural network for clinical decision making, which thus we call 'Deep Ensemble Sparse Regression Network.' To our best knowledge, this is the first work that combines sparse regression models with deep neural network. In our experiments with the ADNI cohort, we validated the effectiveness of the proposed method by achieving the highest diagnostic accuracies in three classification tasks. We also rigorously analyzed our results and compared with the previous studies on the ADNI cohort in the literature. Copyright © 2017 Elsevier B.V. All rights reserved.
Deep ensemble learning of sparse regression models for brain disease diagnosis
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2018-01-01
Recent studies on brain imaging analysis witnessed the core roles of machine learning techniques in computer-assisted intervention for brain disease diagnosis. Of various machine-learning techniques, sparse regression models have proved their effectiveness in handling high-dimensional data but with a small number of training samples, especially in medical problems. In the meantime, deep learning methods have been making great successes by outperforming the state-of-the-art performances in various applications. In this paper, we propose a novel framework that combines the two conceptually different methods of sparse regression and deep learning for Alzheimer’s disease/mild cognitive impairment diagnosis and prognosis. Specifically, we first train multiple sparse regression models, each of which is trained with different values of a regularization control parameter. Thus, our multiple sparse regression models potentially select different feature subsets from the original feature set; thereby they have different powers to predict the response values, i.e., clinical label and clinical scores in our work. By regarding the response values from our sparse regression models as target-level representations, we then build a deep convolutional neural network for clinical decision making, which thus we call ‘ Deep Ensemble Sparse Regression Network.’ To our best knowledge, this is the first work that combines sparse regression models with deep neural network. In our experiments with the ADNI cohort, we validated the effectiveness of the proposed method by achieving the highest diagnostic accuracies in three classification tasks. We also rigorously analyzed our results and compared with the previous studies on the ADNI cohort in the literature. PMID:28167394
Inequality across consonantal contrasts in speech perception: evidence from mismatch negativity.
Cornell, Sonia A; Lahiri, Aditi; Eulitz, Carsten
2013-06-01
The precise structure of speech sound representations is still a matter of debate. In the present neurobiological study, we compared predictions about differential sensitivity to speech contrasts between models that assume full specification of all phonological information in the mental lexicon with those assuming sparse representations (only contrastive or otherwise not predictable information is stored). In a passive oddball paradigm, we studied the contrast sensitivity as reflected in the mismatch negativity (MMN) response to changes in the manner of articulation, as well as place of articulation of consonants in intervocalic positions of nonwords (manner of articulation: [edi ~ eni], [ezi ~ eni]; place of articulation: [edi ~ egi]). Models that assume full specification of all phonological information in the mental lexicon posit equal MMNs within each contrast (symmetric MMNs), that is, changes from standard [edi] to deviant [eni] elicit a similar MMN response as changes from standard [eni] to deviant [edi]. In contrast, models that assume sparse representations predict that only the [ezi] ~ [eni] reversals will evoke symmetric MMNs because of their conflicting fully specified manner features. Asymmetric MMNs are predicted, however, for the reversals of [edi] ~ [eni] and [edi] ~ [egi] because either a manner or place property in each pair is not fully specified in the mental lexicon. Our results show a pattern of symmetric and asymmetric MMNs that is in line with predictions of the featurally underspecified lexicon model that assumes sparse phonological representations. We conclude that the brain refers to underspecified phonological representations during speech perception. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Structural performance analysis and redesign
NASA Technical Reports Server (NTRS)
Whetstone, W. D.
1978-01-01
Program performs stress buckling and vibrational analysis of large, linear, finite-element systems in excess of 50,000 degrees of freedom. Cost, execution time, and storage requirements are kept reasonable through use of sparse matrix solution techniques, and other computational and data management procedures designed for problems of very large size.
Feature Modeling in Underwater Environments Using Sparse Linear Combinations
2010-01-01
nose of the tor- pedo obviously has a different optical depth than the tail and points in between. Our chosen PSF does not consider this, but it...IEEE Transactions on Information Theory, 52(4), 2006. 4 [6] R. Hess and A. Fern. Improved video registration using non-distinctive local image
Monitoring NEON terrestrial sites phenology with daily MODIS BRDF/albedo product and landsat data
USDA-ARS?s Scientific Manuscript database
The MODerate resolution Imaging Spectroradiometer (MODIS) Bidirectional Reflectance Distribution Function (BRDF) and albedo products (MCD43) have already been in production for more than a decade. The standard product makes use of a linear “kernel-driven” RossThick-LiSparse Reciprocal (RTLSR) BRDF m...
NASA Astrophysics Data System (ADS)
Gao, Yuan; Ma, Jiayi; Yuille, Alan L.
2017-05-01
This paper addresses the problem of face recognition when there is only few, or even only a single, labeled examples of the face that we wish to recognize. Moreover, these examples are typically corrupted by nuisance variables, both linear (i.e., additive nuisance variables such as bad lighting, wearing of glasses) and non-linear (i.e., non-additive pixel-wise nuisance variables such as expression changes). The small number of labeled examples means that it is hard to remove these nuisance variables between the training and testing faces to obtain good recognition performance. To address the problem we propose a method called Semi-Supervised Sparse Representation based Classification (S$^3$RC). This is based on recent work on sparsity where faces are represented in terms of two dictionaries: a gallery dictionary consisting of one or more examples of each person, and a variation dictionary representing linear nuisance variables (e.g., different lighting conditions, different glasses). The main idea is that (i) we use the variation dictionary to characterize the linear nuisance variables via the sparsity framework, then (ii) prototype face images are estimated as a gallery dictionary via a Gaussian Mixture Model (GMM), with mixed labeled and unlabeled samples in a semi-supervised manner, to deal with the non-linear nuisance variations between labeled and unlabeled samples. We have done experiments with insufficient labeled samples, even when there is only a single labeled sample per person. Our results on the AR, Multi-PIE, CAS-PEAL, and LFW databases demonstrate that the proposed method is able to deliver significantly improved performance over existing methods.
NASA Astrophysics Data System (ADS)
Tamamitsu, Miu; Zhang, Yibo; Wang, Hongda; Wu, Yichen; Ozcan, Aydogan
2018-02-01
The Sparsity of the Gradient (SoG) is a robust autofocusing criterion for holography, where the gradient modulus of the complex refocused hologram is calculated, on which a sparsity metric is applied. Here, we compare two different choices of sparsity metrics used in SoG, specifically, the Gini index (GI) and the Tamura coefficient (TC), for holographic autofocusing on dense/connected or sparse samples. We provide a theoretical analysis predicting that for uniformly distributed image data, TC and GI exhibit similar behavior, while for naturally sparse images containing few high-valued signal entries and many low-valued noisy background pixels, TC is more sensitive to distribution changes in the signal and more resistive to background noise. These predictions are also confirmed by experimental results using SoG-based holographic autofocusing on dense and connected samples (such as stained breast tissue sections) as well as highly sparse samples (such as isolated Giardia lamblia cysts). Through these experiments, we found that ToG and GoG offer almost identical autofocusing performance on dense and connected samples, whereas for naturally sparse samples, GoG should be calculated on a relatively small region of interest (ROI) closely surrounding the object, while ToG offers more flexibility in choosing a larger ROI containing more background pixels.
NASA Astrophysics Data System (ADS)
Koga-Vicente, A.; Friedel, M. J.
2010-12-01
Every year thousands of people are affected by floods and landslide hazards caused by rainstorms. The problem is more serious in tropical developing countries because of the susceptibility as a result of the high amount of available energy to form storms, and the high vulnerability due to poor economic and social conditions. Predictive models of hazards are important tools to manage this kind of risk. In this study, a comparison of two different modeling approaches was made for predicting hydrometeorological hazards in 12 cities on the coast of São Paulo, Brazil, from 1994 to 2003. In the first approach, an empirical multiple linear regression (MLR) model was developed and used; the second approach used a type of unsupervised nonlinear artificial neural network called a self-organized map (SOM). By using twenty three independent variables of susceptibility (precipitation, soil type, slope, elevation, and regional atmospheric system scale) and vulnerability (distribution and total population, income and educational characteristics, poverty intensity, human development index), binary hazard responses were obtained. Model performance by cross-validation indicated that the respective MLR and SOM model accuracy was about 67% and 80%. Prediction accuracy can be improved by the addition of information, but the SOM approach is preferred because of sparse data and highly nonlinear relations among the independent variables.
NASA Astrophysics Data System (ADS)
Bryan, Karin R.; Nardin, William; Mullarney, Julia C.; Fagherazzi, Sergio
2017-09-01
Mangroves are halophytic plants common in tropical and sub-tropical environments. Their roots and pneumatophores strongly affect intertidal hydrodynamics and related sediment transport. Here, we investigate the role tree and root structures may play in altering tidal currents and the effect of these currents on the development of intertidal landscapes in mangrove-dominated environments. We use a one-dimensional Delft3D model, forced using typical intertidal slopes and vegetation characteristics from two sites with contrasting slope on Cù Lao Dung within the Mekong Delta in Vietnam, to examine the vegetation controls on tidal currents and suspended sediment transport as the tides propagate into the forest. Model results show that vegetation characteristics at the seaward fringe determine the shape of the cross-shore bottom profile, with sparse vegetation leading to profiles that are close to linear, whereas with dense vegetation resulting in a convex intertidal topography. Examples showing different profile developments are provided from a variety of published studies, ranging from linear profiles in sandier sites, and distinctive convex profiles in muddier sites. As expected, profile differences in the model are caused by increased dissipation due to enhanced drag caused by vegetation; however, the reduction of flow shoreward in sparsely vegetated or non-vegetated cases was similar, indicating that shallowing of the profile and slope effects play a dominant role in dissipation. Here, tidal velocities are measured in the field using transects of Acoustic Doppler Current Profilers, and confirm that cross-shore tidal currents diminish quickly as they move over the fringe of the forest; they then stay fairly consistent within the outer few 100 m of the forest, indicating that the fringing environment is likely a region of deposition. An understanding of how vegetation controls the development of topography is critical to predicting the resilience of these sensitive intertidal areas to changes in inundation caused by sea-level rise.
Parameter Estimation for a Pulsating Turbulent Buoyant Jet Using Approximate Bayesian Computation
NASA Astrophysics Data System (ADS)
Christopher, Jason; Wimer, Nicholas; Lapointe, Caelan; Hayden, Torrey; Grooms, Ian; Rieker, Greg; Hamlington, Peter
2017-11-01
Approximate Bayesian Computation (ABC) is a powerful tool that allows sparse experimental or other ``truth'' data to be used for the prediction of unknown parameters, such as flow properties and boundary conditions, in numerical simulations of real-world engineering systems. Here we introduce the ABC approach and then use ABC to predict unknown inflow conditions in simulations of a two-dimensional (2D) turbulent, high-temperature buoyant jet. For this test case, truth data are obtained from a direct numerical simulation (DNS) with known boundary conditions and problem parameters, while the ABC procedure utilizes lower fidelity large eddy simulations. Using spatially-sparse statistics from the 2D buoyant jet DNS, we show that the ABC method provides accurate predictions of true jet inflow parameters. The success of the ABC approach in the present test suggests that ABC is a useful and versatile tool for predicting flow information, such as boundary conditions, that can be difficult to determine experimentally.
Vidyasagar, Mathukumalli
2015-01-01
This article reviews several techniques from machine learning that can be used to study the problem of identifying a small number of features, from among tens of thousands of measured features, that can accurately predict a drug response. Prediction problems are divided into two categories: sparse classification and sparse regression. In classification, the clinical parameter to be predicted is binary, whereas in regression, the parameter is a real number. Well-known methods for both classes of problems are briefly discussed. These include the SVM (support vector machine) for classification and various algorithms such as ridge regression, LASSO (least absolute shrinkage and selection operator), and EN (elastic net) for regression. In addition, several well-established methods that do not directly fall into machine learning theory are also reviewed, including neural networks, PAM (pattern analysis for microarrays), SAM (significance analysis for microarrays), GSEA (gene set enrichment analysis), and k-means clustering. Several references indicative of the application of these methods to cancer biology are discussed.
Mapping High Dimensional Sparse Customer Requirements into Product Configurations
NASA Astrophysics Data System (ADS)
Jiao, Yao; Yang, Yu; Zhang, Hongshan
2017-10-01
Mapping customer requirements into product configurations is a crucial step for product design, while, customers express their needs ambiguously and locally due to the lack of domain knowledge. Thus the data mining process of customer requirements might result in fragmental information with high dimensional sparsity, leading the mapping procedure risk uncertainty and complexity. The Expert Judgment is widely applied against that background since there is no formal requirements for systematic or structural data. However, there are concerns on the repeatability and bias for Expert Judgment. In this study, an integrated method by adjusted Local Linear Embedding (LLE) and Naïve Bayes (NB) classifier is proposed to map high dimensional sparse customer requirements to product configurations. The integrated method adjusts classical LLE to preprocess high dimensional sparse dataset to satisfy the prerequisite of NB for classifying different customer requirements to corresponding product configurations. Compared with Expert Judgment, the adjusted LLE with NB performs much better in a real-world Tablet PC design case both in accuracy and robustness.
Experiments with conjugate gradient algorithms for homotopy curve tracking
NASA Technical Reports Server (NTRS)
Irani, Kashmira M.; Ribbens, Calvin J.; Watson, Layne T.; Kamat, Manohar P.; Walker, Homer F.
1991-01-01
There are algorithms for finding zeros or fixed points of nonlinear systems of equations that are globally convergent for almost all starting points, i.e., with probability one. The essence of all such algorithms is the construction of an appropriate homotopy map and then tracking some smooth curve in the zero set of this homotopy map. HOMPACK is a mathematical software package implementing globally convergent homotopy algorithms with three different techniques for tracking a homotopy zero curve, and has separate routines for dense and sparse Jacobian matrices. The HOMPACK algorithms for sparse Jacobian matrices use a preconditioned conjugate gradient algorithm for the computation of the kernel of the homotopy Jacobian matrix, a required linear algebra step for homotopy curve tracking. Here, variants of the conjugate gradient algorithm are implemented in the context of homotopy curve tracking and compared with Craig's preconditioned conjugate gradient method used in HOMPACK. The test problems used include actual large scale, sparse structural mechanics problems.
Conjugate gradient type methods for linear systems with complex symmetric coefficient matrices
NASA Technical Reports Server (NTRS)
Freund, Roland
1989-01-01
We consider conjugate gradient type methods for the solution of large sparse linear system Ax equals b with complex symmetric coefficient matrices A equals A(T). Such linear systems arise in important applications, such as the numerical solution of the complex Helmholtz equation. Furthermore, most complex non-Hermitian linear systems which occur in practice are actually complex symmetric. We investigate conjugate gradient type iterations which are based on a variant of the nonsymmetric Lanczos algorithm for complex symmetric matrices. We propose a new approach with iterates defined by a quasi-minimal residual property. The resulting algorithm presents several advantages over the standard biconjugate gradient method. We also include some remarks on the obvious approach to general complex linear systems by solving equivalent real linear systems for the real and imaginary parts of x. Finally, numerical experiments for linear systems arising from the complex Helmholtz equation are reported.
Abnormality detection of mammograms by discriminative dictionary learning on DSIFT descriptors.
Tavakoli, Nasrin; Karimi, Maryam; Nejati, Mansour; Karimi, Nader; Reza Soroushmehr, S M; Samavi, Shadrokh; Najarian, Kayvan
2017-07-01
Detection and classification of breast lesions using mammographic images are one of the most difficult studies in medical image processing. A number of learning and non-learning methods have been proposed for detecting and classifying these lesions. However, the accuracy of the detection/classification still needs improvement. In this paper we propose a powerful classification method based on sparse learning to diagnose breast cancer in mammograms. For this purpose, a supervised discriminative dictionary learning approach is applied on dense scale invariant feature transform (DSIFT) features. A linear classifier is also simultaneously learned with the dictionary which can effectively classify the sparse representations. Our experimental results show the superior performance of our method compared to existing approaches.
He, Bo; Liu, Yang; Dong, Diya; Shen, Yue; Yan, Tianhong; Nian, Rui
2015-08-13
In this paper, a novel iterative sparse extended information filter (ISEIF) was proposed to solve the simultaneous localization and mapping problem (SLAM), which is very crucial for autonomous vehicles. The proposed algorithm solves the measurement update equations with iterative methods adaptively to reduce linearization errors. With the scalability advantage being kept, the consistency and accuracy of SEIF is improved. Simulations and practical experiments were carried out with both a land car benchmark and an autonomous underwater vehicle. Comparisons between iterative SEIF (ISEIF), standard EKF and SEIF are presented. All of the results convincingly show that ISEIF yields more consistent and accurate estimates compared to SEIF and preserves the scalability advantage over EKF, as well.
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Gutknecht, Martin H.; Nachtigal, Noel M.
1991-01-01
The nonsymmetric Lanczos method can be used to compute eigenvalues of large sparse non-Hermitian matrices or to solve large sparse non-Hermitian linear systems. However, the original Lanczos algorithm is susceptible to possible breakdowns and potential instabilities. An implementation is presented of a look-ahead version of the Lanczos algorithm that, except for the very special situation of an incurable breakdown, overcomes these problems by skipping over those steps in which a breakdown or near-breakdown would occur in the standard process. The proposed algorithm can handle look-ahead steps of any length and requires the same number of matrix-vector products and inner products as the standard Lanczos process without look-ahead.
Acceleration of GPU-based Krylov solvers via data transfer reduction
Anzt, Hartwig; Tomov, Stanimire; Luszczek, Piotr; ...
2015-04-08
Krylov subspace iterative solvers are often the method of choice when solving large sparse linear systems. At the same time, hardware accelerators such as graphics processing units continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a well optimized but limited set of linear algebra operations, applications that use them often fail to reduce certain data communications, and hence fail to leverage the full potential of the accelerator. In this study, we target the acceleration of Krylov subspace iterative methods for graphicsmore » processing units, and in particular the Biconjugate Gradient Stabilized solver that significant improvement can be achieved by reformulating the method to reduce data-communications through application-specific kernels instead of using the generic BLAS kernels, e.g. as provided by NVIDIA’s cuBLAS library, and by designing a graphics processing unit specific sparse matrix-vector product kernel that is able to more efficiently use the graphics processing unit’s computing power. Furthermore, we derive a model estimating the performance improvement, and use experimental data to validate the expected runtime savings. Finally, considering that the derived implementation achieves significantly higher performance, we assert that similar optimizations addressing algorithm structure, as well as sparse matrix-vector, are crucial for the subsequent development of high-performance graphics processing units accelerated Krylov subspace iterative methods.« less
Guo, Qiang; Qi, Liangang
2017-04-10
In the coexistence of multiple types of interfering signals, the performance of interference suppression methods based on time and frequency domains is degraded seriously, and the technique using an antenna array requires a large enough size and huge hardware costs. To combat multi-type interferences better for GNSS receivers, this paper proposes a cascaded multi-type interferences mitigation method combining improved double chain quantum genetic matching pursuit (DCQGMP)-based sparse decomposition and an MPDR beamformer. The key idea behind the proposed method is that the multiple types of interfering signals can be excised by taking advantage of their sparse features in different domains. In the first stage, the single-tone (multi-tone) and linear chirp interfering signals are canceled by sparse decomposition according to their sparsity in the over-complete dictionary. In order to improve the timeliness of matching pursuit (MP)-based sparse decomposition, a DCQGMP is introduced by combining an improved double chain quantum genetic algorithm (DCQGA) and the MP algorithm, and the DCQGMP algorithm is extended to handle the multi-channel signals according to the correlation among the signals in different channels. In the second stage, the minimum power distortionless response (MPDR) beamformer is utilized to nullify the residuary interferences (e.g., wideband Gaussian noise interferences). Several simulation results show that the proposed method can not only improve the interference mitigation degree of freedom (DoF) of the array antenna, but also effectively deal with the interference arriving from the same direction with the GNSS signal, which can be sparse represented in the over-complete dictionary. Moreover, it does not bring serious distortions into the navigation signal.
Guo, Qiang; Qi, Liangang
2017-01-01
In the coexistence of multiple types of interfering signals, the performance of interference suppression methods based on time and frequency domains is degraded seriously, and the technique using an antenna array requires a large enough size and huge hardware costs. To combat multi-type interferences better for GNSS receivers, this paper proposes a cascaded multi-type interferences mitigation method combining improved double chain quantum genetic matching pursuit (DCQGMP)-based sparse decomposition and an MPDR beamformer. The key idea behind the proposed method is that the multiple types of interfering signals can be excised by taking advantage of their sparse features in different domains. In the first stage, the single-tone (multi-tone) and linear chirp interfering signals are canceled by sparse decomposition according to their sparsity in the over-complete dictionary. In order to improve the timeliness of matching pursuit (MP)-based sparse decomposition, a DCQGMP is introduced by combining an improved double chain quantum genetic algorithm (DCQGA) and the MP algorithm, and the DCQGMP algorithm is extended to handle the multi-channel signals according to the correlation among the signals in different channels. In the second stage, the minimum power distortionless response (MPDR) beamformer is utilized to nullify the residuary interferences (e.g., wideband Gaussian noise interferences). Several simulation results show that the proposed method can not only improve the interference mitigation degree of freedom (DoF) of the array antenna, but also effectively deal with the interference arriving from the same direction with the GNSS signal, which can be sparse represented in the over-complete dictionary. Moreover, it does not bring serious distortions into the navigation signal. PMID:28394290
SparseCT: interrupted-beam acquisition and sparse reconstruction for radiation dose reduction
NASA Astrophysics Data System (ADS)
Koesters, Thomas; Knoll, Florian; Sodickson, Aaron; Sodickson, Daniel K.; Otazo, Ricardo
2017-03-01
State-of-the-art low-dose CT methods reduce the x-ray tube current and use iterative reconstruction methods to denoise the resulting images. However, due to compromises between denoising and image quality, only moderate dose reductions up to 30-40% are accepted in clinical practice. An alternative approach is to reduce the number of x-ray projections and use compressed sensing to reconstruct the full-tube-current undersampled data. This idea was recognized in the early days of compressed sensing and proposals for CT dose reduction appeared soon afterwards. However, no practical means of undersampling has yet been demonstrated in the challenging environment of a rapidly rotating CT gantry. In this work, we propose a moving multislit collimator as a practical incoherent undersampling scheme for compressed sensing CT and evaluate its application for radiation dose reduction. The proposed collimator is composed of narrow slits and moves linearly along the slice dimension (z), to interrupt the incident beam in different slices for each x-ray tube angle (θ). The reduced projection dataset is then reconstructed using a sparse approach, where 3D image gradients are employed to enforce sparsity. The effects of the collimator slits on the beam profile were measured and represented as a continuous slice profile. SparseCT was tested using retrospective undersampling and compared against commercial current-reduction techniques on phantoms and in vivo studies. Initial results suggest that SparseCT may enable higher performance than current-reduction, particularly for high dose reduction factors.
SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics
Will, Sebastian; Otto, Christina; Miladi, Milad; Möhl, Mathias; Backofen, Rolf
2015-01-01
Motivation: RNA-Seq experiments have revealed a multitude of novel ncRNAs. The gold standard for their analysis based on simultaneous alignment and folding suffers from extreme time complexity of O(n6). Subsequently, numerous faster ‘Sankoff-style’ approaches have been suggested. Commonly, the performance of such methods relies on sequence-based heuristics that restrict the search space to optimal or near-optimal sequence alignments; however, the accuracy of sequence-based methods breaks down for RNAs with sequence identities below 60%. Alignment approaches like LocARNA that do not require sequence-based heuristics, have been limited to high complexity (≥ quartic time). Results: Breaking this barrier, we introduce the novel Sankoff-style algorithm ‘sparsified prediction and alignment of RNAs based on their structure ensembles (SPARSE)’, which runs in quadratic time without sequence-based heuristics. To achieve this low complexity, on par with sequence alignment algorithms, SPARSE features strong sparsification based on structural properties of the RNA ensembles. Following PMcomp, SPARSE gains further speed-up from lightweight energy computation. Although all existing lightweight Sankoff-style methods restrict Sankoff’s original model by disallowing loop deletions and insertions, SPARSE transfers the Sankoff algorithm to the lightweight energy model completely for the first time. Compared with LocARNA, SPARSE achieves similar alignment and better folding quality in significantly less time (speedup: 3.7). At similar run-time, it aligns low sequence identity instances substantially more accurate than RAF, which uses sequence-based heuristics. Availability and implementation: SPARSE is freely available at http://www.bioinf.uni-freiburg.de/Software/SPARSE. Contact: backofen@informatik.uni-freiburg.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25838465
Dimension-Factorized Range Migration Algorithm for Regularly Distributed Array Imaging
Guo, Qijia; Wang, Jie; Chang, Tianying
2017-01-01
The two-dimensional planar MIMO array is a popular approach for millimeter wave imaging applications. As a promising practical alternative, sparse MIMO arrays have been devised to reduce the number of antenna elements and transmitting/receiving channels with predictable and acceptable loss in image quality. In this paper, a high precision three-dimensional imaging algorithm is proposed for MIMO arrays of the regularly distributed type, especially the sparse varieties. Termed the Dimension-Factorized Range Migration Algorithm, the new imaging approach factorizes the conventional MIMO Range Migration Algorithm into multiple operations across the sparse dimensions. The thinner the sparse dimensions of the array, the more efficient the new algorithm will be. Advantages of the proposed approach are demonstrated by comparison with the conventional MIMO Range Migration Algorithm and its non-uniform fast Fourier transform based variant in terms of all the important characteristics of the approaches, especially the anti-noise capability. The computation cost is analyzed as well to evaluate the efficiency quantitatively. PMID:29113083
Online learning control using adaptive critic designs with sparse kernel machines.
Xu, Xin; Hou, Zhongsheng; Lian, Chuanqiang; He, Haibo
2013-05-01
In the past decade, adaptive critic designs (ACDs), including heuristic dynamic programming (HDP), dual heuristic programming (DHP), and their action-dependent ones, have been widely studied to realize online learning control of dynamical systems. However, because neural networks with manually designed features are commonly used to deal with continuous state and action spaces, the generalization capability and learning efficiency of previous ACDs still need to be improved. In this paper, a novel framework of ACDs with sparse kernel machines is presented by integrating kernel methods into the critic of ACDs. To improve the generalization capability as well as the computational efficiency of kernel machines, a sparsification method based on the approximately linear dependence analysis is used. Using the sparse kernel machines, two kernel-based ACD algorithms, that is, kernel HDP (KHDP) and kernel DHP (KDHP), are proposed and their performance is analyzed both theoretically and empirically. Because of the representation learning and generalization capability of sparse kernel machines, KHDP and KDHP can obtain much better performance than previous HDP and DHP with manually designed neural networks. Simulation and experimental results of two nonlinear control problems, that is, a continuous-action inverted pendulum problem and a ball and plate control problem, demonstrate the effectiveness of the proposed kernel ACD methods.
NASA Astrophysics Data System (ADS)
Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen
2017-12-01
Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.
A General Family of Limited Information Goodness-of-Fit Statistics for Multinomial Data
ERIC Educational Resources Information Center
Joe, Harry; Maydeu-Olivares, Alberto
2010-01-01
Maydeu-Olivares and Joe (J. Am. Stat. Assoc. 100:1009-1020, "2005"; Psychometrika 71:713-732, "2006") introduced classes of chi-square tests for (sparse) multidimensional multinomial data based on low-order marginal proportions. Our extension provides general conditions under which quadratic forms in linear functions of cell residuals are…
NASA Technical Reports Server (NTRS)
Whetstone, W. D.
1976-01-01
The functions and operating rules of the SPAR system, which is a group of computer programs used primarily to perform stress, buckling, and vibrational analyses of linear finite element systems, were given. The following subject areas were discussed: basic information, structure definition, format system matrix processors, utility programs, static solutions, stresses, sparse matrix eigensolver, dynamic response, graphics, and substructure processors.
Matthew S. Lobdell; Patrick G. Thompson
2017-01-01
Quercus oglethorpensis (Oglethorpe oak) is an endangered species native to the southeastern United States. It is threatened by land use changes, competition, and chestnut blight disease caused by Cryphonectria parasitica. The species is distributed sparsely over a linear distance of ca. 950 km. Its range includes several...
Zhang, Li; Zhou, WeiDa
2013-12-01
This paper deals with fast methods for training a 1-norm support vector machine (SVM). First, we define a specific class of linear programming with many sparse constraints, i.e., row-column sparse constraint linear programming (RCSC-LP). In nature, the 1-norm SVM is a sort of RCSC-LP. In order to construct subproblems for RCSC-LP and solve them, a family of row-column generation (RCG) methods is introduced. RCG methods belong to a category of decomposition techniques, and perform row and column generations in a parallel fashion. Specially, for the 1-norm SVM, the maximum size of subproblems of RCG is identical with the number of Support Vectors (SVs). We also introduce a semi-deleting rule for RCG methods and prove the convergence of RCG methods when using the semi-deleting rule. Experimental results on toy data and real-world datasets illustrate that it is efficient to use RCG to train the 1-norm SVM, especially in the case of small SVs. Copyright © 2013 Elsevier Ltd. All rights reserved.
MO-AB-BRA-10: Cancer Therapy Outcome Prediction Based On Dempster-Shafer Theory and PET Imaging
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lian, C; University of Rouen, QuantIF - EA 4108 LITIS, 76000 Rouen; Li, H
2015-06-15
Purpose: In cancer therapy, utilizing FDG-18 PET image-based features for accurate outcome prediction is challenging because of 1) limited discriminative information within a small number of PET image sets, and 2) fluctuant feature characteristics caused by the inferior spatial resolution and system noise of PET imaging. In this study, we proposed a new Dempster-Shafer theory (DST) based approach, evidential low-dimensional transformation with feature selection (ELT-FS), to accurately predict cancer therapy outcome with both PET imaging features and clinical characteristics. Methods: First, a specific loss function with sparse penalty was developed to learn an adaptive low-rank distance metric for representing themore » dissimilarity between different patients’ feature vectors. By minimizing this loss function, a linear low-dimensional transformation of input features was achieved. Also, imprecise features were excluded simultaneously by applying a l2,1-norm regularization of the learnt dissimilarity metric in the loss function. Finally, the learnt dissimilarity metric was applied in an evidential K-nearest-neighbor (EK- NN) classifier to predict treatment outcome. Results: Twenty-five patients with stage II–III non-small-cell lung cancer and thirty-six patients with esophageal squamous cell carcinomas treated with chemo-radiotherapy were collected. For the two groups of patients, 52 and 29 features, respectively, were utilized. The leave-one-out cross-validation (LOOCV) protocol was used for evaluation. Compared to three existing linear transformation methods (PCA, LDA, NCA), the proposed ELT-FS leads to higher prediction accuracy for the training and testing sets both for lung-cancer patients (100+/−0.0, 88.0+/−33.17) and for esophageal-cancer patients (97.46+/−1.64, 83.33+/−37.8). The ELT-FS also provides superior class separation in both test data sets. Conclusion: A novel DST- based approach has been proposed to predict cancer treatment outcome using PET image features and clinical characteristics. A specific loss function has been designed for robust accommodation of feature set incertitude and imprecision, facilitating adaptive learning of the dissimilarity metric for the EK-NN classifier.« less
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
NASA Astrophysics Data System (ADS)
Chen, Y.-M.; Koniges, A. E.; Anderson, D. V.
1989-10-01
The biconjugate gradient method (BCG) provides an attractive alternative to the usual conjugate gradient algorithms for the solution of sparse systems of linear equations with nonsymmetric and indefinite matrix operators. A preconditioned algorithm is given, whose form resembles the incomplete L-U conjugate gradient scheme (ILUCG2) previously presented. Although the BCG scheme requires the storage of two additional vectors, it converges in a significantly lesser number of iterations (often half), while the number of calculations per iteration remains essentially the same.
A Mixed-Integer Linear Programming Problem which is Efficiently Solvable.
1987-10-01
INTEGER LINEAR PROGRAMMING PROBLEM WHICH IS EFFICIENTLY SOLVABLE 12. PERSONAL AUTHOR(S) Leiserson, Charles, and Saxe, James B. 13a. TYPE OF REPORT j13b TIME...ger prongramn rg versions or the problem is not ac’hievable in genieral for sparse inistancves of’ P rolem(r Mi. Th le remrai nder or thris paper is...rClazes c:oIh edge (i,I*) by comlpli urg +- rnirr(z 3, ,x + a,j). A sirnI) le analysis (11 vto Nei [131 indicates why whe Iellinan-Ford algorithm works
LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators
NASA Astrophysics Data System (ADS)
Gonzalez, Juan; Núñez, Rafael C.
2009-07-01
We present LAPACKrc, a family of FPGA-based linear algebra solvers able to achieve more than 100x speedup per commodity processor on certain problems. LAPACKrc subsumes some of the LAPACK and ScaLAPACK functionalities, and it also incorporates sparse direct and iterative matrix solvers. Current LAPACKrc prototypes demonstrate between 40x-150x speedup compared against top-of-the-line hardware/software systems. A technology roadmap is in place to validate current performance of LAPACKrc in HPC applications, and to increase the computational throughput by factors of hundreds within the next few years.
Elongation cutoff technique armed with quantum fast multipole method for linear scaling.
Korchowiec, Jacek; Lewandowski, Jakub; Makowski, Marcin; Gu, Feng Long; Aoki, Yuriko
2009-11-30
A linear-scaling implementation of the elongation cutoff technique (ELG/C) that speeds up Hartree-Fock (HF) self-consistent field calculations is presented. The cutoff method avoids the known bottleneck of the conventional HF scheme, that is, diagonalization, because it operates within the low dimension subspace of the whole atomic orbital space. The efficiency of ELG/C is illustrated for two model systems. The obtained results indicate that the ELG/C is a very efficient sparse matrix algebra scheme. Copyright 2009 Wiley Periodicals, Inc.
Typed Linear Chain Conditional Random Fields and Their Application to Intrusion Detection
NASA Astrophysics Data System (ADS)
Elfers, Carsten; Horstmann, Mirko; Sohr, Karsten; Herzog, Otthein
Intrusion detection in computer networks faces the problem of a large number of both false alarms and unrecognized attacks. To improve the precision of detection, various machine learning techniques have been proposed. However, one critical issue is that the amount of reference data that contains serious intrusions is very sparse. In this paper we present an inference process with linear chain conditional random fields that aims to solve this problem by using domain knowledge about the alerts of different intrusion sensors represented in an ontology.
NASA Technical Reports Server (NTRS)
Szyld, D. B.
1984-01-01
A brief description of the Model of the World Economy implemented at the Institute for Economic Analysis is presented, together with our experience in converting the software to vector code. For each time period, the model is reduced to a linear system of over 2000 variables. The matrix of coefficients has a bordered block diagonal structure, and we show how some of the matrix operations can be carried out on all diagonal blocks at once.
Krylov Subspace Methods for Complex Non-Hermitian Linear Systems. Thesis
NASA Technical Reports Server (NTRS)
Freund, Roland W.
1991-01-01
We consider Krylov subspace methods for the solution of large sparse linear systems Ax = b with complex non-Hermitian coefficient matrices. Such linear systems arise in important applications, such as inverse scattering, numerical solution of time-dependent Schrodinger equations, underwater acoustics, eddy current computations, numerical computations in quantum chromodynamics, and numerical conformal mapping. Typically, the resulting coefficient matrices A exhibit special structures, such as complex symmetry, or they are shifted Hermitian matrices. In this paper, we first describe a Krylov subspace approach with iterates defined by a quasi-minimal residual property, the QMR method, for solving general complex non-Hermitian linear systems. Then, we study special Krylov subspace methods designed for the two families of complex symmetric respectively shifted Hermitian linear systems. We also include some results concerning the obvious approach to general complex linear systems by solving equivalent real linear systems for the real and imaginary parts of x. Finally, numerical experiments for linear systems arising from the complex Helmholtz equation are reported.
He, Bo; Liu, Yang; Dong, Diya; Shen, Yue; Yan, Tianhong; Nian, Rui
2015-01-01
In this paper, a novel iterative sparse extended information filter (ISEIF) was proposed to solve the simultaneous localization and mapping problem (SLAM), which is very crucial for autonomous vehicles. The proposed algorithm solves the measurement update equations with iterative methods adaptively to reduce linearization errors. With the scalability advantage being kept, the consistency and accuracy of SEIF is improved. Simulations and practical experiments were carried out with both a land car benchmark and an autonomous underwater vehicle. Comparisons between iterative SEIF (ISEIF), standard EKF and SEIF are presented. All of the results convincingly show that ISEIF yields more consistent and accurate estimates compared to SEIF and preserves the scalability advantage over EKF, as well. PMID:26287194
Low photon count based digital holography for quadratic phase cryptography.
Muniraj, Inbarasan; Guo, Changliang; Malallah, Ra'ed; Ryle, James P; Healy, John J; Lee, Byung-Geun; Sheridan, John T
2017-07-15
Recently, the vulnerability of the linear canonical transform-based double random phase encryption system to attack has been demonstrated. To alleviate this, we present for the first time, to the best of our knowledge, a method for securing a two-dimensional scene using a quadratic phase encoding system operating in the photon-counted imaging (PCI) regime. Position-phase-shifting digital holography is applied to record the photon-limited encrypted complex samples. The reconstruction of the complex wavefront involves four sparse (undersampled) dataset intensity measurements (interferograms) at two different positions. Computer simulations validate that the photon-limited sparse-encrypted data has adequate information to authenticate the original data set. Finally, security analysis, employing iterative phase retrieval attacks, has been performed.
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices, part 1
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Gutknecht, Martin H.; Nachtigal, Noel M.
1990-01-01
The nonsymmetric Lanczos method can be used to compute eigenvalues of large sparse non-Hermitian matrices or to solve large sparse non-Hermitian linear systems. However, the original Lanczos algorithm is susceptible to possible breakdowns and potential instabilities. We present an implementation of a look-ahead version of the Lanczos algorithm which overcomes these problems by skipping over those steps in which a breakdown or near-breakdown would occur in the standard process. The proposed algorithm can handle look-ahead steps of any length and is not restricted to steps of length 2, as earlier implementations are. Also, our implementation has the feature that it requires roughly the same number of inner products as the standard Lanczos process without look-ahead.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran
We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less
NASA Astrophysics Data System (ADS)
Yang, Dong; Lu, Anxiang; Ren, Dong; Wang, Jihua
2017-11-01
This study explored the feasibility of rapid detection of biogenic amines (BAs) in cooked beef during the storage process using hyperspectral imaging technique combined with sparse representation (SR) algorithm. The hyperspectral images of samples were collected in the two spectral ranges of 400-1000 nm and 1000-1800 nm, separately. The spectral data were reduced dimensionality by SR and principal component analysis (PCA) algorithms, and then integrated the least square support vector machine (LS-SVM) to build the SR-LS-SVM and PC-LS-SVM models for the prediction of BAs values in cooked beef. The results showed that the SR-LS-SVM model exhibited the best predictive ability with determination coefficients (RP2) of 0.943 and root mean square errors (RMSEP) of 1.206 in the range of 400-1000 nm of prediction set. The SR and PCA algorithms were further combined to establish the best SR-PC-LS-SVM model for BAs prediction, which had high RP2of 0.969 and low RMSEP of 1.039 in the region of 400-1000 nm. The visual map of the BAs was generated using the best SR-PC-LS-SVM model with imaging process algorithms, which could be used to observe the changes of BAs in cooked beef more intuitively. The study demonstrated that hyperspectral imaging technique combined with sparse representation were able to detect effectively the BAs values in cooked beef during storage and the built SR-PC-LS-SVM model had a potential for rapid and accurate determination of freshness indexes in other meat and meat products.
Oh, Ein; Yoo, Tae Keun; Park, Eun-Cheol
2013-09-13
Blindness due to diabetic retinopathy (DR) is the major disability in diabetic patients. Although early management has shown to prevent vision loss, diabetic patients have a low rate of routine ophthalmologic examination. Hence, we developed and validated sparse learning models with the aim of identifying the risk of DR in diabetic patients. Health records from the Korea National Health and Nutrition Examination Surveys (KNHANES) V-1 were used. The prediction models for DR were constructed using data from 327 diabetic patients, and were validated internally on 163 patients in the KNHANES V-1. External validation was performed using 562 diabetic patients in the KNHANES V-2. The learning models, including ridge, elastic net, and LASSO, were compared to the traditional indicators of DR. Considering the Bayesian information criterion, LASSO predicted DR most efficiently. In the internal and external validation, LASSO was significantly superior to the traditional indicators by calculating the area under the curve (AUC) of the receiver operating characteristic. LASSO showed an AUC of 0.81 and an accuracy of 73.6% in the internal validation, and an AUC of 0.82 and an accuracy of 75.2% in the external validation. The sparse learning model using LASSO was effective in analyzing the epidemiological underlying patterns of DR. This is the first study to develop a machine learning model to predict DR risk using health records. LASSO can be an excellent choice when both discriminative power and variable selection are important in the analysis of high-dimensional electronic health records.
NASA Astrophysics Data System (ADS)
Ma, Sangback
In this paper we compare various parallel preconditioners such as Point-SSOR (Symmetric Successive OverRelaxation), ILU(0) (Incomplete LU) in the Wavefront ordering, ILU(0) in the Multi-color ordering, Multi-Color Block SOR (Successive OverRelaxation), SPAI (SParse Approximate Inverse) and pARMS (Parallel Algebraic Recursive Multilevel Solver) for solving large sparse linear systems arising from two-dimensional PDE (Partial Differential Equation)s on structured grids. Point-SSOR is well-known, and ILU(0) is one of the most popular preconditioner, but it is inherently serial. ILU(0) in the Wavefront ordering maximizes the parallelism in the natural order, but the lengths of the wave-fronts are often nonuniform. ILU(0) in the Multi-color ordering is a simple way of achieving a parallelism of the order N, where N is the order of the matrix, but its convergence rate often deteriorates as compared to that of natural ordering. We have chosen the Multi-Color Block SOR preconditioner combined with direct sparse matrix solver, since for the Laplacian matrix the SOR method is known to have a nondeteriorating rate of convergence when used with the Multi-Color ordering. By using block version we expect to minimize the interprocessor communications. SPAI computes the sparse approximate inverse directly by least squares method. Finally, ARMS is a preconditioner recursively exploiting the concept of independent sets and pARMS is the parallel version of ARMS. Experiments were conducted for the Finite Difference and Finite Element discretizations of five two-dimensional PDEs with large meshsizes up to a million on an IBM p595 machine with distributed memory. Our matrices are real positive, i. e., their real parts of the eigenvalues are positive. We have used GMRES(m) as our outer iterative method, so that the convergence of GMRES(m) for our test matrices are mathematically guaranteed. Interprocessor communications were done using MPI (Message Passing Interface) primitives. The results show that in general ILU(0) in the Multi-Color ordering ahd ILU(0) in the Wavefront ordering outperform the other methods but for symmetric and nearly symmetric 5-point matrices Multi-Color Block SOR gives the best performance, except for a few cases with a small number of processors.
Blind compressive sensing dynamic MRI
Lingala, Sajan Goud; Jacob, Mathews
2013-01-01
We propose a novel blind compressive sensing (BCS) frame work to recover dynamic magnetic resonance images from undersampled measurements. This scheme models the dynamic signal as a sparse linear combination of temporal basis functions, chosen from a large dictionary. In contrast to classical compressed sensing, the BCS scheme simultaneously estimates the dictionary and the sparse coefficients from the undersampled measurements. Apart from the sparsity of the coefficients, the key difference of the BCS scheme with current low rank methods is the non-orthogonal nature of the dictionary basis functions. Since the number of degrees of freedom of the BCS model is smaller than that of the low-rank methods, it provides improved reconstructions at high acceleration rates. We formulate the reconstruction as a constrained optimization problem; the objective function is the linear combination of a data consistency term and sparsity promoting ℓ1 prior of the coefficients. The Frobenius norm dictionary constraint is used to avoid scale ambiguity. We introduce a simple and efficient majorize-minimize algorithm, which decouples the original criterion into three simpler sub problems. An alternating minimization strategy is used, where we cycle through the minimization of three simpler problems. This algorithm is seen to be considerably faster than approaches that alternates between sparse coding and dictionary estimation, as well as the extension of K-SVD dictionary learning scheme. The use of the ℓ1 penalty and Frobenius norm dictionary constraint enables the attenuation of insignificant basis functions compared to the ℓ0 norm and column norm constraint assumed in most dictionary learning algorithms; this is especially important since the number of basis functions that can be reliably estimated is restricted by the available measurements. We also observe that the proposed scheme is more robust to local minima compared to K-SVD method, which relies on greedy sparse coding. Our phase transition experiments demonstrate that the BCS scheme provides much better recovery rates than classical Fourier-based CS schemes, while being only marginally worse than the dictionary aware setting. Since the overhead in additionally estimating the dictionary is low, this method can be very useful in dynamic MRI applications, where the signal is not sparse in known dictionaries. We demonstrate the utility of the BCS scheme in accelerating contrast enhanced dynamic data. We observe superior reconstruction performance with the BCS scheme in comparison to existing low rank and compressed sensing schemes. PMID:23542951
TENSOR DECOMPOSITIONS AND SPARSE LOG-LINEAR MODELS
Johndrow, James E.; Bhattacharya, Anirban; Dunson, David B.
2017-01-01
Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions. PMID:29332971
Technical note: an R package for fitting sparse neural networks with application in animal breeding.
Wang, Yangfan; Mi, Xue; Rosa, Guilherme J M; Chen, Zhihui; Lin, Ping; Wang, Shi; Bao, Zhenmin
2018-05-04
Neural networks (NNs) have emerged as a new tool for genomic selection (GS) in animal breeding. However, the properties of NN used in GS for the prediction of phenotypic outcomes are not well characterized due to the problem of over-parameterization of NN and difficulties in using whole-genome marker sets as high-dimensional NN input. In this note, we have developed an R package called snnR that finds an optimal sparse structure of a NN by minimizing the square error subject to a penalty on the L1-norm of the parameters (weights and biases), therefore solving the problem of over-parameterization in NN. We have also tested some models fitted in the snnR package to demonstrate their feasibility and effectiveness to be used in several cases as examples. In comparison of snnR to the R package brnn (the Bayesian regularized single layer NNs), with both using the entries of a genotype matrix or a genomic relationship matrix as inputs, snnR has greatly improved the computational efficiency and the prediction ability for the GS in animal breeding because snnR implements a sparse NN with many hidden layers.
Multi-Source Multi-Target Dictionary Learning for Prediction of Cognitive Decline.
Zhang, Jie; Li, Qingyang; Caselli, Richard J; Thompson, Paul M; Ye, Jieping; Wang, Yalin
2017-06-01
Alzheimer's Disease (AD) is the most common type of dementia. Identifying correct biomarkers may determine pre-symptomatic AD subjects and enable early intervention. Recently, Multi-task sparse feature learning has been successfully applied to many computer vision and biomedical informatics researches. It aims to improve the generalization performance by exploiting the shared features among different tasks. However, most of the existing algorithms are formulated as a supervised learning scheme. Its drawback is with either insufficient feature numbers or missing label information. To address these challenges, we formulate an unsupervised framework for multi-task sparse feature learning based on a novel dictionary learning algorithm. To solve the unsupervised learning problem, we propose a two-stage Multi-Source Multi-Target Dictionary Learning (MMDL) algorithm. In stage 1, we propose a multi-source dictionary learning method to utilize the common and individual sparse features in different time slots. In stage 2, supported by a rigorous theoretical analysis, we develop a multi-task learning method to solve the missing label problem. Empirical studies on an N = 3970 longitudinal brain image data set, which involves 2 sources and 5 targets, demonstrate the improved prediction accuracy and speed efficiency of MMDL in comparison with other state-of-the-art algorithms.
NASA Astrophysics Data System (ADS)
Li, Jun; Song, Minghui; Peng, Yuanxi
2018-03-01
Current infrared and visible image fusion methods do not achieve adequate information extraction, i.e., they cannot extract the target information from infrared images while retaining the background information from visible images. Moreover, most of them have high complexity and are time-consuming. This paper proposes an efficient image fusion framework for infrared and visible images on the basis of robust principal component analysis (RPCA) and compressed sensing (CS). The novel framework consists of three phases. First, RPCA decomposition is applied to the infrared and visible images to obtain their sparse and low-rank components, which represent the salient features and background information of the images, respectively. Second, the sparse and low-rank coefficients are fused by different strategies. On the one hand, the measurements of the sparse coefficients are obtained by the random Gaussian matrix, and they are then fused by the standard deviation (SD) based fusion rule. Next, the fused sparse component is obtained by reconstructing the result of the fused measurement using the fast continuous linearized augmented Lagrangian algorithm (FCLALM). On the other hand, the low-rank coefficients are fused using the max-absolute rule. Subsequently, the fused image is superposed by the fused sparse and low-rank components. For comparison, several popular fusion algorithms are tested experimentally. By comparing the fused results subjectively and objectively, we find that the proposed framework can extract the infrared targets while retaining the background information in the visible images. Thus, it exhibits state-of-the-art performance in terms of both fusion effects and timeliness.
2013-01-01
Background Significant efforts have been made to address the problem of identifying short genes in prokaryotic genomes. However, most known methods are not effective in detecting short genes. Because of the limited information contained in short DNA sequences, it is very difficult to accurately distinguish between protein coding and non-coding sequences in prokaryotic genomes. We have developed a new Iteratively Adaptive Sparse Partial Least Squares (IASPLS) algorithm as the classifier to improve the accuracy of the identification process. Results For testing, we chose the short coding and non-coding sequences from seven prokaryotic organisms. We used seven feature sets (including GC content, Z-curve, etc.) of short genes. In comparison with GeneMarkS, Metagene, Orphelia, and Heuristic Approachs methods, our model achieved the best prediction performance in identification of short prokaryotic genes. Even when we focused on the very short length group ([60–100 nt)), our model provided sensitivity as high as 83.44% and specificity as high as 92.8%. These values are two or three times higher than three of the other methods while Metagene fails to recognize genes in this length range. The experiments also proved that the IASPLS can improve the identification accuracy in comparison with other widely used classifiers, i.e. Logistic, Random Forest (RF) and K nearest neighbors (KNN). The accuracy in using IASPLS was improved 5.90% or more in comparison with the other methods. In addition to the improvements in accuracy, IASPLS required ten times less computer time than using KNN or RF. Conclusions It is conclusive that our method is preferable for application as an automated method of short gene classification. Its linearity and easily optimized parameters make it practicable for predicting short genes of newly-sequenced or under-studied species. Reviewers This article was reviewed by Alexey Kondrashov, Rajeev Azad (nominated by Dr J.Peter Gogarten) and Yuriy Fofanov (nominated by Dr Janet Siefert). PMID:24067167
Sentürk, Damla; Dalrymple, Lorien S; Nguyen, Danh V
2014-11-30
We propose functional linear models for zero-inflated count data with a focus on the functional hurdle and functional zero-inflated Poisson (ZIP) models. Although the hurdle model assumes the counts come from a mixture of a degenerate distribution at zero and a zero-truncated Poisson distribution, the ZIP model considers a mixture of a degenerate distribution at zero and a standard Poisson distribution. We extend the generalized functional linear model framework with a functional predictor and multiple cross-sectional predictors to model counts generated by a mixture distribution. We propose an estimation procedure for functional hurdle and ZIP models, called penalized reconstruction, geared towards error-prone and sparsely observed longitudinal functional predictors. The approach relies on dimension reduction and pooling of information across subjects involving basis expansions and penalized maximum likelihood techniques. The developed functional hurdle model is applied to modeling hospitalizations within the first 2 years from initiation of dialysis, with a high percentage of zeros, in the Comprehensive Dialysis Study participants. Hospitalization counts are modeled as a function of sparse longitudinal measurements of serum albumin concentrations, patient demographics, and comorbidities. Simulation studies are used to study finite sample properties of the proposed method and include comparisons with an adaptation of standard principal components regression. Copyright © 2014 John Wiley & Sons, Ltd.
Prophinder: a computational tool for prophage prediction in prokaryotic genomes.
Lima-Mendez, Gipsi; Van Helden, Jacques; Toussaint, Ariane; Leplae, Raphaël
2008-03-15
Prophinder is a prophage prediction tool coupled with a prediction database, a web server and web service. Predicted prophages will help to fill the gaps in the current sparse phage sequence space, which should cover an estimated 100 million species. Systematic and reliable predictions will enable further studies of prophages contribution to the bacteriophage gene pool and to better understand gene shuffling between prophages and phages infecting the same host. Softare is available at http://aclame.ulb.ac.be/prophinder
ERIC Educational Resources Information Center
Shannon, Kyle M.; Gage, Gregory J.; Jankovic, Aleksandra; Wilson, W. Jeffrey; Marzullo, Timothy C.
2014-01-01
The earthworm is ideal for studying action potential conduction velocity in a classroom setting, as its simple linear anatomy allows easy axon length measurements and the worm's sparse coding allows single action potentials to be easily identified. The earthworm has two giant fiber systems (lateral and medial) with different conduction velocities…
An M-step preconditioned conjugate gradient method for parallel computation
NASA Technical Reports Server (NTRS)
Adams, L.
1983-01-01
This paper describes a preconditioned conjugate gradient method that can be effectively implemented on both vector machines and parallel arrays to solve sparse symmetric and positive definite systems of linear equations. The implementation on the CYBER 203/205 and on the Finite Element Machine is discussed and results obtained using the method on these machines are given.
Sparse Representation of Smooth Linear Operators
1990-08-01
8217,ns, m = 0,2 ,...,I where Sm is defined by Eq. (2.3). We further define Rk, 2 to be the orthogonal n k,2complement of S, 2 in S,+ 1, smk EDR ,2 Sk,2 1...1987. [9] L. Greengard and J. Strain . The fast Gauss transform. Technical report, Department of Computer Science, Yale University, 1989. [10] A
Evaluating Sparse Linear System Solvers on Scalable Parallel Architectures
2008-10-01
42 3.4 Residual history of WSO banded preconditioner for problem 2D 54019 HIGHK . . . . . . . . . . . . . . . . . . . . . . . . . . 43...3.5 Residual history of WSO banded preconditioner for problem Appu 43 3.6 Residual history of WSO banded preconditioner for problem ASIC 680k...44 3.7 Residual history of WSO banded preconditioner for problem BUN- DLE1
An information theoretic approach of designing sparse kernel adaptive filters.
Liu, Weifeng; Park, Il; Principe, José C
2009-12-01
This paper discusses an information theoretic approach of designing sparse kernel adaptive filters. To determine useful data to be learned and remove redundant ones, a subjective information measure called surprise is introduced. Surprise captures the amount of information a datum contains which is transferable to a learning system. Based on this concept, we propose a systematic sparsification scheme, which can drastically reduce the time and space complexity without harming the performance of kernel adaptive filters. Nonlinear regression, short term chaotic time-series prediction, and long term time-series forecasting examples are presented.
NASA Astrophysics Data System (ADS)
Dong, Jian; Kudo, Hiroyuki
2017-03-01
Compressed sensing (CS) is attracting growing concerns in sparse-view computed tomography (CT) image reconstruction. The most standard approach of CS is total variation (TV) minimization. However, images reconstructed by TV usually suffer from distortions, especially in reconstruction of practical CT images, in forms of patchy artifacts, improper serrate edges and loss of image textures. Most existing CS approaches including TV achieve image quality improvement by applying linear transforms to object image, but linear transforms usually fail to take discontinuities into account, such as edges and image textures, which is considered to be the key reason for image distortions. Actually, discussions on nonlinear filter based image processing has a long history, leading us to clarify that the nonlinear filters yield better results compared to linear filters in image processing task such as denoising. Median root prior was first utilized by Alenius as nonlinear transform in CT image reconstruction, with significant gains obtained. Subsequently, Zhang developed the application of nonlocal means-based CS. A fact is gradually becoming clear that the nonlinear transform based CS has superiority in improving image quality compared with the linear transform based CS. However, it has not been clearly concluded in any previous paper within the scope of our knowledge. In this work, we investigated the image quality differences between the conventional TV minimization and nonlinear sparsifying transform based CS, as well as image quality differences among different nonlinear sparisying transform based CSs in sparse-view CT image reconstruction. Additionally, we accelerated the implementation of nonlinear sparsifying transform based CS algorithm.
Parallel Finite Element Domain Decomposition for Structural/Acoustic Analysis
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.; Tungkahotara, Siroj; Watson, Willie R.; Rajan, Subramaniam D.
2005-01-01
A domain decomposition (DD) formulation for solving sparse linear systems of equations resulting from finite element analysis is presented. The formulation incorporates mixed direct and iterative equation solving strategics and other novel algorithmic ideas that are optimized to take advantage of sparsity and exploit modern computer architecture, such as memory and parallel computing. The most time consuming part of the formulation is identified and the critical roles of direct sparse and iterative solvers within the framework of the formulation are discussed. Experiments on several computer platforms using several complex test matrices are conducted using software based on the formulation. Small-scale structural examples are used to validate thc steps in the formulation and large-scale (l,000,000+ unknowns) duct acoustic examples are used to evaluate the ORIGIN 2000 processors, and a duster of 6 PCs (running under the Windows environment). Statistics show that the formulation is efficient in both sequential and parallel computing environmental and that the formulation is significantly faster and consumes less memory than that based on one of the best available commercialized parallel sparse solvers.
Sparse Gaussian elimination with controlled fill-in on a shared memory multiprocessor
NASA Technical Reports Server (NTRS)
Alaghband, Gita; Jordan, Harry F.
1989-01-01
It is shown that in sparse matrices arising from electronic circuits, it is possible to do computations on many diagonal elements simultaneously. A technique for obtaining an ordered compatible set directly from the ordered incompatible table is given. The ordering is based on the Markowitz number of the pivot candidates. This technique generates a set of compatible pivots with the property of generating few fills. A novel heuristic algorithm is presented that combines the idea of an order-compatible set with a limited binary tree search to generate several sets of compatible pivots in linear time. An elimination set for reducing the matrix is generated and selected on the basis of a minimum Markowitz sum number. The parallel pivoting technique presented is a stepwise algorithm and can be applied to any submatrix of the original matrix. Thus, it is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds. Parameters are suggested to obtain a balance between parallelism and fill-ins. Results of applying the proposed algorithms on several large application matrices using the HEP multiprocessor (Kowalik, 1985) are presented and analyzed.
Discriminative Bayesian Dictionary Learning for Classification.
Akhtar, Naveed; Shafait, Faisal; Mian, Ajmal
2016-12-01
We propose a Bayesian approach to learn discriminative dictionaries for sparse representation of data. The proposed approach infers probability distributions over the atoms of a discriminative dictionary using a finite approximation of Beta Process. It also computes sets of Bernoulli distributions that associate class labels to the learned dictionary atoms. This association signifies the selection probabilities of the dictionary atoms in the expansion of class-specific data. Furthermore, the non-parametric character of the proposed approach allows it to infer the correct size of the dictionary. We exploit the aforementioned Bernoulli distributions in separately learning a linear classifier. The classifier uses the same hierarchical Bayesian model as the dictionary, which we present along the analytical inference solution for Gibbs sampling. For classification, a test instance is first sparsely encoded over the learned dictionary and the codes are fed to the classifier. We performed experiments for face and action recognition; and object and scene-category classification using five public datasets and compared the results with state-of-the-art discriminative sparse representation approaches. Experiments show that the proposed Bayesian approach consistently outperforms the existing approaches.
Safo, Sandra E; Li, Shuzhao; Long, Qi
2018-03-01
Integrative analysis of high dimensional omics data is becoming increasingly popular. At the same time, incorporating known functional relationships among variables in analysis of omics data has been shown to help elucidate underlying mechanisms for complex diseases. In this article, our goal is to assess association between transcriptomic and metabolomic data from a Predictive Health Institute (PHI) study that includes healthy adults at a high risk of developing cardiovascular diseases. Adopting a strategy that is both data-driven and knowledge-based, we develop statistical methods for sparse canonical correlation analysis (CCA) with incorporation of known biological information. Our proposed methods use prior network structural information among genes and among metabolites to guide selection of relevant genes and metabolites in sparse CCA, providing insight on the molecular underpinning of cardiovascular disease. Our simulations demonstrate that the structured sparse CCA methods outperform several existing sparse CCA methods in selecting relevant genes and metabolites when structural information is informative and are robust to mis-specified structural information. Our analysis of the PHI study reveals that a number of gene and metabolic pathways including some known to be associated with cardiovascular diseases are enriched in the set of genes and metabolites selected by our proposed approach. © 2017, The International Biometric Society.
SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics.
Will, Sebastian; Otto, Christina; Miladi, Milad; Möhl, Mathias; Backofen, Rolf
2015-08-01
RNA-Seq experiments have revealed a multitude of novel ncRNAs. The gold standard for their analysis based on simultaneous alignment and folding suffers from extreme time complexity of [Formula: see text]. Subsequently, numerous faster 'Sankoff-style' approaches have been suggested. Commonly, the performance of such methods relies on sequence-based heuristics that restrict the search space to optimal or near-optimal sequence alignments; however, the accuracy of sequence-based methods breaks down for RNAs with sequence identities below 60%. Alignment approaches like LocARNA that do not require sequence-based heuristics, have been limited to high complexity ([Formula: see text] quartic time). Breaking this barrier, we introduce the novel Sankoff-style algorithm 'sparsified prediction and alignment of RNAs based on their structure ensembles (SPARSE)', which runs in quadratic time without sequence-based heuristics. To achieve this low complexity, on par with sequence alignment algorithms, SPARSE features strong sparsification based on structural properties of the RNA ensembles. Following PMcomp, SPARSE gains further speed-up from lightweight energy computation. Although all existing lightweight Sankoff-style methods restrict Sankoff's original model by disallowing loop deletions and insertions, SPARSE transfers the Sankoff algorithm to the lightweight energy model completely for the first time. Compared with LocARNA, SPARSE achieves similar alignment and better folding quality in significantly less time (speedup: 3.7). At similar run-time, it aligns low sequence identity instances substantially more accurate than RAF, which uses sequence-based heuristics. © The Author 2015. Published by Oxford University Press.
A range-based predictive localization algorithm for WSID networks
NASA Astrophysics Data System (ADS)
Liu, Yuan; Chen, Junjie; Li, Gang
2017-11-01
Most studies on localization algorithms are conducted on the sensor networks with densely distributed nodes. However, the non-localizable problems are prone to occur in the network with sparsely distributed sensor nodes. To solve this problem, a range-based predictive localization algorithm (RPLA) is proposed in this paper for the wireless sensor networks syncretizing the RFID (WSID) networks. The Gaussian mixture model is established to predict the trajectory of a mobile target. Then, the received signal strength indication is used to reduce the residence area of the target location based on the approximate point-in-triangulation test algorithm. In addition, collaborative localization schemes are introduced to locate the target in the non-localizable situations. Simulation results verify that the RPLA achieves accurate localization for the network with sparsely distributed sensor nodes. The localization accuracy of the RPLA is 48.7% higher than that of the APIT algorithm, 16.8% higher than that of the single Gaussian model-based algorithm and 10.5% higher than that of the Kalman filtering-based algorithm.
NASA Astrophysics Data System (ADS)
Mahrooghy, Majid; Ashraf, Ahmed B.; Daye, Dania; Mies, Carolyn; Rosen, Mark; Feldman, Michael; Kontos, Despina
2014-03-01
We evaluate the prognostic value of sparse representation-based features by applying the K-SVD algorithm on multiparametric kinetic, textural, and morphologic features in breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). K-SVD is an iterative dimensionality reduction method that optimally reduces the initial feature space by updating the dictionary columns jointly with the sparse representation coefficients. Therefore, by using K-SVD, we not only provide sparse representation of the features and condense the information in a few coefficients but also we reduce the dimensionality. The extracted K-SVD features are evaluated by a machine learning algorithm including a logistic regression classifier for the task of classifying high versus low breast cancer recurrence risk as determined by a validated gene expression assay. The features are evaluated using ROC curve analysis and leave one-out cross validation for different sparse representation and dimensionality reduction numbers. Optimal sparse representation is obtained when the number of dictionary elements is 4 (K=4) and maximum non-zero coefficients is 2 (L=2). We compare K-SVD with ANOVA based feature selection for the same prognostic features. The ROC results show that the AUC of the K-SVD based (K=4, L=2), the ANOVA based, and the original features (i.e., no dimensionality reduction) are 0.78, 0.71. and 0.68, respectively. From the results, it can be inferred that by using sparse representation of the originally extracted multi-parametric, high-dimensional data, we can condense the information on a few coefficients with the highest predictive value. In addition, the dimensionality reduction introduced by K-SVD can prevent models from over-fitting.
Nolan, Bernard T.; Malone, Robert W.; Doherty, John E.; Barbash, Jack E.; Ma, Liwang; Shaner, Dale L.
2015-01-01
CONCLUSIONS: Although the observed data were sparse, they substantially reduced prediction uncertainty in unsampled regions of pesticide breakthrough curves. Nitrate evidently functioned as a surrogate for soil hydraulic data in well-drained loam soils conducive to conservative transport of nitrogen. Pesticide properties and macropore parameters could most benefit from improved characterization further to reduce model misfit and prediction uncertainty.
NASA Astrophysics Data System (ADS)
Zhou, Ya-Tong; Fan, Yu; Chen, Zi-Yi; Sun, Jian-Cheng
2017-05-01
The contribution of this work is twofold: (1) a multimodality prediction method of chaotic time series with the Gaussian process mixture (GPM) model is proposed, which employs a divide and conquer strategy. It automatically divides the chaotic time series into multiple modalities with different extrinsic patterns and intrinsic characteristics, and thus can more precisely fit the chaotic time series. (2) An effective sparse hard-cut expectation maximization (SHC-EM) learning algorithm for the GPM model is proposed to improve the prediction performance. SHC-EM replaces a large learning sample set with fewer pseudo inputs, accelerating model learning based on these pseudo inputs. Experiments on Lorenz and Chua time series demonstrate that the proposed method yields not only accurate multimodality prediction, but also the prediction confidence interval. SHC-EM outperforms the traditional variational learning in terms of both prediction accuracy and speed. In addition, SHC-EM is more robust and insusceptible to noise than variational learning. Supported by the National Natural Science Foundation of China under Grant No 60972106, the China Postdoctoral Science Foundation under Grant No 2014M561053, the Humanity and Social Science Foundation of Ministry of Education of China under Grant No 15YJA630108, and the Hebei Province Natural Science Foundation under Grant No E2016202341.
Deng, Lei; Fan, Chao; Zeng, Zhiwen
2017-12-28
Direct prediction of the three-dimensional (3D) structures of proteins from one-dimensional (1D) sequences is a challenging problem. Significant structural characteristics such as solvent accessibility and contact number are essential for deriving restrains in modeling protein folding and protein 3D structure. Thus, accurately predicting these features is a critical step for 3D protein structure building. In this study, we present DeepSacon, a computational method that can effectively predict protein solvent accessibility and contact number by using a deep neural network, which is built based on stacked autoencoder and a dropout method. The results demonstrate that our proposed DeepSacon achieves a significant improvement in the prediction quality compared with the state-of-the-art methods. We obtain 0.70 three-state accuracy for solvent accessibility, 0.33 15-state accuracy and 0.74 Pearson Correlation Coefficient (PCC) for the contact number on the 5729 monomeric soluble globular protein dataset. We also evaluate the performance on the CASP11 benchmark dataset, DeepSacon achieves 0.68 three-state accuracy and 0.69 PCC for solvent accessibility and contact number, respectively. We have shown that DeepSacon can reliably predict solvent accessibility and contact number with stacked sparse autoencoder and a dropout approach.
NASA Astrophysics Data System (ADS)
Sakata, Ayaka; Xu, Yingying
2018-03-01
We analyse a linear regression problem with nonconvex regularization called smoothly clipped absolute deviation (SCAD) under an overcomplete Gaussian basis for Gaussian random data. We propose an approximate message passing (AMP) algorithm considering nonconvex regularization, namely SCAD-AMP, and analytically show that the stability condition corresponds to the de Almeida-Thouless condition in spin glass literature. Through asymptotic analysis, we show the correspondence between the density evolution of SCAD-AMP and the replica symmetric (RS) solution. Numerical experiments confirm that for a sufficiently large system size, SCAD-AMP achieves the optimal performance predicted by the replica method. Through replica analysis, a phase transition between replica symmetric and replica symmetry breaking (RSB) region is found in the parameter space of SCAD. The appearance of the RS region for a nonconvex penalty is a significant advantage that indicates the region of smooth landscape of the optimization problem. Furthermore, we analytically show that the statistical representation performance of the SCAD penalty is better than that of \
Adaptive compressed sensing of remote-sensing imaging based on the sparsity prediction
NASA Astrophysics Data System (ADS)
Yang, Senlin; Li, Xilong; Chong, Xin
2017-10-01
The conventional compressive sensing works based on the non-adaptive linear projections, and the parameter of its measurement times is usually set empirically. As a result, the quality of image reconstruction is always affected. Firstly, the block-based compressed sensing (BCS) with conventional selection for compressive measurements was given. Then an estimation method for the sparsity of image was proposed based on the two dimensional discrete cosine transform (2D DCT). With an energy threshold given beforehand, the DCT coefficients were processed with both energy normalization and sorting in descending order, and the sparsity of the image can be achieved by the proportion of dominant coefficients. And finally, the simulation result shows that, the method can estimate the sparsity of image effectively, and provides an active basis for the selection of compressive observation times. The result also shows that, since the selection of observation times is based on the sparse degree estimated with the energy threshold provided, the proposed method can ensure the quality of image reconstruction.
NASA Astrophysics Data System (ADS)
Vitanovski, Dime; Tsymbal, Alexey; Ionasec, Razvan; Georgescu, Bogdan; Zhou, Shaohua K.; Hornegger, Joachim; Comaniciu, Dorin
2011-03-01
Congenital heart defect (CHD) is the most common birth defect and a frequent cause of death for children. Tetralogy of Fallot (ToF) is the most often occurring CHD which affects in particular the pulmonary valve and trunk. Emerging interventional methods enable percutaneous pulmonary valve implantation, which constitute an alternative to open heart surgery. While minimal invasive methods become common practice, imaging and non-invasive assessment tools become crucial components in the clinical setting. Cardiac computed tomography (CT) and cardiac magnetic resonance imaging (cMRI) are techniques with complementary properties and ability to acquire multiple non-invasive and accurate scans required for advance evaluation and therapy planning. In contrary to CT which covers the full 4D information over the cardiac cycle, cMRI often acquires partial information, for example only one 3D scan of the whole heart in the end-diastolic phase and two 2D planes (long and short axes) over the whole cardiac cycle. The data acquired in this way is called sparse cMRI. In this paper, we propose a regression-based approach for the reconstruction of the full 4D pulmonary trunk model from sparse MRI. The reconstruction approach is based on learning a distance function between the sparse MRI which needs to be completed and the 4D CT data with the full information used as the training set. The distance is based on the intrinsic Random Forest similarity which is learnt for the corresponding regression problem of predicting coordinates of unseen mesh points. Extensive experiments performed on 80 cardiac CT and MR sequences demonstrated the average speed of 10 seconds and accuracy of 0.1053mm mean absolute error for the proposed approach. Using the case retrieval workflow and local nearest neighbour regression with the learnt distance function appears to be competitive with respect to "black box" regression with immediate prediction of coordinates, while providing transparency to the predictions made.
Large-region acoustic source mapping using a movable array and sparse covariance fitting.
Zhao, Shengkui; Tuna, Cagdas; Nguyen, Thi Ngoc Tho; Jones, Douglas L
2017-01-01
Large-region acoustic source mapping is important for city-scale noise monitoring. Approaches using a single-position measurement scheme to scan large regions using small arrays cannot provide clean acoustic source maps, while deploying large arrays spanning the entire region of interest is prohibitively expensive. A multiple-position measurement scheme is applied to scan large regions at multiple spatial positions using a movable array of small size. Based on the multiple-position measurement scheme, a sparse-constrained multiple-position vectorized covariance matrix fitting approach is presented. In the proposed approach, the overall sample covariance matrix of the incoherent virtual array is first estimated using the multiple-position array data and then vectorized using the Khatri-Rao (KR) product. A linear model is then constructed for fitting the vectorized covariance matrix and a sparse-constrained reconstruction algorithm is proposed for recovering source powers from the model. The user parameter settings are discussed. The proposed approach is tested on a 30 m × 40 m region and a 60 m × 40 m region using simulated and measured data. Much cleaner acoustic source maps and lower sound pressure level errors are obtained compared to the beamforming approaches and the previous sparse approach [Zhao, Tuna, Nguyen, and Jones, Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP) (2016)].
Dynamic Textures Modeling via Joint Video Dictionary Learning.
Wei, Xian; Li, Yuanxiang; Shen, Hao; Chen, Fang; Kleinsteuber, Martin; Wang, Zhongfeng
2017-04-06
Video representation is an important and challenging task in the computer vision community. In this paper, we consider the problem of modeling and classifying video sequences of dynamic scenes which could be modeled in a dynamic textures (DT) framework. At first, we assume that image frames of a moving scene can be modeled as a Markov random process. We propose a sparse coding framework, named joint video dictionary learning (JVDL), to model a video adaptively. By treating the sparse coefficients of image frames over a learned dictionary as the underlying "states", we learn an efficient and robust linear transition matrix between two adjacent frames of sparse events in time series. Hence, a dynamic scene sequence is represented by an appropriate transition matrix associated with a dictionary. In order to ensure the stability of JVDL, we impose several constraints on such transition matrix and dictionary. The developed framework is able to capture the dynamics of a moving scene by exploring both sparse properties and the temporal correlations of consecutive video frames. Moreover, such learned JVDL parameters can be used for various DT applications, such as DT synthesis and recognition. Experimental results demonstrate the strong competitiveness of the proposed JVDL approach in comparison with state-of-the-art video representation methods. Especially, it performs significantly better in dealing with DT synthesis and recognition on heavily corrupted data.
Xie, Jianwen; Douglas, Pamela K; Wu, Ying Nian; Brody, Arthur L; Anderson, Ariana E
2017-04-15
Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet other mathematical constraints provide alternate biologically-plausible frameworks for generating brain networks. Non-negative matrix factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms (L1 Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks. The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks within scan for different constraints are used as basis functions to encode observed functional activity. These encodings are then decoded using machine learning, by using the time series weights to predict within scan whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects. The sparse coding algorithm of L1 Regularized Learning outperformed 4 variations of ICA (p<0.001) for predicting the task being performed within each scan using artifact-cleaned components. The NMF algorithms, which suppressed negative BOLD signal, had the poorest accuracy compared to the ICA and sparse coding algorithms. Holding constant the effect of the extraction algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (p<0.001). Lower classification accuracy occurred when the extracted spatial maps contained more CSF regions (p<0.001). The success of sparse coding algorithms suggests that algorithms which enforce sparsity, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA. Negative BOLD signal may capture task-related activations. Copyright © 2017 Elsevier B.V. All rights reserved.
Stenehjem, J S; Veierød, M B; Nilsen, L T; Ghiasvand, R; Johnsen, B; Grimsrud, T K; Babigumira, R; Støer, N C; Rees, J R; Robsahm, T E
2018-06-01
Breslow thickness is the most important prognostic factor of localized cutaneous melanoma (CM), but associations with anthropometric factors have been sparsely and incompletely investigated. To examine pre-diagnostic body mass index (BMI), body surface area (BSA), height, weight and weight change in relation to Breslow thickness, overall and by anatomical site and histological subtype. Further, to assess possible non-linear associations between these anthropometric factors and Breslow thickness. CMs in the Janus Cohort were identified 1972-2014. Linear regression was used to estimate geometric mean ratios (GMRs) of Breslow thickness with 95% confidence intervals (CIs) according to anthropometric factors. Restricted cubic splines in generalized linear models predicted adjusted mean Breslow thickness, and were used to assess possible non-linear relationships. Among 2570 CM cases, obese had a GMR of 1.16 (95% CI: 1.04, 1.30) of Breslow thickness versus normal weight cases. For BSA and weight, quintile 5 showed GMRs of 1.13 (95% CI: 1.00, 1.27) and 1.17 (95% CI: 1.03, 1.33) of Breslow thickness versus quintile 1, respectively. Associations seemed restricted to superficial spreading melanomas and CMs on the trunk and lower limbs. The associations plateaued at an adjusted mean Breslow thickness of about 2.5 mm (BMI 29 kg/m 2 , BSA 2.05 m 2 and weight 90 kg), before declining for the highest values. No associations were found for height and weight change. This large case-series of incident CM demonstrated positive associations between BMI, BSA, weight and Breslow thickness, and suggested that behavioral or other mechanisms apply at high values. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Forest/non-forest mapping using inventory data and satellite imagery
Ronald E. McRoberts
2002-01-01
For two study areas in Minnesota, USA, one heavily forested and one sparsely forested, maps of predicted proportion forest area were created using Landsat Thematic Mapper imagery, forest inventory plot data, and two prediction techniques, logistic regression and a k-Nearest Neighbours technique. The maps were used to increase the precision of forest area estimates by...
USDA-ARS?s Scientific Manuscript database
Assimilation of remotely sensed soil moisture data (SM-DA) to correct soil water stores of rainfall-runoff models has shown skill in improving streamflow prediction. In the case of large and sparsely monitored catchments, SM-DA is a particularly attractive tool.Within this context, we assimilate act...
Deep and Structured Robust Information Theoretic Learning for Image Analysis.
Deng, Yue; Bao, Feng; Deng, Xuesong; Wang, Ruiping; Kong, Youyong; Dai, Qionghai
2016-07-07
This paper presents a robust information theoretic (RIT) model to reduce the uncertainties, i.e. missing and noisy labels, in general discriminative data representation tasks. The fundamental pursuit of our model is to simultaneously learn a transformation function and a discriminative classifier that maximize the mutual information of data and their labels in the latent space. In this general paradigm, we respectively discuss three types of the RIT implementations with linear subspace embedding, deep transformation and structured sparse learning. In practice, the RIT and deep RIT are exploited to solve the image categorization task whose performances will be verified on various benchmark datasets. The structured sparse RIT is further applied to a medical image analysis task for brain MRI segmentation that allows group-level feature selections on the brain tissues.
Blind source separation by sparse decomposition
NASA Astrophysics Data System (ADS)
Zibulevsky, Michael; Pearlmutter, Barak A.
2000-04-01
The blind source separation problem is to extract the underlying source signals from a set of their linear mixtures, where the mixing matrix is unknown. This situation is common, eg in acoustics, radio, and medical signal processing. We exploit the property of the sources to have a sparse representation in a corresponding signal dictionary. Such a dictionary may consist of wavelets, wavelet packets, etc., or be obtained by learning from a given family of signals. Starting from the maximum a posteriori framework, which is applicable to the case of more sources than mixtures, we derive a few other categories of objective functions, which provide faster and more robust computations, when there are an equal number of sources and mixtures. Our experiments with artificial signals and with musical sounds demonstrate significantly better separation than other known techniques.
Enhancing sparsity of Hermite polynomial expansions by iterative rotations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Xiu; Lei, Huan; Baker, Nathan A.
2016-02-01
Compressive sensing has become a powerful addition to uncertainty quantification in recent years. This paper identifies new bases for random variables through linear mappings such that the representation of the quantity of interest is more sparse with new basis functions associated with the new random variables. This sparsity increases both the efficiency and accuracy of the compressive sensing-based uncertainty quantification method. Specifically, we consider rotation- based linear mappings which are determined iteratively for Hermite polynomial expansions. We demonstrate the effectiveness of the new method with applications in solving stochastic partial differential equations and high-dimensional (O(100)) problems.
Mannocci, Laura; Roberts, Jason J; Miller, David L; Halpin, Patrick N
2017-06-01
As human activities expand beyond national jurisdictions to the high seas, there is an increasing need to consider anthropogenic impacts to species inhabiting these waters. The current scarcity of scientific observations of cetaceans in the high seas impedes the assessment of population-level impacts of these activities. We developed plausible density estimates to facilitate a quantitative assessment of anthropogenic impacts on cetacean populations in these waters. Our study region extended from a well-surveyed region within the U.S. Exclusive Economic Zone into a large region of the western North Atlantic sparsely surveyed for cetaceans. We modeled densities of 15 cetacean taxa with available line transect survey data and habitat covariates and extrapolated predictions to sparsely surveyed regions. We formulated models to reduce the extent of extrapolation beyond covariate ranges, and constrained them to model simple and generalizable relationships. To evaluate confidence in the predictions, we mapped where predictions were made outside sampled covariate ranges, examined alternate models, and compared predicted densities with maps of sightings from sources that could not be integrated into our models. Confidence levels in model results depended on the taxon and geographic area and highlighted the need for additional surveying in environmentally distinct areas. With application of necessary caution, our density estimates can inform management needs in the high seas, such as the quantification of potential cetacean interactions with military training exercises, shipping, fisheries, and deep-sea mining and be used to delineate areas of special biological significance in international waters. Our approach is generally applicable to other marine taxa and geographic regions for which management will be implemented but data are sparse. © 2016 The Authors. Conservation Biology published by Wiley Periodicals, Inc. on behalf of Society for Conservation Biology.
The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R.
Pang, Haotian; Liu, Han; Vanderbei, Robert
2014-02-01
We develop an R package fastclime for solving a family of regularized linear programming (LP) problems. Our package efficiently implements the parametric simplex algorithm, which provides a scalable and sophisticated tool for solving large-scale linear programs. As an illustrative example, one use of our LP solver is to implement an important sparse precision matrix estimation method called CLIME (Constrained L 1 Minimization Estimator). Compared with existing packages for this problem such as clime and flare, our package has three advantages: (1) it efficiently calculates the full piecewise-linear regularization path; (2) it provides an accurate dual certificate as stopping criterion; (3) it is completely coded in C and is highly portable. This package is designed to be useful to statisticians and machine learning researchers for solving a wide range of problems.
Multimodal Task-Driven Dictionary Learning for Image Classification
2015-12-18
1 Multimodal Task-Driven Dictionary Learning for Image Classification Soheil Bahrampour, Student Member, IEEE, Nasser M. Nasrabadi, Fellow, IEEE...Asok Ray, Fellow, IEEE, and W. Kenneth Jenkins, Life Fellow, IEEE Abstract— Dictionary learning algorithms have been suc- cessfully used for both...reconstructive and discriminative tasks, where an input signal is represented with a sparse linear combination of dictionary atoms. While these methods are
Regularized two-step brain activity reconstruction from spatiotemporal EEG data
NASA Astrophysics Data System (ADS)
Alecu, Teodor I.; Voloshynovskiy, Sviatoslav; Pun, Thierry
2004-10-01
We are aiming at using EEG source localization in the framework of a Brain Computer Interface project. We propose here a new reconstruction procedure, targeting source (or equivalently mental task) differentiation. EEG data can be thought of as a collection of time continuous streams from sparse locations. The measured electric potential on one electrode is the result of the superposition of synchronized synaptic activity from sources in all the brain volume. Consequently, the EEG inverse problem is a highly underdetermined (and ill-posed) problem. Moreover, each source contribution is linear with respect to its amplitude but non-linear with respect to its localization and orientation. In order to overcome these drawbacks we propose a novel two-step inversion procedure. The solution is based on a double scale division of the solution space. The first step uses a coarse discretization and has the sole purpose of globally identifying the active regions, via a sparse approximation algorithm. The second step is applied only on the retained regions and makes use of a fine discretization of the space, aiming at detailing the brain activity. The local configuration of sources is recovered using an iterative stochastic estimator with adaptive joint minimum energy and directional consistency constraints.
NASA Technical Reports Server (NTRS)
Zubair, Mohammad; Nielsen, Eric; Luitjens, Justin; Hammond, Dana
2016-01-01
In the field of computational fluid dynamics, the Navier-Stokes equations are often solved using an unstructuredgrid approach to accommodate geometric complexity. Implicit solution methodologies for such spatial discretizations generally require frequent solution of large tightly-coupled systems of block-sparse linear equations. The multicolor point-implicit solver used in the current work typically requires a significant fraction of the overall application run time. In this work, an efficient implementation of the solver for graphics processing units is proposed. Several factors present unique challenges to achieving an efficient implementation in this environment. These include the variable amount of parallelism available in different kernel calls, indirect memory access patterns, low arithmetic intensity, and the requirement to support variable block sizes. In this work, the solver is reformulated to use standard sparse and dense Basic Linear Algebra Subprograms (BLAS) functions. However, numerical experiments show that the performance of the BLAS functions available in existing CUDA libraries is suboptimal for matrices representative of those encountered in actual simulations. Instead, optimized versions of these functions are developed. Depending on block size, the new implementations show performance gains of up to 7x over the existing CUDA library functions.
Laplace Inversion of Low-Resolution NMR Relaxometry Data Using Sparse Representation Methods
Berman, Paula; Levi, Ofer; Parmet, Yisrael; Saunders, Michael; Wiesman, Zeev
2013-01-01
Low-resolution nuclear magnetic resonance (LR-NMR) relaxometry is a powerful tool that can be harnessed for characterizing constituents in complex materials. Conversion of the relaxation signal into a continuous distribution of relaxation components is an ill-posed inverse Laplace transform problem. The most common numerical method implemented today for dealing with this kind of problem is based on L2-norm regularization. However, sparse representation methods via L1 regularization and convex optimization are a relatively new approach for effective analysis and processing of digital images and signals. In this article, a numerical optimization method for analyzing LR-NMR data by including non-negativity constraints and L1 regularization and by applying a convex optimization solver PDCO, a primal-dual interior method for convex objectives, that allows general linear constraints to be treated as linear operators is presented. The integrated approach includes validation of analyses by simulations, testing repeatability of experiments, and validation of the model and its statistical assumptions. The proposed method provides better resolved and more accurate solutions when compared with those suggested by existing tools. © 2013 Wiley Periodicals, Inc. Concepts Magn Reson Part A 42A: 72–88, 2013. PMID:23847452
Mniszewski, S M; Cawkwell, M J; Wall, M E; Mohd-Yusof, J; Bock, N; Germann, T C; Niklasson, A M N
2015-10-13
We present an algorithm for the calculation of the density matrix that for insulators scales linearly with system size and parallelizes efficiently on multicore, shared memory platforms with small and controllable numerical errors. The algorithm is based on an implementation of the second-order spectral projection (SP2) algorithm [ Niklasson, A. M. N. Phys. Rev. B 2002 , 66 , 155115 ] in sparse matrix algebra with the ELLPACK-R data format. We illustrate the performance of the algorithm within self-consistent tight binding theory by total energy calculations of gas phase poly(ethylene) molecules and periodic liquid water systems containing up to 15,000 atoms on up to 16 CPU cores. We consider algorithm-specific performance aspects, such as local vs nonlocal memory access and the degree of matrix sparsity. Comparisons to sparse matrix algebra implementations using off-the-shelf libraries on multicore CPUs, graphics processing units (GPUs), and the Intel many integrated core (MIC) architecture are also presented. The accuracy and stability of the algorithm are illustrated with long duration Born-Oppenheimer molecular dynamics simulations of 1000 water molecules and a 303 atom Trp cage protein solvated by 2682 water molecules.
Laplace Inversion of Low-Resolution NMR Relaxometry Data Using Sparse Representation Methods.
Berman, Paula; Levi, Ofer; Parmet, Yisrael; Saunders, Michael; Wiesman, Zeev
2013-05-01
Low-resolution nuclear magnetic resonance (LR-NMR) relaxometry is a powerful tool that can be harnessed for characterizing constituents in complex materials. Conversion of the relaxation signal into a continuous distribution of relaxation components is an ill-posed inverse Laplace transform problem. The most common numerical method implemented today for dealing with this kind of problem is based on L 2 -norm regularization. However, sparse representation methods via L 1 regularization and convex optimization are a relatively new approach for effective analysis and processing of digital images and signals. In this article, a numerical optimization method for analyzing LR-NMR data by including non-negativity constraints and L 1 regularization and by applying a convex optimization solver PDCO, a primal-dual interior method for convex objectives, that allows general linear constraints to be treated as linear operators is presented. The integrated approach includes validation of analyses by simulations, testing repeatability of experiments, and validation of the model and its statistical assumptions. The proposed method provides better resolved and more accurate solutions when compared with those suggested by existing tools. © 2013 Wiley Periodicals, Inc. Concepts Magn Reson Part A 42A: 72-88, 2013.
NASA Astrophysics Data System (ADS)
Sides, Scott; Jamroz, Ben; Crockett, Robert; Pletzer, Alexander
2012-02-01
Self-consistent field theory (SCFT) for dense polymer melts has been highly successful in describing complex morphologies in block copolymers. Field-theoretic simulations such as these are able to access large length and time scales that are difficult or impossible for particle-based simulations such as molecular dynamics. The modified diffusion equations that arise as a consequence of the coarse-graining procedure in the SCF theory can be efficiently solved with a pseudo-spectral (PS) method that uses fast-Fourier transforms on uniform Cartesian grids. However, PS methods can be difficult to apply in many block copolymer SCFT simulations (eg. confinement, interface adsorption) in which small spatial regions might require finer resolution than most of the simulation grid. Progress on using new solver algorithms to address these problems will be presented. The Tech-X Chompst project aims at marrying the best of adaptive mesh refinement with linear matrix solver algorithms. The Tech-X code PolySwift++ is an SCFT simulation platform that leverages ongoing development in coupling Chombo, a package for solving PDEs via block-structured AMR calculations and embedded boundaries, with PETSc, a toolkit that includes a large assortment of sparse linear solvers.
Xu, Jiansong; Potenza, Marc N.; Calhoun, Vince D.; Zhang, Rubin; Yip, Sarah W.; Wall, John T.; Pearlson, Godfrey D.; Worhunsky, Patrick D.; Garrison, Kathleen A.; Moran, Joseph M.
2016-01-01
Functional magnetic resonance imaging (fMRI) studies regularly use univariate general-linear-model-based analyses (GLM). Their findings are often inconsistent across different studies, perhaps because of several fundamental brain properties including functional heterogeneity, balanced excitation and inhibition (E/I), and sparseness of neuronal activities. These properties stipulate heterogeneous neuronal activities in the same voxels and likely limit the sensitivity and specificity of GLM. This paper selectively reviews findings of histological and electrophysiological studies and fMRI spatial independent component analysis (sICA) and reports new findings by applying sICA to two existing datasets. The extant and new findings consistently demonstrate several novel features of brain functional organization not revealed by GLM. They include overlap of large-scale functional networks (FNs) and their concurrent opposite modulations, and no significant modulations in activity of most FNs across the whole brain during any task conditions. These novel features of brain functional organization are highly consistent with the brain’s properties of functional heterogeneity, balanced E/I, and sparseness of neuronal activity, and may help reconcile inconsistent GLM findings. PMID:27592153
Reconstruction of Complex Network based on the Noise via QR Decomposition and Compressed Sensing.
Li, Lixiang; Xu, Dafei; Peng, Haipeng; Kurths, Jürgen; Yang, Yixian
2017-11-08
It is generally known that the states of network nodes are stable and have strong correlations in a linear network system. We find that without the control input, the method of compressed sensing can not succeed in reconstructing complex networks in which the states of nodes are generated through the linear network system. However, noise can drive the dynamics between nodes to break the stability of the system state. Therefore, a new method integrating QR decomposition and compressed sensing is proposed to solve the reconstruction problem of complex networks under the assistance of the input noise. The state matrix of the system is decomposed by QR decomposition. We construct the measurement matrix with the aid of Gaussian noise so that the sparse input matrix can be reconstructed by compressed sensing. We also discover that noise can build a bridge between the dynamics and the topological structure. Experiments are presented to show that the proposed method is more accurate and more efficient to reconstruct four model networks and six real networks by the comparisons between the proposed method and only compressed sensing. In addition, the proposed method can reconstruct not only the sparse complex networks, but also the dense complex networks.
Line-Constrained Camera Location Estimation in Multi-Image Stereomatching.
Donné, Simon; Goossens, Bart; Philips, Wilfried
2017-08-23
Stereomatching is an effective way of acquiring dense depth information from a scene when active measurements are not possible. So-called lightfield methods take a snapshot from many camera locations along a defined trajectory (usually uniformly linear or on a regular grid-we will assume a linear trajectory) and use this information to compute accurate depth estimates. However, they require the locations for each of the snapshots to be known: the disparity of an object between images is related to both the distance of the camera to the object and the distance between the camera positions for both images. Existing solutions use sparse feature matching for camera location estimation. In this paper, we propose a novel method that uses dense correspondences to do the same, leveraging an existing depth estimation framework to also yield the camera locations along the line. We illustrate the effectiveness of the proposed technique for camera location estimation both visually for the rectification of epipolar plane images and quantitatively with its effect on the resulting depth estimation. Our proposed approach yields a valid alternative for sparse techniques, while still being executed in a reasonable time on a graphics card due to its highly parallelizable nature.
Allawala, Altan; Marston, J B
2016-11-01
We investigate the Fokker-Planck description of the equal-time statistics of the three-dimensional Lorenz attractor with additive white noise. The invariant measure is found by computing the zero (or null) mode of the linear Fokker-Planck operator as a problem of sparse linear algebra. Two variants are studied: a self-adjoint construction of the linear operator and the replacement of diffusion with hyperdiffusion. We also access the low-order statistics of the system by a perturbative expansion in equal-time cumulants. A comparison is made to statistics obtained by the standard approach of accumulation via direct numerical simulation. Theoretical and computational aspects of the Fokker-Planck and cumulant expansion methods are discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Spotz, William F.
PyTrilinos is a set of Python interfaces to compiled Trilinos packages. This collection supports serial and parallel dense linear algebra, serial and parallel sparse linear algebra, direct and iterative linear solution techniques, algebraic and multilevel preconditioners, nonlinear solvers and continuation algorithms, eigensolvers and partitioning algorithms. Also included are a variety of related utility functions and classes, including distributed I/O, coloring algorithms and matrix generation. PyTrilinos vector objects are compatible with the popular NumPy Python package. As a Python front end to compiled libraries, PyTrilinos takes advantage of the flexibility and ease of use of Python, and the efficiency of themore » underlying C++, C and Fortran numerical kernels. This paper covers recent, previously unpublished advances in the PyTrilinos package.« less
O’Connor, Christopher D.; Lynch, Ann M.
2016-01-01
A significant concern about Metabolic Scaling Theory (MST) in real forests relates to consistent differences between the values of power law scaling exponents of tree primary size measures used to estimate mass and those predicted by MST. Here we consider why observed scaling exponents for diameter and height relationships deviate from MST predictions across three semi-arid conifer forests in relation to: (1) tree condition and physical form, (2) the level of inter-tree competition (e.g. open vs closed stand structure), (3) increasing tree age, and (4) differences in site productivity. Scaling exponent values derived from non-linear least-squares regression for trees in excellent condition (n = 381) were above the MST prediction at the 95% confidence level, while the exponent for trees in good condition were no different than MST (n = 926). Trees that were in fair or poor condition, characterized as diseased, leaning, or sparsely crowned had exponent values below MST predictions (n = 2,058), as did recently dead standing trees (n = 375). Exponent value of the mean-tree model that disregarded tree condition (n = 3,740) was consistent with other studies that reject MST scaling. Ostensibly, as stand density and competition increase trees exhibited greater morphological plasticity whereby the majority had characteristically fair or poor growth forms. Fitting by least-squares regression biases the mean-tree model scaling exponent toward values that are below MST idealized predictions. For 368 trees from Arizona with known establishment dates, increasing age had no significant impact on expected scaling. We further suggest height to diameter ratios below MST relate to vertical truncation caused by limitation in plant water availability. Even with environmentally imposed height limitation, proportionality between height and diameter scaling exponents were consistent with the predictions of MST. PMID:27391084
Swetnam, Tyson L; O'Connor, Christopher D; Lynch, Ann M
2016-01-01
A significant concern about Metabolic Scaling Theory (MST) in real forests relates to consistent differences between the values of power law scaling exponents of tree primary size measures used to estimate mass and those predicted by MST. Here we consider why observed scaling exponents for diameter and height relationships deviate from MST predictions across three semi-arid conifer forests in relation to: (1) tree condition and physical form, (2) the level of inter-tree competition (e.g. open vs closed stand structure), (3) increasing tree age, and (4) differences in site productivity. Scaling exponent values derived from non-linear least-squares regression for trees in excellent condition (n = 381) were above the MST prediction at the 95% confidence level, while the exponent for trees in good condition were no different than MST (n = 926). Trees that were in fair or poor condition, characterized as diseased, leaning, or sparsely crowned had exponent values below MST predictions (n = 2,058), as did recently dead standing trees (n = 375). Exponent value of the mean-tree model that disregarded tree condition (n = 3,740) was consistent with other studies that reject MST scaling. Ostensibly, as stand density and competition increase trees exhibited greater morphological plasticity whereby the majority had characteristically fair or poor growth forms. Fitting by least-squares regression biases the mean-tree model scaling exponent toward values that are below MST idealized predictions. For 368 trees from Arizona with known establishment dates, increasing age had no significant impact on expected scaling. We further suggest height to diameter ratios below MST relate to vertical truncation caused by limitation in plant water availability. Even with environmentally imposed height limitation, proportionality between height and diameter scaling exponents were consistent with the predictions of MST.
Compressive sampling by artificial neural networks for video
NASA Astrophysics Data System (ADS)
Szu, Harold; Hsu, Charles; Jenkins, Jeffrey; Reinhardt, Kitt
2011-06-01
We describe a smart surveillance strategy for handling novelty changes. Current sensors seem to keep all, redundant or not. The Human Visual System's Hubel-Wiesel (wavelet) edge detection mechanism pays attention to changes in movement, which naturally produce organized sparseness because a stagnant edge is not reported to the brain's visual cortex by retinal neurons. Sparseness is defined as an ordered set of ones (movement or not) relative to zeros that could be pseudo-orthogonal among themselves; then suited for fault tolerant storage and retrieval by means of Associative Memory (AM). The firing is sparse at the change locations. Unlike purely random sparse masks adopted in medical Compressive Sensing, these organized ones have an additional benefit of using the image changes to make retrievable graphical indexes. We coined this organized sparseness as Compressive Sampling; sensing but skipping over redundancy without altering the original image. Thus, we turn illustrate with video the survival tactics which animals that roam the Earth use daily. They acquire nothing but the space-time changes that are important to satisfy specific prey-predator relationships. We have noticed a similarity between the mathematical Compressive Sensing and this biological mechanism used for survival. We have designed a hardware implementation of the Human Visual System's Compressive Sampling scheme. To speed up further, our mixedsignal circuit design of frame differencing is built in on-chip processing hardware. A CMOS trans-conductance amplifier is designed here to generate a linear current output using a pair of differential input voltages from 2 photon detectors for change detection---one for the previous value and the other the subsequent value, ("write" synaptic weight by Hebbian outer products; "read" by inner product & pt. NL threshold) to localize and track the threat targets.
Correlated activity supports efficient cortical processing
Hung, Chou P.; Cui, Ding; Chen, Yueh-peng; Lin, Chia-pei; Levine, Matthew R.
2015-01-01
Visual recognition is a computational challenge that is thought to occur via efficient coding. An important concept is sparseness, a measure of coding efficiency. The prevailing view is that sparseness supports efficiency by minimizing redundancy and correlations in spiking populations. Yet, we recently reported that “choristers”, neurons that behave more similarly (have correlated stimulus preferences and spontaneous coincident spiking), carry more generalizable object information than uncorrelated neurons (“soloists”) in macaque inferior temporal (IT) cortex. The rarity of choristers (as low as 6% of IT neurons) indicates that they were likely missed in previous studies. Here, we report that correlation strength is distinct from sparseness (choristers are not simply broadly tuned neurons), that choristers are located in non-granular output layers, and that correlated activity predicts human visual search efficiency. These counterintuitive results suggest that a redundant correlational structure supports efficient processing and behavior. PMID:25610392
Polarization-independent tunable spectral slicing filter in Ti:LiNbO3.
Rabelo, Renato C; Eknoyan, Ohannes; Taylor, Henry F
2011-02-01
A two-port polarization-independent tunable spectral slicing filter at the 1530 nm wavelength regime is presented. The design utilizes an asymmetric interferometer with a sparse index grating along its arms. The sparse grating makes it possible to select equally spaced frequency channels from an incident WDM signal and to place nulls between them to coincide with the signal comb frequency. The number of selected channels and nulls between them depends on the number of coupling regions used in the sparse grating. The free spectral range depends on the spacing between the coupling regions. The Z-transform method is used to synthesize the filter and determine the spectral response. The operation of a device with six coupling regions is demonstrated, and good agreement with theoretical predictions is obtained. A 3 dB bandwidth of ∼1 nm and thermal tuning over a range of ∼13 nm are measured.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Wenyang; Cheung, Yam; Sawant, Amit
2016-05-15
Purpose: To develop a robust and real-time surface reconstruction method on point clouds captured from a 3D surface photogrammetry system. Methods: The authors have developed a robust and fast surface reconstruction method on point clouds acquired by the photogrammetry system, without explicitly solving the partial differential equation required by a typical variational approach. Taking advantage of the overcomplete nature of the acquired point clouds, their method solves and propagates a sparse linear relationship from the point cloud manifold to the surface manifold, assuming both manifolds share similar local geometry. With relatively consistent point cloud acquisitions, the authors propose a sparsemore » regression (SR) model to directly approximate the target point cloud as a sparse linear combination from the training set, assuming that the point correspondences built by the iterative closest point (ICP) is reasonably accurate and have residual errors following a Gaussian distribution. To accommodate changing noise levels and/or presence of inconsistent occlusions during the acquisition, the authors further propose a modified sparse regression (MSR) model to model the potentially large and sparse error built by ICP with a Laplacian prior. The authors evaluated the proposed method on both clinical point clouds acquired under consistent acquisition conditions and on point clouds with inconsistent occlusions. The authors quantitatively evaluated the reconstruction performance with respect to root-mean-squared-error, by comparing its reconstruction results against that from the variational method. Results: On clinical point clouds, both the SR and MSR models have achieved sub-millimeter reconstruction accuracy and reduced the reconstruction time by two orders of magnitude to a subsecond reconstruction time. On point clouds with inconsistent occlusions, the MSR model has demonstrated its advantage in achieving consistent and robust performance despite the introduced occlusions. Conclusions: The authors have developed a fast and robust surface reconstruction method on point clouds captured from a 3D surface photogrammetry system, with demonstrated sub-millimeter reconstruction accuracy and subsecond reconstruction time. It is suitable for real-time motion tracking in radiotherapy, with clear surface structures for better quantifications.« less
A Distributed Learning Method for ℓ1-Regularized Kernel Machine over Wireless Sensor Networks
Ji, Xinrong; Hou, Cuiqin; Hou, Yibin; Gao, Fang; Wang, Shulong
2016-01-01
In wireless sensor networks, centralized learning methods have very high communication costs and energy consumption. These are caused by the need to transmit scattered training examples from various sensor nodes to the central fusion center where a classifier or a regression machine is trained. To reduce the communication cost, a distributed learning method for a kernel machine that incorporates ℓ1 norm regularization (ℓ1-regularized) is investigated, and a novel distributed learning algorithm for the ℓ1-regularized kernel minimum mean squared error (KMSE) machine is proposed. The proposed algorithm relies on in-network processing and a collaboration that transmits the sparse model only between single-hop neighboring nodes. This paper evaluates the proposed algorithm with respect to the prediction accuracy, the sparse rate of model, the communication cost and the number of iterations on synthetic and real datasets. The simulation results show that the proposed algorithm can obtain approximately the same prediction accuracy as that obtained by the batch learning method. Moreover, it is significantly superior in terms of the sparse rate of model and communication cost, and it can converge with fewer iterations. Finally, an experiment conducted on a wireless sensor network (WSN) test platform further shows the advantages of the proposed algorithm with respect to communication cost. PMID:27376298
Multi-Source Multi-Target Dictionary Learning for Prediction of Cognitive Decline
Zhang, Jie; Li, Qingyang; Caselli, Richard J.; Thompson, Paul M.; Ye, Jieping; Wang, Yalin
2017-01-01
Alzheimer’s Disease (AD) is the most common type of dementia. Identifying correct biomarkers may determine pre-symptomatic AD subjects and enable early intervention. Recently, Multi-task sparse feature learning has been successfully applied to many computer vision and biomedical informatics researches. It aims to improve the generalization performance by exploiting the shared features among different tasks. However, most of the existing algorithms are formulated as a supervised learning scheme. Its drawback is with either insufficient feature numbers or missing label information. To address these challenges, we formulate an unsupervised framework for multi-task sparse feature learning based on a novel dictionary learning algorithm. To solve the unsupervised learning problem, we propose a two-stage Multi-Source Multi-Target Dictionary Learning (MMDL) algorithm. In stage 1, we propose a multi-source dictionary learning method to utilize the common and individual sparse features in different time slots. In stage 2, supported by a rigorous theoretical analysis, we develop a multi-task learning method to solve the missing label problem. Empirical studies on an N = 3970 longitudinal brain image data set, which involves 2 sources and 5 targets, demonstrate the improved prediction accuracy and speed efficiency of MMDL in comparison with other state-of-the-art algorithms. PMID:28943731
NASA Astrophysics Data System (ADS)
Humphries, T.; Winn, J.; Faridani, A.
2017-08-01
Recent work in CT image reconstruction has seen increasing interest in the use of total variation (TV) and related penalties to regularize problems involving reconstruction from undersampled or incomplete data. Superiorization is a recently proposed heuristic which provides an automatic procedure to ‘superiorize’ an iterative image reconstruction algorithm with respect to a chosen objective function, such as TV. Under certain conditions, the superiorized algorithm is guaranteed to find a solution that is as satisfactory as any found by the original algorithm with respect to satisfying the constraints of the problem; this solution is also expected to be superior with respect to the chosen objective. Most work on superiorization has used reconstruction algorithms which assume a linear measurement model, which in the case of CT corresponds to data generated from a monoenergetic x-ray beam. Many CT systems generate x-rays from a polyenergetic spectrum, however, in which the measured data represent an integral of object attenuation over all energies in the spectrum. This inconsistency with the linear model produces the well-known beam hardening artifacts, which impair analysis of CT images. In this work we superiorize an iterative algorithm for reconstruction from polyenergetic data, using both TV and an anisotropic TV (ATV) penalty. We apply the superiorized algorithm in numerical phantom experiments modeling both sparse-view and limited-angle scenarios. In our experiments, the superiorized algorithm successfully finds solutions which are as constraints-compatible as those found by the original algorithm, with significantly reduced TV and ATV values. The superiorized algorithm thus produces images with greatly reduced sparse-view and limited angle artifacts, which are also largely free of the beam hardening artifacts that would be present if a superiorized version of a monoenergetic algorithm were used.
Ziyatdinov, Andrey; Vázquez-Santiago, Miquel; Brunel, Helena; Martinez-Perez, Angel; Aschard, Hugues; Soria, Jose Manuel
2018-02-27
Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by modeling such correlations as random effects in linear mixed models (LMMs). The R package lme4 is a well-established tool that implements major LMM features using sparse matrix methods; however, it is not fully adapted for QTL mapping association and linkage studies. In particular, two LMM features are lacking in the base version of lme4: the definition of random effects by custom covariance matrices; and parameter constraints, which are essential in advanced QTL models. Apart from applications in linkage studies of related individuals, such functionalities are of high interest for association studies in situations where multiple covariance matrices need to be modeled, a scenario not covered by many genome-wide association study (GWAS) software. To address the aforementioned limitations, we developed a new R package lme4qtl as an extension of lme4. First, lme4qtl contributes new models for genetic studies within a single tool integrated with lme4 and its companion packages. Second, lme4qtl offers a flexible framework for scenarios with multiple levels of relatedness and becomes efficient when covariance matrices are sparse. We showed the value of our package using real family-based data in the Genetic Analysis of Idiopathic Thrombophilia 2 (GAIT2) project. Our software lme4qtl enables QTL mapping models with a versatile structure of random effects and efficient computation for sparse covariances. lme4qtl is available at https://github.com/variani/lme4qtl .
Visual properties and memorising scenes: Effects of image-space sparseness and uniformity.
Lukavský, Jiří; Děchtěrenko, Filip
2017-10-01
Previous studies have demonstrated that humans have a remarkable capacity to memorise a large number of scenes. The research on memorability has shown that memory performance can be predicted by the content of an image. We explored how remembering an image is affected by the image properties within the context of the reference set, including the extent to which it is different from its neighbours (image-space sparseness) and if it belongs to the same category as its neighbours (uniformity). We used a reference set of 2,048 scenes (64 categories), evaluated pairwise scene similarity using deep features from a pretrained convolutional neural network (CNN), and calculated the image-space sparseness and uniformity for each image. We ran three memory experiments, varying the memory workload with experiment length and colour/greyscale presentation. We measured the sensitivity and criterion value changes as a function of image-space sparseness and uniformity. Across all three experiments, we found separate effects of 1) sparseness on memory sensitivity, and 2) uniformity on the recognition criterion. People better remembered (and correctly rejected) images that were more separated from others. People tended to make more false alarms and fewer miss errors in images from categorically uniform portions of the image-space. We propose that both image-space properties affect human decisions when recognising images. Additionally, we found that colour presentation did not yield better memory performance over grayscale images.
A model to predict stream water temperature across the conterminous USA
Catalina Segura; Peter Caldwell; Ge Sun; Steve McNulty; Yang Zhang
2014-01-01
Stream water temperature (ts) is a critical water quality parameter for aquatic ecosystems. However, ts records are sparse or nonexistent in many river systems. In this work, we present an empirical model to predict ts at the site scale across the USA. The model, derived using data from 171 reference sites selected from the Geospatial Attributes of Gages for Evaluating...
ERIC Educational Resources Information Center
Chermahini, Soghra Akbari; Hommel, Bernhard
2010-01-01
Human creativity has been claimed to rely on the neurotransmitter dopamine, but evidence is still sparse. We studied whether individual performance (N=117) in divergent thinking (alternative uses task) and convergent thinking (remote association task) can be predicted by the individual spontaneous eye blink rate (EBR), a clinical marker of…
Qi, Miao; Wang, Ting; Yi, Yugen; Gao, Na; Kong, Jun; Wang, Jianzhong
2017-04-01
Feature selection has been regarded as an effective tool to help researchers understand the generating process of data. For mining the synthesis mechanism of microporous AlPOs, this paper proposes a novel feature selection method by joint l 2,1 norm and Fisher discrimination constraints (JNFDC). In order to obtain more effective feature subset, the proposed method can be achieved in two steps. The first step is to rank the features according to sparse and discriminative constraints. The second step is to establish predictive model with the ranked features, and select the most significant features in the light of the contribution of improving the predictive accuracy. To the best of our knowledge, JNFDC is the first work which employs the sparse representation theory to explore the synthesis mechanism of six kinds of pore rings. Numerical simulations demonstrate that our proposed method can select significant features affecting the specified structural property and improve the predictive accuracy. Moreover, comparison results show that JNFDC can obtain better predictive performances than some other state-of-the-art feature selection methods. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Compressed learning and its applications to subcellular localization.
Zheng, Zhong-Long; Guo, Li; Jia, Jiong; Xie, Chen-Mao; Zeng, Wen-Cai; Yang, Jie
2011-09-01
One of the main challenges faced by biological applications is to predict protein subcellular localization in automatic fashion accurately. To achieve this in these applications, a wide variety of machine learning methods have been proposed in recent years. Most of them focus on finding the optimal classification scheme and less of them take the simplifying the complexity of biological systems into account. Traditionally, such bio-data are analyzed by first performing a feature selection before classification. Motivated by CS (Compressed Sensing) theory, we propose the methodology which performs compressed learning with a sparseness criterion such that feature selection and dimension reduction are merged into one analysis. The proposed methodology decreases the complexity of biological system, while increases protein subcellular localization accuracy. Experimental results are quite encouraging, indicating that the aforementioned sparse methods are quite promising in dealing with complicated biological problems, such as predicting the subcellular localization of Gram-negative bacterial proteins.
NASA Technical Reports Server (NTRS)
Rajkumar, T.; Aragon, Cecilia; Bardina, Jorge; Britten, Roy
2002-01-01
A fast, reliable way of predicting aerodynamic coefficients is produced using a neural network optimized by a genetic algorithm. Basic aerodynamic coefficients (e.g. lift, drag, pitching moment) are modelled as functions of angle of attack and Mach number. The neural network is first trained on a relatively rich set of data from wind tunnel tests of numerical simulations to learn an overall model. Most of the aerodynamic parameters can be well-fitted using polynomial functions. A new set of data, which can be relatively sparse, is then supplied to the network to produce a new model consistent with the previous model and the new data. Because the new model interpolates realistically between the sparse test data points, it is suitable for use in piloted simulations. The genetic algorithm is used to choose a neural network architecture to give best results, avoiding over-and under-fitting of the test data.
Principles for problem aggregation and assignment in medium scale multiprocessors
NASA Technical Reports Server (NTRS)
Nicol, David M.; Saltz, Joel H.
1987-01-01
One of the most important issues in parallel processing is the mapping of workload to processors. This paper considers a large class of problems having a high degree of potential fine grained parallelism, and execution requirements that are either not predictable, or are too costly to predict. The main issues in mapping such a problem onto medium scale multiprocessors are those of aggregation and assignment. We study a method of parameterized aggregation that makes few assumptions about the workload. The mapping of aggregate units of work onto processors is uniform, and exploits locality of workload intensity to balance the unknown workload. In general, a finer aggregate granularity leads to a better balance at the price of increased communication/synchronization costs; the aggregation parameters can be adjusted to find a reasonable granularity. The effectiveness of this scheme is demonstrated on three model problems: an adaptive one-dimensional fluid dynamics problem with message passing, a sparse triangular linear system solver on both a shared memory and a message-passing machine, and a two-dimensional time-driven battlefield simulation employing message passing. Using the model problems, the tradeoffs are studied between balanced workload and the communication/synchronization costs. Finally, an analytical model is used to explain why the method balances workload and minimizes the variance in system behavior.
Data Structures for Extreme Scale Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kahan, Simon
As computing problems of national importance grow, the government meets the increased demand by funding the development of ever larger systems. The overarching goal of the work supported in part by this grant is to increase efficiency of programming and performing computations on these large computing systems. In past work, we have demonstrated that some of these computations once thought to require expensive hardware designs and/or complex, special-purpose programming may be executed efficiently on low-cost commodity cluster computing systems using a general-purpose “latency-tolerant” programming framework. One important developed application of the ideas underlying this framework is graph database technology supportingmore » social network pattern matching used by US intelligence agencies to more quickly identify potential terrorist threats. This database application has been spun out by the Pacific Northwest National Laboratory, a Department of Energy Laboratory, into a commercial start-up, Trovares Inc. We explore an alternative application of the same underlying ideas to a well-studied challenge arising in engineering: solving unstructured sparse linear equations. Solving these equations is key to predicting the behavior of large electronic circuits before they are fabricated. Predicting that behavior ahead of fabrication means that designs can optimized and errors corrected ahead of the expense of manufacture.« less
Seismoelectric data processing for surface surveys of shallow targets
Haines, S.S.; Guitton, A.; Biondi, B.
2007-01-01
The utility of the seismoelectric method relies on the development of methods to extract the signal of interest from background and source-generated coherent noise that may be several orders-of-magnitude stronger. We compare data processing approaches to develop a sequence of preprocessing and signal/noise separation and to quantify the noise level from which we can extract signal events. Our preferred sequence begins with the removal of power line harmonic noise and the use of frequency filters to minimize random and source-generated noise. Mapping to the linear Radon domain with an inverse process incorporating a sparseness constraint provides good separation of signal from noise, though it is ineffective on noise that shows the same dip as the signal. Similarly, the seismoelectric signal and noise do not separate cleanly in the Fourier domain, so f-k filtering can not remove all of the source-generated noise and it also disrupts signal amplitude patterns. We find that prediction-error filters provide the most effective method to separate signal and noise, while also preserving amplitude information, assuming that adequate pattern models can be determined for the signal and noise. These Radon-domain and prediction-error-filter methods successfully separate signal from <33 dB stronger noise in our test data. ?? 2007 Society of Exploration Geophysicists.
Uncertainty and Sensitivity Analysis of Afterbody Radiative Heating Predictions for Earth Entry
NASA Technical Reports Server (NTRS)
West, Thomas K., IV; Johnston, Christopher O.; Hosder, Serhat
2016-01-01
The objective of this work was to perform sensitivity analysis and uncertainty quantification for afterbody radiative heating predictions of Stardust capsule during Earth entry at peak afterbody radiation conditions. The radiation environment in the afterbody region poses significant challenges for accurate uncertainty quantification and sensitivity analysis due to the complexity of the flow physics, computational cost, and large number of un-certain variables. In this study, first a sparse collocation non-intrusive polynomial chaos approach along with global non-linear sensitivity analysis was used to identify the most significant uncertain variables and reduce the dimensions of the stochastic problem. Then, a total order stochastic expansion was constructed over only the important parameters for an efficient and accurate estimate of the uncertainty in radiation. Based on previous work, 388 uncertain parameters were considered in the radiation model, which came from the thermodynamics, flow field chemistry, and radiation modeling. The sensitivity analysis showed that only four of these variables contributed significantly to afterbody radiation uncertainty, accounting for almost 95% of the uncertainty. These included the electronic- impact excitation rate for N between level 2 and level 5 and rates of three chemical reactions in uencing N, N(+), O, and O(+) number densities in the flow field.
Three-Dimensional Nacelle Aeroacoustics Code With Application to Impedance Education
NASA Technical Reports Server (NTRS)
Watson, Willie R.
2000-01-01
A three-dimensional nacelle acoustics code that accounts for uniform mean flow and variable surface impedance liners is developed. The code is linked to a commercial version of the NASA-developed General Purpose Solver (for solution of linear systems of equations) in order to obtain the capability to study high frequency waves that may require millions of grid points for resolution. Detailed, single-processor statistics for the performance of the solver in rigid and soft-wall ducts are presented. Over the range of frequencies of current interest in nacelle liner research, noise attenuation levels predicted from the code were in excellent agreement with those predicted from mode theory. The equation solver is memory efficient, requiring only a small fraction of the memory available on modern computers. As an application, the code is combined with an optimization algorithm and used to reduce the impedance spectrum of a ceramic liner. The primary problem with using the code to perform optimization studies at frequencies above I1kHz is the excessive CPU time (a major portion of which is matrix assembly). The research recommends that research be directed toward development of a rapid sparse assembler and exploitation of the multiprocessor capability of the solver to further reduce CPU time.
3-D in vitro estimation of temperature using the change in backscattered ultrasonic energy.
Arthur, R Martin; Basu, Debomita; Guo, Yuzheng; Trobaugh, Jason W; Moros, Eduardo G
2010-08-01
Temperature imaging with a non-invasive modality to monitor the heating of tumors during hyperthermia treatment is an attractive alternative to sparse invasive measurement. Previously, we predicted monotonic changes in backscattered energy (CBE) of ultrasound with temperature for certain sub-wavelength scatterers. We also measured CBE values similar to our predictions in bovine liver, turkey breast muscle, and pork rib muscle in 2-D in vitro studies and in nude mice during 2-D in vivo studies. To extend these studies to three dimensions, we compensated for motion and measured CBE in turkey breast muscle. 3-D data sets were assembled from images formed by a phased-array imager with a 7.5-MHz linear probe moved in 0.6-mm steps in elevation during uniform heating from 37 to 45 degrees C in 0.5 degrees C increments. We used cross-correlation as a similarity measure in RF signals to automatically track feature displacement as a function of temperature. Feature displacement was non-rigid. Envelopes of image regions, compensated for non-rigid motion, were found with the Hilbert transform then smoothed with a 3 x 3 running average filter before forming the backscattered energy at each pixel. CBE in 3-D motion-compensated images was nearly linear with an average sensitivity of 0.30 dB/ degrees C. 3-D estimation of temperature in separate tissue regions had errors with a maximum standard deviation of about 0.5 degrees C over 1-cm(3) volumes. Success of CBE temperature estimation based on 3-D non-rigid tracking and compensation for real and apparent motion of image features could serve as the foundation for the eventual generation of 3-D temperature maps in soft tissue in a non-invasive, convenient, and low-cost way in clinical hyperthermia.
M-step preconditioned conjugate gradient methods
NASA Technical Reports Server (NTRS)
Adams, L.
1983-01-01
Preconditioned conjugate gradient methods for solving sparse symmetric and positive finite systems of linear equations are described. Necessary and sufficient conditions are given for when these preconditioners can be used and an analysis of their effectiveness is given. Efficient computer implementations of these methods are discussed and results on the CYBER 203 and the Finite Element Machine under construction at NASA Langley Research Center are included.
NASA Astrophysics Data System (ADS)
Zeng, X.
2015-12-01
A large number of model executions are required to obtain alternative conceptual models' predictions and their posterior probabilities in Bayesian model averaging (BMA). The posterior model probability is estimated through models' marginal likelihood and prior probability. The heavy computation burden hinders the implementation of BMA prediction, especially for the elaborated marginal likelihood estimator. For overcoming the computation burden of BMA, an adaptive sparse grid (SG) stochastic collocation method is used to build surrogates for alternative conceptual models through the numerical experiment of a synthetical groundwater model. BMA predictions depend on model posterior weights (or marginal likelihoods), and this study also evaluated four marginal likelihood estimators, including arithmetic mean estimator (AME), harmonic mean estimator (HME), stabilized harmonic mean estimator (SHME), and thermodynamic integration estimator (TIE). The results demonstrate that TIE is accurate in estimating conceptual models' marginal likelihoods. The BMA-TIE has better predictive performance than other BMA predictions. TIE has high stability for estimating conceptual model's marginal likelihood. The repeated estimated conceptual model's marginal likelihoods by TIE have significant less variability than that estimated by other estimators. In addition, the SG surrogates are efficient to facilitate BMA predictions, especially for BMA-TIE. The number of model executions needed for building surrogates is 4.13%, 6.89%, 3.44%, and 0.43% of the required model executions of BMA-AME, BMA-HME, BMA-SHME, and BMA-TIE, respectively.
Energy budgets and resistances to energy transport in sparsely vegetated rangeland
Nichols, W.D.
1992-01-01
Partitioning available energy between plants and bare soil in sparsely vegetated rangelands will allow hydrologists and others to gain a greater understanding of water use by native vegetation, especially phreatophytes. Standard methods of conducting energy budget studies result in measurements of latent and sensible heat fluxes above the plant canopy which therefore include the energy fluxes from both the canopy and the soil. One-dimensional theoretical numerical models have been proposed recently for the partitioning of energy in sparse crops. Bowen ratio and other micrometeorological data collected over phreatophytes growing in areas of shallow ground water in central Nevada were used to evaluate the feasibility of using these models, which are based on surface and within-canopy aerodynamic resistances, to determine heat and water vapor transport in sparsely vegetated rangelands. The models appear to provide reasonably good estimates of sensible heat flux from the soil and latent heat flux from the canopy. Estimates of latent heat flux from the soil were less satisfactory. Sensible heat flux from the canopy was not well predicted by the present resistance formulations. Also, estimates of total above-canopy fluxes were not satisfactory when using a single value for above-canopy bulk aerodynamic resistance. ?? 1992.
Brodsky, V Y; Malchenko, L A; Konchenko, D S; Zvezdina, N D; Dubovaya, T K
2016-08-01
Primary cultures of rat hepatocytes were studied in serum-free media. Ultradian protein synthesis rhythm was used as a marker of cell synchronization in the population. Addition of glutamic acid (0.2 mg/ml) to the medium of nonsynchronous sparse cultures resulted in detection of a common protein synthesis rhythm, hence in synchronization of the cells. The antagonist of glutamic acid metabotropic receptors MCPG (0.01 mg/ml) added together with glutamic acid abolished the synchronization effect; in sparse cultures, no rhythm was detected. Feeding rats with glutamic acid (30 mg with food) resulted in protein synthesis rhythm in sparse cultures obtained from the rats. After feeding without glutamic acid, linear kinetics of protein synthesis was revealed. Thus, glutamic acid, a component of blood as a non-neural transmitter, can synchronize the activity of hepatocytes and can form common rhythm of protein synthesis in vitro and in vivo. This effect is realized via receptors. Mechanisms of cell-cell communication are discussed on analyzing effects of non-neural functions of neurotransmitters. Glutamic acid is used clinically in humans. Hence, a previously unknown function of this drug is revealed.
FDD Massive MIMO Channel Estimation With Arbitrary 2D-Array Geometry
NASA Astrophysics Data System (ADS)
Dai, Jisheng; Liu, An; Lau, Vincent K. N.
2018-05-01
This paper addresses the problem of downlink channel estimation in frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems. The existing methods usually exploit hidden sparsity under a discrete Fourier transform (DFT) basis to estimate the cdownlink channel. However, there are at least two shortcomings of these DFT-based methods: 1) they are applicable to uniform linear arrays (ULAs) only, since the DFT basis requires a special structure of ULAs, and 2) they always suffer from a performance loss due to the leakage of energy over some DFT bins. To deal with the above shortcomings, we introduce an off-grid model for downlink channel sparse representation with arbitrary 2D-array antenna geometry, and propose an efficient sparse Bayesian learning (SBL) approach for the sparse channel recovery and off-grid refinement. The main idea of the proposed off-grid method is to consider the sampled grid points as adjustable parameters. Utilizing an in-exact block majorization-minimization (MM) algorithm, the grid points are refined iteratively to minimize the off-grid gap. Finally, we further extend the solution to uplink-aided channel estimation by exploiting the angular reciprocity between downlink and uplink channels, which brings enhanced recovery performance.
Regression-based adaptive sparse polynomial dimensional decomposition for sensitivity analysis
NASA Astrophysics Data System (ADS)
Tang, Kunkun; Congedo, Pietro; Abgrall, Remi
2014-11-01
Polynomial dimensional decomposition (PDD) is employed in this work for global sensitivity analysis and uncertainty quantification of stochastic systems subject to a large number of random input variables. Due to the intimate structure between PDD and Analysis-of-Variance, PDD is able to provide simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to polynomial chaos (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of the standard method unaffordable for real engineering applications. In order to address this problem of curse of dimensionality, this work proposes a variance-based adaptive strategy aiming to build a cheap meta-model by sparse-PDD with PDD coefficients computed by regression. During this adaptive procedure, the model representation by PDD only contains few terms, so that the cost to resolve repeatedly the linear system of the least-square regression problem is negligible. The size of the final sparse-PDD representation is much smaller than the full PDD, since only significant terms are eventually retained. Consequently, a much less number of calls to the deterministic model is required to compute the final PDD coefficients.
An embedded system for face classification in infrared video using sparse representation
NASA Astrophysics Data System (ADS)
Saavedra M., Antonio; Pezoa, Jorge E.; Zarkesh-Ha, Payman; Figueroa, Miguel
2017-09-01
We propose a platform for robust face recognition in Infrared (IR) images using Compressive Sensing (CS). In line with CS theory, the classification problem is solved using a sparse representation framework, where test images are modeled by means of a linear combination of the training set. Because the training set constitutes an over-complete dictionary, we identify new images by finding their sparsest representation based on the training set, using standard l1-minimization algorithms. Unlike conventional face-recognition algorithms, we feature extraction is performed using random projections with a precomputed binary matrix, as proposed in the CS literature. This random sampling reduces the effects of noise and occlusions such as facial hair, eyeglasses, and disguises, which are notoriously challenging in IR images. Thus, the performance of our framework is robust to these noise and occlusion factors, achieving an average accuracy of approximately 90% when the UCHThermalFace database is used for training and testing purposes. We implemented our framework on a high-performance embedded digital system, where the computation of the sparse representation of IR images was performed by a dedicated hardware using a deeply pipelined architecture on an Field-Programmable Gate Array (FPGA).
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Heber, Gerd; Biswas, Rupak
2000-01-01
The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations within a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming paradigms and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multi-threaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.
Improved parallel data partitioning by nested dissection with applications to information retrieval.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wolf, Michael M.; Chevalier, Cedric; Boman, Erik Gunnar
The computational work in many information retrieval and analysis algorithms is based on sparse linear algebra. Sparse matrix-vector multiplication is a common kernel in many of these computations. Thus, an important related combinatorial problem in parallel computing is how to distribute the matrix and the vectors among processors so as to minimize the communication cost. We focus on minimizing the total communication volume while keeping the computation balanced across processes. In [1], the first two authors presented a new 2D partitioning method, the nested dissection partitioning algorithm. In this paper, we improve on that algorithm and show that it ismore » a good option for data partitioning in information retrieval. We also show partitioning time can be substantially reduced by using the SCOTCH software, and quality improves in some cases, too.« less
Incremental Structured Dictionary Learning for Video Sensor-Based Object Tracking
Xue, Ming; Yang, Hua; Zheng, Shibao; Zhou, Yi; Yu, Zhenghua
2014-01-01
To tackle robust object tracking for video sensor-based applications, an online discriminative algorithm based on incremental discriminative structured dictionary learning (IDSDL-VT) is presented. In our framework, a discriminative dictionary combining both positive, negative and trivial patches is designed to sparsely represent the overlapped target patches. Then, a local update (LU) strategy is proposed for sparse coefficient learning. To formulate the training and classification process, a multiple linear classifier group based on a K-combined voting (KCV) function is proposed. As the dictionary evolves, the models are also trained to timely adapt the target appearance variation. Qualitative and quantitative evaluations on challenging image sequences compared with state-of-the-art algorithms demonstrate that the proposed tracking algorithm achieves a more favorable performance. We also illustrate its relay application in visual sensor networks. PMID:24549252
Evaluation of generalized degrees of freedom for sparse estimation by replica method
NASA Astrophysics Data System (ADS)
Sakata, A.
2016-12-01
We develop a method to evaluate the generalized degrees of freedom (GDF) for linear regression with sparse regularization. The GDF is a key factor in model selection, and thus its evaluation is useful in many modelling applications. An analytical expression for the GDF is derived using the replica method in the large-system-size limit with random Gaussian predictors. The resulting formula has a universal form that is independent of the type of regularization, providing us with a simple interpretation. Within the framework of replica symmetric (RS) analysis, GDF has a physical meaning as the effective fraction of non-zero components. The validity of our method in the RS phase is supported by the consistency of our results with previous mathematical results. The analytical results in the RS phase are calculated numerically using the belief propagation algorithm.
Source term identification in atmospheric modelling via sparse optimization
NASA Astrophysics Data System (ADS)
Adam, Lukas; Branda, Martin; Hamburger, Thomas
2015-04-01
Inverse modelling plays an important role in identifying the amount of harmful substances released into atmosphere during major incidents such as power plant accidents or volcano eruptions. Another possible application of inverse modelling lies in the monitoring the CO2 emission limits where only observations at certain places are available and the task is to estimate the total releases at given locations. This gives rise to minimizing the discrepancy between the observations and the model predictions. There are two standard ways of solving such problems. In the first one, this discrepancy is regularized by adding additional terms. Such terms may include Tikhonov regularization, distance from a priori information or a smoothing term. The resulting, usually quadratic, problem is then solved via standard optimization solvers. The second approach assumes that the error term has a (normal) distribution and makes use of Bayesian modelling to identify the source term. Instead of following the above-mentioned approaches, we utilize techniques from the field of compressive sensing. Such techniques look for a sparsest solution (solution with the smallest number of nonzeros) of a linear system, where a maximal allowed error term may be added to this system. Even though this field is a developed one with many possible solution techniques, most of them do not consider even the simplest constraints which are naturally present in atmospheric modelling. One of such examples is the nonnegativity of release amounts. We believe that the concept of a sparse solution is natural in both problems of identification of the source location and of the time process of the source release. In the first case, it is usually assumed that there are only few release points and the task is to find them. In the second case, the time window is usually much longer than the duration of the actual release. In both cases, the optimal solution should contain a large amount of zeros, giving rise to the concept of sparsity. In the paper, we summarize several optimization techniques which are used for finding sparse solutions and propose their modifications to handle selected constraints such as nonnegativity constraints and simple linear constraints, for example the minimal or maximal amount of total release. These techniques range from successive convex approximations to solution of one nonconvex problem. On simple examples, we explain these techniques and compare them from the point of implementation simplicity, approximation capability and convergence properties. Finally, these methods will be applied on the European Tracer Experiment (ETEX) data and the results will be compared with the current state of arts techniques such as regularized least squares or Bayesian approach. The obtained results show the surprisingly good results of these techniques. This research is supported by EEA/Norwegian Financial Mechanism under project 7F14287 STRADI.
A Galerkin discretisation-based identification for parameters in nonlinear mechanical systems
NASA Astrophysics Data System (ADS)
Liu, Zuolin; Xu, Jian
2018-04-01
In the paper, a new parameter identification method is proposed for mechanical systems. Based on the idea of Galerkin finite-element method, the displacement over time history is approximated by piecewise linear functions, and the second-order terms in model equation are eliminated by integrating by parts. In this way, the lost function of integration form is derived. Being different with the existing methods, the lost function actually is a quadratic sum of integration over the whole time history. Then for linear or nonlinear systems, the optimisation of the lost function can be applied with traditional least-squares algorithm or the iterative one, respectively. Such method could be used to effectively identify parameters in linear and arbitrary nonlinear mechanical systems. Simulation results show that even under the condition of sparse data or low sampling frequency, this method could still guarantee high accuracy in identifying linear and nonlinear parameters.
Prediction of brain maturity based on cortical thickness at different spatial resolutions.
Khundrakpam, Budhachandra S; Tohka, Jussi; Evans, Alan C
2015-05-01
Several studies using magnetic resonance imaging (MRI) scans have shown developmental trajectories of cortical thickness. Cognitive milestones happen concurrently with these structural changes, and a delay in such changes has been implicated in developmental disorders such as attention-deficit/hyperactivity disorder (ADHD). Accurate estimation of individuals' brain maturity, therefore, is critical in establishing a baseline for normal brain development against which neurodevelopmental disorders can be assessed. In this study, cortical thickness derived from structural magnetic resonance imaging (MRI) scans of a large longitudinal dataset of normally growing children and adolescents (n=308), were used to build a highly accurate predictive model for estimating chronological age (cross-validated correlation up to R=0.84). Unlike previous studies which used kernelized approach in building prediction models, we used an elastic net penalized linear regression model capable of producing a spatially sparse, yet accurate predictive model of chronological age. Upon investigating different scales of cortical parcellation from 78 to 10,240 brain parcels, we observed that the accuracy in estimated age improved with increased spatial scale of brain parcellation, with the best estimations obtained for spatial resolutions consisting of 2560 and 10,240 brain parcels. The top predictors of brain maturity were found in highly localized sensorimotor and association areas. The results of our study demonstrate that cortical thickness can be used to estimate individuals' brain maturity with high accuracy, and the estimated ages relate to functional and behavioural measures, underscoring the relevance and scope of the study in the understanding of biological maturity. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Park, Sang C.; Carnahan, Timothy M.; Cohen, Lester M.; Congedo, Cherie B.; Eisenhower, Michael J.; Ousley, Wes; Weaver, Andrew; Yang, Kan
2017-01-01
The JWST Optical Telescope Element (OTE) assembly is the largest optically stable infrared-optimized telescope currently being manufactured and assembled, and is scheduled for launch in 2018. The JWST OTE, including the 18 segment primary mirror, secondary mirror, and the Aft Optics Subsystem (AOS) are designed to be passively cooled and operate near 45K. These optical elements are supported by a complex composite backplane structure. As a part of the structural distortion model validation efforts, a series of tests are planned during the cryogenic vacuum test of the fully integrated flight hardware at NASA JSC Chamber A. The successful ends to the thermal-distortion phases are heavily dependent on the accurate temperature knowledge of the OTE structural members. However, the current temperature sensor allocations during the cryo-vac test may not have sufficient fidelity to provide accurate knowledge of the temperature distributions within the composite structure. A method based on an inverse distance relationship among the sensors and thermal model nodes was developed to improve the thermal data provided for the nanometer scale WaveFront Error (WFE) predictions. The Linear Distance Weighted Interpolation (LDWI) method was developed to augment the thermal model predictions based on the sparse sensor information. This paper will encompass the development of the LDWI method using the test data from the earlier pathfinder cryo-vac tests, and the results of the notional and as tested WFE predictions from the structural finite element model cases to characterize the accuracies of this LDWI method.
An empirical investigation of methods for nonsymmetric linear systems
NASA Technical Reports Server (NTRS)
Sherman, A. H.
1981-01-01
The present investigation is concerned with a comparison of methods for solving linear algebraic systems which arise from finite difference discretizations of the elliptic convection-diffusion equation in a planar region Omega with Dirichlet boundary conditions. Such linear systems are typically of the form Ax = b where A is an N x N sparse nonsymmetric matrix. In a discussion of discretizations, it is assumed that a regular rectilinear mesh of width h has been imposed on Omega. The discretizations considered include central differences, upstream differences, and modified upstream differences. Six methods for solving Ax = b are considered. Three variants of Gaussian elimination have been chosen as representatives of state-of-the-art software for direct methods under different assumptions about pivoting. Three iterative methods are also included.
Stability and stabilisation of a class of networked dynamic systems
NASA Astrophysics Data System (ADS)
Liu, H. B.; Wang, D. Q.
2018-04-01
We investigate the stability and stabilisation of a linear time invariant networked heterogeneous system with arbitrarily connected subsystems. A new linear matrix inequality based sufficient and necessary condition for the stability is derived, based on which the stabilisation is provided. The obtained conditions efficiently utilise the block-diagonal characteristic of system parameter matrices and the sparseness of subsystem connection matrix. Moreover, a sufficient condition only dependent on each individual subsystem is also presented for the stabilisation of the networked systems with a large scale. Numerical simulations show that these conditions are computationally valid in the analysis and synthesis of a large-scale networked system.
Compressed modes for variational problems in mathematics and physics.
Ozolins, Vidvuds; Lai, Rongjie; Caflisch, Russel; Osher, Stanley
2013-11-12
This article describes a general formalism for obtaining spatially localized ("sparse") solutions to a class of problems in mathematical physics, which can be recast as variational optimization problems, such as the important case of Schrödinger's equation in quantum mechanics. Sparsity is achieved by adding an regularization term to the variational principle, which is shown to yield solutions with compact support ("compressed modes"). Linear combinations of these modes approximate the eigenvalue spectrum and eigenfunctions in a systematically improvable manner, and the localization properties of compressed modes make them an attractive choice for use with efficient numerical algorithms that scale linearly with the problem size.
SIMD Optimization of Linear Expressions for Programmable Graphics Hardware
Bajaj, Chandrajit; Ihm, Insung; Min, Jungki; Oh, Jinsang
2009-01-01
The increased programmability of graphics hardware allows efficient graphical processing unit (GPU) implementations of a wide range of general computations on commodity PCs. An important factor in such implementations is how to fully exploit the SIMD computing capacities offered by modern graphics processors. Linear expressions in the form of ȳ = Ax̄ + b̄, where A is a matrix, and x̄, ȳ and b̄ are vectors, constitute one of the most basic operations in many scientific computations. In this paper, we propose a SIMD code optimization technique that enables efficient shader codes to be generated for evaluating linear expressions. It is shown that performance can be improved considerably by efficiently packing arithmetic operations into four-wide SIMD instructions through reordering of the operations in linear expressions. We demonstrate that the presented technique can be used effectively for programming both vertex and pixel shaders for a variety of mathematical applications, including integrating differential equations and solving a sparse linear system of equations using iterative methods. PMID:19946569
A Machine Learning Approach to Predicted Bathymetry
NASA Astrophysics Data System (ADS)
Wood, W. T.; Elmore, P. A.; Petry, F.
2017-12-01
Recent and on-going efforts have shown how machine learning (ML) techniques, incorporating more, and more disparate data than can be interpreted manually, can predict seafloor properties, with uncertainty, where they have not been measured directly. We examine here a ML approach to predicted bathymetry. Our approach employs a paradigm of global bathymetry as an integral component of global geology. From a marine geology and geophysics perspective the bathymetry is the thickness of one layer in an ensemble of layers that inter-relate to varying extents vertically and geospatially. The nature of the multidimensional relationships in these layers between bathymetry, gravity, magnetic field, age, and many other global measures is typically geospatially dependent and non-linear. The advantage of using ML is that these relationships need not be stated explicitly, nor do they need to be approximated with a transfer function - the machine learns them via the data. Fundamentally, ML operates by brute-force searching for multidimensional correlations between desired, but sparsely known data values (in this case water depth), and a multitude of (geologic) predictors. Predictors include quantities known extensively such as remotely sensed measurements (i.e. gravity and magnetics), distance from spreading ridge, trench etc., (and spatial statistics based on these quantities). Estimating bathymetry from an approximate transfer function is inherently model, as well as data limited - complex relationships are explicitly ruled out. The ML is a purely data-driven approach, so only the extent and quality of the available observations limit prediction accuracy. This allows for a system in which new data, of a wide variety of types, can be quickly and easily assimilated into updated bathymetry predictions with quantitative posterior uncertainties.
Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J
2008-01-01
ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.
Signal processing using sparse derivatives with applications to chromatograms and ECG
NASA Astrophysics Data System (ADS)
Ning, Xiaoran
In this thesis, we investigate the sparsity exist in the derivative domain. Particularly, we focus on the type of signals which posses up to Mth (M > 0) order sparse derivatives. Efforts are put on formulating proper penalty functions and optimization problems to capture properties related to sparse derivatives, searching for fast, computationally efficient solvers. Also the effectiveness of these algorithms are applied to two real world applications. In the first application, we provide an algorithm which jointly addresses the problems of chromatogram baseline correction and noise reduction. The series of chromatogram peaks are modeled as sparse with sparse derivatives, and the baseline is modeled as a low-pass signal. A convex optimization problem is formulated so as to encapsulate these non-parametric models. To account for the positivity of chromatogram peaks, an asymmetric penalty function is also utilized with symmetric penalty functions. A robust, computationally efficient, iterative algorithm is developed that is guaranteed to converge to the unique optimal solution. The approach, termed Baseline Estimation And Denoising with Sparsity (BEADS), is evaluated and compared with two state-of-the-art methods using both simulated and real chromatogram data. Promising result is obtained. In the second application, a novel Electrocardiography (ECG) enhancement algorithm is designed also based on sparse derivatives. In the real medical environment, ECG signals are often contaminated by various kinds of noise or artifacts, for example, morphological changes due to motion artifact, non-stationary noise due to muscular contraction (EMG), etc. Some of these contaminations severely affect the usefulness of ECG signals, especially when computer aided algorithms are utilized. By solving the proposed convex l1 optimization problem, artifacts are reduced by modeling the clean ECG signal as a sum of two signals whose second and third-order derivatives (differences) are sparse respectively. At the end, the algorithm is applied to a QRS detection system and validated using the MIT-BIH Arrhythmia database (109452 anotations), resulting a sensitivity of Se = 99.87%$ and a positive prediction of +P = 99.88%.
Affordable and personalized lighting using inverse modeling and virtual sensors
NASA Astrophysics Data System (ADS)
Basu, Chandrayee; Chen, Benjamin; Richards, Jacob; Dhinakaran, Aparna; Agogino, Alice; Martin, Rodney
2014-03-01
Wireless sensor networks (WSN) have great potential to enable personalized intelligent lighting systems while reducing building energy use by 50%-70%. As a result WSN systems are being increasingly integrated in state-ofart intelligent lighting systems. In the future these systems will enable participation of lighting loads as ancillary services. However, such systems can be expensive to install and lack the plug-and-play quality necessary for user-friendly commissioning. In this paper we present an integrated system of wireless sensor platforms and modeling software to enable affordable and user-friendly intelligent lighting. It requires ⇠ 60% fewer sensor deployments compared to current commercial systems. Reduction in sensor deployments has been achieved by optimally replacing the actual photo-sensors with real-time discrete predictive inverse models. Spatially sparse and clustered sub-hourly photo-sensor data captured by the WSN platforms are used to develop and validate a piece-wise linear regression of indoor light distribution. This deterministic data-driven model accounts for sky conditions and solar position. The optimal placement of photo-sensors is performed iteratively to achieve the best predictability of the light field desired for indoor lighting control. Using two weeks of daylight and artificial light training data acquired at the Sustainability Base at NASA Ames, the model was able to predict the light level at seven monitored workstations with 80%-95% accuracy. We estimate that 10% adoption of this intelligent wireless sensor system in commercial buildings could save 0.2-0.25 quads BTU of energy nationwide.
NASA Astrophysics Data System (ADS)
Liu, Y.; Zheng, L.; Pau, G. S. H.
2016-12-01
A careful assessment of the risk associated with geologic CO2 storage is critical to the deployment of large-scale storage projects. While numerical modeling is an indispensable tool for risk assessment, there has been increasing need in considering and addressing uncertainties in the numerical models. However, uncertainty analyses have been significantly hindered by the computational complexity of the model. As a remedy, reduced-order models (ROM), which serve as computationally efficient surrogates for high-fidelity models (HFM), have been employed. The ROM is constructed at the expense of an initial set of HFM simulations, and afterwards can be relied upon to predict the model output values at minimal cost. The ROM presented here is part of National Risk Assessment Program (NRAP) and intends to predict the water quality change in groundwater in response to hypothetical CO2 and brine leakage. The HFM based on which the ROM is derived is a multiphase flow and reactive transport model, with 3-D heterogeneous flow field and complex chemical reactions including aqueous complexation, mineral dissolution/precipitation, adsorption/desorption via surface complexation and cation exchange. Reduced-order modeling techniques based on polynomial basis expansion, such as polynomial chaos expansion (PCE), are widely used in the literature. However, the accuracy of such ROMs can be affected by the sparse structure of the coefficients of the expansion. Failing to identify vanishing polynomial coefficients introduces unnecessary sampling errors, the accumulation of which deteriorates the accuracy of the ROMs. To address this issue, we treat the PCE as a sparse Bayesian learning (SBL) problem, and the sparsity is obtained by detecting and including only the non-zero PCE coefficients one at a time by iteratively selecting the most contributing coefficients. The computational complexity due to predicting the entire 3-D concentration fields is further mitigated by a dimension reduction procedure-proper orthogonal decomposition (POD). Our numerical results show that utilizing the sparse structure and POD significantly enhances the accuracy and efficiency of the ROMs, laying the basis for further analyses that necessitate a large number of model simulations.
NASA Astrophysics Data System (ADS)
Kibler, K. M.; Alipour, M.
2017-12-01
Diversion hydropower has been shown to significantly alter river flow regimes by dewatering diversion bypass reaches. Data scarcity is one of the foremost challenges to establishing environmental flow regimes below diversion hydropower dams, especially in regions of sparse hydro-meteorological observation. Herein, we test two prediction strategies for generating daily flows in rivers developed with diversion hydropower: a catchment similarity model, and a rainfall-runoff model selected by multi-objective optimization based on soft data. While both methods are designed for ungauged rivers embedded within large regions of sparse hydrologic observation, one is more complex and computationally-intensive. The objective of this study is to assess the benefit of using complex modeling tools in data-sparse landscapes to support design of environmental flow regimes. Models were tested in gauged catchments and then used to simulate a 28-year record of daily flows in 32 ungauged rivers. After perturbing flows with the hydropower diversion, we detect alteration using Indicators of Hydrologic Alteration (IHA) metrics and compare outcomes of the two modeling approaches. The catchment similarity model simulates low flows well (Nash-Sutcliff efficiency (NSE) = 0.91), but poorly represents moderate to high flows (overall NSE = 0.25). The multi-objective rainfall-runoff model performs well overall (NSE = 0.72). Both models agree that flow magnitudes and variability consistently decrease following diversion as temporally-dynamic flows are replaced by static minimal flows. Mean duration of events sustained below the pre-diversion Q75 and mean hydrograph rise and fall rates increase. While we see broad areas of agreement, significant effects and thresholds vary between models, particularly in the representation of moderate flows. Thus, use of simplified streamflow models may bias detected alterations or inadequately characterize pre-regulation flow regimes, providing inaccurate information as a basis for flow regime design. As an alternative, the multi-objective framework can be applied globally, and is robust to common challenges of flow prediction in ungauged rivers, such as equifinality and hydrologic dissimilarity of reference catchments.
Pseudotumor Cerebri Resulting in Empty Sella Syndrome and Multiple Pituitary Hormone Deficiencies
2017-09-16
of chronic headaches, back pain, decreased energy, and frequent nausea and vomiting. His growth velocity had slowed over the previous 3 years. On...exam, he had a eunuchoid body habitus without gynecomastia. He had sparse axillary hair , Tanner II pubic hair , and a phallus smaller than expected for...notable progression of puberty and linear growth acceleration. Subsequently, physiologic hydrocortisone replacement therapy resulted in resolution of
Pseudotumor Cerebri Resulting in Empty Sella Syndrome and Multiple Pituitary Hormone Deficiencies
2017-09-14
of chronic headaches, back pain, decreased energy, and frequent nausea and vomiting. His growth velocity had slowed over the previous 3 years. On...exam, he had a eunuchoid body habltus without gynecomastia. He had sparse axillary hair , Tanner II pubic hair , and a phallus smaller than expected...with notable progression of puberty and linear growth acceleration. Subsequently, physiologic hydrocortisone replacement therapy resulted in resolution
DOE Office of Scientific and Technical Information (OSTI.GOV)
2014-01-17
This library is an implementation of the Sparse Approximate Matrix Multiplication (SpAMM) algorithm introduced. It provides a matrix data type, and an approximate matrix product, which exhibits linear scaling computational complexity for matrices with decay. The product error and the performance of the multiply can be tuned by choosing an appropriate tolerance. The library can be compiled for serial execution or parallel execution on shared memory systems with an OpenMP capable compiler
Tensor-GMRES method for large sparse systems of nonlinear equations
NASA Technical Reports Server (NTRS)
Feng, Dan; Pulliam, Thomas H.
1994-01-01
This paper introduces a tensor-Krylov method, the tensor-GMRES method, for large sparse systems of nonlinear equations. This method is a coupling of tensor model formation and solution techniques for nonlinear equations with Krylov subspace projection techniques for unsymmetric systems of linear equations. Traditional tensor methods for nonlinear equations are based on a quadratic model of the nonlinear function, a standard linear model augmented by a simple second order term. These methods are shown to be significantly more efficient than standard methods both on nonsingular problems and on problems where the Jacobian matrix at the solution is singular. A major disadvantage of the traditional tensor methods is that the solution of the tensor model requires the factorization of the Jacobian matrix, which may not be suitable for problems where the Jacobian matrix is large and has a 'bad' sparsity structure for an efficient factorization. We overcome this difficulty by forming and solving the tensor model using an extension of a Newton-GMRES scheme. Like traditional tensor methods, we show that the new tensor method has significant computational advantages over the analogous Newton counterpart. Consistent with Krylov subspace based methods, the new tensor method does not depend on the factorization of the Jacobian matrix. As a matter of fact, the Jacobian matrix is never needed explicitly.
Negre, Christian F A; Mniszewski, Susan M; Cawkwell, Marc J; Bock, Nicolas; Wall, Michael E; Niklasson, Anders M N
2016-07-12
We present a reduced complexity algorithm to compute the inverse overlap factors required to solve the generalized eigenvalue problem in a quantum-based molecular dynamics (MD) simulation. Our method is based on the recursive, iterative refinement of an initial guess of Z (inverse square root of the overlap matrix S). The initial guess of Z is obtained beforehand by using either an approximate divide-and-conquer technique or dynamical methods, propagated within an extended Lagrangian dynamics from previous MD time steps. With this formulation, we achieve long-term stability and energy conservation even under the incomplete, approximate, iterative refinement of Z. Linear-scaling performance is obtained using numerically thresholded sparse matrix algebra based on the ELLPACK-R sparse matrix data format, which also enables efficient shared-memory parallelization. As we show in this article using self-consistent density-functional-based tight-binding MD, our approach is faster than conventional methods based on the diagonalization of overlap matrix S for systems as small as a few hundred atoms, substantially accelerating quantum-based simulations even for molecular structures of intermediate size. For a 4158-atom water-solvated polyalanine system, we find an average speedup factor of 122 for the computation of Z in each MD step.
Negre, Christian F. A; Mniszewski, Susan M.; Cawkwell, Marc Jon; ...
2016-06-06
We present a reduced complexity algorithm to compute the inverse overlap factors required to solve the generalized eigenvalue problem in a quantum-based molecular dynamics (MD) simulation. Our method is based on the recursive iterative re nement of an initial guess Z of the inverse overlap matrix S. The initial guess of Z is obtained beforehand either by using an approximate divide and conquer technique or dynamically, propagated within an extended Lagrangian dynamics from previous MD time steps. With this formulation, we achieve long-term stability and energy conservation even under incomplete approximate iterative re nement of Z. Linear scaling performance ismore » obtained using numerically thresholded sparse matrix algebra based on the ELLPACK-R sparse matrix data format, which also enables e cient shared memory parallelization. As we show in this article using selfconsistent density functional based tight-binding MD, our approach is faster than conventional methods based on the direct diagonalization of the overlap matrix S for systems as small as a few hundred atoms, substantially accelerating quantum-based simulations even for molecular structures of intermediate size. For a 4,158 atom water-solvated polyalanine system we nd an average speedup factor of 122 for the computation of Z in each MD step.« less
Dose-shaping using targeted sparse optimization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sayre, George A.; Ruan, Dan
2013-07-15
Purpose: Dose volume histograms (DVHs) are common tools in radiation therapy treatment planning to characterize plan quality. As statistical metrics, DVHs provide a compact summary of the underlying plan at the cost of losing spatial information: the same or similar dose-volume histograms can arise from substantially different spatial dose maps. This is exactly the reason why physicians and physicists scrutinize dose maps even after they satisfy all DVH endpoints numerically. However, up to this point, little has been done to control spatial phenomena, such as the spatial distribution of hot spots, which has significant clinical implications. To this end, themore » authors propose a novel objective function that enables a more direct tradeoff between target coverage, organ-sparing, and planning target volume (PTV) homogeneity, and presents our findings from four prostate cases, a pancreas case, and a head-and-neck case to illustrate the advantages and general applicability of our method.Methods: In designing the energy minimization objective (E{sub tot}{sup sparse}), the authors utilized the following robust cost functions: (1) an asymmetric linear well function to allow differential penalties for underdose, relaxation of prescription dose, and overdose in the PTV; (2) a two-piece linear function to heavily penalize high dose and mildly penalize low and intermediate dose in organs-at risk (OARs); and (3) a total variation energy, i.e., the L{sub 1} norm applied to the first-order approximation of the dose gradient in the PTV. By minimizing a weighted sum of these robust costs, general conformity to dose prescription and dose-gradient prescription is achieved while encouraging prescription violations to follow a Laplace distribution. In contrast, conventional quadratic objectives are associated with a Gaussian distribution of violations, which is less forgiving to large violations of prescription than the Laplace distribution. As a result, the proposed objective E{sub tot}{sup sparse} improves tradeoff between planning goals by 'sacrificing' voxels that have already been violated to improve PTV coverage, PTV homogeneity, and/or OAR-sparing. In doing so, overall plan quality is increased since these large violations only arise if a net reduction in E{sub tot}{sup sparse} occurs as a result. For example, large violations to dose prescription in the PTV in E{sub tot}{sup sparse}-optimized plans will naturally localize to voxels in and around PTV-OAR overlaps where OAR-sparing may be increased without compromising target coverage. The authors compared the results of our method and the corresponding clinical plans using analyses of DVH plots, dose maps, and two quantitative metrics that quantify PTV homogeneity and overdose. These metrics do not penalize underdose since E{sub tot}{sup sparse}-optimized plans were planned such that their target coverage was similar or better than that of the clinical plans. Finally, plan deliverability was assessed with the 2D modulation index.Results: The proposed method was implemented using IBM's CPLEX optimization package (ILOG CPLEX, Sunnyvale, CA) and required 1-4 min to solve with a 12-core Intel i7 processor. In the testing procedure, the authors optimized for several points on the Pareto surface of four 7-field 6MV prostate cases that were optimized for different levels of PTV homogeneity and OAR-sparing. The generated results were compared against each other and the clinical plan by analyzing their DVH plots and dose maps. After developing intuition by planning the four prostate cases, which had relatively few tradeoffs, the authors applied our method to a 7-field 6 MV pancreas case and a 9-field 6MV head-and-neck case to test the potential impact of our method on more challenging cases. The authors found that our formulation: (1) provided excellent flexibility for balancing OAR-sparing with PTV homogeneity; and (2) permitted the dose planner more control over the evolution of the PTV's spatial dose distribution than conventional objective functions. In particular, E{sub tot}{sup sparse}-optimized plans for the pancreas case and head-and-neck case exhibited substantially improved sparing of the spinal cord and parotid glands, respectively, while maintaining or improving sparing for other OARs and markedly improving PTV homogeneity. Plan deliverability for E{sub tot}{sup sparse}-optimized plans was shown to be better than their associated clinical plans, according to the two-dimensional modulation index.Conclusions: These results suggest that our formulation may be used to improve dose-shaping and OAR-sparing for complicated disease sites, such as the pancreas or head and neck. Furthermore, our objective function and constraints are linear and constitute a linear program, which converges to the global minimum quickly, and can be easily implemented in treatment planning software. Thus, the authors expect fast translation of our method to the clinic where it may have a positive impact on plan quality for challenging disease sites.« less
A network model of successive partitioning-limited solute diffusion through the stratum corneum.
Schumm, Phillip; Scoglio, Caterina M; van der Merwe, Deon
2010-02-07
As the most exposed point of contact with the external environment, the skin is an important barrier to many chemical exposures, including medications, potentially toxic chemicals and cosmetics. Traditional dermal absorption models treat the stratum corneum lipids as a homogenous medium through which solutes diffuse according to Fick's first law of diffusion. This approach does not explain non-linear absorption and irregular distribution patterns within the stratum corneum lipids as observed in experimental data. A network model, based on successive partitioning-limited solute diffusion through the stratum corneum, where the lipid structure is represented by a large, sparse, and regular network where nodes have variable characteristics, offers an alternative, efficient, and flexible approach to dermal absorption modeling that simulates non-linear absorption data patterns. Four model versions are presented: two linear models, which have unlimited node capacities, and two non-linear models, which have limited node capacities. The non-linear model outputs produce absorption to dose relationships that can be best characterized quantitatively by using power equations, similar to the equations used to describe non-linear experimental data.
Fast Solution in Sparse LDA for Binary Classification
NASA Technical Reports Server (NTRS)
Moghaddam, Baback
2010-01-01
An algorithm that performs sparse linear discriminant analysis (Sparse-LDA) finds near-optimal solutions in far less time than the prior art when specialized to binary classification (of 2 classes). Sparse-LDA is a type of feature- or variable- selection problem with numerous applications in statistics, machine learning, computer vision, computational finance, operations research, and bio-informatics. Because of its combinatorial nature, feature- or variable-selection problems are NP-hard or computationally intractable in cases involving more than 30 variables or features. Therefore, one typically seeks approximate solutions by means of greedy search algorithms. The prior Sparse-LDA algorithm was a greedy algorithm that considered the best variable or feature to add/ delete to/ from its subsets in order to maximally discriminate between multiple classes of data. The present algorithm is designed for the special but prevalent case of 2-class or binary classification (e.g. 1 vs. 0, functioning vs. malfunctioning, or change versus no change). The present algorithm provides near-optimal solutions on large real-world datasets having hundreds or even thousands of variables or features (e.g. selecting the fewest wavelength bands in a hyperspectral sensor to do terrain classification) and does so in typical computation times of minutes as compared to days or weeks as taken by the prior art. Sparse LDA requires solving generalized eigenvalue problems for a large number of variable subsets (represented by the submatrices of the input within-class and between-class covariance matrices). In the general (fullrank) case, the amount of computation scales at least cubically with the number of variables and thus the size of the problems that can be solved is limited accordingly. However, in binary classification, the principal eigenvalues can be found using a special analytic formula, without resorting to costly iterative techniques. The present algorithm exploits this analytic form along with the inherent sequential nature of greedy search itself. Together this enables the use of highly-efficient partitioned-matrix-inverse techniques that result in large speedups of computation in both the forward-selection and backward-elimination stages of greedy algorithms in general.
LRSSLMDA: Laplacian Regularized Sparse Subspace Learning for MiRNA-Disease Association prediction
Huang, Li
2017-01-01
Predicting novel microRNA (miRNA)-disease associations is clinically significant due to miRNAs’ potential roles of diagnostic biomarkers and therapeutic targets for various human diseases. Previous studies have demonstrated the viability of utilizing different types of biological data to computationally infer new disease-related miRNAs. Yet researchers face the challenge of how to effectively integrate diverse datasets and make reliable predictions. In this study, we presented a computational model named Laplacian Regularized Sparse Subspace Learning for MiRNA-Disease Association prediction (LRSSLMDA), which projected miRNAs/diseases’ statistical feature profile and graph theoretical feature profile to a common subspace. It used Laplacian regularization to preserve the local structures of the training data and a L1-norm constraint to select important miRNA/disease features for prediction. The strength of dimensionality reduction enabled the model to be easily extended to much higher dimensional datasets than those exploited in this study. Experimental results showed that LRSSLMDA outperformed ten previous models: the AUC of 0.9178 in global leave-one-out cross validation (LOOCV) and the AUC of 0.8418 in local LOOCV indicated the model’s superior prediction accuracy; and the average AUC of 0.9181+/-0.0004 in 5-fold cross validation justified its accuracy and stability. In addition, three types of case studies further demonstrated its predictive power. Potential miRNAs related to Colon Neoplasms, Lymphoma, Kidney Neoplasms, Esophageal Neoplasms and Breast Neoplasms were predicted by LRSSLMDA. Respectively, 98%, 88%, 96%, 98% and 98% out of the top 50 predictions were validated by experimental evidences. Therefore, we conclude that LRSSLMDA would be a valuable computational tool for miRNA-disease association prediction. PMID:29253885
A sparse representation-based approach for copy-move image forgery detection in smooth regions
NASA Astrophysics Data System (ADS)
Abdessamad, Jalila; ElAdel, Asma; Zaied, Mourad
2017-03-01
Copy-move image forgery is the act of cloning a restricted region in the image and pasting it once or multiple times within that same image. This procedure intends to cover a certain feature, probably a person or an object, in the processed image or emphasize it through duplication. Consequences of this malicious operation can be unexpectedly harmful. Hence, the present paper proposes a new approach that automatically detects Copy-move Forgery (CMF). In particular, this work broaches a widely common open issue in CMF research literature that is detecting CMF within smooth areas. Indeed, the proposed approach represents the image blocks as a sparse linear combination of pre-learned bases (a mixture of texture and color-wise small patches) which allows a robust description of smooth patches. The reported experimental results demonstrate the effectiveness of the proposed approach in identifying the forged regions in CM attacks.
Archetypal Analysis for Sparse Representation-Based Hyperspectral Sub-Pixel Quantification
NASA Astrophysics Data System (ADS)
Drees, L.; Roscher, R.
2017-05-01
This paper focuses on the quantification of land cover fractions in an urban area of Berlin, Germany, using simulated hyperspectral EnMAP data with a spatial resolution of 30m×30m. For this, sparse representation is applied, where each pixel with unknown surface characteristics is expressed by a weighted linear combination of elementary spectra with known land cover class. The elementary spectra are determined from image reference data using simplex volume maximization, which is a fast heuristic technique for archetypal analysis. In the experiments, the estimation of class fractions based on the archetypal spectral library is compared to the estimation obtained by a manually designed spectral library by means of reconstruction error, mean absolute error of the fraction estimates, sum of fractions and the number of used elementary spectra. We will show, that a collection of archetypes can be an adequate and efficient alternative to the spectral library with respect to mentioned criteria.
Ensemble of sparse classifiers for high-dimensional biological data.
Kim, Sunghan; Scalzo, Fabien; Telesca, Donatello; Hu, Xiao
2015-01-01
Biological data are often high in dimension while the number of samples is small. In such cases, the performance of classification can be improved by reducing the dimension of data, which is referred to as feature selection. Recently, a novel feature selection method has been proposed utilising the sparsity of high-dimensional biological data where a small subset of features accounts for most variance of the dataset. In this study we propose a new classification method for high-dimensional biological data, which performs both feature selection and classification within a single framework. Our proposed method utilises a sparse linear solution technique and the bootstrap aggregating algorithm. We tested its performance on four public mass spectrometry cancer datasets along with two other conventional classification techniques such as Support Vector Machines and Adaptive Boosting. The results demonstrate that our proposed method performs more accurate classification across various cancer datasets than those conventional classification techniques.
Non-Convex Sparse and Low-Rank Based Robust Subspace Segmentation for Data Mining.
Cheng, Wenlong; Zhao, Mingbo; Xiong, Naixue; Chui, Kwok Tai
2017-07-15
Parsimony, including sparsity and low-rank, has shown great importance for data mining in social networks, particularly in tasks such as segmentation and recognition. Traditionally, such modeling approaches rely on an iterative algorithm that minimizes an objective function with convex l ₁-norm or nuclear norm constraints. However, the obtained results by convex optimization are usually suboptimal to solutions of original sparse or low-rank problems. In this paper, a novel robust subspace segmentation algorithm has been proposed by integrating l p -norm and Schatten p -norm constraints. Our so-obtained affinity graph can better capture local geometrical structure and the global information of the data. As a consequence, our algorithm is more generative, discriminative and robust. An efficient linearized alternating direction method is derived to realize our model. Extensive segmentation experiments are conducted on public datasets. The proposed algorithm is revealed to be more effective and robust compared to five existing algorithms.
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication
Azad, Ariful; Ballard, Grey; Buluc, Aydin; ...
2016-11-08
Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdös-Rényi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first implementation of the 3D SpGEMM formulation that exploits multiple (intranode and internode) levels of parallelism, achievingmore » significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research.« less
Sparse Bayesian learning for DOA estimation with mutual coupling.
Dai, Jisheng; Hu, Nan; Xu, Weichao; Chang, Chunqi
2015-10-16
Sparse Bayesian learning (SBL) has given renewed interest to the problem of direction-of-arrival (DOA) estimation. It is generally assumed that the measurement matrix in SBL is precisely known. Unfortunately, this assumption may be invalid in practice due to the imperfect manifold caused by unknown or misspecified mutual coupling. This paper describes a modified SBL method for joint estimation of DOAs and mutual coupling coefficients with uniform linear arrays (ULAs). Unlike the existing method that only uses stationary priors, our new approach utilizes a hierarchical form of the Student t prior to enforce the sparsity of the unknown signal more heavily. We also provide a distinct Bayesian inference for the expectation-maximization (EM) algorithm, which can update the mutual coupling coefficients more efficiently. Another difference is that our method uses an additional singular value decomposition (SVD) to reduce the computational complexity of the signal reconstruction process and the sensitivity to the measurement noise.
Regression analysis of sparse asynchronous longitudinal data.
Cao, Hongyuan; Zeng, Donglin; Fine, Jason P
2015-09-01
We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.
Do microbial processes regulate the stability of a coral atoll's enclosed pelagic ecosystem?
Complex marine ecosystems contain multiple feedback cycles that can cause unexpected responses to perturbations. To better predict these responses, complicated models are increasingly being developed to enable the study of feedback cycles. However, the sparseness of ecological da...
Optimal test selection for prediction uncertainty reduction
Mullins, Joshua; Mahadevan, Sankaran; Urbina, Angel
2016-12-02
Economic factors and experimental limitations often lead to sparse and/or imprecise data used for the calibration and validation of computational models. This paper addresses resource allocation for calibration and validation experiments, in order to maximize their effectiveness within given resource constraints. When observation data are used for model calibration, the quality of the inferred parameter descriptions is directly affected by the quality and quantity of the data. This paper characterizes parameter uncertainty within a probabilistic framework, which enables the uncertainty to be systematically reduced with additional data. The validation assessment is also uncertain in the presence of sparse and imprecisemore » data; therefore, this paper proposes an approach for quantifying the resulting validation uncertainty. Since calibration and validation uncertainty affect the prediction of interest, the proposed framework explores the decision of cost versus importance of data in terms of the impact on the prediction uncertainty. Often, calibration and validation tests may be performed for different input scenarios, and this paper shows how the calibration and validation results from different conditions may be integrated into the prediction. Then, a constrained discrete optimization formulation that selects the number of tests of each type (calibration or validation at given input conditions) is proposed. Furthermore, the proposed test selection methodology is demonstrated on a microelectromechanical system (MEMS) example.« less
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.; Mohammed, Ahmed Ali; Kadiam, Subhash
2010-01-01
Solving large (and sparse) system of simultaneous linear equations has been (and continues to be) a major challenging problem for many real-world engineering/science applications [1-2]. For many practical/large-scale problems, the sparse, Symmetrical and Positive Definite (SPD) system of linear equations can be conveniently represented in matrix notation as [A] {x} = {b} , where the square coefficient matrix [A] and the Right-Hand-Side (RHS) vector {b} are known. The unknown solution vector {x} can be efficiently solved by the following step-by-step procedures [1-2]: Reordering phase, Matrix Factorization phase, Forward solution phase, and Backward solution phase. In this research work, a Game-Based Learning (GBL) approach has been developed to help engineering students to understand crucial details about matrix reordering and factorization phases. A "chess-like" game has been developed and can be played by either a single player, or two players. Through this "chess-like" open-ended game, the players/learners will not only understand the key concepts involved in reordering algorithms (based on existing algorithms), but also have the opportunities to "discover new algorithms" which are better than existing algorithms. Implementing the proposed "chess-like" game for matrix reordering and factorization phases can be enhanced by FLASH [3] computer environments, where computer simulation with animated human voice, sound effects, visual/graphical/colorful displays of matrix tables, score (or monetary) awards for the best game players, etc. can all be exploited. Preliminary demonstrations of the developed GBL approach can be viewed by anyone who has access to the internet web-site [4]!
NASA Astrophysics Data System (ADS)
Alipour, M.; Kibler, K. M.
2017-12-01
Despite advances in flow prediction, managers of ungauged rivers located within broad regions of sparse hydrometeorologic observation still lack prescriptive methods robust to the data challenges of such regions. We propose a multi-objective streamflow prediction framework for regions of minimum observation to select models that balance runoff efficiency with choice of accurate parameter values. We supplement sparse observed data with uncertain or low-resolution information incorporated as `soft' a priori parameter estimates. The performance of the proposed framework is tested against traditional single-objective and constrained single-objective calibrations in two catchments in a remote area of southwestern China. We find that the multi-objective approach performs well with respect to runoff efficiency in both catchments (NSE = 0.74 and 0.72), within the range of efficiencies returned by other models (NSE = 0.67 - 0.78). However, soil moisture capacity estimated by the multi-objective model resonates with a priori estimates (parameter residuals of 61 cm versus 289 and 518 cm for maximum soil moisture capacity in one catchment, and 20 cm versus 246 and 475 cm in the other; parameter residuals of 0.48 versus 0.65 and 0.7 for soil moisture distribution shape factor in one catchment, and 0.91 versus 0.79 and 1.24 in the other). Thus, optimization to a multi-criteria objective function led to very different representations of soil moisture capacity as compared to models selected by single-objective calibration, without compromising runoff efficiency. These different soil moisture representations may translate into considerably different hydrological behaviors. The proposed approach thus offers a preliminary step towards greater process understanding in regions of severe data limitations. For instance, the multi-objective framework may be an adept tool to discern between models of similar efficiency to select models that provide the "right answers for the right reasons". Managers may feel more confident to utilize such models to predict flows in fully ungauged areas.
Abdolali, Fatemeh; Zoroofi, Reza Aghaeizadeh; Otake, Yoshito; Sato, Yoshinobu
2017-02-01
Accurate detection of maxillofacial cysts is an essential step for diagnosis, monitoring and planning therapeutic intervention. Cysts can be of various sizes and shapes and existing detection methods lead to poor results. Customizing automatic detection systems to gain sufficient accuracy in clinical practice is highly challenging. For this purpose, integrating the engineering knowledge in efficient feature extraction is essential. This paper presents a novel framework for maxillofacial cysts detection. A hybrid methodology based on surface and texture information is introduced. The proposed approach consists of three main steps as follows: At first, each cystic lesion is segmented with high accuracy. Then, in the second and third steps, feature extraction and classification are performed. Contourlet and SPHARM coefficients are utilized as texture and shape features which are fed into the classifier. Two different classifiers are used in this study, i.e. support vector machine and sparse discriminant analysis. Generally SPHARM coefficients are estimated by the iterative residual fitting (IRF) algorithm which is based on stepwise regression method. In order to improve the accuracy of IRF estimation, a method based on extra orthogonalization is employed to reduce linear dependency. We have utilized a ground-truth dataset consisting of cone beam CT images of 96 patients, belonging to three maxillofacial cyst categories: radicular cyst, dentigerous cyst and keratocystic odontogenic tumor. Using orthogonalized SPHARM, residual sum of squares is decreased which leads to a more accurate estimation. Analysis of the results based on statistical measures such as specificity, sensitivity, positive predictive value and negative predictive value is reported. The classification rate of 96.48% is achieved using sparse discriminant analysis and orthogonalized SPHARM features. Classification accuracy at least improved by 8.94% with respect to conventional features. This study demonstrated that our proposed methodology can improve the computer assisted diagnosis (CAD) performance by incorporating more discriminative features. Using orthogonalized SPHARM is promising in computerized cyst detection and may have a significant impact in future CAD systems. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Sankaran, Sethuraman; Humphrey, Jay D.; Marsden, Alison L.
2013-01-01
Computational models for vascular growth and remodeling (G&R) are used to predict the long-term response of vessels to changes in pressure, flow, and other mechanical loading conditions. Accurate predictions of these responses are essential for understanding numerous disease processes. Such models require reliable inputs of numerous parameters, including material properties and growth rates, which are often experimentally derived, and inherently uncertain. While earlier methods have used a brute force approach, systematic uncertainty quantification in G&R models promises to provide much better information. In this work, we introduce an efficient framework for uncertainty quantification and optimal parameter selection, and illustrate it via several examples. First, an adaptive sparse grid stochastic collocation scheme is implemented in an established G&R solver to quantify parameter sensitivities, and near-linear scaling with the number of parameters is demonstrated. This non-intrusive and parallelizable algorithm is compared with standard sampling algorithms such as Monte-Carlo. Second, we determine optimal arterial wall material properties by applying robust optimization. We couple the G&R simulator with an adaptive sparse grid collocation approach and a derivative-free optimization algorithm. We show that an artery can achieve optimal homeostatic conditions over a range of alterations in pressure and flow; robustness of the solution is enforced by including uncertainty in loading conditions in the objective function. We then show that homeostatic intramural and wall shear stress is maintained for a wide range of material properties, though the time it takes to achieve this state varies. We also show that the intramural stress is robust and lies within 5% of its mean value for realistic variability of the material parameters. We observe that prestretch of elastin and collagen are most critical to maintaining homeostasis, while values of the material properties are most critical in determining response time. Finally, we outline several challenges to the G&R community for future work. We suggest that these tools provide the first systematic and efficient framework to quantify uncertainties and optimally identify G&R model parameters. PMID:23626380
Zhe, Shandian; Xu, Zenglin; Qi, Yuan; Yu, Peng
2014-01-01
A key step for Alzheimer's disease (AD) study is to identify associations between genetic variations and intermediate phenotypes (e.g., brain structures). At the same time, it is crucial to develop a noninvasive means for AD diagnosis. Although these two tasks-association discovery and disease diagnosis-have been treated separately by a variety of approaches, they are tightly coupled due to their common biological basis. We hypothesize that the two tasks can potentially benefit each other by a joint analysis, because (i) the association study discovers correlated biomarkers from different data sources, which may help improve diagnosis accuracy, and (ii) the disease status may help identify disease-sensitive associations between genetic variations and MRI features. Based on this hypothesis, we present a new sparse Bayesian approach for joint association study and disease diagnosis. In this approach, common latent features are extracted from different data sources based on sparse projection matrices and used to predict multiple disease severity levels based on Gaussian process ordinal regression; in return, the disease status is used to guide the discovery of relationships between the data sources. The sparse projection matrices not only reveal the associations but also select groups of biomarkers related to AD. To learn the model from data, we develop an efficient variational expectation maximization algorithm. Simulation results demonstrate that our approach achieves higher accuracy in both predicting ordinal labels and discovering associations between data sources than alternative methods. We apply our approach to an imaging genetics dataset of AD. Our joint analysis approach not only identifies meaningful and interesting associations between genetic variations, brain structures, and AD status, but also achieves significantly higher accuracy for predicting ordinal AD stages than the competing methods.
Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data
Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D.; Nichols, Thomas E.
2017-01-01
Summary Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. PMID:28498564
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel; ...
2017-06-01
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
Signal-Preserving Erratic Noise Attenuation via Iterative Robust Sparsity-Promoting Filter
Zhao, Qiang; Du, Qizhen; Gong, Xufei; ...
2018-04-06
Sparse domain thresholding filters operating in a sparse domain are highly effective in removing Gaussian random noise under Gaussian distribution assumption. Erratic noise, which designates non-Gaussian noise that consists of large isolated events with known or unknown distribution, also needs to be explicitly taken into account. However, conventional sparse domain thresholding filters based on the least-squares (LS) criterion are severely sensitive to data with high-amplitude and non-Gaussian noise, i.e., the erratic noise, which makes the suppression of this type of noise extremely challenging. Here, in this paper, we present a robust sparsity-promoting denoising model, in which the LS criterion ismore » replaced by the Huber criterion to weaken the effects of erratic noise. The random and erratic noise is distinguished by using a data-adaptive parameter in the presented method, where random noise is described by mean square, while the erratic noise is downweighted through a damped weight. Different from conventional sparse domain thresholding filters, definition of the misfit between noisy data and recovered signal via the Huber criterion results in a nonlinear optimization problem. With the help of theoretical pseudoseismic data, an iterative robust sparsity-promoting filter is proposed to transform the nonlinear optimization problem into a linear LS problem through an iterative procedure. The main advantage of this transformation is that the nonlinear denoising filter can be solved by conventional LS solvers. Lastly, tests with several data sets demonstrate that the proposed denoising filter can successfully attenuate the erratic noise without damaging useful signal when compared with conventional denoising approaches based on the LS criterion.« less
Signal-Preserving Erratic Noise Attenuation via Iterative Robust Sparsity-Promoting Filter
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Qiang; Du, Qizhen; Gong, Xufei
Sparse domain thresholding filters operating in a sparse domain are highly effective in removing Gaussian random noise under Gaussian distribution assumption. Erratic noise, which designates non-Gaussian noise that consists of large isolated events with known or unknown distribution, also needs to be explicitly taken into account. However, conventional sparse domain thresholding filters based on the least-squares (LS) criterion are severely sensitive to data with high-amplitude and non-Gaussian noise, i.e., the erratic noise, which makes the suppression of this type of noise extremely challenging. Here, in this paper, we present a robust sparsity-promoting denoising model, in which the LS criterion ismore » replaced by the Huber criterion to weaken the effects of erratic noise. The random and erratic noise is distinguished by using a data-adaptive parameter in the presented method, where random noise is described by mean square, while the erratic noise is downweighted through a damped weight. Different from conventional sparse domain thresholding filters, definition of the misfit between noisy data and recovered signal via the Huber criterion results in a nonlinear optimization problem. With the help of theoretical pseudoseismic data, an iterative robust sparsity-promoting filter is proposed to transform the nonlinear optimization problem into a linear LS problem through an iterative procedure. The main advantage of this transformation is that the nonlinear denoising filter can be solved by conventional LS solvers. Lastly, tests with several data sets demonstrate that the proposed denoising filter can successfully attenuate the erratic noise without damaging useful signal when compared with conventional denoising approaches based on the LS criterion.« less
Noniterative MAP reconstruction using sparse matrix representations.
Cao, Guangzhi; Bouman, Charles A; Webb, Kevin J
2009-09-01
We present a method for noniterative maximum a posteriori (MAP) tomographic reconstruction which is based on the use of sparse matrix representations. Our approach is to precompute and store the inverse matrix required for MAP reconstruction. This approach has generally not been used in the past because the inverse matrix is typically large and fully populated (i.e., not sparse). In order to overcome this problem, we introduce two new ideas. The first idea is a novel theory for the lossy source coding of matrix transformations which we refer to as matrix source coding. This theory is based on a distortion metric that reflects the distortions produced in the final matrix-vector product, rather than the distortions in the coded matrix itself. The resulting algorithms are shown to require orthonormal transformations of both the measurement data and the matrix rows and columns before quantization and coding. The second idea is a method for efficiently storing and computing the required orthonormal transformations, which we call a sparse-matrix transform (SMT). The SMT is a generalization of the classical FFT in that it uses butterflies to compute an orthonormal transform; but unlike an FFT, the SMT uses the butterflies in an irregular pattern, and is numerically designed to best approximate the desired transforms. We demonstrate the potential of the noniterative MAP reconstruction with examples from optical tomography. The method requires offline computation to encode the inverse transform. However, once these offline computations are completed, the noniterative MAP algorithm is shown to reduce both storage and computation by well over two orders of magnitude, as compared to a linear iterative reconstruction methods.
Deploying temporary networks for upscaling of sparse network stations
NASA Astrophysics Data System (ADS)
Coopersmith, Evan J.; Cosh, Michael H.; Bell, Jesse E.; Kelly, Victoria; Hall, Mark; Palecki, Michael A.; Temimi, Marouane
2016-10-01
Soil observations networks at the national scale play an integral role in hydrologic modeling, drought assessment, agricultural decision support, and our ability to understand climate change. Understanding soil moisture variability is necessary to apply these measurements to model calibration, business and consumer applications, or even human health issues. The installation of soil moisture sensors as sparse, national networks is necessitated by limited financial resources. However, this results in the incomplete sampling of the local heterogeneity of soil type, vegetation cover, topography, and the fine spatial distribution of precipitation events. To this end, temporary networks can be installed in the areas surrounding a permanent installation within a sparse network. The temporary networks deployed in this study provide a more representative average at the 3 km and 9 km scales, localized about the permanent gauge. The value of such temporary networks is demonstrated at test sites in Millbrook, New York and Crossville, Tennessee. The capacity of a single U.S. Climate Reference Network (USCRN) sensor set to approximate the average of a temporary network at the 3 km and 9 km scales using a simple linear scaling function is tested. The capacity of a temporary network to provide reliable estimates with diminishing numbers of sensors, the temporal stability of those networks, and ultimately, the relationship of the variability of those networks to soil moisture conditions at the permanent sensor are investigated. In this manner, this work demonstrates the single-season installation of a temporary network as a mechanism to characterize the soil moisture variability at a permanent gauge within a sparse network.
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
NELasso: Group-Sparse Modeling for Characterizing Relations Among Named Entities in News Articles.
Tariq, Amara; Karim, Asim; Foroosh, Hassan
2017-10-01
Named entities such as people, locations, and organizations play a vital role in characterizing online content. They often reflect information of interest and are frequently used in search queries. Although named entities can be detected reliably from textual content, extracting relations among them is more challenging, yet useful in various applications (e.g., news recommending systems). In this paper, we present a novel model and system for learning semantic relations among named entities from collections of news articles. We model each named entity occurrence with sparse structured logistic regression, and consider the words (predictors) to be grouped based on background semantics. This sparse group LASSO approach forces the weights of word groups that do not influence the prediction towards zero. The resulting sparse structure is utilized for defining the type and strength of relations. Our unsupervised system yields a named entities' network where each relation is typed, quantified, and characterized in context. These relations are the key to understanding news material over time and customizing newsfeeds for readers. Extensive evaluation of our system on articles from TIME magazine and BBC News shows that the learned relations correlate with static semantic relatedness measures like WLM, and capture the evolving relationships among named entities over time.
ADDITIVITY ASSESSMENT OF TRIHALOMETHANE MIXTURES BY PROPORTIONAL RESPONSE ADDITION
If additivity is known or assumed, the toxicity of a chemical mixture may be predicted from the dose response curves of the individual chemicals comprising the mixture. As single chemical data are abundant and mixture data sparse, mixture risk methods that utilize single chemical...
Graph cuts via l1 norm minimization.
Bhusnurmath, Arvind; Taylor, Camillo J
2008-10-01
Graph cuts have become an increasingly important tool for solving a number of energy minimization problems in computer vision and other fields. In this paper, the graph cut problem is reformulated as an unconstrained l1 norm minimization that can be solved effectively using interior point methods. This reformulation exposes connections between the graph cuts and other related continuous optimization problems. Eventually the problem is reduced to solving a sequence of sparse linear systems involving the Laplacian of the underlying graph. The proposed procedure exploits the structure of these linear systems in a manner that is easily amenable to parallel implementations. Experimental results obtained by applying the procedure to graphs derived from image processing problems are provided.
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices, part 2
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Nachtigal, Noel M.
1990-01-01
It is shown how the look-ahead Lanczos process (combined with a quasi-minimal residual QMR) approach) can be used to develop a robust black box solver for large sparse non-Hermitian linear systems. Details of an implementation of the resulting QMR algorithm are presented. It is demonstrated that the QMR method is closely related to the biconjugate gradient (BCG) algorithm; however, unlike BCG, the QMR algorithm has smooth convergence curves and good numerical properties. We report numerical experiments with our implementation of the look-ahead Lanczos algorithm, both for eigenvalue problem and linear systems. Also, program listings of FORTRAN implementations of the look-ahead algorithm and the QMR method are included.
NASA Astrophysics Data System (ADS)
Imamura, Seigo; Ono, Kenji; Yokokawa, Mitsuo
2016-07-01
Ensemble computing, which is an instance of capacity computing, is an effective computing scenario for exascale parallel supercomputers. In ensemble computing, there are multiple linear systems associated with a common coefficient matrix. We improve the performance of iterative solvers for multiple vectors by solving them at the same time, that is, by solving for the product of the matrices. We implemented several iterative methods and compared their performance. The maximum performance on Sparc VIIIfx was 7.6 times higher than that of a naïve implementation. Finally, to deal with the different convergence processes of linear systems, we introduced a control method to eliminate the calculation of already converged vectors.
GPU-accelerated element-free reverse-time migration with Gauss points partition
NASA Astrophysics Data System (ADS)
Zhou, Zhen; Jia, Xiaofeng; Qiang, Xiaodong
2018-06-01
An element-free method (EFM) has been demonstrated successfully in elasticity, heat conduction and fatigue crack growth problems. We present the theory of EFM and its numerical applications in seismic modelling and reverse time migration (RTM). Compared with the finite difference method and the finite element method, the EFM has unique advantages: (1) independence of grids in computation and (2) lower expense and more flexibility (because only the information of the nodes and the boundary of the concerned area is required). However, in EFM, due to improper computation and storage of some large sparse matrices, such as the mass matrix and the stiffness matrix, the method is difficult to apply to seismic modelling and RTM for a large velocity model. To solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition and utilise the graphics processing unit to improve the computational efficiency. We employ the compressed sparse row format to compress the intermediate large sparse matrices and attempt to simplify the operations by solving the linear equations with CULA solver. To improve the computation efficiency further, we introduce the concept of the lumped mass matrix. Numerical experiments indicate that the proposed method is accurate and more efficient than the regular EFM.
Configurable hardware integrate and fire neurons for sparse approximation.
Shapero, Samuel; Rozell, Christopher; Hasler, Paul
2013-09-01
Sparse approximation is an important optimization problem in signal and image processing applications. A Hopfield-Network-like system of integrate and fire (IF) neurons is proposed as a solution, using the Locally Competitive Algorithm (LCA) to solve an overcomplete L1 sparse approximation problem. A scalable system architecture is described, including IF neurons with a nonlinear firing function, and current-based synapses to provide linear computation. A network of 18 neurons with 12 inputs is implemented on the RASP 2.9v chip, a Field Programmable Analog Array (FPAA) with directly programmable floating gate elements. Said system uses over 1400 floating gates, the largest system programmed on a FPAA to date. The circuit successfully reproduced the outputs of a digital optimization program, converging to within 4.8% RMS, and an objective cost only 1.7% higher on average. The active circuit consumed 559 μA of current at 2.4 V and converges on solutions in 25 μs, with measurement of the converged spike rate taking an additional 1 ms. Extrapolating the scaling trends to a N=1000 node system, the spiking LCA compares favorably with state-of-the-art digital solutions, and analog solutions using a non-spiking approach. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Sun, Yankui; Li, Shan; Sun, Zhongyang
2017-01-01
We propose a framework for automated detection of dry age-related macular degeneration (AMD) and diabetic macular edema (DME) from retina optical coherence tomography (OCT) images, based on sparse coding and dictionary learning. The study aims to improve the classification performance of state-of-the-art methods. First, our method presents a general approach to automatically align and crop retina regions; then it obtains global representations of images by using sparse coding and a spatial pyramid; finally, a multiclass linear support vector machine classifier is employed for classification. We apply two datasets for validating our algorithm: Duke spectral domain OCT (SD-OCT) dataset, consisting of volumetric scans acquired from 45 subjects-15 normal subjects, 15 AMD patients, and 15 DME patients; and clinical SD-OCT dataset, consisting of 678 OCT retina scans acquired from clinics in Beijing-168, 297, and 213 OCT images for AMD, DME, and normal retinas, respectively. For the former dataset, our classifier correctly identifies 100%, 100%, and 93.33% of the volumes with DME, AMD, and normal subjects, respectively, and thus performs much better than the conventional method; for the latter dataset, our classifier leads to a correct classification rate of 99.67%, 99.67%, and 100.00% for DME, AMD, and normal images, respectively.
Sun, Yankui; Li, Shan; Sun, Zhongyang
2017-01-01
We propose a framework for automated detection of dry age-related macular degeneration (AMD) and diabetic macular edema (DME) from retina optical coherence tomography (OCT) images, based on sparse coding and dictionary learning. The study aims to improve the classification performance of state-of-the-art methods. First, our method presents a general approach to automatically align and crop retina regions; then it obtains global representations of images by using sparse coding and a spatial pyramid; finally, a multiclass linear support vector machine classifier is employed for classification. We apply two datasets for validating our algorithm: Duke spectral domain OCT (SD-OCT) dataset, consisting of volumetric scans acquired from 45 subjects—15 normal subjects, 15 AMD patients, and 15 DME patients; and clinical SD-OCT dataset, consisting of 678 OCT retina scans acquired from clinics in Beijing—168, 297, and 213 OCT images for AMD, DME, and normal retinas, respectively. For the former dataset, our classifier correctly identifies 100%, 100%, and 93.33% of the volumes with DME, AMD, and normal subjects, respectively, and thus performs much better than the conventional method; for the latter dataset, our classifier leads to a correct classification rate of 99.67%, 99.67%, and 100.00% for DME, AMD, and normal images, respectively.
NASA Astrophysics Data System (ADS)
An, Zhe; Rey, Daniel; Ye, Jingxin; Abarbanel, Henry D. I.
2017-01-01
The problem of forecasting the behavior of a complex dynamical system through analysis of observational time-series data becomes difficult when the system expresses chaotic behavior and the measurements are sparse, in both space and/or time. Despite the fact that this situation is quite typical across many fields, including numerical weather prediction, the issue of whether the available observations are "sufficient" for generating successful forecasts is still not well understood. An analysis by Whartenby et al. (2013) found that in the context of the nonlinear shallow water equations on a β plane, standard nudging techniques require observing approximately 70 % of the full set of state variables. Here we examine the same system using a method introduced by Rey et al. (2014a), which generalizes standard nudging methods to utilize time delayed measurements. We show that in certain circumstances, it provides a sizable reduction in the number of observations required to construct accurate estimates and high-quality predictions. In particular, we find that this estimate of 70 % can be reduced to about 33 % using time delays, and even further if Lagrangian drifter locations are also used as measurements.
Sparse Event Modeling with Hierarchical Bayesian Kernel Methods
2016-01-05
SECURITY CLASSIFICATION OF: The research objective of this proposal was to develop a predictive Bayesian kernel approach to model count data based on...several predictive variables. Such an approach, which we refer to as the Poisson Bayesian kernel model , is able to model the rate of occurrence of...which adds specificity to the model and can make nonlinear data more manageable. Early results show that the 1. REPORT DATE (DD-MM-YYYY) 4. TITLE
Rosenfeld, Adar; Dorman, Michael; Schwartz, Joel; Novack, Victor; Just, Allan C; Kloog, Itai
2017-11-01
Meteorological stations measure air temperature (Ta) accurately with high temporal resolution, but usually suffer from limited spatial resolution due to their sparse distribution across rural, undeveloped or less populated areas. Remote sensing satellite-based measurements provide daily surface temperature (Ts) data in high spatial and temporal resolution and can improve the estimation of daily Ta. In this study we developed spatiotemporally resolved models which allow us to predict three daily parameters: Ta Max (day time), 24h mean, and Ta Min (night time) on a fine 1km grid across the state of Israel. We used and compared both the Aqua and Terra MODIS satellites. We used linear mixed effect models, IDW (inverse distance weighted) interpolations and thin plate splines (using a smooth nonparametric function of longitude and latitude) to first calibrate between Ts and Ta in those locations where we have available data for both and used that calibration to fill in neighboring cells without surface monitors or missing Ts. Out-of-sample ten-fold cross validation (CV) was used to quantify the accuracy of our predictions. Our model performance was excellent for both days with and without available Ts observations for both Aqua and Terra (CV Aqua R 2 results for min 0.966, mean 0.986, and max 0.967; CV Terra R 2 results for min 0.965, mean 0.987, and max 0.968). Our research shows that daily min, mean and max Ta can be reliably predicted using daily MODIS Ts data even across Israel, with high accuracy even for days without Ta or Ts data. These predictions can be used as three separate Ta exposures in epidemiology studies for better diurnal exposure assessment. Copyright © 2017 Elsevier Inc. All rights reserved.
Luenser, Arne; Kussmann, Jörg; Ochsenfeld, Christian
2016-09-28
We present a (sub)linear-scaling algorithm to determine indirect nuclear spin-spin coupling constants at the Hartree-Fock and Kohn-Sham density functional levels of theory. Employing efficient integral algorithms and sparse algebra routines, an overall (sub)linear scaling behavior can be obtained for systems with a non-vanishing HOMO-LUMO gap. Calculations on systems with over 1000 atoms and 20 000 basis functions illustrate the performance and accuracy of our reference implementation. Specifically, we demonstrate that linear algebra dominates the runtime of conventional algorithms for 10 000 basis functions and above. Attainable speedups of our method exceed 6 × in total runtime and 10 × in the linear algebra steps for the tested systems. Furthermore, a convergence study of spin-spin couplings of an aminopyrazole peptide upon inclusion of the water environment is presented: using the new method it is shown that large solvent spheres are necessary to converge spin-spin coupling values.
Kawaguchi, Atsushi; Yamashita, Fumio
2017-10-01
This article proposes a procedure for describing the relationship between high-dimensional data sets, such as multimodal brain images and genetic data. We propose a supervised technique to incorporate the clinical outcome to determine a score, which is a linear combination of variables with hieratical structures to multimodalities. This approach is expected to obtain interpretable and predictive scores. The proposed method was applied to a study of Alzheimer's disease (AD). We propose a diagnostic method for AD that involves using whole-brain magnetic resonance imaging (MRI) and positron emission tomography (PET), and we select effective brain regions for the diagnostic probability and investigate the genome-wide association with the regions using single nucleotide polymorphisms (SNPs). The two-step dimension reduction method, which we previously introduced, was considered applicable to such a study and allows us to partially incorporate the proposed method. We show that the proposed method offers classification functions with feasibility and reasonable prediction accuracy based on the receiver operating characteristic (ROC) analysis and reasonable regions of the brain and genomes. Our simulation study based on the synthetic structured data set showed that the proposed method outperformed the original method and provided the characteristic for the supervised feature. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Hydrological flow predictions in ungauged and sparsely gauged watersheds use regionalization or classification of hydrologically similar watersheds to develop empirical relationships between hydrologic, climatic, and watershed variables. The watershed classifications may be based...
Assessment of Mercury in Fish Tissue from Select Lakes of Northeastern Oregon
A fish tissue study was conducted in five northeastern Oregon reservoirs to evaluate mercury concentrations in an area where elevated atmospheric mercury deposition had been predicted by a national EPA model, but where tissue data were sparse. The study targeted resident predator...
Kussmann, Jörg; Ochsenfeld, Christian
2007-11-28
A density matrix-based time-dependent self-consistent field (D-TDSCF) method for the calculation of dynamic polarizabilities and first hyperpolarizabilities using the Hartree-Fock and Kohn-Sham density functional theory approaches is presented. The D-TDSCF method allows us to reduce the asymptotic scaling behavior of the computational effort from cubic to linear for systems with a nonvanishing band gap. The linear scaling is achieved by combining a density matrix-based reformulation of the TDSCF equations with linear-scaling schemes for the formation of Fock- or Kohn-Sham-type matrices. In our reformulation only potentially linear-scaling matrices enter the formulation and efficient sparse algebra routines can be employed. Furthermore, the corresponding formulas for the first hyperpolarizabilities are given in terms of zeroth- and first-order one-particle reduced density matrices according to Wigner's (2n+1) rule. The scaling behavior of our method is illustrated for first exemplary calculations with systems of up to 1011 atoms and 8899 basis functions.
Kussmann, Jörg; Ochsenfeld, Christian
2007-08-07
Details of a new density matrix-based formulation for calculating nuclear magnetic resonance chemical shifts at both Hartree-Fock and density functional theory levels are presented. For systems with a nonvanishing highest occupied molecular orbital-lowest unoccupied molecular orbital gap, the method allows us to reduce the asymptotic scaling order of the computational effort from cubic to linear, so that molecular systems with 1000 and more atoms can be tackled with today's computers. The key feature is a reformulation of the coupled-perturbed self-consistent field (CPSCF) theory in terms of the one-particle density matrix (D-CPSCF), which avoids entirely the use of canonical MOs. By means of a direct solution for the required perturbed density matrices and the adaptation of linear-scaling integral contraction schemes, the overall scaling of the computational effort is reduced to linear. A particular focus of our formulation is to ensure numerical stability when sparse-algebra routines are used to obtain an overall linear-scaling behavior.
Ji, Jiadong; He, Di; Feng, Yang; He, Yong; Xue, Fuzhong; Xie, Lei
2017-10-01
A complex disease is usually driven by a number of genes interwoven into networks, rather than a single gene product. Network comparison or differential network analysis has become an important means of revealing the underlying mechanism of pathogenesis and identifying clinical biomarkers for disease classification. Most studies, however, are limited to network correlations that mainly capture the linear relationship among genes, or rely on the assumption of a parametric probability distribution of gene measurements. They are restrictive in real application. We propose a new Joint density based non-parametric Differential Interaction Network Analysis and Classification (JDINAC) method to identify differential interaction patterns of network activation between two groups. At the same time, JDINAC uses the network biomarkers to build a classification model. The novelty of JDINAC lies in its potential to capture non-linear relations between molecular interactions using high-dimensional sparse data as well as to adjust confounding factors, without the need of the assumption of a parametric probability distribution of gene measurements. Simulation studies demonstrate that JDINAC provides more accurate differential network estimation and lower classification error than that achieved by other state-of-the-art methods. We apply JDINAC to a Breast Invasive Carcinoma dataset, which includes 114 patients who have both tumor and matched normal samples. The hub genes and differential interaction patterns identified were consistent with existing experimental studies. Furthermore, JDINAC discriminated the tumor and normal sample with high accuracy by virtue of the identified biomarkers. JDINAC provides a general framework for feature selection and classification using high-dimensional sparse omics data. R scripts available at https://github.com/jijiadong/JDINAC. lxie@iscb.org. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
NASA Astrophysics Data System (ADS)
Elkurdi, Yousef; Fernández, David; Souleimanov, Evgueni; Giannacopoulos, Dennis; Gross, Warren J.
2008-04-01
The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory.
LiDAR point classification based on sparse representation
NASA Astrophysics Data System (ADS)
Li, Nan; Pfeifer, Norbert; Liu, Chun
2017-04-01
In order to combine the initial spatial structure and features of LiDAR data for accurate classification. The LiDAR data is represented as a 4-order tensor. Sparse representation for classification(SRC) method is used for LiDAR tensor classification. It turns out SRC need only a few of training samples from each class, meanwhile can achieve good classification result. Multiple features are extracted from raw LiDAR points to generate a high-dimensional vector at each point. Then the LiDAR tensor is built by the spatial distribution and feature vectors of the point neighborhood. The entries of LiDAR tensor are accessed via four indexes. Each index is called mode: three spatial modes in direction X ,Y ,Z and one feature mode. Sparse representation for classification(SRC) method is proposed in this paper. The sparsity algorithm is to find the best represent the test sample by sparse linear combination of training samples from a dictionary. To explore the sparsity of LiDAR tensor, the tucker decomposition is used. It decomposes a tensor into a core tensor multiplied by a matrix along each mode. Those matrices could be considered as the principal components in each mode. The entries of core tensor show the level of interaction between the different components. Therefore, the LiDAR tensor can be approximately represented by a sparse tensor multiplied by a matrix selected from a dictionary along each mode. The matrices decomposed from training samples are arranged as initial elements in the dictionary. By dictionary learning, a reconstructive and discriminative structure dictionary along each mode is built. The overall structure dictionary composes of class-specified sub-dictionaries. Then the sparse core tensor is calculated by tensor OMP(Orthogonal Matching Pursuit) method based on dictionaries along each mode. It is expected that original tensor should be well recovered by sub-dictionary associated with relevant class, while entries in the sparse tensor associated with other classed should be nearly zero. Therefore, SRC use the reconstruction error associated with each class to do data classification. A section of airborne LiDAR points of Vienna city is used and classified into 6classes: ground, roofs, vegetation, covered ground, walls and other points. Only 6 training samples from each class are taken. For the final classification result, ground and covered ground are merged into one same class(ground). The classification accuracy for ground is 94.60%, roof is 95.47%, vegetation is 85.55%, wall is 76.17%, other object is 20.39%.
Genetic diversity and trait genomic prediction in a pea diversity panel.
Burstin, Judith; Salloignon, Pauline; Chabert-Martinello, Marianne; Magnin-Robert, Jean-Bernard; Siol, Mathieu; Jacquin, Françoise; Chauveau, Aurélie; Pont, Caroline; Aubert, Grégoire; Delaitre, Catherine; Truntzer, Caroline; Duc, Gérard
2015-02-21
Pea (Pisum sativum L.), a major pulse crop grown for its protein-rich seeds, is an important component of agroecological cropping systems in diverse regions of the world. New breeding challenges imposed by global climate change and new regulations urge pea breeders to undertake more efficient methods of selection and better take advantage of the large genetic diversity present in the Pisum sativum genepool. Diversity studies conducted so far in pea used Simple Sequence Repeat (SSR) and Retrotransposon Based Insertion Polymorphism (RBIP) markers. Recently, SNP marker panels have been developed that will be useful for genetic diversity assessment and marker-assisted selection. A collection of diverse pea accessions, including landraces and cultivars of garden, field or fodder peas as well as wild peas was characterised at the molecular level using newly developed SNP markers, as well as SSR markers and RBIP markers. The three types of markers were used to describe the structure of the collection and revealed different pictures of the genetic diversity among the collection. SSR showed the fastest rate of evolution and RBIP the slowest rate of evolution, pointing to their contrasted mode of evolution. SNP markers were then used to predict phenotypes -the date of flowering (BegFlo), the number of seeds per plant (Nseed) and thousand seed weight (TSW)- that were recorded for the collection. Different statistical methods were tested including the LASSO (Least Absolute Shrinkage ans Selection Operator), PLS (Partial Least Squares), SPLS (Sparse Partial Least Squares), Bayes A, Bayes B and GBLUP (Genomic Best Linear Unbiased Prediction) methods and the structure of the collection was taken into account in the prediction. Despite a limited number of 331 markers used for prediction, TSW was reliably predicted. The development of marker assisted selection has not reached its full potential in pea until now. This paper shows that the high-throughput SNP arrays that are being developed will most probably allow for a more efficient selection in this species.
AITRAC: Augmented Interactive Transient Radiation Analysis by Computer. User's information manual
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1977-10-01
AITRAC is a program designed for on-line, interactive, DC, and transient analysis of electronic circuits. The program solves linear and nonlinear simultaneous equations which characterize the mathematical models used to predict circuit response. The program features 100 external node--200 branch capability; conversional, free-format input language; built-in junction, FET, MOS, and switch models; sparse matrix algorithm with extended-precision H matrix and T vector calculations, for fast and accurate execution; linear transconductances: beta, GM, MU, ZM; accurate and fast radiation effects analysis; special interface for user-defined equations; selective control of multiple outputs; graphical outputs in wide and narrow formats; and on-line parametermore » modification capability. The user describes the problem by entering the circuit topology and part parameters. The program then automatically generates and solves the circuit equations, providing the user with printed or plotted output. The circuit topology and/or part values may then be changed by the user, and a new analysis, requested. Circuit descriptions may be saved on disk files for storage and later use. The program contains built-in standard models for resistors, voltage and current sources, capacitors, inductors including mutual couplings, switches, junction diodes and transistors, FETS, and MOS devices. Nonstandard models may be constructed from standard models or by using the special equations interface. Time functions may be described by straight-line segments or by sine, damped sine, and exponential functions. 42 figures, 1 table. (RWR)« less
Inequality across Consonantal Contrasts in Speech Perception: Evidence from Mismatch Negativity
ERIC Educational Resources Information Center
Cornell, Sonia A.; Lahiri, Aditi; Eulitz, Carsten
2013-01-01
The precise structure of speech sound representations is still a matter of debate. In the present neurobiological study, we compared predictions about differential sensitivity to speech contrasts between models that assume full specification of all phonological information in the mental lexicon with those assuming sparse representations (only…
Regulators and consultants alike are routinely tasked with predicting potential future impacts to ground water resources from leaking underground storage tank (LUST) sites. Site data is usually sparse, variable, and uncertain at best. However, this type of data can be evaluated ...
Daily mapping of 30m LAI and NDVI for grape yield prediction in California vineyards
USDA-ARS?s Scientific Manuscript database
Wine grape quality and quantity are affected by vine growing conditions during critical phenological stages. Field observations of vine growth stages are normally very sparse and cannot capture the spatial variability of vine conditions. Remote sensing data acquired from visible and near infrared ba...
A Higher Order Iterative Method for Computing the Drazin Inverse
Soleymani, F.; Stanimirović, Predrag S.
2013-01-01
A method with high convergence rate for finding approximate inverses of nonsingular matrices is suggested and established analytically. An extension of the introduced computational scheme to general square matrices is defined. The extended method could be used for finding the Drazin inverse. The application of the scheme on large sparse test matrices alongside the use in preconditioning of linear system of equations will be presented to clarify the contribution of the paper. PMID:24222747
Simulation of Locking Space Truss Deployments for a Large Deployable Sparse Aperture Reflector
2015-03-01
Dr. Alan Jennings, for his unending patience with my struggles through this entire process . Without his expertise, guidance, and trust I would have...engineer since they are not automatically meshed. Fortunately, the mesh process is quite swift. Figure 13 shows both a linear hexahedral element as well...less than that of the serial process . Therefore, COMSOL’s partially parallelized algorithms will not be sped up as a function of cores added and is
Communication: Analysing kinetic transition networks for rare events.
Stevenson, Jacob D; Wales, David J
2014-07-28
The graph transformation approach is a recently proposed method for computing mean first passage times, rates, and committor probabilities for kinetic transition networks. Here we compare the performance to existing linear algebra methods, focusing on large, sparse networks. We show that graph transformation provides a much more robust framework, succeeding when numerical precision issues cause the other methods to fail completely. These are precisely the situations that correspond to rare event dynamics for which the graph transformation was introduced.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hutchinson, S.A.; Shadid, J.N.; Tuminaro, R.S.
1995-10-01
Aztec is an iterative library that greatly simplifies the parallelization process when solving the linear systems of equations Ax = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. Aztec is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparsemore » unstructured matrices for parallel solution. Once the distributed matrix is created, computation can be performed on any of the parallel machines running Aztec: nCUBE 2, IBM SP2 and Intel Paragon, MPI platforms as well as standard serial and vector platforms. Aztec includes a number of Krylov iterative methods such as conjugate gradient (CG), generalized minimum residual (GMRES) and stabilized biconjugate gradient (BICGSTAB) to solve systems of equations. These Krylov methods are used in conjunction with various preconditioners such as polynomial or domain decomposition methods using LU or incomplete LU factorizations within subdomains. Although the matrix A can be general, the package has been designed for matrices arising from the approximation of partial differential equations (PDEs). In particular, the Aztec package is oriented toward systems arising from PDE applications.« less
Semi-automatic sparse preconditioners for high-order finite element methods on non-uniform meshes
NASA Astrophysics Data System (ADS)
Austin, Travis M.; Brezina, Marian; Jamroz, Ben; Jhurani, Chetan; Manteuffel, Thomas A.; Ruge, John
2012-05-01
High-order finite elements often have a higher accuracy per degree of freedom than the classical low-order finite elements. However, in the context of implicit time-stepping methods, high-order finite elements present challenges to the construction of efficient simulations due to the high cost of inverting the denser finite element matrix. There are many cases where simulations are limited by the memory required to store the matrix and/or the algorithmic components of the linear solver. We are particularly interested in preconditioned Krylov methods for linear systems generated by discretization of elliptic partial differential equations with high-order finite elements. Using a preconditioner like Algebraic Multigrid can be costly in terms of memory due to the need to store matrix information at the various levels. We present a novel method for defining a preconditioner for systems generated by high-order finite elements that is based on a much sparser system than the original high-order finite element system. We investigate the performance for non-uniform meshes on a cube and a cubed sphere mesh, showing that the sparser preconditioner is more efficient and uses significantly less memory. Finally, we explore new methods to construct the sparse preconditioner and examine their effectiveness for non-uniform meshes. We compare results to a direct use of Algebraic Multigrid as a preconditioner and to a two-level additive Schwarz method.
Enjilela, Esmaeil; Lee, Ting-Yim; Hsieh, Jiang; Wisenberg, Gerald; Teefy, Patrick; Yadegari, Andrew; Bagur, Rodrigo; Islam, Ali; Branch, Kelley; So, Aaron
2018-03-01
We implemented and validated a compressed sensing (CS) based algorithm for reconstructing dynamic contrast-enhanced (DCE) CT images of the heart from sparsely sampled X-ray projections. DCE CT imaging of the heart was performed on five normal and ischemic pigs after contrast injection. DCE images were reconstructed with filtered backprojection (FBP) and CS from all projections (984-view) and 1/3 of all projections (328-view), and with CS from 1/4 of all projections (246-view). Myocardial perfusion (MP) measurements with each protocol were compared to those with the reference 984-view FBP protocol. Both the 984-view CS and 328-view CS protocols were in good agreements with the reference protocol. The Pearson correlation coefficients of 984-view CS and 328-view CS determined from linear regression analyses were 0.98 and 0.99 respectively. The corresponding mean biases of MP measurement determined from Bland-Altman analyses were 2.7 and 1.2ml/min/100g. When only 328 projections were used for image reconstruction, CS was more accurate than FBP for MP measurement with respect to 984-view FBP. However, CS failed to generate MP maps comparable to those with 984-view FBP when only 246 projections were used for image reconstruction. DCE heart images reconstructed from one-third of a full projection set with CS were minimally affected by aliasing artifacts, leading to accurate MP measurements with the effective dose reduced to just 33% of conventional full-view FBP method. The proposed CS sparse-view image reconstruction method could facilitate the implementation of sparse-view dynamic acquisition for ultra-low dose CT MP imaging. Copyright © 2017 Elsevier B.V. All rights reserved.
Retrieval of Understory NDVI in Sparse Boreal Forests By MODIS Brdf Data
NASA Astrophysics Data System (ADS)
Yang, W.; Kobayashi, H.; Suzuki, R.; Nasahara, K. N.
2014-12-01
Global products of leaf area index (LAI) usually show large uncertainties in sparsely vegetated areas. The reason is that the understory contribution is not negligible in reflectance modeling for the case of low to intermediate canopy cover. Therefore many efforts have been carried out on inclusion of understory properties in the LAI estimation algorithms. Compared with conventional data bank method, estimation of forest understory property from satellite data is superior in the studies at global or continental scale during a long periods. However, the existing remote sensing method based on multi-angular observations is very complicated to implement. Alternatively, a simple method to retrieve understory NDVI (NDVIu) for sparse boreal forests was proposed in this study. The method is based on the property that the bi-directional variation of NDVIu is much smaller than that of the canopy-level NDVI. To retrieve NDVIu for a certain pixel, linear extrapolation was applied using the pixels within a 5×5 target-pixel-centered window. The NDVI values were reconstructed from the MODIS BRDF data corresponding to eight different solar-view angles. NDVIu was estimated as the average of the NDVI values corresponding to the position where the stand NDVI has the smallest angular variation. Validation by noise-free simulation dataset yielded high agreement between estimated and true NDVIu with R2 and RMSE of 0.99 and 0.03, respectively. By the MODIS BRDF data, we got the estimate of NDVIu close to the in situ measured value (0.61 vs. 0.66 for estimate and measurement, respectively), and also reasonable seasonal patterns of NDVIu in 2010-2013. The results imply a potential application of the retrieved NDVIu to improve the estimation of overstory LAI for sparse boreal forests.
Xi, Jianing; Wang, Minghui; Li, Ao
2018-06-05
Discovery of mutated driver genes is one of the primary objective for studying tumorigenesis. To discover some relatively low frequently mutated driver genes from somatic mutation data, many existing methods incorporate interaction network as prior information. However, the prior information of mRNA expression patterns are not exploited by these existing network-based methods, which is also proven to be highly informative of cancer progressions. To incorporate prior information from both interaction network and mRNA expressions, we propose a robust and sparse co-regularized nonnegative matrix factorization to discover driver genes from mutation data. Furthermore, our framework also conducts Frobenius norm regularization to overcome overfitting issue. Sparsity-inducing penalty is employed to obtain sparse scores in gene representations, of which the top scored genes are selected as driver candidates. Evaluation experiments by known benchmarking genes indicate that the performance of our method benefits from the two type of prior information. Our method also outperforms the existing network-based methods, and detect some driver genes that are not predicted by the competing methods. In summary, our proposed method can improve the performance of driver gene discovery by effectively incorporating prior information from interaction network and mRNA expression patterns into a robust and sparse co-regularized matrix factorization framework.
Superresolving Black Hole Images with Full-Closure Sparse Modeling
NASA Astrophysics Data System (ADS)
Crowley, Chelsea; Akiyama, Kazunori; Fish, Vincent
2018-01-01
It is believed that almost all galaxies have black holes at their centers. Imaging a black hole is a primary objective to answer scientific questions relating to relativistic accretion and jet formation. The Event Horizon Telescope (EHT) is set to capture images of two nearby black holes, Sagittarius A* at the center of the Milky Way galaxy roughly 26,000 light years away and the other M87 which is in Virgo A, a large elliptical galaxy that is 50 million light years away. Sparse imaging techniques have shown great promise for reconstructing high-fidelity superresolved images of black holes from simulated data. Previous work has included the effects of atmospheric phase errors and thermal noise, but not systematic amplitude errors that arise due to miscalibration. We explore a full-closure imaging technique with sparse modeling that uses closure amplitudes and closure phases to improve the imaging process. This new technique can successfully handle data with systematic amplitude errors. Applying our technique to synthetic EHT data of M87, we find that full-closure sparse modeling can reconstruct images better than traditional methods and recover key structural information on the source, such as the shape and size of the predicted photon ring. These results suggest that our new approach will provide superior imaging performance for data from the EHT and other interferometric arrays.
Su, Hai; Xing, Fuyong; Yang, Lin
2016-01-01
Successful diagnostic and prognostic stratification, treatment outcome prediction, and therapy planning depend on reproducible and accurate pathology analysis. Computer aided diagnosis (CAD) is a useful tool to help doctors make better decisions in cancer diagnosis and treatment. Accurate cell detection is often an essential prerequisite for subsequent cellular analysis. The major challenge of robust brain tumor nuclei/cell detection is to handle significant variations in cell appearance and to split touching cells. In this paper, we present an automatic cell detection framework using sparse reconstruction and adaptive dictionary learning. The main contributions of our method are: 1) A sparse reconstruction based approach to split touching cells; 2) An adaptive dictionary learning method used to handle cell appearance variations. The proposed method has been extensively tested on a data set with more than 2000 cells extracted from 32 whole slide scanned images. The automatic cell detection results are compared with the manually annotated ground truth and other state-of-the-art cell detection algorithms. The proposed method achieves the best cell detection accuracy with a F1 score = 0.96. PMID:26812706
Zhuang, Chengxu; Wang, Yulong; Yamins, Daniel; Hu, Xiaolin
2017-01-01
Visual information in the visual cortex is processed in a hierarchical manner. Recent studies show that higher visual areas, such as V2, V3, and V4, respond more vigorously to images with naturalistic higher-order statistics than to images lacking them. This property is a functional signature of higher areas, as it is much weaker or even absent in the primary visual cortex (V1). However, the mechanism underlying this signature remains elusive. We studied this problem using computational models. In several typical hierarchical visual models including the AlexNet, VggNet, and SHMAX, this signature was found to be prominent in higher layers but much weaker in lower layers. By changing both the model structure and experimental settings, we found that the signature strongly correlated with sparse firing of units in higher layers but not with any other factors, including model structure, training algorithm (supervised or unsupervised), receptive field size, and property of training stimuli. The results suggest an important role of sparse neuronal activity underlying this special feature of higher visual areas.
Zhuang, Chengxu; Wang, Yulong; Yamins, Daniel; Hu, Xiaolin
2017-01-01
Visual information in the visual cortex is processed in a hierarchical manner. Recent studies show that higher visual areas, such as V2, V3, and V4, respond more vigorously to images with naturalistic higher-order statistics than to images lacking them. This property is a functional signature of higher areas, as it is much weaker or even absent in the primary visual cortex (V1). However, the mechanism underlying this signature remains elusive. We studied this problem using computational models. In several typical hierarchical visual models including the AlexNet, VggNet, and SHMAX, this signature was found to be prominent in higher layers but much weaker in lower layers. By changing both the model structure and experimental settings, we found that the signature strongly correlated with sparse firing of units in higher layers but not with any other factors, including model structure, training algorithm (supervised or unsupervised), receptive field size, and property of training stimuli. The results suggest an important role of sparse neuronal activity underlying this special feature of higher visual areas. PMID:29163117
A note on the blind deconvolution of multiple sparse signals from unknown subspaces
NASA Astrophysics Data System (ADS)
Cosse, Augustin
2017-08-01
This note studies the recovery of multiple sparse signals, xn ∈ ℝL, n = 1, . . . , N, from the knowledge of their convolution with an unknown point spread function h ∈ ℝL. When the point spread function is known to be nonzero, |h[k]| > 0, this blind deconvolution problem can be relaxed into a linear, ill-posed inverse problem in the vector concatenating the unknown inputs xn together with the inverse of the filter, d ∈ ℝL where d[k] := 1/h[k]. When prior information is given on the input subspaces, the resulting overdetermined linear system can be solved efficiently via least squares (see Ling et al. 20161). When no information is given on those subspaces, and the inputs are only known to be sparse, it still remains possible to recover these inputs along with the filter by considering an additional l1 penalty. This note certifies exact recovery of both the unknown PSF and unknown sparse inputs, from the knowledge of their convolutions, as soon as the number of inputs N and the dimension of each input, L , satisfy L ≳ N and N ≳ T2max, up to log factors. Here Tmax = maxn{Tn} and Tn, n = 1, . . . , N denote the supports of the inputs xn. Our proof system combines the recent results on blind deconvolution via least squares to certify invertibility of the linear map encoding the convolutions, with the construction of a dual certificate following the structure first suggested in Candés et al. 2007.2 Unlike in these papers, however, it is not possible to rely on the norm ||(A*TAT)-1|| to certify recovery. We instead use a combination of the Schur Complement and Neumann series to compute an expression for the inverse (A*TAT)-1. Given this expression, it is possible to show that the poorly scaled blocks in (A*TAT)-1 are multiplied by the better scaled ones or vanish in the construction of the certificate. Recovery is certified with high probablility on the choice of the supports and distribution of the signs of each input xn on the support. The paper follows the line of previous work by Wang et al. 20163 where the authors guarantee recovery for subgaussian × Bernoulli inputs satisfying 𝔼xn|k| ∈ [1/10, 1] as soon as N ≳ L. Examples of applications include seismic imaging with unknown source or marine seismic data deghosting, magnetic resonance autocalibration or multiple channel estimation in communication. Numerical experiments are provided along with a discussion on the sample complexity tightness.
Sparse feature selection for classification and prediction of metastasis in endometrial cancer.
Ahsen, Mehmet Eren; Boren, Todd P; Singh, Nitin K; Misganaw, Burook; Mutch, David G; Moore, Kathleen N; Backes, Floor J; McCourt, Carolyn K; Lea, Jayanthi S; Miller, David S; White, Michael A; Vidyasagar, Mathukumalli
2017-03-27
Metastasis via pelvic and/or para-aortic lymph nodes is a major risk factor for endometrial cancer. Lymph-node resection ameliorates risk but is associated with significant co-morbidities. Incidence in patients with stage I disease is 4-22% but no mechanism exists to accurately predict it. Therefore, national guidelines for primary staging surgery include pelvic and para-aortic lymph node dissection for all patients whose tumor exceeds 2cm in diameter. We sought to identify a robust molecular signature that can accurately classify risk of lymph node metastasis in endometrial cancer patients. 86 tumors matched for age and race, and evenly distributed between lymph node-positive and lymph node-negative cases, were selected as a training cohort. Genomic micro-RNA expression was profiled for each sample to serve as the predictive feature matrix. An independent set of 28 tumor samples was collected and similarly characterized to serve as a test cohort. A feature selection algorithm was designed for applications where the number of samples is far smaller than the number of measured features per sample. A predictive miRNA expression signature was developed using this algorithm, which was then used to predict the metastatic status of the independent test cohort. A weighted classifier, using 18 micro-RNAs, achieved 100% accuracy on the training cohort. When applied to the testing cohort, the classifier correctly predicted 90% of node-positive cases, and 80% of node-negative cases (FDR = 6.25%). Results indicate that the evaluation of the quantitative sparse-feature classifier proposed here in clinical trials may lead to significant improvement in the prediction of lymphatic metastases in endometrial cancer patients.
Tong, Tong; Gao, Qinquan; Guerrero, Ricardo; Ledig, Christian; Chen, Liang; Rueckert, Daniel; Initiative, Alzheimer's Disease Neuroimaging
2017-01-01
Identifying mild cognitive impairment (MCI) subjects who will progress to Alzheimer's disease (AD) is not only crucial in clinical practice, but also has a significant potential to enrich clinical trials. The purpose of this study is to develop an effective biomarker for an accurate prediction of MCI-to-AD conversion from magnetic resonance images. We propose a novel grading biomarker for the prediction of MCI-to-AD conversion. First, we comprehensively study the effects of several important factors on the performance in the prediction task including registration accuracy, age correction, feature selection, and the selection of training data. Based on the studies of these factors, a grading biomarker is then calculated for each MCI subject using sparse representation techniques. Finally, the grading biomarker is combined with age and cognitive measures to provide a more accurate prediction of MCI-to-AD conversion. Using the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, the proposed global grading biomarker achieved an area under the receiver operating characteristic curve (AUC) in the range of 79-81% for the prediction of MCI-to-AD conversion within three years in tenfold cross validations. The classification AUC further increases to 84-92% when age and cognitive measures are combined with the proposed grading biomarker. The obtained accuracy of the proposed biomarker benefits from the contributions of different factors: a tradeoff registration level to align images to the template space, the removal of the normal aging effect, selection of discriminative voxels, the calculation of the grading biomarker using AD and normal control groups, and the integration of sparse representation technique and the combination of cognitive measures. The evaluation on the ADNI dataset shows the efficacy of the proposed biomarker and demonstrates a significant contribution in accurate prediction of MCI-to-AD conversion.
An Efficient Scheme for Updating Sparse Cholesky Factors
NASA Technical Reports Server (NTRS)
Raghavan, Padma
2002-01-01
Raghavan had earlier developed the software package DCSPACK which can be used for solving sparse linear systems where the coefficient matrix is symmetric and positive definite (this project was not funded by NASA but by agencies such as NSF). DSCPACK-S is the serial code and DSCPACK-P is a parallel implementation suitable for multiprocessors or networks-of-workstations with message passing using MCI. The main algorithm used is the Cholesky factorization of a sparse symmetric positive positive definite matrix A = LL(T). The code can also compute the factorization A = LDL(T). The complexity of the software arises from several factors relating to the sparsity of the matrix A. A sparse N x N matrix A has typically less that cN nonzeroes where c is a small constant. If the matrix were dense, it would have O(N2) nonzeroes. The most complicated part of such sparse Cholesky factorization relates to fill-in, i.e., zeroes in the original matrix that become nonzeroes in the factor L. An efficient implementation depends to a large extent on complex data structures and on techniques from graph theory to reduce, identify, and manage fill. DSCPACK is based on an efficient multifrontal implementation with fill-managing algorithms and implementation arising from earlier research by Raghavan and others. Sparse Cholesky factorization is typically a four step process: (1) ordering to compute a fill-reducing numbering, (2) symbolic factorization to determine the nonzero structure of L, (3) numeric factorization to compute L, and, (4) triangular solution to solve L(T)x = y and Ly = b. The first two steps are symbolic and are performed using the graph of the matrix. The numeric factorization step is of dominant cost and there are several schemes for improving performance by exploiting the nested and dense structure of groups of columns in the factor. The latter are aimed at better utilization of the cache-memory hierarchy on modem processors to prevent cache-misses and provide execution rates (operations/second) that are close to the peak rates for dense matrix computations. Currently, EPISCOPACY is being used in an application at NASA directed by J. Newman and M. James. We propose the implementation of efficient schemes for updating the LL(T) or LDL(T) factors computed in DSCPACK-S to meet the computational requirements of their project. A brief description is provided in the next section.
Haley, Benjamin M.; Paunesku, Tatjana; Grdina, David J.; ...
2015-12-09
The US government regulates allowable radiation exposures relying, in large part, on the seventh report from the committee to estimate the Biological Effect of Ionizing Radiation (BEIR VII), which estimated that most contemporary exposures- protracted or low-dose, carry 1.5 fold less risk of carcinogenesis and mortality per Gy than acute exposures of atomic bomb survivors. This correction is known as the dose and dose rate effectiveness factor for the life span study of atomic bomb survivors (DDREF LSS). As a result, it was calculated by applying a linear-quadratic dose response model to data from Japanese atomic bomb survivors and amore » limited number of animal studies.« less
Discretized energy minimization in a wave guide with point sources
NASA Technical Reports Server (NTRS)
Propst, G.
1994-01-01
An anti-noise problem on a finite time interval is solved by minimization of a quadratic functional on the Hilbert space of square integrable controls. To this end, the one-dimensional wave equation with point sources and pointwise reflecting boundary conditions is decomposed into a system for the two propagating components of waves. Wellposedness of this system is proved for a class of data that includes piecewise linear initial conditions and piecewise constant forcing functions. It is shown that for such data the optimal piecewise constant control is the solution of a sparse linear system. Methods for its computational treatment are presented as well as examples of their applicability. The convergence of discrete approximations to the general optimization problem is demonstrated by finite element methods.
NASA Astrophysics Data System (ADS)
Ali, Kashif; Akbar, Muhammad Zubair; Iqbal, Muhammad Farooq; Ashraf, Muhammad
2014-10-01
The paper deals with the study of heat and mass transfer in an unsteady viscous incompressible water-based nanofluid (containing Titanium dioxide nanoparticles) between two orthogonally moving porous coaxial disks with suction. A combination of iterative (successive over relaxation) and a direct method is employed for solving the sparse systems of linear algebraic equations arising from the FD discretization of the linearized self similar ODEs. It has been noticed that the rate of mass transfer at the disks decreases with the permeability Reynolds number whether the disks are approaching or receding. The findings of the present investigation may be beneficial for the electronic industry in maintaining the electronic components under effective and safe operational conditions.
Complexity transitions in global algorithms for sparse linear systems over finite fields
NASA Astrophysics Data System (ADS)
Braunstein, A.; Leone, M.; Ricci-Tersenghi, F.; Zecchina, R.
2002-09-01
We study the computational complexity of a very basic problem, namely that of finding solutions to a very large set of random linear equations in a finite Galois field modulo q. Using tools from statistical mechanics we are able to identify phase transitions in the structure of the solution space and to connect them to the changes in the performance of a global algorithm, namely Gaussian elimination. Crossing phase boundaries produces a dramatic increase in memory and CPU requirements necessary for the algorithms. In turn, this causes the saturation of the upper bounds for the running time. We illustrate the results on the specific problem of integer factorization, which is of central interest for deciphering messages encrypted with the RSA cryptosystem.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haley, Benjamin M.; Paunesku, Tatjana; Grdina, David J.
The US government regulates allowable radiation exposures relying, in large part, on the seventh report from the committee to estimate the Biological Effect of Ionizing Radiation (BEIR VII), which estimated that most contemporary exposures- protracted or low-dose, carry 1.5 fold less risk of carcinogenesis and mortality per Gy than acute exposures of atomic bomb survivors. This correction is known as the dose and dose rate effectiveness factor for the life span study of atomic bomb survivors (DDREF LSS). As a result, it was calculated by applying a linear-quadratic dose response model to data from Japanese atomic bomb survivors and amore » limited number of animal studies.« less
A Permutation Approach for Selecting the Penalty Parameter in Penalized Model Selection
Sabourin, Jeremy A; Valdar, William; Nobel, Andrew B
2015-01-01
Summary We describe a simple, computationally effcient, permutation-based procedure for selecting the penalty parameter in LASSO penalized regression. The procedure, permutation selection, is intended for applications where variable selection is the primary focus, and can be applied in a variety of structural settings, including that of generalized linear models. We briefly discuss connections between permutation selection and existing theory for the LASSO. In addition, we present a simulation study and an analysis of real biomedical data sets in which permutation selection is compared with selection based on the following: cross-validation (CV), the Bayesian information criterion (BIC), Scaled Sparse Linear Regression, and a selection method based on recently developed testing procedures for the LASSO. PMID:26243050
Performance Models for the Spike Banded Linear System Solver
Manguoglu, Murat; Saied, Faisal; Sameh, Ahmed; ...
2011-01-01
With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers themselves. In this paper, we investigate the problem of characterizing performance and scalability of banded preconditioners. Recent work has demonstrated the superior convergence properties and robustness of banded preconditioners,more » compared to state-of-the-art ILU family of preconditioners as well as algebraic multigrid preconditioners. Furthermore, when used in conjunction with efficient banded solvers, banded preconditioners are capable of significantly faster time-to-solution. Our banded solver, the Truncated Spike algorithm is specifically designed for parallel performance and tolerance to deep memory hierarchies. Its regular structure is also highly amenable to accurate performance characterization. Using these characteristics, we derive the following results in this paper: (i) we develop parallel formulations of the Truncated Spike solver, (ii) we develop a highly accurate pseudo-analytical parallel performance model for our solver, (iii) we show excellent predication capabilities of our model – based on which we argue the high scalability of our solver. Our pseudo-analytical performance model is based on analytical performance characterization of each phase of our solver. These analytical models are then parameterized using actual runtime information on target platforms. An important consequence of our performance models is that they reveal underlying performance bottlenecks in both serial and parallel formulations. All of our results are validated on diverse heterogeneous multiclusters – platforms for which performance prediction is particularly challenging. Finally, we provide predict the scalability of the Spike algorithm using up to 65,536 cores with our model. In this paper we extend the results presented in the Ninth International Symposium on Parallel and Distributed Computing.« less
Tipton, John; Hooten, Mevin B.; Goring, Simon
2017-01-01
Scientific records of temperature and precipitation have been kept for several hundred years, but for many areas, only a shorter record exists. To understand climate change, there is a need for rigorous statistical reconstructions of the paleoclimate using proxy data. Paleoclimate proxy data are often sparse, noisy, indirect measurements of the climate process of interest, making each proxy uniquely challenging to model statistically. We reconstruct spatially explicit temperature surfaces from sparse and noisy measurements recorded at historical United States military forts and other observer stations from 1820 to 1894. One common method for reconstructing the paleoclimate from proxy data is principal component regression (PCR). With PCR, one learns a statistical relationship between the paleoclimate proxy data and a set of climate observations that are used as patterns for potential reconstruction scenarios. We explore PCR in a Bayesian hierarchical framework, extending classical PCR in a variety of ways. First, we model the latent principal components probabilistically, accounting for measurement error in the observational data. Next, we extend our method to better accommodate outliers that occur in the proxy data. Finally, we explore alternatives to the truncation of lower-order principal components using different regularization techniques. One fundamental challenge in paleoclimate reconstruction efforts is the lack of out-of-sample data for predictive validation. Cross-validation is of potential value, but is computationally expensive and potentially sensitive to outliers in sparse data scenarios. To overcome the limitations that a lack of out-of-sample records presents, we test our methods using a simulation study, applying proper scoring rules including a computationally efficient approximation to leave-one-out cross-validation using the log score to validate model performance. The result of our analysis is a spatially explicit reconstruction of spatio-temporal temperature from a very sparse historical record.
NASA Astrophysics Data System (ADS)
Bogiatzis, P.; Ishii, M.; Davis, T. A.
2016-12-01
Seismic tomography inverse problems are among the largest high-dimensional parameter estimation tasks in Earth science. We show how combinatorics and graph theory can be used to analyze the structure of such problems, and to effectively decompose them into smaller ones that can be solved efficiently by means of the least squares method. In combination with recent high performance direct sparse algorithms, this reduction in dimensionality allows for an efficient computation of the model resolution and covariance matrices using limited resources. Furthermore, we show that a new sparse singular value decomposition method can be used to obtain the complete spectrum of the singular values. This procedure provides the means for more objective regularization and further dimensionality reduction of the problem. We apply this methodology to a moderate size, non-linear seismic tomography problem to image the structure of the crust and the upper mantle beneath Japan using local deep earthquakes recorded by the High Sensitivity Seismograph Network stations.
Yang, Yan; Onishi, Takeo; Hiramatsu, Ken
2014-01-01
Simulation results of the widely used temperature index snowmelt model are greatly influenced by input air temperature data. Spatially sparse air temperature data remain the main factor inducing uncertainties and errors in that model, which limits its applications. Thus, to solve this problem, we created new air temperature data using linear regression relationships that can be formulated based on MODIS land surface temperature data. The Soil Water Assessment Tool model, which includes an improved temperature index snowmelt module, was chosen to test the newly created data. By evaluating simulation performance for daily snowmelt in three test basins of the Amur River, performance of the newly created data was assessed. The coefficient of determination (R 2) and Nash-Sutcliffe efficiency (NSE) were used for evaluation. The results indicate that MODIS land surface temperature data can be used as a new source for air temperature data creation. This will improve snow simulation using the temperature index model in an area with sparse air temperature observations. PMID:25165746
Regression analysis of sparse asynchronous longitudinal data
Cao, Hongyuan; Zeng, Donglin; Fine, Jason P.
2015-01-01
Summary We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus. PMID:26568699
Multiprocessor sparse L/U decomposition with controlled fill-in
NASA Technical Reports Server (NTRS)
Alaghband, G.; Jordan, H. F.
1985-01-01
Generation of the maximal compatibles of pivot elements for a class of small sparse matrices is studied. The algorithm involves a binary tree search and has a complexity exponential in the order of the matrix. Different strategies for selection of a set of compatible pivots based on the Markowitz criterion are investigated. The competing issues of parallelism and fill-in generation are studied and results are provided. A technque for obtaining an ordered compatible set directly from the ordered incompatible table is given. This technique generates a set of compatible pivots with the property of generating few fills. A new hueristic algorithm is then proposed that combines the idea of an ordered compatible set with a limited binary tree search to generate several sets of compatible pivots in linear time. Finally, an elimination set to reduce the matrix is selected. Parameters are suggested to obtain a balance between parallelism and fill-ins. Results of applying the proposed algorithms on several large application matrices are presented and analyzed.
Compressed digital holography: from micro towards macro
NASA Astrophysics Data System (ADS)
Schretter, Colas; Bettens, Stijn; Blinder, David; Pesquet-Popescu, Béatrice; Cagnazzo, Marco; Dufaux, Frédéric; Schelkens, Peter
2016-09-01
signal processing methods from software-driven computer engineering and applied mathematics. The compressed sensing theory in particular established a practical framework for reconstructing the scene content using few linear combinations of complex measurements and a sparse prior for regularizing the solution. Compressed sensing found direct applications in digital holography for microscopy. Indeed, the wave propagation phenomenon in free space mixes in a natural way the spatial distribution of point sources from the 3-dimensional scene. As the 3-dimensional scene is mapped to a 2-dimensional hologram, the hologram samples form a compressed representation of the scene as well. This overview paper discusses contributions in the field of compressed digital holography at the micro scale. Then, an outreach on future extensions towards the real-size macro scale is discussed. Thanks to advances in sensor technologies, increasing computing power and the recent improvements in sparse digital signal processing, holographic modalities are on the verge of practical high-quality visualization at a macroscopic scale where much higher resolution holograms must be acquired and processed on the computer.
Discovering governing equations from data by sparse identification of nonlinear dynamical systems
Brunton, Steven L.; Proctor, Joshua L.; Kutz, J. Nathan
2016-01-01
Extracting governing equations from data is a central challenge in many diverse areas of science and engineering. Data are abundant whereas models often remain elusive, as in climate science, neuroscience, ecology, finance, and epidemiology, to name only a few examples. In this work, we combine sparsity-promoting techniques and machine learning with nonlinear dynamical systems to discover governing equations from noisy measurement data. The only assumption about the structure of the model is that there are only a few important terms that govern the dynamics, so that the equations are sparse in the space of possible functions; this assumption holds for many physical systems in an appropriate basis. In particular, we use sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data. This results in parsimonious models that balance accuracy with model complexity to avoid overfitting. We demonstrate the algorithm on a wide range of problems, from simple canonical systems, including linear and nonlinear oscillators and the chaotic Lorenz system, to the fluid vortex shedding behind an obstacle. The fluid example illustrates the ability of this method to discover the underlying dynamics of a system that took experts in the community nearly 30 years to resolve. We also show that this method generalizes to parameterized systems and systems that are time-varying or have external forcing. PMID:27035946
Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Li, Xiaoye; Heber, Gerd; Biswas, Rupak
2000-01-01
The ability of computers to solve hitherto intractable problems and simulate complex processes using mathematical models makes them an indispensable part of modern science and engineering. Computer simulations of large-scale realistic applications usually require solving a set of non-linear partial differential equations (PDES) over a finite region. For example, one thrust area in the DOE Grand Challenge projects is to design future accelerators such as the SpaHation Neutron Source (SNS). Our colleagues at SLAC need to model complex RFQ cavities with large aspect ratios. Unstructured grids are currently used to resolve the small features in a large computational domain; dynamic mesh adaptation will be added in the future for additional efficiency. The PDEs for electromagnetics are discretized by the FEM method, which leads to a generalized eigenvalue problem Kx = AMx, where K and M are the stiffness and mass matrices, and are very sparse. In a typical cavity model, the number of degrees of freedom is about one million. For such large eigenproblems, direct solution techniques quickly reach the memory limits. Instead, the most widely-used methods are Krylov subspace methods, such as Lanczos or Jacobi-Davidson. In all the Krylov-based algorithms, sparse matrix-vector multiplication (SPMV) must be performed repeatedly. Therefore, the efficiency of SPMV usually determines the eigensolver speed. SPMV is also one of the most heavily used kernels in large-scale numerical simulations.
Discovering governing equations from data by sparse identification of nonlinear dynamical systems.
Brunton, Steven L; Proctor, Joshua L; Kutz, J Nathan
2016-04-12
Extracting governing equations from data is a central challenge in many diverse areas of science and engineering. Data are abundant whereas models often remain elusive, as in climate science, neuroscience, ecology, finance, and epidemiology, to name only a few examples. In this work, we combine sparsity-promoting techniques and machine learning with nonlinear dynamical systems to discover governing equations from noisy measurement data. The only assumption about the structure of the model is that there are only a few important terms that govern the dynamics, so that the equations are sparse in the space of possible functions; this assumption holds for many physical systems in an appropriate basis. In particular, we use sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data. This results in parsimonious models that balance accuracy with model complexity to avoid overfitting. We demonstrate the algorithm on a wide range of problems, from simple canonical systems, including linear and nonlinear oscillators and the chaotic Lorenz system, to the fluid vortex shedding behind an obstacle. The fluid example illustrates the ability of this method to discover the underlying dynamics of a system that took experts in the community nearly 30 years to resolve. We also show that this method generalizes to parameterized systems and systems that are time-varying or have external forcing.
NASA Astrophysics Data System (ADS)
Miao, Minmin; Zeng, Hong; Wang, Aimin; Zhao, Fengkui; Liu, Feixiang
2017-09-01
Electroencephalogram (EEG)-based motor imagery (MI) brain-computer interface (BCI) has shown its effectiveness for the control of rehabilitation devices designed for large body parts of the patients with neurologic impairments. In order to validate the feasibility of using EEG to decode the MI of a single index finger and constructing a BCI-enhanced finger rehabilitation system, we collected EEG data during right hand index finger MI and rest state for five healthy subjects and proposed a pattern recognition approach for classifying these two mental states. First, Fisher's linear discriminant criteria and power spectral density analysis were used to analyze the event-related desynchronization patterns. Second, both band power and approximate entropy were extracted as features. Third, aiming to eliminate the abnormal samples in the dictionary and improve the classification performance of the conventional sparse representation-based classification (SRC) method, we proposed a novel dictionary cleaned sparse representation-based classification (DCSRC) method for final classification. The experimental results show that the proposed DCSRC method gives better classification accuracies than SRC and an average classification accuracy of 81.32% is obtained for five subjects. Thus, it is demonstrated that single right hand index finger MI can be decoded from the sensorimotor rhythms, and the feature patterns of index finger MI and rest state can be well recognized for robotic exoskeleton initiation.
Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks
Chen, Jianhui; Liu, Ji; Ye, Jieping
2013-01-01
We consider the problem of learning incoherent sparse and low-rank patterns from multiple tasks. Our approach is based on a linear multi-task learning formulation, in which the sparse and low-rank patterns are induced by a cardinality regularization term and a low-rank constraint, respectively. This formulation is non-convex; we convert it into its convex surrogate, which can be routinely solved via semidefinite programming for small-size problems. We propose to employ the general projected gradient scheme to efficiently solve such a convex surrogate; however, in the optimization formulation, the objective function is non-differentiable and the feasible domain is non-trivial. We present the procedures for computing the projected gradient and ensuring the global convergence of the projected gradient scheme. The computation of projected gradient involves a constrained optimization problem; we show that the optimal solution to such a problem can be obtained via solving an unconstrained optimization subproblem and an Euclidean projection subproblem. We also present two projected gradient algorithms and analyze their rates of convergence in details. In addition, we illustrate the use of the presented projected gradient algorithms for the proposed multi-task learning formulation using the least squares loss. Experimental results on a collection of real-world data sets demonstrate the effectiveness of the proposed multi-task learning formulation and the efficiency of the proposed projected gradient algorithms. PMID:24077658
Discriminative Transfer Subspace Learning via Low-Rank and Sparse Representation.
Xu, Yong; Fang, Xiaozhao; Wu, Jian; Li, Xuelong; Zhang, David
2016-02-01
In this paper, we address the problem of unsupervised domain transfer learning in which no labels are available in the target domain. We use a transformation matrix to transfer both the source and target data to a common subspace, where each target sample can be represented by a combination of source samples such that the samples from different domains can be well interlaced. In this way, the discrepancy of the source and target domains is reduced. By imposing joint low-rank and sparse constraints on the reconstruction coefficient matrix, the global and local structures of data can be preserved. To enlarge the margins between different classes as much as possible and provide more freedom to diminish the discrepancy, a flexible linear classifier (projection) is obtained by learning a non-negative label relaxation matrix that allows the strict binary label matrix to relax into a slack variable matrix. Our method can avoid a potentially negative transfer by using a sparse matrix to model the noise and, thus, is more robust to different types of noise. We formulate our problem as a constrained low-rankness and sparsity minimization problem and solve it by the inexact augmented Lagrange multiplier method. Extensive experiments on various visual domain adaptation tasks show the superiority of the proposed method over the state-of-the art methods. The MATLAB code of our method will be publicly available at http://www.yongxu.org/lunwen.html.
Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks.
Chen, Jianhui; Liu, Ji; Ye, Jieping
2012-02-01
We consider the problem of learning incoherent sparse and low-rank patterns from multiple tasks. Our approach is based on a linear multi-task learning formulation, in which the sparse and low-rank patterns are induced by a cardinality regularization term and a low-rank constraint, respectively. This formulation is non-convex; we convert it into its convex surrogate, which can be routinely solved via semidefinite programming for small-size problems. We propose to employ the general projected gradient scheme to efficiently solve such a convex surrogate; however, in the optimization formulation, the objective function is non-differentiable and the feasible domain is non-trivial. We present the procedures for computing the projected gradient and ensuring the global convergence of the projected gradient scheme. The computation of projected gradient involves a constrained optimization problem; we show that the optimal solution to such a problem can be obtained via solving an unconstrained optimization subproblem and an Euclidean projection subproblem. We also present two projected gradient algorithms and analyze their rates of convergence in details. In addition, we illustrate the use of the presented projected gradient algorithms for the proposed multi-task learning formulation using the least squares loss. Experimental results on a collection of real-world data sets demonstrate the effectiveness of the proposed multi-task learning formulation and the efficiency of the proposed projected gradient algorithms.
Structure-preserving and rank-revealing QR-factorizations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bischof, C.H.; Hansen, P.C.
1991-11-01
The rank-revealing QR-factorization (RRQR-factorization) is a special QR-factorization that is guaranteed to reveal the numerical rank of the matrix under consideration. This makes the RRQR-factorization a useful tool in the numerical treatment of many rank-deficient problems in numerical linear algebra. In this paper, a framework is presented for the efficient implementation of RRQR algorithms, in particular, for sparse matrices. A sparse RRQR-algorithm should seek to preserve the structure and sparsity of the matrix as much as possible while retaining the ability to capture safely the numerical rank. To this end, the paper proposes to compute an initial QR-factorization using amore » restricted pivoting strategy guarded by incremental condition estimation (ICE), and then applies the algorithm suggested by Chan and Foster to this QR-factorization. The column exchange strategy used in the initial QR factorization will exploit the fact that certain column exchanges do not change the sparsity structure, and compute a sparse QR-factorization that is a good approximation of the sought-after RRQR-factorization. Due to quantities produced by ICE, the Chan/Foster RRQR algorithm can be implemented very cheaply, thus verifying that the sought-after RRQR-factorization has indeed been computed. Experimental results on a model problem show that the initial QR-factorization is indeed very likely to produce RRQR-factorization.« less
2014-01-01
and proportional correctors. The weighting function evaluates nearby data samples to determine the utility of each correction style , eliminating the...sparse methods may be of use. As for other multi-fidelity techniques, true cokriging in the style described by geo-statisticians[93] is beyond the...sampling style between sampling points predicted to fall near the contour and sampling points predicted to be farther from the contour but with
An, Zhe; Rey, Daniel; Ye, Jingxin; ...
2017-01-16
The problem of forecasting the behavior of a complex dynamical system through analysis of observational time-series data becomes difficult when the system expresses chaotic behavior and the measurements are sparse, in both space and/or time. Despite the fact that this situation is quite typical across many fields, including numerical weather prediction, the issue of whether the available observations are "sufficient" for generating successful forecasts is still not well understood. An analysis by Whartenby et al. (2013) found that in the context of the nonlinear shallow water equations on a β plane, standard nudging techniques require observing approximately 70 % of themore » full set of state variables. Here we examine the same system using a method introduced by Rey et al. (2014a), which generalizes standard nudging methods to utilize time delayed measurements. Here, we show that in certain circumstances, it provides a sizable reduction in the number of observations required to construct accurate estimates and high-quality predictions. In particular, we find that this estimate of 70 % can be reduced to about 33 % using time delays, and even further if Lagrangian drifter locations are also used as measurements.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
An, Zhe; Rey, Daniel; Ye, Jingxin
The problem of forecasting the behavior of a complex dynamical system through analysis of observational time-series data becomes difficult when the system expresses chaotic behavior and the measurements are sparse, in both space and/or time. Despite the fact that this situation is quite typical across many fields, including numerical weather prediction, the issue of whether the available observations are "sufficient" for generating successful forecasts is still not well understood. An analysis by Whartenby et al. (2013) found that in the context of the nonlinear shallow water equations on a β plane, standard nudging techniques require observing approximately 70 % of themore » full set of state variables. Here we examine the same system using a method introduced by Rey et al. (2014a), which generalizes standard nudging methods to utilize time delayed measurements. Here, we show that in certain circumstances, it provides a sizable reduction in the number of observations required to construct accurate estimates and high-quality predictions. In particular, we find that this estimate of 70 % can be reduced to about 33 % using time delays, and even further if Lagrangian drifter locations are also used as measurements.« less
Multi-Scale Three-Dimensional Variational Data Assimilation System for Coastal Ocean Prediction
NASA Technical Reports Server (NTRS)
Li, Zhijin; Chao, Yi; Li, P. Peggy
2012-01-01
A multi-scale three-dimensional variational data assimilation system (MS-3DVAR) has been formulated and the associated software system has been developed for improving high-resolution coastal ocean prediction. This system helps improve coastal ocean prediction skill, and has been used in support of operational coastal ocean forecasting systems and field experiments. The system has been developed to improve the capability of data assimilation for assimilating, simultaneously and effectively, sparse vertical profiles and high-resolution remote sensing surface measurements into coastal ocean models, as well as constraining model biases. In this system, the cost function is decomposed into two separate units for the large- and small-scale components, respectively. As such, data assimilation is implemented sequentially from large to small scales, the background error covariance is constructed to be scale-dependent, and a scale-dependent dynamic balance is incorporated. This scheme then allows effective constraining large scales and model bias through assimilating sparse vertical profiles, and small scales through assimilating high-resolution surface measurements. This MS-3DVAR enhances the capability of the traditional 3DVAR for assimilating highly heterogeneously distributed observations, such as along-track satellite altimetry data, and particularly maximizing the extraction of information from limited numbers of vertical profile observations.
Taft, L M; Evans, R S; Shyu, C R; Egger, M J; Chawla, N; Mitchell, J A; Thornton, S N; Bray, B; Varner, M
2009-04-01
The IOM report, Preventing Medication Errors, emphasizes the overall lack of knowledge of the incidence of adverse drug events (ADE). Operating rooms, emergency departments and intensive care units are known to have a higher incidence of ADE. Labor and delivery (L&D) is an emergency care unit that could have an increased risk of ADE, where reported rates remain low and under-reporting is suspected. Risk factor identification with electronic pattern recognition techniques could improve ADE detection rates. The objective of the present study is to apply Synthetic Minority Over Sampling Technique (SMOTE) as an enhanced sampling method in a sparse dataset to generate prediction models to identify ADE in women admitted for labor and delivery based on patient risk factors and comorbidities. By creating synthetic cases with the SMOTE algorithm and using a 10-fold cross-validation technique, we demonstrated improved performance of the Naïve Bayes and the decision tree algorithms. The true positive rate (TPR) of 0.32 in the raw dataset increased to 0.67 in the 800% over-sampled dataset. Enhanced performance from classification algorithms can be attained with the use of synthetic minority class oversampling techniques in sparse clinical datasets. Predictive models created in this manner can be used to develop evidence based ADE monitoring systems.
Joint Probabilistic Reasoning About Coreference and Relations of Univeral Schema
2017-10-01
containing Barack and Michelle Obama state that they are married. A variety of one - shot and iterative methods have addressed the alignment problem [25...speed. Most of the computation time for these linear models is spent on the dot-product between the sparse features of an example and the weights...of the model. In some cases , it is clear that the use of all of these features is excessive and the example can be correctly classified without such
Incomplete Gröbner basis as a preconditioner for polynomial systems
NASA Astrophysics Data System (ADS)
Sun, Yang; Tao, Yu-Hui; Bai, Feng-Shan
2009-04-01
Precondition plays a critical role in the numerical methods for large and sparse linear systems. It is also true for nonlinear algebraic systems. In this paper incomplete Gröbner basis (IGB) is proposed as a preconditioner of homotopy methods for polynomial systems of equations, which transforms a deficient system into a system with the same finite solutions, but smaller degree. The reduced system can thus be solved faster. Numerical results show the efficiency of the preconditioner.
A Compressed Sensing Based Ultra-Wideband Communication System
2009-06-01
principle, most of the processing at the receiver can be moved to the transmitter—where energy consumption and computation are sufficient for many advanced...extended to continuous time signals. We use ∗ to denote the convolution process in a linear time-invariant (LTI) system. Assume that there is an analog...Filter Channel Low Rate A/D Processing Sparse Bit Sequence UWB Pulse Generator α̂ Waves)(RadioGHz 5 MHz125 θ Ψ Φ y θ̂ 1 ˆ arg min s.t. yθ
A variant of nested dissection for solving n by n grid problems
NASA Technical Reports Server (NTRS)
George, A.; Poole, W. G., Jr.; Voigt, R. G.
1976-01-01
Nested dissection orderings are known to be very effective for solving the sparse positive definite linear systems which arise from n by n grid problems. In this paper nested dissection is shown to be the final step of incomplete nested dissection, an ordering which corresponds to the premature termination of dissection. Analyses of the arithmetic and storage requirements for incomplete nested dissection are given, and the ordering is shown to be competitive with nested dissection under certain conditions.
An accurate method for solving a class of fractional Sturm-Liouville eigenvalue problems
NASA Astrophysics Data System (ADS)
Kashkari, Bothayna S. H.; Syam, Muhammed I.
2018-06-01
This article is devoted to both theoretical and numerical study of the eigenvalues of nonsingular fractional second-order Sturm-Liouville problem. In this paper, we implement a fractional-order Legendre Tau method to approximate the eigenvalues. This method transforms the Sturm-Liouville problem to a sparse nonsingular linear system which is solved using the continuation method. Theoretical results for the considered problem are provided and proved. Numerical results are presented to show the efficiency of the proposed method.
Improved Personalized Recommendation Based on Causal Association Rule and Collaborative Filtering
ERIC Educational Resources Information Center
Lei, Wu; Qing, Fang; Zhou, Jin
2016-01-01
There are usually limited user evaluation of resources on a recommender system, which caused an extremely sparse user rating matrix, and this greatly reduce the accuracy of personalized recommendation, especially for new users or new items. This paper presents a recommendation method based on rating prediction using causal association rules.…
Incoming Population: Where Will the People Live? Coping with Growth.
ERIC Educational Resources Information Center
Siegler, Theodore R.
The guide describes an assessment procedure that can be used by sparsely populated communities located near a potential development to help predict where the incoming population will choose to live and shop. First, a numerical model, the "gravity model," is presented which utilizes community size and the distance from the community to…
The current study uses case studies of model-estimated regional precipitation and wet ion deposition to estimate errors in corresponding regional values derived from the means of site-specific values within regions of interest located in the eastern US. The mean of model-estimate...
Psychosocial Predictors of Women's Physical Health in Middle Adulthood.
ERIC Educational Resources Information Center
Thomas, Sandra P.
Although health is a key element in one's experience of middle adulthood as a time of productivity and personal fulfillment, research on psychosocial factors predictive of mid-life health is sparse, especially for women. Psychosocial variables are not only highly salient to health, but also are potentially modifiable by women themselves. This…
NASA Astrophysics Data System (ADS)
Friedel, Michael; Buscema, Massimo
2016-04-01
Aquatic ecosystem models can potentially be used to understand the influence of stresses on catchment resource quality. Given that catchment responses are functions of natural and anthropogenic stresses reflected in sparse and spatiotemporal biological, physical, and chemical measurements, an ecosystem is difficult to model using statistical or numerical methods. We propose an artificial adaptive systems approach to model ecosystems. First, an unsupervised machine-learning (ML) network is trained using the set of available sparse and disparate data variables. Second, an evolutionary algorithm with genetic doping is applied to reduce the number of ecosystem variables to an optimal set. Third, the optimal set of ecosystem variables is used to retrain the ML network. Fourth, a stochastic cross-validation approach is applied to quantify and compare the nonlinear uncertainty in selected predictions of the original and reduced models. Results are presented for aquatic ecosystems (tens of thousands of square kilometers) undergoing landscape change in the USA: Upper Illinois River Basin and Central Colorado Assessment Project Area, and Southland region, NZ.