Masuda, Y; Misztal, I; Legarra, A; Tsuruta, S; Lourenco, D A L; Fragomeni, B O; Aguilar, I
2017-01-01
This paper evaluates an efficient implementation to multiply the inverse of a numerator relationship matrix for genotyped animals () by a vector (). The computation is required for solving mixed model equations in single-step genomic BLUP (ssGBLUP) with the preconditioned conjugate gradient (PCG). The inverse can be decomposed into sparse matrices that are blocks of the sparse inverse of a numerator relationship matrix () including genotyped animals and their ancestors. The elements of were rapidly calculated with the Henderson's rule and stored as sparse matrices in memory. Implementation of was by a series of sparse matrix-vector multiplications. Diagonal elements of , which were required as preconditioners in PCG, were approximated with a Monte Carlo method using 1,000 samples. The efficient implementation of was compared with explicit inversion of with 3 data sets including about 15,000, 81,000, and 570,000 genotyped animals selected from populations with 213,000, 8.2 million, and 10.7 million pedigree animals, respectively. The explicit inversion required 1.8 GB, 49 GB, and 2,415 GB (estimated) of memory, respectively, and 42 s, 56 min, and 13.5 d (estimated), respectively, for the computations. The efficient implementation required <1 MB, 2.9 GB, and 2.3 GB of memory, respectively, and <1 sec, 3 min, and 5 min, respectively, for setting up. Only <1 sec was required for the multiplication in each PCG iteration for any data sets. When the equations in ssGBLUP are solved with the PCG algorithm, is no longer a limiting factor in the computations.
Yang, C L; Wei, H Y; Adler, A; Soleimani, M
2013-06-01
Electrical impedance tomography (EIT) is a fast and cost-effective technique to provide a tomographic conductivity image of a subject from boundary current-voltage data. This paper proposes a time and memory efficient method for solving a large scale 3D EIT inverse problem using a parallel conjugate gradient (CG) algorithm. The 3D EIT system with a large number of measurement data can produce a large size of Jacobian matrix; this could cause difficulties in computer storage and the inversion process. One of challenges in 3D EIT is to decrease the reconstruction time and memory usage, at the same time retaining the image quality. Firstly, a sparse matrix reduction technique is proposed using thresholding to set very small values of the Jacobian matrix to zero. By adjusting the Jacobian matrix into a sparse format, the element with zeros would be eliminated, which results in a saving of memory requirement. Secondly, a block-wise CG method for parallel reconstruction has been developed. The proposed method has been tested using simulated data as well as experimental test samples. Sparse Jacobian with a block-wise CG enables the large scale EIT problem to be solved efficiently. Image quality measures are presented to quantify the effect of sparse matrix reduction in reconstruction results.
Sparse Regression as a Sparse Eigenvalue Problem
NASA Technical Reports Server (NTRS)
Moghaddam, Baback; Gruber, Amit; Weiss, Yair; Avidan, Shai
2008-01-01
We extend the l0-norm "subspectral" algorithms for sparse-LDA [5] and sparse-PCA [6] to general quadratic costs such as MSE in linear (kernel) regression. The resulting "Sparse Least Squares" (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem (e.g., binary sparse-LDA [7]). Specifically, for a general quadratic cost we use a highly-efficient technique for direct eigenvalue computation using partitioned matrix inverses which leads to dramatic x103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) scaling behaviour that up to now has limited the previous algorithms' utility for high-dimensional learning problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes more efficient than forward selection. Similarly, branch-and-bound search for Exact Sparse Least Squares (ESLS) also benefits from partitioned matrix inverse techniques. Our Greedy Sparse Least Squares (GSLS) generalizes Natarajan's algorithm [9] also known as Order-Recursive Matching Pursuit (ORMP). Specifically, the forward half of GSLS is exactly equivalent to ORMP but more efficient. By including the backward pass, which only doubles the computation, we can achieve lower MSE than ORMP. Experimental comparisons to the state-of-the-art LARS algorithm [3] show forward-GSLS is faster, more accurate and more flexible in terms of choice of regularization
An efficient implementation of a high-order filter for a cubed-sphere spectral element model
NASA Astrophysics Data System (ADS)
Kang, Hyun-Gyu; Cheong, Hyeong-Bin
2017-03-01
A parallel-scalable, isotropic, scale-selective spatial filter was developed for the cubed-sphere spectral element model on the sphere. The filter equation is a high-order elliptic (Helmholtz) equation based on the spherical Laplacian operator, which is transformed into cubed-sphere local coordinates. The Laplacian operator is discretized on the computational domain, i.e., on each cell, by the spectral element method with Gauss-Lobatto Lagrange interpolating polynomials (GLLIPs) as the orthogonal basis functions. On the global domain, the discrete filter equation yielded a linear system represented by a highly sparse matrix. The density of this matrix increases quadratically (linearly) with the order of GLLIP (order of the filter), and the linear system is solved in only O (Ng) operations, where Ng is the total number of grid points. The solution, obtained by a row reduction method, demonstrated the typical accuracy and convergence rate of the cubed-sphere spectral element method. To achieve computational efficiency on parallel computers, the linear system was treated by an inverse matrix method (a sparse matrix-vector multiplication). The density of the inverse matrix was lowered to only a few times of the original sparse matrix without degrading the accuracy of the solution. For better computational efficiency, a local-domain high-order filter was introduced: The filter equation is applied to multiple cells, and then the central cell was only used to reconstruct the filtered field. The parallel efficiency of applying the inverse matrix method to the global- and local-domain filter was evaluated by the scalability on a distributed-memory parallel computer. The scale-selective performance of the filter was demonstrated on Earth topography. The usefulness of the filter as a hyper-viscosity for the vorticity equation was also demonstrated.
Noniterative MAP reconstruction using sparse matrix representations.
Cao, Guangzhi; Bouman, Charles A; Webb, Kevin J
2009-09-01
We present a method for noniterative maximum a posteriori (MAP) tomographic reconstruction which is based on the use of sparse matrix representations. Our approach is to precompute and store the inverse matrix required for MAP reconstruction. This approach has generally not been used in the past because the inverse matrix is typically large and fully populated (i.e., not sparse). In order to overcome this problem, we introduce two new ideas. The first idea is a novel theory for the lossy source coding of matrix transformations which we refer to as matrix source coding. This theory is based on a distortion metric that reflects the distortions produced in the final matrix-vector product, rather than the distortions in the coded matrix itself. The resulting algorithms are shown to require orthonormal transformations of both the measurement data and the matrix rows and columns before quantization and coding. The second idea is a method for efficiently storing and computing the required orthonormal transformations, which we call a sparse-matrix transform (SMT). The SMT is a generalization of the classical FFT in that it uses butterflies to compute an orthonormal transform; but unlike an FFT, the SMT uses the butterflies in an irregular pattern, and is numerically designed to best approximate the desired transforms. We demonstrate the potential of the noniterative MAP reconstruction with examples from optical tomography. The method requires offline computation to encode the inverse transform. However, once these offline computations are completed, the noniterative MAP algorithm is shown to reduce both storage and computation by well over two orders of magnitude, as compared to a linear iterative reconstruction methods.
NASA Astrophysics Data System (ADS)
Kaporin, I. E.
2012-02-01
In order to precondition a sparse symmetric positive definite matrix, its approximate inverse is examined, which is represented as the product of two sparse mutually adjoint triangular matrices. In this way, the solution of the corresponding system of linear algebraic equations (SLAE) by applying the preconditioned conjugate gradient method (CGM) is reduced to performing only elementary vector operations and calculating sparse matrix-vector products. A method for constructing the above preconditioner is described and analyzed. The triangular factor has a fixed sparsity pattern and is optimal in the sense that the preconditioned matrix has a minimum K-condition number. The use of polynomial preconditioning based on Chebyshev polynomials makes it possible to considerably reduce the amount of scalar product operations (at the cost of an insignificant increase in the total number of arithmetic operations). The possibility of an efficient massively parallel implementation of the resulting method for solving SLAEs is discussed. For a sequential version of this method, the results obtained by solving 56 test problems from the Florida sparse matrix collection (which are large-scale and ill-conditioned) are presented. These results show that the method is highly reliable and has low computational costs.
NASA Astrophysics Data System (ADS)
He, Xingyu; Tong, Ningning; Hu, Xiaowei
2018-01-01
Compressive sensing has been successfully applied to inverse synthetic aperture radar (ISAR) imaging of moving targets. By exploiting the block sparse structure of the target image, sparse solution for multiple measurement vectors (MMV) can be applied in ISAR imaging and a substantial performance improvement can be achieved. As an effective sparse recovery method, sparse Bayesian learning (SBL) for MMV involves a matrix inverse at each iteration. Its associated computational complexity grows significantly with the problem size. To address this problem, we develop a fast inverse-free (IF) SBL method for MMV. A relaxed evidence lower bound (ELBO), which is computationally more amiable than the traditional ELBO used by SBL, is obtained by invoking fundamental property for smooth functions. A variational expectation-maximization scheme is then employed to maximize the relaxed ELBO, and a computationally efficient IF-MSBL algorithm is proposed. Numerical results based on simulated and real data show that the proposed method can reconstruct row sparse signal accurately and obtain clear superresolution ISAR images. Moreover, the running time and computational complexity are reduced to a great extent compared with traditional SBL methods.
Algorithms for solving large sparse systems of simultaneous linear equations on vector processors
NASA Technical Reports Server (NTRS)
David, R. E.
1984-01-01
Very efficient algorithms for solving large sparse systems of simultaneous linear equations have been developed for serial processing computers. These involve a reordering of matrix rows and columns in order to obtain a near triangular pattern of nonzero elements. Then an LU factorization is developed to represent the matrix inverse in terms of a sequence of elementary Gaussian eliminations, or pivots. In this paper it is shown how these algorithms are adapted for efficient implementation on vector processors. Results obtained on the CYBER 200 Model 205 are presented for a series of large test problems which show the comparative advantages of the triangularization and vector processing algorithms.
Negre, Christian F A; Mniszewski, Susan M; Cawkwell, Marc J; Bock, Nicolas; Wall, Michael E; Niklasson, Anders M N
2016-07-12
We present a reduced complexity algorithm to compute the inverse overlap factors required to solve the generalized eigenvalue problem in a quantum-based molecular dynamics (MD) simulation. Our method is based on the recursive, iterative refinement of an initial guess of Z (inverse square root of the overlap matrix S). The initial guess of Z is obtained beforehand by using either an approximate divide-and-conquer technique or dynamical methods, propagated within an extended Lagrangian dynamics from previous MD time steps. With this formulation, we achieve long-term stability and energy conservation even under the incomplete, approximate, iterative refinement of Z. Linear-scaling performance is obtained using numerically thresholded sparse matrix algebra based on the ELLPACK-R sparse matrix data format, which also enables efficient shared-memory parallelization. As we show in this article using self-consistent density-functional-based tight-binding MD, our approach is faster than conventional methods based on the diagonalization of overlap matrix S for systems as small as a few hundred atoms, substantially accelerating quantum-based simulations even for molecular structures of intermediate size. For a 4158-atom water-solvated polyalanine system, we find an average speedup factor of 122 for the computation of Z in each MD step.
Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners
Li, Ruipeng; Saad, Yousef
2017-08-01
This study presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits domain decomposition (DD) and low-rank corrections. The DD approach decouples the matrix and, once inverted, a low-rank approximation is applied by exploiting the Sherman--Morrison--Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisonsmore » with pARMS, a DD-based parallel incomplete LU (ILU) preconditioning method, are presented for solving Poisson's equation and linear elasticity problems.« less
Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Ruipeng; Saad, Yousef
This study presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits domain decomposition (DD) and low-rank corrections. The DD approach decouples the matrix and, once inverted, a low-rank approximation is applied by exploiting the Sherman--Morrison--Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisonsmore » with pARMS, a DD-based parallel incomplete LU (ILU) preconditioning method, are presented for solving Poisson's equation and linear elasticity problems.« less
Fast Component Pursuit for Large-Scale Inverse Covariance Estimation.
Han, Lei; Zhang, Yu; Zhang, Tong
2016-08-01
The maximum likelihood estimation (MLE) for the Gaussian graphical model, which is also known as the inverse covariance estimation problem, has gained increasing interest recently. Most existing works assume that inverse covariance estimators contain sparse structure and then construct models with the ℓ 1 regularization. In this paper, different from existing works, we study the inverse covariance estimation problem from another perspective by efficiently modeling the low-rank structure in the inverse covariance, which is assumed to be a combination of a low-rank part and a diagonal matrix. One motivation for this assumption is that the low-rank structure is common in many applications including the climate and financial analysis, and another one is that such assumption can reduce the computational complexity when computing its inverse. Specifically, we propose an efficient COmponent Pursuit (COP) method to obtain the low-rank part, where each component can be sparse. For optimization, the COP method greedily learns a rank-one component in each iteration by maximizing the log-likelihood. Moreover, the COP algorithm enjoys several appealing properties including the existence of an efficient solution in each iteration and the theoretical guarantee on the convergence of this greedy approach. Experiments on large-scale synthetic and real-world datasets including thousands of millions variables show that the COP method is faster than the state-of-the-art techniques for the inverse covariance estimation problem when achieving comparable log-likelihood on test data.
NASA Astrophysics Data System (ADS)
Fiandrotti, Attilio; Fosson, Sophie M.; Ravazzi, Chiara; Magli, Enrico
2018-04-01
Compressive sensing promises to enable bandwidth-efficient on-board compression of astronomical data by lifting the encoding complexity from the source to the receiver. The signal is recovered off-line, exploiting GPUs parallel computation capabilities to speedup the reconstruction process. However, inherent GPU hardware constraints limit the size of the recoverable signal and the speedup practically achievable. In this work, we design parallel algorithms that exploit the properties of circulant matrices for efficient GPU-accelerated sparse signals recovery. Our approach reduces the memory requirements, allowing us to recover very large signals with limited memory. In addition, it achieves a tenfold signal recovery speedup thanks to ad-hoc parallelization of matrix-vector multiplications and matrix inversions. Finally, we practically demonstrate our algorithms in a typical application of circulant matrices: deblurring a sparse astronomical image in the compressed domain.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schiffmann, Florian; VandeVondele, Joost, E-mail: Joost.VandeVondele@mat.ethz.ch
2015-06-28
We present an improved preconditioning scheme for electronic structure calculations based on the orbital transformation method. First, a preconditioner is developed which includes information from the full Kohn-Sham matrix but avoids computationally demanding diagonalisation steps in its construction. This reduces the computational cost of its construction, eliminating a bottleneck in large scale simulations, while maintaining rapid convergence. In addition, a modified form of Hotelling’s iterative inversion is introduced to replace the exact inversion of the preconditioner matrix. This method is highly effective during molecular dynamics (MD), as the solution obtained in earlier MD steps is a suitable initial guess. Filteringmore » small elements during sparse matrix multiplication leads to linear scaling inversion, while retaining robustness, already for relatively small systems. For system sizes ranging from a few hundred to a few thousand atoms, which are typical for many practical applications, the improvements to the algorithm lead to a 2-5 fold speedup per MD step.« less
Efficient Storage Scheme of Covariance Matrix during Inverse Modeling
NASA Astrophysics Data System (ADS)
Mao, D.; Yeh, T. J.
2013-12-01
During stochastic inverse modeling, the covariance matrix of geostatistical based methods carries the information about the geologic structure. Its update during iterations reflects the decrease of uncertainty with the incorporation of observed data. For large scale problem, its storage and update cost too much memory and computational resources. In this study, we propose a new efficient storage scheme for storage and update. Compressed Sparse Column (CSC) format is utilized to storage the covariance matrix, and users can assign how many data they prefer to store based on correlation scales since the data beyond several correlation scales are usually not very informative for inverse modeling. After every iteration, only the diagonal terms of the covariance matrix are updated. The off diagonal terms are calculated and updated based on shortened correlation scales with a pre-assigned exponential model. The correlation scales are shortened by a coefficient, i.e. 0.95, every iteration to show the decrease of uncertainty. There is no universal coefficient for all the problems and users are encouraged to try several times. This new scheme is tested with 1D examples first. The estimated results and uncertainty are compared with the traditional full storage method. In the end, a large scale numerical model is utilized to validate this new scheme.
Sparse Matrix Motivated Reconstruction of Far-Field Radiation Patterns
2015-03-01
method for base - station antenna radiation patterns. IEEE Antennas Propagation Magazine. 2001;43(2):132. 4. Vasiliadis TG, Dimitriou D, Sergiadis JD...algorithm based on sparse representations of radiation patterns using the inverse Discrete Fourier Transform (DFT) and the inverse Discrete Cosine...patterns using a Model- Based Parameter Estimation (MBPE) technique that reduces the computational time required to model radiation patterns. Another
A physiologically motivated sparse, compact, and smooth (SCS) approach to EEG source localization.
Cao, Cheng; Akalin Acar, Zeynep; Kreutz-Delgado, Kenneth; Makeig, Scott
2012-01-01
Here, we introduce a novel approach to the EEG inverse problem based on the assumption that principal cortical sources of multi-channel EEG recordings may be assumed to be spatially sparse, compact, and smooth (SCS). To enforce these characteristics of solutions to the EEG inverse problem, we propose a correlation-variance model which factors a cortical source space covariance matrix into the multiplication of a pre-given correlation coefficient matrix and the square root of the diagonal variance matrix learned from the data under a Bayesian learning framework. We tested the SCS method using simulated EEG data with various SNR and applied it to a real ECOG data set. We compare the results of SCS to those of an established SBL algorithm.
NASA Astrophysics Data System (ADS)
Galiatsatos, P. G.; Tennyson, J.
2012-11-01
The most time consuming step within the framework of the UK R-matrix molecular codes is that of the diagonalization of the inner region Hamiltonian matrix (IRHM). Here we present the method that we follow to speed up this step. We use shared memory machines (SMM), distributed memory machines (DMM), the OpenMP directive based parallel language, the MPI function based parallel language, the sparse matrix diagonalizers ARPACK and PARPACK, a variation for real symmetric matrices of the official coordinate sparse matrix format and finally a parallel sparse matrix-vector product (PSMV). The efficient application of the previous techniques rely on two important facts: the sparsity of the matrix is large enough (more than 98%) and in order to get back converged results we need a small only part of the matrix spectrum.
Laplace-domain waveform modeling and inversion for the 3D acoustic-elastic coupled media
NASA Astrophysics Data System (ADS)
Shin, Jungkyun; Shin, Changsoo; Calandra, Henri
2016-06-01
Laplace-domain waveform inversion reconstructs long-wavelength subsurface models by using the zero-frequency component of damped seismic signals. Despite the computational advantages of Laplace-domain waveform inversion over conventional frequency-domain waveform inversion, an acoustic assumption and an iterative matrix solver have been used to invert 3D marine datasets to mitigate the intensive computing cost. In this study, we develop a Laplace-domain waveform modeling and inversion algorithm for 3D acoustic-elastic coupled media by using a parallel sparse direct solver library (MUltifrontal Massively Parallel Solver, MUMPS). We precisely simulate a real marine environment by coupling the 3D acoustic and elastic wave equations with the proper boundary condition at the fluid-solid interface. In addition, we can extract the elastic properties of the Earth below the sea bottom from the recorded acoustic pressure datasets. As a matrix solver, the parallel sparse direct solver is used to factorize the non-symmetric impedance matrix in a distributed memory architecture and rapidly solve the wave field for a number of shots by using the lower and upper matrix factors. Using both synthetic datasets and real datasets obtained by a 3D wide azimuth survey, the long-wavelength component of the P-wave and S-wave velocity models is reconstructed and the proposed modeling and inversion algorithm are verified. A cluster of 80 CPU cores is used for this study.
NASA Astrophysics Data System (ADS)
Stoykov, S.; Atanassov, E.; Margenov, S.
2016-10-01
Many of the scientific applications involve sparse or dense matrix operations, such as solving linear systems, matrix-matrix products, eigensolvers, etc. In what concerns structural nonlinear dynamics, the computations of periodic responses and the determination of stability of the solution are of primary interest. Shooting method iswidely used for obtaining periodic responses of nonlinear systems. The method involves simultaneously operations with sparse and dense matrices. One of the computationally expensive operations in the method is multiplication of sparse by dense matrices. In the current work, a new algorithm for sparse matrix by dense matrix products is presented. The algorithm takes into account the structure of the sparse matrix, which is obtained by space discretization of the nonlinear Mindlin's plate equation of motion by the finite element method. The algorithm is developed to use the vector engine of Intel Xeon Phi coprocessors. It is compared with the standard sparse matrix by dense matrix algorithm and the one developed by Intel MKL and it is shown that by considering the properties of the sparse matrix better algorithms can be developed.
Overcoming Challenges in Kinetic Modeling of Magnetized Plasmas and Vacuum Electronic Devices
NASA Astrophysics Data System (ADS)
Omelchenko, Yuri; Na, Dong-Yeop; Teixeira, Fernando
2017-10-01
We transform the state-of-the art of plasma modeling by taking advantage of novel computational techniques for fast and robust integration of multiscale hybrid (full particle ions, fluid electrons, no displacement current) and full-PIC models. These models are implemented in 3D HYPERS and axisymmetric full-PIC CONPIC codes. HYPERS is a massively parallel, asynchronous code. The HYPERS solver does not step fields and particles synchronously in time but instead executes local variable updates (events) at their self-adaptive rates while preserving fundamental conservation laws. The charge-conserving CONPIC code has a matrix-free explicit finite-element (FE) solver based on a sparse-approximate inverse (SPAI) algorithm. This explicit solver approximates the inverse FE system matrix (``mass'' matrix) using successive sparsity pattern orders of the original matrix. It does not reduce the set of Maxwell's equations to a vector-wave (curl-curl) equation of second order but instead utilizes the standard coupled first-order Maxwell's system. We discuss the ability of our codes to accurately and efficiently account for multiscale physical phenomena in 3D magnetized space and laboratory plasmas and axisymmetric vacuum electronic devices.
Multi-threaded Sparse Matrix Sparse Matrix Multiplication for Many-Core and GPU Architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deveci, Mehmet; Trott, Christian Robert; Rajamanickam, Sivasankaran
Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix- matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and datamore » structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.« less
Negre, Christian F. A; Mniszewski, Susan M.; Cawkwell, Marc Jon; ...
2016-06-06
We present a reduced complexity algorithm to compute the inverse overlap factors required to solve the generalized eigenvalue problem in a quantum-based molecular dynamics (MD) simulation. Our method is based on the recursive iterative re nement of an initial guess Z of the inverse overlap matrix S. The initial guess of Z is obtained beforehand either by using an approximate divide and conquer technique or dynamically, propagated within an extended Lagrangian dynamics from previous MD time steps. With this formulation, we achieve long-term stability and energy conservation even under incomplete approximate iterative re nement of Z. Linear scaling performance ismore » obtained using numerically thresholded sparse matrix algebra based on the ELLPACK-R sparse matrix data format, which also enables e cient shared memory parallelization. As we show in this article using selfconsistent density functional based tight-binding MD, our approach is faster than conventional methods based on the direct diagonalization of the overlap matrix S for systems as small as a few hundred atoms, substantially accelerating quantum-based simulations even for molecular structures of intermediate size. For a 4,158 atom water-solvated polyalanine system we nd an average speedup factor of 122 for the computation of Z in each MD step.« less
Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation.
Lam, Clifford; Fan, Jianqing
2009-01-01
This paper studies the sparsistency and rates of convergence for estimating sparse covariance and precision matrices based on penalized likelihood with nonconvex penalty functions. Here, sparsistency refers to the property that all parameters that are zero are actually estimated as zero with probability tending to one. Depending on the case of applications, sparsity priori may occur on the covariance matrix, its inverse or its Cholesky decomposition. We study these three sparsity exploration problems under a unified framework with a general penalty function. We show that the rates of convergence for these problems under the Frobenius norm are of order (s(n) log p(n)/n)(1/2), where s(n) is the number of nonzero elements, p(n) is the size of the covariance matrix and n is the sample size. This explicitly spells out the contribution of high-dimensionality is merely of a logarithmic factor. The conditions on the rate with which the tuning parameter λ(n) goes to 0 have been made explicit and compared under different penalties. As a result, for the L(1)-penalty, to guarantee the sparsistency and optimal rate of convergence, the number of nonzero elements should be small: sn'=O(pn) at most, among O(pn2) parameters, for estimating sparse covariance or correlation matrix, sparse precision or inverse correlation matrix or sparse Cholesky factor, where sn' is the number of the nonzero elements on the off-diagonal entries. On the other hand, using the SCAD or hard-thresholding penalty functions, there is no such a restriction.
Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU Architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deveci, Mehmet; Rajamanickam, Sivasankaran; Trott, Christian Robert
Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scienti c computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and datamore » structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.« less
Recursive inverse factorization.
Rubensson, Emanuel H; Bock, Nicolas; Holmström, Erik; Niklasson, Anders M N
2008-03-14
A recursive algorithm for the inverse factorization S(-1)=ZZ(*) of Hermitian positive definite matrices S is proposed. The inverse factorization is based on iterative refinement [A.M.N. Niklasson, Phys. Rev. B 70, 193102 (2004)] combined with a recursive decomposition of S. As the computational kernel is matrix-matrix multiplication, the algorithm can be parallelized and the computational effort increases linearly with system size for systems with sufficiently sparse matrices. Recent advances in network theory are used to find appropriate recursive decompositions. We show that optimization of the so-called network modularity results in an improved partitioning compared to other approaches. In particular, when the recursive inverse factorization is applied to overlap matrices of irregularly structured three-dimensional molecules.
Fast and Adaptive Sparse Precision Matrix Estimation in High Dimensions
Liu, Weidong; Luo, Xi
2014-01-01
This paper proposes a new method for estimating sparse precision matrices in the high dimensional setting. It has been popular to study fast computation and adaptive procedures for this problem. We propose a novel approach, called Sparse Column-wise Inverse Operator, to address these two issues. We analyze an adaptive procedure based on cross validation, and establish its convergence rate under the Frobenius norm. The convergence rates under other matrix norms are also established. This method also enjoys the advantage of fast computation for large-scale problems, via a coordinate descent algorithm. Numerical merits are illustrated using both simulated and real datasets. In particular, it performs favorably on an HIV brain tissue dataset and an ADHD resting-state fMRI dataset. PMID:25750463
BCYCLIC: A parallel block tridiagonal matrix cyclic solver
NASA Astrophysics Data System (ADS)
Hirshman, S. P.; Perumalla, K. S.; Lynch, V. E.; Sanchez, R.
2010-09-01
A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.
High-SNR spectrum measurement based on Hadamard encoding and sparse reconstruction
NASA Astrophysics Data System (ADS)
Wang, Zhaoxin; Yue, Jiang; Han, Jing; Li, Long; Jin, Yong; Gao, Yuan; Li, Baoming
2017-12-01
The denoising capabilities of the H-matrix and cyclic S-matrix based on the sparse reconstruction, employed in the Pixel of Focal Plane Coded Visible Spectrometer for spectrum measurement are investigated, where the spectrum is sparse in a known basis. In the measurement process, the digital micromirror device plays an important role, which implements the Hadamard coding. In contrast with Hadamard transform spectrometry, based on the shift invariability, this spectrometer may have the advantage of a high efficiency. Simulations and experiments show that the nonlinear solution with a sparse reconstruction has a better signal-to-noise ratio than the linear solution and the H-matrix outperforms the cyclic S-matrix whether the reconstruction method is nonlinear or linear.
A fast time-difference inverse solver for 3D EIT with application to lung imaging.
Javaherian, Ashkan; Soleimani, Manuchehr; Moeller, Knut
2016-08-01
A class of sparse optimization techniques that require solely matrix-vector products, rather than an explicit access to the forward matrix and its transpose, has been paid much attention in the recent decade for dealing with large-scale inverse problems. This study tailors application of the so-called Gradient Projection for Sparse Reconstruction (GPSR) to large-scale time-difference three-dimensional electrical impedance tomography (3D EIT). 3D EIT typically suffers from the need for a large number of voxels to cover the whole domain, so its application to real-time imaging, for example monitoring of lung function, remains scarce since the large number of degrees of freedom of the problem extremely increases storage space and reconstruction time. This study shows the great potential of the GPSR for large-size time-difference 3D EIT. Further studies are needed to improve its accuracy for imaging small-size anomalies.
Wavelet-like bases for thin-wire integral equations in electromagnetics
NASA Astrophysics Data System (ADS)
Francomano, E.; Tortorici, A.; Toscano, E.; Ala, G.; Viola, F.
2005-03-01
In this paper, wavelets are used in solving, by the method of moments, a modified version of the thin-wire electric field integral equation, in frequency domain. The time domain electromagnetic quantities, are obtained by using the inverse discrete fast Fourier transform. The retarded scalar electric and vector magnetic potentials are employed in order to obtain the integral formulation. The discretized model generated by applying the direct method of moments via point-matching procedure, results in a linear system with a dense matrix which have to be solved for each frequency of the Fourier spectrum of the time domain impressed source. Therefore, orthogonal wavelet-like basis transform is used to sparsify the moment matrix. In particular, dyadic and M-band wavelet transforms have been adopted, so generating different sparse matrix structures. This leads to an efficient solution in solving the resulting sparse matrix equation. Moreover, a wavelet preconditioner is used to accelerate the convergence rate of the iterative solver employed. These numerical features are used in analyzing the transient behavior of a lightning protection system. In particular, the transient performance of the earth termination system of a lightning protection system or of the earth electrode of an electric power substation, during its operation is focused. The numerical results, obtained by running a complex structure, are discussed and the features of the used method are underlined.
DNA melting profiles from a matrix method.
Poland, Douglas
2004-02-05
In this article we give a new method for the calculation of DNA melting profiles. Based on the matrix formulation of the DNA partition function, the method relies for its efficiency on the fact that the required matrices are very sparse, essentially reducing matrix multiplication to vector multiplication and thus making the computer time required to treat a DNA molecule containing N base pairs proportional to N(2). A key ingredient in the method is the result that multiplication by the inverse matrix can also be reduced to vector multiplication. The task of calculating the melting profile for the entire genome is further reduced by treating regions of the molecule between helix-plateaus, thus breaking the molecule up into independent parts that can each be treated individually. The method is easily modified to incorporate changes in the assignment of statistical weights to the different structural features of DNA. We illustrate the method using the genome of Haemophilus influenzae. Copyright 2003 Wiley Periodicals, Inc.
Bilinear Inverse Problems: Theory, Algorithms, and Applications
NASA Astrophysics Data System (ADS)
Ling, Shuyang
We will discuss how several important real-world signal processing problems, such as self-calibration and blind deconvolution, can be modeled as bilinear inverse problems and solved by convex and nonconvex optimization approaches. In Chapter 2, we bring together three seemingly unrelated concepts, self-calibration, compressive sensing and biconvex optimization. We show how several self-calibration problems can be treated efficiently within the framework of biconvex compressive sensing via a new method called SparseLift. More specifically, we consider a linear system of equations y = DAx, where the diagonal matrix D (which models the calibration error) is unknown and x is an unknown sparse signal. By "lifting" this biconvex inverse problem and exploiting sparsity in this model, we derive explicit theoretical guarantees under which both x and D can be recovered exactly, robustly, and numerically efficiently. In Chapter 3, we study the question of the joint blind deconvolution and blind demixing, i.e., extracting a sequence of functions [special characters omitted] from observing only the sum of their convolutions [special characters omitted]. In particular, for the special case s = 1, it becomes the well-known blind deconvolution problem. We present a non-convex algorithm which guarantees exact recovery under conditions that are competitive with convex optimization methods, with the additional advantage of being computationally much more efficient. We discuss several applications of the proposed framework in image processing and wireless communications in connection with the Internet-of-Things. In Chapter 4, we consider three different self-calibration models of practical relevance. We show how their corresponding bilinear inverse problems can be solved by both the simple linear least squares approach and the SVD-based approach. As a consequence, the proposed algorithms are numerically extremely efficient, thus allowing for real-time deployment. Explicit theoretical guarantees and stability theory are derived and the number of sampling complexity is nearly optimal (up to a poly-log factor). Applications in imaging sciences and signal processing are discussed and numerical simulations are presented to demonstrate the effectiveness and efficiency of our approach.
Three-Dimensional Inverse Transport Solver Based on Compressive Sensing Technique
NASA Astrophysics Data System (ADS)
Cheng, Yuxiong; Wu, Hongchun; Cao, Liangzhi; Zheng, Youqi
2013-09-01
According to the direct exposure measurements from flash radiographic image, a compressive sensing-based method for three-dimensional inverse transport problem is presented. The linear absorption coefficients and interface locations of objects are reconstructed directly at the same time. It is always very expensive to obtain enough measurements. With limited measurements, compressive sensing sparse reconstruction technique orthogonal matching pursuit is applied to obtain the sparse coefficients by solving an optimization problem. A three-dimensional inverse transport solver is developed based on a compressive sensing-based technique. There are three features in this solver: (1) AutoCAD is employed as a geometry preprocessor due to its powerful capacity in graphic. (2) The forward projection matrix rather than Gauss matrix is constructed by the visualization tool generator. (3) Fourier transform and Daubechies wavelet transform are adopted to convert an underdetermined system to a well-posed system in the algorithm. Simulations are performed and numerical results in pseudo-sine absorption problem, two-cube problem and two-cylinder problem when using compressive sensing-based solver agree well with the reference value.
FPGA implementation of sparse matrix algorithm for information retrieval
NASA Astrophysics Data System (ADS)
Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio
2005-06-01
Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
Sparse deconvolution for the large-scale ill-posed inverse problem of impact force reconstruction
NASA Astrophysics Data System (ADS)
Qiao, Baijie; Zhang, Xingwu; Gao, Jiawei; Liu, Ruonan; Chen, Xuefeng
2017-01-01
Most previous regularization methods for solving the inverse problem of force reconstruction are to minimize the l2-norm of the desired force. However, these traditional regularization methods such as Tikhonov regularization and truncated singular value decomposition, commonly fail to solve the large-scale ill-posed inverse problem in moderate computational cost. In this paper, taking into account the sparse characteristic of impact force, the idea of sparse deconvolution is first introduced to the field of impact force reconstruction and a general sparse deconvolution model of impact force is constructed. Second, a novel impact force reconstruction method based on the primal-dual interior point method (PDIPM) is proposed to solve such a large-scale sparse deconvolution model, where minimizing the l2-norm is replaced by minimizing the l1-norm. Meanwhile, the preconditioned conjugate gradient algorithm is used to compute the search direction of PDIPM with high computational efficiency. Finally, two experiments including the small-scale or medium-scale single impact force reconstruction and the relatively large-scale consecutive impact force reconstruction are conducted on a composite wind turbine blade and a shell structure to illustrate the advantage of PDIPM. Compared with Tikhonov regularization, PDIPM is more efficient, accurate and robust whether in the single impact force reconstruction or in the consecutive impact force reconstruction.
Task-driven dictionary learning.
Mairal, Julien; Bach, Francis; Ponce, Jean
2012-04-01
Modeling data with linear combinations of a few elements from a learned dictionary has been the focus of much recent research in machine learning, neuroscience, and signal processing. For signals such as natural images that admit such sparse representations, it is now well established that these models are well suited to restoration tasks. In this context, learning the dictionary amounts to solving a large-scale matrix factorization problem, which can be done efficiently with classical optimization tools. The same approach has also been used for learning features from data for other purposes, e.g., image classification, but tuning the dictionary in a supervised way for these tasks has proven to be more difficult. In this paper, we present a general formulation for supervised dictionary learning adapted to a wide variety of tasks, and present an efficient algorithm for solving the corresponding optimization problem. Experiments on handwritten digit classification, digital art identification, nonlinear inverse image problems, and compressed sensing demonstrate that our approach is effective in large-scale settings, and is well suited to supervised and semi-supervised classification, as well as regression tasks for data that admit sparse representations.
An Efficient Scheme for Updating Sparse Cholesky Factors
NASA Technical Reports Server (NTRS)
Raghavan, Padma
2002-01-01
Raghavan had earlier developed the software package DCSPACK which can be used for solving sparse linear systems where the coefficient matrix is symmetric and positive definite (this project was not funded by NASA but by agencies such as NSF). DSCPACK-S is the serial code and DSCPACK-P is a parallel implementation suitable for multiprocessors or networks-of-workstations with message passing using MCI. The main algorithm used is the Cholesky factorization of a sparse symmetric positive positive definite matrix A = LL(T). The code can also compute the factorization A = LDL(T). The complexity of the software arises from several factors relating to the sparsity of the matrix A. A sparse N x N matrix A has typically less that cN nonzeroes where c is a small constant. If the matrix were dense, it would have O(N2) nonzeroes. The most complicated part of such sparse Cholesky factorization relates to fill-in, i.e., zeroes in the original matrix that become nonzeroes in the factor L. An efficient implementation depends to a large extent on complex data structures and on techniques from graph theory to reduce, identify, and manage fill. DSCPACK is based on an efficient multifrontal implementation with fill-managing algorithms and implementation arising from earlier research by Raghavan and others. Sparse Cholesky factorization is typically a four step process: (1) ordering to compute a fill-reducing numbering, (2) symbolic factorization to determine the nonzero structure of L, (3) numeric factorization to compute L, and, (4) triangular solution to solve L(T)x = y and Ly = b. The first two steps are symbolic and are performed using the graph of the matrix. The numeric factorization step is of dominant cost and there are several schemes for improving performance by exploiting the nested and dense structure of groups of columns in the factor. The latter are aimed at better utilization of the cache-memory hierarchy on modem processors to prevent cache-misses and provide execution rates (operations/second) that are close to the peak rates for dense matrix computations. Currently, EPISCOPACY is being used in an application at NASA directed by J. Newman and M. James. We propose the implementation of efficient schemes for updating the LL(T) or LDL(T) factors computed in DSCPACK-S to meet the computational requirements of their project. A brief description is provided in the next section.
Saravanan, Chandra; Shao, Yihan; Baer, Roi; Ross, Philip N; Head-Gordon, Martin
2003-04-15
A sparse matrix multiplication scheme with multiatom blocks is reported, a tool that can be very useful for developing linear-scaling methods with atom-centered basis functions. Compared to conventional element-by-element sparse matrix multiplication schemes, efficiency is gained by the use of the highly optimized basic linear algebra subroutines (BLAS). However, some sparsity is lost in the multiatom blocking scheme because these matrix blocks will in general contain negligible elements. As a result, an optimal block size that minimizes the CPU time by balancing these two effects is recovered. In calculations on linear alkanes, polyglycines, estane polymers, and water clusters the optimal block size is found to be between 40 and 100 basis functions, where about 55-75% of the machine peak performance was achieved on an IBM RS6000 workstation. In these calculations, the blocked sparse matrix multiplications can be 10 times faster than a standard element-by-element sparse matrix package. Copyright 2003 Wiley Periodicals, Inc. J Comput Chem 24: 618-622, 2003
Computing row and column counts for sparse QR and LU factorization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilbert, John R.; Li, Xiaoye S.; Ng, Esmond G.
2001-01-01
We present algorithms to determine the number of nonzeros in each row and column of the factors of a sparse matrix, for both the QR factorization and the LU factorization with partial pivoting. The algorithms use only the nonzero structure of the input matrix, and run in time nearly linear in the number of nonzeros in that matrix. They may be used to set up data structures or schedule parallel operations in advance of the numerical factorization. The row and column counts we compute are upper bounds on the actual counts. If the input matrix is strong Hall and theremore » is no coincidental numerical cancellation, the counts are exact for QR factorization and are the tightest bounds possible for LU factorization. These algorithms are based on our earlier work on computing row and column counts for sparse Cholesky factorization, plus an efficient method to compute the column elimination tree of a sparse matrix without explicitly forming the product of the matrix and its transpose.« less
Benzi, Michele; Evans, Thomas M.; Hamilton, Steven P.; ...
2017-03-05
Here, we consider hybrid deterministic-stochastic iterative algorithms for the solution of large, sparse linear systems. Starting from a convergent splitting of the coefficient matrix, we analyze various types of Monte Carlo acceleration schemes applied to the original preconditioned Richardson (stationary) iteration. We expect that these methods will have considerable potential for resiliency to faults when implemented on massively parallel machines. We also establish sufficient conditions for the convergence of the hybrid schemes, and we investigate different types of preconditioners including sparse approximate inverses. Numerical experiments on linear systems arising from the discretization of partial differential equations are presented.
Incomplete Sparse Approximate Inverses for Parallel Preconditioning
Anzt, Hartwig; Huckle, Thomas K.; Bräckle, Jürgen; ...
2017-10-28
In this study, we propose a new preconditioning method that can be seen as a generalization of block-Jacobi methods, or as a simplification of the sparse approximate inverse (SAI) preconditioners. The “Incomplete Sparse Approximate Inverses” (ISAI) is in particular efficient in the solution of sparse triangular linear systems of equations. Those arise, for example, in the context of incomplete factorization preconditioning. ISAI preconditioners can be generated via an algorithm providing fine-grained parallelism, which makes them attractive for hardware with a high concurrency level. Finally, in a study covering a large number of matrices, we identify the ISAI preconditioner as anmore » attractive alternative to exact triangular solves in the context of incomplete factorization preconditioning.« less
Cheng, Yih-Chun; Tsai, Pei-Yun; Huang, Ming-Hao
2016-05-19
Low-complexity compressed sensing (CS) techniques for monitoring electrocardiogram (ECG) signals in wireless body sensor network (WBSN) are presented. The prior probability of ECG sparsity in the wavelet domain is first exploited. Then, variable orthogonal multi-matching pursuit (vOMMP) algorithm that consists of two phases is proposed. In the first phase, orthogonal matching pursuit (OMP) algorithm is adopted to effectively augment the support set with reliable indices and in the second phase, the orthogonal multi-matching pursuit (OMMP) is employed to rescue the missing indices. The reconstruction performance is thus enhanced with the prior information and the vOMMP algorithm. Furthermore, the computation-intensive pseudo-inverse operation is simplified by the matrix-inversion-free (MIF) technique based on QR decomposition. The vOMMP-MIF CS decoder is then implemented in 90 nm CMOS technology. The QR decomposition is accomplished by two systolic arrays working in parallel. The implementation supports three settings for obtaining 40, 44, and 48 coefficients in the sparse vector. From the measurement result, the power consumption is 11.7 mW at 0.9 V and 12 MHz. Compared to prior chip implementations, our design shows good hardware efficiency and is suitable for low-energy applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pinski, Peter; Riplinger, Christoph; Neese, Frank, E-mail: evaleev@vt.edu, E-mail: frank.neese@cec.mpg.de
2015-07-21
In this work, a systematic infrastructure is described that formalizes concepts implicit in previous work and greatly simplifies computer implementation of reduced-scaling electronic structure methods. The key concept is sparse representation of tensors using chains of sparse maps between two index sets. Sparse map representation can be viewed as a generalization of compressed sparse row, a common representation of a sparse matrix, to tensor data. By combining few elementary operations on sparse maps (inversion, chaining, intersection, etc.), complex algorithms can be developed, illustrated here by a linear-scaling transformation of three-center Coulomb integrals based on our compact code library that implementsmore » sparse maps and operations on them. The sparsity of the three-center integrals arises from spatial locality of the basis functions and domain density fitting approximation. A novel feature of our approach is the use of differential overlap integrals computed in linear-scaling fashion for screening products of basis functions. Finally, a robust linear scaling domain based local pair natural orbital second-order Möller-Plesset (DLPNO-MP2) method is described based on the sparse map infrastructure that only depends on a minimal number of cutoff parameters that can be systematically tightened to approach 100% of the canonical MP2 correlation energy. With default truncation thresholds, DLPNO-MP2 recovers more than 99.9% of the canonical resolution of the identity MP2 (RI-MP2) energy while still showing a very early crossover with respect to the computational effort. Based on extensive benchmark calculations, relative energies are reproduced with an error of typically <0.2 kcal/mol. The efficiency of the local MP2 (LMP2) method can be drastically improved by carrying out the LMP2 iterations in a basis of pair natural orbitals. While the present work focuses on local electron correlation, it is of much broader applicability to computation with sparse tensors in quantum chemistry and beyond.« less
Pinski, Peter; Riplinger, Christoph; Valeev, Edward F; Neese, Frank
2015-07-21
In this work, a systematic infrastructure is described that formalizes concepts implicit in previous work and greatly simplifies computer implementation of reduced-scaling electronic structure methods. The key concept is sparse representation of tensors using chains of sparse maps between two index sets. Sparse map representation can be viewed as a generalization of compressed sparse row, a common representation of a sparse matrix, to tensor data. By combining few elementary operations on sparse maps (inversion, chaining, intersection, etc.), complex algorithms can be developed, illustrated here by a linear-scaling transformation of three-center Coulomb integrals based on our compact code library that implements sparse maps and operations on them. The sparsity of the three-center integrals arises from spatial locality of the basis functions and domain density fitting approximation. A novel feature of our approach is the use of differential overlap integrals computed in linear-scaling fashion for screening products of basis functions. Finally, a robust linear scaling domain based local pair natural orbital second-order Möller-Plesset (DLPNO-MP2) method is described based on the sparse map infrastructure that only depends on a minimal number of cutoff parameters that can be systematically tightened to approach 100% of the canonical MP2 correlation energy. With default truncation thresholds, DLPNO-MP2 recovers more than 99.9% of the canonical resolution of the identity MP2 (RI-MP2) energy while still showing a very early crossover with respect to the computational effort. Based on extensive benchmark calculations, relative energies are reproduced with an error of typically <0.2 kcal/mol. The efficiency of the local MP2 (LMP2) method can be drastically improved by carrying out the LMP2 iterations in a basis of pair natural orbitals. While the present work focuses on local electron correlation, it is of much broader applicability to computation with sparse tensors in quantum chemistry and beyond.
Systematic sparse matrix error control for linear scaling electronic structure calculations.
Rubensson, Emanuel H; Sałek, Paweł
2005-11-30
Efficient truncation criteria used in multiatom blocked sparse matrix operations for ab initio calculations are proposed. As system size increases, so does the need to stay on top of errors and still achieve high performance. A variant of a blocked sparse matrix algebra to achieve strict error control with good performance is proposed. The presented idea is that the condition to drop a certain submatrix should depend not only on the magnitude of that particular submatrix, but also on which other submatrices that are dropped. The decision to remove a certain submatrix is based on the contribution the removal would cause to the error in the chosen norm. We study the effect of an accumulated truncation error in iterative algorithms like trace correcting density matrix purification. One way to reduce the initial exponential growth of this error is presented. The presented error control for a sparse blocked matrix toolbox allows for achieving optimal performance by performing only necessary operations needed to maintain the requested level of accuracy. Copyright 2005 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Babbush, Ryan; Berry, Dominic W.; Sanders, Yuval R.; Kivlichan, Ian D.; Scherer, Artur; Wei, Annie Y.; Love, Peter J.; Aspuru-Guzik, Alán
2018-01-01
We present a quantum algorithm for the simulation of molecular systems that is asymptotically more efficient than all previous algorithms in the literature in terms of the main problem parameters. As in Babbush et al (2016 New Journal of Physics 18, 033032), we employ a recently developed technique for simulating Hamiltonian evolution using a truncated Taylor series to obtain logarithmic scaling with the inverse of the desired precision. The algorithm of this paper involves simulation under an oracle for the sparse, first-quantized representation of the molecular Hamiltonian known as the configuration interaction (CI) matrix. We construct and query the CI matrix oracle to allow for on-the-fly computation of molecular integrals in a way that is exponentially more efficient than classical numerical methods. Whereas second-quantized representations of the wavefunction require \\widetilde{{ O }}(N) qubits, where N is the number of single-particle spin-orbitals, the CI matrix representation requires \\widetilde{{ O }}(η ) qubits, where η \\ll N is the number of electrons in the molecule of interest. We show that the gate count of our algorithm scales at most as \\widetilde{{ O }}({η }2{N}3t).
Preconditioner-free Wiener filtering with a dense noise matrix
NASA Astrophysics Data System (ADS)
Huffenberger, Kevin M.
2018-05-01
This work extends the Elsner & Wandelt (2013) iterative method for efficient, preconditioner-free Wiener filtering to cases in which the noise covariance matrix is dense, but can be decomposed into a sum whose parts are sparse in convenient bases. The new method, which uses multiple messenger fields, reproduces Wiener-filter solutions for test problems, and we apply it to a case beyond the reach of the Elsner & Wandelt (2013) method. We compute the Wiener-filter solution for a simulated Cosmic Microwave Background (CMB) map that contains spatially varying, uncorrelated noise, isotropic 1/f noise, and large-scale horizontal stripes (like those caused by atmospheric noise). We discuss simple extensions that can filter contaminated modes or inverse-noise-filter the data. These techniques help to address complications in the noise properties of maps from current and future generations of ground-based Microwave Background experiments, like Advanced ACTPol, Simons Observatory, and CMB-S4.
An efficient sparse matrix multiplication scheme for the CYBER 205 computer
NASA Technical Reports Server (NTRS)
Lambiotte, Jules J., Jr.
1988-01-01
This paper describes the development of an efficient algorithm for computing the product of a matrix and vector on a CYBER 205 vector computer. The desire to provide software which allows the user to choose between the often conflicting goals of minimizing central processing unit (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of four types of storage is selected for each diagonal. The candidate storage types employed were chosen to be efficient on the CYBER 205 for diagonals which have nonzero structure which is dense, moderately sparse, very sparse and short, or very sparse and long; however, for many densities, no diagonal type is most efficient with respect to both resource requirements, and a trade-off must be made. For each diagonal, an initialization subroutine estimates the CPU time and storage required for each storage type based on results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the two resources. The adjusted resource requirements are then compared to select the most efficient storage and computational scheme.
A space efficient flexible pivot selection approach to evaluate determinant and inverse of a matrix.
Jafree, Hafsa Athar; Imtiaz, Muhammad; Inayatullah, Syed; Khan, Fozia Hanif; Nizami, Tajuddin
2014-01-01
This paper presents new simple approaches for evaluating determinant and inverse of a matrix. The choice of pivot selection has been kept arbitrary thus they reduce the error while solving an ill conditioned system. Computation of determinant of a matrix has been made more efficient by saving unnecessary data storage and also by reducing the order of the matrix at each iteration, while dictionary notation [1] has been incorporated for computing the matrix inverse thereby saving unnecessary calculations. These algorithms are highly class room oriented, easy to use and implemented by students. By taking the advantage of flexibility in pivot selection, one may easily avoid development of the fractions by most. Unlike the matrix inversion method [2] and [3], the presented algorithms obviate the use of permutations and inverse permutations.
Visual Tracking Based on Extreme Learning Machine and Sparse Representation
Wang, Baoxian; Tang, Linbo; Yang, Jinglin; Zhao, Baojun; Wang, Shuigen
2015-01-01
The existing sparse representation-based visual trackers mostly suffer from both being time consuming and having poor robustness problems. To address these issues, a novel tracking method is presented via combining sparse representation and an emerging learning technique, namely extreme learning machine (ELM). Specifically, visual tracking can be divided into two consecutive processes. Firstly, ELM is utilized to find the optimal separate hyperplane between the target observations and background ones. Thus, the trained ELM classification function is able to remove most of the candidate samples related to background contents efficiently, thereby reducing the total computational cost of the following sparse representation. Secondly, to further combine ELM and sparse representation, the resultant confidence values (i.e., probabilities to be a target) of samples on the ELM classification function are used to construct a new manifold learning constraint term of the sparse representation framework, which tends to achieve robuster results. Moreover, the accelerated proximal gradient method is used for deriving the optimal solution (in matrix form) of the constrained sparse tracking model. Additionally, the matrix form solution allows the candidate samples to be calculated in parallel, thereby leading to a higher efficiency. Experiments demonstrate the effectiveness of the proposed tracker. PMID:26506359
Comparing implementations of penalized weighted least-squares sinogram restoration.
Forthmann, Peter; Koehler, Thomas; Defrise, Michel; La Riviere, Patrick
2010-11-01
A CT scanner measures the energy that is deposited in each channel of a detector array by x rays that have been partially absorbed on their way through the object. The measurement process is complex and quantitative measurements are always and inevitably associated with errors, so CT data must be preprocessed prior to reconstruction. In recent years, the authors have formulated CT sinogram preprocessing as a statistical restoration problem in which the goal is to obtain the best estimate of the line integrals needed for reconstruction from the set of noisy, degraded measurements. The authors have explored both penalized Poisson likelihood (PL) and penalized weighted least-squares (PWLS) objective functions. At low doses, the authors found that the PL approach outperforms PWLS in terms of resolution-noise tradeoffs, but at standard doses they perform similarly. The PWLS objective function, being quadratic, is more amenable to computational acceleration than the PL objective. In this work, the authors develop and compare two different methods for implementing PWLS sinogram restoration with the hope of improving computational performance relative to PL in the standard-dose regime. Sinogram restoration is still significant in the standard-dose regime since it can still outperform standard approaches and it allows for correction of effects that are not usually modeled in standard CT preprocessing. The authors have explored and compared two implementation strategies for PWLS sinogram restoration: (1) A direct matrix-inversion strategy based on the closed-form solution to the PWLS optimization problem and (2) an iterative approach based on the conjugate-gradient algorithm. Obtaining optimal performance from each strategy required modifying the naive off-the-shelf implementations of the algorithms to exploit the particular symmetry and sparseness of the sinogram-restoration problem. For the closed-form approach, the authors subdivided the large matrix inversion into smaller coupled problems and exploited sparseness to minimize matrix operations. For the conjugate-gradient approach, the authors exploited sparseness and preconditioned the problem to speed up convergence. All methods produced qualitatively and quantitatively similar images as measured by resolution-variance tradeoffs and difference images. Despite the acceleration strategies, the direct matrix-inversion approach was found to be uncompetitive with iterative approaches, with a computational burden higher by an order of magnitude or more. The iterative conjugate-gradient approach, however, does appear promising, with computation times half that of the authors' previous penalized-likelihood implementation. Iterative conjugate-gradient based PWLS sinogram restoration with careful matrix optimizations has computational advantages over direct matrix PWLS inversion and over penalized-likelihood sinogram restoration and can be considered a good alternative in standard-dose regimes.
NASA Astrophysics Data System (ADS)
Quy Muoi, Pham; Nho Hào, Dinh; Sahoo, Sujit Kumar; Tang, Dongliang; Cong, Nguyen Huu; Dang, Cuong
2018-05-01
In this paper, we study a gradient-type method and a semismooth Newton method for minimization problems in regularizing inverse problems with nonnegative and sparse solutions. We propose a special penalty functional forcing the minimizers of regularized minimization problems to be nonnegative and sparse, and then we apply the proposed algorithms in a practical the problem. The strong convergence of the gradient-type method and the local superlinear convergence of the semismooth Newton method are proven. Then, we use these algorithms for the phase retrieval problem and illustrate their efficiency in numerical examples, particularly in the practical problem of optical imaging through scattering media where all the noises from experiment are presented.
Multi scales based sparse matrix spectral clustering image segmentation
NASA Astrophysics Data System (ADS)
Liu, Zhongmin; Chen, Zhicai; Li, Zhanming; Hu, Wenjin
2018-04-01
In image segmentation, spectral clustering algorithms have to adopt the appropriate scaling parameter to calculate the similarity matrix between the pixels, which may have a great impact on the clustering result. Moreover, when the number of data instance is large, computational complexity and memory use of the algorithm will greatly increase. To solve these two problems, we proposed a new spectral clustering image segmentation algorithm based on multi scales and sparse matrix. We devised a new feature extraction method at first, then extracted the features of image on different scales, at last, using the feature information to construct sparse similarity matrix which can improve the operation efficiency. Compared with traditional spectral clustering algorithm, image segmentation experimental results show our algorithm have better degree of accuracy and robustness.
Distorted Born iterative T-matrix method for inversion of CSEM data in anisotropic media
NASA Astrophysics Data System (ADS)
Jakobsen, Morten; Tveit, Svenn
2018-05-01
We present a direct iterative solutions to the nonlinear controlled-source electromagnetic (CSEM) inversion problem in the frequency domain, which is based on a volume integral equation formulation of the forward modelling problem in anisotropic conductive media. Our vectorial nonlinear inverse scattering approach effectively replaces an ill-posed nonlinear inverse problem with a series of linear ill-posed inverse problems, for which there already exist efficient (regularized) solution methods. The solution update the dyadic Green's function's from the source to the scattering-volume and from the scattering-volume to the receivers, after each iteration. The T-matrix approach of multiple scattering theory is used for efficient updating of all dyadic Green's functions after each linearized inversion step. This means that we have developed a T-matrix variant of the Distorted Born Iterative (DBI) method, which is often used in the acoustic and electromagnetic (medical) imaging communities as an alternative to contrast-source inversion. The main advantage of using the T-matrix approach in this context, is that it eliminates the need to perform a full forward simulation at each iteration of the DBI method, which is known to be consistent with the Gauss-Newton method. The T-matrix allows for a natural domain decomposition, since in the sense that a large model can be decomposed into an arbitrary number of domains that can be treated independently and in parallel. The T-matrix we use for efficient model updating is also independent of the source-receiver configuration, which could be an advantage when performing fast-repeat modelling and time-lapse inversion. The T-matrix is also compatible with the use of modern renormalization methods that can potentially help us to reduce the sensitivity of the CSEM inversion results on the starting model. To illustrate the performance and potential of our T-matrix variant of the DBI method for CSEM inversion, we performed a numerical experiments based on synthetic CSEM data associated with 2D VTI and 3D orthorombic model inversions. The results of our numerical experiment suggest that the DBIT method for inversion of CSEM data in anisotropic media is both accurate and efficient.
On-Chip Neural Data Compression Based On Compressed Sensing With Sparse Sensing Matrices.
Zhao, Wenfeng; Sun, Biao; Wu, Tong; Yang, Zhi
2018-02-01
On-chip neural data compression is an enabling technique for wireless neural interfaces that suffer from insufficient bandwidth and power budgets to transmit the raw data. The data compression algorithm and its implementation should be power and area efficient and functionally reliable over different datasets. Compressed sensing is an emerging technique that has been applied to compress various neurophysiological data. However, the state-of-the-art compressed sensing (CS) encoders leverage random but dense binary measurement matrices, which incur substantial implementation costs on both power and area that could offset the benefits from the reduced wireless data rate. In this paper, we propose two CS encoder designs based on sparse measurement matrices that could lead to efficient hardware implementation. Specifically, two different approaches for the construction of sparse measurement matrices, i.e., the deterministic quasi-cyclic array code (QCAC) matrix and -sparse random binary matrix [-SRBM] are exploited. We demonstrate that the proposed CS encoders lead to comparable recovery performance. And efficient VLSI architecture designs are proposed for QCAC-CS and -SRBM encoders with reduced area and total power consumption.
Seismic data restoration with a fast L1 norm trust region method
NASA Astrophysics Data System (ADS)
Cao, Jingjie; Wang, Yanfei
2014-08-01
Seismic data restoration is a major strategy to provide reliable wavefield when field data dissatisfy the Shannon sampling theorem. Recovery by sparsity-promoting inversion often get sparse solutions of seismic data in a transformed domains, however, most methods for sparsity-promoting inversion are line-searching methods which are efficient but are inclined to obtain local solutions. Using trust region method which can provide globally convergent solutions is a good choice to overcome this shortcoming. A trust region method for sparse inversion has been proposed, however, the efficiency should be improved to suitable for large-scale computation. In this paper, a new L1 norm trust region model is proposed for seismic data restoration and a robust gradient projection method for solving the sub-problem is utilized. Numerical results of synthetic and field data demonstrate that the proposed trust region method can get excellent computation speed and is a viable alternative for large-scale computation.
Efficient ICCG on a shared memory multiprocessor
NASA Technical Reports Server (NTRS)
Hammond, Steven W.; Schreiber, Robert
1989-01-01
Different approaches are discussed for exploiting parallelism in the ICCG (Incomplete Cholesky Conjugate Gradient) method for solving large sparse symmetric positive definite systems of equations on a shared memory parallel computer. Techniques for efficiently solving triangular systems and computing sparse matrix-vector products are explored. Three methods for scheduling the tasks in solving triangular systems are implemented on the Sequent Balance 21000. Sample problems that are representative of a large class of problems solved using iterative methods are used. We show that a static analysis to determine data dependences in the triangular solve can greatly improve its parallel efficiency. We also show that ignoring symmetry and storing the whole matrix can reduce solution time substantially.
Parallel Preconditioning for CFD Problems on the CM-5
NASA Technical Reports Server (NTRS)
Simon, Horst D.; Kremenetsky, Mark D.; Richardson, John; Lasinski, T. A. (Technical Monitor)
1994-01-01
Up to today, preconditioning methods on massively parallel systems have faced a major difficulty. The most successful preconditioning methods in terms of accelerating the convergence of the iterative solver such as incomplete LU factorizations are notoriously difficult to implement on parallel machines for two reasons: (1) the actual computation of the preconditioner is not very floating-point intensive, but requires a large amount of unstructured communication, and (2) the application of the preconditioning matrix in the iteration phase (i.e. triangular solves) are difficult to parallelize because of the recursive nature of the computation. Here we present a new approach to preconditioning for very large, sparse, unsymmetric, linear systems, which avoids both difficulties. We explicitly compute an approximate inverse to our original matrix. This new preconditioning matrix can be applied most efficiently for iterative methods on massively parallel machines, since the preconditioning phase involves only a matrix-vector multiplication, with possibly a dense matrix. Furthermore the actual computation of the preconditioning matrix has natural parallelism. For a problem of size n, the preconditioning matrix can be computed by solving n independent small least squares problems. The algorithm and its implementation on the Connection Machine CM-5 are discussed in detail and supported by extensive timings obtained from real problem data.
Lim, Jun-Seok; Pang, Hee-Suk
2016-01-01
In this paper an [Formula: see text]-regularized recursive total least squares (RTLS) algorithm is considered for the sparse system identification. Although recursive least squares (RLS) has been successfully applied in sparse system identification, the estimation performance in RLS based algorithms becomes worse, when both input and output are contaminated by noise (the error-in-variables problem). We proposed an algorithm to handle the error-in-variables problem. The proposed [Formula: see text]-RTLS algorithm is an RLS like iteration using the [Formula: see text] regularization. The proposed algorithm not only gives excellent performance but also reduces the required complexity through the effective inversion matrix handling. Simulations demonstrate the superiority of the proposed [Formula: see text]-regularized RTLS for the sparse system identification setting.
Solving large sparse eigenvalue problems on supercomputers
NASA Technical Reports Server (NTRS)
Philippe, Bernard; Saad, Youcef
1988-01-01
An important problem in scientific computing consists in finding a few eigenvalues and corresponding eigenvectors of a very large and sparse matrix. The most popular methods to solve these problems are based on projection techniques on appropriate subspaces. The main attraction of these methods is that they only require the use of the matrix in the form of matrix by vector multiplications. The implementations on supercomputers of two such methods for symmetric matrices, namely Lanczos' method and Davidson's method are compared. Since one of the most important operations in these two methods is the multiplication of vectors by the sparse matrix, methods of performing this operation efficiently are discussed. The advantages and the disadvantages of each method are compared and implementation aspects are discussed. Numerical experiments on a one processor CRAY 2 and CRAY X-MP are reported. Possible parallel implementations are also discussed.
Partitioning Rectangular and Structurally Nonsymmetric Sparse Matrices for Parallel Processing
DOE Office of Scientific and Technical Information (OSTI.GOV)
B. Hendrickson; T.G. Kolda
1998-09-01
A common operation in scientific computing is the multiplication of a sparse, rectangular or structurally nonsymmetric matrix and a vector. In many applications the matrix- transpose-vector product is also required. This paper addresses the efficient parallelization of these operations. We show that the problem can be expressed in terms of partitioning bipartite graphs. We then introduce several algorithms for this partitioning problem and compare their performance on a set of test matrices.
A performance study of sparse Cholesky factorization on INTEL iPSC/860
NASA Technical Reports Server (NTRS)
Zubair, M.; Ghose, M.
1992-01-01
The problem of Cholesky factorization of a sparse matrix has been very well investigated on sequential machines. A number of efficient codes exist for factorizing large unstructured sparse matrices. However, there is a lack of such efficient codes on parallel machines in general, and distributed machines in particular. Some of the issues that are critical to the implementation of sparse Cholesky factorization on a distributed memory parallel machine are ordering, partitioning and mapping, load balancing, and ordering of various tasks within a processor. Here, we focus on the effect of various partitioning schemes on the performance of sparse Cholesky factorization on the Intel iPSC/860. Also, a new partitioning heuristic for structured as well as unstructured sparse matrices is proposed, and its performance is compared with other schemes.
Highly parallel sparse Cholesky factorization
NASA Technical Reports Server (NTRS)
Gilbert, John R.; Schreiber, Robert
1990-01-01
Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms.
GPU-accelerated element-free reverse-time migration with Gauss points partition
NASA Astrophysics Data System (ADS)
Zhou, Zhen; Jia, Xiaofeng; Qiang, Xiaodong
2018-06-01
An element-free method (EFM) has been demonstrated successfully in elasticity, heat conduction and fatigue crack growth problems. We present the theory of EFM and its numerical applications in seismic modelling and reverse time migration (RTM). Compared with the finite difference method and the finite element method, the EFM has unique advantages: (1) independence of grids in computation and (2) lower expense and more flexibility (because only the information of the nodes and the boundary of the concerned area is required). However, in EFM, due to improper computation and storage of some large sparse matrices, such as the mass matrix and the stiffness matrix, the method is difficult to apply to seismic modelling and RTM for a large velocity model. To solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition and utilise the graphics processing unit to improve the computational efficiency. We employ the compressed sparse row format to compress the intermediate large sparse matrices and attempt to simplify the operations by solving the linear equations with CULA solver. To improve the computation efficiency further, we introduce the concept of the lumped mass matrix. Numerical experiments indicate that the proposed method is accurate and more efficient than the regular EFM.
Efficient Implementations of the Quadrature-Free Discontinuous Galerkin Method
NASA Technical Reports Server (NTRS)
Lockard, David P.; Atkins, Harold L.
1999-01-01
The efficiency of the quadrature-free form of the dis- continuous Galerkin method in two dimensions, and briefly in three dimensions, is examined. Most of the work for constant-coefficient, linear problems involves the volume and edge integrations, and the transformation of information from the volume to the edges. These operations can be viewed as matrix-vector multiplications. Many of the matrices are sparse as a result of symmetry, and blocking and specialized multiplication routines are used to account for the sparsity. By optimizing these operations, a 35% reduction in total CPU time is achieved. For nonlinear problems, the calculation of the flux becomes dominant because of the cost associated with polynomial products and inversion. This component of the work can be reduced by up to 75% when the products are approximated by truncating terms. Because the cost is high for nonlinear problems on general elements, it is suggested that simplified physics and the most efficient element types be used over most of the domain.
Solving large tomographic linear systems: size reduction and error estimation
NASA Astrophysics Data System (ADS)
Voronin, Sergey; Mikesell, Dylan; Slezak, Inna; Nolet, Guust
2014-10-01
We present a new approach to reduce a sparse, linear system of equations associated with tomographic inverse problems. We begin by making a modification to the commonly used compressed sparse-row format, whereby our format is tailored to the sparse structure of finite-frequency (volume) sensitivity kernels in seismic tomography. Next, we cluster the sparse matrix rows to divide a large matrix into smaller subsets representing ray paths that are geographically close. Singular value decomposition of each subset allows us to project the data onto a subspace associated with the largest eigenvalues of the subset. After projection we reject those data that have a signal-to-noise ratio (SNR) below a chosen threshold. Clustering in this way assures that the sparse nature of the system is minimally affected by the projection. Moreover, our approach allows for a precise estimation of the noise affecting the data while also giving us the ability to identify outliers. We illustrate the method by reducing large matrices computed for global tomographic systems with cross-correlation body wave delays, as well as with surface wave phase velocity anomalies. For a massive matrix computed for 3.7 million Rayleigh wave phase velocity measurements, imposing a threshold of 1 for the SNR, we condensed the matrix size from 1103 to 63 Gbyte. For a global data set of multiple-frequency P wave delays from 60 well-distributed deep earthquakes we obtain a reduction to 5.9 per cent. This type of reduction allows one to avoid loss of information due to underparametrizing models. Alternatively, if data have to be rejected to fit the system into computer memory, it assures that the most important data are preserved.
Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Xu, Richard Yi Da; Luo, Xiangfeng
2018-05-01
Sparse nonnegative matrix factorization (SNMF) aims to factorize a data matrix into two optimized nonnegative sparse factor matrices, which could benefit many tasks, such as document-word co-clustering. However, the traditional SNMF typically assumes the number of latent factors (i.e., dimensionality of the factor matrices) to be fixed. This assumption makes it inflexible in practice. In this paper, we propose a doubly sparse nonparametric NMF framework to mitigate this issue by using dependent Indian buffet processes (dIBP). We apply a correlation function for the generation of two stick weights associated with each column pair of factor matrices while still maintaining their respective marginal distribution specified by IBP. As a consequence, the generation of two factor matrices will be columnwise correlated. Under this framework, two classes of correlation function are proposed: 1) using bivariate Beta distribution and 2) using Copula function. Compared with the single IBP-based NMF, this paper jointly makes two factor matrices nonparametric and sparse, which could be applied to broader scenarios, such as co-clustering. This paper is seen to be much more flexible than Gaussian process-based and hierarchial Beta process-based dIBPs in terms of allowing the two corresponding binary matrix columns to have greater variations in their nonzero entries. Our experiments on synthetic data show the merits of this paper compared with the state-of-the-art models in respect of factorization efficiency, sparsity, and flexibility. Experiments on real-world data sets demonstrate the efficiency of this paper in document-word co-clustering tasks.
Comparing implementations of penalized weighted least-squares sinogram restoration
Forthmann, Peter; Koehler, Thomas; Defrise, Michel; La Riviere, Patrick
2010-01-01
Purpose: A CT scanner measures the energy that is deposited in each channel of a detector array by x rays that have been partially absorbed on their way through the object. The measurement process is complex and quantitative measurements are always and inevitably associated with errors, so CT data must be preprocessed prior to reconstruction. In recent years, the authors have formulated CT sinogram preprocessing as a statistical restoration problem in which the goal is to obtain the best estimate of the line integrals needed for reconstruction from the set of noisy, degraded measurements. The authors have explored both penalized Poisson likelihood (PL) and penalized weighted least-squares (PWLS) objective functions. At low doses, the authors found that the PL approach outperforms PWLS in terms of resolution-noise tradeoffs, but at standard doses they perform similarly. The PWLS objective function, being quadratic, is more amenable to computational acceleration than the PL objective. In this work, the authors develop and compare two different methods for implementing PWLS sinogram restoration with the hope of improving computational performance relative to PL in the standard-dose regime. Sinogram restoration is still significant in the standard-dose regime since it can still outperform standard approaches and it allows for correction of effects that are not usually modeled in standard CT preprocessing. Methods: The authors have explored and compared two implementation strategies for PWLS sinogram restoration: (1) A direct matrix-inversion strategy based on the closed-form solution to the PWLS optimization problem and (2) an iterative approach based on the conjugate-gradient algorithm. Obtaining optimal performance from each strategy required modifying the naive off-the-shelf implementations of the algorithms to exploit the particular symmetry and sparseness of the sinogram-restoration problem. For the closed-form approach, the authors subdivided the large matrix inversion into smaller coupled problems and exploited sparseness to minimize matrix operations. For the conjugate-gradient approach, the authors exploited sparseness and preconditioned the problem to speed up convergence. Results: All methods produced qualitatively and quantitatively similar images as measured by resolution-variance tradeoffs and difference images. Despite the acceleration strategies, the direct matrix-inversion approach was found to be uncompetitive with iterative approaches, with a computational burden higher by an order of magnitude or more. The iterative conjugate-gradient approach, however, does appear promising, with computation times half that of the authors’ previous penalized-likelihood implementation. Conclusions: Iterative conjugate-gradient based PWLS sinogram restoration with careful matrix optimizations has computational advantages over direct matrix PWLS inversion and over penalized-likelihood sinogram restoration and can be considered a good alternative in standard-dose regimes. PMID:21158306
An exact formulation of the time-ordered exponential using path-sums
DOE Office of Scientific and Technical Information (OSTI.GOV)
Giscard, P.-L., E-mail: p.giscard1@physics.ox.ac.uk; Lui, K.; Thwaite, S. J.
2015-05-15
We present the path-sum formulation for the time-ordered exponential of a time-dependent matrix. The path-sum formulation gives the time-ordered exponential as a branched continued fraction of finite depth and breadth. The terms of the path-sum have an elementary interpretation as self-avoiding walks and self-avoiding polygons on a graph. Our result is based on a representation of the time-ordered exponential as the inverse of an operator, the mapping of this inverse to sums of walks on a graphs, and the algebraic structure of sets of walks. We give examples demonstrating our approach. We establish a super-exponential decay bound for the magnitudemore » of the entries of the time-ordered exponential of sparse matrices. We give explicit results for matrices with commonly encountered sparse structures.« less
NASA Astrophysics Data System (ADS)
Hu, Guiqiang; Xiao, Di; Wang, Yong; Xiang, Tao; Zhou, Qing
2017-11-01
Recently, a new kind of image encryption approach using compressive sensing (CS) and double random phase encoding has received much attention due to the advantages such as compressibility and robustness. However, this approach is found to be vulnerable to chosen plaintext attack (CPA) if the CS measurement matrix is re-used. Therefore, designing an efficient measurement matrix updating mechanism that ensures resistance to CPA is of practical significance. In this paper, we provide a novel solution to update the CS measurement matrix by altering the secret sparse basis with the help of counter mode operation. Particularly, the secret sparse basis is implemented by a reality-preserving fractional cosine transform matrix. Compared with the conventional CS-based cryptosystem that totally generates all the random entries of measurement matrix, our scheme owns efficiency superiority while guaranteeing resistance to CPA. Experimental and analysis results show that the proposed scheme has a good security performance and has robustness against noise and occlusion.
A Fast Gradient Method for Nonnegative Sparse Regression With Self-Dictionary
NASA Astrophysics Data System (ADS)
Gillis, Nicolas; Luce, Robert
2018-01-01
A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images.
Strategies for vectorizing the sparse matrix vector product on the CRAY XMP, CRAY 2, and CYBER 205
NASA Technical Reports Server (NTRS)
Bauschlicher, Charles W., Jr.; Partridge, Harry
1987-01-01
Large, randomly sparse matrix vector products are important in a number of applications in computational chemistry, such as matrix diagonalization and the solution of simultaneous equations. Vectorization of this process is considered for the CRAY XMP, CRAY 2, and CYBER 205, using a matrix of dimension of 20,000 with from 1 percent to 6 percent nonzeros. Efficient scatter/gather capabilities add coding flexibility and yield significant improvements in performance. For the CYBER 205, it is shown that minor changes in the IO can reduce the CPU time by a factor of 50. Similar changes in the CRAY codes make a far smaller improvement.
Ravishankar, Saiprasad; Nadakuditi, Raj Rao; Fessler, Jeffrey A
2017-12-01
The sparsity of signals in a transform domain or dictionary has been exploited in applications such as compression, denoising and inverse problems. More recently, data-driven adaptation of synthesis dictionaries has shown promise compared to analytical dictionary models. However, dictionary learning problems are typically non-convex and NP-hard, and the usual alternating minimization approaches for these problems are often computationally expensive, with the computations dominated by the NP-hard synthesis sparse coding step. This paper exploits the ideas that drive algorithms such as K-SVD, and investigates in detail efficient methods for aggregate sparsity penalized dictionary learning by first approximating the data with a sum of sparse rank-one matrices (outer products) and then using a block coordinate descent approach to estimate the unknowns. The resulting block coordinate descent algorithms involve efficient closed-form solutions. Furthermore, we consider the problem of dictionary-blind image reconstruction, and propose novel and efficient algorithms for adaptive image reconstruction using block coordinate descent and sum of outer products methodologies. We provide a convergence study of the algorithms for dictionary learning and dictionary-blind image reconstruction. Our numerical experiments show the promising performance and speedups provided by the proposed methods over previous schemes in sparse data representation and compressed sensing-based image reconstruction.
Ravishankar, Saiprasad; Nadakuditi, Raj Rao; Fessler, Jeffrey A.
2017-01-01
The sparsity of signals in a transform domain or dictionary has been exploited in applications such as compression, denoising and inverse problems. More recently, data-driven adaptation of synthesis dictionaries has shown promise compared to analytical dictionary models. However, dictionary learning problems are typically non-convex and NP-hard, and the usual alternating minimization approaches for these problems are often computationally expensive, with the computations dominated by the NP-hard synthesis sparse coding step. This paper exploits the ideas that drive algorithms such as K-SVD, and investigates in detail efficient methods for aggregate sparsity penalized dictionary learning by first approximating the data with a sum of sparse rank-one matrices (outer products) and then using a block coordinate descent approach to estimate the unknowns. The resulting block coordinate descent algorithms involve efficient closed-form solutions. Furthermore, we consider the problem of dictionary-blind image reconstruction, and propose novel and efficient algorithms for adaptive image reconstruction using block coordinate descent and sum of outer products methodologies. We provide a convergence study of the algorithms for dictionary learning and dictionary-blind image reconstruction. Our numerical experiments show the promising performance and speedups provided by the proposed methods over previous schemes in sparse data representation and compressed sensing-based image reconstruction. PMID:29376111
NASA Astrophysics Data System (ADS)
Bustamam, A.; Ulul, E. D.; Hura, H. F. A.; Siswantining, T.
2017-07-01
Hierarchical clustering is one of effective methods in creating a phylogenetic tree based on the distance matrix between DNA (deoxyribonucleic acid) sequences. One of the well-known methods to calculate the distance matrix is k-mer method. Generally, k-mer is more efficient than some distance matrix calculation techniques. The steps of k-mer method are started from creating k-mer sparse matrix, and followed by creating k-mer singular value vectors. The last step is computing the distance amongst vectors. In this paper, we analyze the sequences of MERS-CoV (Middle East Respiratory Syndrome - Coronavirus) DNA by implementing hierarchical clustering using k-mer sparse matrix in order to perform the phylogenetic analysis. Our results show that the ancestor of our MERS-CoV is coming from Egypt. Moreover, we found that the MERS-CoV infection that occurs in one country may not necessarily come from the same country of origin. This suggests that the process of MERS-CoV mutation might not only be influenced by geographical factor.
NASA Astrophysics Data System (ADS)
Ghale, Purnima; Johnson, Harley T.
2018-06-01
We present an efficient sparse matrix-vector (SpMV) based method to compute the density matrix P from a given Hamiltonian in electronic structure computations. Our method is a hybrid approach based on Chebyshev-Jackson approximation theory and matrix purification methods like the second order spectral projection purification (SP2). Recent methods to compute the density matrix scale as O(N) in the number of floating point operations but are accompanied by large memory and communication overhead, and they are based on iterative use of the sparse matrix-matrix multiplication kernel (SpGEMM), which is known to be computationally irregular. In addition to irregularity in the sparse Hamiltonian H, the nonzero structure of intermediate estimates of P depends on products of H and evolves over the course of computation. On the other hand, an expansion of the density matrix P in terms of Chebyshev polynomials is straightforward and SpMV based; however, the resulting density matrix may not satisfy the required constraints exactly. In this paper, we analyze the strengths and weaknesses of the Chebyshev-Jackson polynomials and the second order spectral projection purification (SP2) method, and propose to combine them so that the accurate density matrix can be computed using the SpMV computational kernel only, and without having to store the density matrix P. Our method accomplishes these objectives by using the Chebyshev polynomial estimate as the initial guess for SP2, which is followed by using sparse matrix-vector multiplications (SpMVs) to replicate the behavior of the SP2 algorithm for purification. We demonstrate the method on a tight-binding model system of an oxide material containing more than 3 million atoms. In addition, we also present the predicted behavior of our method when applied to near-metallic Hamiltonians with a wide energy spectrum.
Aguilar, I; Misztal, I; Legarra, A; Tsuruta, S
2011-12-01
Genomic evaluations can be calculated using a unified procedure that combines phenotypic, pedigree and genomic information. Implementation of such a procedure requires the inverse of the relationship matrix based on pedigree and genomic relationships. The objective of this study was to investigate efficient computing options to create relationship matrices based on genomic markers and pedigree information as well as their inverses. SNP maker information was simulated for a panel of 40 K SNPs, with the number of genotyped animals up to 30 000. Matrix multiplication in the computation of the genomic relationship was by a simple 'do' loop, by two optimized versions of the loop, and by a specific matrix multiplication subroutine. Inversion was by a generalized inverse algorithm and by a LAPACK subroutine. With the most efficient choices and parallel processing, creation of matrices for 30 000 animals would take a few hours. Matrices required to implement a unified approach can be computed efficiently. Optimizations can be either by modifications of existing code or by the use of efficient automatic optimizations provided by open source or third-party libraries. © 2011 Blackwell Verlag GmbH.
NASA Astrophysics Data System (ADS)
Yang, Yongchao; Nagarajaiah, Satish
2016-06-01
Randomly missing data of structural vibration responses time history often occurs in structural dynamics and health monitoring. For example, structural vibration responses are often corrupted by outliers or erroneous measurements due to sensor malfunction; in wireless sensing platforms, data loss during wireless communication is a common issue. Besides, to alleviate the wireless data sampling or communication burden, certain accounts of data are often discarded during sampling or before transmission. In these and other applications, recovery of the randomly missing structural vibration responses from the available, incomplete data, is essential for system identification and structural health monitoring; it is an ill-posed inverse problem, however. This paper explicitly harnesses the data structure itself-of the structural vibration responses-to address this (inverse) problem. What is relevant is an empirical, but often practically true, observation, that is, typically there are only few modes active in the structural vibration responses; hence a sparse representation (in frequency domain) of the single-channel data vector, or, a low-rank structure (by singular value decomposition) of the multi-channel data matrix. Exploiting such prior knowledge of data structure (intra-channel sparse or inter-channel low-rank), the new theories of ℓ1-minimization sparse recovery and nuclear-norm-minimization low-rank matrix completion enable recovery of the randomly missing or corrupted structural vibration response data. The performance of these two alternatives, in terms of recovery accuracy and computational time under different data missing rates, is investigated on a few structural vibration response data sets-the seismic responses of the super high-rise Canton Tower and the structural health monitoring accelerations of a real large-scale cable-stayed bridge. Encouraging results are obtained and the applicability and limitation of the presented methods are discussed.
Inversion Of Jacobian Matrix For Robot Manipulators
NASA Technical Reports Server (NTRS)
Fijany, Amir; Bejczy, Antal K.
1989-01-01
Report discusses inversion of Jacobian matrix for class of six-degree-of-freedom arms with spherical wrist, i.e., with last three joints intersecting. Shows by taking advantage of simple geometry of such arms, closed-form solution of Q=J-1X, which represents linear transformation from task space to joint space, obtained efficiently. Presents solutions for PUMA arm, JPL/Stanford arm, and six-revolute-joint coplanar arm along with all singular points. Main contribution of paper shows simple geometry of this type of arms exploited in performing inverse transformation without any need to compute Jacobian or its inverse explicitly. Implication of this computational efficiency advanced task-space control schemes for spherical-wrist arms implemented more efficiently.
NASA Astrophysics Data System (ADS)
Chang, Yong; Zi, Yanyang; Zhao, Jiyuan; Yang, Zhe; He, Wangpeng; Sun, Hailiang
2017-03-01
In guided wave pipeline inspection, echoes reflected from closely spaced reflectors generally overlap, meaning useful information is lost. To solve the overlapping problem, sparse deconvolution methods have been developed in the past decade. However, conventional sparse deconvolution methods have limitations in handling guided wave signals, because the input signal is directly used as the prototype of the convolution matrix, without considering the waveform change caused by the dispersion properties of the guided wave. In this paper, an adaptive sparse deconvolution (ASD) method is proposed to overcome these limitations. First, the Gaussian echo model is employed to adaptively estimate the column prototype of the convolution matrix instead of directly using the input signal as the prototype. Then, the convolution matrix is constructed upon the estimated results. Third, the split augmented Lagrangian shrinkage (SALSA) algorithm is introduced to solve the deconvolution problem with high computational efficiency. To verify the effectiveness of the proposed method, guided wave signals obtained from pipeline inspection are investigated numerically and experimentally. Compared to conventional sparse deconvolution methods, e.g. the {{l}1} -norm deconvolution method, the proposed method shows better performance in handling the echo overlap problem in the guided wave signal.
Wei, Jianing; Bouman, Charles A; Allebach, Jan P
2014-05-01
Many imaging applications require the implementation of space-varying convolution for accurate restoration and reconstruction of images. Here, we use the term space-varying convolution to refer to linear operators whose impulse response has slow spatial variation. In addition, these space-varying convolution operators are often dense, so direct implementation of the convolution operator is typically computationally impractical. One such example is the problem of stray light reduction in digital cameras, which requires the implementation of a dense space-varying deconvolution operator. However, other inverse problems, such as iterative tomographic reconstruction, can also depend on the implementation of dense space-varying convolution. While space-invariant convolution can be efficiently implemented with the fast Fourier transform, this approach does not work for space-varying operators. So direct convolution is often the only option for implementing space-varying convolution. In this paper, we develop a general approach to the efficient implementation of space-varying convolution, and demonstrate its use in the application of stray light reduction. Our approach, which we call matrix source coding, is based on lossy source coding of the dense space-varying convolution matrix. Importantly, by coding the transformation matrix, we not only reduce the memory required to store it; we also dramatically reduce the computation required to implement matrix-vector products. Our algorithm is able to reduce computation by approximately factoring the dense space-varying convolution operator into a product of sparse transforms. Experimental results show that our method can dramatically reduce the computation required for stray light reduction while maintaining high accuracy.
Newmark-Beta-FDTD method for super-resolution analysis of time reversal waves
NASA Astrophysics Data System (ADS)
Shi, Sheng-Bing; Shao, Wei; Ma, Jing; Jin, Congjun; Wang, Xiao-Hua
2017-09-01
In this work, a new unconditionally stable finite-difference time-domain (FDTD) method with the split-field perfectly matched layer (PML) is proposed for the analysis of time reversal (TR) waves. The proposed method is very suitable for multiscale problems involving microstructures. The spatial and temporal derivatives in this method are discretized by the central difference technique and Newmark-Beta algorithm, respectively, and the derivation results in the calculation of a banded-sparse matrix equation. Since the coefficient matrix keeps unchanged during the whole simulation process, the lower-upper (LU) decomposition of the matrix needs to be performed only once at the beginning of the calculation. Moreover, the reverse Cuthill-Mckee (RCM) technique, an effective preprocessing technique in bandwidth compression of sparse matrices, is used to improve computational efficiency. The super-resolution focusing of TR wave propagation in two- and three-dimensional spaces is included to validate the accuracy and efficiency of the proposed method.
NASA Astrophysics Data System (ADS)
Jahandari, H.; Farquharson, C. G.
2017-11-01
Unstructured grids enable representing arbitrary structures more accurately and with fewer cells compared to regular structured grids. These grids also allow more efficient refinements compared to rectilinear meshes. In this study, tetrahedral grids are used for the inversion of magnetotelluric (MT) data, which allows for the direct inclusion of topography in the model, for constraining an inversion using a wireframe-based geological model and for local refinement at the observation stations. A minimum-structure method with an iterative model-space Gauss-Newton algorithm for optimization is used. An iterative solver is employed for solving the normal system of equations at each Gauss-Newton step and the sensitivity matrix-vector products that are required by this solver are calculated using pseudo-forward problems. This method alleviates the need to explicitly form the Hessian or Jacobian matrices which significantly reduces the required computation memory. Forward problems are formulated using an edge-based finite-element approach and a sparse direct solver is used for the solutions. This solver allows saving and re-using the factorization of matrices for similar pseudo-forward problems within a Gauss-Newton iteration which greatly minimizes the computation time. Two examples are presented to show the capability of the algorithm: the first example uses a benchmark model while the second example represents a realistic geological setting with topography and a sulphide deposit. The data that are inverted are the full-tensor impedance and the magnetic transfer function vector. The inversions sufficiently recovered the models and reproduced the data, which shows the effectiveness of unstructured grids for complex and realistic MT inversion scenarios. The first example is also used to demonstrate the computational efficiency of the presented model-space method by comparison with its data-space counterpart.
NASA Astrophysics Data System (ADS)
Ma, Sangback
In this paper we compare various parallel preconditioners such as Point-SSOR (Symmetric Successive OverRelaxation), ILU(0) (Incomplete LU) in the Wavefront ordering, ILU(0) in the Multi-color ordering, Multi-Color Block SOR (Successive OverRelaxation), SPAI (SParse Approximate Inverse) and pARMS (Parallel Algebraic Recursive Multilevel Solver) for solving large sparse linear systems arising from two-dimensional PDE (Partial Differential Equation)s on structured grids. Point-SSOR is well-known, and ILU(0) is one of the most popular preconditioner, but it is inherently serial. ILU(0) in the Wavefront ordering maximizes the parallelism in the natural order, but the lengths of the wave-fronts are often nonuniform. ILU(0) in the Multi-color ordering is a simple way of achieving a parallelism of the order N, where N is the order of the matrix, but its convergence rate often deteriorates as compared to that of natural ordering. We have chosen the Multi-Color Block SOR preconditioner combined with direct sparse matrix solver, since for the Laplacian matrix the SOR method is known to have a nondeteriorating rate of convergence when used with the Multi-Color ordering. By using block version we expect to minimize the interprocessor communications. SPAI computes the sparse approximate inverse directly by least squares method. Finally, ARMS is a preconditioner recursively exploiting the concept of independent sets and pARMS is the parallel version of ARMS. Experiments were conducted for the Finite Difference and Finite Element discretizations of five two-dimensional PDEs with large meshsizes up to a million on an IBM p595 machine with distributed memory. Our matrices are real positive, i. e., their real parts of the eigenvalues are positive. We have used GMRES(m) as our outer iterative method, so that the convergence of GMRES(m) for our test matrices are mathematically guaranteed. Interprocessor communications were done using MPI (Message Passing Interface) primitives. The results show that in general ILU(0) in the Multi-Color ordering ahd ILU(0) in the Wavefront ordering outperform the other methods but for symmetric and nearly symmetric 5-point matrices Multi-Color Block SOR gives the best performance, except for a few cases with a small number of processors.
Algorithms and Application of Sparse Matrix Assembly and Equation Solvers for Aeroacoustics
NASA Technical Reports Server (NTRS)
Watson, W. R.; Nguyen, D. T.; Reddy, C. J.; Vatsa, V. N.; Tang, W. H.
2001-01-01
An algorithm for symmetric sparse equation solutions on an unstructured grid is described. Efficient, sequential sparse algorithms for degree-of-freedom reordering, supernodes, symbolic/numerical factorization, and forward backward solution phases are reviewed. Three sparse algorithms for the generation and assembly of symmetric systems of matrix equations are presented. The accuracy and numerical performance of the sequential version of the sparse algorithms are evaluated over the frequency range of interest in a three-dimensional aeroacoustics application. Results show that the solver solutions are accurate using a discretization of 12 points per wavelength. Results also show that the first assembly algorithm is impractical for high-frequency noise calculations. The second and third assembly algorithms have nearly equal performance at low values of source frequencies, but at higher values of source frequencies the third algorithm saves CPU time and RAM. The CPU time and the RAM required by the second and third assembly algorithms are two orders of magnitude smaller than that required by the sparse equation solver. A sequential version of these sparse algorithms can, therefore, be conveniently incorporated into a substructuring for domain decomposition formulation to achieve parallel computation, where different substructures are handles by different parallel processors.
Beyond Low Rank + Sparse: Multi-scale Low Rank Matrix Decomposition
Ong, Frank; Lustig, Michael
2016-01-01
We present a natural generalization of the recent low rank + sparse matrix decomposition and consider the decomposition of matrices into components of multiple scales. Such decomposition is well motivated in practice as data matrices often exhibit local correlations in multiple scales. Concretely, we propose a multi-scale low rank modeling that represents a data matrix as a sum of block-wise low rank matrices with increasing scales of block sizes. We then consider the inverse problem of decomposing the data matrix into its multi-scale low rank components and approach the problem via a convex formulation. Theoretically, we show that under various incoherence conditions, the convex program recovers the multi-scale low rank components either exactly or approximately. Practically, we provide guidance on selecting the regularization parameters and incorporate cycle spinning to reduce blocking artifacts. Experimentally, we show that the multi-scale low rank decomposition provides a more intuitive decomposition than conventional low rank methods and demonstrate its effectiveness in four applications, including illumination normalization for face images, motion separation for surveillance videos, multi-scale modeling of the dynamic contrast enhanced magnetic resonance imaging and collaborative filtering exploiting age information. PMID:28450978
Fast iterative image reconstruction using sparse matrix factorization with GPU acceleration
NASA Astrophysics Data System (ADS)
Zhou, Jian; Qi, Jinyi
2011-03-01
Statistically based iterative approaches for image reconstruction have gained much attention in medical imaging. An accurate system matrix that defines the mapping from the image space to the data space is the key to high-resolution image reconstruction. However, an accurate system matrix is often associated with high computational cost and huge storage requirement. Here we present a method to address this problem by using sparse matrix factorization and parallel computing on a graphic processing unit (GPU).We factor the accurate system matrix into three sparse matrices: a sinogram blurring matrix, a geometric projection matrix, and an image blurring matrix. The sinogram blurring matrix models the detector response. The geometric projection matrix is based on a simple line integral model. The image blurring matrix is to compensate for the line-of-response (LOR) degradation due to the simplified geometric projection matrix. The geometric projection matrix is precomputed, while the sinogram and image blurring matrices are estimated by minimizing the difference between the factored system matrix and the original system matrix. The resulting factored system matrix has much less number of nonzero elements than the original system matrix and thus substantially reduces the storage and computation cost. The smaller size also allows an efficient implement of the forward and back projectors on GPUs, which have limited amount of memory. Our simulation studies show that the proposed method can dramatically reduce the computation cost of high-resolution iterative image reconstruction. The proposed technique is applicable to image reconstruction for different imaging modalities, including x-ray CT, PET, and SPECT.
On the development of efficient algorithms for three dimensional fluid flow
NASA Technical Reports Server (NTRS)
Maccormack, R. W.
1988-01-01
The difficulties of constructing efficient algorithms for three-dimensional flow are discussed. Reasonable candidates are analyzed and tested, and most are found to have obvious shortcomings. Yet, there is promise that an efficient class of algorithms exist between the severely time-step sized-limited explicit or approximately factored algorithms and the computationally intensive direct inversion of large sparse matrices by Gaussian elimination.
Wang, Ya-Xuan; Gao, Ying-Lian; Liu, Jin-Xing; Kong, Xiang-Zhen; Li, Hai-Jun
2017-09-01
Identifying differentially expressed genes from the thousands of genes is a challenging task. Robust principal component analysis (RPCA) is an efficient method in the identification of differentially expressed genes. RPCA method uses nuclear norm to approximate the rank function. However, theoretical studies showed that the nuclear norm minimizes all singular values, so it may not be the best solution to approximate the rank function. The truncated nuclear norm is defined as the sum of some smaller singular values, which may achieve a better approximation of the rank function than nuclear norm. In this paper, a novel method is proposed by replacing nuclear norm of RPCA with the truncated nuclear norm, which is named robust principal component analysis regularized by truncated nuclear norm (TRPCA). The method decomposes the observation matrix of genomic data into a low-rank matrix and a sparse matrix. Because the significant genes can be considered as sparse signals, the differentially expressed genes are viewed as the sparse perturbation signals. Thus, the differentially expressed genes can be identified according to the sparse matrix. The experimental results on The Cancer Genome Atlas data illustrate that the TRPCA method outperforms other state-of-the-art methods in the identification of differentially expressed genes.
User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.
NASA Technical Reports Server (NTRS)
Reddy, C. J.
2000-01-01
PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.
Mohr, Stephan; Dawson, William; Wagner, Michael; Caliste, Damien; Nakajima, Takahito; Genovese, Luigi
2017-10-10
We present CheSS, the "Chebyshev Sparse Solvers" library, which has been designed to solve typical problems arising in large-scale electronic structure calculations using localized basis sets. The library is based on a flexible and efficient expansion in terms of Chebyshev polynomials and presently features the calculation of the density matrix, the calculation of matrix powers for arbitrary powers, and the extraction of eigenvalues in a selected interval. CheSS is able to exploit the sparsity of the matrices and scales linearly with respect to the number of nonzero entries, making it well-suited for large-scale calculations. The approach is particularly adapted for setups leading to small spectral widths of the involved matrices and outperforms alternative methods in this regime. By coupling CheSS to the DFT code BigDFT, we show that such a favorable setup is indeed possible in practice. In addition, the approach based on Chebyshev polynomials can be massively parallelized, and CheSS exhibits excellent scaling up to thousands of cores even for relatively small matrix sizes.
Spatial operator factorization and inversion of the manipulator mass matrix
NASA Technical Reports Server (NTRS)
Rodriguez, Guillermo; Kreutz-Delgado, Kenneth
1992-01-01
This paper advances two linear operator factorizations of the manipulator mass matrix. Embedded in the factorizations are many of the techniques that are regarded as very efficient computational solutions to inverse and forward dynamics problems. The operator factorizations provide a high-level architectural understanding of the mass matrix and its inverse, which is not visible in the detailed algorithms. They also lead to a new approach to the development of computer programs or organize complexity in robot dynamics.
Mniszewski, S M; Cawkwell, M J; Wall, M E; Mohd-Yusof, J; Bock, N; Germann, T C; Niklasson, A M N
2015-10-13
We present an algorithm for the calculation of the density matrix that for insulators scales linearly with system size and parallelizes efficiently on multicore, shared memory platforms with small and controllable numerical errors. The algorithm is based on an implementation of the second-order spectral projection (SP2) algorithm [ Niklasson, A. M. N. Phys. Rev. B 2002 , 66 , 155115 ] in sparse matrix algebra with the ELLPACK-R data format. We illustrate the performance of the algorithm within self-consistent tight binding theory by total energy calculations of gas phase poly(ethylene) molecules and periodic liquid water systems containing up to 15,000 atoms on up to 16 CPU cores. We consider algorithm-specific performance aspects, such as local vs nonlocal memory access and the degree of matrix sparsity. Comparisons to sparse matrix algebra implementations using off-the-shelf libraries on multicore CPUs, graphics processing units (GPUs), and the Intel many integrated core (MIC) architecture are also presented. The accuracy and stability of the algorithm are illustrated with long duration Born-Oppenheimer molecular dynamics simulations of 1000 water molecules and a 303 atom Trp cage protein solvated by 2682 water molecules.
Approximate method of variational Bayesian matrix factorization/completion with sparse prior
NASA Astrophysics Data System (ADS)
Kawasumi, Ryota; Takeda, Koujin
2018-05-01
We derive the analytical expression of a matrix factorization/completion solution by the variational Bayes method, under the assumption that the observed matrix is originally the product of low-rank, dense and sparse matrices with additive noise. We assume the prior of a sparse matrix is a Laplace distribution by taking matrix sparsity into consideration. Then we use several approximations for the derivation of a matrix factorization/completion solution. By our solution, we also numerically evaluate the performance of a sparse matrix reconstruction in matrix factorization, and completion of a missing matrix element in matrix completion.
Brief announcement: Hypergraph parititioning for parallel sparse matrix-matrix multiplication
Ballard, Grey; Druinsky, Alex; Knight, Nicholas; ...
2015-01-01
The performance of parallel algorithms for sparse matrix-matrix multiplication is typically determined by the amount of interprocessor communication performed, which in turn depends on the nonzero structure of the input matrices. In this paper, we characterize the communication cost of a sparse matrix-matrix multiplication algorithm in terms of the size of a cut of an associated hypergraph that encodes the computation for a given input nonzero structure. Obtaining an optimal algorithm corresponds to solving a hypergraph partitioning problem. Furthermore, our hypergraph model generalizes several existing models for sparse matrix-vector multiplication, and we can leverage hypergraph partitioners developed for that computationmore » to improve application-specific algorithms for multiplying sparse matrices.« less
New shape models of asteroids reconstructed from sparse-in-time photometry
NASA Astrophysics Data System (ADS)
Durech, Josef; Hanus, Josef; Vanco, Radim; Oszkiewicz, Dagmara Anna
2015-08-01
Asteroid physical parameters - the shape, the sidereal rotation period, and the spin axis orientation - can be reconstructed from the disk-integrated photometry either dense (classical lightcurves) or sparse in time by the lightcurve inversion method. We will review our recent progress in asteroid shape reconstruction from sparse photometry. The problem of finding a unique solution of the inverse problem is time consuming because the sidereal rotation period has to be found by scanning a wide interval of possible periods. This can be efficiently solved by splitting the period parameter space into small parts that are sent to computers of volunteers and processed in parallel. We will show how this approach of distributed computing works with currently available sparse photometry processed in the framework of project Asteroids@home. In particular, we will show the results based on the Lowell Photometric Database. The method produce reliable asteroid models with very low rate of false solutions and the pipelines and codes can be directly used also to other sources of sparse photometry - Gaia data, for example. We will present the distribution of spin axis of hundreds of asteroids, discuss the dependence of the spin obliquity on the size of an asteroid,and show examples of spin-axis distribution in asteroid families that confirm the Yarkovsky/YORP evolution scenario.
Logarithmic Laplacian Prior Based Bayesian Inverse Synthetic Aperture Radar Imaging.
Zhang, Shuanghui; Liu, Yongxiang; Li, Xiang; Bi, Guoan
2016-04-28
This paper presents a novel Inverse Synthetic Aperture Radar Imaging (ISAR) algorithm based on a new sparse prior, known as the logarithmic Laplacian prior. The newly proposed logarithmic Laplacian prior has a narrower main lobe with higher tail values than the Laplacian prior, which helps to achieve performance improvement on sparse representation. The logarithmic Laplacian prior is used for ISAR imaging within the Bayesian framework to achieve better focused radar image. In the proposed method of ISAR imaging, the phase errors are jointly estimated based on the minimum entropy criterion to accomplish autofocusing. The maximum a posterior (MAP) estimation and the maximum likelihood estimation (MLE) are utilized to estimate the model parameters to avoid manually tuning process. Additionally, the fast Fourier Transform (FFT) and Hadamard product are used to minimize the required computational efficiency. Experimental results based on both simulated and measured data validate that the proposed algorithm outperforms the traditional sparse ISAR imaging algorithms in terms of resolution improvement and noise suppression.
NASA Astrophysics Data System (ADS)
Miorelli, Roberto; Reboud, Christophe
2018-04-01
Pulsed Eddy Current Testing (PECT) is a popular NonDestructive Testing (NDT) technique for some applications like corrosion monitoring in the oil and gas industry, or rivet inspection in the aeronautic area. Its particularity is to use a transient excitation, which allows to retrieve more information from the piece than conventional harmonic ECT, in a simpler and cheaper way than multi-frequency ECT setups. Efficient modeling tools prove, as usual, very useful to optimize experimental sensors and devices or evaluate their performance, for instance. This paper proposes an efficient simulation of PECT signals based on standard time harmonic solvers and use of an Adaptive Sparse Grid (ASG) algorithm. An adaptive sampling of the ECT signal spectrum is performed with this algorithm, then the complete spectrum is interpolated from this sparse representation and PECT signals are finally synthesized by means of inverse Fourier transform. Simulation results corresponding to existing industrial configurations are presented and the performance of the strategy is discussed by comparison to reference results.
NASA Astrophysics Data System (ADS)
Zheng, Maoteng; Zhang, Yongjun; Zhou, Shunping; Zhu, Junfeng; Xiong, Xiaodong
2016-07-01
In recent years, new platforms and sensors in photogrammetry, remote sensing and computer vision areas have become available, such as Unmanned Aircraft Vehicles (UAV), oblique camera systems, common digital cameras and even mobile phone cameras. Images collected by all these kinds of sensors could be used as remote sensing data sources. These sensors can obtain large-scale remote sensing data which consist of a great number of images. Bundle block adjustment of large-scale data with conventional algorithm is very time and space (memory) consuming due to the super large normal matrix arising from large-scale data. In this paper, an efficient Block-based Sparse Matrix Compression (BSMC) method combined with the Preconditioned Conjugate Gradient (PCG) algorithm is chosen to develop a stable and efficient bundle block adjustment system in order to deal with the large-scale remote sensing data. The main contribution of this work is the BSMC-based PCG algorithm which is more efficient in time and memory than the traditional algorithm without compromising the accuracy. Totally 8 datasets of real data are used to test our proposed method. Preliminary results have shown that the BSMC method can efficiently decrease the time and memory requirement of large-scale data.
Bayesian statistical ionospheric tomography improved by incorporating ionosonde measurements
NASA Astrophysics Data System (ADS)
Norberg, Johannes; Virtanen, Ilkka I.; Roininen, Lassi; Vierinen, Juha; Orispää, Mikko; Kauristie, Kirsti; Lehtinen, Markku S.
2016-04-01
We validate two-dimensional ionospheric tomography reconstructions against EISCAT incoherent scatter radar measurements. Our tomography method is based on Bayesian statistical inversion with prior distribution given by its mean and covariance. We employ ionosonde measurements for the choice of the prior mean and covariance parameters and use the Gaussian Markov random fields as a sparse matrix approximation for the numerical computations. This results in a computationally efficient tomographic inversion algorithm with clear probabilistic interpretation. We demonstrate how this method works with simultaneous beacon satellite and ionosonde measurements obtained in northern Scandinavia. The performance is compared with results obtained with a zero-mean prior and with the prior mean taken from the International Reference Ionosphere 2007 model. In validating the results, we use EISCAT ultra-high-frequency incoherent scatter radar measurements as the ground truth for the ionization profile shape. We find that in comparison to the alternative prior information sources, ionosonde measurements improve the reconstruction by adding accurate information about the absolute value and the altitude distribution of electron density. With an ionosonde at continuous disposal, the presented method enhances stand-alone near-real-time ionospheric tomography for the given conditions significantly.
3D CSEM inversion based on goal-oriented adaptive finite element method
NASA Astrophysics Data System (ADS)
Zhang, Y.; Key, K.
2016-12-01
We present a parallel 3D frequency domain controlled-source electromagnetic inversion code name MARE3DEM. Non-linear inversion of observed data is performed with the Occam variant of regularized Gauss-Newton optimization. The forward operator is based on the goal-oriented finite element method that efficiently calculates the responses and sensitivity kernels in parallel using a data decomposition scheme where independent modeling tasks contain different frequencies and subsets of the transmitters and receivers. To accommodate complex 3D conductivity variation with high flexibility and precision, we adopt the dual-grid approach where the forward mesh conforms to the inversion parameter grid and is adaptively refined until the forward solution converges to the desired accuracy. This dual-grid approach is memory efficient, since the inverse parameter grid remains independent from fine meshing generated around the transmitter and receivers by the adaptive finite element method. Besides, the unstructured inverse mesh efficiently handles multiple scale structures and allows for fine-scale model parameters within the region of interest. Our mesh generation engine keeps track of the refinement hierarchy so that the map of conductivity and sensitivity kernel between the forward and inverse mesh is retained. We employ the adjoint-reciprocity method to calculate the sensitivity kernels which establish a linear relationship between changes in the conductivity model and changes in the modeled responses. Our code uses a direcy solver for the linear systems, so the adjoint problem is efficiently computed by re-using the factorization from the primary problem. Further computational efficiency and scalability is obtained in the regularized Gauss-Newton portion of the inversion using parallel dense matrix-matrix multiplication and matrix factorization routines implemented with the ScaLAPACK library. We show the scalability, reliability and the potential of the algorithm to deal with complex geological scenarios by applying it to the inversion of synthetic marine controlled source EM data generated for a complex 3D offshore model with significant seafloor topography.
Fast polar decomposition of an arbitrary matrix
NASA Technical Reports Server (NTRS)
Higham, Nicholas J.; Schreiber, Robert S.
1988-01-01
The polar decomposition of an m x n matrix A of full rank, where m is greater than or equal to n, can be computed using a quadratically convergent algorithm. The algorithm is based on a Newton iteration involving a matrix inverse. With the use of a preliminary complete orthogonal decomposition the algorithm can be extended to arbitrary A. How to use the algorithm to compute the positive semi-definite square root of a Hermitian positive semi-definite matrix is described. A hybrid algorithm which adaptively switches from the matrix inversion based iteration to a matrix multiplication based iteration due to Kovarik, and to Bjorck and Bowie is formulated. The decision when to switch is made using a condition estimator. This matrix multiplication rich algorithm is shown to be more efficient on machines for which matrix multiplication can be executed 1.5 times faster than matrix inversion.
An efficient optical architecture for sparsely connected neural networks
NASA Technical Reports Server (NTRS)
Hine, Butler P., III; Downie, John D.; Reid, Max B.
1990-01-01
An architecture for general-purpose optical neural network processor is presented in which the interconnections and weights are formed by directing coherent beams holographically, thereby making use of the space-bandwidth products of the recording medium for sparsely interconnected networks more efficiently that the commonly used vector-matrix multiplier, since all of the hologram area is in use. An investigation is made of the use of computer-generated holograms recorded on such updatable media as thermoplastic materials, in order to define the interconnections and weights of a neural network processor; attention is given to limits on interconnection densities, diffraction efficiencies, and weighing accuracies possible with such an updatable thin film holographic device.
Spectral Calculation of ICRF Wave Propagation and Heating in 2-D Using Massively Parallel Computers
NASA Astrophysics Data System (ADS)
Jaeger, E. F.; D'Azevedo, E.; Berry, L. A.; Carter, M. D.; Batchelor, D. B.
2000-10-01
Spectral calculations of ICRF wave propagation in plasmas have the natural advantage that they require no assumption regarding the smallness of the ion Larmor radius ρ relative to wavelength λ. Results are therefore applicable to all orders in k_bot ρ where k_bot = 2π/λ. But because all modes in the spectral representation are coupled, the solution requires inversion of a large dense matrix. In contrast, finite difference algorithms involve only matrices that are sparse and banded. Thus, spectral calculations of wave propagation and heating in tokamak plasmas have so far been limited to 1-D. In this paper, we extend the spectral method to 2-D by taking advantage of new matrix inversion techniques that utilize massively parallel computers. By spreading the dense matrix over 576 processors on the ORNL IBM RS/6000 SP supercomputer, we are able to solve up to 120,000 coupled complex equations requiring 230 GBytes of memory and achieving over 500 Gflops/sec. Initial results for ASDEX and NSTX will be presented using up to 200 modes in both the radial and vertical dimensions.
A fast object-oriented Matlab implementation of the Reproducing Kernel Particle Method
NASA Astrophysics Data System (ADS)
Barbieri, Ettore; Meo, Michele
2012-05-01
Novel numerical methods, known as Meshless Methods or Meshfree Methods and, in a wider perspective, Partition of Unity Methods, promise to overcome most of disadvantages of the traditional finite element techniques. The absence of a mesh makes meshfree methods very attractive for those problems involving large deformations, moving boundaries and crack propagation. However, meshfree methods still have significant limitations that prevent their acceptance among researchers and engineers, namely the computational costs. This paper presents an in-depth analysis of computational techniques to speed-up the computation of the shape functions in the Reproducing Kernel Particle Method and Moving Least Squares, with particular focus on their bottlenecks, like the neighbour search, the inversion of the moment matrix and the assembly of the stiffness matrix. The paper presents numerous computational solutions aimed at a considerable reduction of the computational times: the use of kd-trees for the neighbour search, sparse indexing of the nodes-points connectivity and, most importantly, the explicit and vectorized inversion of the moment matrix without using loops and numerical routines.
The application of nonlinear programming and collocation to optimal aeroassisted orbital transfers
NASA Astrophysics Data System (ADS)
Shi, Y. Y.; Nelson, R. L.; Young, D. H.; Gill, P. E.; Murray, W.; Saunders, M. A.
1992-01-01
Sequential quadratic programming (SQP) and collocation of the differential equations of motion were applied to optimal aeroassisted orbital transfers. The Optimal Trajectory by Implicit Simulation (OTIS) computer program codes with updated nonlinear programming code (NZSOL) were used as a testbed for the SQP nonlinear programming (NLP) algorithms. The state-of-the-art sparse SQP method is considered to be effective for solving large problems with a sparse matrix. Sparse optimizers are characterized in terms of memory requirements and computational efficiency. For the OTIS problems, less than 10 percent of the Jacobian matrix elements are nonzero. The SQP method encompasses two phases: finding an initial feasible point by minimizing the sum of infeasibilities and minimizing the quadratic objective function within the feasible region. The orbital transfer problem under consideration involves the transfer from a high energy orbit to a low energy orbit.
Removing flicker based on sparse color correspondences in old film restoration
NASA Astrophysics Data System (ADS)
Huang, Xi; Ding, Youdong; Yu, Bing; Xia, Tianran
2018-04-01
In the long history of human civilization, archived film is an indispensable part of it, and using digital method to repair damaged film is also a mainstream trend nowadays. In this paper, we propose a sparse color correspondences based technique to remove fading flicker for old films. Our model, combined with multi frame images to establish a simple correction model, includes three key steps. Firstly, we recover sparse color correspondences in the input frames to build a matrix with many missing entries. Secondly, we present a low-rank matrix factorization approach to estimate the unknown parameters of this model. Finally, we adopt a two-step strategy that divide the estimated parameters into reference frame parameters for color recovery correction and other frame parameters for color consistency correction to remove flicker. Our method combined multi-frames takes continuity of the input sequence into account, and the experimental results show the method can remove fading flicker efficiently.
Sparse Covariance Matrix Estimation by DCA-Based Algorithms.
Phan, Duy Nhat; Le Thi, Hoai An; Dinh, Tao Pham
2017-11-01
This letter proposes a novel approach using the [Formula: see text]-norm regularization for the sparse covariance matrix estimation (SCME) problem. The objective function of SCME problem is composed of a nonconvex part and the [Formula: see text] term, which is discontinuous and difficult to tackle. Appropriate DC (difference of convex functions) approximations of [Formula: see text]-norm are used that result in approximation SCME problems that are still nonconvex. DC programming and DCA (DC algorithm), powerful tools in nonconvex programming framework, are investigated. Two DC formulations are proposed and corresponding DCA schemes developed. Two applications of the SCME problem that are considered are classification via sparse quadratic discriminant analysis and portfolio optimization. A careful empirical experiment is performed through simulated and real data sets to study the performance of the proposed algorithms. Numerical results showed their efficiency and their superiority compared with seven state-of-the-art methods.
Tensor-GMRES method for large sparse systems of nonlinear equations
NASA Technical Reports Server (NTRS)
Feng, Dan; Pulliam, Thomas H.
1994-01-01
This paper introduces a tensor-Krylov method, the tensor-GMRES method, for large sparse systems of nonlinear equations. This method is a coupling of tensor model formation and solution techniques for nonlinear equations with Krylov subspace projection techniques for unsymmetric systems of linear equations. Traditional tensor methods for nonlinear equations are based on a quadratic model of the nonlinear function, a standard linear model augmented by a simple second order term. These methods are shown to be significantly more efficient than standard methods both on nonsingular problems and on problems where the Jacobian matrix at the solution is singular. A major disadvantage of the traditional tensor methods is that the solution of the tensor model requires the factorization of the Jacobian matrix, which may not be suitable for problems where the Jacobian matrix is large and has a 'bad' sparsity structure for an efficient factorization. We overcome this difficulty by forming and solving the tensor model using an extension of a Newton-GMRES scheme. Like traditional tensor methods, we show that the new tensor method has significant computational advantages over the analogous Newton counterpart. Consistent with Krylov subspace based methods, the new tensor method does not depend on the factorization of the Jacobian matrix. As a matter of fact, the Jacobian matrix is never needed explicitly.
Total variation-based method for radar coincidence imaging with model mismatch for extended target
NASA Astrophysics Data System (ADS)
Cao, Kaicheng; Zhou, Xiaoli; Cheng, Yongqiang; Fan, Bo; Qin, Yuliang
2017-11-01
Originating from traditional optical coincidence imaging, radar coincidence imaging (RCI) is a staring/forward-looking imaging technique. In RCI, the reference matrix must be computed precisely to reconstruct the image as preferred; unfortunately, such precision is almost impossible due to the existence of model mismatch in practical applications. Although some conventional sparse recovery algorithms are proposed to solve the model-mismatch problem, they are inapplicable to nonsparse targets. We therefore sought to derive the signal model of RCI with model mismatch by replacing the sparsity constraint item with total variation (TV) regularization in the sparse total least squares optimization problem; in this manner, we obtain the objective function of RCI with model mismatch for an extended target. A more robust and efficient algorithm called TV-TLS is proposed, in which the objective function is divided into two parts and the perturbation matrix and scattering coefficients are updated alternately. Moreover, due to the ability of TV regularization to recover sparse signal or image with sparse gradient, TV-TLS method is also applicable to sparse recovering. Results of numerical experiments demonstrate that, for uniform extended targets, sparse targets, and real extended targets, the algorithm can achieve preferred imaging performance both in suppressing noise and in adapting to model mismatch.
Oryspayev, Dossay; Aktulga, Hasan Metin; Sosonkina, Masha; ...
2015-07-14
In this article, sparse matrix vector multiply (SpMVM) is an important kernel that frequently arises in high performance computing applications. Due to its low arithmetic intensity, several approaches have been proposed in literature to improve its scalability and efficiency in large scale computations. In this paper, our target systems are high end multi-core architectures and we use messaging passing interface + open multiprocessing hybrid programming model for parallelism. We analyze the performance of recently proposed implementation of the distributed symmetric SpMVM, originally developed for large sparse symmetric matrices arising in ab initio nuclear structure calculations. We also study important featuresmore » of this implementation and compare with previously reported implementations that do not exploit underlying symmetry. Our SpMVM implementations leverage the hybrid paradigm to efficiently overlap expensive communications with computations. Our main comparison criterion is the "CPU core hours" metric, which is the main measure of resource usage on supercomputers. We analyze the effects of topology-aware mapping heuristic using simplified network load model. Furthermore, we have tested the different SpMVM implementations on two large clusters with 3D Torus and Dragonfly topology. Our results show that the distributed SpMVM implementation that exploits matrix symmetry and hides communication yields the best value for the "CPU core hours" metric and significantly reduces data movement overheads.« less
Solving Boltzmann and Fokker-Planck Equations Using Sparse Representation
2011-05-31
material science. We have com- puted the electronic structure of 2D quantum dot system, and compared the efficiency with the benchmark software OCTOPUS . For...one self-consistent iteration step with 512 electrons, OCTOPUS costs 1091 sec, and selected inversion costs 9.76 sec. The algorithm exhibits
Inverse modeling of the terrestrial carbon flux in China with flux covariance among inverted regions
NASA Astrophysics Data System (ADS)
Wang, H.; Jiang, F.; Chen, J. M.; Ju, W.; Wang, H.
2011-12-01
Quantitative understanding of the role of ocean and terrestrial biosphere in the global carbon cycle, their response and feedback to climate change is required for the future projection of the global climate. China has the largest amount of anthropogenic CO2 emission, diverse terrestrial ecosystems and an unprecedented rate of urbanization. Thus information on spatial and temporal distributions of the terrestrial carbon flux in China is of great importance in understanding the global carbon cycle. We developed a nested inversion with focus in China. Based on Transcom 22 regions for the globe, we divide China and its neighboring countries into 17 regions, making 39 regions in total for the globe. A Bayesian synthesis inversion is made to estimate the terrestrial carbon flux based on GlobalView CO2 data. In the inversion, GEOS-Chem is used as the transport model to develop the transport matrix. A terrestrial ecosystem model named BEPS is used to produce the prior surface flux to constrain the inversion. However, the sparseness of available observation stations in Asia poses a challenge to the inversion for the 17 small regions. To obtain additional constraint on the inversion, a prior flux covariance matrix is constructed using the BEPS model through analyzing the correlation in the net carbon flux among regions under variable climate conditions. The use of the covariance among different regions in the inversion effectively extends the information content of CO2 observations to more regions. The carbon flux over the 39 land and ocean regions are inverted for the period from 2004 to 2009. In order to investigate the impact of introducing the covariance matrix with non-zero off-diagonal values to the inversion, the inverted terrestrial carbon flux over China is evaluated against ChinaFlux eddy-covariance observations after applying an upscaling methodology.
Sparse Matrices in MATLAB: Design and Implementation
NASA Technical Reports Server (NTRS)
Gilbert, John R.; Moler, Cleve; Schreiber, Robert
1992-01-01
The matrix computation language and environment MATLAB is extended to include sparse matrix storage and operations. The only change to the outward appearance of the MATLAB language is a pair of commands to create full or sparse matrices. Nearly all the operations of MATLAB now apply equally to full or sparse matrices, without any explicit action by the user. The sparse data structure represents a matrix in space proportional to the number of nonzero entries, and most of the operations compute sparse results in time proportional to the number of arithmetic operations on nonzeros.
Exact recovery of sparse multiple measurement vectors by [Formula: see text]-minimization.
Wang, Changlong; Peng, Jigen
2018-01-01
The joint sparse recovery problem is a generalization of the single measurement vector problem widely studied in compressed sensing. It aims to recover a set of jointly sparse vectors, i.e., those that have nonzero entries concentrated at a common location. Meanwhile [Formula: see text]-minimization subject to matrixes is widely used in a large number of algorithms designed for this problem, i.e., [Formula: see text]-minimization [Formula: see text] Therefore the main contribution in this paper is two theoretical results about this technique. The first one is proving that in every multiple system of linear equations there exists a constant [Formula: see text] such that the original unique sparse solution also can be recovered from a minimization in [Formula: see text] quasi-norm subject to matrixes whenever [Formula: see text]. The other one is showing an analytic expression of such [Formula: see text]. Finally, we display the results of one example to confirm the validity of our conclusions, and we use some numerical experiments to show that we increase the efficiency of these algorithms designed for [Formula: see text]-minimization by using our results.
Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization.
Lu, Canyi; Lin, Zhouchen; Yan, Shuicheng
2015-02-01
This paper presents a general framework for solving the low-rank and/or sparse matrix minimization problems, which may involve multiple nonsmooth terms. The iteratively reweighted least squares (IRLSs) method is a fast solver, which smooths the objective function and minimizes it by alternately updating the variables and their weights. However, the traditional IRLS can only solve a sparse only or low rank only minimization problem with squared loss or an affine constraint. This paper generalizes IRLS to solve joint/mixed low-rank and sparse minimization problems, which are essential formulations for many tasks. As a concrete example, we solve the Schatten-p norm and l2,q-norm regularized low-rank representation problem by IRLS, and theoretically prove that the derived solution is a stationary point (globally optimal if p,q ≥ 1). Our convergence proof of IRLS is more general than previous one that depends on the special properties of the Schatten-p norm and l2,q-norm. Extensive experiments on both synthetic and real data sets demonstrate that our IRLS is much more efficient.
NASA Astrophysics Data System (ADS)
Mohamad Noor, Faris; Adipta, Agra
2018-03-01
Coal Bed Methane (CBM) as a newly developed resource in Indonesia is one of the alternatives to relieve Indonesia’s dependencies on conventional energies. Coal resource of Muara Enim Formation is known as one of the prolific reservoirs in South Sumatra Basin. Seismic inversion and well analysis are done to determine the coal seam characteristics of Muara Enim Formation. This research uses three inversion methods, which are: model base hard- constrain, bandlimited, and sparse-spike inversion. Each type of seismic inversion has its own advantages to display the coal seam and its characteristic. Interpretation result from the analysis data shows that the Muara Enim coal seam has 20 (API) gamma ray value, 1 (gr/cc) – 1.4 (gr/cc) from density log, and low AI cutoff value range between 5000-6400 (m/s)*(g/cc). The distribution of coal seam is laterally thinning northwest to southeast. Coal seam is seen biasedly on model base hard constraint inversion and discontinued on band-limited inversion which isn’t similar to the geological model. The appropriate AI inversion is sparse spike inversion which has 0.884757 value from cross plot inversion as the best correlation value among the chosen inversion methods. Sparse Spike inversion its self-has high amplitude as a proper tool to identify coal seam continuity which commonly appears as a thin layer. Cross-sectional sparse spike inversion shows that there are possible new boreholes in CDP 3662-3722, CDP 3586-3622, and CDP 4004-4148 which is seen in seismic data as a thick coal seam.
NASA Astrophysics Data System (ADS)
Bouchet, L.; Amestoy, P.; Buttari, A.; Rouet, F.-H.; Chauvin, M.
2013-02-01
Nowadays, analyzing and reducing the ever larger astronomical datasets is becoming a crucial challenge, especially for long cumulated observation times. The INTEGRAL/SPI X/γ-ray spectrometer is an instrument for which it is essential to process many exposures at the same time in order to increase the low signal-to-noise ratio of the weakest sources. In this context, the conventional methods for data reduction are inefficient and sometimes not feasible at all. Processing several years of data simultaneously requires computing not only the solution of a large system of equations, but also the associated uncertainties. We aim at reducing the computation time and the memory usage. Since the SPI transfer function is sparse, we have used some popular methods for the solution of large sparse linear systems; we briefly review these methods. We use the Multifrontal Massively Parallel Solver (MUMPS) to compute the solution of the system of equations. We also need to compute the variance of the solution, which amounts to computing selected entries of the inverse of the sparse matrix corresponding to our linear system. This can be achieved through one of the latest features of the MUMPS software that has been partly motivated by this work. In this paper we provide a brief presentation of this feature and evaluate its effectiveness on astrophysical problems requiring the processing of large datasets simultaneously, such as the study of the entire emission of the Galaxy. We used these algorithms to solve the large sparse systems arising from SPI data processing and to obtain both their solutions and the associated variances. In conclusion, thanks to these newly developed tools, processing large datasets arising from SPI is now feasible with both a reasonable execution time and a low memory usage.
AZTEC. Parallel Iterative method Software for Solving Linear Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hutchinson, S.; Shadid, J.; Tuminaro, R.
1995-07-01
AZTEC is an interactive library that greatly simplifies the parrallelization process when solving the linear systems of equations Ax=b where A is a user supplied n X n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. AZTEC is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparse unstructured matricesmore » for parallel solutions.« less
Dynamic data integration and stochastic inversion of a confined aquifer
NASA Astrophysics Data System (ADS)
Wang, D.; Zhang, Y.; Irsa, J.; Huang, H.; Wang, L.
2013-12-01
Much work has been done in developing and applying inverse methods to aquifer modeling. The scope of this paper is to investigate the applicability of a new direct method for large inversion problems and to incorporate uncertainty measures in the inversion outcomes (Wang et al., 2013). The problem considered is a two-dimensional inverse model (50×50 grid) of steady-state flow for a heterogeneous ground truth model (500×500 grid) with two hydrofacies. From the ground truth model, decreasing number of wells (12, 6, 3) were sampled for facies types, based on which experimental indicator histograms and directional variograms were computed. These parameters and models were used by Sequential Indicator Simulation to generate 100 realizations of hydrofacies patterns in a 100×100 (geostatistical) grid, which were conditioned to the facies measurements at wells. These realizations were smoothed with Simulated Annealing, coarsened to the 50×50 inverse grid, before they were conditioned with the direct method to the dynamic data, i.e., observed heads and groundwater fluxes at the same sampled wells. A set of realizations of estimated hydraulic conductivities (Ks), flow fields, and boundary conditions were created, which centered on the 'true' solutions from solving the ground truth model. Both hydrofacies conductivities were computed with an estimation accuracy of ×10% (12 wells), ×20% (6 wells), ×35% (3 wells) of the true values. For boundary condition estimation, the accuracy was within × 15% (12 wells), 30% (6 wells), and 50% (3 wells) of the true values. The inversion system of equations was solved with LSQR (Paige et al, 1982), for which coordinate transform and matrix scaling preprocessor were used to improve the condition number (CN) of the coefficient matrix. However, when the inverse grid was refined to 100×100, Gaussian Noise Perturbation was used to limit the growth of the CN before the matrix solve. To scale the inverse problem up (i.e., without smoothing and coarsening and therefore reducing the associated estimation uncertainty), a parallel LSQR solver was written and verified. For the 50×50 grid, the parallel solver sped up the serial solution time by 14X using 4 CPUs (research on parallel performance and scaling is ongoing). A sensitivity analysis was conducted to examine the relation between the observed data and the inversion outcomes, where measurement errors of increasing magnitudes (i.e., ×1, 2, 5, 10% of the total head variation and up to ×2% of the total flux variation) were imposed on the observed data. Inversion results were stable but the accuracy of Ks and boundary estimation degraded with increasing errors, as expected. In particular, quality of the observed heads is critical to hydraulic head recovery, while quality of the observed fluxes plays a dominant role in K estimation. References: Wang, D., Y. Zhang, J. Irsa, H. Huang, and L. Wang (2013), Data integration and stochastic inversion of a confined aquifer with high performance computing, Advances in Water Resources, in preparation. Paige, C. C., and M. A. Saunders (1982), LSQR: an algorithm for sparse linear equations and sparse least squares, ACM Transactions on Mathematical Software, 8(1), 43-71.
Fast Solution in Sparse LDA for Binary Classification
NASA Technical Reports Server (NTRS)
Moghaddam, Baback
2010-01-01
An algorithm that performs sparse linear discriminant analysis (Sparse-LDA) finds near-optimal solutions in far less time than the prior art when specialized to binary classification (of 2 classes). Sparse-LDA is a type of feature- or variable- selection problem with numerous applications in statistics, machine learning, computer vision, computational finance, operations research, and bio-informatics. Because of its combinatorial nature, feature- or variable-selection problems are NP-hard or computationally intractable in cases involving more than 30 variables or features. Therefore, one typically seeks approximate solutions by means of greedy search algorithms. The prior Sparse-LDA algorithm was a greedy algorithm that considered the best variable or feature to add/ delete to/ from its subsets in order to maximally discriminate between multiple classes of data. The present algorithm is designed for the special but prevalent case of 2-class or binary classification (e.g. 1 vs. 0, functioning vs. malfunctioning, or change versus no change). The present algorithm provides near-optimal solutions on large real-world datasets having hundreds or even thousands of variables or features (e.g. selecting the fewest wavelength bands in a hyperspectral sensor to do terrain classification) and does so in typical computation times of minutes as compared to days or weeks as taken by the prior art. Sparse LDA requires solving generalized eigenvalue problems for a large number of variable subsets (represented by the submatrices of the input within-class and between-class covariance matrices). In the general (fullrank) case, the amount of computation scales at least cubically with the number of variables and thus the size of the problems that can be solved is limited accordingly. However, in binary classification, the principal eigenvalues can be found using a special analytic formula, without resorting to costly iterative techniques. The present algorithm exploits this analytic form along with the inherent sequential nature of greedy search itself. Together this enables the use of highly-efficient partitioned-matrix-inverse techniques that result in large speedups of computation in both the forward-selection and backward-elimination stages of greedy algorithms in general.
Meng, Yuguang; Lei, Hao
2010-06-01
An efficient iterative gridding reconstruction method with correction of off-resonance artifacts was developed, which is especially tailored for multiple-shot non-Cartesian imaging. The novelty of the method lies in that the transformation matrix for gridding (T) was constructed as the convolution of two sparse matrices, among which the former is determined by the sampling interval and the spatial distribution of the off-resonance frequencies and the latter by the sampling trajectory and the target grid in the Cartesian space. The resulting T matrix is also sparse and can be solved efficiently with the iterative conjugate gradient algorithm. It was shown that, with the proposed method, the reconstruction speed in multiple-shot non-Cartesian imaging can be improved significantly while retaining high reconstruction fidelity. More important, the method proposed allows tradeoff between the accuracy and the computation time of reconstruction, making customization of the use of such a method in different applications possible. The performance of the proposed method was demonstrated by numerical simulation and multiple-shot spiral imaging on rat brain at 4.7 T. (c) 2010 Wiley-Liss, Inc.
Kim, Hyunsoo; Park, Haesun
2007-06-15
Many practical pattern recognition problems require non-negativity constraints. For example, pixels in digital images and chemical concentrations in bioinformatics are non-negative. Sparse non-negative matrix factorizations (NMFs) are useful when the degree of sparseness in the non-negative basis matrix or the non-negative coefficient matrix in an NMF needs to be controlled in approximating high-dimensional data in a lower dimensional space. In this article, we introduce a novel formulation of sparse NMF and show how the new formulation leads to a convergent sparse NMF algorithm via alternating non-negativity-constrained least squares. We apply our sparse NMF algorithm to cancer-class discovery and gene expression data analysis and offer biological analysis of the results obtained. Our experimental results illustrate that the proposed sparse NMF algorithm often achieves better clustering performance with shorter computing time compared to other existing NMF algorithms. The software is available as supplementary material.
SPARSKIT: A basic tool kit for sparse matrix computations
NASA Technical Reports Server (NTRS)
Saad, Youcef
1990-01-01
Presented here are the main features of a tool package for manipulating and working with sparse matrices. One of the goals of the package is to provide basic tools to facilitate the exchange of software and data between researchers in sparse matrix computations. The starting point is the Harwell/Boeing collection of matrices for which the authors provide a number of tools. Among other things, the package provides programs for converting data structures, printing simple statistics on a matrix, plotting a matrix profile, and performing linear algebra operations with sparse matrices.
Improved Estimation and Interpretation of Correlations in Neural Circuits
Yatsenko, Dimitri; Josić, Krešimir; Ecker, Alexander S.; Froudarakis, Emmanouil; Cotton, R. James; Tolias, Andreas S.
2015-01-01
Ambitious projects aim to record the activity of ever larger and denser neuronal populations in vivo. Correlations in neural activity measured in such recordings can reveal important aspects of neural circuit organization. However, estimating and interpreting large correlation matrices is statistically challenging. Estimation can be improved by regularization, i.e. by imposing a structure on the estimate. The amount of improvement depends on how closely the assumed structure represents dependencies in the data. Therefore, the selection of the most efficient correlation matrix estimator for a given neural circuit must be determined empirically. Importantly, the identity and structure of the most efficient estimator informs about the types of dominant dependencies governing the system. We sought statistically efficient estimators of neural correlation matrices in recordings from large, dense groups of cortical neurons. Using fast 3D random-access laser scanning microscopy of calcium signals, we recorded the activity of nearly every neuron in volumes 200 μm wide and 100 μm deep (150–350 cells) in mouse visual cortex. We hypothesized that in these densely sampled recordings, the correlation matrix should be best modeled as the combination of a sparse graph of pairwise partial correlations representing local interactions and a low-rank component representing common fluctuations and external inputs. Indeed, in cross-validation tests, the covariance matrix estimator with this structure consistently outperformed other regularized estimators. The sparse component of the estimate defined a graph of interactions. These interactions reflected the physical distances and orientation tuning properties of cells: The density of positive ‘excitatory’ interactions decreased rapidly with geometric distances and with differences in orientation preference whereas negative ‘inhibitory’ interactions were less selective. Because of its superior performance, this ‘sparse+latent’ estimator likely provides a more physiologically relevant representation of the functional connectivity in densely sampled recordings than the sample correlation matrix. PMID:25826696
Dynamic Textures Modeling via Joint Video Dictionary Learning.
Wei, Xian; Li, Yuanxiang; Shen, Hao; Chen, Fang; Kleinsteuber, Martin; Wang, Zhongfeng
2017-04-06
Video representation is an important and challenging task in the computer vision community. In this paper, we consider the problem of modeling and classifying video sequences of dynamic scenes which could be modeled in a dynamic textures (DT) framework. At first, we assume that image frames of a moving scene can be modeled as a Markov random process. We propose a sparse coding framework, named joint video dictionary learning (JVDL), to model a video adaptively. By treating the sparse coefficients of image frames over a learned dictionary as the underlying "states", we learn an efficient and robust linear transition matrix between two adjacent frames of sparse events in time series. Hence, a dynamic scene sequence is represented by an appropriate transition matrix associated with a dictionary. In order to ensure the stability of JVDL, we impose several constraints on such transition matrix and dictionary. The developed framework is able to capture the dynamics of a moving scene by exploring both sparse properties and the temporal correlations of consecutive video frames. Moreover, such learned JVDL parameters can be used for various DT applications, such as DT synthesis and recognition. Experimental results demonstrate the strong competitiveness of the proposed JVDL approach in comparison with state-of-the-art video representation methods. Especially, it performs significantly better in dealing with DT synthesis and recognition on heavily corrupted data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kitanidis, Peter
As large-scale, commercial storage projects become operational, the problem of utilizing information from diverse sources becomes more critically important. In this project, we developed, tested, and applied an advanced joint data inversion system for CO 2 storage modeling with large data sets for use in site characterization and real-time monitoring. Emphasis was on the development of advanced and efficient computational algorithms for joint inversion of hydro-geophysical data, coupled with state-of-the-art forward process simulations. The developed system consists of (1) inversion tools using characterization data, such as 3D seismic survey (amplitude images), borehole log and core data, as well as hydraulic,more » tracer and thermal tests before CO 2 injection, (2) joint inversion tools for updating the geologic model with the distribution of rock properties, thus reducing uncertainty, using hydro-geophysical monitoring data, and (3) highly efficient algorithms for directly solving the dense or sparse linear algebra systems derived from the joint inversion. The system combines methods from stochastic analysis, fast linear algebra, and high performance computing. The developed joint inversion tools have been tested through synthetic CO 2 storage examples.« less
An efficient classification method based on principal component and sparse representation.
Zhai, Lin; Fu, Shujun; Zhang, Caiming; Liu, Yunxian; Wang, Lu; Liu, Guohua; Yang, Mingqiang
2016-01-01
As an important application in optical imaging, palmprint recognition is interfered by many unfavorable factors. An effective fusion of blockwise bi-directional two-dimensional principal component analysis and grouping sparse classification is presented. The dimension reduction and normalizing are implemented by the blockwise bi-directional two-dimensional principal component analysis for palmprint images to extract feature matrixes, which are assembled into an overcomplete dictionary in sparse classification. A subspace orthogonal matching pursuit algorithm is designed to solve the grouping sparse representation. Finally, the classification result is gained by comparing the residual between testing and reconstructed images. Experiments are carried out on a palmprint database, and the results show that this method has better robustness against position and illumination changes of palmprint images, and can get higher rate of palmprint recognition.
Sparse Image Reconstruction on the Sphere: Analysis and Synthesis.
Wallis, Christopher G R; Wiaux, Yves; McEwen, Jason D
2017-11-01
We develop techniques to solve ill-posed inverse problems on the sphere by sparse regularization, exploiting sparsity in both axisymmetric and directional scale-discretized wavelet space. Denoising, inpainting, and deconvolution problems and combinations thereof, are considered as examples. Inverse problems are solved in both the analysis and synthesis settings, with a number of different sampling schemes. The most effective approach is that with the most restricted solution-space, which depends on the interplay between the adopted sampling scheme, the selection of the analysis/synthesis problem, and any weighting of the l 1 norm appearing in the regularization problem. More efficient sampling schemes on the sphere improve reconstruction fidelity by restricting the solution-space and also by improving sparsity in wavelet space. We apply the technique to denoise Planck 353-GHz observations, improving the ability to extract the structure of Galactic dust emission, which is important for studying Galactic magnetism.
A method of vehicle license plate recognition based on PCANet and compressive sensing
NASA Astrophysics Data System (ADS)
Ye, Xianyi; Min, Feng
2018-03-01
The manual feature extraction of the traditional method for vehicle license plates has no good robustness to change in diversity. And the high feature dimension that is extracted with Principal Component Analysis Network (PCANet) leads to low classification efficiency. For solving these problems, a method of vehicle license plate recognition based on PCANet and compressive sensing is proposed. First, PCANet is used to extract the feature from the images of characters. And then, the sparse measurement matrix which is a very sparse matrix and consistent with Restricted Isometry Property (RIP) condition of the compressed sensing is used to reduce the dimensions of extracted features. Finally, the Support Vector Machine (SVM) is used to train and recognize the features whose dimension has been reduced. Experimental results demonstrate that the proposed method has better performance than Convolutional Neural Network (CNN) in the recognition and time. Compared with no compression sensing, the proposed method has lower feature dimension for the increase of efficiency.
A Space-Time-Frequency Dictionary for Sparse Cortical Source Localization.
Korats, Gundars; Le Cam, Steven; Ranta, Radu; Louis-Dorr, Valerie
2016-09-01
Cortical source imaging aims at identifying activated cortical areas on the surface of the cortex from the raw electroencephalogram (EEG) data. This problem is ill posed, the number of channels being very low compared to the number of possible source positions. In some realistic physiological situations, the active areas are sparse in space and of short time durations, and the amount of spatio-temporal data to carry the inversion is then limited. In this study, we propose an original data driven space-time-frequency (STF) dictionary which takes into account simultaneously both spatial and time-frequency sparseness while preserving smoothness in the time frequency (i.e., nonstationary smooth time courses in sparse locations). Based on these assumptions, we take benefit of the matching pursuit (MP) framework for selecting the most relevant atoms in this highly redundant dictionary. We apply two recent MP algorithms, single best replacement (SBR) and source deflated matching pursuit, and we compare the results using a spatial dictionary and the proposed STF dictionary to demonstrate the improvements of our multidimensional approach. We also provide comparison using well-established inversion methods, FOCUSS and RAP-MUSIC, analyzing performances under different degrees of nonstationarity and signal to noise ratio. Our STF dictionary combined with the SBR approach provides robust performances on realistic simulations. From a computational point of view, the algorithm is embedded in the wavelet domain, ensuring high efficiency in term of computation time. The proposed approach ensures fast and accurate sparse cortical localizations on highly nonstationary and noisy data.
1982-10-27
are buried within * a much larger, special purpose package. We regret such omissions, but to have reached the practi- tioners in each of the diverse...sparse matrix (form PAQ ) 4. Method of solution: Distribution count sort 5. Programming language: FORTRAN g Precision: Single and double precision 7
Elastic-Waveform Inversion with Compressive Sensing for Sparse Seismic Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Youzuo; Huang, Lianjie
2015-01-28
Accurate velocity models of compressional- and shear-waves are essential for geothermal reservoir characterization and microseismic imaging. Elastic-waveform inversion of multi-component seismic data can provide high-resolution inversion results of subsurface geophysical properties. However, the method requires seismic data acquired using dense source and receiver arrays. In practice, seismic sources and/or geophones are often sparsely distributed on the surface and/or in a borehole, such as 3D vertical seismic profiling (VSP) surveys. We develop a novel elastic-waveform inversion method with compressive sensing for inversion of sparse seismic data. We employ an alternating-minimization algorithm to solve the optimization problem of our new waveform inversionmore » method. We validate our new method using synthetic VSP data for a geophysical model built using geologic features found at the Raft River enhanced-geothermal-system (EGS) field. We apply our method to synthetic VSP data with a sparse source array and compare the results with those obtained with a dense source array. Our numerical results demonstrate that the velocity models produced with our new method using a sparse source array are almost as accurate as those obtained using a dense source array.« less
Solution of matrix equations using sparse techniques
NASA Technical Reports Server (NTRS)
Baddourah, Majdi
1994-01-01
The solution of large systems of matrix equations is key to the solution of a large number of scientific and engineering problems. This talk describes the sparse matrix solver developed at Langley which can routinely solve in excess of 263,000 equations in 40 seconds on one Cray C-90 processor. It appears that for large scale structural analysis applications, sparse matrix methods have a significant performance advantage over other methods.
Convex Banding of the Covariance Matrix
Bien, Jacob; Bunea, Florentina; Xiao, Luo
2016-01-01
We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings. PMID:28042189
Convex Banding of the Covariance Matrix.
Bien, Jacob; Bunea, Florentina; Xiao, Luo
2016-01-01
We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings.
He, Xiaowei; Liang, Jimin; Wang, Xiaorui; Yu, Jingjing; Qu, Xiaochao; Wang, Xiaodong; Hou, Yanbin; Chen, Duofang; Liu, Fang; Tian, Jie
2010-11-22
In this paper, we present an incomplete variables truncated conjugate gradient (IVTCG) method for bioluminescence tomography (BLT). Considering the sparse characteristic of the light source and insufficient surface measurement in the BLT scenarios, we combine a sparseness-inducing (ℓ1 norm) regularization term with a quadratic error term in the IVTCG-based framework for solving the inverse problem. By limiting the number of variables updated at each iterative and combining a variable splitting strategy to find the search direction more efficiently, it obtains fast and stable source reconstruction, even without a priori information of the permissible source region and multispectral measurements. Numerical experiments on a mouse atlas validate the effectiveness of the method. In vivo mouse experimental results further indicate its potential for a practical BLT system.
Low-rank matrix decomposition and spatio-temporal sparse recovery for STAP radar
Sen, Satyabrata
2015-08-04
We develop space-time adaptive processing (STAP) methods by leveraging the advantages of sparse signal processing techniques in order to detect a slowly-moving target. We observe that the inherent sparse characteristics of a STAP problem can be formulated as the low-rankness of clutter covariance matrix when compared to the total adaptive degrees-of-freedom, and also as the sparse interference spectrum on the spatio-temporal domain. By exploiting these sparse properties, we propose two approaches for estimating the interference covariance matrix. In the first approach, we consider a constrained matrix rank minimization problem (RMP) to decompose the sample covariance matrix into a low-rank positivemore » semidefinite and a diagonal matrix. The solution of RMP is obtained by applying the trace minimization technique and the singular value decomposition with matrix shrinkage operator. Our second approach deals with the atomic norm minimization problem to recover the clutter response-vector that has a sparse support on the spatio-temporal plane. We use convex relaxation based standard sparse-recovery techniques to find the solutions. With extensive numerical examples, we demonstrate the performances of proposed STAP approaches with respect to both the ideal and practical scenarios, involving Doppler-ambiguous clutter ridges, spatial and temporal decorrelation effects. As a result, the low-rank matrix decomposition based solution requires secondary measurements as many as twice the clutter rank to attain a near-ideal STAP performance; whereas the spatio-temporal sparsity based approach needs a considerably small number of secondary data.« less
NASA Astrophysics Data System (ADS)
Zhang, H.; Fang, H.; Yao, H.; Maceira, M.; van der Hilst, R. D.
2014-12-01
Recently, Zhang et al. (2014, Pure and Appiled Geophysics) have developed a joint inversion code incorporating body-wave arrival times and surface-wave dispersion data. The joint inversion code was based on the regional-scale version of the double-difference tomography algorithm tomoDD. The surface-wave inversion part uses the propagator matrix solver in the algorithm DISPER80 (Saito, 1988) for forward calculation of dispersion curves from layered velocity models and the related sensitivities. The application of the joint inversion code to the SAFOD site in central California shows that the fault structure is better imaged in the new model, which is able to fit both the body-wave and surface-wave observations adequately. Here we present a new joint inversion method that solves the model in the wavelet domain constrained by sparsity regularization. Compared to the previous method, it has the following advantages: (1) The method is both data- and model-adaptive. For the velocity model, it can be represented by different wavelet coefficients at different scales, which are generally sparse. By constraining the model wavelet coefficients to be sparse, the inversion in the wavelet domain can inherently adapt to the data distribution so that the model has higher spatial resolution in the good data coverage zone. Fang and Zhang (2014, Geophysical Journal International) have showed the superior performance of the wavelet-based double-difference seismic tomography method compared to the conventional method. (2) For the surface wave inversion, the joint inversion code takes advantage of the recent development of direct inversion of surface wave dispersion data for 3-D variations of shear wave velocity without the intermediate step of phase or group velocity maps (Fang et al., 2014, Geophysical Journal International). A fast marching method is used to compute, at each period, surface wave traveltimes and ray paths between sources and receivers. We will test the new joint inversion code at the SAFOD site to compare its performance over the previous code. We will also select another fault zone such as the San Jacinto Fault Zone to better image its structure.
Computationally efficient modeling and simulation of large scale systems
NASA Technical Reports Server (NTRS)
Jain, Jitesh (Inventor); Cauley, Stephen F. (Inventor); Li, Hong (Inventor); Koh, Cheng-Kok (Inventor); Balakrishnan, Venkataramanan (Inventor)
2010-01-01
A method of simulating operation of a VLSI interconnect structure having capacitive and inductive coupling between nodes thereof. A matrix X and a matrix Y containing different combinations of passive circuit element values for the interconnect structure are obtained where the element values for each matrix include inductance L and inverse capacitance P. An adjacency matrix A associated with the interconnect structure is obtained. Numerical integration is used to solve first and second equations, each including as a factor the product of the inverse matrix X.sup.1 and at least one other matrix, with first equation including X.sup.1Y, X.sup.1A, and X.sup.1P, and the second equation including X.sup.1A and X.sup.1P.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter
In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter; ...
2016-06-30
In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
Efficient spares matrix multiplication scheme for the CYBER 203
NASA Technical Reports Server (NTRS)
Lambiotte, J. J., Jr.
1984-01-01
This work has been directed toward the development of an efficient algorithm for performing this computation on the CYBER-203. The desire to provide software which gives the user the choice between the often conflicting goals of minimizing central processing (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of three types of storage is selected for each diagonal. For each storage type, an initialization sub-routine estimates the CPU and storage requirements based upon results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the resources. The three storage types employed were chosen to be efficient on the CYBER-203 for diagonals which are sparse, moderately sparse, or dense; however, for many densities, no diagonal type is most efficient with respect to both resource requirements. The user-supplied weights dictate the choice.
Sparse Matrix for ECG Identification with Two-Lead Features.
Tseng, Kuo-Kun; Luo, Jiao; Hegarty, Robert; Wang, Wenmin; Haiting, Dong
2015-01-01
Electrocardiograph (ECG) human identification has the potential to improve biometric security. However, improvements in ECG identification and feature extraction are required. Previous work has focused on single lead ECG signals. Our work proposes a new algorithm for human identification by mapping two-lead ECG signals onto a two-dimensional matrix then employing a sparse matrix method to process the matrix. And that is the first application of sparse matrix techniques for ECG identification. Moreover, the results of our experiments demonstrate the benefits of our approach over existing methods.
Approximate equiangular tight frames for compressed sensing and CDMA applications
NASA Astrophysics Data System (ADS)
Tsiligianni, Evaggelia; Kondi, Lisimachos P.; Katsaggelos, Aggelos K.
2017-12-01
Performance guarantees for recovery algorithms employed in sparse representations, and compressed sensing highlights the importance of incoherence. Optimal bounds of incoherence are attained by equiangular unit norm tight frames (ETFs). Although ETFs are important in many applications, they do not exist for all dimensions, while their construction has been proven extremely difficult. In this paper, we construct frames that are close to ETFs. According to results from frame and graph theory, the existence of an ETF depends on the existence of its signature matrix, that is, a symmetric matrix with certain structure and spectrum consisting of two distinct eigenvalues. We view the construction of a signature matrix as an inverse eigenvalue problem and propose a method that produces frames of any dimensions that are close to ETFs. Due to the achieved equiangularity property, the so obtained frames can be employed as spreading sequences in synchronous code-division multiple access (s-CDMA) systems, besides compressed sensing.
Pagès, Hervé
2018-01-01
Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set. PMID:29723188
Lun, Aaron T L; Pagès, Hervé; Smith, Mike L
2018-05-01
Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set.
Das, Anup; Sampson, Aaron L.; Lainscsek, Claudia; Muller, Lyle; Lin, Wutu; Doyle, John C.; Cash, Sydney S.; Halgren, Eric; Sejnowski, Terrence J.
2017-01-01
The correlation method from brain imaging has been used to estimate functional connectivity in the human brain. However, brain regions might show very high correlation even when the two regions are not directly connected due to the strong interaction of the two regions with common input from a third region. One previously proposed solution to this problem is to use a sparse regularized inverse covariance matrix or precision matrix (SRPM) assuming that the connectivity structure is sparse. This method yields partial correlations to measure strong direct interactions between pairs of regions while simultaneously removing the influence of the rest of the regions, thus identifying regions that are conditionally independent. To test our methods, we first demonstrated conditions under which the SRPM method could indeed find the true physical connection between a pair of nodes for a spring-mass example and an RC circuit example. The recovery of the connectivity structure using the SRPM method can be explained by energy models using the Boltzmann distribution. We then demonstrated the application of the SRPM method for estimating brain connectivity during stage 2 sleep spindles from human electrocorticography (ECoG) recordings using an 8 × 8 electrode array. The ECoG recordings that we analyzed were from a 32-year-old male patient with long-standing pharmaco-resistant left temporal lobe complex partial epilepsy. Sleep spindles were automatically detected using delay differential analysis and then analyzed with SRPM and the Louvain method for community detection. We found spatially localized brain networks within and between neighboring cortical areas during spindles, in contrast to the case when sleep spindles were not present. PMID:28095202
Addressing the computational cost of large EIT solutions.
Boyle, Alistair; Borsic, Andrea; Adler, Andy
2012-05-01
Electrical impedance tomography (EIT) is a soft field tomography modality based on the application of electric current to a body and measurement of voltages through electrodes at the boundary. The interior conductivity is reconstructed on a discrete representation of the domain using a finite-element method (FEM) mesh and a parametrization of that domain. The reconstruction requires a sequence of numerically intensive calculations. There is strong interest in reducing the cost of these calculations. An improvement in the compute time for current problems would encourage further exploration of computationally challenging problems such as the incorporation of time series data, wide-spread adoption of three-dimensional simulations and correlation of other modalities such as CT and ultrasound. Multicore processors offer an opportunity to reduce EIT computation times but may require some restructuring of the underlying algorithms to maximize the use of available resources. This work profiles two EIT software packages (EIDORS and NDRM) to experimentally determine where the computational costs arise in EIT as problems scale. Sparse matrix solvers, a key component for the FEM forward problem and sensitivity estimates in the inverse problem, are shown to take a considerable portion of the total compute time in these packages. A sparse matrix solver performance measurement tool, Meagre-Crowd, is developed to interface with a variety of solvers and compare their performance over a range of two- and three-dimensional problems of increasing node density. Results show that distributed sparse matrix solvers that operate on multiple cores are advantageous up to a limit that increases as the node density increases. We recommend a selection procedure to find a solver and hardware arrangement matched to the problem and provide guidance and tools to perform that selection.
LSRN: A PARALLEL ITERATIVE SOLVER FOR STRONGLY OVER- OR UNDERDETERMINED SYSTEMS*
Meng, Xiangrui; Saunders, Michael A.; Mahoney, Michael W.
2014-01-01
We describe a parallel iterative least squares solver named LSRN that is based on random normal projection. LSRN computes the min-length solution to minx∈ℝn ‖Ax − b‖2, where A ∈ ℝm × n with m ≫ n or m ≪ n, and where A may be rank-deficient. Tikhonov regularization may also be included. Since A is involved only in matrix-matrix and matrix-vector multiplications, it can be a dense or sparse matrix or a linear operator, and LSRN automatically speeds up when A is sparse or a fast linear operator. The preconditioning phase consists of a random normal projection, which is embarrassingly parallel, and a singular value decomposition of size ⌈γ min(m, n)⌉ × min(m, n), where γ is moderately larger than 1, e.g., γ = 2. We prove that the preconditioned system is well-conditioned, with a strong concentration result on the extreme singular values, and hence that the number of iterations is fully predictable when we apply LSQR or the Chebyshev semi-iterative method. As we demonstrate, the Chebyshev method is particularly efficient for solving large problems on clusters with high communication cost. Numerical results show that on a shared-memory machine, LSRN is very competitive with LAPACK’s DGELSD and a fast randomized least squares solver called Blendenpik on large dense problems, and it outperforms the least squares solver from SuiteSparseQR on sparse problems without sparsity patterns that can be exploited to reduce fill-in. Further experiments show that LSRN scales well on an Amazon Elastic Compute Cloud cluster. PMID:25419094
Adaptive Inverse Control for Rotorcraft Vibration Reduction
NASA Technical Reports Server (NTRS)
Jacklin, Stephen A.
1985-01-01
This thesis extends the Least Mean Square (LMS) algorithm to solve the mult!ple-input, multiple-output problem of alleviating N/Rev (revolutions per minute by number of blades) helicopter fuselage vibration by means of adaptive inverse control. A frequency domain locally linear model is used to represent the transfer matrix relating the higher harmonic pitch control inputs to the harmonic vibration outputs to be controlled. By using the inverse matrix as the controller gain matrix, an adaptive inverse regulator is formed to alleviate the N/Rev vibration. The stability and rate of convergence properties of the extended LMS algorithm are discussed. It is shown that the stability ranges for the elements of the stability gain matrix are directly related to the eigenvalues of the vibration signal information matrix for the learning phase, but not for the control phase. The overall conclusion is that the LMS adaptive inverse control method can form a robust vibration control system, but will require some tuning of the input sensor gains, the stability gain matrix, and the amount of control relaxation to be used. The learning curve of the controller during the learning phase is shown to be quantitatively close to that predicted by averaging the learning curves of the normal modes. For higher order transfer matrices, a rough estimate of the inverse is needed to start the algorithm efficiently. The simulation results indicate that the factor which most influences LMS adaptive inverse control is the product of the control relaxation and the the stability gain matrix. A small stability gain matrix makes the controller less sensitive to relaxation selection, and permits faster and more stable vibration reduction, than by choosing the stability gain matrix large and the control relaxation term small. It is shown that the best selections of the stability gain matrix elements and the amount of control relaxation is basically a compromise between slow, stable convergence and fast convergence with increased possibility of unstable identification. In the simulation studies, the LMS adaptive inverse control algorithm is shown to be capable of adapting the inverse (controller) matrix to track changes in the flight conditions. The algorithm converges quickly for moderate disturbances, while taking longer for larger disturbances. Perfect knowledge of the inverse matrix is not required for good control of the N/Rev vibration. However it is shown that measurement noise will prevent the LMS adaptive inverse control technique from controlling the vibration, unless the signal averaging method presented is incorporated into the algorithm.
Massively parallel sparse matrix function calculations with NTPoly
NASA Astrophysics Data System (ADS)
Dawson, William; Nakajima, Takahito
2018-04-01
We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.
Ortiz, Andrés; Munilla, Jorge; Álvarez-Illán, Ignacio; Górriz, Juan M; Ramírez, Javier
2015-01-01
Alzheimer's Disease (AD) is the most common neurodegenerative disease in elderly people. Its development has been shown to be closely related to changes in the brain connectivity network and in the brain activation patterns along with structural changes caused by the neurodegenerative process. Methods to infer dependence between brain regions are usually derived from the analysis of covariance between activation levels in the different areas. However, these covariance-based methods are not able to estimate conditional independence between variables to factor out the influence of other regions. Conversely, models based on the inverse covariance, or precision matrix, such as Sparse Gaussian Graphical Models allow revealing conditional independence between regions by estimating the covariance between two variables given the rest as constant. This paper uses Sparse Inverse Covariance Estimation (SICE) methods to learn undirected graphs in order to derive functional and structural connectivity patterns from Fludeoxyglucose (18F-FDG) Position Emission Tomography (PET) data and segmented Magnetic Resonance images (MRI), drawn from the ADNI database, for Control, MCI (Mild Cognitive Impairment Subjects), and AD subjects. Sparse computation fits perfectly here as brain regions usually only interact with a few other areas. The models clearly show different metabolic covariation patters between subject groups, revealing the loss of strong connections in AD and MCI subjects when compared to Controls. Similarly, the variance between GM (Gray Matter) densities of different regions reveals different structural covariation patterns between the different groups. Thus, the different connectivity patterns for controls and AD are used in this paper to select regions of interest in PET and GM images with discriminative power for early AD diagnosis. Finally, functional an structural models are combined to leverage the classification accuracy. The results obtained in this work show the usefulness of the Sparse Gaussian Graphical models to reveal functional and structural connectivity patterns. This information provided by the sparse inverse covariance matrices is not only used in an exploratory way but we also propose a method to use it in a discriminative way. Regression coefficients are used to compute reconstruction errors for the different classes that are then introduced in a SVM for classification. Classification experiments performed using 68 Controls, 70 AD, and 111 MCI images and assessed by cross-validation show the effectiveness of the proposed method.
Supercomputing on massively parallel bit-serial architectures
NASA Technical Reports Server (NTRS)
Iobst, Ken
1985-01-01
Research on the Goodyear Massively Parallel Processor (MPP) suggests that high-level parallel languages are practical and can be designed with powerful new semantics that allow algorithms to be efficiently mapped to the real machines. For the MPP these semantics include parallel/associative array selection for both dense and sparse matrices, variable precision arithmetic to trade accuracy for speed, micro-pipelined train broadcast, and conditional branching at the processing element (PE) control unit level. The preliminary design of a FORTRAN-like parallel language for the MPP has been completed and is being used to write programs to perform sparse matrix array selection, min/max search, matrix multiplication, Gaussian elimination on single bit arrays and other generic algorithms. A description is given of the MPP design. Features of the system and its operation are illustrated in the form of charts and diagrams.
Label consistent K-SVD: learning a discriminative dictionary for recognition.
Jiang, Zhuolin; Lin, Zhe; Davis, Larry S
2013-11-01
A label consistent K-SVD (LC-KSVD) algorithm to learn a discriminative dictionary for sparse coding is presented. In addition to using class labels of training data, we also associate label information with each dictionary item (columns of the dictionary matrix) to enforce discriminability in sparse codes during the dictionary learning process. More specifically, we introduce a new label consistency constraint called "discriminative sparse-code error" and combine it with the reconstruction error and the classification error to form a unified objective function. The optimal solution is efficiently obtained using the K-SVD algorithm. Our algorithm learns a single overcomplete dictionary and an optimal linear classifier jointly. The incremental dictionary learning algorithm is presented for the situation of limited memory resources. It yields dictionaries so that feature points with the same class labels have similar sparse codes. Experimental results demonstrate that our algorithm outperforms many recently proposed sparse-coding techniques for face, action, scene, and object category recognition under the same learning conditions.
Disentangling giant component and finite cluster contributions in sparse random matrix spectra.
Kühn, Reimer
2016-04-01
We describe a method for disentangling giant component and finite cluster contributions to sparse random matrix spectra, using sparse symmetric random matrices defined on Erdős-Rényi graphs as an example and test bed. Our methods apply to sparse matrices defined in terms of arbitrary graphs in the configuration model class, as long as they have finite mean degree.
NASA Astrophysics Data System (ADS)
Sourbier, Florent; Operto, Stéphane; Virieux, Jean; Amestoy, Patrick; L'Excellent, Jean-Yves
2009-03-01
This is the first paper in a two-part series that describes a massively parallel code that performs 2D frequency-domain full-waveform inversion of wide-aperture seismic data for imaging complex structures. Full-waveform inversion methods, namely quantitative seismic imaging methods based on the resolution of the full wave equation, are computationally expensive. Therefore, designing efficient algorithms which take advantage of parallel computing facilities is critical for the appraisal of these approaches when applied to representative case studies and for further improvements. Full-waveform modelling requires the resolution of a large sparse system of linear equations which is performed with the massively parallel direct solver MUMPS for efficient multiple-shot simulations. Efficiency of the multiple-shot solution phase (forward/backward substitutions) is improved by using the BLAS3 library. The inverse problem relies on a classic local optimization approach implemented with a gradient method. The direct solver returns the multiple-shot wavefield solutions distributed over the processors according to a domain decomposition driven by the distribution of the LU factors. The domain decomposition of the wavefield solutions is used to compute in parallel the gradient of the objective function and the diagonal Hessian, this latter providing a suitable scaling of the gradient. The algorithm allows one to test different strategies for multiscale frequency inversion ranging from successive mono-frequency inversion to simultaneous multifrequency inversion. These different inversion strategies will be illustrated in the following companion paper. The parallel efficiency and the scalability of the code will also be quantified.
Selected inversion as key to a stable Langevin evolution across the QCD phase boundary
NASA Astrophysics Data System (ADS)
Bloch, Jacques; Schenk, Olaf
2018-03-01
We present new results of full QCD at nonzero chemical potential. In PRD 92, 094516 (2015) the complex Langevin method was shown to break down when the inverse coupling decreases and enters the transition region from the deconfined to the confined phase. We found that the stochastic technique used to estimate the drift term can be very unstable for indefinite matrices. This may be avoided by using the full inverse of the Dirac operator, which is, however, too costly for four-dimensional lattices. The major breakthrough in this work was achieved by realizing that the inverse elements necessary for the drift term can be computed efficiently using the selected inversion technique provided by the parallel sparse direct solver package PARDISO. In our new study we show that no breakdown of the complex Langevin method is encountered and that simulations can be performed across the phase boundary.
A fast new algorithm for a robot neurocontroller using inverse QR decomposition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morris, A.S.; Khemaissia, S.
2000-01-01
A new adaptive neural network controller for robots is presented. The controller is based on direct adaptive techniques. Unlike many neural network controllers in the literature, inverse dynamical model evaluation is not required. A numerically robust, computationally efficient processing scheme for neutral network weight estimation is described, namely, the inverse QR decomposition (INVQR). The inverse QR decomposition and a weighted recursive least-squares (WRLS) method for neural network weight estimation is derived using Cholesky factorization of the data matrix. The algorithm that performs the efficient INVQR of the underlying space-time data matrix may be implemented in parallel on a triangular array.more » Furthermore, its systolic architecture is well suited for VLSI implementation. Another important benefit is well suited for VLSI implementation. Another important benefit of the INVQR decomposition is that it solves directly for the time-recursive least-squares filter vector, while avoiding the sequential back-substitution step required by the QR decomposition approaches.« less
Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Li, Xiaoye; Heber, Gerd; Biswas, Rupak
2000-01-01
The ability of computers to solve hitherto intractable problems and simulate complex processes using mathematical models makes them an indispensable part of modern science and engineering. Computer simulations of large-scale realistic applications usually require solving a set of non-linear partial differential equations (PDES) over a finite region. For example, one thrust area in the DOE Grand Challenge projects is to design future accelerators such as the SpaHation Neutron Source (SNS). Our colleagues at SLAC need to model complex RFQ cavities with large aspect ratios. Unstructured grids are currently used to resolve the small features in a large computational domain; dynamic mesh adaptation will be added in the future for additional efficiency. The PDEs for electromagnetics are discretized by the FEM method, which leads to a generalized eigenvalue problem Kx = AMx, where K and M are the stiffness and mass matrices, and are very sparse. In a typical cavity model, the number of degrees of freedom is about one million. For such large eigenproblems, direct solution techniques quickly reach the memory limits. Instead, the most widely-used methods are Krylov subspace methods, such as Lanczos or Jacobi-Davidson. In all the Krylov-based algorithms, sparse matrix-vector multiplication (SPMV) must be performed repeatedly. Therefore, the efficiency of SPMV usually determines the eigensolver speed. SPMV is also one of the most heavily used kernels in large-scale numerical simulations.
Sequential time interleaved random equivalent sampling for repetitive signal.
Zhao, Yijiu; Liu, Jingjing
2016-12-01
Compressed sensing (CS) based sampling techniques exhibit many advantages over other existing approaches for sparse signal spectrum sensing; they are also incorporated into non-uniform sampling signal reconstruction to improve the efficiency, such as random equivalent sampling (RES). However, in CS based RES, only one sample of each acquisition is considered in the signal reconstruction stage, and it will result in more acquisition runs and longer sampling time. In this paper, a sampling sequence is taken in each RES acquisition run, and the corresponding block measurement matrix is constructed using a Whittaker-Shannon interpolation formula. All the block matrices are combined into an equivalent measurement matrix with respect to all sampling sequences. We implemented the proposed approach with a multi-cores analog-to-digital converter (ADC), whose ADC cores are time interleaved. A prototype realization of this proposed CS based sequential random equivalent sampling method has been developed. It is able to capture an analog waveform at an equivalent sampling rate of 40 GHz while sampled at 1 GHz physically. Experiments indicate that, for a sparse signal, the proposed CS based sequential random equivalent sampling exhibits high efficiency.
Direct Iterative Nonlinear Inversion by Multi-frequency T-matrix Completion
NASA Astrophysics Data System (ADS)
Jakobsen, M.; Wu, R. S.
2016-12-01
Researchers in the mathematical physics community have recently proposed a conceptually new method for solving nonlinear inverse scattering problems (like FWI) which is inspired by the theory of nonlocality of physical interactions. The conceptually new method, which may be referred to as the T-matrix completion method, is very interesting since it is not based on linearization at any stage. Also, there are no gradient vectors or (inverse) Hessian matrices to calculate. However, the convergence radius of this promising T-matrix completion method is seriously restricted by it's use of single-frequency scattering data only. In this study, we have developed a modified version of the T-matrix completion method which we believe is more suitable for applications to nonlinear inverse scattering problems in (exploration) seismology, because it makes use of multi-frequency data. Essentially, we have simplified the single-frequency T-matrix completion method of Levinson and Markel and combined it with the standard sequential frequency inversion (multi-scale regularization) method. For each frequency, we first estimate the experimental T-matrix by using the Moore-Penrose pseudo inverse concept. Then this experimental T-matrix is used to initiate an iterative procedure for successive estimation of the scattering potential and the T-matrix using the Lippmann-Schwinger for the nonlinear relation between these two quantities. The main physical requirements in the basic iterative cycle is that the T-matrix should be data-compatible and the scattering potential operator should be dominantly local; although a non-local scattering potential operator is allowed in the intermediate iterations. In our simplified T-matrix completion strategy, we ensure that the T-matrix updates are always data compatible simply by adding a suitable correction term in the real space coordinate representation. The use of singular-value decomposition representations are not required in our formulation since we have developed an efficient domain decomposition method. The results of several numerical experiments for the SEG/EAGE salt model illustrate the importance of using multi-frequency data when performing frequency domain full waveform inversion in strongly scattering media via the new concept of T-matrix completion.
Hybrid Weighted Minimum Norm Method A new method based LORETA to solve EEG inverse problem.
Song, C; Zhuang, T; Wu, Q
2005-01-01
This Paper brings forward a new method to solve EEG inverse problem. Based on following physiological characteristic of neural electrical activity source: first, the neighboring neurons are prone to active synchronously; second, the distribution of source space is sparse; third, the active intensity of the sources are high centralized, we take these prior knowledge as prerequisite condition to develop the inverse solution of EEG, and not assume other characteristic of inverse solution to realize the most commonly 3D EEG reconstruction map. The proposed algorithm takes advantage of LORETA's low resolution method which emphasizes particularly on 'localization' and FOCUSS's high resolution method which emphasizes particularly on 'separability'. The method is still under the frame of the weighted minimum norm method. The keystone is to construct a weighted matrix which takes reference from the existing smoothness operator, competition mechanism and study algorithm. The basic processing is to obtain an initial solution's estimation firstly, then construct a new estimation using the initial solution's information, repeat this process until the solutions under last two estimate processing is keeping unchanged.
NASA Technical Reports Server (NTRS)
Cannone, Jaime J.; Barnes, Cindy L.; Achari, Aniruddha; Kundrot, Craig E.; Whitaker, Ann F. (Technical Monitor)
2001-01-01
The Sparse Matrix approach for obtaining lead crystallization conditions has proven to be very fruitful for the crystallization of proteins and nucleic acids. Here we report a Sparse Matrix developed specifically for the crystallization of protein-DNA complexes. This method is rapid and economical, typically requiring 2.5 mg of complex to test 48 conditions. The method was originally developed to crystallize basic fibroblast growth factor (bFGF) complexed with DNA sequences identified through in vitro selection, or SELEX, methods. Two DNA aptamers that bind with approximately nanomolar affinity and inhibit the angiogenic properties of bFGF were selected for co-crystallization. The Sparse Matrix produced lead crystallization conditions for both bFGF-DNA complexes.
Thermodynamic characterization of synchronization-optimized oscillator networks
NASA Astrophysics Data System (ADS)
Yanagita, Tatsuo; Ichinomiya, Takashi
2014-12-01
We consider a canonical ensemble of synchronization-optimized networks of identical oscillators under external noise. By performing a Markov chain Monte Carlo simulation using the Kirchhoff index, i.e., the sum of the inverse eigenvalues of the Laplacian matrix (as a graph Hamiltonian of the network), we construct more than 1 000 different synchronization-optimized networks. We then show that the transition from star to core-periphery structure depends on the connectivity of the network, and is characterized by the node degree variance of the synchronization-optimized ensemble. We find that thermodynamic properties such as heat capacity show anomalies for sparse networks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carpentier, J.L.; Di Bono, P.J.; Tournebise, P.J.
The efficient bounding method for DC contingency analysis is improved using reciprocity properties. Knowing the consequences of the outage of a branch, these properties provide the consequences on that branch of various kinds of outages. This is used in order to reduce computation times and to get rid of some difficulties, such as those occurring when a branch flow is close to its limit before outage. Compensation, sparse vector, sparse inverse and bounding techniques are also used. A program has been implemented for single branch outages and tested on actual French EHV 650 bus network. Computation times are 60% ofmore » the Efficient Bounding method. The relevant algorithm is described in detail in the first part of this paper. In the second part, reciprocity properties and bounding formulas are extended for multiple branch outages and for multiple generator or load outages. An algorithm is proposed in order to handle all these cases simultaneously.« less
High-dimensional statistical inference: From vector to matrix
NASA Astrophysics Data System (ADS)
Zhang, Anru
Statistical inference for sparse signals or low-rank matrices in high-dimensional settings is of significant interest in a range of contemporary applications. It has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. In this thesis, we consider several problems in including sparse signal recovery (compressed sensing under restricted isometry) and low-rank matrix recovery (matrix recovery via rank-one projections and structured matrix completion). The first part of the thesis discusses compressed sensing and affine rank minimization in both noiseless and noisy cases and establishes sharp restricted isometry conditions for sparse signal and low-rank matrix recovery. The analysis relies on a key technical tool which represents points in a polytope by convex combinations of sparse vectors. The technique is elementary while leads to sharp results. It is shown that, in compressed sensing, delta kA < 1/3, deltak A+ thetak,kA < 1, or deltatkA < √( t - 1)/t for any given constant t ≥ 4/3 guarantee the exact recovery of all k sparse signals in the noiseless case through the constrained ℓ1 minimization, and similarly in affine rank minimization delta rM < 1/3, deltar M + thetar, rM < 1, or deltatrM< √( t - 1)/t ensure the exact reconstruction of all matrices with rank at most r in the noiseless case via the constrained nuclear norm minimization. Moreover, for any epsilon > 0, delta kA < 1/3 + epsilon, deltak A + thetak,kA < 1 + epsilon, or deltatkA< √(t - 1) / t + epsilon are not sufficient to guarantee the exact recovery of all k-sparse signals for large k. Similar result also holds for matrix recovery. In addition, the conditions delta kA<1/3, deltak A+ thetak,kA<1, delta tkA < √(t - 1)/t and deltarM<1/3, delta rM+ thetar,rM<1, delta trM< √(t - 1)/ t are also shown to be sufficient respectively for stable recovery of approximately sparse signals and low-rank matrices in the noisy case. For the second part of the thesis, we introduce a rank-one projection model for low-rank matrix recovery and propose a constrained nuclear norm minimization method for stable recovery of low-rank matrices in the noisy case. The procedure is adaptive to the rank and robust against small perturbations. Both upper and lower bounds for the estimation accuracy under the Frobenius norm loss are obtained. The proposed estimator is shown to be rate-optimal under certain conditions. The estimator is easy to implement via convex programming and performs well numerically. The techniques and main results developed in the chapter also have implications to other related statistical problems. An application to estimation of spiked covariance matrices from one-dimensional random projections is considered. The results demonstrate that it is still possible to accurately estimate the covariance matrix of a high-dimensional distribution based only on one-dimensional projections. For the third part of the thesis, we consider another setting of low-rank matrix completion. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival.
Sparse nonnegative matrix factorization with ℓ0-constraints
Peharz, Robert; Pernkopf, Franz
2012-01-01
Although nonnegative matrix factorization (NMF) favors a sparse and part-based representation of nonnegative data, there is no guarantee for this behavior. Several authors proposed NMF methods which enforce sparseness by constraining or penalizing the ℓ1-norm of the factor matrices. On the other hand, little work has been done using a more natural sparseness measure, the ℓ0-pseudo-norm. In this paper, we propose a framework for approximate NMF which constrains the ℓ0-norm of the basis matrix, or the coefficient matrix, respectively. For this purpose, techniques for unconstrained NMF can be easily incorporated, such as multiplicative update rules, or the alternating nonnegative least-squares scheme. In experiments we demonstrate the benefits of our methods, which compare to, or outperform existing approaches. PMID:22505792
Network Inference via the Time-Varying Graphical Lasso
Hallac, David; Park, Youngsuk; Boyd, Stephen; Leskovec, Jure
2018-01-01
Many important problems can be modeled as a system of interconnected entities, where each entity is recording time-dependent observations or measurements. In order to spot trends, detect anomalies, and interpret the temporal dynamics of such data, it is essential to understand the relationships between the different entities and how these relationships evolve over time. In this paper, we introduce the time-varying graphical lasso (TVGL), a method of inferring time-varying networks from raw time series data. We cast the problem in terms of estimating a sparse time-varying inverse covariance matrix, which reveals a dynamic network of interdependencies between the entities. Since dynamic network inference is a computationally expensive task, we derive a scalable message-passing algorithm based on the Alternating Direction Method of Multipliers (ADMM) to solve this problem in an efficient way. We also discuss several extensions, including a streaming algorithm to update the model and incorporate new observations in real time. Finally, we evaluate our TVGL algorithm on both real and synthetic datasets, obtaining interpretable results and outperforming state-of-the-art baselines in terms of both accuracy and scalability. PMID:29770256
Sample-Starved Large Scale Network Analysis
2016-05-05
As reported in our journal publication (G. Marjanovic and A. O. Hero, ”l0 Sparse Inverse Covariance Estimation,” IEEE Trans on Signal Processing, vol... Marjanovic and A. O. Hero, ”l0 Sparse Inverse Covariance Estimation,” in IEEE Trans on Signal Processing, vol. 63, no. 12, pp. 3218-3231, May 2015. 6. G
Multivariable frequency domain identification via 2-norm minimization
NASA Technical Reports Server (NTRS)
Bayard, David S.
1992-01-01
The author develops a computational approach to multivariable frequency domain identification, based on 2-norm minimization. In particular, a Gauss-Newton (GN) iteration is developed to minimize the 2-norm of the error between frequency domain data and a matrix fraction transfer function estimate. To improve the global performance of the optimization algorithm, the GN iteration is initialized using the solution to a particular sequentially reweighted least squares problem, denoted as the SK iteration. The least squares problems which arise from both the SK and GN iterations are shown to involve sparse matrices with identical block structure. A sparse matrix QR factorization method is developed to exploit the special block structure, and to efficiently compute the least squares solution. A numerical example involving the identification of a multiple-input multiple-output (MIMO) plant having 286 unknown parameters is given to illustrate the effectiveness of the algorithm.
A Sparsity-Promoted Method Based on Majorization-Minimization for Weak Fault Feature Enhancement
Hao, Yansong; Song, Liuyang; Tang, Gang; Yuan, Hongfang
2018-01-01
Fault transient impulses induced by faulty components in rotating machinery usually contain substantial interference. Fault features are comparatively weak in the initial fault stage, which renders fault diagnosis more difficult. In this case, a sparse representation method based on the Majorzation-Minimization (MM) algorithm is proposed to enhance weak fault features and extract the features from strong background noise. However, the traditional MM algorithm suffers from two issues, which are the choice of sparse basis and complicated calculations. To address these challenges, a modified MM algorithm is proposed in which a sparse optimization objective function is designed firstly. Inspired by the Basis Pursuit (BP) model, the optimization function integrates an impulsive feature-preserving factor and a penalty function factor. Second, a modified Majorization iterative method is applied to address the convex optimization problem of the designed function. A series of sparse coefficients can be achieved through iterating, which only contain transient components. It is noteworthy that there is no need to select the sparse basis in the proposed iterative method because it is fixed as a unit matrix. Then the reconstruction step is omitted, which can significantly increase detection efficiency. Eventually, envelope analysis of the sparse coefficients is performed to extract weak fault features. Simulated and experimental signals including bearings and gearboxes are employed to validate the effectiveness of the proposed method. In addition, comparisons are made to prove that the proposed method outperforms the traditional MM algorithm in terms of detection results and efficiency. PMID:29597280
A Sparsity-Promoted Method Based on Majorization-Minimization for Weak Fault Feature Enhancement.
Ren, Bangyue; Hao, Yansong; Wang, Huaqing; Song, Liuyang; Tang, Gang; Yuan, Hongfang
2018-03-28
Fault transient impulses induced by faulty components in rotating machinery usually contain substantial interference. Fault features are comparatively weak in the initial fault stage, which renders fault diagnosis more difficult. In this case, a sparse representation method based on the Majorzation-Minimization (MM) algorithm is proposed to enhance weak fault features and extract the features from strong background noise. However, the traditional MM algorithm suffers from two issues, which are the choice of sparse basis and complicated calculations. To address these challenges, a modified MM algorithm is proposed in which a sparse optimization objective function is designed firstly. Inspired by the Basis Pursuit (BP) model, the optimization function integrates an impulsive feature-preserving factor and a penalty function factor. Second, a modified Majorization iterative method is applied to address the convex optimization problem of the designed function. A series of sparse coefficients can be achieved through iterating, which only contain transient components. It is noteworthy that there is no need to select the sparse basis in the proposed iterative method because it is fixed as a unit matrix. Then the reconstruction step is omitted, which can significantly increase detection efficiency. Eventually, envelope analysis of the sparse coefficients is performed to extract weak fault features. Simulated and experimental signals including bearings and gearboxes are employed to validate the effectiveness of the proposed method. In addition, comparisons are made to prove that the proposed method outperforms the traditional MM algorithm in terms of detection results and efficiency.
On the efficiency of a randomized mirror descent algorithm in online optimization problems
NASA Astrophysics Data System (ADS)
Gasnikov, A. V.; Nesterov, Yu. E.; Spokoiny, V. G.
2015-04-01
A randomized online version of the mirror descent method is proposed. It differs from the existing versions by the randomization method. Randomization is performed at the stage of the projection of a subgradient of the function being optimized onto the unit simplex rather than at the stage of the computation of a subgradient, which is common practice. As a result, a componentwise subgradient descent with a randomly chosen component is obtained, which admits an online interpretation. This observation, for example, has made it possible to uniformly interpret results on weighting expert decisions and propose the most efficient method for searching for an equilibrium in a zero-sum two-person matrix game with sparse matrix.
A manual for PARTI runtime primitives
NASA Technical Reports Server (NTRS)
Berryman, Harry; Saltz, Joel
1990-01-01
Primitives are presented that are designed to help users efficiently program irregular problems (e.g., unstructured mesh sweeps, sparse matrix codes, adaptive mesh partial differential equations solvers) on distributed memory machines. These primitives are also designed for use in compilers for distributed memory multiprocessors. Communications patterns are captured at runtime, and the appropriate send and receive messages are automatically generated.
NASA Astrophysics Data System (ADS)
Jiang, Jinghui; Zhou, Han; Ding, Jian; Zhang, Fan; Fan, Tongxiang; Zhang, Di
2015-08-01
Bio-template approach was employed to construct inverse V-type TiO2-based photocatalyst with well distributed AgBr in TiO2 matrix by making dead Troides Helena wings with inverse V-type scales as the template. A cross-linked titanium precursor with homogenous hydrolytic rate, good liquidity, and low viscosity was employed to facilitate a perfect duplication of the template and the dispersion of AgBr based on appropriate pretreatment of the template by alkali and acid. The as-synthesized inverse V-type TiO2/AgBr can be turned into inverse V-type TiO2/Ag0 from AgBr photolysis during photocatalysis to achieve in situ deposition of Ag0 in TiO2 matrix, by this approach, to avoid the deformation of surface microstructure inherited from the template. The result showed that the cooperation of perfect inverse V-type structure and the well distributed TiO2/Ag0 microstructures can efficiently boost the photosynthetic water oxidation compared to non-inverse V-type TiO2/Ag0 and TiO2/Ag0 without using template. The anti-reflection function of inverse V-type structure and the plasmatic effect of Ag0 might be able to account for the enhanced photon capture and efficient photoelectric conversion.
Luo, Xin; You, Zhuhong; Zhou, Mengchu; Li, Shuai; Leung, Hareton; Xia, Yunni; Zhu, Qingsheng
2015-01-09
The comprehensive mapping of protein-protein interactions (PPIs) is highly desired for one to gain deep insights into both fundamental cell biology processes and the pathology of diseases. Finely-set small-scale experiments are not only very expensive but also inefficient to identify numerous interactomes despite their high accuracy. High-throughput screening techniques enable efficient identification of PPIs; yet the desire to further extract useful knowledge from these data leads to the problem of binary interactome mapping. Network topology-based approaches prove to be highly efficient in addressing this problem; however, their performance deteriorates significantly on sparse putative PPI networks. Motivated by the success of collaborative filtering (CF)-based approaches to the problem of personalized-recommendation on large, sparse rating matrices, this work aims at implementing a highly efficient CF-based approach to binary interactome mapping. To achieve this, we first propose a CF framework for it. Under this framework, we model the given data into an interactome weight matrix, where the feature-vectors of involved proteins are extracted. With them, we design the rescaled cosine coefficient to model the inter-neighborhood similarity among involved proteins, for taking the mapping process. Experimental results on three large, sparse datasets demonstrate that the proposed approach outperforms several sophisticated topology-based approaches significantly.
Luo, Xin; You, Zhuhong; Zhou, Mengchu; Li, Shuai; Leung, Hareton; Xia, Yunni; Zhu, Qingsheng
2015-01-01
The comprehensive mapping of protein-protein interactions (PPIs) is highly desired for one to gain deep insights into both fundamental cell biology processes and the pathology of diseases. Finely-set small-scale experiments are not only very expensive but also inefficient to identify numerous interactomes despite their high accuracy. High-throughput screening techniques enable efficient identification of PPIs; yet the desire to further extract useful knowledge from these data leads to the problem of binary interactome mapping. Network topology-based approaches prove to be highly efficient in addressing this problem; however, their performance deteriorates significantly on sparse putative PPI networks. Motivated by the success of collaborative filtering (CF)-based approaches to the problem of personalized-recommendation on large, sparse rating matrices, this work aims at implementing a highly efficient CF-based approach to binary interactome mapping. To achieve this, we first propose a CF framework for it. Under this framework, we model the given data into an interactome weight matrix, where the feature-vectors of involved proteins are extracted. With them, we design the rescaled cosine coefficient to model the inter-neighborhood similarity among involved proteins, for taking the mapping process. Experimental results on three large, sparse datasets demonstrate that the proposed approach outperforms several sophisticated topology-based approaches significantly. PMID:25572661
NASA Astrophysics Data System (ADS)
Luo, Xin; You, Zhuhong; Zhou, Mengchu; Li, Shuai; Leung, Hareton; Xia, Yunni; Zhu, Qingsheng
2015-01-01
The comprehensive mapping of protein-protein interactions (PPIs) is highly desired for one to gain deep insights into both fundamental cell biology processes and the pathology of diseases. Finely-set small-scale experiments are not only very expensive but also inefficient to identify numerous interactomes despite their high accuracy. High-throughput screening techniques enable efficient identification of PPIs; yet the desire to further extract useful knowledge from these data leads to the problem of binary interactome mapping. Network topology-based approaches prove to be highly efficient in addressing this problem; however, their performance deteriorates significantly on sparse putative PPI networks. Motivated by the success of collaborative filtering (CF)-based approaches to the problem of personalized-recommendation on large, sparse rating matrices, this work aims at implementing a highly efficient CF-based approach to binary interactome mapping. To achieve this, we first propose a CF framework for it. Under this framework, we model the given data into an interactome weight matrix, where the feature-vectors of involved proteins are extracted. With them, we design the rescaled cosine coefficient to model the inter-neighborhood similarity among involved proteins, for taking the mapping process. Experimental results on three large, sparse datasets demonstrate that the proposed approach outperforms several sophisticated topology-based approaches significantly.
Background recovery via motion-based robust principal component analysis with matrix factorization
NASA Astrophysics Data System (ADS)
Pan, Peng; Wang, Yongli; Zhou, Mingyuan; Sun, Zhipeng; He, Guoping
2018-03-01
Background recovery is a key technique in video analysis, but it still suffers from many challenges, such as camouflage, lighting changes, and diverse types of image noise. Robust principal component analysis (RPCA), which aims to recover a low-rank matrix and a sparse matrix, is a general framework for background recovery. The nuclear norm is widely used as a convex surrogate for the rank function in RPCA, which requires computing the singular value decomposition (SVD), a task that is increasingly costly as matrix sizes and ranks increase. However, matrix factorization greatly reduces the dimension of the matrix for which the SVD must be computed. Motion information has been shown to improve low-rank matrix recovery in RPCA, but this method still finds it difficult to handle original video data sets because of its batch-mode formulation and implementation. Hence, in this paper, we propose a motion-assisted RPCA model with matrix factorization (FM-RPCA) for background recovery. Moreover, an efficient linear alternating direction method of multipliers with a matrix factorization (FL-ADM) algorithm is designed for solving the proposed FM-RPCA model. Experimental results illustrate that the method provides stable results and is more efficient than the current state-of-the-art algorithms.
Turbo-SMT: Parallel Coupled Sparse Matrix-Tensor Factorizations and Applications
Papalexakis, Evangelos E.; Faloutsos, Christos; Mitchell, Tom M.; Talukdar, Partha Pratim; Sidiropoulos, Nicholas D.; Murphy, Brian
2016-01-01
How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ’edible’, ’fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a Facebook dataset (users, ’friends’, wall-postings); there, Turbo-SMT spots spammer-like anomalies. PMID:27672406
Weighted low-rank sparse model via nuclear norm minimization for bearing fault detection
NASA Astrophysics Data System (ADS)
Du, Zhaohui; Chen, Xuefeng; Zhang, Han; Yang, Boyuan; Zhai, Zhi; Yan, Ruqiang
2017-07-01
It is a fundamental task in the machine fault diagnosis community to detect impulsive signatures generated by the localized faults of bearings. The main goal of this paper is to exploit the low-rank physical structure of periodic impulsive features and further establish a weighted low-rank sparse model for bearing fault detection. The proposed model mainly consists of three basic components: an adaptive partition window, a nuclear norm regularization and a weighted sequence. Firstly, due to the periodic repetition mechanism of impulsive feature, an adaptive partition window could be designed to transform the impulsive feature into a data matrix. The highlight of partition window is to accumulate all local feature information and align them. Then, all columns of the data matrix share similar waveforms and a core physical phenomenon arises, i.e., these singular values of the data matrix demonstrates a sparse distribution pattern. Therefore, a nuclear norm regularization is enforced to capture that sparse prior. However, the nuclear norm regularization treats all singular values equally and thus ignores one basic fact that larger singular values have more information volume of impulsive features and should be preserved as much as possible. Therefore, a weighted sequence with adaptively tuning weights inversely proportional to singular amplitude is adopted to guarantee the distribution consistence of large singular values. On the other hand, the proposed model is difficult to solve due to its non-convexity and thus a new algorithm is developed to search one satisfying stationary solution through alternatively implementing one proximal operator operation and least-square fitting. Moreover, the sensitivity analysis and selection principles of algorithmic parameters are comprehensively investigated through a set of numerical experiments, which shows that the proposed method is robust and only has a few adjustable parameters. Lastly, the proposed model is applied to the wind turbine (WT) bearing fault detection and its effectiveness is sufficiently verified. Compared with the current popular bearing fault diagnosis techniques, wavelet analysis and spectral kurtosis, our model achieves a higher diagnostic accuracy.
Decomposed direct matrix inversion for fast non-cartesian SENSE reconstructions.
Qian, Yongxian; Zhang, Zhenghui; Wang, Yi; Boada, Fernando E
2006-08-01
A new k-space direct matrix inversion (DMI) method is proposed here to accelerate non-Cartesian SENSE reconstructions. In this method a global k-space matrix equation is established on basic MRI principles, and the inverse of the global encoding matrix is found from a set of local matrix equations by taking advantage of the small extension of k-space coil maps. The DMI algorithm's efficiency is achieved by reloading the precalculated global inverse when the coil maps and trajectories remain unchanged, such as in dynamic studies. Phantom and human subject experiments were performed on a 1.5T scanner with a standard four-channel phased-array cardiac coil. Interleaved spiral trajectories were used to collect fully sampled and undersampled 3D raw data. The equivalence of the global k-space matrix equation to its image-space version, was verified via conjugate gradient (CG) iterative algorithms on a 2x undersampled phantom and numerical-model data sets. When applied to the 2x undersampled phantom and human-subject raw data, the decomposed DMI method produced images with small errors (< or = 3.9%) relative to the reference images obtained from the fully-sampled data, at a rate of 2 s per slice (excluding 4 min for precalculating the global inverse at an image size of 256 x 256). The DMI method may be useful for noise evaluations in parallel coil designs, dynamic MRI, and 3D sodium MRI with fixed coils and trajectories. Copyright 2006 Wiley-Liss, Inc.
Kim, Hye-Na; Yoo, Haemin; Moon, Jun Hyuk
2013-05-21
We demonstrated the preparation of graphene-embedded 3D inverse opal electrodes for use in DSSCs. The graphene was incorporated locally into the top layers of the inverse opal structures and was embedded into the TiO2 matrix via post-treatment of the TiO2 precursors. DSSCs comprising the bare and 1-5 wt% graphene-incorporated TiO2 inverse opal electrodes were compared. We observed that the local arrangement of graphene sheets effectively enhanced electron transport without significantly reducing light harvesting by the dye molecules. A high efficiency of 7.5% was achieved in DSSCs prepared with the 3 wt% graphene-incorporated TiO2 inverse opal electrodes, constituting a 50% increase over the efficiencies of DSSCs prepared without graphene. The increase in efficiency was mainly attributed to an increase in J(SC), as determined by the photovoltaic parameters and the electrochemical impedance spectroscopy analysis.
NASA Astrophysics Data System (ADS)
Yihaa Roodhiyah, Lisa’; Tjong, Tiffany; Nurhasan; Sutarno, D.
2018-04-01
The late research, linear matrices of vector finite element in two dimensional(2-D) magnetotelluric (MT) responses modeling was solved by non-sparse direct solver in TE mode. Nevertheless, there is some weakness which have to be improved especially accuracy in the low frequency (10-3 Hz-10-5 Hz) which is not achieved yet and high cost computation in dense mesh. In this work, the solver which is used is sparse direct solver instead of non-sparse direct solverto overcome the weaknesses of solving linear matrices of vector finite element metod using non-sparse direct solver. Sparse direct solver will be advantageous in solving linear matrices of vector finite element method because of the matrix properties which is symmetrical and sparse. The validation of sparse direct solver in solving linear matrices of vector finite element has been done for a homogen half-space model and vertical contact model by analytical solution. Thevalidation result of sparse direct solver in solving linear matrices of vector finite element shows that sparse direct solver is more stable than non-sparse direct solver in computing linear problem of vector finite element method especially in low frequency. In the end, the accuracy of 2D MT responses modelling in low frequency (10-3 Hz-10-5 Hz) has been reached out under the efficient allocation memory of array and less computational time consuming.
Factor Analysis by Generalized Least Squares.
ERIC Educational Resources Information Center
Joreskog, Karl G.; Goldberger, Arthur S.
Aitkin's generalized least squares (GLS) principle, with the inverse of the observed variance-covariance matrix as a weight matrix, is applied to estimate the factor analysis model in the exploratory (unrestricted) case. It is shown that the GLS estimates are scale free and asymptotically efficient. The estimates are computed by a rapidly…
Efficient 3D inversions using the Richards equation
NASA Astrophysics Data System (ADS)
Cockett, Rowan; Heagy, Lindsey J.; Haber, Eldad
2018-07-01
Fluid flow in the vadose zone is governed by the Richards equation; it is parameterized by hydraulic conductivity, which is a nonlinear function of pressure head. Investigations in the vadose zone typically require characterizing distributed hydraulic properties. Water content or pressure head data may include direct measurements made from boreholes. Increasingly, proxy measurements from hydrogeophysics are being used to supply more spatially and temporally dense data sets. Inferring hydraulic parameters from such datasets requires the ability to efficiently solve and optimize the nonlinear time domain Richards equation. This is particularly important as the number of parameters to be estimated in a vadose zone inversion continues to grow. In this paper, we describe an efficient technique to invert for distributed hydraulic properties in 1D, 2D, and 3D. Our technique does not store the Jacobian matrix, but rather computes its product with a vector. Existing literature for the Richards equation inversion explicitly calculates the sensitivity matrix using finite difference or automatic differentiation, however, for large scale problems these methods are constrained by computation and/or memory. Using an implicit sensitivity algorithm enables large scale inversion problems for any distributed hydraulic parameters in the Richards equation to become tractable on modest computational resources. We provide an open source implementation of our technique based on the SimPEG framework, and show it in practice for a 3D inversion of saturated hydraulic conductivity using water content data through time.
Sparse Covariance Matrix Estimation With Eigenvalue Constraints
LIU, Han; WANG, Lie; ZHAO, Tuo
2014-01-01
We propose a new approach for estimating high-dimensional, positive-definite covariance matrices. Our method extends the generalized thresholding operator by adding an explicit eigenvalue constraint. The estimated covariance matrix simultaneously achieves sparsity and positive definiteness. The estimator is rate optimal in the minimax sense and we develop an efficient iterative soft-thresholding and projection algorithm based on the alternating direction method of multipliers. Empirically, we conduct thorough numerical experiments on simulated datasets as well as real data examples to illustrate the usefulness of our method. Supplementary materials for the article are available online. PMID:25620866
Efficient quantum circuits for dense circulant and circulant like operators
Zhou, S. S.
2017-01-01
Circulant matrices are an important family of operators, which have a wide range of applications in science and engineering-related fields. They are, in general, non-sparse and non-unitary. In this paper, we present efficient quantum circuits to implement circulant operators using fewer resources and with lower complexity than existing methods. Moreover, our quantum circuits can be readily extended to the implementation of Toeplitz, Hankel and block circulant matrices. Efficient quantum algorithms to implement the inverses and products of circulant operators are also provided, and an example application in solving the equation of motion for cyclic systems is discussed. PMID:28572988
Convergence Speed of a Dynamical System for Sparse Recovery
NASA Astrophysics Data System (ADS)
Balavoine, Aurele; Rozell, Christopher J.; Romberg, Justin
2013-09-01
This paper studies the convergence rate of a continuous-time dynamical system for L1-minimization, known as the Locally Competitive Algorithm (LCA). Solving L1-minimization} problems efficiently and rapidly is of great interest to the signal processing community, as these programs have been shown to recover sparse solutions to underdetermined systems of linear equations and come with strong performance guarantees. The LCA under study differs from the typical L1 solver in that it operates in continuous time: instead of being specified by discrete iterations, it evolves according to a system of nonlinear ordinary differential equations. The LCA is constructed from simple components, giving it the potential to be implemented as a large-scale analog circuit. The goal of this paper is to give guarantees on the convergence time of the LCA system. To do so, we analyze how the LCA evolves as it is recovering a sparse signal from underdetermined measurements. We show that under appropriate conditions on the measurement matrix and the problem parameters, the path the LCA follows can be described as a sequence of linear differential equations, each with a small number of active variables. This allows us to relate the convergence time of the system to the restricted isometry constant of the matrix. Interesting parallels to sparse-recovery digital solvers emerge from this study. Our analysis covers both the noisy and noiseless settings and is supported by simulation results.
Semi-automatic sparse preconditioners for high-order finite element methods on non-uniform meshes
NASA Astrophysics Data System (ADS)
Austin, Travis M.; Brezina, Marian; Jamroz, Ben; Jhurani, Chetan; Manteuffel, Thomas A.; Ruge, John
2012-05-01
High-order finite elements often have a higher accuracy per degree of freedom than the classical low-order finite elements. However, in the context of implicit time-stepping methods, high-order finite elements present challenges to the construction of efficient simulations due to the high cost of inverting the denser finite element matrix. There are many cases where simulations are limited by the memory required to store the matrix and/or the algorithmic components of the linear solver. We are particularly interested in preconditioned Krylov methods for linear systems generated by discretization of elliptic partial differential equations with high-order finite elements. Using a preconditioner like Algebraic Multigrid can be costly in terms of memory due to the need to store matrix information at the various levels. We present a novel method for defining a preconditioner for systems generated by high-order finite elements that is based on a much sparser system than the original high-order finite element system. We investigate the performance for non-uniform meshes on a cube and a cubed sphere mesh, showing that the sparser preconditioner is more efficient and uses significantly less memory. Finally, we explore new methods to construct the sparse preconditioner and examine their effectiveness for non-uniform meshes. We compare results to a direct use of Algebraic Multigrid as a preconditioner and to a two-level additive Schwarz method.
NASA Astrophysics Data System (ADS)
Klein, Ole; Cirpka, Olaf A.; Bastian, Peter; Ippisch, Olaf
2017-04-01
In the geostatistical inverse problem of subsurface hydrology, continuous hydraulic parameter fields, in most cases hydraulic conductivity, are estimated from measurements of dependent variables, such as hydraulic heads, under the assumption that the parameter fields are autocorrelated random space functions. Upon discretization, the continuous fields become large parameter vectors with O (104 -107) elements. While cokriging-like inversion methods have been shown to be efficient for highly resolved parameter fields when the number of measurements is small, they require the calculation of the sensitivity of each measurement with respect to all parameters, which may become prohibitive with large sets of measured data such as those arising from transient groundwater flow. We present a Preconditioned Conjugate Gradient method for the geostatistical inverse problem, in which a single adjoint equation needs to be solved to obtain the gradient of the objective function. Using the autocovariance matrix of the parameters as preconditioning matrix, expensive multiplications with its inverse can be avoided, and the number of iterations is significantly reduced. We use a randomized spectral decomposition of the posterior covariance matrix of the parameters to perform a linearized uncertainty quantification of the parameter estimate. The feasibility of the method is tested by virtual examples of head observations in steady-state and transient groundwater flow. These synthetic tests demonstrate that transient data can reduce both parameter uncertainty and time spent conducting experiments, while the presented methods are able to handle the resulting large number of measurements.
Das, Kiranmoy; Daniels, Michael J.
2014-01-01
Summary Estimation of the covariance structure for irregular sparse longitudinal data has been studied by many authors in recent years but typically using fully parametric specifications. In addition, when data are collected from several groups over time, it is known that assuming the same or completely different covariance matrices over groups can lead to loss of efficiency and/or bias. Nonparametric approaches have been proposed for estimating the covariance matrix for regular univariate longitudinal data by sharing information across the groups under study. For the irregular case, with longitudinal measurements that are bivariate or multivariate, modeling becomes more difficult. In this article, to model bivariate sparse longitudinal data from several groups, we propose a flexible covariance structure via a novel matrix stick-breaking process for the residual covariance structure and a Dirichlet process mixture of normals for the random effects. Simulation studies are performed to investigate the effectiveness of the proposed approach over more traditional approaches. We also analyze a subset of Framingham Heart Study data to examine how the blood pressure trajectories and covariance structures differ for the patients from different BMI groups (high, medium and low) at baseline. PMID:24400941
Zhang, Zhilin; Jung, Tzyy-Ping; Makeig, Scott; Rao, Bhaskar D
2013-02-01
Fetal ECG (FECG) telemonitoring is an important branch in telemedicine. The design of a telemonitoring system via a wireless body area network with low energy consumption for ambulatory use is highly desirable. As an emerging technique, compressed sensing (CS) shows great promise in compressing/reconstructing data with low energy consumption. However, due to some specific characteristics of raw FECG recordings such as nonsparsity and strong noise contamination, current CS algorithms generally fail in this application. This paper proposes to use the block sparse Bayesian learning framework to compress/reconstruct nonsparse raw FECG recordings. Experimental results show that the framework can reconstruct the raw recordings with high quality. Especially, the reconstruction does not destroy the interdependence relation among the multichannel recordings. This ensures that the independent component analysis decomposition of the reconstructed recordings has high fidelity. Furthermore, the framework allows the use of a sparse binary sensing matrix with much fewer nonzero entries to compress recordings. Particularly, each column of the matrix can contain only two nonzero entries. This shows that the framework, compared to other algorithms such as current CS algorithms and wavelet algorithms, can greatly reduce code execution in CPU in the data compression stage.
Technical note: an R package for fitting sparse neural networks with application in animal breeding.
Wang, Yangfan; Mi, Xue; Rosa, Guilherme J M; Chen, Zhihui; Lin, Ping; Wang, Shi; Bao, Zhenmin
2018-05-04
Neural networks (NNs) have emerged as a new tool for genomic selection (GS) in animal breeding. However, the properties of NN used in GS for the prediction of phenotypic outcomes are not well characterized due to the problem of over-parameterization of NN and difficulties in using whole-genome marker sets as high-dimensional NN input. In this note, we have developed an R package called snnR that finds an optimal sparse structure of a NN by minimizing the square error subject to a penalty on the L1-norm of the parameters (weights and biases), therefore solving the problem of over-parameterization in NN. We have also tested some models fitted in the snnR package to demonstrate their feasibility and effectiveness to be used in several cases as examples. In comparison of snnR to the R package brnn (the Bayesian regularized single layer NNs), with both using the entries of a genotype matrix or a genomic relationship matrix as inputs, snnR has greatly improved the computational efficiency and the prediction ability for the GS in animal breeding because snnR implements a sparse NN with many hidden layers.
Color normalization of histology slides using graph regularized sparse NMF
NASA Astrophysics Data System (ADS)
Sha, Lingdao; Schonfeld, Dan; Sethi, Amit
2017-03-01
Computer based automatic medical image processing and quantification are becoming popular in digital pathology. However, preparation of histology slides can vary widely due to differences in staining equipment, procedures and reagents, which can reduce the accuracy of algorithms that analyze their color and texture information. To re- duce the unwanted color variations, various supervised and unsupervised color normalization methods have been proposed. Compared with supervised color normalization methods, unsupervised color normalization methods have advantages of time and cost efficient and universal applicability. Most of the unsupervised color normaliza- tion methods for histology are based on stain separation. Based on the fact that stain concentration cannot be negative and different parts of the tissue absorb different stains, nonnegative matrix factorization (NMF), and particular its sparse version (SNMF), are good candidates for stain separation. However, most of the existing unsupervised color normalization method like PCA, ICA, NMF and SNMF fail to consider important information about sparse manifolds that its pixels occupy, which could potentially result in loss of texture information during color normalization. Manifold learning methods like Graph Laplacian have proven to be very effective in interpreting high-dimensional data. In this paper, we propose a novel unsupervised stain separation method called graph regularized sparse nonnegative matrix factorization (GSNMF). By considering the sparse prior of stain concentration together with manifold information from high-dimensional image data, our method shows better performance in stain color deconvolution than existing unsupervised color deconvolution methods, especially in keeping connected texture information. To utilized the texture information, we construct a nearest neighbor graph between pixels within a spatial area of an image based on their distances using heat kernal in lαβ space. The representation of a pixel in the stain density space is constrained to follow the feature distance of the pixel to pixels in the neighborhood graph. Utilizing color matrix transfer method with the stain concentrations found using our GSNMF method, the color normalization performance was also better than existing methods.
Method and apparatus for optimized processing of sparse matrices
Taylor, Valerie E.
1993-01-01
A computer architecture for processing a sparse matrix is disclosed. The apparatus stores a value-row vector corresponding to nonzero values of a sparse matrix. Each of the nonzero values is located at a defined row and column position in the matrix. The value-row vector includes a first vector including nonzero values and delimiting characters indicating a transition from one column to another. The value-row vector also includes a second vector which defines row position values in the matrix corresponding to the nonzero values in the first vector and column position values in the matrix corresponding to the column position of the nonzero values in the first vector. The architecture also includes a circuit for detecting a special character within the value-row vector. Matrix-vector multiplication is executed on the value-row vector. This multiplication is performed by multiplying an index value of the first vector value by a column value from a second matrix to form a matrix-vector product which is added to a previous matrix-vector product.
A manual for PARTI runtime primitives, revision 1
NASA Technical Reports Server (NTRS)
Das, Raja; Saltz, Joel; Berryman, Harry
1991-01-01
Primitives are presented that are designed to help users efficiently program irregular problems (e.g., unstructured mesh sweeps, sparse matrix codes, adaptive mesh partial differential equations solvers) on distributed memory machines. These primitives are also designed for use in compilers for distributed memory multiprocessors. Communications patterns are captured at runtime, and the appropriate send and receive messages are automatically generated.
Adaptive OFDM Waveform Design for Spatio-Temporal-Sparsity Exploited STAP Radar
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sen, Satyabrata
In this chapter, we describe a sparsity-based space-time adaptive processing (STAP) algorithm to detect a slowly moving target using an orthogonal frequency division multiplexing (OFDM) radar. The motivation of employing an OFDM signal is that it improves the target-detectability from the interfering signals by increasing the frequency diversity of the system. However, due to the addition of one extra dimension in terms of frequency, the adaptive degrees-of-freedom in an OFDM-STAP also increases. Therefore, to avoid the construction a fully adaptive OFDM-STAP, we develop a sparsity-based STAP algorithm. We observe that the interference spectrum is inherently sparse in the spatio-temporal domain,more » as the clutter responses occupy only a diagonal ridge on the spatio-temporal plane and the jammer signals interfere only from a few spatial directions. Hence, we exploit that sparsity to develop an efficient STAP technique that utilizes considerably lesser number of secondary data compared to the other existing STAP techniques, and produces nearly optimum STAP performance. In addition to designing the STAP filter, we optimally design the transmit OFDM signals by maximizing the output signal-to-interference-plus-noise ratio (SINR) in order to improve the STAP performance. The computation of output SINR depends on the estimated value of the interference covariance matrix, which we obtain by applying the sparse recovery algorithm. Therefore, we analytically assess the effects of the synthesized OFDM coefficients on the sparse recovery of the interference covariance matrix by computing the coherence measure of the sparse measurement matrix. Our numerical examples demonstrate the achieved STAP-performance due to sparsity-based technique and adaptive waveform design.« less
Tang, Xin; Feng, Guo-Can; Li, Xiao-Xin; Cai, Jia-Xin
2015-01-01
Face recognition is challenging especially when the images from different persons are similar to each other due to variations in illumination, expression, and occlusion. If we have sufficient training images of each person which can span the facial variations of that person under testing conditions, sparse representation based classification (SRC) achieves very promising results. However, in many applications, face recognition often encounters the small sample size problem arising from the small number of available training images for each person. In this paper, we present a novel face recognition framework by utilizing low-rank and sparse error matrix decomposition, and sparse coding techniques (LRSE+SC). Firstly, the low-rank matrix recovery technique is applied to decompose the face images per class into a low-rank matrix and a sparse error matrix. The low-rank matrix of each individual is a class-specific dictionary and it captures the discriminative feature of this individual. The sparse error matrix represents the intra-class variations, such as illumination, expression changes. Secondly, we combine the low-rank part (representative basis) of each person into a supervised dictionary and integrate all the sparse error matrix of each individual into a within-individual variant dictionary which can be applied to represent the possible variations between the testing and training images. Then these two dictionaries are used to code the query image. The within-individual variant dictionary can be shared by all the subjects and only contribute to explain the lighting conditions, expressions, and occlusions of the query image rather than discrimination. At last, a reconstruction-based scheme is adopted for face recognition. Since the within-individual dictionary is introduced, LRSE+SC can handle the problem of the corrupted training data and the situation that not all subjects have enough samples for training. Experimental results show that our method achieves the state-of-the-art results on AR, FERET, FRGC and LFW databases.
Tang, Xin; Feng, Guo-can; Li, Xiao-xin; Cai, Jia-xin
2015-01-01
Face recognition is challenging especially when the images from different persons are similar to each other due to variations in illumination, expression, and occlusion. If we have sufficient training images of each person which can span the facial variations of that person under testing conditions, sparse representation based classification (SRC) achieves very promising results. However, in many applications, face recognition often encounters the small sample size problem arising from the small number of available training images for each person. In this paper, we present a novel face recognition framework by utilizing low-rank and sparse error matrix decomposition, and sparse coding techniques (LRSE+SC). Firstly, the low-rank matrix recovery technique is applied to decompose the face images per class into a low-rank matrix and a sparse error matrix. The low-rank matrix of each individual is a class-specific dictionary and it captures the discriminative feature of this individual. The sparse error matrix represents the intra-class variations, such as illumination, expression changes. Secondly, we combine the low-rank part (representative basis) of each person into a supervised dictionary and integrate all the sparse error matrix of each individual into a within-individual variant dictionary which can be applied to represent the possible variations between the testing and training images. Then these two dictionaries are used to code the query image. The within-individual variant dictionary can be shared by all the subjects and only contribute to explain the lighting conditions, expressions, and occlusions of the query image rather than discrimination. At last, a reconstruction-based scheme is adopted for face recognition. Since the within-individual dictionary is introduced, LRSE+SC can handle the problem of the corrupted training data and the situation that not all subjects have enough samples for training. Experimental results show that our method achieves the state-of-the-art results on AR, FERET, FRGC and LFW databases. PMID:26571112
Computing Generalized Matrix Inverse on Spiking Neural Substrate.
Shukla, Rohit; Khoram, Soroosh; Jorgensen, Erik; Li, Jing; Lipasti, Mikko; Wright, Stephen
2018-01-01
Emerging neural hardware substrates, such as IBM's TrueNorth Neurosynaptic System, can provide an appealing platform for deploying numerical algorithms. For example, a recurrent Hopfield neural network can be used to find the Moore-Penrose generalized inverse of a matrix, thus enabling a broad class of linear optimizations to be solved efficiently, at low energy cost. However, deploying numerical algorithms on hardware platforms that severely limit the range and precision of representation for numeric quantities can be quite challenging. This paper discusses these challenges and proposes a rigorous mathematical framework for reasoning about range and precision on such substrates. The paper derives techniques for normalizing inputs and properly quantizing synaptic weights originating from arbitrary systems of linear equations, so that solvers for those systems can be implemented in a provably correct manner on hardware-constrained neural substrates. The analytical model is empirically validated on the IBM TrueNorth platform, and results show that the guarantees provided by the framework for range and precision hold under experimental conditions. Experiments with optical flow demonstrate the energy benefits of deploying a reduced-precision and energy-efficient generalized matrix inverse engine on the IBM TrueNorth platform, reflecting 10× to 100× improvement over FPGA and ARM core baselines.
Two-stage sparse coding of region covariance via Log-Euclidean kernels to detect saliency.
Zhang, Ying-Ying; Yang, Cai; Zhang, Ping
2017-05-01
In this paper, we present a novel bottom-up saliency detection algorithm from the perspective of covariance matrices on a Riemannian manifold. Each superpixel is described by a region covariance matrix on Riemannian Manifolds. We carry out a two-stage sparse coding scheme via Log-Euclidean kernels to extract salient objects efficiently. In the first stage, given background dictionary on image borders, sparse coding of each region covariance via Log-Euclidean kernels is performed. The reconstruction error on the background dictionary is regarded as the initial saliency of each superpixel. In the second stage, an improvement of the initial result is achieved by calculating reconstruction errors of the superpixels on foreground dictionary, which is extracted from the first stage saliency map. The sparse coding in the second stage is similar to the first stage, but is able to effectively highlight the salient objects uniformly from the background. Finally, three post-processing methods-highlight-inhibition function, context-based saliency weighting, and the graph cut-are adopted to further refine the saliency map. Experiments on four public benchmark datasets show that the proposed algorithm outperforms the state-of-the-art methods in terms of precision, recall and mean absolute error, and demonstrate the robustness and efficiency of the proposed method. Copyright © 2017 Elsevier Ltd. All rights reserved.
On the inversion of geodetic integrals defined over the sphere using 1-D FFT
NASA Astrophysics Data System (ADS)
García, R. V.; Alejo, C. A.
2005-08-01
An iterative method is presented which performs inversion of integrals defined over the sphere. The method is based on one-dimensional fast Fourier transform (1-D FFT) inversion and is implemented with the projected Landweber technique, which is used to solve constrained least-squares problems reducing the associated 1-D cyclic-convolution error. The results obtained are as precise as the direct matrix inversion approach, but with better computational efficiency. A case study uses the inversion of Hotine’s integral to obtain gravity disturbances from geoid undulations. Numerical convergence is also analyzed and comparisons with respect to the direct matrix inversion method using conjugate gradient (CG) iteration are presented. Like the CG method, the number of iterations needed to get the optimum (i.e., small) error decreases as the measurement noise increases. Nevertheless, for discrete data given over a whole parallel band, the method can be applied directly without implementing the projected Landweber method, since no cyclic convolution error exists.
An Efficient Spectral Method for Ordinary Differential Equations with Rational Function Coefficients
NASA Technical Reports Server (NTRS)
Coutsias, Evangelos A.; Torres, David; Hagstrom, Thomas
1994-01-01
We present some relations that allow the efficient approximate inversion of linear differential operators with rational function coefficients. We employ expansions in terms of a large class of orthogonal polynomial families, including all the classical orthogonal polynomials. These families obey a simple three-term recurrence relation for differentiation, which implies that on an appropriately restricted domain the differentiation operator has a unique banded inverse. The inverse is an integration operator for the family, and it is simply the tridiagonal coefficient matrix for the recurrence. Since in these families convolution operators (i.e. matrix representations of multiplication by a function) are banded for polynomials, we are able to obtain a banded representation for linear differential operators with rational coefficients. This leads to a method of solution of initial or boundary value problems that, besides having an operation count that scales linearly with the order of truncation N, is computationally well conditioned. Among the applications considered is the use of rational maps for the resolution of sharp interior layers.
NASA Astrophysics Data System (ADS)
Weiss, Chester J.
2013-08-01
An essential element for computational hypothesis testing, data inversion and experiment design for electromagnetic geophysics is a robust forward solver, capable of easily and quickly evaluating the electromagnetic response of arbitrary geologic structure. The usefulness of such a solver hinges on the balance among competing desires like ease of use, speed of forward calculation, scalability to large problems or compute clusters, parsimonious use of memory access, accuracy and by necessity, the ability to faithfully accommodate a broad range of geologic scenarios over extremes in length scale and frequency content. This is indeed a tall order. The present study addresses recent progress toward the development of a forward solver with these properties. Based on the Lorenz-gauged Helmholtz decomposition, a new finite volume solution over Cartesian model domains endowed with complex-valued electrical properties is shown to be stable over the frequency range 10-2-1010 Hz and range 10-3-105 m in length scale. Benchmark examples are drawn from magnetotellurics, exploration geophysics, geotechnical mapping and laboratory-scale analysis, showing excellent agreement with reference analytic solutions. Computational efficiency is achieved through use of a matrix-free implementation of the quasi-minimum-residual (QMR) iterative solver, which eliminates explicit storage of finite volume matrix elements in favor of "on the fly" computation as needed by the iterative Krylov sequence. Further efficiency is achieved through sparse coupling matrices between the vector and scalar potentials whose non-zero elements arise only in those parts of the model domain where the conductivity gradient is non-zero. Multi-thread parallelization in the QMR solver through OpenMP pragmas is used to reduce the computational cost of its most expensive step: the single matrix-vector product at each iteration. High-level MPI communicators farm independent processes to available compute nodes for simultaneous computation of multi-frequency or multi-transmitter responses.
Ray, J.; Lee, J.; Yadav, V.; ...
2015-04-29
Atmospheric inversions are frequently used to estimate fluxes of atmospheric greenhouse gases (e.g., biospheric CO 2 flux fields) at Earth's surface. These inversions typically assume that flux departures from a prior model are spatially smoothly varying, which are then modeled using a multi-variate Gaussian. When the field being estimated is spatially rough, multi-variate Gaussian models are difficult to construct and a wavelet-based field model may be more suitable. Unfortunately, such models are very high dimensional and are most conveniently used when the estimation method can simultaneously perform data-driven model simplification (removal of model parameters that cannot be reliably estimated) andmore » fitting. Such sparse reconstruction methods are typically not used in atmospheric inversions. In this work, we devise a sparse reconstruction method, and illustrate it in an idealized atmospheric inversion problem for the estimation of fossil fuel CO 2 (ffCO 2) emissions in the lower 48 states of the USA. Our new method is based on stagewise orthogonal matching pursuit (StOMP), a method used to reconstruct compressively sensed images. Our adaptations bestow three properties to the sparse reconstruction procedure which are useful in atmospheric inversions. We have modified StOMP to incorporate prior information on the emission field being estimated and to enforce non-negativity on the estimated field. Finally, though based on wavelets, our method allows for the estimation of fields in non-rectangular geometries, e.g., emission fields inside geographical and political boundaries. Our idealized inversions use a recently developed multi-resolution (i.e., wavelet-based) random field model developed for ffCO 2 emissions and synthetic observations of ffCO 2 concentrations from a limited set of measurement sites. We find that our method for limiting the estimated field within an irregularly shaped region is about a factor of 10 faster than conventional approaches. It also reduces the overall computational cost by a factor of 2. Further, the sparse reconstruction scheme imposes non-negativity without introducing strong nonlinearities, such as those introduced by employing log-transformed fields, and thus reaps the benefits of simplicity and computational speed that are characteristic of linear inverse problems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ray, J.; Lee, J.; Yadav, V.
Atmospheric inversions are frequently used to estimate fluxes of atmospheric greenhouse gases (e.g., biospheric CO 2 flux fields) at Earth's surface. These inversions typically assume that flux departures from a prior model are spatially smoothly varying, which are then modeled using a multi-variate Gaussian. When the field being estimated is spatially rough, multi-variate Gaussian models are difficult to construct and a wavelet-based field model may be more suitable. Unfortunately, such models are very high dimensional and are most conveniently used when the estimation method can simultaneously perform data-driven model simplification (removal of model parameters that cannot be reliably estimated) andmore » fitting. Such sparse reconstruction methods are typically not used in atmospheric inversions. In this work, we devise a sparse reconstruction method, and illustrate it in an idealized atmospheric inversion problem for the estimation of fossil fuel CO 2 (ffCO 2) emissions in the lower 48 states of the USA. Our new method is based on stagewise orthogonal matching pursuit (StOMP), a method used to reconstruct compressively sensed images. Our adaptations bestow three properties to the sparse reconstruction procedure which are useful in atmospheric inversions. We have modified StOMP to incorporate prior information on the emission field being estimated and to enforce non-negativity on the estimated field. Finally, though based on wavelets, our method allows for the estimation of fields in non-rectangular geometries, e.g., emission fields inside geographical and political boundaries. Our idealized inversions use a recently developed multi-resolution (i.e., wavelet-based) random field model developed for ffCO 2 emissions and synthetic observations of ffCO 2 concentrations from a limited set of measurement sites. We find that our method for limiting the estimated field within an irregularly shaped region is about a factor of 10 faster than conventional approaches. It also reduces the overall computational cost by a factor of 2. Further, the sparse reconstruction scheme imposes non-negativity without introducing strong nonlinearities, such as those introduced by employing log-transformed fields, and thus reaps the benefits of simplicity and computational speed that are characteristic of linear inverse problems.« less
Recursive flexible multibody system dynamics using spatial operators
NASA Technical Reports Server (NTRS)
Jain, A.; Rodriguez, G.
1992-01-01
This paper uses spatial operators to develop new spatially recursive dynamics algorithms for flexible multibody systems. The operator description of the dynamics is identical to that for rigid multibody systems. Assumed-mode models are used for the deformation of each individual body. The algorithms are based on two spatial operator factorizations of the system mass matrix. The first (Newton-Euler) factorization of the mass matrix leads to recursive algorithms for the inverse dynamics, mass matrix evaluation, and composite-body forward dynamics for the systems. The second (innovations) factorization of the mass matrix, leads to an operator expression for the mass matrix inverse and to a recursive articulated-body forward dynamics algorithm. The primary focus is on serial chains, but extensions to general topologies are also described. A comparison of computational costs shows that the articulated-body, forward dynamics algorithm is much more efficient than the composite-body algorithm for most flexible multibody systems.
Iris recognition based on robust principal component analysis
NASA Astrophysics Data System (ADS)
Karn, Pradeep; He, Xiao Hai; Yang, Shuai; Wu, Xiao Hong
2014-11-01
Iris images acquired under different conditions often suffer from blur, occlusion due to eyelids and eyelashes, specular reflection, and other artifacts. Existing iris recognition systems do not perform well on these types of images. To overcome these problems, we propose an iris recognition method based on robust principal component analysis. The proposed method decomposes all training images into a low-rank matrix and a sparse error matrix, where the low-rank matrix is used for feature extraction. The sparsity concentration index approach is then applied to validate the recognition result. Experimental results using CASIA V4 and IIT Delhi V1iris image databases showed that the proposed method achieved competitive performances in both recognition accuracy and computational efficiency.
A compressed sensing based 3D resistivity inversion algorithm for hydrogeological applications
NASA Astrophysics Data System (ADS)
Ranjan, Shashi; Kambhammettu, B. V. N. P.; Peddinti, Srinivasa Rao; Adinarayana, J.
2018-04-01
Image reconstruction from discrete electrical responses pose a number of computational and mathematical challenges. Application of smoothness constrained regularized inversion from limited measurements may fail to detect resistivity anomalies and sharp interfaces separated by hydro stratigraphic units. Under favourable conditions, compressed sensing (CS) can be thought of an alternative to reconstruct the image features by finding sparse solutions to highly underdetermined linear systems. This paper deals with the development of a CS assisted, 3-D resistivity inversion algorithm for use with hydrogeologists and groundwater scientists. CS based l1-regularized least square algorithm was applied to solve the resistivity inversion problem. Sparseness in the model update vector is introduced through block oriented discrete cosine transformation, with recovery of the signal achieved through convex optimization. The equivalent quadratic program was solved using primal-dual interior point method. Applicability of the proposed algorithm was demonstrated using synthetic and field examples drawn from hydrogeology. The proposed algorithm has outperformed the conventional (smoothness constrained) least square method in recovering the model parameters with much fewer data, yet preserving the sharp resistivity fronts separated by geologic layers. Resistivity anomalies represented by discrete homogeneous blocks embedded in contrasting geologic layers were better imaged using the proposed algorithm. In comparison to conventional algorithm, CS has resulted in an efficient (an increase in R2 from 0.62 to 0.78; a decrease in RMSE from 125.14 Ω-m to 72.46 Ω-m), reliable, and fast converging (run time decreased by about 25%) solution.
Sparse subspace clustering for data with missing entries and high-rank matrix completion.
Fan, Jicong; Chow, Tommy W S
2017-09-01
Many methods have recently been proposed for subspace clustering, but they are often unable to handle incomplete data because of missing entries. Using matrix completion methods to recover missing entries is a common way to solve the problem. Conventional matrix completion methods require that the matrix should be of low-rank intrinsically, but most matrices are of high-rank or even full-rank in practice, especially when the number of subspaces is large. In this paper, a new method called Sparse Representation with Missing Entries and Matrix Completion is proposed to solve the problems of incomplete-data subspace clustering and high-rank matrix completion. The proposed algorithm alternately computes the matrix of sparse representation coefficients and recovers the missing entries of a data matrix. The proposed algorithm recovers missing entries through minimizing the representation coefficients, representation errors, and matrix rank. Thorough experimental study and comparative analysis based on synthetic data and natural images were conducted. The presented results demonstrate that the proposed algorithm is more effective in subspace clustering and matrix completion compared with other existing methods. Copyright © 2017 Elsevier Ltd. All rights reserved.
Zhang, Kaihua; Zhang, Lei; Yang, Ming-Hsuan
2014-10-01
It is a challenging task to develop effective and efficient appearance models for robust object tracking due to factors such as pose variation, illumination change, occlusion, and motion blur. Existing online tracking algorithms often update models with samples from observations in recent frames. Despite much success has been demonstrated, numerous issues remain to be addressed. First, while these adaptive appearance models are data-dependent, there does not exist sufficient amount of data for online algorithms to learn at the outset. Second, online tracking algorithms often encounter the drift problems. As a result of self-taught learning, misaligned samples are likely to be added and degrade the appearance models. In this paper, we propose a simple yet effective and efficient tracking algorithm with an appearance model based on features extracted from a multiscale image feature space with data-independent basis. The proposed appearance model employs non-adaptive random projections that preserve the structure of the image feature space of objects. A very sparse measurement matrix is constructed to efficiently extract the features for the appearance model. We compress sample images of the foreground target and the background using the same sparse measurement matrix. The tracking task is formulated as a binary classification via a naive Bayes classifier with online update in the compressed domain. A coarse-to-fine search strategy is adopted to further reduce the computational complexity in the detection procedure. The proposed compressive tracking algorithm runs in real-time and performs favorably against state-of-the-art methods on challenging sequences in terms of efficiency, accuracy and robustness.
Uniform Recovery Bounds for Structured Random Matrices in Corrupted Compressed Sensing
NASA Astrophysics Data System (ADS)
Zhang, Peng; Gan, Lu; Ling, Cong; Sun, Sumei
2018-04-01
We study the problem of recovering an $s$-sparse signal $\\mathbf{x}^{\\star}\\in\\mathbb{C}^n$ from corrupted measurements $\\mathbf{y} = \\mathbf{A}\\mathbf{x}^{\\star}+\\mathbf{z}^{\\star}+\\mathbf{w}$, where $\\mathbf{z}^{\\star}\\in\\mathbb{C}^m$ is a $k$-sparse corruption vector whose nonzero entries may be arbitrarily large and $\\mathbf{w}\\in\\mathbb{C}^m$ is a dense noise with bounded energy. The aim is to exactly and stably recover the sparse signal with tractable optimization programs. In this paper, we prove the uniform recovery guarantee of this problem for two classes of structured sensing matrices. The first class can be expressed as the product of a unit-norm tight frame (UTF), a random diagonal matrix and a bounded columnwise orthonormal matrix (e.g., partial random circulant matrix). When the UTF is bounded (i.e. $\\mu(\\mathbf{U})\\sim1/\\sqrt{m}$), we prove that with high probability, one can recover an $s$-sparse signal exactly and stably by $l_1$ minimization programs even if the measurements are corrupted by a sparse vector, provided $m = \\mathcal{O}(s \\log^2 s \\log^2 n)$ and the sparsity level $k$ of the corruption is a constant fraction of the total number of measurements. The second class considers randomly sub-sampled orthogonal matrix (e.g., random Fourier matrix). We prove the uniform recovery guarantee provided that the corruption is sparse on certain sparsifying domain. Numerous simulation results are also presented to verify and complement the theoretical results.
Acceleration of GPU-based Krylov solvers via data transfer reduction
Anzt, Hartwig; Tomov, Stanimire; Luszczek, Piotr; ...
2015-04-08
Krylov subspace iterative solvers are often the method of choice when solving large sparse linear systems. At the same time, hardware accelerators such as graphics processing units continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a well optimized but limited set of linear algebra operations, applications that use them often fail to reduce certain data communications, and hence fail to leverage the full potential of the accelerator. In this study, we target the acceleration of Krylov subspace iterative methods for graphicsmore » processing units, and in particular the Biconjugate Gradient Stabilized solver that significant improvement can be achieved by reformulating the method to reduce data-communications through application-specific kernels instead of using the generic BLAS kernels, e.g. as provided by NVIDIA’s cuBLAS library, and by designing a graphics processing unit specific sparse matrix-vector product kernel that is able to more efficiently use the graphics processing unit’s computing power. Furthermore, we derive a model estimating the performance improvement, and use experimental data to validate the expected runtime savings. Finally, considering that the derived implementation achieves significantly higher performance, we assert that similar optimizations addressing algorithm structure, as well as sparse matrix-vector, are crucial for the subsequent development of high-performance graphics processing units accelerated Krylov subspace iterative methods.« less
Improvements in sparse matrix operations of NASTRAN
NASA Technical Reports Server (NTRS)
Harano, S.
1980-01-01
A "nontransmit" packing routine was added to NASTRAN to allow matrix data to be refered to directly from the input/output buffer. Use of the packing routine permits various routines for matrix handling to perform a direct reference to the input/output buffer if data addresses have once been received. The packing routine offers a buffer by buffer backspace feature for efficient backspacing in sequential access. Unlike a conventional backspacing that needs twice back record for a single read of one record (one column), this feature omits overlapping of READ operation and back record. It eliminates the necessity of writing, in decomposition of a symmetric matrix, of a portion of the matrix to its upper triangular matrix from the last to the first columns of the symmetric matrix, thus saving time for generating the upper triangular matrix. Only a lower triangular matrix must be written onto the secondary storage device, bringing 10 to 30% reduction in use of the disk space of the storage device.
Matched field localization based on CS-MUSIC algorithm
NASA Astrophysics Data System (ADS)
Guo, Shuangle; Tang, Ruichun; Peng, Linhui; Ji, Xiaopeng
2016-04-01
The problem caused by shortness or excessiveness of snapshots and by coherent sources in underwater acoustic positioning is considered. A matched field localization algorithm based on CS-MUSIC (Compressive Sensing Multiple Signal Classification) is proposed based on the sparse mathematical model of the underwater positioning. The signal matrix is calculated through the SVD (Singular Value Decomposition) of the observation matrix. The observation matrix in the sparse mathematical model is replaced by the signal matrix, and a new concise sparse mathematical model is obtained, which means not only the scale of the localization problem but also the noise level is reduced; then the new sparse mathematical model is solved by the CS-MUSIC algorithm which is a combination of CS (Compressive Sensing) method and MUSIC (Multiple Signal Classification) method. The algorithm proposed in this paper can overcome effectively the difficulties caused by correlated sources and shortness of snapshots, and it can also reduce the time complexity and noise level of the localization problem by using the SVD of the observation matrix when the number of snapshots is large, which will be proved in this paper.
A Shifted Block Lanczos Algorithm 1: The Block Recurrence
NASA Technical Reports Server (NTRS)
Grimes, Roger G.; Lewis, John G.; Simon, Horst D.
1990-01-01
In this paper we describe a block Lanczos algorithm that is used as the key building block of a software package for the extraction of eigenvalues and eigenvectors of large sparse symmetric generalized eigenproblems. The software package comprises: a version of the block Lanczos algorithm specialized for spectrally transformed eigenproblems; an adaptive strategy for choosing shifts, and efficient codes for factoring large sparse symmetric indefinite matrices. This paper describes the algorithmic details of our block Lanczos recurrence. This uses a novel combination of block generalizations of several features that have only been investigated independently in the past. In particular new forms of partial reorthogonalization, selective reorthogonalization and local reorthogonalization are used, as is a new algorithm for obtaining the M-orthogonal factorization of a matrix. The heuristic shifting strategy, the integration with sparse linear equation solvers and numerical experience with the code are described in a companion paper.
Color Sparse Representations for Image Processing: Review, Models, and Prospects.
Barthélemy, Quentin; Larue, Anthony; Mars, Jérôme I
2015-11-01
Sparse representations have been extended to deal with color images composed of three channels. A review of dictionary-learning-based sparse representations for color images is made here, detailing the differences between the models, and comparing their results on the real and simulated data. These models are considered in a unifying framework that is based on the degrees of freedom of the linear filtering/transformation of the color channels. Moreover, this allows it to be shown that the scalar quaternionic linear model is equivalent to constrained matrix-based color filtering, which highlights the filtering implicitly applied through this model. Based on this reformulation, the new color filtering model is introduced, using unconstrained filters. In this model, spatial morphologies of color images are encoded by atoms, and colors are encoded by color filters. Color variability is no longer captured in increasing the dictionary size, but with color filters, this gives an efficient color representation.
NASA Astrophysics Data System (ADS)
Bogiatzis, P.; Ishii, M.; Davis, T. A.
2016-12-01
Seismic tomography inverse problems are among the largest high-dimensional parameter estimation tasks in Earth science. We show how combinatorics and graph theory can be used to analyze the structure of such problems, and to effectively decompose them into smaller ones that can be solved efficiently by means of the least squares method. In combination with recent high performance direct sparse algorithms, this reduction in dimensionality allows for an efficient computation of the model resolution and covariance matrices using limited resources. Furthermore, we show that a new sparse singular value decomposition method can be used to obtain the complete spectrum of the singular values. This procedure provides the means for more objective regularization and further dimensionality reduction of the problem. We apply this methodology to a moderate size, non-linear seismic tomography problem to image the structure of the crust and the upper mantle beneath Japan using local deep earthquakes recorded by the High Sensitivity Seismograph Network stations.
A path-oriented matrix-based knowledge representation system
NASA Technical Reports Server (NTRS)
Feyock, Stefan; Karamouzis, Stamos T.
1993-01-01
Experience has shown that designing a good representation is often the key to turning hard problems into simple ones. Most AI (Artificial Intelligence) search/representation techniques are oriented toward an infinite domain of objects and arbitrary relations among them. In reality much of what needs to be represented in AI can be expressed using a finite domain and unary or binary predicates. Well-known vector- and matrix-based representations can efficiently represent finite domains and unary/binary predicates, and allow effective extraction of path information by generalized transitive closure/path matrix computations. In order to avoid space limitations a set of abstract sparse matrix data types was developed along with a set of operations on them. This representation forms the basis of an intelligent information system for representing and manipulating relational data.
1-norm support vector novelty detection and its sparseness.
Zhang, Li; Zhou, WeiDa
2013-12-01
This paper proposes a 1-norm support vector novelty detection (SVND) method and discusses its sparseness. 1-norm SVND is formulated as a linear programming problem and uses two techniques for inducing sparseness, or the 1-norm regularization and the hinge loss function. We also find two upper bounds on the sparseness of 1-norm SVND, or exact support vector (ESV) and kernel Gram matrix rank bounds. The ESV bound indicates that 1-norm SVND has a sparser representation model than SVND. The kernel Gram matrix rank bound can loosely estimate the sparseness of 1-norm SVND. Experimental results show that 1-norm SVND is feasible and effective. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
O'Malley, D.; Le, E. B.; Vesselinov, V. V.
2015-12-01
We present a fast, scalable, and highly-implementable stochastic inverse method for characterization of aquifer heterogeneity. The method utilizes recent advances in randomized matrix algebra and exploits the structure of the Quasi-Linear Geostatistical Approach (QLGA), without requiring a structured grid like Fast-Fourier Transform (FFT) methods. The QLGA framework is a more stable version of Gauss-Newton iterates for a large number of unknown model parameters, but provides unbiased estimates. The methods are matrix-free and do not require derivatives or adjoints, and are thus ideal for complex models and black-box implementation. We also incorporate randomized least-square solvers and data-reduction methods, which speed up computation and simulate missing data points. The new inverse methodology is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. Inversion results based on series of synthetic problems with steady-state and transient calibration data are presented.
A Strassen-Newton algorithm for high-speed parallelizable matrix inversion
NASA Technical Reports Server (NTRS)
Bailey, David H.; Ferguson, Helaman R. P.
1988-01-01
Techniques are described for computing matrix inverses by algorithms that are highly suited to massively parallel computation. The techniques are based on an algorithm suggested by Strassen (1969). Variations of this scheme use matrix Newton iterations and other methods to improve the numerical stability while at the same time preserving a very high level of parallelism. One-processor Cray-2 implementations of these schemes range from one that is up to 55 percent faster than a conventional library routine to one that is slower than a library routine but achieves excellent numerical stability. The problem of computing the solution to a single set of linear equations is discussed, and it is shown that this problem can also be solved efficiently using these techniques.
Multitasking the Davidson algorithm for the large, sparse eigenvalue problem
DOE Office of Scientific and Technical Information (OSTI.GOV)
Umar, V.M.; Fischer, C.F.
1989-01-01
The authors report how the Davidson algorithm, developed for handling the eigenvalue problem for large and sparse matrices arising in quantum chemistry, was modified for use in atomic structure calculations. To date these calculations have used traditional eigenvalue methods, which limit the range of feasible calculations because of their excessive memory requirements and unsatisfactory performance attributed to time-consuming and costly processing of zero valued elements. The replacement of a traditional matrix eigenvalue method by the Davidson algorithm reduced these limitations. Significant speedup was found, which varied with the size of the underlying problem and its sparsity. Furthermore, the range ofmore » matrix sizes that can be manipulated efficiently was expended by more than one order or magnitude. On the CRAY X-MP the code was vectorized and the importance of gather/scatter analyzed. A parallelized version of the algorithm obtained an additional 35% reduction in execution time. Speedup due to vectorization and concurrency was also measured on the Alliant FX/8.« less
Holographic implementation of a binary associative memory for improved recognition
NASA Astrophysics Data System (ADS)
Bandyopadhyay, Somnath; Ghosh, Ajay; Datta, Asit K.
1998-03-01
Neural network associate memory has found wide application sin pattern recognition techniques. We propose an associative memory model for binary character recognition. The interconnection strengths of the memory are binary valued. The concept of sparse coding is sued to enhance the storage efficiency of the model. The question of imposed preconditioning of pattern vectors, which is inherent in a sparsely coded conventional memory, is eliminated by using a multistep correlation technique an the ability of correct association is enhanced in a real-time application. A potential optoelectronic implementation of the proposed associative memory is also described. The learning and recall is possible by using digital optical matrix-vector multiplication, where full use of parallelism and connectivity of optics is made. A hologram is used in the experiment as a longer memory (LTM) for storing all input information. The short-term memory or the interconnection weight matrix required during the recall process is configured by retrieving the necessary information from the holographic LTM.
NASA Technical Reports Server (NTRS)
Alfano, Robert R. (Inventor); Cai, Wei (Inventor)
2007-01-01
A reconstruction technique for reducing computation burden in the 3D image processes, wherein the reconstruction procedure comprises an inverse and a forward model. The inverse model uses a hybrid dual Fourier algorithm that combines a 2D Fourier inversion with a 1D matrix inversion to thereby provide high-speed inverse computations. The inverse algorithm uses a hybrid transfer to provide fast Fourier inversion for data of multiple sources and multiple detectors. The forward model is based on an analytical cumulant solution of a radiative transfer equation. The accurate analytical form of the solution to the radiative transfer equation provides an efficient formalism for fast computation of the forward model.
Teaching Linear Algebra: Proceeding More Efficiently by Staying Comfortably within Z
ERIC Educational Resources Information Center
Beaver, Scott
2015-01-01
For efficiency in a linear algebra course the instructor may wish to avoid the undue arithmetical distractions of rational arithmetic. In this paper we explore how to write fraction-free problems of various types including elimination, matrix inverses, orthogonality, and the (non-normalizing) Gram-Schmidt process.
Exploring Deep Learning and Sparse Matrix Format Selection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Y.; Liao, C.; Shen, X.
We proposed to explore the use of Deep Neural Networks (DNN) for addressing the longstanding barriers. The recent rapid progress of DNN technology has created a large impact in many fields, which has significantly improved the prediction accuracy over traditional machine learning techniques in image classifications, speech recognitions, machine translations, and so on. To some degree, these tasks resemble the decision makings in many HPC tasks, including the aforementioned format selection for SpMV and linear solver selection. For instance, sparse matrix format selection is akin to image classification—such as, to tell whether an image contains a dog or a cat;more » in both problems, the right decisions are primarily determined by the spatial patterns of the elements in an input. For image classification, the patterns are of pixels, and for sparse matrix format selection, they are of non-zero elements. DNN could be naturally applied if we regard a sparse matrix as an image and the format selection or solver selection as classification problems.« less
Reconstruction of Complex Network based on the Noise via QR Decomposition and Compressed Sensing.
Li, Lixiang; Xu, Dafei; Peng, Haipeng; Kurths, Jürgen; Yang, Yixian
2017-11-08
It is generally known that the states of network nodes are stable and have strong correlations in a linear network system. We find that without the control input, the method of compressed sensing can not succeed in reconstructing complex networks in which the states of nodes are generated through the linear network system. However, noise can drive the dynamics between nodes to break the stability of the system state. Therefore, a new method integrating QR decomposition and compressed sensing is proposed to solve the reconstruction problem of complex networks under the assistance of the input noise. The state matrix of the system is decomposed by QR decomposition. We construct the measurement matrix with the aid of Gaussian noise so that the sparse input matrix can be reconstructed by compressed sensing. We also discover that noise can build a bridge between the dynamics and the topological structure. Experiments are presented to show that the proposed method is more accurate and more efficient to reconstruct four model networks and six real networks by the comparisons between the proposed method and only compressed sensing. In addition, the proposed method can reconstruct not only the sparse complex networks, but also the dense complex networks.
Signal Sampling for Efficient Sparse Representation of Resting State FMRI Data
Ge, Bao; Makkie, Milad; Wang, Jin; Zhao, Shijie; Jiang, Xi; Li, Xiang; Lv, Jinglei; Zhang, Shu; Zhang, Wei; Han, Junwei; Guo, Lei; Liu, Tianming
2015-01-01
As the size of brain imaging data such as fMRI grows explosively, it provides us with unprecedented and abundant information about the brain. How to reduce the size of fMRI data but not lose much information becomes a more and more pressing issue. Recent literature studies tried to deal with it by dictionary learning and sparse representation methods, however, their computation complexities are still high, which hampers the wider application of sparse representation method to large scale fMRI datasets. To effectively address this problem, this work proposes to represent resting state fMRI (rs-fMRI) signals of a whole brain via a statistical sampling based sparse representation. First we sampled the whole brain’s signals via different sampling methods, then the sampled signals were aggregate into an input data matrix to learn a dictionary, finally this dictionary was used to sparsely represent the whole brain’s signals and identify the resting state networks. Comparative experiments demonstrate that the proposed signal sampling framework can speed-up by ten times in reconstructing concurrent brain networks without losing much information. The experiments on the 1000 Functional Connectomes Project further demonstrate its effectiveness and superiority. PMID:26646924
Representation-Independent Iteration of Sparse Data Arrays
NASA Technical Reports Server (NTRS)
James, Mark
2007-01-01
An approach is defined that describes a method of iterating over massively large arrays containing sparse data using an approach that is implementation independent of how the contents of the sparse arrays are laid out in memory. What is unique and important here is the decoupling of the iteration over the sparse set of array elements from how they are internally represented in memory. This enables this approach to be backward compatible with existing schemes for representing sparse arrays as well as new approaches. What is novel here is a new approach for efficiently iterating over sparse arrays that is independent of the underlying memory layout representation of the array. A functional interface is defined for implementing sparse arrays in any modern programming language with a particular focus for the Chapel programming language. Examples are provided that show the translation of a loop that computes a matrix vector product into this representation for both the distributed and not-distributed cases. This work is directly applicable to NASA and its High Productivity Computing Systems (HPCS) program that JPL and our current program are engaged in. The goal of this program is to create powerful, scalable, and economically viable high-powered computer systems suitable for use in national security and industry by 2010. This is important to NASA for its computationally intensive requirements for analyzing and understanding the volumes of science data from our returned missions.
Low-rank and Adaptive Sparse Signal (LASSI) Models for Highly Accelerated Dynamic Imaging
Ravishankar, Saiprasad; Moore, Brian E.; Nadakuditi, Raj Rao; Fessler, Jeffrey A.
2017-01-01
Sparsity-based approaches have been popular in many applications in image processing and imaging. Compressed sensing exploits the sparsity of images in a transform domain or dictionary to improve image recovery from undersampled measurements. In the context of inverse problems in dynamic imaging, recent research has demonstrated the promise of sparsity and low-rank techniques. For example, the patches of the underlying data are modeled as sparse in an adaptive dictionary domain, and the resulting image and dictionary estimation from undersampled measurements is called dictionary-blind compressed sensing, or the dynamic image sequence is modeled as a sum of low-rank and sparse (in some transform domain) components (L+S model) that are estimated from limited measurements. In this work, we investigate a data-adaptive extension of the L+S model, dubbed LASSI, where the temporal image sequence is decomposed into a low-rank component and a component whose spatiotemporal (3D) patches are sparse in some adaptive dictionary domain. We investigate various formulations and efficient methods for jointly estimating the underlying dynamic signal components and the spatiotemporal dictionary from limited measurements. We also obtain efficient sparsity penalized dictionary-blind compressed sensing methods as special cases of our LASSI approaches. Our numerical experiments demonstrate the promising performance of LASSI schemes for dynamic magnetic resonance image reconstruction from limited k-t space data compared to recent methods such as k-t SLR and L+S, and compared to the proposed dictionary-blind compressed sensing method. PMID:28092528
Low-Rank and Adaptive Sparse Signal (LASSI) Models for Highly Accelerated Dynamic Imaging.
Ravishankar, Saiprasad; Moore, Brian E; Nadakuditi, Raj Rao; Fessler, Jeffrey A
2017-05-01
Sparsity-based approaches have been popular in many applications in image processing and imaging. Compressed sensing exploits the sparsity of images in a transform domain or dictionary to improve image recovery fromundersampledmeasurements. In the context of inverse problems in dynamic imaging, recent research has demonstrated the promise of sparsity and low-rank techniques. For example, the patches of the underlying data are modeled as sparse in an adaptive dictionary domain, and the resulting image and dictionary estimation from undersampled measurements is called dictionary-blind compressed sensing, or the dynamic image sequence is modeled as a sum of low-rank and sparse (in some transform domain) components (L+S model) that are estimated from limited measurements. In this work, we investigate a data-adaptive extension of the L+S model, dubbed LASSI, where the temporal image sequence is decomposed into a low-rank component and a component whose spatiotemporal (3D) patches are sparse in some adaptive dictionary domain. We investigate various formulations and efficient methods for jointly estimating the underlying dynamic signal components and the spatiotemporal dictionary from limited measurements. We also obtain efficient sparsity penalized dictionary-blind compressed sensing methods as special cases of our LASSI approaches. Our numerical experiments demonstrate the promising performance of LASSI schemes for dynamicmagnetic resonance image reconstruction from limited k-t space data compared to recent methods such as k-t SLR and L+S, and compared to the proposed dictionary-blind compressed sensing method.
Elongation cutoff technique armed with quantum fast multipole method for linear scaling.
Korchowiec, Jacek; Lewandowski, Jakub; Makowski, Marcin; Gu, Feng Long; Aoki, Yuriko
2009-11-30
A linear-scaling implementation of the elongation cutoff technique (ELG/C) that speeds up Hartree-Fock (HF) self-consistent field calculations is presented. The cutoff method avoids the known bottleneck of the conventional HF scheme, that is, diagonalization, because it operates within the low dimension subspace of the whole atomic orbital space. The efficiency of ELG/C is illustrated for two model systems. The obtained results indicate that the ELG/C is a very efficient sparse matrix algebra scheme. Copyright 2009 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Joulidehsar, Farshad; Moradzadeh, Ali; Doulati Ardejani, Faramarz
2018-06-01
The joint interpretation of two sets of geophysical data related to the same source is an appropriate method for decreasing non-uniqueness of the resulting models during inversion process. Among the available methods, a method based on using cross-gradient constraint combines two datasets is an efficient approach. This method, however, is time-consuming for 3D inversion and cannot provide an exact assessment of situation and extension of anomaly of interest. In this paper, the first attempt is to speed up the required calculation by substituting singular value decomposition by least-squares QR method to solve the large-scale kernel matrix of 3D inversion, more rapidly. Furthermore, to improve the accuracy of resulting models, a combination of depth-weighing matrix and compacted constraint, as automatic selection covariance of initial parameters, is used in the proposed inversion algorithm. This algorithm was developed in Matlab environment and first implemented on synthetic data. The 3D joint inversion of synthetic gravity and magnetic data shows a noticeable improvement in the results and increases the efficiency of algorithm for large-scale problems. Additionally, a real gravity and magnetic dataset of Jalalabad mine, in southeast of Iran was tested. The obtained results by the improved joint 3D inversion of cross-gradient along with compacted constraint showed a mineralised zone in depth interval of about 110-300 m which is in good agreement with the available drilling data. This is also a further confirmation on the accuracy and progress of the improved inversion algorithm.
Computing Generalized Matrix Inverse on Spiking Neural Substrate
Shukla, Rohit; Khoram, Soroosh; Jorgensen, Erik; Li, Jing; Lipasti, Mikko; Wright, Stephen
2018-01-01
Emerging neural hardware substrates, such as IBM's TrueNorth Neurosynaptic System, can provide an appealing platform for deploying numerical algorithms. For example, a recurrent Hopfield neural network can be used to find the Moore-Penrose generalized inverse of a matrix, thus enabling a broad class of linear optimizations to be solved efficiently, at low energy cost. However, deploying numerical algorithms on hardware platforms that severely limit the range and precision of representation for numeric quantities can be quite challenging. This paper discusses these challenges and proposes a rigorous mathematical framework for reasoning about range and precision on such substrates. The paper derives techniques for normalizing inputs and properly quantizing synaptic weights originating from arbitrary systems of linear equations, so that solvers for those systems can be implemented in a provably correct manner on hardware-constrained neural substrates. The analytical model is empirically validated on the IBM TrueNorth platform, and results show that the guarantees provided by the framework for range and precision hold under experimental conditions. Experiments with optical flow demonstrate the energy benefits of deploying a reduced-precision and energy-efficient generalized matrix inverse engine on the IBM TrueNorth platform, reflecting 10× to 100× improvement over FPGA and ARM core baselines. PMID:29593483
Atmospheric inverse modeling via sparse reconstruction
NASA Astrophysics Data System (ADS)
Hase, Nils; Miller, Scot M.; Maaß, Peter; Notholt, Justus; Palm, Mathias; Warneke, Thorsten
2017-10-01
Many applications in atmospheric science involve ill-posed inverse problems. A crucial component of many inverse problems is the proper formulation of a priori knowledge about the unknown parameters. In most cases, this knowledge is expressed as a Gaussian prior. This formulation often performs well at capturing smoothed, large-scale processes but is often ill equipped to capture localized structures like large point sources or localized hot spots. Over the last decade, scientists from a diverse array of applied mathematics and engineering fields have developed sparse reconstruction techniques to identify localized structures. In this study, we present a new regularization approach for ill-posed inverse problems in atmospheric science. It is based on Tikhonov regularization with sparsity constraint and allows bounds on the parameters. We enforce sparsity using a dictionary representation system. We analyze its performance in an atmospheric inverse modeling scenario by estimating anthropogenic US methane (CH4) emissions from simulated atmospheric measurements. Different measures indicate that our sparse reconstruction approach is better able to capture large point sources or localized hot spots than other methods commonly used in atmospheric inversions. It captures the overall signal equally well but adds details on the grid scale. This feature can be of value for any inverse problem with point or spatially discrete sources. We show an example for source estimation of synthetic methane emissions from the Barnett shale formation.
Redundant interferometric calibration as a complex optimization problem
NASA Astrophysics Data System (ADS)
Grobler, T. L.; Bernardi, G.; Kenyon, J. S.; Parsons, A. R.; Smirnov, O. M.
2018-05-01
Observations of the redshifted 21 cm line from the epoch of reionization have recently motivated the construction of low-frequency radio arrays with highly redundant configurations. These configurations provide an alternative calibration strategy - `redundant calibration' - and boost sensitivity on specific spatial scales. In this paper, we formulate calibration of redundant interferometric arrays as a complex optimization problem. We solve this optimization problem via the Levenberg-Marquardt algorithm. This calibration approach is more robust to initial conditions than current algorithms and, by leveraging an approximate matrix inversion, allows for further optimization and an efficient implementation (`redundant STEFCAL'). We also investigated using the preconditioned conjugate gradient method as an alternative to the approximate matrix inverse, but found that its computational performance is not competitive with respect to `redundant STEFCAL'. The efficient implementation of this new algorithm is made publicly available.
Communication Optimal Parallel Multiplication of Sparse Random Matrices
2013-02-21
Definition 2.1), and (2) the algorithm is sparsity- independent, where the computation is statically partitioned to processors independent of the sparsity...struc- ture of the input matrices (see Definition 2.5). The second assumption applies to nearly all existing al- gorithms for general sparse matrix-matrix...where A and B are n× n ER(d) matrices: Definition 2.1 An ER(d) matrix is an adjacency matrix of an Erdős-Rényi graph with parameters n and d/n. That
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chow, Edmond
Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.
NASA Astrophysics Data System (ADS)
Mao, Deqing; Zhang, Yin; Zhang, Yongchao; Huang, Yulin; Yang, Jianyu
2018-01-01
Doppler beam sharpening (DBS) is a critical technology for airborne radar ground mapping in forward-squint region. In conventional DBS technology, the narrow-band Doppler filter groups formed by fast Fourier transform (FFT) method suffer from low spectral resolution and high side lobe levels. The iterative adaptive approach (IAA), based on the weighted least squares (WLS), is applied to the DBS imaging applications, forming narrower Doppler filter groups than the FFT with lower side lobe levels. Regrettably, the IAA is iterative, and requires matrix multiplication and inverse operation when forming the covariance matrix, its inverse and traversing the WLS estimate for each sampling point, resulting in a notably high computational complexity for cubic time. We propose a fast IAA (FIAA)-based super-resolution DBS imaging method, taking advantage of the rich matrix structures of the classical narrow-band filtering. First, we formulate the covariance matrix via the FFT instead of the conventional matrix multiplication operation, based on the typical Fourier structure of the steering matrix. Then, by exploiting the Gohberg-Semencul representation, the inverse of the Toeplitz covariance matrix is computed by the celebrated Levinson-Durbin (LD) and Toeplitz-vector algorithm. Finally, the FFT and fast Toeplitz-vector algorithm are further used to traverse the WLS estimates based on the data-dependent trigonometric polynomials. The method uses the Hermitian feature of the echo autocorrelation matrix R to achieve its fast solution and uses the Toeplitz structure of R to realize its fast inversion. The proposed method enjoys a lower computational complexity without performance loss compared with the conventional IAA-based super-resolution DBS imaging method. The results based on simulations and measured data verify the imaging performance and the operational efficiency.
Filtered gradient reconstruction algorithm for compressive spectral imaging
NASA Astrophysics Data System (ADS)
Mejia, Yuri; Arguello, Henry
2017-04-01
Compressive sensing matrices are traditionally based on random Gaussian and Bernoulli entries. Nevertheless, they are subject to physical constraints, and their structure unusually follows a dense matrix distribution, such as the case of the matrix related to compressive spectral imaging (CSI). The CSI matrix represents the integration of coded and shifted versions of the spectral bands. A spectral image can be recovered from CSI measurements by using iterative algorithms for linear inverse problems that minimize an objective function including a quadratic error term combined with a sparsity regularization term. However, current algorithms are slow because they do not exploit the structure and sparse characteristics of the CSI matrices. A gradient-based CSI reconstruction algorithm, which introduces a filtering step in each iteration of a conventional CSI reconstruction algorithm that yields improved image quality, is proposed. Motivated by the structure of the CSI matrix, Φ, this algorithm modifies the iterative solution such that it is forced to converge to a filtered version of the residual ΦTy, where y is the compressive measurement vector. We show that the filtered-based algorithm converges to better quality performance results than the unfiltered version. Simulation results highlight the relative performance gain over the existing iterative algorithms.
Zhang, Ying-Ying; Yang, Cai; Zhang, Ping
2017-08-01
In this paper, we present a novel bottom-up saliency detection algorithm from the perspective of covariance matrices on a Riemannian manifold. Each superpixel is described by a region covariance matrix on Riemannian Manifolds. We carry out a two-stage sparse coding scheme via Log-Euclidean kernels to extract salient objects efficiently. In the first stage, given background dictionary on image borders, sparse coding of each region covariance via Log-Euclidean kernels is performed. The reconstruction error on the background dictionary is regarded as the initial saliency of each superpixel. In the second stage, an improvement of the initial result is achieved by calculating reconstruction errors of the superpixels on foreground dictionary, which is extracted from the first stage saliency map. The sparse coding in the second stage is similar to the first stage, but is able to effectively highlight the salient objects uniformly from the background. Finally, three post-processing methods-highlight-inhibition function, context-based saliency weighting, and the graph cut-are adopted to further refine the saliency map. Experiments on four public benchmark datasets show that the proposed algorithm outperforms the state-of-the-art methods in terms of precision, recall and mean absolute error, and demonstrate the robustness and efficiency of the proposed method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Computational efficiency improvements for image colorization
NASA Astrophysics Data System (ADS)
Yu, Chao; Sharma, Gaurav; Aly, Hussein
2013-03-01
We propose an efficient algorithm for colorization of greyscale images. As in prior work, colorization is posed as an optimization problem: a user specifies the color for a few scribbles drawn on the greyscale image and the color image is obtained by propagating color information from the scribbles to surrounding regions, while maximizing the local smoothness of colors. In this formulation, colorization is obtained by solving a large sparse linear system, which normally requires substantial computation and memory resources. Our algorithm improves the computational performance through three innovations over prior colorization implementations. First, the linear system is solved iteratively without explicitly constructing the sparse matrix, which significantly reduces the required memory. Second, we formulate each iteration in terms of integral images obtained by dynamic programming, reducing repetitive computation. Third, we use a coarseto- fine framework, where a lower resolution subsampled image is first colorized and this low resolution color image is upsampled to initialize the colorization process for the fine level. The improvements we develop provide significant speedup and memory savings compared to the conventional approach of solving the linear system directly using off-the-shelf sparse solvers, and allow us to colorize images with typical sizes encountered in realistic applications on typical commodity computing platforms.
Decentralized state estimation for a large-scale spatially interconnected system.
Liu, Huabo; Yu, Haisheng
2018-03-01
A decentralized state estimator is derived for the spatially interconnected systems composed of many subsystems with arbitrary connection relations. An optimization problem on the basis of linear matrix inequality (LMI) is constructed for the computations of improved subsystem parameter matrices. Several computationally effective approaches are derived which efficiently utilize the block-diagonal characteristic of system parameter matrices and the sparseness of subsystem connection matrix. Moreover, this decentralized state estimator is proved to converge to a stable system and obtain a bounded covariance matrix of estimation errors under certain conditions. Numerical simulations show that the obtained decentralized state estimator is attractive in the synthesis of a large-scale networked system. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
Computationally Efficient Adaptive Beamformer for Ultrasound Imaging Based on QR Decomposition.
Park, Jongin; Wi, Seok-Min; Lee, Jin S
2016-02-01
Adaptive beamforming methods for ultrasound imaging have been studied to improve image resolution and contrast. The most common approach is the minimum variance (MV) beamformer which minimizes the power of the beamformed output while maintaining the response from the direction of interest constant. The method achieves higher resolution and better contrast than the delay-and-sum (DAS) beamformer, but it suffers from high computational cost. This cost is mainly due to the computation of the spatial covariance matrix and its inverse, which requires O(L(3)) computations, where L denotes the subarray size. In this study, we propose a computationally efficient MV beamformer based on QR decomposition. The idea behind our approach is to transform the spatial covariance matrix to be a scalar matrix σI and we subsequently obtain the apodization weights and the beamformed output without computing the matrix inverse. To do that, QR decomposition algorithm is used and also can be executed at low cost, and therefore, the computational complexity is reduced to O(L(2)). In addition, our approach is mathematically equivalent to the conventional MV beamformer, thereby showing the equivalent performances. The simulation and experimental results support the validity of our approach.
NASA Astrophysics Data System (ADS)
Bigdeli, Abbas; Biglari-Abhari, Morteza; Salcic, Zoran; Tin Lai, Yat
2006-12-01
A new pipelined systolic array-based (PSA) architecture for matrix inversion is proposed. The pipelined systolic array (PSA) architecture is suitable for FPGA implementations as it efficiently uses available resources of an FPGA. It is scalable for different matrix size and as such allows employing parameterisation that makes it suitable for customisation for application-specific needs. This new architecture has an advantage of[InlineEquation not available: see fulltext.] processing element complexity, compared to the[InlineEquation not available: see fulltext.] in other systolic array structures, where the size of the input matrix is given by[InlineEquation not available: see fulltext.]. The use of the PSA architecture for Kalman filter as an implementation example, which requires different structures for different number of states, is illustrated. The resulting precision error is analysed and shown to be negligible.
Liao, Ke; Zhu, Min; Ding, Lei
2013-08-01
The present study investigated the use of transform sparseness of cortical current density on human brain surface to improve electroencephalography/magnetoencephalography (EEG/MEG) inverse solutions. Transform sparseness was assessed by evaluating compressibility of cortical current densities in transform domains. To do that, a structure compression method from computer graphics was first adopted to compress cortical surface structure, either regular or irregular, into hierarchical multi-resolution meshes. Then, a new face-based wavelet method based on generated multi-resolution meshes was proposed to compress current density functions defined on cortical surfaces. Twelve cortical surface models were built by three EEG/MEG softwares and their structural compressibility was evaluated and compared by the proposed method. Monte Carlo simulations were implemented to evaluate the performance of the proposed wavelet method in compressing various cortical current density distributions as compared to other two available vertex-based wavelet methods. The present results indicate that the face-based wavelet method can achieve higher transform sparseness than vertex-based wavelet methods. Furthermore, basis functions from the face-based wavelet method have lower coherence against typical EEG and MEG measurement systems than vertex-based wavelet methods. Both high transform sparseness and low coherent measurements suggest that the proposed face-based wavelet method can improve the performance of L1-norm regularized EEG/MEG inverse solutions, which was further demonstrated in simulations and experimental setups using MEG data. Thus, this new transform on complicated cortical structure is promising to significantly advance EEG/MEG inverse source imaging technologies. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Xu, Xiankun; Li, Peiwen
2017-11-01
Fixman's work in 1974 and the follow-up studies have developed a method that can factorize the inverse of mass matrix into an arithmetic combination of three sparse matrices-one of them is positive definite and needs to be further factorized by using the Cholesky decomposition or similar methods. When the molecule subjected to study is of serial chain structure, this method can achieve O (n) time complexity. However, for molecules with long branches, Cholesky decomposition about the corresponding positive definite matrix will introduce massive fill-in due to its nonzero structure. Although there are several methods can be used to reduce the number of fill-in, none of them could strictly guarantee for zero fill-in for all molecules according to our test, and thus cannot obtain O (n) time complexity by using these traditional methods. In this paper we present a new method that can guarantee for no fill-in in doing the Cholesky decomposition, which was developed based on the correlations between the mass matrix and the geometrical structure of molecules. As a result, the inverting of mass matrix will remain the O (n) time complexity, no matter the molecule structure has long branches or not.
Viscoelastic material inversion using Sierra-SD and ROL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Walsh, Timothy; Aquino, Wilkins; Ridzal, Denis
2014-11-01
In this report we derive frequency-domain methods for inverse characterization of the constitutive parameters of viscoelastic materials. The inverse problem is cast in a PDE-constrained optimization framework with efficient computation of gradients and Hessian vector products through matrix free operations. The abstract optimization operators for first and second derivatives are derived from first principles. Various methods from the Rapid Optimization Library (ROL) are tested on the viscoelastic inversion problem. The methods described herein are applied to compute the viscoelastic bulk and shear moduli of a foam block model, which was recently used in experimental testing for viscoelastic property characterization.
NASA Astrophysics Data System (ADS)
Jiao, J.; Trautz, A.; Zhang, Y.; Illangasekera, T.
2017-12-01
Subsurface flow and transport characterization under data-sparse condition is addressed by a new and computationally efficient inverse theory that simultaneously estimates parameters, state variables, and boundary conditions. Uncertainty in static data can be accounted for while parameter structure can be complex due to process uncertainty. The approach has been successfully extended to inverting transient and unsaturated flows as well as contaminant source identification under unknown initial and boundary conditions. In one example, by sampling numerical experiments simulating two-dimensional steady-state flow in which tracer migrates, a sequential inversion scheme first estimates the flow field and permeability structure before the evolution of tracer plume and dispersivities are jointly estimated. Compared to traditional inversion techniques, the theory does not use forward simulations to assess model-data misfits, thus the knowledge of the difficult-to-determine site boundary condition is not required. To test the general applicability of the theory, data generated during high-precision intermediate-scale experiments (i.e., a scale intermediary to the field and column scales) in large synthetic aquifers can be used. The design of such experiments is not trivial as laboratory conditions have to be selected to mimic natural systems in order to provide useful data, thus requiring a variety of sensors and data collection strategies. This paper presents the design of such an experiment in a synthetic, multi-layered aquifer with dimensions of 242.7 x 119.3 x 7.7 cm3. Different experimental scenarios that will generate data to validate the theory are presented.
NASA Astrophysics Data System (ADS)
Liu, Y.; Pau, G. S. H.; Finsterle, S.
2015-12-01
Parameter inversion involves inferring the model parameter values based on sparse observations of some observables. To infer the posterior probability distributions of the parameters, Markov chain Monte Carlo (MCMC) methods are typically used. However, the large number of forward simulations needed and limited computational resources limit the complexity of the hydrological model we can use in these methods. In view of this, we studied the implicit sampling (IS) method, an efficient importance sampling technique that generates samples in the high-probability region of the posterior distribution and thus reduces the number of forward simulations that we need to run. For a pilot-point inversion of a heterogeneous permeability field based on a synthetic ponded infiltration experiment simulated with TOUGH2 (a subsurface modeling code), we showed that IS with linear map provides an accurate Bayesian description of the parameterized permeability field at the pilot points with just approximately 500 forward simulations. We further studied the use of surrogate models to improve the computational efficiency of parameter inversion. We implemented two reduced-order models (ROMs) for the TOUGH2 forward model. One is based on polynomial chaos expansion (PCE), of which the coefficients are obtained using the sparse Bayesian learning technique to mitigate the "curse of dimensionality" of the PCE terms. The other model is Gaussian process regression (GPR) for which different covariance, likelihood and inference models are considered. Preliminary results indicate that ROMs constructed based on the prior parameter space perform poorly. It is thus impractical to replace this hydrological model by a ROM directly in a MCMC method. However, the IS method can work with a ROM constructed for parameters in the close vicinity of the maximum a posteriori probability (MAP) estimate. We will discuss the accuracy and computational efficiency of using ROMs in the implicit sampling procedure for the hydrological problem considered. This work was supported, in part, by the U.S. Dept. of Energy under Contract No. DE-AC02-05CH11231
Computationally Efficient Modeling and Simulation of Large Scale Systems
NASA Technical Reports Server (NTRS)
Jain, Jitesh (Inventor); Koh, Cheng-Kok (Inventor); Balakrishnan, Vankataramanan (Inventor); Cauley, Stephen F (Inventor); Li, Hong (Inventor)
2014-01-01
A system for simulating operation of a VLSI interconnect structure having capacitive and inductive coupling between nodes thereof, including a processor, and a memory, the processor configured to perform obtaining a matrix X and a matrix Y containing different combinations of passive circuit element values for the interconnect structure, the element values for each matrix including inductance L and inverse capacitance P, obtaining an adjacency matrix A associated with the interconnect structure, storing the matrices X, Y, and A in the memory, and performing numerical integration to solve first and second equations.
Inverse opal carbons for counter electrode of dye-sensitized solar cells.
Kang, Da-Young; Lee, Youngshin; Cho, Chang-Yeol; Moon, Jun Hyuk
2012-05-01
We investigated the fabrication of inverse opal carbon counter electrodes using a colloidal templating method for DSSCs. Specifically, bare inverse opal carbon, mesopore-incoporated inverse opal carbon, and graphitized inverse opal carbon were synthesized and stably dispersed in ethanol solution for spray coating on a FTO substrate. The thickness of the electrode was controlled by the number of coatings, and the average relative thickness was evaluated by measuring the transmittance spectrum. The effect of the counter electrode thickness on the photovoltaic performance of the DSSCs was investigated and analyzed by interfacial charge transfer resistance (R(CT)) under EIS measurement. The effect of the surface area and conductivity of the inverse opal was also investigated by considering the increase in surface area due to the mesopore in the inverse opal carbon and conductivity by graphitization of the carbon matrix. The results showed that the FF and thereby the efficiency of DSSCs were increased as the electrode thickness increased. Consequently, the larger FF and thereby the greater efficiency of the DSSCs were achieved for mIOC and gIOC compared to IOC, which was attributed to the lower R(CT). Finally, compared to a conventional Pt counter electrode, the inverse opal-based carbon showed a comparable efficiency upon application to DSSCs.
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.; Mohammed, Ahmed Ali; Kadiam, Subhash
2010-01-01
Solving large (and sparse) system of simultaneous linear equations has been (and continues to be) a major challenging problem for many real-world engineering/science applications [1-2]. For many practical/large-scale problems, the sparse, Symmetrical and Positive Definite (SPD) system of linear equations can be conveniently represented in matrix notation as [A] {x} = {b} , where the square coefficient matrix [A] and the Right-Hand-Side (RHS) vector {b} are known. The unknown solution vector {x} can be efficiently solved by the following step-by-step procedures [1-2]: Reordering phase, Matrix Factorization phase, Forward solution phase, and Backward solution phase. In this research work, a Game-Based Learning (GBL) approach has been developed to help engineering students to understand crucial details about matrix reordering and factorization phases. A "chess-like" game has been developed and can be played by either a single player, or two players. Through this "chess-like" open-ended game, the players/learners will not only understand the key concepts involved in reordering algorithms (based on existing algorithms), but also have the opportunities to "discover new algorithms" which are better than existing algorithms. Implementing the proposed "chess-like" game for matrix reordering and factorization phases can be enhanced by FLASH [3] computer environments, where computer simulation with animated human voice, sound effects, visual/graphical/colorful displays of matrix tables, score (or monetary) awards for the best game players, etc. can all be exploited. Preliminary demonstrations of the developed GBL approach can be viewed by anyone who has access to the internet web-site [4]!
Stability and stabilisation of a class of networked dynamic systems
NASA Astrophysics Data System (ADS)
Liu, H. B.; Wang, D. Q.
2018-04-01
We investigate the stability and stabilisation of a linear time invariant networked heterogeneous system with arbitrarily connected subsystems. A new linear matrix inequality based sufficient and necessary condition for the stability is derived, based on which the stabilisation is provided. The obtained conditions efficiently utilise the block-diagonal characteristic of system parameter matrices and the sparseness of subsystem connection matrix. Moreover, a sufficient condition only dependent on each individual subsystem is also presented for the stabilisation of the networked systems with a large scale. Numerical simulations show that these conditions are computationally valid in the analysis and synthesis of a large-scale networked system.
Sparse matrix methods based on orthogonality and conjugacy
NASA Technical Reports Server (NTRS)
Lawson, C. L.
1973-01-01
A matrix having a high percentage of zero elements is called spares. In the solution of systems of linear equations or linear least squares problems involving large sparse matrices, significant saving of computer cost can be achieved by taking advantage of the sparsity. The conjugate gradient algorithm and a set of related algorithms are described.
Thakur, Anil S.; Robin, Gautier; Guncar, Gregor; Saunders, Neil F. W.; Newman, Janet; Martin, Jennifer L.; Kobe, Bostjan
2007-01-01
Background Crystallization is a major bottleneck in the process of macromolecular structure determination by X-ray crystallography. Successful crystallization requires the formation of nuclei and their subsequent growth to crystals of suitable size. Crystal growth generally occurs spontaneously in a supersaturated solution as a result of homogenous nucleation. However, in a typical sparse matrix screening experiment, precipitant and protein concentration are not sampled extensively, and supersaturation conditions suitable for nucleation are often missed. Methodology/Principal Findings We tested the effect of nine potential heterogenous nucleating agents on crystallization of ten test proteins in a sparse matrix screen. Several nucleating agents induced crystal formation under conditions where no crystallization occurred in the absence of the nucleating agent. Four nucleating agents: dried seaweed; horse hair; cellulose and hydroxyapatite, had a considerable overall positive effect on crystallization success. This effect was further enhanced when these nucleating agents were used in combination with each other. Conclusions/Significance Our results suggest that the addition of heterogeneous nucleating agents increases the chances of crystal formation when using sparse matrix screens. PMID:17971854
The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R.
Pang, Haotian; Liu, Han; Vanderbei, Robert
2014-02-01
We develop an R package fastclime for solving a family of regularized linear programming (LP) problems. Our package efficiently implements the parametric simplex algorithm, which provides a scalable and sophisticated tool for solving large-scale linear programs. As an illustrative example, one use of our LP solver is to implement an important sparse precision matrix estimation method called CLIME (Constrained L 1 Minimization Estimator). Compared with existing packages for this problem such as clime and flare, our package has three advantages: (1) it efficiently calculates the full piecewise-linear regularization path; (2) it provides an accurate dual certificate as stopping criterion; (3) it is completely coded in C and is highly portable. This package is designed to be useful to statisticians and machine learning researchers for solving a wide range of problems.
2011-09-01
strain data provided by in-situ strain sensors. The application focus is on the stain data obtained from FBG (Fiber Bragg Grating) sensor arrays...sparsely distributed lines to simulate strain data from FBG (Fiber Bragg Grating) arrays that provide either single-core (axial) or rosette (tri...when the measured strain data are sparse, as it is often the case when FBG sensors are used. For an inverse element without strain-sensor data, the
NASA Astrophysics Data System (ADS)
Szabó, Norbert Péter
2018-03-01
An evolutionary inversion approach is suggested for the interpretation of nuclear and resistivity logs measured by direct-push tools in shallow unsaturated sediments. The efficiency of formation evaluation is improved by estimating simultaneously (1) the petrophysical properties that vary rapidly along a drill hole with depth and (2) the zone parameters that can be treated as constant, in one inversion procedure. In the workflow, the fractional volumes of water, air, matrix and clay are estimated in adjacent depths by linearized inversion, whereas the clay and matrix properties are updated using a float-encoded genetic meta-algorithm. The proposed inversion method provides an objective estimate of the zone parameters that appear in the tool response equations applied to solve the forward problem, which can significantly increase the reliability of the petrophysical model as opposed to setting these parameters arbitrarily. The global optimization meta-algorithm not only assures the best fit between the measured and calculated data but also gives a reliable solution, practically independent of the initial model, as laboratory data are unnecessary in the inversion procedure. The feasibility test uses engineering geophysical sounding logs observed in an unsaturated loessy-sandy formation in Hungary. The multi-borehole extension of the inversion technique is developed to determine the petrophysical properties and their estimation errors along a profile of drill holes. The genetic meta-algorithmic inversion method is recommended for hydrogeophysical logging applications of various kinds to automatically extract the volumetric ratios of rock and fluid constituents as well as the most important zone parameters in a reliable inversion procedure.
A general parallel sparse-blocked matrix multiply for linear scaling SCF theory
NASA Astrophysics Data System (ADS)
Challacombe, Matt
2000-06-01
A general approach to the parallel sparse-blocked matrix-matrix multiply is developed in the context of linear scaling self-consistent-field (SCF) theory. The data-parallel message passing method uses non-blocking communication to overlap computation and communication. The space filling curve heuristic is used to achieve data locality for sparse matrix elements that decay with “separation”. Load balance is achieved by solving the bin packing problem for blocks with variable size.With this new method as the kernel, parallel performance of the simplified density matrix minimization (SDMM) for solution of the SCF equations is investigated for RHF/6-31G ∗∗ water clusters and RHF/3-21G estane globules. Sustained rates above 5.7 GFLOPS for the SDMM have been achieved for (H 2 O) 200 with 95 Origin 2000 processors. Scalability is found to be limited by load imbalance, which increases with decreasing granularity, due primarily to the inhomogeneous distribution of variable block sizes.
A new fast direct solver for the boundary element method
NASA Astrophysics Data System (ADS)
Huang, S.; Liu, Y. J.
2017-09-01
A new fast direct linear equation solver for the boundary element method (BEM) is presented in this paper. The idea of the new fast direct solver stems from the concept of the hierarchical off-diagonal low-rank matrix. The hierarchical off-diagonal low-rank matrix can be decomposed into the multiplication of several diagonal block matrices. The inverse of the hierarchical off-diagonal low-rank matrix can be calculated efficiently with the Sherman-Morrison-Woodbury formula. In this paper, a more general and efficient approach to approximate the coefficient matrix of the BEM with the hierarchical off-diagonal low-rank matrix is proposed. Compared to the current fast direct solver based on the hierarchical off-diagonal low-rank matrix, the proposed method is suitable for solving general 3-D boundary element models. Several numerical examples of 3-D potential problems with the total number of unknowns up to above 200,000 are presented. The results show that the new fast direct solver can be applied to solve large 3-D BEM models accurately and with better efficiency compared with the conventional BEM.
Data traffic reduction schemes for sparse Cholesky factorizations
NASA Technical Reports Server (NTRS)
Naik, Vijay K.; Patrick, Merrell L.
1988-01-01
Load distribution schemes are presented which minimize the total data traffic in the Cholesky factorization of dense and sparse, symmetric, positive definite matrices on multiprocessor systems with local and shared memory. The total data traffic in factoring an n x n sparse, symmetric, positive definite matrix representing an n-vertex regular 2-D grid graph using n (sup alpha), alpha is equal to or less than 1, processors are shown to be O(n(sup 1 + alpha/2)). It is O(n(sup 3/2)), when n (sup alpha), alpha is equal to or greater than 1, processors are used. Under the conditions of uniform load distribution, these results are shown to be asymptotically optimal. The schemes allow efficient use of up to O(n) processors before the total data traffic reaches the maximum value of O(n(sup 3/2)). The partitioning employed within the scheme, allows a better utilization of the data accessed from shared memory than those of previously published methods.
Sparse matrix-vector multiplication on network-on-chip
NASA Astrophysics Data System (ADS)
Sun, C.-C.; Götze, J.; Jheng, H.-Y.; Ruan, S.-J.
2010-12-01
In this paper, we present an idea for performing matrix-vector multiplication by using Network-on-Chip (NoC) architecture. In traditional IC design on-chip communications have been designed with dedicated point-to-point interconnections. Therefore, regular local data transfer is the major concept of many parallel implementations. However, when dealing with the parallel implementation of sparse matrix-vector multiplication (SMVM), which is the main step of all iterative algorithms for solving systems of linear equation, the required data transfers depend on the sparsity structure of the matrix and can be extremely irregular. Using the NoC architecture makes it possible to deal with arbitrary structure of the data transfers; i.e. with the irregular structure of the sparse matrices. So far, we have already implemented the proposed SMVM-NoC architecture with the size 4×4 and 5×5 in IEEE 754 single float point precision using FPGA.
Structure-preserving and rank-revealing QR-factorizations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bischof, C.H.; Hansen, P.C.
1991-11-01
The rank-revealing QR-factorization (RRQR-factorization) is a special QR-factorization that is guaranteed to reveal the numerical rank of the matrix under consideration. This makes the RRQR-factorization a useful tool in the numerical treatment of many rank-deficient problems in numerical linear algebra. In this paper, a framework is presented for the efficient implementation of RRQR algorithms, in particular, for sparse matrices. A sparse RRQR-algorithm should seek to preserve the structure and sparsity of the matrix as much as possible while retaining the ability to capture safely the numerical rank. To this end, the paper proposes to compute an initial QR-factorization using amore » restricted pivoting strategy guarded by incremental condition estimation (ICE), and then applies the algorithm suggested by Chan and Foster to this QR-factorization. The column exchange strategy used in the initial QR factorization will exploit the fact that certain column exchanges do not change the sparsity structure, and compute a sparse QR-factorization that is a good approximation of the sought-after RRQR-factorization. Due to quantities produced by ICE, the Chan/Foster RRQR algorithm can be implemented very cheaply, thus verifying that the sought-after RRQR-factorization has indeed been computed. Experimental results on a model problem show that the initial QR-factorization is indeed very likely to produce RRQR-factorization.« less
Underdetermined blind separation of three-way fluorescence spectra of PAHs in water
NASA Astrophysics Data System (ADS)
Yang, Ruifang; Zhao, Nanjing; Xiao, Xue; Zhu, Wei; Chen, Yunan; Yin, Gaofang; Liu, Jianguo; Liu, Wenqing
2018-06-01
In this work, underdetermined blind decomposition method is developed to recognize individual components from the three-way fluorescent spectra of their mixtures by using sparse component analysis (SCA). The mixing matrix is estimated from the mixtures using fuzzy data clustering algorithm together with the scatters corresponding to local energy maximum value in the time-frequency domain, and the spectra of object components are recovered by pseudo inverse technique. As an example, using this method three and four pure components spectra can be blindly extracted from two samples of their mixture, with similarities between resolved and reference spectra all above 0.80. This work opens a new and effective path to realize monitoring PAHs in water by three-way fluorescence spectroscopy technique.
NASA Astrophysics Data System (ADS)
Monnier, S.; Lumley, D. E.; Kamei, R.; Goncharov, A.; Shragge, J. C.
2016-12-01
Ocean Bottom Seismic datasets have become increasingly used in recent years to develop high-resolution, wavelength-scale P-wave velocity models of the lithosphere from waveform inversion, due to their recording of long-offset transmitted phases. New OBS surveys evolve towards novel acquisition geometries involving longer offsets (several hundreds of km), broader frequency content (1-100 Hz), while receiver sampling often remains sparse (several km). Therefore, it is critical to assess the effects of such geometries on the eventual success and resolution of waveform inversion velocity models. In this study, we investigate the feasibility of waveform inversion on the Bart 2D OBS profile acquired offshore Western Australia, to investigate regional crustal and Moho structures. The dataset features 14 broadband seismometers (0.01-100 Hz) from AuScope's national OBS fleet, offsets in excess of 280 km, and a sparse receiver sampling (18 km). We perform our analysis in four stages: (1) field data analysis, (2) 2D P-wave velocity model building, synthetic data (3) modelling, and (4) waveform inversion. Data exploration shows high-quality active-source signal down to 2Hz, and usable first arrivals to offsets greater than 100 km. The background velocity model is constructed by combining crustal and Moho information in continental reference models (e.g., AuSREM, AusMoho). These low-resolution studies suggest a crustal thickness of 20-25 km along our seismic line and constitute a starting point for synthetic modelling and inversion. We perform synthetic 2D time-domain modelling to: (1) evaluate the misfit between synthetic and field data within the usable frequency band (2-10 Hz); (2) validate our velocity model; and (3) observe the effects of sparse OBS interval on data quality. Finally, we apply 2D acoustic frequency-domain waveform inversion to the synthetic data to generate velocity model updates. The inverted model is compared to the reference model to investigate the improved crustal resolution and Moho boundary delineation that could be realized using waveform inversion, and to evaluate the effects of the acquisition parameters. The inversion strategies developed through the synthetic tests will help the subsequent inversion of sparse, long-offset OBS field data.
Magnetotelluric inversion via reverse time migration algorithm of seismic data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ha, Taeyoung; Shin, Changsoo
2007-07-01
We propose a new algorithm for two-dimensional magnetotelluric (MT) inversion. Our algorithm is an MT inversion based on the steepest descent method, borrowed from the backpropagation technique of seismic inversion or reverse time migration, introduced in the middle 1980s by Lailly and Tarantola. The steepest descent direction can be calculated efficiently by using the symmetry of numerical Green's function derived from a mixed finite element method proposed by Nedelec for Maxwell's equation, without calculating the Jacobian matrix explicitly. We construct three different objective functions by taking the logarithm of the complex apparent resistivity as introduced in the recent waveform inversionmore » algorithm by Shin and Min. These objective functions can be naturally separated into amplitude inversion, phase inversion and simultaneous inversion. We demonstrate our algorithm by showing three inversion results for synthetic data.« less
Computationally efficient multibody simulations
NASA Technical Reports Server (NTRS)
Ramakrishnan, Jayant; Kumar, Manoj
1994-01-01
Computationally efficient approaches to the solution of the dynamics of multibody systems are presented in this work. The computational efficiency is derived from both the algorithmic and implementational standpoint. Order(n) approaches provide a new formulation of the equations of motion eliminating the assembly and numerical inversion of a system mass matrix as required by conventional algorithms. Computational efficiency is also gained in the implementation phase by the symbolic processing and parallel implementation of these equations. Comparison of this algorithm with existing multibody simulation programs illustrates the increased computational efficiency.
Zhou, Xiaolong; Wang, Xina; Feng, Xi; Zhang, Kun; Peng, Xiaoniu; Wang, Hanbin; Liu, Chunlei; Han, Yibo; Wang, Hao; Li, Quan
2017-07-12
Carbon dots (C dots, size < 10 nm) have been conventionally decorated onto semiconductor matrixes for photocatalytic H 2 evolution, but the efficiency is largely limited by the low loading ratio of the C dots on the photocatalyst. Here, we propose an inverse structure of Cd 0.5 Zn 0.5 S quantum dots (QDs) loaded onto the onionlike carbon (OLC) matrix for noble metal-free photocatalytic H 2 evolution. Cd 0.5 Zn 0.5 S QDs (6.9 nm) were uniformly distributed on an OLC (30 nm) matrix with both upconverted and downconverted photoluminescence property. Such an inverse structure allows the full optimization of the QD/OLC interfaces for effective energy transfer and charge separation, both of which contribute to efficient H 2 generation. An optimized H 2 generation rate of 2018 μmol/h/g (under the irradiation of visible light) and 58.6 μmol/h/g (under the irradiation of 550-900 nm light) was achieved in the Cd 0.5 Zn 0.5 S/OLC composite samples. The present work shows that using the OLC matrix in such a reverse construction is a promising strategy for noble metal-free solar hydrogen production.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas
In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas; ...
2016-01-06
In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
Teaching Tip: When a Matrix and Its Inverse Are Stochastic
ERIC Educational Resources Information Center
Ding, J.; Rhee, N. H.
2013-01-01
A stochastic matrix is a square matrix with nonnegative entries and row sums 1. The simplest example is a permutation matrix, whose rows permute the rows of an identity matrix. A permutation matrix and its inverse are both stochastic. We prove the converse, that is, if a matrix and its inverse are both stochastic, then it is a permutation matrix.
NASA Astrophysics Data System (ADS)
Liu, Yang; Li, Feng; Xin, Lei; Fu, Jie; Huang, Puming
2017-10-01
Large amount of data is one of the most obvious features in satellite based remote sensing systems, which is also a burden for data processing and transmission. The theory of compressive sensing(CS) has been proposed for almost a decade, and massive experiments show that CS has favorable performance in data compression and recovery, so we apply CS theory to remote sensing images acquisition. In CS, the construction of classical sensing matrix for all sparse signals has to satisfy the Restricted Isometry Property (RIP) strictly, which limits applying CS in practical in image compression. While for remote sensing images, we know some inherent characteristics such as non-negative, smoothness and etc.. Therefore, the goal of this paper is to present a novel measurement matrix that breaks RIP. The new sensing matrix consists of two parts: the standard Nyquist sampling matrix for thumbnails and the conventional CS sampling matrix. Since most of sun-synchronous based satellites fly around the earth 90 minutes and the revisit cycle is also short, lots of previously captured remote sensing images of the same place are available in advance. This drives us to reconstruct remote sensing images through a deep learning approach with those measurements from the new framework. Therefore, we propose a novel deep convolutional neural network (CNN) architecture which takes in undersampsing measurements as input and outputs an intermediate reconstruction image. It is well known that the training procedure to the network costs long time, luckily, the training step can be done only once, which makes the approach attractive for a host of sparse recovery problems.
Joint sparse learning for 3-D facial expression generation.
Song, Mingli; Tao, Dacheng; Sun, Shengpeng; Chen, Chun; Bu, Jiajun
2013-08-01
3-D facial expression generation, including synthesis and retargeting, has received intensive attentions in recent years, because it is important to produce realistic 3-D faces with specific expressions in modern film production and computer games. In this paper, we present joint sparse learning (JSL) to learn mapping functions and their respective inverses to model the relationship between the high-dimensional 3-D faces (of different expressions and identities) and their corresponding low-dimensional representations. Based on JSL, we can effectively and efficiently generate various expressions of a 3-D face by either synthesizing or retargeting. Furthermore, JSL is able to restore 3-D faces with holes by learning a mapping function between incomplete and intact data. Experimental results on a wide range of 3-D faces demonstrate the effectiveness of the proposed approach by comparing with representative ones in terms of quality, time cost, and robustness.
Optimal Tikhonov Regularization in Finite-Frequency Tomography
NASA Astrophysics Data System (ADS)
Fang, Y.; Yao, Z.; Zhou, Y.
2017-12-01
The last decade has witnessed a progressive transition in seismic tomography from ray theory to finite-frequency theory which overcomes the resolution limit of the high-frequency approximation in ray theory. In addition to approximations in wave propagation physics, a main difference between ray-theoretical tomography and finite-frequency tomography is the sparseness of the associated sensitivity matrix. It is well known that seismic tomographic problems are ill-posed and regularizations such as damping and smoothing are often applied to analyze the tradeoff between data misfit and model uncertainty. The regularizations depend on the structure of the matrix as well as noise level of the data. Cross-validation has been used to constrain data uncertainties in body-wave finite-frequency inversions when measurements at multiple frequencies are available to invert for a common structure. In this study, we explore an optimal Tikhonov regularization in surface-wave phase-velocity tomography based on minimization of an empirical Bayes risk function using theoretical training datasets. We exploit the structure of the sensitivity matrix in the framework of singular value decomposition (SVD) which also allows for the calculation of complete resolution matrix. We compare the optimal Tikhonov regularization in finite-frequency tomography with traditional tradeo-off analysis using surface wave dispersion measurements from global as well as regional studies.
NASA Astrophysics Data System (ADS)
Lin, Chuang; Wang, Binghui; Jiang, Ning; Farina, Dario
2018-04-01
Objective. This paper proposes a novel simultaneous and proportional multiple degree of freedom (DOF) myoelectric control method for active prostheses. Approach. The approach is based on non-negative matrix factorization (NMF) of surface EMG signals with the inclusion of sparseness constraints. By applying a sparseness constraint to the control signal matrix, it is possible to extract the basis information from arbitrary movements (quasi-unsupervised approach) for multiple DOFs concurrently. Main Results. In online testing based on target hitting, able-bodied subjects reached a greater throughput (TP) when using sparse NMF (SNMF) than with classic NMF or with linear regression (LR). Accordingly, the completion time (CT) was shorter for SNMF than NMF or LR. The same observations were made in two patients with unilateral limb deficiencies. Significance. The addition of sparseness constraints to NMF allows for a quasi-unsupervised approach to myoelectric control with superior results with respect to previous methods for the simultaneous and proportional control of multi-DOF. The proposed factorization algorithm allows robust simultaneous and proportional control, is superior to previous supervised algorithms, and, because of minimal supervision, paves the way to online adaptation in myoelectric control.
Sparse PCA with Oracle Property.
Gu, Quanquan; Wang, Zhaoran; Liu, Han
In this paper, we study the estimation of the k -dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank- k , and attains a [Formula: see text] statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.
Sparse PCA with Oracle Property
Gu, Quanquan; Wang, Zhaoran; Liu, Han
2014-01-01
In this paper, we study the estimation of the k-dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-k, and attains a s/n statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets. PMID:25684971
Efficient Implementation of an Optimal Interpolator for Large Spatial Data Sets
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.
2007-01-01
Scattered data interpolation is a problem of interest in numerous areas such as electronic imaging, smooth surface modeling, and computational geometry. Our motivation arises from applications in geology and mining, which often involve large scattered data sets and a demand for high accuracy. The method of choice is ordinary kriging. This is because it is a best unbiased estimator. Unfortunately, this interpolant is computationally very expensive to compute exactly. For n scattered data points, computing the value of a single interpolant involves solving a dense linear system of size roughly n x n. This is infeasible for large n. In practice, kriging is solved approximately by local approaches that are based on considering only a relatively small'number of points that lie close to the query point. There are many problems with this local approach, however. The first is that determining the proper neighborhood size is tricky, and is usually solved by ad hoc methods such as selecting a fixed number of nearest neighbors or all the points lying within a fixed radius. Such fixed neighborhood sizes may not work well for all query points, depending on local density of the point distribution. Local methods also suffer from the problem that the resulting interpolant is not continuous. Meyer showed that while kriging produces smooth continues surfaces, it has zero order continuity along its borders. Thus, at interface boundaries where the neighborhood changes, the interpolant behaves discontinuously. Therefore, it is important to consider and solve the global system for each interpolant. However, solving such large dense systems for each query point is impractical. Recently a more principled approach to approximating kriging has been proposed based on a technique called covariance tapering. The problems arise from the fact that the covariance functions that are used in kriging have global support. Our implementations combine, utilize, and enhance a number of different approaches that have been introduced in literature for solving large linear systems for interpolation of scattered data points. For very large systems, exact methods such as Gaussian elimination are impractical since they require 0(n(exp 3)) time and 0(n(exp 2)) storage. As Billings et al. suggested, we use an iterative approach. In particular, we use the SYMMLQ method, for solving the large but sparse ordinary kriging systems that result from tapering. The main technical issue that need to be overcome in our algorithmic solution is that the points' covariance matrix for kriging should be symmetric positive definite. The goal of tapering is to obtain a sparse approximate representation of the covariance matrix while maintaining its positive definiteness. Furrer et al. used tapering to obtain a sparse linear system of the form Ax = b, where A is the tapered symmetric positive definite covariance matrix. Thus, Cholesky factorization could be used to solve their linear systems. They implemented an efficient sparse Cholesky decomposition method. They also showed if these tapers are used for a limited class of covariance models, the solution of the system converges to the solution of the original system. Matrix A in the ordinary kriging system, while symmetric, is not positive definite. Thus, their approach is not applicable to the ordinary kriging system. Therefore, we use tapering only to obtain a sparse linear system. Then, we use SYMMLQ to solve the ordinary kriging system. We show that solving large kriging systems becomes practical via tapering and iterative methods, and results in lower estimation errors compared to traditional local approaches, and significant memory savings compared to the original global system. We also developed a more efficient variant of the sparse SYMMLQ method for large ordinary kriging systems. This approach adaptively finds the correct local neighborhood for each query point in the interpolation process.
NASA Astrophysics Data System (ADS)
Elkurdi, Yousef; Fernández, David; Souleimanov, Evgueni; Giannacopoulos, Dennis; Gross, Warren J.
2008-04-01
The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory.
Effects of partitioning and scheduling sparse matrix factorization on communication and load balance
NASA Technical Reports Server (NTRS)
Venugopal, Sesh; Naik, Vijay K.
1991-01-01
A block based, automatic partitioning and scheduling methodology is presented for sparse matrix factorization on distributed memory systems. Using experimental results, this technique is analyzed for communication and load imbalance overhead. To study the performance effects, these overheads were compared with those obtained from a straightforward 'wrap mapped' column assignment scheme. All experimental results were obtained using test sparse matrices from the Harwell-Boeing data set. The results show that there is a communication and load balance tradeoff. The block based method results in lower communication cost whereas the wrap mapped scheme gives better load balance.
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo
McDaniel, Tyler; D’Azevedo, Ed F.; Li, Ying Wai; ...
2017-11-07
Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with applicationmore » of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi- core CPUs and GPUs.« less
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo
DOE Office of Scientific and Technical Information (OSTI.GOV)
McDaniel, Tyler; D’Azevedo, Ed F.; Li, Ying Wai
Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with applicationmore » of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi- core CPUs and GPUs.« less
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo.
McDaniel, T; D'Azevedo, E F; Li, Y W; Wong, K; Kent, P R C
2017-11-07
Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo
NASA Astrophysics Data System (ADS)
McDaniel, T.; D'Azevedo, E. F.; Li, Y. W.; Wong, K.; Kent, P. R. C.
2017-11-01
Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.
Generating probabilistic Boolean networks from a prescribed transition probability matrix.
Ching, W-K; Chen, X; Tsing, N-K
2009-11-01
Probabilistic Boolean networks (PBNs) have received much attention in modeling genetic regulatory networks. A PBN can be regarded as a Markov chain process and is characterised by a transition probability matrix. In this study, the authors propose efficient algorithms for constructing a PBN when its transition probability matrix is given. The complexities of the algorithms are also analysed. This is an interesting inverse problem in network inference using steady-state data. The problem is important as most microarray data sets are assumed to be obtained from sampling the steady-state.
Quantum Support Vector Machine for Big Data Classification
NASA Astrophysics Data System (ADS)
Rebentrost, Patrick; Mohseni, Masoud; Lloyd, Seth
2014-09-01
Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.
Solving large-scale dynamic systems using band Lanczos method in Rockwell NASTRAN on CRAY X-MP
NASA Technical Reports Server (NTRS)
Gupta, V. K.; Zillmer, S. D.; Allison, R. E.
1986-01-01
The improved cost effectiveness using better models, more accurate and faster algorithms and large scale computing offers more representative dynamic analyses. The band Lanczos eigen-solution method was implemented in Rockwell's version of 1984 COSMIC-released NASTRAN finite element structural analysis computer program to effectively solve for structural vibration modes including those of large complex systems exceeding 10,000 degrees of freedom. The Lanczos vectors were re-orthogonalized locally using the Lanczos Method and globally using the modified Gram-Schmidt method for sweeping rigid-body modes and previously generated modes and Lanczos vectors. The truncated band matrix was solved for vibration frequencies and mode shapes using Givens rotations. Numerical examples are included to demonstrate the cost effectiveness and accuracy of the method as implemented in ROCKWELL NASTRAN. The CRAY version is based on RPK's COSMIC/NASTRAN. The band Lanczos method was more reliable and accurate and converged faster than the single vector Lanczos Method. The band Lanczos method was comparable to the subspace iteration method which was a block version of the inverse power method. However, the subspace matrix tended to be fully populated in the case of subspace iteration and not as sparse as a band matrix.
Archer, A.W.; Maples, C.G.
1989-01-01
Numerous departures from ideal relationships are revealed by Monte Carlo simulations of widely accepted binomial coefficients. For example, simulations incorporating varying levels of matrix sparseness (presence of zeros indicating lack of data) and computation of expected values reveal that not only are all common coefficients influenced by zero data, but also that some coefficients do not discriminate between sparse or dense matrices (few zero data). Such coefficients computationally merge mutually shared and mutually absent information and do not exploit all the information incorporated within the standard 2 ?? 2 contingency table; therefore, the commonly used formulae for such coefficients are more complicated than the actual range of values produced. Other coefficients do differentiate between mutual presences and absences; however, a number of these coefficients do not demonstrate a linear relationship to matrix sparseness. Finally, simulations using nonrandom matrices with known degrees of row-by-row similarities signify that several coefficients either do not display a reasonable range of values or are nonlinear with respect to known relationships within the data. Analyses with nonrandom matrices yield clues as to the utility of certain coefficients for specific applications. For example, coefficients such as Jaccard, Dice, and Baroni-Urbani and Buser are useful if correction of sparseness is desired, whereas the Russell-Rao coefficient is useful when sparseness correction is not desired. ?? 1989 International Association for Mathematical Geology.
Sparse representation of whole-brain fMRI signals for identification of functional networks.
Lv, Jinglei; Jiang, Xi; Li, Xiang; Zhu, Dajiang; Chen, Hanbo; Zhang, Tuo; Zhang, Shu; Hu, Xintao; Han, Junwei; Huang, Heng; Zhang, Jing; Guo, Lei; Liu, Tianming
2015-02-01
There have been several recent studies that used sparse representation for fMRI signal analysis and activation detection based on the assumption that each voxel's fMRI signal is linearly composed of sparse components. Previous studies have employed sparse coding to model functional networks in various modalities and scales. These prior contributions inspired the exploration of whether/how sparse representation can be used to identify functional networks in a voxel-wise way and on the whole brain scale. This paper presents a novel, alternative methodology of identifying multiple functional networks via sparse representation of whole-brain task-based fMRI signals. Our basic idea is that all fMRI signals within the whole brain of one subject are aggregated into a big data matrix, which is then factorized into an over-complete dictionary basis matrix and a reference weight matrix via an effective online dictionary learning algorithm. Our extensive experimental results have shown that this novel methodology can uncover multiple functional networks that can be well characterized and interpreted in spatial, temporal and frequency domains based on current brain science knowledge. Importantly, these well-characterized functional network components are quite reproducible in different brains. In general, our methods offer a novel, effective and unified solution to multiple fMRI data analysis tasks including activation detection, de-activation detection, and functional network identification. Copyright © 2014 Elsevier B.V. All rights reserved.
A gradient based algorithm to solve inverse plane bimodular problems of identification
NASA Astrophysics Data System (ADS)
Ran, Chunjiang; Yang, Haitian; Zhang, Guoqing
2018-02-01
This paper presents a gradient based algorithm to solve inverse plane bimodular problems of identifying constitutive parameters, including tensile/compressive moduli and tensile/compressive Poisson's ratios. For the forward bimodular problem, a FE tangent stiffness matrix is derived facilitating the implementation of gradient based algorithms, for the inverse bimodular problem of identification, a two-level sensitivity analysis based strategy is proposed. Numerical verification in term of accuracy and efficiency is provided, and the impacts of initial guess, number of measurement points, regional inhomogeneity, and noisy data on the identification are taken into accounts.
Efficient electromagnetic source imaging with adaptive standardized LORETA/FOCUSS.
Schimpf, Paul H; Liu, Hesheng; Ramon, Ceon; Haueisen, Jens
2005-05-01
Functional brain imaging and source localization based on the scalp's potential field require a solution to an ill-posed inverse problem with many solutions. This makes it necessary to incorporate a priori knowledge in order to select a particular solution. A computational challenge for some subject-specific head models is that many inverse algorithms require a comprehensive sampling of the candidate source space at the desired resolution. In this study, we present an algorithm that can accurately reconstruct details of localized source activity from a sparse sampling of the candidate source space. Forward computations are minimized through an adaptive procedure that increases source resolution as the spatial extent is reduced. With this algorithm, we were able to compute inverses using only 6% to 11% of the full resolution lead-field, with a localization accuracy that was not significantly different than an exhaustive search through a fully-sampled source space. The technique is, therefore, applicable for use with anatomically-realistic, subject-specific forward models for applications with spatially concentrated source activity.
NASA Astrophysics Data System (ADS)
Scheunert, M.; Ullmann, A.; Afanasjew, M.; Börner, R.-U.; Siemon, B.; Spitzer, K.
2016-06-01
We present an inversion concept for helicopter-borne frequency-domain electromagnetic (HEM) data capable of reconstructing 3-D conductivity structures in the subsurface. Standard interpretation procedures often involve laterally constrained stitched 1-D inversion techniques to create pseudo-3-D models that are largely representative for smoothly varying conductivity distributions in the subsurface. Pronounced lateral conductivity changes may, however, produce significant artifacts that can lead to serious misinterpretation. Still, 3-D inversions of entire survey data sets are numerically very expensive. Our approach is therefore based on a cut-&-paste strategy whereupon the full 3-D inversion needs to be applied only to those parts of the survey where the 1-D inversion actually fails. The introduced 3-D Gauss-Newton inversion scheme exploits information given by a state-of-the-art (laterally constrained) 1-D inversion. For a typical HEM measurement, an explicit representation of the Jacobian matrix is inevitable which is caused by the unique transmitter-receiver relation. We introduce tensor quantities which facilitate the matrix assembly of the forward operator as well as the efficient calculation of the Jacobian. The finite difference forward operator incorporates the displacement currents because they may seriously affect the electromagnetic response at frequencies above 100. Finally, we deliver the proof of concept for the inversion using a synthetic data set with a noise level of up to 5%.
Hwang, Geelsu; Koltisko, Bernard; Jin, Xiaoming; Koo, Hyun
2017-11-08
Surface-grown bacteria and production of an extracellular polymeric matrix modulate the assembly of highly cohesive and firmly attached biofilms, making them difficult to remove from solid surfaces. Inhibition of cell growth and inactivation of matrix-producing bacteria can impair biofilm formation and facilitate removal. Here, we developed a novel nonleachable antibacterial composite with potent antibiofilm activity by directly incorporating polymerizable imidazolium-containing resin (antibacterial resin with carbonate linkage; ABR-C) into a methacrylate-based scaffold (ABR-modified composite; ABR-MC) using an efficient yet simplified chemistry. Low-dose inclusion of imidazolium moiety (∼2 wt %) resulted in bioactivity with minimal cytotoxicity without compromising mechanical integrity of the restorative material. The antibiofilm properties of ABR-MC were assessed using an exopolysaccharide-matrix-producing (EPS-matrix-producing) oral pathogen (Streptococcus mutans) in an experimental biofilm model. Using high-resolution confocal fluorescence imaging and biophysical methods, we observed remarkable disruption of bacterial accumulation and defective 3D matrix structure on the surface of ABR-MC. Specifically, the antibacterial composite impaired the ability of S. mutans to form organized bacterial clusters on the surface, resulting in altered biofilm architecture with sparse cell accumulation and reduced amounts of EPS matrix (versus control composite). Biofilm topology analyses on the control composite revealed a highly organized and weblike EPS structure that tethers the bacterial clusters to each other and to the surface, forming a highly cohesive unit. In contrast, such a structured matrix was absent on the surface of ABR-MC with mostly sparse and amorphous EPS, indicating disruption in the biofilm physical stability. Consistent with lack of structural organization, the defective biofilm on the surface of ABR-MC was readily detached when subjected to low shear stress, while most of the biofilm biomass remained on the control surface. Altogether, we demonstrate a new nonleachable antibacterial composite with excellent antibiofilm activity without affecting its mechanical properties, which may serve as a platform for development of alternative antifouling biomaterials.
Improving IMRT delivery efficiency with reweighted L1-minimization for inverse planning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Hojin; Becker, Stephen; Lee, Rena
2013-07-15
Purpose: This study presents an improved technique to further simplify the fluence-map in intensity modulated radiation therapy (IMRT) inverse planning, thereby reducing plan complexity and improving delivery efficiency, while maintaining the plan quality.Methods: First-order total-variation (TV) minimization (min.) based on L1-norm has been proposed to reduce the complexity of fluence-map in IMRT by generating sparse fluence-map variations. However, with stronger dose sparing to the critical structures, the inevitable increase in the fluence-map complexity can lead to inefficient dose delivery. Theoretically, L0-min. is the ideal solution for the sparse signal recovery problem, yet practically intractable due to its nonconvexity of themore » objective function. As an alternative, the authors use the iteratively reweighted L1-min. technique to incorporate the benefits of the L0-norm into the tractability of L1-min. The weight multiplied to each element is inversely related to the magnitude of the corresponding element, which is iteratively updated by the reweighting process. The proposed penalizing process combined with TV min. further improves sparsity in the fluence-map variations, hence ultimately enhancing the delivery efficiency. To validate the proposed method, this work compares three treatment plans obtained from quadratic min. (generally used in clinic IMRT), conventional TV min., and our proposed reweighted TV min. techniques, implemented by a large-scale L1-solver (template for first-order conic solver), for five patient clinical data. Criteria such as conformation number (CN), modulation index (MI), and estimated treatment time are employed to assess the relationship between the plan quality and delivery efficiency.Results: The proposed method yields simpler fluence-maps than the quadratic and conventional TV based techniques. To attain a given CN and dose sparing to the critical organs for 5 clinical cases, the proposed method reduces the number of segments by 10-15 and 30-35, relative to TV min. and quadratic min. based plans, while MIs decreases by about 20%-30% and 40%-60% over the plans by two existing techniques, respectively. With such conditions, the total treatment time of the plans obtained from our proposed method can be reduced by 12-30 s and 30-80 s mainly due to greatly shorter multileaf collimator (MLC) traveling time in IMRT step-and-shoot delivery.Conclusions: The reweighted L1-minimization technique provides a promising solution to simplify the fluence-map variations in IMRT inverse planning. It improves the delivery efficiency by reducing the entire segments and treatment time, while maintaining the plan quality in terms of target conformity and critical structure sparing.« less
Salient Object Detection via Structured Matrix Decomposition.
Peng, Houwen; Li, Bing; Ling, Haibin; Hu, Weiming; Xiong, Weihua; Maybank, Stephen J
2016-05-04
Low-rank recovery models have shown potential for salient object detection, where a matrix is decomposed into a low-rank matrix representing image background and a sparse matrix identifying salient objects. Two deficiencies, however, still exist. First, previous work typically assumes the elements in the sparse matrix are mutually independent, ignoring the spatial and pattern relations of image regions. Second, when the low-rank and sparse matrices are relatively coherent, e.g., when there are similarities between the salient objects and background or when the background is complicated, it is difficult for previous models to disentangle them. To address these problems, we propose a novel structured matrix decomposition model with two structural regularizations: (1) a tree-structured sparsity-inducing regularization that captures the image structure and enforces patches from the same object to have similar saliency values, and (2) a Laplacian regularization that enlarges the gaps between salient objects and the background in feature space. Furthermore, high-level priors are integrated to guide the matrix decomposition and boost the detection. We evaluate our model for salient object detection on five challenging datasets including single object, multiple objects and complex scene images, and show competitive results as compared with 24 state-of-the-art methods in terms of seven performance metrics.
NASA Astrophysics Data System (ADS)
Bui-Thanh, T.; Girolami, M.
2014-11-01
We consider the Riemann manifold Hamiltonian Monte Carlo (RMHMC) method for solving statistical inverse problems governed by partial differential equations (PDEs). The Bayesian framework is employed to cast the inverse problem into the task of statistical inference whose solution is the posterior distribution in infinite dimensional parameter space conditional upon observation data and Gaussian prior measure. We discretize both the likelihood and the prior using the H1-conforming finite element method together with a matrix transfer technique. The power of the RMHMC method is that it exploits the geometric structure induced by the PDE constraints of the underlying inverse problem. Consequently, each RMHMC posterior sample is almost uncorrelated/independent from the others providing statistically efficient Markov chain simulation. However this statistical efficiency comes at a computational cost. This motivates us to consider computationally more efficient strategies for RMHMC. At the heart of our construction is the fact that for Gaussian error structures the Fisher information matrix coincides with the Gauss-Newton Hessian. We exploit this fact in considering a computationally simplified RMHMC method combining state-of-the-art adjoint techniques and the superiority of the RMHMC method. Specifically, we first form the Gauss-Newton Hessian at the maximum a posteriori point and then use it as a fixed constant metric tensor throughout RMHMC simulation. This eliminates the need for the computationally costly differential geometric Christoffel symbols, which in turn greatly reduces computational effort at a corresponding loss of sampling efficiency. We further reduce the cost of forming the Fisher information matrix by using a low rank approximation via a randomized singular value decomposition technique. This is efficient since a small number of Hessian-vector products are required. The Hessian-vector product in turn requires only two extra PDE solves using the adjoint technique. Various numerical results up to 1025 parameters are presented to demonstrate the ability of the RMHMC method in exploring the geometric structure of the problem to propose (almost) uncorrelated/independent samples that are far away from each other, and yet the acceptance rate is almost unity. The results also suggest that for the PDE models considered the proposed fixed metric RMHMC can attain almost as high a quality performance as the original RMHMC, i.e. generating (almost) uncorrelated/independent samples, while being two orders of magnitude less computationally expensive.
An efficient numerical technique for calculating thermal spreading resistance
NASA Technical Reports Server (NTRS)
Gale, E. H., Jr.
1977-01-01
An efficient numerical technique for solving the equations resulting from finite difference analyses of fields governed by Poisson's equation is presented. The method is direct (noniterative)and the computer work required varies with the square of the order of the coefficient matrix. The computational work required varies with the cube of this order for standard inversion techniques, e.g., Gaussian elimination, Jordan, Doolittle, etc.
A fast multigrid-based electromagnetic eigensolver for curved metal boundaries on the Yee mesh
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bauer, Carl A., E-mail: carl.bauer@colorado.edu; Werner, Gregory R.; Cary, John R.
For embedded boundary electromagnetics using the Dey–Mittra (Dey and Mittra, 1997) [1] algorithm, a special grad–div matrix constructed in this work allows use of multigrid methods for efficient inversion of Maxwell’s curl–curl matrix. Efficient curl–curl inversions are demonstrated within a shift-and-invert Krylov-subspace eigensolver (open-sourced at ([ofortt]https://github.com/bauerca/maxwell[cfortt])) on the spherical cavity and the 9-cell TESLA superconducting accelerator cavity. The accuracy of the Dey–Mittra algorithm is also examined: frequencies converge with second-order error, and surface fields are found to converge with nearly second-order error. In agreement with previous work (Nieter et al., 2009) [2], neglecting some boundary-cut cell faces (as is requiredmore » in the time domain for numerical stability) reduces frequency convergence to first-order and surface-field convergence to zeroth-order (i.e. surface fields do not converge). Additionally and importantly, neglecting faces can reduce accuracy by an order of magnitude at low resolutions.« less
A Digital Compressed Sensing-Based Energy-Efficient Single-Spot Bluetooth ECG Node
Cai, Zhipeng; Zou, Fumin; Zhang, Xiangyu
2018-01-01
Energy efficiency is still the obstacle for long-term real-time wireless ECG monitoring. In this paper, a digital compressed sensing- (CS-) based single-spot Bluetooth ECG node is proposed to deal with the challenge in wireless ECG application. A periodic sleep/wake-up scheme and a CS-based compression algorithm are implemented in a node, which consists of ultra-low-power analog front-end, microcontroller, Bluetooth 4.0 communication module, and so forth. The efficiency improvement and the node's specifics are evidenced by the experiments using the ECG signals sampled by the proposed node under daily activities of lay, sit, stand, walk, and run. Under using sparse binary matrix (SBM), block sparse Bayesian learning (BSBL) method, and discrete cosine transform (DCT) basis, all ECG signals were essentially undistorted recovered with root-mean-square differences (PRDs) which are less than 6%. The proposed sleep/wake-up scheme and data compression can reduce the airtime over energy-hungry wireless links, the energy consumption of proposed node is 6.53 mJ, and the energy consumption of radio decreases 77.37%. Moreover, the energy consumption increase caused by CS code execution is negligible, which is 1.3% of the total energy consumption. PMID:29599945
A Digital Compressed Sensing-Based Energy-Efficient Single-Spot Bluetooth ECG Node.
Luo, Kan; Cai, Zhipeng; Du, Keqin; Zou, Fumin; Zhang, Xiangyu; Li, Jianqing
2018-01-01
Energy efficiency is still the obstacle for long-term real-time wireless ECG monitoring. In this paper, a digital compressed sensing- (CS-) based single-spot Bluetooth ECG node is proposed to deal with the challenge in wireless ECG application. A periodic sleep/wake-up scheme and a CS-based compression algorithm are implemented in a node, which consists of ultra-low-power analog front-end, microcontroller, Bluetooth 4.0 communication module, and so forth. The efficiency improvement and the node's specifics are evidenced by the experiments using the ECG signals sampled by the proposed node under daily activities of lay, sit, stand, walk, and run. Under using sparse binary matrix (SBM), block sparse Bayesian learning (BSBL) method, and discrete cosine transform (DCT) basis, all ECG signals were essentially undistorted recovered with root-mean-square differences (PRDs) which are less than 6%. The proposed sleep/wake-up scheme and data compression can reduce the airtime over energy-hungry wireless links, the energy consumption of proposed node is 6.53 mJ, and the energy consumption of radio decreases 77.37%. Moreover, the energy consumption increase caused by CS code execution is negligible, which is 1.3% of the total energy consumption.
Vecharynski, Eugene; Yang, Chao; Pask, John E.
2015-02-25
Here, we present an iterative algorithm for computing an invariant subspace associated with the algebraically smallest eigenvalues of a large sparse or structured Hermitian matrix A. We are interested in the case in which the dimension of the invariant subspace is large (e.g., over several hundreds or thousands) even though it may still be small relative to the dimension of A. These problems arise from, for example, density functional theory (DFT) based electronic structure calculations for complex materials. The key feature of our algorithm is that it performs fewer Rayleigh–Ritz calculations compared to existing algorithms such as the locally optimalmore » block preconditioned conjugate gradient or the Davidson algorithm. It is a block algorithm, and hence can take advantage of efficient BLAS3 operations and be implemented with multiple levels of concurrency. We discuss a number of practical issues that must be addressed in order to implement the algorithm efficiently on a high performance computer.« less
Ray, J.; Lee, J.; Yadav, V.; ...
2014-08-20
We present a sparse reconstruction scheme that can also be used to ensure non-negativity when fitting wavelet-based random field models to limited observations in non-rectangular geometries. The method is relevant when multiresolution fields are estimated using linear inverse problems. Examples include the estimation of emission fields for many anthropogenic pollutants using atmospheric inversion or hydraulic conductivity in aquifers from flow measurements. The scheme is based on three new developments. Firstly, we extend an existing sparse reconstruction method, Stagewise Orthogonal Matching Pursuit (StOMP), to incorporate prior information on the target field. Secondly, we develop an iterative method that uses StOMP tomore » impose non-negativity on the estimated field. Finally, we devise a method, based on compressive sensing, to limit the estimated field within an irregularly shaped domain. We demonstrate the method on the estimation of fossil-fuel CO 2 (ffCO 2) emissions in the lower 48 states of the US. The application uses a recently developed multiresolution random field model and synthetic observations of ffCO 2 concentrations from a limited set of measurement sites. We find that our method for limiting the estimated field within an irregularly shaped region is about a factor of 10 faster than conventional approaches. It also reduces the overall computational cost by a factor of two. Further, the sparse reconstruction scheme imposes non-negativity without introducing strong nonlinearities, such as those introduced by employing log-transformed fields, and thus reaps the benefits of simplicity and computational speed that are characteristic of linear inverse problems.« less
Frequency-domain elastic full waveform inversion using encoded simultaneous sources
NASA Astrophysics Data System (ADS)
Jeong, W.; Son, W.; Pyun, S.; Min, D.
2011-12-01
Currently, numerous studies have endeavored to develop robust full waveform inversion and migration algorithms. These processes require enormous computational costs, because of the number of sources in the survey. To avoid this problem, the phase encoding technique for prestack migration was proposed by Romero (2000) and Krebs et al. (2009) proposed the encoded simultaneous-source inversion technique in the time domain. On the other hand, Ben-Hadj-Ali et al. (2011) demonstrated the robustness of the frequency-domain full waveform inversion with simultaneous sources for noisy data changing the source assembling. Although several studies on simultaneous-source inversion tried to estimate P- wave velocity based on the acoustic wave equation, seismic migration and waveform inversion based on the elastic wave equations are required to obtain more reliable subsurface information. In this study, we propose a 2-D frequency-domain elastic full waveform inversion technique using phase encoding methods. In our algorithm, the random phase encoding method is employed to calculate the gradients of the elastic parameters, source signature estimation and the diagonal entries of approximate Hessian matrix. The crosstalk for the estimated source signature and the diagonal entries of approximate Hessian matrix are suppressed with iteration as for the gradients. Our 2-D frequency-domain elastic waveform inversion algorithm is composed using the back-propagation technique and the conjugate-gradient method. Source signature is estimated using the full Newton method. We compare the simultaneous-source inversion with the conventional waveform inversion for synthetic data sets of the Marmousi-2 model. The inverted results obtained by simultaneous sources are comparable to those obtained by individual sources, and source signature is successfully estimated in simultaneous source technique. Comparing the inverted results using the pseudo Hessian matrix with previous inversion results provided by the approximate Hessian matrix, it is noted that the latter are better than the former for deeper parts of the model. This work was financially supported by the Brain Korea 21 project of Energy System Engineering, by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-0006155), by the Energy Efficiency & Resources of the Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant funded by the Korea government Ministry of Knowledge Economy (No. 2010T100200133).
Using a multifrontal sparse solver in a high performance, finite element code
NASA Technical Reports Server (NTRS)
King, Scott D.; Lucas, Robert; Raefsky, Arthur
1990-01-01
We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.
Characterizing the inverses of block tridiagonal, block Toeplitz matrices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boffi, Nicholas M.; Hill, Judith C.; Reuter, Matthew G.
2014-12-04
We consider the inversion of block tridiagonal, block Toeplitz matrices and comment on the behaviour of these inverses as one moves away from the diagonal. Using matrix M bius transformations, we first present an O(1) representation (with respect to the number of block rows and block columns) for the inverse matrix and subsequently use this representation to characterize the inverse matrix. There are four symmetry-distinct cases where the blocks of the inverse matrix (i) decay to zero on both sides of the diagonal, (ii) oscillate on both sides, (iii) decay on one side and oscillate on the other and (iv)more » decay on one side and grow on the other. This characterization exposes the necessary conditions for the inverse matrix to be numerically banded and may also aid in the design of preconditioners and fast algorithms. Finally, we present numerical examples of these matrix types.« less
Large-scale inverse model analyses employing fast randomized data reduction
NASA Astrophysics Data System (ADS)
Lin, Youzuo; Le, Ellen B.; O'Malley, Daniel; Vesselinov, Velimir V.; Bui-Thanh, Tan
2017-08-01
When the number of observations is large, it is computationally challenging to apply classical inverse modeling techniques. We have developed a new computationally efficient technique for solving inverse problems with a large number of observations (e.g., on the order of 107 or greater). Our method, which we call the randomized geostatistical approach (RGA), is built upon the principal component geostatistical approach (PCGA). We employ a data reduction technique combined with the PCGA to improve the computational efficiency and reduce the memory usage. Specifically, we employ a randomized numerical linear algebra technique based on a so-called "sketching" matrix to effectively reduce the dimension of the observations without losing the information content needed for the inverse analysis. In this way, the computational and memory costs for RGA scale with the information content rather than the size of the calibration data. Our algorithm is coded in Julia and implemented in the MADS open-source high-performance computational framework (http://mads.lanl.gov). We apply our new inverse modeling method to invert for a synthetic transmissivity field. Compared to a standard geostatistical approach (GA), our method is more efficient when the number of observations is large. Most importantly, our method is capable of solving larger inverse problems than the standard GA and PCGA approaches. Therefore, our new model inversion method is a powerful tool for solving large-scale inverse problems. The method can be applied in any field and is not limited to hydrogeological applications such as the characterization of aquifer heterogeneity.
NASA Astrophysics Data System (ADS)
Qin, Xulei; Cong, Zhibin; Halig, Luma V.; Fei, Baowei
2013-03-01
An automatic framework is proposed to segment right ventricle on ultrasound images. This method can automatically segment both epicardial and endocardial boundaries from a continuous echocardiography series by combining sparse matrix transform (SMT), a training model, and a localized region based level set. First, the sparse matrix transform extracts main motion regions of myocardium as eigenimages by analyzing statistical information of these images. Second, a training model of right ventricle is registered to the extracted eigenimages in order to automatically detect the main location of the right ventricle and the corresponding transform relationship between the training model and the SMT-extracted results in the series. Third, the training model is then adjusted as an adapted initialization for the segmentation of each image in the series. Finally, based on the adapted initializations, a localized region based level set algorithm is applied to segment both epicardial and endocardial boundaries of the right ventricle from the whole series. Experimental results from real subject data validated the performance of the proposed framework in segmenting right ventricle from echocardiography. The mean Dice scores for both epicardial and endocardial boundaries are 89.1%+/-2.3% and 83.6+/-7.3%, respectively. The automatic segmentation method based on sparse matrix transform and level set can provide a useful tool for quantitative cardiac imaging.
A direct method for unfolding the resolution function from measurements of neutron induced reactions
NASA Astrophysics Data System (ADS)
Žugec, P.; Colonna, N.; Sabate-Gilarte, M.; Vlachoudis, V.; Massimi, C.; Lerendegui-Marco, J.; Stamatopoulos, A.; Bacak, M.; Warren, S. G.; n TOF Collaboration
2017-12-01
The paper explores the numerical stability and the computational efficiency of a direct method for unfolding the resolution function from the measurements of the neutron induced reactions. A detailed resolution function formalism is laid out, followed by an overview of challenges present in a practical implementation of the method. A special matrix storage scheme is developed in order to facilitate both the memory management of the resolution function matrix, and to increase the computational efficiency of the matrix multiplication and decomposition procedures. Due to its admirable computational properties, a Cholesky decomposition is at the heart of the unfolding procedure. With the smallest but necessary modification of the matrix to be decomposed, the method is successfully applied to system of 105 × 105. However, the amplification of the uncertainties during the direct inversion procedures limits the applicability of the method to high-precision measurements of neutron induced reactions.
Underdetermined blind separation of three-way fluorescence spectra of PAHs in water.
Yang, Ruifang; Zhao, Nanjing; Xiao, Xue; Zhu, Wei; Chen, Yunan; Yin, Gaofang; Liu, Jianguo; Liu, Wenqing
2018-06-15
In this work, underdetermined blind decomposition method is developed to recognize individual components from the three-way fluorescent spectra of their mixtures by using sparse component analysis (SCA). The mixing matrix is estimated from the mixtures using fuzzy data clustering algorithm together with the scatters corresponding to local energy maximum value in the time-frequency domain, and the spectra of object components are recovered by pseudo inverse technique. As an example, using this method three and four pure components spectra can be blindly extracted from two samples of their mixture, with similarities between resolved and reference spectra all above 0.80. This work opens a new and effective path to realize monitoring PAHs in water by three-way fluorescence spectroscopy technique. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Heber, Gerd; Biswas, Rupak
2000-01-01
The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations within a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming paradigms and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multi-threaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.
HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION
Mukherjee, Rajarshi; Pillai, Natesh S.; Lin, Xihong
2015-01-01
In this paper, we study the detection boundary for minimax hypothesis testing in the context of high-dimensional, sparse binary regression models. Motivated by genetic sequencing association studies for rare variant effects, we investigate the complexity of the hypothesis testing problem when the design matrix is sparse. We observe a new phenomenon in the behavior of detection boundary which does not occur in the case of Gaussian linear regression. We derive the detection boundary as a function of two components: a design matrix sparsity index and signal strength, each of which is a function of the sparsity of the alternative. For any alternative, if the design matrix sparsity index is too high, any test is asymptotically powerless irrespective of the magnitude of signal strength. For binary design matrices with the sparsity index that is not too high, our results are parallel to those in the Gaussian case. In this context, we derive detection boundaries for both dense and sparse regimes. For the dense regime, we show that the generalized likelihood ratio is rate optimal; for the sparse regime, we propose an extended Higher Criticism Test and show it is rate optimal and sharp. We illustrate the finite sample properties of the theoretical results using simulation studies. PMID:26246645
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hutchinson, S.A.; Shadid, J.N.; Tuminaro, R.S.
1995-10-01
Aztec is an iterative library that greatly simplifies the parallelization process when solving the linear systems of equations Ax = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. Aztec is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparsemore » unstructured matrices for parallel solution. Once the distributed matrix is created, computation can be performed on any of the parallel machines running Aztec: nCUBE 2, IBM SP2 and Intel Paragon, MPI platforms as well as standard serial and vector platforms. Aztec includes a number of Krylov iterative methods such as conjugate gradient (CG), generalized minimum residual (GMRES) and stabilized biconjugate gradient (BICGSTAB) to solve systems of equations. These Krylov methods are used in conjunction with various preconditioners such as polynomial or domain decomposition methods using LU or incomplete LU factorizations within subdomains. Although the matrix A can be general, the package has been designed for matrices arising from the approximation of partial differential equations (PDEs). In particular, the Aztec package is oriented toward systems arising from PDE applications.« less
NASA Astrophysics Data System (ADS)
Prodhan, Suryoday; Ramasesha, S.
2018-05-01
The symmetry adapted density matrix renormalization group (SDMRG) technique has been an efficient method for studying low-lying eigenstates in one- and quasi-one-dimensional electronic systems. However, the SDMRG method had bottlenecks involving the construction of linearly independent symmetry adapted basis states as the symmetry matrices in the DMRG basis were not sparse. We have developed a modified algorithm to overcome this bottleneck. The new method incorporates end-to-end interchange symmetry (C2) , electron-hole symmetry (J ) , and parity or spin-flip symmetry (P ) in these calculations. The one-to-one correspondence between direct-product basis states in the DMRG Hilbert space for these symmetry operations renders the symmetry matrices in the new basis with maximum sparseness, just one nonzero matrix element per row. Using methods similar to those employed in the exact diagonalization technique for Pariser-Parr-Pople (PPP) models, developed in the 1980s, it is possible to construct orthogonal SDMRG basis states while bypassing the slow step of the Gram-Schmidt orthonormalization procedure. The method together with the PPP model which incorporates long-range electronic correlations is employed to study the correlated excited-state spectra of 1,12-benzoperylene and a narrow mixed graphene nanoribbon with a chrysene molecule as the building unit, comprising both zigzag and cove-edge structures.
NASA Astrophysics Data System (ADS)
Grayver, Alexander V.
2015-07-01
This paper presents a distributed magnetotelluric inversion scheme based on adaptive finite-element method (FEM). The key novel aspect of the introduced algorithm is the use of automatic mesh refinement techniques for both forward and inverse modelling. These techniques alleviate tedious and subjective procedure of choosing a suitable model parametrization. To avoid overparametrization, meshes for forward and inverse problems were decoupled. For calculation of accurate electromagnetic (EM) responses, automatic mesh refinement algorithm based on a goal-oriented error estimator has been adopted. For further efficiency gain, EM fields for each frequency were calculated using independent meshes in order to account for substantially different spatial behaviour of the fields over a wide range of frequencies. An automatic approach for efficient initial mesh design in inverse problems based on linearized model resolution matrix was developed. To make this algorithm suitable for large-scale problems, it was proposed to use a low-rank approximation of the linearized model resolution matrix. In order to fill a gap between initial and true model complexities and resolve emerging 3-D structures better, an algorithm for adaptive inverse mesh refinement was derived. Within this algorithm, spatial variations of the imaged parameter are calculated and mesh is refined in the neighborhoods of points with the largest variations. A series of numerical tests were performed to demonstrate the utility of the presented algorithms. Adaptive mesh refinement based on the model resolution estimates provides an efficient tool to derive initial meshes which account for arbitrary survey layouts, data types, frequency content and measurement uncertainties. Furthermore, the algorithm is capable to deliver meshes suitable to resolve features on multiple scales while keeping number of unknowns low. However, such meshes exhibit dependency on an initial model guess. Additionally, it is demonstrated that the adaptive mesh refinement can be particularly efficient in resolving complex shapes. The implemented inversion scheme was able to resolve a hemisphere object with sufficient resolution starting from a coarse discretization and refining mesh adaptively in a fully automatic process. The code is able to harness the computational power of modern distributed platforms and is shown to work with models consisting of millions of degrees of freedom. Significant computational savings were achieved by using locally refined decoupled meshes.
Determining biosonar images using sparse representations.
Fontaine, Bertrand; Peremans, Herbert
2009-05-01
Echolocating bats are thought to be able to create an image of their environment by emitting pulses and analyzing the reflected echoes. In this paper, the theory of sparse representations and its more recent further development into compressed sensing are applied to this biosonar image formation task. Considering the target image representation as sparse allows formulation of this inverse problem as a convex optimization problem for which well defined and efficient solution methods have been established. The resulting technique, referred to as L1-minimization, is applied to simulated data to analyze its performance relative to delay accuracy and delay resolution experiments. This method performs comparably to the coherent receiver for the delay accuracy experiments, is quite robust to noise, and can reconstruct complex target impulse responses as generated by many closely spaced reflectors with different reflection strengths. This same technique, in addition to reconstructing biosonar target images, can be used to simultaneously localize these complex targets by interpreting location cues induced by the bat's head related transfer function. Finally, a tentative explanation is proposed for specific bat behavioral experiments in terms of the properties of target images as reconstructed by the L1-minimization method.
Approximation of reliability of direct genomic breeding values
USDA-ARS?s Scientific Manuscript database
Two methods to efficiently approximate theoretical genomic reliabilities are presented. The first method is based on the direct inverse of the left hand side (LHS) of mixed model equations. It uses the genomic relationship matrix for a small subset of individuals with the highest genomic relationshi...
Chang, Zheng; Xiang, Qing-San; Shen, Hao; Yin, Fang-Fang
2010-03-01
To accelerate non-contrast-enhanced MR angiography (MRA) with inflow inversion recovery (IFIR) with a fast imaging method, Skipped Phase Encoding and Edge Deghosting (SPEED). IFIR imaging uses a preparatory inversion pulse to reduce signals from static tissue, while leaving inflow arterial blood unaffected, resulting in sparse arterial vasculature on modest tissue background. By taking advantage of vascular sparsity, SPEED can be simplified with a single-layer model to achieve higher efficiency in both scan time reduction and image reconstruction. SPEED can also make use of information available in multiple coils for further acceleration. The techniques are demonstrated with a three-dimensional renal non-contrast-enhanced IFIR MRA study. Images are reconstructed by SPEED based on a single-layer model to achieve an undersampling factor of up to 2.5 using one skipped phase encoding direction. By making use of information available in multiple coils, SPEED can achieve an undersampling factor of up to 8.3 with four receiver coils. The reconstructed images generally have comparable quality as that of the reference images reconstructed from full k-space data. As demonstrated with a three-dimensional renal IFIR scan, SPEED based on a single-layer model is able to reduce scan time further and achieve higher computational efficiency than the original SPEED.
Fast super-resolution estimation of DOA and DOD in bistatic MIMO Radar with off-grid targets
NASA Astrophysics Data System (ADS)
Zhang, Dong; Zhang, Yongshun; Zheng, Guimei; Feng, Cunqian; Tang, Jun
2018-05-01
In this paper, we focus on the problem of joint DOA and DOD estimation in Bistatic MIMO Radar using sparse reconstruction method. In traditional ways, we usually convert the 2D parameter estimation problem into 1D parameter estimation problem by Kronecker product which will enlarge the scale of the parameter estimation problem and bring more computational burden. Furthermore, it requires that the targets must fall on the predefined grids. In this paper, a 2D-off-grid model is built which can solve the grid mismatch problem of 2D parameters estimation. Then in order to solve the joint 2D sparse reconstruction problem directly and efficiently, three kinds of fast joint sparse matrix reconstruction methods are proposed which are Joint-2D-OMP algorithm, Joint-2D-SL0 algorithm and Joint-2D-SOONE algorithm. Simulation results demonstrate that our methods not only can improve the 2D parameter estimation accuracy but also reduce the computational complexity compared with the traditional Kronecker Compressed Sensing method.
Hine, N D M; Haynes, P D; Mostofi, A A; Payne, M C
2010-09-21
We present calculations of formation energies of defects in an ionic solid (Al(2)O(3)) extrapolated to the dilute limit, corresponding to a simulation cell of infinite size. The large-scale calculations required for this extrapolation are enabled by developments in the approach to parallel sparse matrix algebra operations, which are central to linear-scaling density-functional theory calculations. The computational cost of manipulating sparse matrices, whose sizes are determined by the large number of basis functions present, is greatly improved with this new approach. We present details of the sparse algebra scheme implemented in the ONETEP code using hierarchical sparsity patterns, and demonstrate its use in calculations on a wide range of systems, involving thousands of atoms on hundreds to thousands of parallel processes.
Zhang, Shu; Li, Xiang; Lv, Jinglei; Jiang, Xi; Guo, Lei; Liu, Tianming
2016-03-01
A relatively underexplored question in fMRI is whether there are intrinsic differences in terms of signal composition patterns that can effectively characterize and differentiate task-based or resting state fMRI (tfMRI or rsfMRI) signals. In this paper, we propose a novel two-stage sparse representation framework to examine the fundamental difference between tfMRI and rsfMRI signals. Specifically, in the first stage, the whole-brain tfMRI or rsfMRI signals of each subject were composed into a big data matrix, which was then factorized into a subject-specific dictionary matrix and a weight coefficient matrix for sparse representation. In the second stage, all of the dictionary matrices from both tfMRI/rsfMRI data across multiple subjects were composed into another big data-matrix, which was further sparsely represented by a cross-subjects common dictionary and a weight matrix. This framework has been applied on the recently publicly released Human Connectome Project (HCP) fMRI data and experimental results revealed that there are distinctive and descriptive atoms in the cross-subjects common dictionary that can effectively characterize and differentiate tfMRI and rsfMRI signals, achieving 100% classification accuracy. Moreover, our methods and results can be meaningfully interpreted, e.g., the well-known default mode network (DMN) activities can be recovered from the very noisy and heterogeneous aggregated big-data of tfMRI and rsfMRI signals across all subjects in HCP Q1 release.
Sparsity-based acoustic inversion in cross-sectional multiscale optoacoustic imaging.
Han, Yiyong; Tzoumas, Stratis; Nunes, Antonio; Ntziachristos, Vasilis; Rosenthal, Amir
2015-09-01
With recent advancement in hardware of optoacoustic imaging systems, highly detailed cross-sectional images may be acquired at a single laser shot, thus eliminating motion artifacts. Nonetheless, other sources of artifacts remain due to signal distortion or out-of-plane signals. The purpose of image reconstruction algorithms is to obtain the most accurate images from noisy, distorted projection data. In this paper, the authors use the model-based approach for acoustic inversion, combined with a sparsity-based inversion procedure. Specifically, a cost function is used that includes the L1 norm of the image in sparse representation and a total variation (TV) term. The optimization problem is solved by a numerically efficient implementation of a nonlinear gradient descent algorithm. TV-L1 model-based inversion is tested in the cross section geometry for numerically generated data as well as for in vivo experimental data from an adult mouse. In all cases, model-based TV-L1 inversion showed a better performance over the conventional Tikhonov regularization, TV inversion, and L1 inversion. In the numerical examples, the images reconstructed with TV-L1 inversion were quantitatively more similar to the originating images. In the experimental examples, TV-L1 inversion yielded sharper images and weaker streak artifact. The results herein show that TV-L1 inversion is capable of improving the quality of highly detailed, multiscale optoacoustic images obtained in vivo using cross-sectional imaging systems. As a result of its high fidelity, model-based TV-L1 inversion may be considered as the new standard for image reconstruction in cross-sectional imaging.
NASA Astrophysics Data System (ADS)
Sides, Scott; Jamroz, Ben; Crockett, Robert; Pletzer, Alexander
2012-02-01
Self-consistent field theory (SCFT) for dense polymer melts has been highly successful in describing complex morphologies in block copolymers. Field-theoretic simulations such as these are able to access large length and time scales that are difficult or impossible for particle-based simulations such as molecular dynamics. The modified diffusion equations that arise as a consequence of the coarse-graining procedure in the SCF theory can be efficiently solved with a pseudo-spectral (PS) method that uses fast-Fourier transforms on uniform Cartesian grids. However, PS methods can be difficult to apply in many block copolymer SCFT simulations (eg. confinement, interface adsorption) in which small spatial regions might require finer resolution than most of the simulation grid. Progress on using new solver algorithms to address these problems will be presented. The Tech-X Chompst project aims at marrying the best of adaptive mesh refinement with linear matrix solver algorithms. The Tech-X code PolySwift++ is an SCFT simulation platform that leverages ongoing development in coupling Chombo, a package for solving PDEs via block-structured AMR calculations and embedded boundaries, with PETSc, a toolkit that includes a large assortment of sparse linear solvers.
Computing sparse derivatives and consecutive zeros problem
NASA Astrophysics Data System (ADS)
Chandra, B. V. Ravi; Hossain, Shahadat
2013-02-01
We describe a substitution based sparse Jacobian matrix determination method using algorithmic differentiation. Utilizing the a priori known sparsity pattern, a compression scheme is determined using graph coloring. The "compressed pattern" of the Jacobian matrix is then reordered into a form suitable for computation by substitution. We show that the column reordering of the compressed pattern matrix (so as to align the zero entries into consecutive locations in each row) can be viewed as a variant of traveling salesman problem. Preliminary computational results show that on the test problems the performance of nearest-neighbor type heuristic algorithms is highly encouraging.
From the Rendering Equation to Stratified Light Transport Inversion
2010-12-09
iteratively. These approaches relate closely to the radiosity method for diffuse global illumination in forward rendering (Hanrahan et al, 1991; Gortler et...currently simply use sparse matrices to represent T, we are also interested in exploring connections with hierar- chical and wavelet radiosity as in...Seidel iterative methods used in radiosity . 2.4 Inverse Light Transport Previous work on inverse rendering has considered inversion of the direct
Sparse matrix methods research using the CSM testbed software system
NASA Technical Reports Server (NTRS)
Chu, Eleanor; George, J. Alan
1989-01-01
Research is described on sparse matrix techniques for the Computational Structural Mechanics (CSM) Testbed. The primary objective was to compare the performance of state-of-the-art techniques for solving sparse systems with those that are currently available in the CSM Testbed. Thus, one of the first tasks was to become familiar with the structure of the testbed, and to install some or all of the SPARSPAK package in the testbed. A suite of subroutines to extract from the data base the relevant structural and numerical information about the matrix equations was written, and all the demonstration problems distributed with the testbed were successfully solved. These codes were documented, and performance studies comparing the SPARSPAK technology to the methods currently in the testbed were completed. In addition, some preliminary studies were done comparing some recently developed out-of-core techniques with the performance of the testbed processor INV.
A tight and explicit representation of Q in sparse QR factorization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ng, E.G.; Peyton, B.W.
1992-05-01
In QR factorization of a sparse m{times}n matrix A (m {ge} n) the orthogonal factor Q is often stored implicitly as a lower trapezoidal matrix H known as the Householder matrix. This paper presents a simple characterization of the row structure of Q, which could be used as the basis for a sparse data structure that can store Q explicitly. The new characterization is a simple extension of a well known row-oriented characterization of the structure of H. Hare, Johnson, Olesky, and van den Driessche have recently provided a complete sparsity analysis of the QR factorization. Let U be themore » matrix consisting of the first n columns of Q. Using results from, we show that the data structures for H and U resulting from our characterizations are tight when A is a strong Hall matrix. We also show that H and the lower trapezoidal part of U have the same sparsity characterization when A is strong Hall. We then show that this characterization can be extended to any weak Hall matrix that has been permuted into block upper triangular form. Finally, we show that permuting to block triangular form never increases the fill incurred during the factorization.« less
Large Covariance Estimation by Thresholding Principal Orthogonal Complements
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2012-01-01
This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented. PMID:24348088
Large Covariance Estimation by Thresholding Principal Orthogonal Complements.
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2013-09-01
This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented.
NASA Technical Reports Server (NTRS)
Bloxham, Jeremy
1987-01-01
The method of stochastic inversion is extended to the simultaneous inversion of both main field and secular variation. In the present method, the time dependency is represented by an expansion in Legendre polynomials, resulting in a simple diagonal form for the a priori covariance matrix. The efficient preconditioned Broyden-Fletcher-Goldfarb-Shanno algorithm is used to solve the large system of equations resulting from expansion of the field spatially to spherical harmonic degree 14 and temporally to degree 8. Application of the method to observatory data spanning the 1900-1980 period results in a data fit of better than 30 nT, while providing temporally and spatially smoothly varying models of the magnetic field at the core-mantle boundary.
Preconditioned conjugate gradient wave-front reconstructors for multiconjugate adaptive optics
NASA Astrophysics Data System (ADS)
Gilles, Luc; Ellerbroek, Brent L.; Vogel, Curtis R.
2003-09-01
Multiconjugate adaptive optics (MCAO) systems with 104-105 degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wave-front control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of adaptive optics degrees of freedom. We develop scalable open-loop iterative sparse matrix implementations of minimum variance wave-front reconstruction for telescope diameters up to 32 m with more than 104 actuators. The basic approach is the preconditioned conjugate gradient method with an efficient preconditioner, whose block structure is defined by the atmospheric turbulent layers very much like the layer-oriented MCAO algorithms of current interest. Two cost-effective preconditioners are investigated: a multigrid solver and a simpler block symmetric Gauss-Seidel (BSGS) sweep. Both options require off-line sparse Cholesky factorizations of the diagonal blocks of the matrix system. The cost to precompute these factors scales approximately as the three-halves power of the number of estimated phase grid points per atmospheric layer, and their average update rate is typically of the order of 10-2 Hz, i.e., 4-5 orders of magnitude lower than the typical 103 Hz temporal sampling rate. All other computations scale almost linearly with the total number of estimated phase grid points. We present numerical simulation results to illustrate algorithm convergence. Convergence rates of both preconditioners are similar, regardless of measurement noise level, indicating that the layer-oriented BSGS sweep is as effective as the more elaborated multiresolution preconditioner.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pieper, Andreas; Kreutzer, Moritz; Alvermann, Andreas, E-mail: alvermann@physik.uni-greifswald.de
2016-11-15
We study Chebyshev filter diagonalization as a tool for the computation of many interior eigenvalues of very large sparse symmetric matrices. In this technique the subspace projection onto the target space of wanted eigenvectors is approximated with filter polynomials obtained from Chebyshev expansions of window functions. After the discussion of the conceptual foundations of Chebyshev filter diagonalization we analyze the impact of the choice of the damping kernel, search space size, and filter polynomial degree on the computational accuracy and effort, before we describe the necessary steps towards a parallel high-performance implementation. Because Chebyshev filter diagonalization avoids the need formore » matrix inversion it can deal with matrices and problem sizes that are presently not accessible with rational function methods based on direct or iterative linear solvers. To demonstrate the potential of Chebyshev filter diagonalization for large-scale problems of this kind we include as an example the computation of the 10{sup 2} innermost eigenpairs of a topological insulator matrix with dimension 10{sup 9} derived from quantum physics applications.« less
Sparseness- and continuity-constrained seismic imaging
NASA Astrophysics Data System (ADS)
Herrmann, Felix J.
2005-04-01
Non-linear solution strategies to the least-squares seismic inverse-scattering problem with sparseness and continuity constraints are proposed. Our approach is designed to (i) deal with substantial amounts of additive noise (SNR < 0 dB); (ii) use the sparseness and locality (both in position and angle) of directional basis functions (such as curvelets and contourlets) on the model: the reflectivity; and (iii) exploit the near invariance of these basis functions under the normal operator, i.e., the scattering-followed-by-imaging operator. Signal-to-noise ratio and the continuity along the imaged reflectors are significantly enhanced by formulating the solution of the seismic inverse problem in terms of an optimization problem. During the optimization, sparseness on the basis and continuity along the reflectors are imposed by jointly minimizing the l1- and anisotropic diffusion/total-variation norms on the coefficients and reflectivity, respectively. [Joint work with Peyman P. Moghaddam was carried out as part of the SINBAD project, with financial support secured through ITF (the Industry Technology Facilitator) from the following organizations: BG Group, BP, ExxonMobil, and SHELL. Additional funding came from the NSERC Discovery Grants 22R81254.
3D Airborne Electromagnetic Inversion: A case study from the Musgrave Region, South Australia
NASA Astrophysics Data System (ADS)
Cox, L. H.; Wilson, G. A.; Zhdanov, M. S.; Sunwall, D. A.
2012-12-01
Geophysicists know and accept that geology is inherently 3D, and is resultant from complex, overlapping processes related to genesis, metamorphism, deformation, alteration, weathering, and/or hydrogeology. Yet, the geophysics community has long relied on qualitative analysis, conductivity depth imaging (CDIs), 1D inversion, and/or plate modeling. There are many reasons for this deficiency, not the least of which has been the lack of capacity for historic 3D AEM inversion algorithms to invert entire surveys so as to practically affect exploration decisions. Our recent introduction of a moving sensitivity domain (footprint) methodology has been a paradigm shift in AEM interpretation. The basis of this method is that one needs only to calculate the responses and sensitivities for that part of the 3D earth model that is within the AEM system's sensitivity domain (footprint), and then superimpose all sensitivity domains into a single, sparse sensitivity matrix for the entire 3D earth model which is then updated in a regularized inversion scheme. This has made it practical to rigorously invert entire surveys with thousands of line kilometers of AEM data to mega-cell 3D models in hours using multi-processor workstations. Since 2010, over eighty individual projects have been completed for Aerodat, AEROTEM, DIGHEM, GEOTEM, HELITEM, HoisTEM, MEGATEM, RepTEM, RESOLVE, SkyTEM, SPECTREM, TEMPEST, and VTEM data from Australia, Brazil, Canada, Finland, Ghana, Peru, Tanzania, the US, and Zambia. Examples of 3D AEM inversion have been published for a variety of applications, including mineral exploration, oil sands exploration, salinity, permafrost, and bathymetry mapping. In this paper, we present a comparison of 3D inversions for SkyTEM, SPECTREM, TEMPET and VTEM data acquired over the same area in the Musgrave region of South Australia for exploration under cover.
GPU-accelerated Modeling and Element-free Reverse-time Migration with Gauss Points Partition
NASA Astrophysics Data System (ADS)
Zhen, Z.; Jia, X.
2014-12-01
Element-free method (EFM) has been applied to seismic modeling and migration. Compared with finite element method (FEM) and finite difference method (FDM), it is much cheaper and more flexible because only the information of the nodes and the boundary of the study area are required in computation. In the EFM, the number of Gauss points should be consistent with the number of model nodes; otherwise the accuracy of the intermediate coefficient matrices would be harmed. Thus when we increase the nodes of velocity model in order to obtain higher resolution, we find that the size of the computer's memory will be a bottleneck. The original EFM can deal with at most 81×81 nodes in the case of 2G memory, as tested by Jia and Hu (2006). In order to solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition (GPP), and utilize the GPUs to improve the computation efficiency. Considering the characteristics of the Gaussian points, the GPP method doesn't influence the propagation of seismic wave in the velocity model. To overcome the time-consuming computation of the stiffness matrix (K) and the mass matrix (M), we also use the GPUs in our computation program. We employ the compressed sparse row (CSR) format to compress the intermediate sparse matrices and try to simplify the operations by solving the linear equations with the CULA Sparse's Conjugate Gradient (CG) solver instead of the linear sparse solver 'PARDISO'. It is observed that our strategy can significantly reduce the computational time of K and Mcompared with the algorithm based on CPU. The model tested is Marmousi model. The length of the model is 7425m and the depth is 2990m. We discretize the model with 595x298 nodes, 300x300 Gauss cells and 3x3 Gauss points in each cell. In contrast to the computational time of the conventional EFM, the GPUs-GPP approach can substantially improve the efficiency. The speedup ratio of time consumption of computing K, M is 120 and the speedup ratio time consumption of RTM is 11.5. At the same time, the accuracy of imaging is not harmed. Another advantage of the GPUs-GPP method is its easy applications in other numerical methods such as the FEM. Finally, in the GPUs-GPP method, the arrays require quite limited memory storage, which makes the method promising in dealing with large-scale 3D problems.
Deflation as a method of variance reduction for estimating the trace of a matrix inverse
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gambhir, Arjun Singh; Stathopoulos, Andreas; Orginos, Kostas
Many fields require computing the trace of the inverse of a large, sparse matrix. The typical method used for such computations is the Hutchinson method which is a Monte Carlo (MC) averaging over matrix quadratures. To improve its convergence, several variance reductions techniques have been proposed. In this paper, we study the effects of deflating the near null singular value space. We make two main contributions. First, we analyze the variance of the Hutchinson method as a function of the deflated singular values and vectors. Although this provides good intuition in general, by assuming additionally that the singular vectors aremore » random unitary matrices, we arrive at concise formulas for the deflated variance that include only the variance and mean of the singular values. We make the remarkable observation that deflation may increase variance for Hermitian matrices but not for non-Hermitian ones. This is a rare, if not unique, property where non-Hermitian matrices outperform Hermitian ones. The theory can be used as a model for predicting the benefits of deflation. Second, we use deflation in the context of a large scale application of "disconnected diagrams" in Lattice QCD. On lattices, Hierarchical Probing (HP) has previously provided an order of magnitude of variance reduction over MC by removing "error" from neighboring nodes of increasing distance in the lattice. Although deflation used directly on MC yields a limited improvement of 30% in our problem, when combined with HP they reduce variance by a factor of over 150 compared to MC. For this, we pre-computated 1000 smallest singular values of an ill-conditioned matrix of size 25 million. Furthermore, using PRIMME and a domain-specific Algebraic Multigrid preconditioner, we perform one of the largest eigenvalue computations in Lattice QCD at a fraction of the cost of our trace computation.« less
Deflation as a method of variance reduction for estimating the trace of a matrix inverse
Gambhir, Arjun Singh; Stathopoulos, Andreas; Orginos, Kostas
2017-04-06
Many fields require computing the trace of the inverse of a large, sparse matrix. The typical method used for such computations is the Hutchinson method which is a Monte Carlo (MC) averaging over matrix quadratures. To improve its convergence, several variance reductions techniques have been proposed. In this paper, we study the effects of deflating the near null singular value space. We make two main contributions. First, we analyze the variance of the Hutchinson method as a function of the deflated singular values and vectors. Although this provides good intuition in general, by assuming additionally that the singular vectors aremore » random unitary matrices, we arrive at concise formulas for the deflated variance that include only the variance and mean of the singular values. We make the remarkable observation that deflation may increase variance for Hermitian matrices but not for non-Hermitian ones. This is a rare, if not unique, property where non-Hermitian matrices outperform Hermitian ones. The theory can be used as a model for predicting the benefits of deflation. Second, we use deflation in the context of a large scale application of "disconnected diagrams" in Lattice QCD. On lattices, Hierarchical Probing (HP) has previously provided an order of magnitude of variance reduction over MC by removing "error" from neighboring nodes of increasing distance in the lattice. Although deflation used directly on MC yields a limited improvement of 30% in our problem, when combined with HP they reduce variance by a factor of over 150 compared to MC. For this, we pre-computated 1000 smallest singular values of an ill-conditioned matrix of size 25 million. Furthermore, using PRIMME and a domain-specific Algebraic Multigrid preconditioner, we perform one of the largest eigenvalue computations in Lattice QCD at a fraction of the cost of our trace computation.« less
NASA Technical Reports Server (NTRS)
Park, K. C.; Alvin, K. F.; Belvin, W. Keith
1991-01-01
A second-order form of discrete Kalman filtering equations is proposed as a candidate state estimator for efficient simulations of control-structure interactions in coupled physical coordinate configurations as opposed to decoupled modal coordinates. The resulting matrix equation of the present state estimator consists of the same symmetric, sparse N x N coupled matrices of the governing structural dynamics equations as opposed to unsymmetric 2N x 2N state space-based estimators. Thus, in addition to substantial computational efficiency improvement, the present estimator can be applied to control-structure design optimization for which the physical coordinates associated with the mass, damping and stiffness matrices of the structure are needed instead of modal coordinates.
Collaborative sparse priors for multi-view ATR
NASA Astrophysics Data System (ADS)
Li, Xuelu; Monga, Vishal
2018-04-01
Recent work has seen a surge of sparse representation based classification (SRC) methods applied to automatic target recognition problems. While traditional SRC approaches used l0 or l1 norm to quantify sparsity, spike and slab priors have established themselves as the gold standard for providing general tunable sparse structures on vectors. In this work, we employ collaborative spike and slab priors that can be applied to matrices to encourage sparsity for the problem of multi-view ATR. That is, target images captured from multiple views are expanded in terms of a training dictionary multiplied with a coefficient matrix. Ideally, for a test image set comprising of multiple views of a target, coefficients corresponding to its identifying class are expected to be active, while others should be zero, i.e. the coefficient matrix is naturally sparse. We develop a new approach to solve the optimization problem that estimates the sparse coefficient matrix jointly with the sparsity inducing parameters in the collaborative prior. ATR problems are investigated on the mid-wave infrared (MWIR) database made available by the US Army Night Vision and Electronic Sensors Directorate, which has a rich collection of views. Experimental results show that the proposed joint prior and coefficient estimation method (JPCEM) can: 1.) enable improved accuracy when multiple views vs. a single one are invoked, and 2.) outperform state of the art alternatives particularly when training imagery is limited.
Sparse Bayesian learning for DOA estimation with mutual coupling.
Dai, Jisheng; Hu, Nan; Xu, Weichao; Chang, Chunqi
2015-10-16
Sparse Bayesian learning (SBL) has given renewed interest to the problem of direction-of-arrival (DOA) estimation. It is generally assumed that the measurement matrix in SBL is precisely known. Unfortunately, this assumption may be invalid in practice due to the imperfect manifold caused by unknown or misspecified mutual coupling. This paper describes a modified SBL method for joint estimation of DOAs and mutual coupling coefficients with uniform linear arrays (ULAs). Unlike the existing method that only uses stationary priors, our new approach utilizes a hierarchical form of the Student t prior to enforce the sparsity of the unknown signal more heavily. We also provide a distinct Bayesian inference for the expectation-maximization (EM) algorithm, which can update the mutual coupling coefficients more efficiently. Another difference is that our method uses an additional singular value decomposition (SVD) to reduce the computational complexity of the signal reconstruction process and the sensitivity to the measurement noise.
Improved collaborative filtering recommendation algorithm of similarity measure
NASA Astrophysics Data System (ADS)
Zhang, Baofu; Yuan, Baoping
2017-05-01
The Collaborative filtering recommendation algorithm is one of the most widely used recommendation algorithm in personalized recommender systems. The key is to find the nearest neighbor set of the active user by using similarity measure. However, the methods of traditional similarity measure mainly focus on the similarity of user common rating items, but ignore the relationship between the user common rating items and all items the user rates. And because rating matrix is very sparse, traditional collaborative filtering recommendation algorithm is not high efficiency. In order to obtain better accuracy, based on the consideration of common preference between users, the difference of rating scale and score of common items, this paper presents an improved similarity measure method, and based on this method, a collaborative filtering recommendation algorithm based on similarity improvement is proposed. Experimental results show that the algorithm can effectively improve the quality of recommendation, thus alleviate the impact of data sparseness.
NASA Technical Reports Server (NTRS)
Ehrhart, E. J.; Gillette, E. L.; Barcellos-Hoff, M. H.; Chaterjee, A. (Principal Investigator)
1996-01-01
High-LET radiation has unique physical and biological properties compared to sparsely ionizing radiation. Recent studies demonstrate that sparsely ionizing radiation rapidly alters the pattern of extracellular matrix expression in several tissues, but little is known about the effect of heavy-ion radiation. This study investigates densely ionizing radiation-induced changes in extracellular matrix localization in the mammary glands of adult female BALB/c mice after whole-body irradiation with 0.8 Gy 600 MeV iron particles. The basement membrane and interstitial extracellular matrix proteins of the mammary gland stroma were mapped with respect to time postirradiation using immunofluorescence. Collagen III was induced in the adipose stroma within 1 day, continued to increase through day 9 and was resolved by day 14. Immunoreactive tenascin was induced in the epithelium by day 1, was evident at the epithelial-stromal interface by day 5-9 and persisted as a condensed layer beneath the basement membrane through day 14. These findings parallel similar changes induced by gamma irradiation but demonstrate different onset and chronicity. In contrast, the integrity of epithelial basement membrane, which was unaffected by sparsely ionizing radiation, was disrupted by iron-particle irradiation. Laminin immunoreactivity was mildly irregular at 1 h postirradiation and showed discontinuities and thickening from days 1 to 9. Continuity was restored by day 14. Thus high-LET radiation, like sparsely ionizing radiation, induces rapid-remodeling of the stromal extracellular matrix but also appears to alter the integrity of the epithelial basement membrane, which is an important regulator of epithelial cell proliferation and differentiation.
Photon-efficient super-resolution laser radar
NASA Astrophysics Data System (ADS)
Shin, Dongeek; Shapiro, Jeffrey H.; Goyal, Vivek K.
2017-08-01
The resolution achieved in photon-efficient active optical range imaging systems can be low due to non-idealities such as propagation through a diffuse scattering medium. We propose a constrained optimization-based frame- work to address extremes in scarcity of photons and blurring by a forward imaging kernel. We provide two algorithms for the resulting inverse problem: a greedy algorithm, inspired by sparse pursuit algorithms; and a convex optimization heuristic that incorporates image total variation regularization. We demonstrate that our framework outperforms existing deconvolution imaging techniques in terms of peak signal-to-noise ratio. Since our proposed method is able to super-resolve depth features using small numbers of photon counts, it can be useful for observing fine-scale phenomena in remote sensing through a scattering medium and through-the-skin biomedical imaging applications.
Xu, Jason; Minin, Vladimir N
2015-07-01
Branching processes are a class of continuous-time Markov chains (CTMCs) with ubiquitous applications. A general difficulty in statistical inference under partially observed CTMC models arises in computing transition probabilities when the discrete state space is large or uncountable. Classical methods such as matrix exponentiation are infeasible for large or countably infinite state spaces, and sampling-based alternatives are computationally intensive, requiring integration over all possible hidden events. Recent work has successfully applied generating function techniques to computing transition probabilities for linear multi-type branching processes. While these techniques often require significantly fewer computations than matrix exponentiation, they also become prohibitive in applications with large populations. We propose a compressed sensing framework that significantly accelerates the generating function method, decreasing computational cost up to a logarithmic factor by only assuming the probability mass of transitions is sparse. We demonstrate accurate and efficient transition probability computations in branching process models for blood cell formation and evolution of self-replicating transposable elements in bacterial genomes.
Xu, Jason; Minin, Vladimir N.
2016-01-01
Branching processes are a class of continuous-time Markov chains (CTMCs) with ubiquitous applications. A general difficulty in statistical inference under partially observed CTMC models arises in computing transition probabilities when the discrete state space is large or uncountable. Classical methods such as matrix exponentiation are infeasible for large or countably infinite state spaces, and sampling-based alternatives are computationally intensive, requiring integration over all possible hidden events. Recent work has successfully applied generating function techniques to computing transition probabilities for linear multi-type branching processes. While these techniques often require significantly fewer computations than matrix exponentiation, they also become prohibitive in applications with large populations. We propose a compressed sensing framework that significantly accelerates the generating function method, decreasing computational cost up to a logarithmic factor by only assuming the probability mass of transitions is sparse. We demonstrate accurate and efficient transition probability computations in branching process models for blood cell formation and evolution of self-replicating transposable elements in bacterial genomes. PMID:26949377
Solving Large-Scale Inverse Magnetostatic Problems using the Adjoint Method
Bruckner, Florian; Abert, Claas; Wautischer, Gregor; Huber, Christian; Vogler, Christoph; Hinze, Michael; Suess, Dieter
2017-01-01
An efficient algorithm for the reconstruction of the magnetization state within magnetic components is presented. The occurring inverse magnetostatic problem is solved by means of an adjoint approach, based on the Fredkin-Koehler method for the solution of the forward problem. Due to the use of hybrid FEM-BEM coupling combined with matrix compression techniques the resulting algorithm is well suited for large-scale problems. Furthermore the reconstruction of the magnetization state within a permanent magnet as well as an optimal design application are demonstrated. PMID:28098851
Discriminative Transfer Subspace Learning via Low-Rank and Sparse Representation.
Xu, Yong; Fang, Xiaozhao; Wu, Jian; Li, Xuelong; Zhang, David
2016-02-01
In this paper, we address the problem of unsupervised domain transfer learning in which no labels are available in the target domain. We use a transformation matrix to transfer both the source and target data to a common subspace, where each target sample can be represented by a combination of source samples such that the samples from different domains can be well interlaced. In this way, the discrepancy of the source and target domains is reduced. By imposing joint low-rank and sparse constraints on the reconstruction coefficient matrix, the global and local structures of data can be preserved. To enlarge the margins between different classes as much as possible and provide more freedom to diminish the discrepancy, a flexible linear classifier (projection) is obtained by learning a non-negative label relaxation matrix that allows the strict binary label matrix to relax into a slack variable matrix. Our method can avoid a potentially negative transfer by using a sparse matrix to model the noise and, thus, is more robust to different types of noise. We formulate our problem as a constrained low-rankness and sparsity minimization problem and solve it by the inexact augmented Lagrange multiplier method. Extensive experiments on various visual domain adaptation tasks show the superiority of the proposed method over the state-of-the art methods. The MATLAB code of our method will be publicly available at http://www.yongxu.org/lunwen.html.
Sparsity-based acoustic inversion in cross-sectional multiscale optoacoustic imaging
DOE Office of Scientific and Technical Information (OSTI.GOV)
Han, Yiyong; Tzoumas, Stratis; Nunes, Antonio
2015-09-15
Purpose: With recent advancement in hardware of optoacoustic imaging systems, highly detailed cross-sectional images may be acquired at a single laser shot, thus eliminating motion artifacts. Nonetheless, other sources of artifacts remain due to signal distortion or out-of-plane signals. The purpose of image reconstruction algorithms is to obtain the most accurate images from noisy, distorted projection data. Methods: In this paper, the authors use the model-based approach for acoustic inversion, combined with a sparsity-based inversion procedure. Specifically, a cost function is used that includes the L1 norm of the image in sparse representation and a total variation (TV) term. Themore » optimization problem is solved by a numerically efficient implementation of a nonlinear gradient descent algorithm. TV–L1 model-based inversion is tested in the cross section geometry for numerically generated data as well as for in vivo experimental data from an adult mouse. Results: In all cases, model-based TV–L1 inversion showed a better performance over the conventional Tikhonov regularization, TV inversion, and L1 inversion. In the numerical examples, the images reconstructed with TV–L1 inversion were quantitatively more similar to the originating images. In the experimental examples, TV–L1 inversion yielded sharper images and weaker streak artifact. Conclusions: The results herein show that TV–L1 inversion is capable of improving the quality of highly detailed, multiscale optoacoustic images obtained in vivo using cross-sectional imaging systems. As a result of its high fidelity, model-based TV–L1 inversion may be considered as the new standard for image reconstruction in cross-sectional imaging.« less
Improved analysis of SP and CoSaMP under total perturbations
NASA Astrophysics Data System (ADS)
Li, Haifeng
2016-12-01
Practically, in the underdetermined model y= A x, where x is a K sparse vector (i.e., it has no more than K nonzero entries), both y and A could be totally perturbed. A more relaxed condition means less number of measurements are needed to ensure the sparse recovery from theoretical aspect. In this paper, based on restricted isometry property (RIP), for subspace pursuit (SP) and compressed sampling matching pursuit (CoSaMP), two relaxed sufficient conditions are presented under total perturbations to guarantee that the sparse vector x is recovered. Taking random matrix as measurement matrix, we also discuss the advantage of our condition. Numerical experiments validate that SP and CoSaMP can provide oracle-order recovery performance.
Multi-energy CT based on a prior rank, intensity and sparsity model (PRISM).
Gao, Hao; Yu, Hengyong; Osher, Stanley; Wang, Ge
2011-11-01
We propose a compressive sensing approach for multi-energy computed tomography (CT), namely the prior rank, intensity and sparsity model (PRISM). To further compress the multi-energy image for allowing the reconstruction with fewer CT data and less radiation dose, the PRISM models a multi-energy image as the superposition of a low-rank matrix and a sparse matrix (with row dimension in space and column dimension in energy), where the low-rank matrix corresponds to the stationary background over energy that has a low matrix rank, and the sparse matrix represents the rest of distinct spectral features that are often sparse. Distinct from previous methods, the PRISM utilizes the generalized rank, e.g., the matrix rank of tight-frame transform of a multi-energy image, which offers a way to characterize the multi-level and multi-filtered image coherence across the energy spectrum. Besides, the energy-dependent intensity information can be incorporated into the PRISM in terms of the spectral curves for base materials, with which the restoration of the multi-energy image becomes the reconstruction of the energy-independent material composition matrix. In other words, the PRISM utilizes prior knowledge on the generalized rank and sparsity of a multi-energy image, and intensity/spectral characteristics of base materials. Furthermore, we develop an accurate and fast split Bregman method for the PRISM and demonstrate the superior performance of the PRISM relative to several competing methods in simulations.
SAMBA: Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos
NASA Astrophysics Data System (ADS)
Ahlfeld, R.; Belkouchi, B.; Montomoli, F.
2016-09-01
A new arbitrary Polynomial Chaos (aPC) method is presented for moderately high-dimensional problems characterised by limited input data availability. The proposed methodology improves the algorithm of aPC and extends the method, that was previously only introduced as tensor product expansion, to moderately high-dimensional stochastic problems. The fundamental idea of aPC is to use the statistical moments of the input random variables to develop the polynomial chaos expansion. This approach provides the possibility to propagate continuous or discrete probability density functions and also histograms (data sets) as long as their moments exist, are finite and the determinant of the moment matrix is strictly positive. For cases with limited data availability, this approach avoids bias and fitting errors caused by wrong assumptions. In this work, an alternative way to calculate the aPC is suggested, which provides the optimal polynomials, Gaussian quadrature collocation points and weights from the moments using only a handful of matrix operations on the Hankel matrix of moments. It can therefore be implemented without requiring prior knowledge about statistical data analysis or a detailed understanding of the mathematics of polynomial chaos expansions. The extension to more input variables suggested in this work, is an anisotropic and adaptive version of Smolyak's algorithm that is solely based on the moments of the input probability distributions. It is referred to as SAMBA (PC), which is short for Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos. It is illustrated that for moderately high-dimensional problems (up to 20 different input variables or histograms) SAMBA can significantly simplify the calculation of sparse Gaussian quadrature rules. SAMBA's efficiency for multivariate functions with regard to data availability is further demonstrated by analysing higher order convergence and accuracy for a set of nonlinear test functions with 2, 5 and 10 different input distributions or histograms.
Cao, Buwen; Deng, Shuguang; Qin, Hua; Ding, Pingjian; Chen, Shaopeng; Li, Guanghui
2018-06-15
High-throughput technology has generated large-scale protein interaction data, which is crucial in our understanding of biological organisms. Many complex identification algorithms have been developed to determine protein complexes. However, these methods are only suitable for dense protein interaction networks, because their capabilities decrease rapidly when applied to sparse protein⁻protein interaction (PPI) networks. In this study, based on penalized matrix decomposition ( PMD ), a novel method of penalized matrix decomposition for the identification of protein complexes (i.e., PMD pc ) was developed to detect protein complexes in the human protein interaction network. This method mainly consists of three steps. First, the adjacent matrix of the protein interaction network is normalized. Second, the normalized matrix is decomposed into three factor matrices. The PMD pc method can detect protein complexes in sparse PPI networks by imposing appropriate constraints on factor matrices. Finally, the results of our method are compared with those of other methods in human PPI network. Experimental results show that our method can not only outperform classical algorithms, such as CFinder, ClusterONE, RRW, HC-PIN, and PCE-FR, but can also achieve an ideal overall performance in terms of a composite score consisting of F-measure, accuracy (ACC), and the maximum matching ratio (MMR).
Target detection in GPR data using joint low-rank and sparsity constraints
NASA Astrophysics Data System (ADS)
Bouzerdoum, Abdesselam; Tivive, Fok Hing Chi; Abeynayake, Canicious
2016-05-01
In ground penetrating radars, background clutter, which comprises the signals backscattered from the rough, uneven ground surface and the background noise, impairs the visualization of buried objects and subsurface inspections. In this paper, a clutter mitigation method is proposed for target detection. The removal of background clutter is formulated as a constrained optimization problem to obtain a low-rank matrix and a sparse matrix. The low-rank matrix captures the ground surface reflections and the background noise, whereas the sparse matrix contains the target reflections. An optimization method based on split-Bregman algorithm is developed to estimate these two matrices from the input GPR data. Evaluated on real radar data, the proposed method achieves promising results in removing the background clutter and enhancing the target signature.
NASA Astrophysics Data System (ADS)
Kuo, Chih-Hao
Efficient and accurate modeling of electromagnetic scattering from layered rough surfaces with buried objects finds applications ranging from detection of landmines to remote sensing of subsurface soil moisture. The formulation of a hybrid numerical/analytical solution to electromagnetic scattering from layered rough surfaces is first presented in this dissertation. The solution to scattering from each rough interface is sought independently based on the extended boundary condition method (EBCM), where the scattered fields of each rough interface are expressed as a summation of plane waves and then cast into reflection/transmission matrices. To account for interactions between multiple rough boundaries, the scattering matrix method (SMM) is applied to recursively cascade reflection and transmission matrices of each rough interface and obtain the composite reflection matrix from the overall scattering medium. The validation of this method against the Method of Moments (MoM) and Small Perturbation Method (SPM) is addressed and the numerical results which investigate the potential of low frequency radar systems in estimating deep soil moisture are presented. Computational efficiency of the proposed method is also discussed. In order to demonstrate the capability of this method in modeling coherent multiple scattering phenomena, the proposed method has been employed to analyze backscattering enhancement and satellite peaks due to surface plasmon waves from layered rough surfaces. Numerical results which show the appearance of enhanced backscattered peaks and satellite peaks are presented. Following the development of the EBCM/SMM technique, a technique which incorporates a buried object in layered rough surfaces by employing the T-matrix method and the cylindrical-to-spatial harmonics transformation is proposed. Validation and numerical results are provided. Finally, a multi-frequency polarimetric inversion algorithm for the retrieval of subsurface soil properties using VHF/UHF band radar measurements is devised. The top soil dielectric constant is first determined using an L-band inversion algorithm. For the retrieval of subsurface properties, a time-domain inversion technique is employed together with a parameter optimization for the pulse shape of time delay echoes from VHF/UHF band radar observations. Numerical studies to investigate the accuracy of the proposed inversion technique in presence of errors are addressed.
Refractive index inversion based on Mueller matrix method
NASA Astrophysics Data System (ADS)
Fan, Huaxi; Wu, Wenyuan; Huang, Yanhua; Li, Zhaozhao
2016-03-01
Based on Stokes vector and Jones vector, the correlation between Mueller matrix elements and refractive index was studied with the result simplified, and through Mueller matrix way, the expression of refractive index inversion was deduced. The Mueller matrix elements, under different incident angle, are simulated through the expression of specular reflection so as to analyze the influence of the angle of incidence and refractive index on it, which is verified through the measure of the Mueller matrix elements of polished metal surface. Research shows that, under the condition of specular reflection, the result of Mueller matrix inversion is consistent with the experiment and can be used as an index of refraction of inversion method, and it provides a new way for target detection and recognition technology.
Sparse image reconstruction for molecular imaging.
Ting, Michael; Raich, Raviv; Hero, Alfred O
2009-06-01
The application that motivates this paper is molecular imaging at the atomic level. When discretized at subatomic distances, the volume is inherently sparse. Noiseless measurements from an imaging technology can be modeled by convolution of the image with the system point spread function (psf). Such is the case with magnetic resonance force microscopy (MRFM), an emerging technology where imaging of an individual tobacco mosaic virus was recently demonstrated with nanometer resolution. We also consider additive white Gaussian noise (AWGN) in the measurements. Many prior works of sparse estimators have focused on the case when H has low coherence; however, the system matrix H in our application is the convolution matrix for the system psf. A typical convolution matrix has high coherence. This paper, therefore, does not assume a low coherence H. A discrete-continuous form of the Laplacian and atom at zero (LAZE) p.d.f. used by Johnstone and Silverman is formulated, and two sparse estimators derived by maximizing the joint p.d.f. of the observation and image conditioned on the hyperparameters. A thresholding rule that generalizes the hard and soft thresholding rule appears in the course of the derivation. This so-called hybrid thresholding rule, when used in the iterative thresholding framework, gives rise to the hybrid estimator, a generalization of the lasso. Estimates of the hyperparameters for the lasso and hybrid estimator are obtained via Stein's unbiased risk estimate (SURE). A numerical study with a Gaussian psf and two sparse images shows that the hybrid estimator outperforms the lasso.
Preconditioned conjugate gradient wave-front reconstructors for multiconjugate adaptive optics.
Gilles, Luc; Ellerbroek, Brent L; Vogel, Curtis R
2003-09-10
Multiconjugate adaptive optics (MCAO) systems with 10(4)-10(5) degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wavefront control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of adaptive optics degrees of freedom. We develop scalable open-loop iterative sparse matrix implementations of minimum variance wave-front reconstruction for telescope diameters up to 32 m with more than 10(4) actuators. The basic approach is the preconditioned conjugate gradient method with an efficient preconditioner, whose block structure is defined by the atmospheric turbulent layers very much like the layer-oriented MCAO algorithms of current interest. Two cost-effective preconditioners are investigated: a multigrid solver and a simpler block symmetric Gauss-Seidel (BSGS) sweep. Both options require off-line sparse Cholesky factorizations of the diagonal blocks of the matrix system. The cost to precompute these factors scales approximately as the three-halves power of the number of estimated phase grid points per atmospheric layer, and their average update rate is typically of the order of 10(-2) Hz, i.e., 4-5 orders of magnitude lower than the typical 10(3) Hz temporal sampling rate. All other computations scale almost linearly with the total number of estimated phase grid points. We present numerical simulation results to illustrate algorithm convergence. Convergence rates of both preconditioners are similar, regardless of measurement noise level, indicating that the layer-oriented BSGS sweep is as effective as the more elaborated multiresolution preconditioner.
Utilizing Visual Effects Software for Efficient and Flexible Isostatic Adjustment Modelling
NASA Astrophysics Data System (ADS)
Meldgaard, A.; Nielsen, L.; Iaffaldano, G.
2017-12-01
The isostatic adjustment signal generated by transient ice sheet loading is an important indicator of past ice sheet extent and the rheological constitution of the interior of the Earth. Finite element modelling has proved to be a very useful tool in these studies. We present a simple numerical model for 3D visco elastic Earth deformation and a new approach to the design of such models utilizing visual effects software designed for the film and game industry. The software package Houdini offers an assortment of optimized tools and libraries which greatly facilitate the creation of efficient numerical algorithms. In particular, we make use of Houdini's procedural work flow, the SIMD programming language VEX, Houdini's sparse matrix creation and inversion libraries, an inbuilt tetrahedralizer for grid creation, and the user interface, which facilitates effortless manipulation of 3D geometry. We mitigate many of the time consuming steps associated with the authoring of efficient algorithms from scratch while still keeping the flexibility that may be lost with the use of commercial dedicated finite element programs. We test the efficiency of the algorithm by comparing simulation times with off-the-shelf solutions from the Abaqus software package. The algorithm is tailored for the study of local isostatic adjustment patterns, in close vicinity to present ice sheet margins. In particular, we wish to examine possible causes for the considerable spatial differences in the uplift magnitude which are apparent from field observations in these areas. Such features, with spatial scales of tens of kilometres, are not resolvable with current global isostatic adjustment models, and may require the inclusion of local topographic features. We use the presented algorithm to study a near field area where field observations are abundant, namely, Disko Bay in West Greenland with the intention of constraining Earth parameters and ice thickness. In addition, we assess how local topographic features may influence the differential isostatic uplift in the area.
Sparse spikes super-resolution on thin grids II: the continuous basis pursuit
NASA Astrophysics Data System (ADS)
Duval, Vincent; Peyré, Gabriel
2017-09-01
This article analyzes the performance of the continuous basis pursuit (C-BP) method for sparse super-resolution. The C-BP has been recently proposed by Ekanadham, Tranchina and Simoncelli as a refined discretization scheme for the recovery of spikes in inverse problems regularization. One of the most well known discretization scheme, the basis pursuit (BP, also known as \
NASA Astrophysics Data System (ADS)
Zhao, Jin; Han-Ming, Zhang; Bin, Yan; Lei, Li; Lin-Yuan, Wang; Ai-Long, Cai
2016-03-01
Sparse-view x-ray computed tomography (CT) imaging is an interesting topic in CT field and can efficiently decrease radiation dose. Compared with spatial reconstruction, a Fourier-based algorithm has advantages in reconstruction speed and memory usage. A novel Fourier-based iterative reconstruction technique that utilizes non-uniform fast Fourier transform (NUFFT) is presented in this work along with advanced total variation (TV) regularization for a fan sparse-view CT. The proposition of a selective matrix contributes to improve reconstruction quality. The new method employs the NUFFT and its adjoin to iterate back and forth between the Fourier and image space. The performance of the proposed algorithm is demonstrated through a series of digital simulations and experimental phantom studies. Results of the proposed algorithm are compared with those of existing TV-regularized techniques based on compressed sensing method, as well as basic algebraic reconstruction technique. Compared with the existing TV-regularized techniques, the proposed Fourier-based technique significantly improves convergence rate and reduces memory allocation, respectively. Projected supported by the National High Technology Research and Development Program of China (Grant No. 2012AA011603) and the National Natural Science Foundation of China (Grant No. 61372172).
Randomized subspace-based robust principal component analysis for hyperspectral anomaly detection
NASA Astrophysics Data System (ADS)
Sun, Weiwei; Yang, Gang; Li, Jialin; Zhang, Dianfa
2018-01-01
A randomized subspace-based robust principal component analysis (RSRPCA) method for anomaly detection in hyperspectral imagery (HSI) is proposed. The RSRPCA combines advantages of randomized column subspace and robust principal component analysis (RPCA). It assumes that the background has low-rank properties, and the anomalies are sparse and do not lie in the column subspace of the background. First, RSRPCA implements random sampling to sketch the original HSI dataset from columns and to construct a randomized column subspace of the background. Structured random projections are also adopted to sketch the HSI dataset from rows. Sketching from columns and rows could greatly reduce the computational requirements of RSRPCA. Second, the RSRPCA adopts the columnwise RPCA (CWRPCA) to eliminate negative effects of sampled anomaly pixels and that purifies the previous randomized column subspace by removing sampled anomaly columns. The CWRPCA decomposes the submatrix of the HSI data into a low-rank matrix (i.e., background component), a noisy matrix (i.e., noise component), and a sparse anomaly matrix (i.e., anomaly component) with only a small proportion of nonzero columns. The algorithm of inexact augmented Lagrange multiplier is utilized to optimize the CWRPCA problem and estimate the sparse matrix. Nonzero columns of the sparse anomaly matrix point to sampled anomaly columns in the submatrix. Third, all the pixels are projected onto the complemental subspace of the purified randomized column subspace of the background and the anomaly pixels in the original HSI data are finally exactly located. Several experiments on three real hyperspectral images are carefully designed to investigate the detection performance of RSRPCA, and the results are compared with four state-of-the-art methods. Experimental results show that the proposed RSRPCA outperforms four comparison methods both in detection performance and in computational time.
Tuning Fractures With Dynamic Data
NASA Astrophysics Data System (ADS)
Yao, Mengbi; Chang, Haibin; Li, Xiang; Zhang, Dongxiao
2018-02-01
Flow in fractured porous media is crucial for production of oil/gas reservoirs and exploitation of geothermal energy. Flow behaviors in such media are mainly dictated by the distribution of fractures. Measuring and inferring the distribution of fractures is subject to large uncertainty, which, in turn, leads to great uncertainty in the prediction of flow behaviors. Inverse modeling with dynamic data may assist to constrain fracture distributions, thus reducing the uncertainty of flow prediction. However, inverse modeling for flow in fractured reservoirs is challenging, owing to the discrete and non-Gaussian distribution of fractures, as well as strong nonlinearity in the relationship between flow responses and model parameters. In this work, building upon a series of recent advances, an inverse modeling approach is proposed to efficiently update the flow model to match the dynamic data while retaining geological realism in the distribution of fractures. In the approach, the Hough-transform method is employed to parameterize non-Gaussian fracture fields with continuous parameter fields, thus rendering desirable properties required by many inverse modeling methods. In addition, a recently developed forward simulation method, the embedded discrete fracture method (EDFM), is utilized to model the fractures. The EDFM maintains computational efficiency while preserving the ability to capture the geometrical details of fractures because the matrix is discretized as structured grid, while the fractures being handled as planes are inserted into the matrix grids. The combination of Hough representation of fractures with the EDFM makes it possible to tune the fractures (through updating their existence, location, orientation, length, and other properties) without requiring either unstructured grids or regridding during updating. Such a treatment is amenable to numerous inverse modeling approaches, such as the iterative inverse modeling method employed in this study, which is capable of dealing with strongly nonlinear problems. A series of numerical case studies with increasing complexity are set up to examine the performance of the proposed approach.
Inverse lithography using sparse mask representations
NASA Astrophysics Data System (ADS)
Ionescu, Radu C.; Hurley, Paul; Apostol, Stefan
2015-03-01
We present a novel optimisation algorithm for inverse lithography, based on optimization of the mask derivative, a domain inherently sparse, and for rectilinear polygons, invertible. The method is first developed assuming a point light source, and then extended to general incoherent sources. What results is a fast algorithm, producing manufacturable masks (the search space is constrained to rectilinear polygons), and flexible (specific constraints such as minimal line widths can be imposed). One inherent trick is to treat polygons as continuous entities, thus making aerial image calculation extremely fast and accurate. Requirements for mask manufacturability can be integrated in the optimization without too much added complexity. We also explain how to extend the scheme for phase-changing mask optimization.
Quantum algorithms for Gibbs sampling and hitting-time estimation
Chowdhury, Anirban Narayan; Somma, Rolando D.
2017-02-01
In this paper, we present quantum algorithms for solving two problems regarding stochastic processes. The first algorithm prepares the thermal Gibbs state of a quantum system and runs in time almost linear in √Nβ/Ζ and polynomial in log(1/ϵ), where N is the Hilbert space dimension, β is the inverse temperature, Ζ is the partition function, and ϵ is the desired precision of the output state. Our quantum algorithm exponentially improves the dependence on 1/ϵ and quadratically improves the dependence on β of known quantum algorithms for this problem. The second algorithm estimates the hitting time of a Markov chain. Formore » a sparse stochastic matrix Ρ, it runs in time almost linear in 1/(ϵΔ 3/2), where ϵ is the absolute precision in the estimation and Δ is a parameter determined by Ρ, and whose inverse is an upper bound of the hitting time. Our quantum algorithm quadratically improves the dependence on 1/ϵ and 1/Δ of the analog classical algorithm for hitting-time estimation. Finally, both algorithms use tools recently developed in the context of Hamiltonian simulation, spectral gap amplification, and solving linear systems of equations.« less
Efficient Jacobian inversion for the control of simple robot manipulators
NASA Technical Reports Server (NTRS)
Fijany, Amir; Bejczy, Antal K.
1988-01-01
Symbolic inversion of the Jacobian matrix for spherical wrist arms is investigated. It is shown that, taking advantage of the simple geometry of these arms, the closed-form solution of the system Q = J-1X, representing a transformation from task space to joint space, can be obtained very efficiently. The solutions for PUMA, Stanford, and a six-revolute-joint coplanar arm, along with all singular points, are presented. The solution for each joint variable is found as an explicit function of the singular points which provides a better insight into the effect of different singular points on the motion and force exertion of each individual joint. For the above arms, the computation cost of the solution is on the same order as the cost of forward kinematic solution and it is significantly reduced if forward kinematic solution is already obtained. A comparison with previous methods shows that this method is the most efficient to date.
Bessel smoothing filter for spectral-element mesh
NASA Astrophysics Data System (ADS)
Trinh, P. T.; Brossier, R.; Métivier, L.; Virieux, J.; Wellington, P.
2017-06-01
Smoothing filters are extremely important tools in seismic imaging and inversion, such as for traveltime tomography, migration and waveform inversion. For efficiency, and as they can be used a number of times during inversion, it is important that these filters can easily incorporate prior information on the geological structure of the investigated medium, through variable coherent lengths and orientation. In this study, we promote the use of the Bessel filter to achieve these purposes. Instead of considering the direct application of the filter, we demonstrate that we can rely on the equation associated with its inverse filter, which amounts to the solution of an elliptic partial differential equation. This enhances the efficiency of the filter application, and also its flexibility. We apply this strategy within a spectral-element-based elastic full waveform inversion framework. Taking advantage of this formulation, we apply the Bessel filter by solving the associated partial differential equation directly on the spectral-element mesh through the standard weak formulation. This avoids cumbersome projection operators between the spectral-element mesh and a regular Cartesian grid, or expensive explicit windowed convolution on the finite-element mesh, which is often used for applying smoothing operators. The associated linear system is solved efficiently through a parallel conjugate gradient algorithm, in which the matrix vector product is factorized and highly optimized with vectorized computation. Significant scaling behaviour is obtained when comparing this strategy with the explicit convolution method. The theoretical numerical complexity of this approach increases linearly with the coherent length, whereas a sublinear relationship is observed practically. Numerical illustrations are provided here for schematic examples, and for a more realistic elastic full waveform inversion gradient smoothing on the SEAM II benchmark model. These examples illustrate well the efficiency and flexibility of the approach proposed.
Framework to trade optimality for local processing in large-scale wavefront reconstruction problems.
Haber, Aleksandar; Verhaegen, Michel
2016-11-15
We show that the minimum variance wavefront estimation problems permit localized approximate solutions, in the sense that the wavefront value at a point (excluding unobservable modes, such as the piston mode) can be approximated by a linear combination of the wavefront slope measurements in the point's neighborhood. This enables us to efficiently compute a wavefront estimate by performing a single sparse matrix-vector multiplication. Moreover, our results open the possibility for the development of wavefront estimators that can be easily implemented in a decentralized/distributed manner, and in which the estimate optimality can be easily traded for computational efficiency. We numerically validate our approach on Hudgin wavefront sensor geometries, and the results can be easily generalized to Fried geometries.
a Novel Discrete Optimal Transport Method for Bayesian Inverse Problems
NASA Astrophysics Data System (ADS)
Bui-Thanh, T.; Myers, A.; Wang, K.; Thiery, A.
2017-12-01
We present the Augmented Ensemble Transform (AET) method for generating approximate samples from a high-dimensional posterior distribution as a solution to Bayesian inverse problems. Solving large-scale inverse problems is critical for some of the most relevant and impactful scientific endeavors of our time. Therefore, constructing novel methods for solving the Bayesian inverse problem in more computationally efficient ways can have a profound impact on the science community. This research derives the novel AET method for exploring a posterior by solving a sequence of linear programming problems, resulting in a series of transport maps which map prior samples to posterior samples, allowing for the computation of moments of the posterior. We show both theoretical and numerical results, indicating this method can offer superior computational efficiency when compared to other SMC methods. Most of this efficiency is derived from matrix scaling methods to solve the linear programming problem and derivative-free optimization for particle movement. We use this method to determine inter-well connectivity in a reservoir and the associated uncertainty related to certain parameters. The attached file shows the difference between the true parameter and the AET parameter in an example 3D reservoir problem. The error is within the Morozov discrepancy allowance with lower computational cost than other particle methods.
Hessian Schatten-norm regularization for linear inverse problems.
Lefkimmiatis, Stamatios; Ward, John Paul; Unser, Michael
2013-05-01
We introduce a novel family of invariant, convex, and non-quadratic functionals that we employ to derive regularized solutions of ill-posed linear inverse imaging problems. The proposed regularizers involve the Schatten norms of the Hessian matrix, which are computed at every pixel of the image. They can be viewed as second-order extensions of the popular total-variation (TV) semi-norm since they satisfy the same invariance properties. Meanwhile, by taking advantage of second-order derivatives, they avoid the staircase effect, a common artifact of TV-based reconstructions, and perform well for a wide range of applications. To solve the corresponding optimization problems, we propose an algorithm that is based on a primal-dual formulation. A fundamental ingredient of this algorithm is the projection of matrices onto Schatten norm balls of arbitrary radius. This operation is performed efficiently based on a direct link we provide between vector projections onto lq norm balls and matrix projections onto Schatten norm balls. Finally, we demonstrate the effectiveness of the proposed methods through experimental results on several inverse imaging problems with real and simulated data.
Kinetics of diffusion-controlled annihilation with sparse initial conditions
Ben-Naim, Eli; Krapivsky, Paul
2016-12-16
Here, we study diffusion-controlled single-species annihilation with sparse initial conditions. In this random process, particles undergo Brownian motion, and when two particles meet, both disappear. We also focus on sparse initial conditions where particles occupy a subspace of dimension δ that is embedded in a larger space of dimension d. Furthermore, we find that the co-dimension Δ = d - δ governs the behavior. All particles disappear when the co-dimension is sufficiently small, Δ ≤ 2; otherwise, a finite fraction of particles indefinitely survive. We establish the asymptotic behavior of the probability S(t) that a test particle survives until time t. When the subspace is a line, δ = 1, we find inverse logarithmic decay,more » $$S\\sim {(\\mathrm{ln}t)}^{-1}$$, in three dimensions, and a modified power-law decay, $$S\\sim (\\mathrm{ln}t){t}^{-1/2}$$, in two dimensions. In general, the survival probability decays algebraically when Δ < 2, and there is an inverse logarithmic decay at the critical co-dimension Δ = 2.« less
NASA Astrophysics Data System (ADS)
Li, Jun; Song, Minghui; Peng, Yuanxi
2018-03-01
Current infrared and visible image fusion methods do not achieve adequate information extraction, i.e., they cannot extract the target information from infrared images while retaining the background information from visible images. Moreover, most of them have high complexity and are time-consuming. This paper proposes an efficient image fusion framework for infrared and visible images on the basis of robust principal component analysis (RPCA) and compressed sensing (CS). The novel framework consists of three phases. First, RPCA decomposition is applied to the infrared and visible images to obtain their sparse and low-rank components, which represent the salient features and background information of the images, respectively. Second, the sparse and low-rank coefficients are fused by different strategies. On the one hand, the measurements of the sparse coefficients are obtained by the random Gaussian matrix, and they are then fused by the standard deviation (SD) based fusion rule. Next, the fused sparse component is obtained by reconstructing the result of the fused measurement using the fast continuous linearized augmented Lagrangian algorithm (FCLALM). On the other hand, the low-rank coefficients are fused using the max-absolute rule. Subsequently, the fused image is superposed by the fused sparse and low-rank components. For comparison, several popular fusion algorithms are tested experimentally. By comparing the fused results subjectively and objectively, we find that the proposed framework can extract the infrared targets while retaining the background information in the visible images. Thus, it exhibits state-of-the-art performance in terms of both fusion effects and timeliness.
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS.
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2011-01-01
The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied.
Convergence of Chahine's nonlinear relaxation inversion method used for limb viewing remote sensing
NASA Technical Reports Server (NTRS)
Chu, W. P.
1985-01-01
The application of Chahine's (1970) inversion technique to remote sensing problems utilizing the limb viewing geometry is discussed. The problem considered here involves occultation-type measurements and limb radiance-type measurements from either spacecraft or balloon platforms. The kernel matrix of the inversion problem is either an upper or lower triangular matrix. It is demonstrated that the Chahine inversion technique always converges, provided the diagonal elements of the kernel matrix are nonzero.
Total recall in distributive associative memories
NASA Technical Reports Server (NTRS)
Danforth, Douglas G.
1991-01-01
Iterative error correction of asymptotically large associative memories is equivalent to a one-step learning rule. This rule is the inverse of the activation function of the memory. Spectral representations of nonlinear activation functions are used to obtain the inverse in closed form for Sparse Distributed Memory, Selected-Coordinate Design, and Radial Basis Functions.
Sparse electrocardiogram signals recovery based on solving a row echelon-like form of system.
Cai, Pingmei; Wang, Guinan; Yu, Shiwei; Zhang, Hongjuan; Ding, Shuxue; Wu, Zikai
2016-02-01
The study of biology and medicine in a noise environment is an evolving direction in biological data analysis. Among these studies, analysis of electrocardiogram (ECG) signals in a noise environment is a challenging direction in personalized medicine. Due to its periodic characteristic, ECG signal can be roughly regarded as sparse biomedical signals. This study proposes a two-stage recovery algorithm for sparse biomedical signals in time domain. In the first stage, the concentration subspaces are found in advance. Then by exploiting these subspaces, the mixing matrix is estimated accurately. In the second stage, based on the number of active sources at each time point, the time points are divided into different layers. Next, by constructing some transformation matrices, these time points form a row echelon-like system. After that, the sources at each layer can be solved out explicitly by corresponding matrix operations. It is noting that all these operations are conducted under a weak sparse condition that the number of active sources is less than the number of observations. Experimental results show that the proposed method has a better performance for sparse ECG signal recovery problem.
Quantum algorithm for support matrix machines
NASA Astrophysics Data System (ADS)
Duan, Bojia; Yuan, Jiabin; Liu, Ying; Li, Dan
2017-09-01
We propose a quantum algorithm for support matrix machines (SMMs) that efficiently addresses an image classification problem by introducing a least-squares reformulation. This algorithm consists of two core subroutines: a quantum matrix inversion (Harrow-Hassidim-Lloyd, HHL) algorithm and a quantum singular value thresholding (QSVT) algorithm. The two algorithms can be implemented on a universal quantum computer with complexity O[log(npq) ] and O[log(pq)], respectively, where n is the number of the training data and p q is the size of the feature space. By iterating the algorithms, we can find the parameters for the SMM classfication model. Our analysis shows that both HHL and QSVT algorithms achieve an exponential increase of speed over their classical counterparts.
Deconvolution using a neural network
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lehman, S.K.
1990-11-15
Viewing one dimensional deconvolution as a matrix inversion problem, we compare a neural network backpropagation matrix inverse with LMS, and pseudo-inverse. This is a largely an exercise in understanding how our neural network code works. 1 ref.
Saa, Pedro A.; Nielsen, Lars K.
2016-01-01
Motivation: Computation of steady-state flux solutions in large metabolic models is routinely performed using flux balance analysis based on a simple LP (Linear Programming) formulation. A minimal requirement for thermodynamic feasibility of the flux solution is the absence of internal loops, which are enforced using ‘loopless constraints’. The resulting loopless flux problem is a substantially harder MILP (Mixed Integer Linear Programming) problem, which is computationally expensive for large metabolic models. Results: We developed a pre-processing algorithm that significantly reduces the size of the original loopless problem into an easier and equivalent MILP problem. The pre-processing step employs a fast matrix sparsification algorithm—Fast- sparse null-space pursuit (SNP)—inspired by recent results on SNP. By finding a reduced feasible ‘loop-law’ matrix subject to known directionalities, Fast-SNP considerably improves the computational efficiency in several metabolic models running different loopless optimization problems. Furthermore, analysis of the topology encoded in the reduced loop matrix enabled identification of key directional constraints for the potential permanent elimination of infeasible loops in the underlying model. Overall, Fast-SNP is an effective and simple algorithm for efficient formulation of loop-law constraints, making loopless flux optimization feasible and numerically tractable at large scale. Availability and Implementation: Source code for MATLAB including examples is freely available for download at http://www.aibn.uq.edu.au/cssb-resources under Software. Optimization uses Gurobi, CPLEX or GLPK (the latter is included with the algorithm). Contact: lars.nielsen@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27559155
An approach to solving large reliability models
NASA Technical Reports Server (NTRS)
Boyd, Mark A.; Veeraraghavan, Malathi; Dugan, Joanne Bechta; Trivedi, Kishor S.
1988-01-01
This paper describes a unified approach to the problem of solving large realistic reliability models. The methodology integrates behavioral decomposition, state trunction, and efficient sparse matrix-based numerical methods. The use of fault trees, together with ancillary information regarding dependencies to automatically generate the underlying Markov model state space is proposed. The effectiveness of this approach is illustrated by modeling a state-of-the-art flight control system and a multiprocessor system. Nonexponential distributions for times to failure of components are assumed in the latter example. The modeling tool used for most of this analysis is HARP (the Hybrid Automated Reliability Predictor).
A pressure flux-split technique for computation of inlet flow behavior
NASA Technical Reports Server (NTRS)
Pordal, H. S.; Khosla, P. K.; Rubin, S. G.
1991-01-01
A method for calculating the flow field in aircraft engine inlets is presented. The phenomena of inlet unstart and restart are investigated. Solutions of the reduced Navier-Stokes (RNS) equations are obtained with a time consistent direct sparse matrix solver that computes the transient flow field both internal and external to the inlet. Time varying shocks and time varying recirculation regions can be efficiently analyzed. The code is quite general and is suitable for the computation of flow for a wide variety of geometries and over a wide range of Mach and Reynolds numbers.
Fabrication of cell-benign inverse opal hydrogels for three-dimensional cell culture.
Im, Pilseon; Ji, Dong Hwan; Kim, Min Kyung; Kim, Jaeyun
2017-05-15
Inverse opal hydrogels (IOHs) for cell culture were fabricated and optimized using calcium-crosslinked alginate microbeads as sacrificial template and gelatin as a matrix. In contrast to traditional three-dimensional (3D) scaffolds, the gelatin IOHs allowed the utilization of both the macropore surface and inner matrix for cell co-culture. In order to remove templates efficiently for the construction of 3D interconnected macropores and to maintain high cell viability during the template removal process using EDTA solution, various factors in fabrication, including alginate viscosity, alginate concentration, alginate microbeads size, crosslinking calcium concentration, and gelatin network density were investigated. Low viscosity alginate, lower crosslinking calcium ion concentration, and lower concentration of alginate and gelatin were found to obtain high viability of cells encapsulated in the gelatin matrix after removal of the alginate template by EDTA treatment by allowing rapid dissociation and diffusion of alginate polymers. Based on the optimized fabrication conditions, gelatin IOHs showed good potential as a cell co-culture system, applicable to tissue engineering and cancer research. Copyright © 2017 Elsevier Inc. All rights reserved.
Parallel halftoning technique using dot diffusion optimization
NASA Astrophysics Data System (ADS)
Molina-Garcia, Javier; Ponomaryov, Volodymyr I.; Reyes-Reyes, Rogelio; Cruz-Ramos, Clara
2017-05-01
In this paper, a novel approach for halftone images is proposed and implemented for images that are obtained by the Dot Diffusion (DD) method. Designed technique is based on an optimization of the so-called class matrix used in DD algorithm and it consists of generation new versions of class matrix, which has no baron and near-baron in order to minimize inconsistencies during the distribution of the error. Proposed class matrix has different properties and each is designed for two different applications: applications where the inverse-halftoning is necessary, and applications where this method is not required. The proposed method has been implemented in GPU (NVIDIA GeForce GTX 750 Ti), multicore processors (AMD FX(tm)-6300 Six-Core Processor and in Intel core i5-4200U), using CUDA and OpenCV over a PC with linux. Experimental results have shown that novel framework generates a good quality of the halftone images and the inverse halftone images obtained. The simulation results using parallel architectures have demonstrated the efficiency of the novel technique when it is implemented in real-time processing.
Numerical solution of quadratic matrix equations for free vibration analysis of structures
NASA Technical Reports Server (NTRS)
Gupta, K. K.
1975-01-01
This paper is concerned with the efficient and accurate solution of the eigenvalue problem represented by quadratic matrix equations. Such matrix forms are obtained in connection with the free vibration analysis of structures, discretized by finite 'dynamic' elements, resulting in frequency-dependent stiffness and inertia matrices. The paper presents a new numerical solution procedure of the quadratic matrix equations, based on a combined Sturm sequence and inverse iteration technique enabling economical and accurate determination of a few required eigenvalues and associated vectors. An alternative procedure based on a simultaneous iteration procedure is also described when only the first few modes are the usual requirement. The employment of finite dynamic elements in conjunction with the presently developed eigenvalue routines results in a most significant economy in the dynamic analysis of structures.
Optimization of computations for adjoint field and Jacobian needed in 3D CSEM inversion
NASA Astrophysics Data System (ADS)
Dehiya, Rahul; Singh, Arun; Gupta, Pravin K.; Israil, M.
2017-01-01
We present the features and results of a newly developed code, based on Gauss-Newton optimization technique, for solving three-dimensional Controlled-Source Electromagnetic inverse problem. In this code a special emphasis has been put on representing the operations by block matrices for conjugate gradient iteration. We show how in the computation of Jacobian, the matrix formed by differentiation of system matrix can be made independent of frequency to optimize the operations at conjugate gradient step. The coarse level parallel computing, using OpenMP framework, is used primarily due to its simplicity in implementation and accessibility of shared memory multi-core computing machine to almost anyone. We demonstrate how the coarseness of modeling grid in comparison to source (comp`utational receivers) spacing can be exploited for efficient computing, without compromising the quality of the inverted model, by reducing the number of adjoint calls. It is also demonstrated that the adjoint field can even be computed on a grid coarser than the modeling grid without affecting the inversion outcome. These observations were reconfirmed using an experiment design where the deviation of source from straight tow line is considered. Finally, a real field data inversion experiment is presented to demonstrate robustness of the code.
Mathematical modeling of photovoltaic thermal PV/T system with v-groove collector
NASA Astrophysics Data System (ADS)
Zohri, M.; Fudholi, A.; Ruslan, M. H.; Sopian, K.
2017-07-01
The use of v-groove in solar collector has a higher thermal efficiency in references. Dropping the working heat of photovoltaic panel was able to raise the electrical efficiency performance. Electrical and thermal efficiency were produced by photovoltaic thermal (PV/T) system concurrently. Mathematical modeling based on steady-state thermal analysis of PV/T system with v-groove was conducted. With matrix inversion method, the energy balance equations are explained by means of the investigative method. The comparison results show that in the PV/T system with the V-groove collector is higher temperature, thermal and electrical efficiency than other collectors.
Sparse Gaussian elimination with controlled fill-in on a shared memory multiprocessor
NASA Technical Reports Server (NTRS)
Alaghband, Gita; Jordan, Harry F.
1989-01-01
It is shown that in sparse matrices arising from electronic circuits, it is possible to do computations on many diagonal elements simultaneously. A technique for obtaining an ordered compatible set directly from the ordered incompatible table is given. The ordering is based on the Markowitz number of the pivot candidates. This technique generates a set of compatible pivots with the property of generating few fills. A novel heuristic algorithm is presented that combines the idea of an order-compatible set with a limited binary tree search to generate several sets of compatible pivots in linear time. An elimination set for reducing the matrix is generated and selected on the basis of a minimum Markowitz sum number. The parallel pivoting technique presented is a stepwise algorithm and can be applied to any submatrix of the original matrix. Thus, it is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds. Parameters are suggested to obtain a balance between parallelism and fill-ins. Results of applying the proposed algorithms on several large application matrices using the HEP multiprocessor (Kowalik, 1985) are presented and analyzed.
Non-convex Statistical Optimization for Sparse Tensor Graphical Model
Sun, Wei; Wang, Zhaoran; Liu, Han; Cheng, Guang
2016-01-01
We consider the estimation of sparse graphical models that characterize the dependency structure of high-dimensional tensor-valued data. To facilitate the estimation of the precision matrix corresponding to each way of the tensor, we assume the data follow a tensor normal distribution whose covariance has a Kronecker product structure. The penalized maximum likelihood estimation of this model involves minimizing a non-convex objective function. In spite of the non-convexity of this estimation problem, we prove that an alternating minimization algorithm, which iteratively estimates each sparse precision matrix while fixing the others, attains an estimator with the optimal statistical rate of convergence as well as consistent graph recovery. Notably, such an estimator achieves estimation consistency with only one tensor sample, which is unobserved in previous work. Our theoretical results are backed by thorough numerical studies. PMID:28316459
NASA Astrophysics Data System (ADS)
Yu, Jian; Yin, Qian; Guo, Ping; Luo, A.-li
2014-09-01
This paper presents an efficient method for the extraction of astronomical spectra from two-dimensional (2D) multifibre spectrographs based on the regularized least-squares QR-factorization (LSQR) algorithm. We address two issues: we propose a modified Gaussian point spread function (PSF) for modelling the 2D PSF from multi-emission-line gas-discharge lamp images (arc images), and we develop an efficient deconvolution method to extract spectra in real circumstances. The proposed modified 2D Gaussian PSF model can fit various types of 2D PSFs, including different radial distortion angles and ellipticities. We adopt the regularized LSQR algorithm to solve the sparse linear equations constructed from the sparse convolution matrix, which we designate the deconvolution spectrum extraction method. Furthermore, we implement a parallelized LSQR algorithm based on graphics processing unit programming in the Compute Unified Device Architecture to accelerate the computational processing. Experimental results illustrate that the proposed extraction method can greatly reduce the computational cost and memory use of the deconvolution method and, consequently, increase its efficiency and practicability. In addition, the proposed extraction method has a stronger noise tolerance than other methods, such as the boxcar (aperture) extraction and profile extraction methods. Finally, we present an analysis of the sensitivity of the extraction results to the radius and full width at half-maximum of the 2D PSF.
Li, Yongbao; Tian, Zhen; Song, Ting; Wu, Zhaoxia; Liu, Yaqiang; Jiang, Steve; Jia, Xun
2017-01-07
Monte Carlo (MC)-based spot dose calculation is highly desired for inverse treatment planning in proton therapy because of its accuracy. Recent studies on biological optimization have also indicated the use of MC methods to compute relevant quantities of interest, e.g. linear energy transfer. Although GPU-based MC engines have been developed to address inverse optimization problems, their efficiency still needs to be improved. Also, the use of a large number of GPUs in MC calculation is not favorable for clinical applications. The previously proposed adaptive particle sampling (APS) method can improve the efficiency of MC-based inverse optimization by using the computationally expensive MC simulation more effectively. This method is more efficient than the conventional approach that performs spot dose calculation and optimization in two sequential steps. In this paper, we propose a computational library to perform MC-based spot dose calculation on GPU with the APS scheme. The implemented APS method performs a non-uniform sampling of the particles from pencil beam spots during the optimization process, favoring those from the high intensity spots. The library also conducts two computationally intensive matrix-vector operations frequently used when solving an optimization problem. This library design allows a streamlined integration of the MC-based spot dose calculation into an existing proton therapy inverse planning process. We tested the developed library in a typical inverse optimization system with four patient cases. The library achieved the targeted functions by supporting inverse planning in various proton therapy schemes, e.g. single field uniform dose, 3D intensity modulated proton therapy, and distal edge tracking. The efficiency was 41.6 ± 15.3% higher than the use of a GPU-based MC package in a conventional calculation scheme. The total computation time ranged between 2 and 50 min on a single GPU card depending on the problem size.
NASA Astrophysics Data System (ADS)
Li, Yongbao; Tian, Zhen; Song, Ting; Wu, Zhaoxia; Liu, Yaqiang; Jiang, Steve; Jia, Xun
2017-01-01
Monte Carlo (MC)-based spot dose calculation is highly desired for inverse treatment planning in proton therapy because of its accuracy. Recent studies on biological optimization have also indicated the use of MC methods to compute relevant quantities of interest, e.g. linear energy transfer. Although GPU-based MC engines have been developed to address inverse optimization problems, their efficiency still needs to be improved. Also, the use of a large number of GPUs in MC calculation is not favorable for clinical applications. The previously proposed adaptive particle sampling (APS) method can improve the efficiency of MC-based inverse optimization by using the computationally expensive MC simulation more effectively. This method is more efficient than the conventional approach that performs spot dose calculation and optimization in two sequential steps. In this paper, we propose a computational library to perform MC-based spot dose calculation on GPU with the APS scheme. The implemented APS method performs a non-uniform sampling of the particles from pencil beam spots during the optimization process, favoring those from the high intensity spots. The library also conducts two computationally intensive matrix-vector operations frequently used when solving an optimization problem. This library design allows a streamlined integration of the MC-based spot dose calculation into an existing proton therapy inverse planning process. We tested the developed library in a typical inverse optimization system with four patient cases. The library achieved the targeted functions by supporting inverse planning in various proton therapy schemes, e.g. single field uniform dose, 3D intensity modulated proton therapy, and distal edge tracking. The efficiency was 41.6 ± 15.3% higher than the use of a GPU-based MC package in a conventional calculation scheme. The total computation time ranged between 2 and 50 min on a single GPU card depending on the problem size.
Li, Yongbao; Tian, Zhen; Song, Ting; Wu, Zhaoxia; Liu, Yaqiang; Jiang, Steve; Jia, Xun
2016-01-01
Monte Carlo (MC)-based spot dose calculation is highly desired for inverse treatment planning in proton therapy because of its accuracy. Recent studies on biological optimization have also indicated the use of MC methods to compute relevant quantities of interest, e.g. linear energy transfer. Although GPU-based MC engines have been developed to address inverse optimization problems, their efficiency still needs to be improved. Also, the use of a large number of GPUs in MC calculation is not favorable for clinical applications. The previously proposed adaptive particle sampling (APS) method can improve the efficiency of MC-based inverse optimization by using the computationally expensive MC simulation more effectively. This method is more efficient than the conventional approach that performs spot dose calculation and optimization in two sequential steps. In this paper, we propose a computational library to perform MC-based spot dose calculation on GPU with the APS scheme. The implemented APS method performs a non-uniform sampling of the particles from pencil beam spots during the optimization process, favoring those from the high intensity spots. The library also conducts two computationally intensive matrix-vector operations frequently used when solving an optimization problem. This library design allows a streamlined integration of the MC-based spot dose calculation into an existing proton therapy inverse planning process. We tested the developed library in a typical inverse optimization system with four patient cases. The library achieved the targeted functions by supporting inverse planning in various proton therapy schemes, e.g. single field uniform dose, 3D intensity modulated proton therapy, and distal edge tracking. The efficiency was 41.6±15.3% higher than the use of a GPU-based MC package in a conventional calculation scheme. The total computation time ranged between 2 and 50 min on a single GPU card depending on the problem size. PMID:27991456
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication
Azad, Ariful; Ballard, Grey; Buluc, Aydin; ...
2016-11-08
Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdös-Rényi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first implementation of the 3D SpGEMM formulation that exploits multiple (intranode and internode) levels of parallelism, achievingmore » significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research.« less
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2012-01-01
The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied. PMID:22661790
Point-source inversion techniques
NASA Astrophysics Data System (ADS)
Langston, Charles A.; Barker, Jeffrey S.; Pavlin, Gregory B.
1982-11-01
A variety of approaches for obtaining source parameters from waveform data using moment-tensor or dislocation point source models have been investigated and applied to long-period body and surface waves from several earthquakes. Generalized inversion techniques have been applied to data for long-period teleseismic body waves to obtain the orientation, time function and depth of the 1978 Thessaloniki, Greece, event, of the 1971 San Fernando event, and of several events associated with the 1963 induced seismicity sequence at Kariba, Africa. The generalized inversion technique and a systematic grid testing technique have also been used to place meaningful constraints on mechanisms determined from very sparse data sets; a single station with high-quality three-component waveform data is often sufficient to discriminate faulting type (e.g., strike-slip, etc.). Sparse data sets for several recent California earthquakes, for a small regional event associated with the Koyna, India, reservoir, and for several events at the Kariba reservoir have been investigated in this way. Although linearized inversion techniques using the moment-tensor model are often robust, even for sparse data sets, there are instances where the simplifying assumption of a single point source is inadequate to model the data successfully. Numerical experiments utilizing synthetic data and actual data for the 1971 San Fernando earthquake graphically demonstrate that severe problems may be encountered if source finiteness effects are ignored. These techniques are generally applicable to on-line processing of high-quality digital data, but source complexity and inadequacy of the assumed Green's functions are major problems which are yet to be fully addressed.
A Higher Order Iterative Method for Computing the Drazin Inverse
Soleymani, F.; Stanimirović, Predrag S.
2013-01-01
A method with high convergence rate for finding approximate inverses of nonsingular matrices is suggested and established analytically. An extension of the introduced computational scheme to general square matrices is defined. The extended method could be used for finding the Drazin inverse. The application of the scheme on large sparse test matrices alongside the use in preconditioning of linear system of equations will be presented to clarify the contribution of the paper. PMID:24222747
Colleau, Jean-Jacques; Palhière, Isabelle; Rodríguez-Ramilo, Silvia T; Legarra, Andres
2017-12-01
Pedigree-based management of genetic diversity in populations, e.g., using optimal contributions, involves computation of the [Formula: see text] type yielding elements (relationships) or functions (usually averages) of relationship matrices. For pedigree-based relationships [Formula: see text], a very efficient method exists. When all the individuals of interest are genotyped, genomic management can be addressed using the genomic relationship matrix [Formula: see text]; however, to date, the computational problem of efficiently computing [Formula: see text] has not been well studied. When some individuals of interest are not genotyped, genomic management should consider the relationship matrix [Formula: see text] that combines genotyped and ungenotyped individuals; however, direct computation of [Formula: see text] is computationally very demanding, because construction of a possibly huge matrix is required. Our work presents efficient ways of computing [Formula: see text] and [Formula: see text], with applications on real data from dairy sheep and dairy goat breeding schemes. For genomic relationships, an efficient indirect computation with quadratic instead of cubic cost is [Formula: see text], where Z is a matrix relating animals to genotypes. For the relationship matrix [Formula: see text], we propose an indirect method based on the difference between vectors [Formula: see text], which involves computation of [Formula: see text] and of products such as [Formula: see text] and [Formula: see text], where [Formula: see text] is a working vector derived from [Formula: see text]. The latter computation is the most demanding but can be done using sparse Cholesky decompositions of matrix [Formula: see text], which allows handling very large genomic and pedigree data files. Studies based on simulations reported in the literature show that the trends of average relationships in [Formula: see text] and [Formula: see text] differ as genomic selection proceeds. When selection is based on genomic relationships but management is based on pedigree data, the true genetic diversity is overestimated. However, our tests on real data from sheep and goat obtained before genomic selection started do not show this. We present efficient methods to compute elements and statistics of the genomic relationships [Formula: see text] and of matrix [Formula: see text] that combines ungenotyped and genotyped individuals. These methods should be useful to monitor and handle genomic diversity.
NASA Technical Reports Server (NTRS)
Choo, Yung K.; Soh, Woo-Yung; Yoon, Seokkwan
1989-01-01
A finite-volume lower-upper (LU) implicit scheme is used to simulate an inviscid flow in a tubine cascade. This approximate factorization scheme requires only the inversion of sparse lower and upper triangular matrices, which can be done efficiently without extensive storage. As an implicit scheme it allows a large time step to reach the steady state. An interactive grid generation program (TURBO), which is being developed, is used to generate grids. This program uses the control point form of algebraic grid generation which uses a sparse collection of control points from which the shape and position of coordinate curves can be adjusted. A distinct advantage of TURBO compared with other grid generation programs is that it allows the easy change of local mesh structure without affecting the grid outside the domain of independence. Sample grids are generated by TURBO for a compressor rotor blade and a turbine cascade. The turbine cascade flow is simulated by using the LU implicit scheme on the grid generated by TURBO.
Group sparse multiview patch alignment framework with view consistency for image classification.
Gui, Jie; Tao, Dacheng; Sun, Zhenan; Luo, Yong; You, Xinge; Tang, Yuan Yan
2014-07-01
No single feature can satisfactorily characterize the semantic concepts of an image. Multiview learning aims to unify different kinds of features to produce a consensual and efficient representation. This paper redefines part optimization in the patch alignment framework (PAF) and develops a group sparse multiview patch alignment framework (GSM-PAF). The new part optimization considers not only the complementary properties of different views, but also view consistency. In particular, view consistency models the correlations between all possible combinations of any two kinds of view. In contrast to conventional dimensionality reduction algorithms that perform feature extraction and feature selection independently, GSM-PAF enjoys joint feature extraction and feature selection by exploiting l(2,1)-norm on the projection matrix to achieve row sparsity, which leads to the simultaneous selection of relevant features and learning transformation, and thus makes the algorithm more discriminative. Experiments on two real-world image data sets demonstrate the effectiveness of GSM-PAF for image classification.
Competition in high dimensional spaces using a sparse approximation of neural fields.
Quinton, Jean-Charles; Girau, Bernard; Lefort, Mathieu
2011-01-01
The Continuum Neural Field Theory implements competition within topologically organized neural networks with lateral inhibitory connections. However, due to the polynomial complexity of matrix-based implementations, updating dense representations of the activity becomes computationally intractable when an adaptive resolution or an arbitrary number of input dimensions is required. This paper proposes an alternative to self-organizing maps with a sparse implementation based on Gaussian mixture models, promoting a trade-off in redundancy for higher computational efficiency and alleviating constraints on the underlying substrate.This version reproduces the emergent attentional properties of the original equations, by directly applying them within a continuous approximation of a high dimensional neural field. The model is compatible with preprocessed sensory flows but can also be interfaced with artificial systems. This is particularly important for sensorimotor systems, where decisions and motor actions must be taken and updated in real-time. Preliminary tests are performed on a reactive color tracking application, using spatially distributed color features.
NASA Astrophysics Data System (ADS)
Wang, Jinting; Lu, Liqiao; Zhu, Fei
2018-01-01
Finite element (FE) is a powerful tool and has been applied by investigators to real-time hybrid simulations (RTHSs). This study focuses on the computational efficiency, including the computational time and accuracy, of numerical integrations in solving FE numerical substructure in RTHSs. First, sparse matrix storage schemes are adopted to decrease the computational time of FE numerical substructure. In this way, the task execution time (TET) decreases such that the scale of the numerical substructure model increases. Subsequently, several commonly used explicit numerical integration algorithms, including the central difference method (CDM), the Newmark explicit method, the Chang method and the Gui-λ method, are comprehensively compared to evaluate their computational time in solving FE numerical substructure. CDM is better than the other explicit integration algorithms when the damping matrix is diagonal, while the Gui-λ (λ = 4) method is advantageous when the damping matrix is non-diagonal. Finally, the effect of time delay on the computational accuracy of RTHSs is investigated by simulating structure-foundation systems. Simulation results show that the influences of time delay on the displacement response become obvious with the mass ratio increasing, and delay compensation methods may reduce the relative error of the displacement peak value to less than 5% even under the large time-step and large time delay.
A matrix-inversion method for gamma-source mapping from gamma-count data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adsley, Ian; Burgess, Claire; Bull, Richard K
In a previous paper it was proposed that a simple matrix inversion method could be used to extract source distributions from gamma-count maps, using simple models to calculate the response matrix. The method was tested using numerically generated count maps. In the present work a 100 kBq Co{sup 60} source has been placed on a gridded surface and the count rate measured using a NaI scintillation detector. The resulting map of gamma counts was used as input to the matrix inversion procedure and the source position recovered. A multi-source array was simulated by superposition of several single-source count maps andmore » the source distribution was again recovered using matrix inversion. The measurements were performed for several detector heights. The effects of uncertainties in source-detector distances on the matrix inversion method are also examined. The results from this work give confidence in the application of the method to practical applications, such as the segregation of highly active objects amongst fuel-element debris. (authors)« less
Hu, Shaoxing; Xu, Shike; Wang, Duhu; Zhang, Aiwu
2015-11-11
Aiming at addressing the problem of high computational cost of the traditional Kalman filter in SINS/GPS, a practical optimization algorithm with offline-derivation and parallel processing methods based on the numerical characteristics of the system is presented in this paper. The algorithm exploits the sparseness and/or symmetry of matrices to simplify the computational procedure. Thus plenty of invalid operations can be avoided by offline derivation using a block matrix technique. For enhanced efficiency, a new parallel computational mechanism is established by subdividing and restructuring calculation processes after analyzing the extracted "useful" data. As a result, the algorithm saves about 90% of the CPU processing time and 66% of the memory usage needed in a classical Kalman filter. Meanwhile, the method as a numerical approach needs no precise-loss transformation/approximation of system modules and the accuracy suffers little in comparison with the filter before computational optimization. Furthermore, since no complicated matrix theories are needed, the algorithm can be easily transplanted into other modified filters as a secondary optimization method to achieve further efficiency.
Xi, Jianing; Wang, Minghui; Li, Ao
2018-06-05
Discovery of mutated driver genes is one of the primary objective for studying tumorigenesis. To discover some relatively low frequently mutated driver genes from somatic mutation data, many existing methods incorporate interaction network as prior information. However, the prior information of mRNA expression patterns are not exploited by these existing network-based methods, which is also proven to be highly informative of cancer progressions. To incorporate prior information from both interaction network and mRNA expressions, we propose a robust and sparse co-regularized nonnegative matrix factorization to discover driver genes from mutation data. Furthermore, our framework also conducts Frobenius norm regularization to overcome overfitting issue. Sparsity-inducing penalty is employed to obtain sparse scores in gene representations, of which the top scored genes are selected as driver candidates. Evaluation experiments by known benchmarking genes indicate that the performance of our method benefits from the two type of prior information. Our method also outperforms the existing network-based methods, and detect some driver genes that are not predicted by the competing methods. In summary, our proposed method can improve the performance of driver gene discovery by effectively incorporating prior information from interaction network and mRNA expression patterns into a robust and sparse co-regularized matrix factorization framework.
Large-region acoustic source mapping using a movable array and sparse covariance fitting.
Zhao, Shengkui; Tuna, Cagdas; Nguyen, Thi Ngoc Tho; Jones, Douglas L
2017-01-01
Large-region acoustic source mapping is important for city-scale noise monitoring. Approaches using a single-position measurement scheme to scan large regions using small arrays cannot provide clean acoustic source maps, while deploying large arrays spanning the entire region of interest is prohibitively expensive. A multiple-position measurement scheme is applied to scan large regions at multiple spatial positions using a movable array of small size. Based on the multiple-position measurement scheme, a sparse-constrained multiple-position vectorized covariance matrix fitting approach is presented. In the proposed approach, the overall sample covariance matrix of the incoherent virtual array is first estimated using the multiple-position array data and then vectorized using the Khatri-Rao (KR) product. A linear model is then constructed for fitting the vectorized covariance matrix and a sparse-constrained reconstruction algorithm is proposed for recovering source powers from the model. The user parameter settings are discussed. The proposed approach is tested on a 30 m × 40 m region and a 60 m × 40 m region using simulated and measured data. Much cleaner acoustic source maps and lower sound pressure level errors are obtained compared to the beamforming approaches and the previous sparse approach [Zhao, Tuna, Nguyen, and Jones, Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP) (2016)].
An indirect approach to the extensive calculation of relationship coefficients
Colleau, Jean-Jacques
2002-01-01
A method was described for calculating population statistics on relationship coefficients without using corresponding individual data. It relied on the structure of the inverse of the numerator relationship matrix between individuals under investigation and ancestors. Computation times were observed on simulated populations and were compared to those incurred with a conventional direct approach. The indirect approach turned out to be very efficient for multiplying the relationship matrix corresponding to planned matings (full design) by any vector. Efficiency was generally still good or very good for calculating statistics on these simulated populations. An extreme implementation of the method is the calculation of inbreeding coefficients themselves. Relative performances of the indirect method were good except when many full-sibs during many generations existed in the population. PMID:12270102
A Sparse Matrix Approach for Simultaneous Quantification of Nystagmus and Saccade
NASA Technical Reports Server (NTRS)
Kukreja, Sunil L.; Stone, Lee; Boyle, Richard D.
2012-01-01
The vestibulo-ocular reflex (VOR) consists of two intermingled non-linear subsystems; namely, nystagmus and saccade. Typically, nystagmus is analysed using a single sufficiently long signal or a concatenation of them. Saccade information is not analysed and discarded due to insufficient data length to provide consistent and minimum variance estimates. This paper presents a novel sparse matrix approach to system identification of the VOR. It allows for the simultaneous estimation of both nystagmus and saccade signals. We show via simulation of the VOR that our technique provides consistent and unbiased estimates in the presence of output additive noise.
Suhr, Anna Catharina; Vogeser, Michael; Grimm, Stefanie H
2016-05-30
For quotable quantitative analysis of endogenous analytes in complex biological samples by isotope dilution LC-MS/MS, the creation of appropriate calibrators is a challenge, since analyte-free authentic material is in general not available. Thus, surrogate matrices are often used to prepare calibrators and controls. However, currently employed validation protocols do not include specific experiments to verify the suitability of a surrogate matrix calibration for quantification of authentic matrix samples. The aim of the study was the development of a novel validation experiment to test whether surrogate matrix based calibrators enable correct quantification of authentic matrix samples. The key element of the novel validation experiment is the inversion of nonlabelled analytes and their stable isotope labelled (SIL) counterparts in respect to their functions, i.e. SIL compound is the analyte and nonlabelled substance is employed as internal standard. As a consequence, both surrogate and authentic matrix are analyte-free regarding SIL analytes, which allows a comparison of both matrices. We called this approach Isotope Inversion Experiment. As figure of merit we defined the accuracy of inverse quality controls in authentic matrix quantified by means of a surrogate matrix calibration curve. As a proof-of-concept application a LC-MS/MS assay addressing six corticosteroids (cortisol, cortisone, corticosterone, 11-deoxycortisol, 11-deoxycorticosterone, and 17-OH-progesterone) was chosen. The integration of the Isotope Inversion Experiment in the validation protocol for the steroid assay was successfully realized. The accuracy results of the inverse quality controls were all in all very satisfying. As a consequence the suitability of a surrogate matrix calibration for quantification of the targeted steroids in human serum as authentic matrix could be successfully demonstrated. The Isotope Inversion Experiment fills a gap in the validation process for LC-MS/MS assays quantifying endogenous analytes. We consider it a valuable and convenient tool to evaluate the correct quantification of authentic matrix samples based on a calibration curve in surrogate matrix. Copyright © 2016 Elsevier B.V. All rights reserved.
Polymer sol-gel composite inverse opal structures.
Zhang, Xiaoran; Blanchard, G J
2015-03-25
We report on the formation of composite inverse opal structures where the matrix used to form the inverse opal contains both silica, formed using sol-gel chemistry, and poly(ethylene glycol), PEG. We find that the morphology of the inverse opal structure depends on both the amount of PEG incorporated into the matrix and its molecular weight. The extent of organization in the inverse opal structure, which is characterized by scanning electron microscopy and optical reflectance data, is mediated by the chemical bonding interactions between the silica and PEG constituents in the hybrid matrix. Both polymer chain terminus Si-O-C bonding and hydrogen bonding between the polymer backbone oxygens and silanol functionalities can contribute, with the polymer mediating the extent to which Si-O-Si bonds can form within the silica regions of the matrix due to hydrogen-bonding interactions.
Deconvolution of mixing time series on a graph
Blocker, Alexander W.; Airoldi, Edoardo M.
2013-01-01
In many applications we are interested in making inference on latent time series from indirect measurements, which are often low-dimensional projections resulting from mixing or aggregation. Positron emission tomography, super-resolution, and network traffic monitoring are some examples. Inference in such settings requires solving a sequence of ill-posed inverse problems, yt = Axt, where the projection mechanism provides information on A. We consider problems in which A specifies mixing on a graph of times series that are bursty and sparse. We develop a multilevel state-space model for mixing times series and an efficient approach to inference. A simple model is used to calibrate regularization parameters that lead to efficient inference in the multilevel state-space model. We apply this method to the problem of estimating point-to-point traffic flows on a network from aggregate measurements. Our solution outperforms existing methods for this problem, and our two-stage approach suggests an efficient inference strategy for multilevel models of multivariate time series. PMID:25309135
NASA Astrophysics Data System (ADS)
Guglielmino, F.; Nunnari, G.; Puglisi, G.; Spata, A.
2009-04-01
We propose a new technique, based on the elastic theory, to efficiently produce an estimate of three-dimensional surface displacement maps by integrating sparse Global Position System (GPS) measurements of deformations and Differential Interferometric Synthetic Aperture Radar (DInSAR) maps of movements of the Earth's surface. The previous methodologies known in literature, for combining data from GPS and DInSAR surveys, require two steps: the first, in which sparse GPS measurements are interpolated in order to fill in GPS displacements at the DInSAR grid, and the second, to estimate the three-dimensional surface displacement maps by using a suitable optimization technique. One of the advantages of the proposed approach is that both these steps are unified. We propose a linear matrix equation which accounts for both GPS and DInSAR data whose solution provide simultaneously the strain tensor, the displacement field and the rigid body rotation tensor throughout the entire investigated area. The mentioned linear matrix equation is solved by using the Weighted Least Square (WLS) thus assuring both numerical robustness and high computation efficiency. The proposed methodology was tested on both synthetic and experimental data, these last from GPS and DInSAR measurements carried out on Mt. Etna. The goodness of the results has been evaluated by using standard errors. These tests also allow optimising the choice of specific parameters of this algorithm. This "open" structure of the method will allow in the near future to take account of other available data sets, such as additional interferograms, or other geodetic data (e.g. levelling, tilt, etc.), in order to achieve even higher accuracy.
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess
2011-01-01
More efficient versions of an interpolation method, called kriging, have been introduced in order to reduce its traditionally high computational cost. Written in C++, these approaches were tested on both synthetic and real data. Kriging is a best unbiased linear estimator and suitable for interpolation of scattered data points. Kriging has long been used in the geostatistic and mining communities, but is now being researched for use in the image fusion of remotely sensed data. This allows a combination of data from various locations to be used to fill in any missing data from any single location. To arrive at the faster algorithms, sparse SYMMLQ iterative solver, covariance tapering, Fast Multipole Methods (FMM), and nearest neighbor searching techniques were used. These implementations were used when the coefficient matrix in the linear system is symmetric, but not necessarily positive-definite.
3D frequency-domain finite-difference modeling of acoustic wave propagation
NASA Astrophysics Data System (ADS)
Operto, S.; Virieux, J.
2006-12-01
We present a 3D frequency-domain finite-difference method for acoustic wave propagation modeling. This method is developed as a tool to perform 3D frequency-domain full-waveform inversion of wide-angle seismic data. For wide-angle data, frequency-domain full-waveform inversion can be applied only to few discrete frequencies to develop reliable velocity model. Frequency-domain finite-difference (FD) modeling of wave propagation requires resolution of a huge sparse system of linear equations. If this system can be solved with a direct method, solutions for multiple sources can be computed efficiently once the underlying matrix has been factorized. The drawback of the direct method is the memory requirement resulting from the fill-in of the matrix during factorization. We assess in this study whether representative problems can be addressed in 3D geometry with such approach. We start from the velocity-stress formulation of the 3D acoustic wave equation. The spatial derivatives are discretized with second-order accurate staggered-grid stencil on different coordinate systems such that the axis span over as many directions as possible. Once the discrete equations were developed on each coordinate system, the particle velocity fields are eliminated from the first-order hyperbolic system (following the so-called parsimonious staggered-grid method) leading to second-order elliptic wave equations in pressure. The second-order wave equations discretized on each coordinate system are combined linearly to mitigate the numerical anisotropy. Secondly, grid dispersion is minimized by replacing the mass term at the collocation point by its weighted averaging over all the grid points of the stencil. Use of second-order accurate staggered- grid stencil allows to reduce the bandwidth of the matrix to be factorized. The final stencil incorporates 27 points. Absorbing conditions are PML. The system is solved using the parallel direct solver MUMPS developed for distributed-memory computers. The MUMPS solver is based on a multifrontal method for LU factorization. We used the METIS algorithm to perform re-ordering of the matrix coefficients before factorization. Four grid points per minimum wavelength is used for discretization. We applied our algorithm to the 3D SEG/EAGE synthetic onshore OVERTHRUST model of dimensions 20 x 20 x 4.65 km. The velocities range between 2 and 6 km/s. We performed the simulations using 192 processors with 2 Gbytes of RAM memory per processor. We performed simulations for the 5 Hz, 7 Hz and 10 Hz frequencies in some fractions of the OVERTHRUST model. The grid interval was 100 m, 75 m and 50 m respectively. The grid dimensions were 207x207x53, 275x218x71 and 409x109x102 respectively corresponding to 100, 80 and 25 percents of the model respectively. The time for factorization is 20 mn, 108 mn and 163 mn respectively. The time for resolution was 3.8, 9.3 and 10.3 s per source. The total memory used during factorization is 143, 384 and 449 Gbytes respectively. One can note the huge memory requirement for factorization and the efficiency of the direct method to compute solutions for a large number of sources. This highlights the respective drawback and merit of the frequency-domain approach with respect to the time- domain counterpart. These results show that 3D acoustic frequency-domain wave propagation modeling can be performed at low frequencies using direct solver on large clusters of Pcs. This forward modeling algorithm may be used in the future as a tool to image the first kilometers of the crust by frequency-domain full-waveform inversion. For larger problems, we will use the out-of-core memory during factorization that has been implemented by the authors of MUMPS.
Shen, Hong-Bin
2011-01-01
Modern science of networks has brought significant advances to our understanding of complex systems biology. As a representative model of systems biology, Protein Interaction Networks (PINs) are characterized by a remarkable modular structures, reflecting functional associations between their components. Many methods were proposed to capture cohesive modules so that there is a higher density of edges within modules than those across them. Recent studies reveal that cohesively interacting modules of proteins is not a universal organizing principle in PINs, which has opened up new avenues for revisiting functional modules in PINs. In this paper, functional clusters in PINs are found to be able to form unorthodox structures defined as bi-sparse module. In contrast to the traditional cohesive module, the nodes in the bi-sparse module are sparsely connected internally and densely connected with other bi-sparse or cohesive modules. We present a novel protocol called the BinTree Seeking (BTS) for mining both bi-sparse and cohesive modules in PINs based on Edge Density of Module (EDM) and matrix theory. BTS detects modules by depicting links and nodes rather than nodes alone and its derivation procedure is totally performed on adjacency matrix of networks. The number of modules in a PIN can be automatically determined in the proposed BTS approach. BTS is tested on three real PINs and the results demonstrate that functional modules in PINs are not dominantly cohesive but can be sparse. BTS software and the supporting information are available at: www.csbio.sjtu.edu.cn/bioinf/BTS/. PMID:22140454
A Spectral Algorithm for Envelope Reduction of Sparse Matrices
NASA Technical Reports Server (NTRS)
Barnard, Stephen T.; Pothen, Alex; Simon, Horst D.
1993-01-01
The problem of reordering a sparse symmetric matrix to reduce its envelope size is considered. A new spectral algorithm for computing an envelope-reducing reordering is obtained by associating a Laplacian matrix with the given matrix and then sorting the components of a specified eigenvector of the Laplacian. This Laplacian eigenvector solves a continuous relaxation of a discrete problem related to envelope minimization called the minimum 2-sum problem. The permutation vector computed by the spectral algorithm is a closest permutation vector to the specified Laplacian eigenvector. Numerical results show that the new reordering algorithm usually computes smaller envelope sizes than those obtained from the current standard algorithms such as Gibbs-Poole-Stockmeyer (GPS) or SPARSPAK reverse Cuthill-McKee (RCM), in some cases reducing the envelope by more than a factor of two.
Zhang, Guoqing; Sun, Huaijiang; Xia, Guiyu; Sun, Quansen
2016-07-07
Sparse representation based classification (SRC) has been developed and shown great potential for real-world application. Based on SRC, Yang et al. [10] devised a SRC steered discriminative projection (SRC-DP) method. However, as a linear algorithm, SRC-DP cannot handle the data with highly nonlinear distribution. Kernel sparse representation-based classifier (KSRC) is a non-linear extension of SRC and can remedy the drawback of SRC. KSRC requires the use of a predetermined kernel function and selection of the kernel function and its parameters is difficult. Recently, multiple kernel learning for SRC (MKL-SRC) [22] has been proposed to learn a kernel from a set of base kernels. However, MKL-SRC only considers the within-class reconstruction residual while ignoring the between-class relationship, when learning the kernel weights. In this paper, we propose a novel multiple kernel sparse representation-based classifier (MKSRC), and then we use it as a criterion to design a multiple kernel sparse representation based orthogonal discriminative projection method (MK-SR-ODP). The proposed algorithm aims at learning a projection matrix and a corresponding kernel from the given base kernels such that in the low dimension subspace the between-class reconstruction residual is maximized and the within-class reconstruction residual is minimized. Furthermore, to achieve a minimum overall loss by performing recognition in the learned low-dimensional subspace, we introduce cost information into the dimensionality reduction method. The solutions for the proposed method can be efficiently found based on trace ratio optimization method [33]. Extensive experimental results demonstrate the superiority of the proposed algorithm when compared with the state-of-the-art methods.
Stable Estimation of a Covariance Matrix Guided by Nuclear Norm Penalties
Chi, Eric C.; Lange, Kenneth
2014-01-01
Estimation of a covariance matrix or its inverse plays a central role in many statistical methods. For these methods to work reliably, estimated matrices must not only be invertible but also well-conditioned. The current paper introduces a novel prior to ensure a well-conditioned maximum a posteriori (MAP) covariance estimate. The prior shrinks the sample covariance estimator towards a stable target and leads to a MAP estimator that is consistent and asymptotically efficient. Thus, the MAP estimator gracefully transitions towards the sample covariance matrix as the number of samples grows relative to the number of covariates. The utility of the MAP estimator is demonstrated in two standard applications – discriminant analysis and EM clustering – in this sampling regime. PMID:25143662
NASA Astrophysics Data System (ADS)
Uieda, Leonardo; Barbosa, Valéria C. F.
2017-01-01
Estimating the relief of the Moho from gravity data is a computationally intensive nonlinear inverse problem. What is more, the modelling must take the Earths curvature into account when the study area is of regional scale or greater. We present a regularized nonlinear gravity inversion method that has a low computational footprint and employs a spherical Earth approximation. To achieve this, we combine the highly efficient Bott's method with smoothness regularization and a discretization of the anomalous Moho into tesseroids (spherical prisms). The computational efficiency of our method is attained by harnessing the fact that all matrices involved are sparse. The inversion results are controlled by three hyperparameters: the regularization parameter, the anomalous Moho density-contrast, and the reference Moho depth. We estimate the regularization parameter using the method of hold-out cross-validation. Additionally, we estimate the density-contrast and the reference depth using knowledge of the Moho depth at certain points. We apply the proposed method to estimate the Moho depth for the South American continent using satellite gravity data and seismological data. The final Moho model is in accordance with previous gravity-derived models and seismological data. The misfit to the gravity and seismological data is worse in the Andes and best in oceanic areas, central Brazil and Patagonia, and along the Atlantic coast. Similarly to previous results, the model suggests a thinner crust of 30-35 km under the Andean foreland basins. Discrepancies with the seismological data are greatest in the Guyana Shield, the central Solimões and Amazonas Basins, the Paraná Basin, and the Borborema province. These differences suggest the existence of crustal or mantle density anomalies that were unaccounted for during gravity data processing.
NASA Astrophysics Data System (ADS)
Bousserez, Nicolas; Henze, Daven; Bowman, Kevin; Liu, Junjie; Jones, Dylan; Keller, Martin; Deng, Feng
2013-04-01
This work presents improved analysis error estimates for 4D-Var systems. From operational NWP models to top-down constraints on trace gas emissions, many of today's data assimilation and inversion systems in atmospheric science rely on variational approaches. This success is due to both the mathematical clarity of these formulations and the availability of computationally efficient minimization algorithms. However, unlike Kalman Filter-based algorithms, these methods do not provide an estimate of the analysis or forecast error covariance matrices, these error statistics being propagated only implicitly by the system. From both a practical (cycling assimilation) and scientific perspective, assessing uncertainties in the solution of the variational problem is critical. For large-scale linear systems, deterministic or randomization approaches can be considered based on the equivalence between the inverse Hessian of the cost function and the covariance matrix of analysis error. For perfectly quadratic systems, like incremental 4D-Var, Lanczos/Conjugate-Gradient algorithms have proven to be most efficient in generating low-rank approximations of the Hessian matrix during the minimization. For weakly non-linear systems though, the Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS), a quasi-Newton descent algorithm, is usually considered the best method for the minimization. Suitable for large-scale optimization, this method allows one to generate an approximation to the inverse Hessian using the latest m vector/gradient pairs generated during the minimization, m depending upon the available core memory. At each iteration, an initial low-rank approximation to the inverse Hessian has to be provided, which is called preconditioning. The ability of the preconditioner to retain useful information from previous iterations largely determines the efficiency of the algorithm. Here we assess the performance of different preconditioners to estimate the inverse Hessian of a large-scale 4D-Var system. The impact of using the diagonal preconditioners proposed by Gilbert and Le Maréchal (1989) instead of the usual Oren-Spedicato scalar will be first presented. We will also introduce new hybrid methods that combine randomization estimates of the analysis error variance with L-BFGS diagonal updates to improve the inverse Hessian approximation. Results from these new algorithms will be evaluated against standard large ensemble Monte-Carlo simulations. The methods explored here are applied to the problem of inferring global atmospheric CO2 fluxes using remote sensing observations, and are intended to be integrated with the future NASA Carbon Monitoring System.
Bouridane, Ahmed; Ling, Bingo Wing-Kuen
2018-01-01
This paper presents an unsupervised learning algorithm for sparse nonnegative matrix factor time–frequency deconvolution with optimized fractional β-divergence. The β-divergence is a group of cost functions parametrized by a single parameter β. The Itakura–Saito divergence, Kullback–Leibler divergence and Least Square distance are special cases that correspond to β=0, 1, 2, respectively. This paper presents a generalized algorithm that uses a flexible range of β that includes fractional values. It describes a maximization–minimization (MM) algorithm leading to the development of a fast convergence multiplicative update algorithm with guaranteed convergence. The proposed model operates in the time–frequency domain and decomposes an information-bearing matrix into two-dimensional deconvolution of factor matrices that represent the spectral dictionary and temporal codes. The deconvolution process has been optimized to yield sparse temporal codes through maximizing the likelihood of the observations. The paper also presents a method to estimate the fractional β value. The method is demonstrated on separating audio mixtures recorded from a single channel. The paper shows that the extraction of the spectral dictionary and temporal codes is significantly more efficient by using the proposed algorithm and subsequently leads to better source separation performance. Experimental tests and comparisons with other factorization methods have been conducted to verify its efficacy. PMID:29702629
NASA Astrophysics Data System (ADS)
Qin, Xulei; Cong, Zhibin; Fei, Baowei
2013-11-01
An automatic segmentation framework is proposed to segment the right ventricle (RV) in echocardiographic images. The method can automatically segment both epicardial and endocardial boundaries from a continuous echocardiography series by combining sparse matrix transform, a training model, and a localized region-based level set. First, the sparse matrix transform extracts main motion regions of the myocardium as eigen-images by analyzing the statistical information of the images. Second, an RV training model is registered to the eigen-images in order to locate the position of the RV. Third, the training model is adjusted and then serves as an optimized initialization for the segmentation of each image. Finally, based on the initializations, a localized, region-based level set algorithm is applied to segment both epicardial and endocardial boundaries in each echocardiograph. Three evaluation methods were used to validate the performance of the segmentation framework. The Dice coefficient measures the overall agreement between the manual and automatic segmentation. The absolute distance and the Hausdorff distance between the boundaries from manual and automatic segmentation were used to measure the accuracy of the segmentation. Ultrasound images of human subjects were used for validation. For the epicardial and endocardial boundaries, the Dice coefficients were 90.8 ± 1.7% and 87.3 ± 1.9%, the absolute distances were 2.0 ± 0.42 mm and 1.79 ± 0.45 mm, and the Hausdorff distances were 6.86 ± 1.71 mm and 7.02 ± 1.17 mm, respectively. The automatic segmentation method based on a sparse matrix transform and level set can provide a useful tool for quantitative cardiac imaging.
Moody, Daniela; Wohlberg, Brendt
2018-01-02
An approach for land cover classification, seasonal and yearly change detection and monitoring, and identification of changes in man-made features may use a clustering of sparse approximations (CoSA) on sparse representations in learned dictionaries. The learned dictionaries may be derived using efficient convolutional sparse coding to build multispectral or hyperspectral, multiresolution dictionaries that are adapted to regional satellite image data. Sparse image representations of images over the learned dictionaries may be used to perform unsupervised k-means clustering into land cover categories. The clustering process behaves as a classifier in detecting real variability. This approach may combine spectral and spatial textural characteristics to detect geologic, vegetative, hydrologic, and man-made features, as well as changes in these features over time.
Compressive sensing using optimized sensing matrix for face verification
NASA Astrophysics Data System (ADS)
Oey, Endra; Jeffry; Wongso, Kelvin; Tommy
2017-12-01
Biometric appears as one of the solutions which is capable in solving problems that occurred in the usage of password in terms of data access, for example there is possibility in forgetting password and hard to recall various different passwords. With biometrics, physical characteristics of a person can be captured and used in the identification process. In this research, facial biometric is used in the verification process to determine whether the user has the authority to access the data or not. Facial biometric is chosen as its low cost implementation and generate quite accurate result for user identification. Face verification system which is adopted in this research is Compressive Sensing (CS) technique, in which aims to reduce dimension size as well as encrypt data in form of facial test image where the image is represented in sparse signals. Encrypted data can be reconstructed using Sparse Coding algorithm. Two types of Sparse Coding namely Orthogonal Matching Pursuit (OMP) and Iteratively Reweighted Least Squares -ℓp (IRLS-ℓp) will be used for comparison face verification system research. Reconstruction results of sparse signals are then used to find Euclidean norm with the sparse signal of user that has been previously saved in system to determine the validity of the facial test image. Results of system accuracy obtained in this research are 99% in IRLS with time response of face verification for 4.917 seconds and 96.33% in OMP with time response of face verification for 0.4046 seconds with non-optimized sensing matrix, while 99% in IRLS with time response of face verification for 13.4791 seconds and 98.33% for OMP with time response of face verification for 3.1571 seconds with optimized sensing matrix.
Bordehore, Cesar; Fuentes, Verónica L; Segarra, Jose G; Acevedo, Melisa; Canepa, Antonio; Raventós, Josep
2015-01-01
Frequently, population ecology of marine organisms uses a descriptive approach in which their sizes and densities are plotted over time. This approach has limited usefulness for design strategies in management or modelling different scenarios. Population projection matrix models are among the most widely used tools in ecology. Unfortunately, for the majority of pelagic marine organisms, it is difficult to mark individuals and follow them over time to determine their vital rates and built a population projection matrix model. Nevertheless, it is possible to get time-series data to calculate size structure and densities of each size, in order to determine the matrix parameters. This approach is known as a "demographic inverse problem" and it is based on quadratic programming methods, but it has rarely been used on aquatic organisms. We used unpublished field data of a population of cubomedusae Carybdea marsupialis to construct a population projection matrix model and compare two different management strategies to lower population to values before year 2008 when there was no significant interaction with bathers. Those strategies were by direct removal of medusae and by reducing prey. Our results showed that removal of jellyfish from all size classes was more effective than removing only juveniles or adults. When reducing prey, the highest efficiency to lower the C. marsupialis population occurred when prey depletion affected prey of all medusae sizes. Our model fit well with the field data and may serve to design an efficient management strategy or build hypothetical scenarios such as removal of individuals or reducing prey. TThis This sdfsdshis method is applicable to other marine or terrestrial species, for which density and population structure over time are available.
A robust method of computing finite difference coefficients based on Vandermonde matrix
NASA Astrophysics Data System (ADS)
Zhang, Yijie; Gao, Jinghuai; Peng, Jigen; Han, Weimin
2018-05-01
When the finite difference (FD) method is employed to simulate the wave propagation, high-order FD method is preferred in order to achieve better accuracy. However, if the order of FD scheme is high enough, the coefficient matrix of the formula for calculating finite difference coefficients is close to be singular. In this case, when the FD coefficients are computed by matrix inverse operator of MATLAB, inaccuracy can be produced. In order to overcome this problem, we have suggested an algorithm based on Vandermonde matrix in this paper. After specified mathematical transformation, the coefficient matrix is transformed into a Vandermonde matrix. Then the FD coefficients of high-order FD method can be computed by the algorithm of Vandermonde matrix, which prevents the inverse of the singular matrix. The dispersion analysis and numerical results of a homogeneous elastic model and a geophysical model of oil and gas reservoir demonstrate that the algorithm based on Vandermonde matrix has better accuracy compared with matrix inverse operator of MATLAB.
Disconnected Diagrams in Lattice QCD
NASA Astrophysics Data System (ADS)
Gambhir, Arjun Singh
In this work, we present state-of-the-art numerical methods and their applications for computing a particular class of observables using lattice quantum chromodynamics (Lattice QCD), a discretized version of the fundamental theory of quarks and gluons. These observables require calculating so called "disconnected diagrams" and are important for understanding many aspects of hadron structure, such as the strange content of the proton. We begin by introducing the reader to the key concepts of Lattice QCD and rigorously define the meaning of disconnected diagrams through an example of the Wick contractions of the nucleon. Subsequently, the calculation of observables requiring disconnected diagrams is posed as the computationally challenging problem of finding the trace of the inverse of an incredibly large, sparse matrix. This is followed by a brief primer of numerical sparse matrix techniques that overviews broadly used methods in Lattice QCD and builds the background for the novel algorithm presented in this work. We then introduce singular value deflation as a method to improve convergence of trace estimation and analyze its effects on matrices from a variety of fields, including chemical transport modeling, magnetohydrodynamics, and QCD. Finally, we apply this method to compute observables such as the strange axial charge of the proton and strange sigma terms in light nuclei. The work in this thesis is innovative for four reasons. First, we analyze the effects of deflation with a model that makes qualitative predictions about its effectiveness, taking only the singular value spectrum as input, and compare deflated variance with different types of trace estimator noise. Second, the synergy between probing methods and deflation is investigated both experimentally and theoretically. Third, we use the synergistic combination of deflation and a graph coloring algorithm known as hierarchical probing to conduct a lattice calculation of light disconnected matrix elements of the nucleon at two different values of the lattice spacing. Finally, we employ these algorithms to do a high-precision study of strange sigma terms in light nuclei; to our knowledge this is the first calculation of its kind from Lattice QCD.
Disconnected Diagrams in Lattice QCD
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gambhir, Arjun
In this work, we present state-of-the-art numerical methods and their applications for computing a particular class of observables using lattice quantum chromodynamics (Lattice QCD), a discretized version of the fundamental theory of quarks and gluons. These observables require calculating so called \\disconnected diagrams" and are important for understanding many aspects of hadron structure, such as the strange content of the proton. We begin by introducing the reader to the key concepts of Lattice QCD and rigorously define the meaning of disconnected diagrams through an example of the Wick contractions of the nucleon. Subsequently, the calculation of observables requiring disconnected diagramsmore » is posed as the computationally challenging problem of finding the trace of the inverse of an incredibly large, sparse matrix. This is followed by a brief primer of numerical sparse matrix techniques that overviews broadly used methods in Lattice QCD and builds the background for the novel algorithm presented in this work. We then introduce singular value deflation as a method to improve convergence of trace estimation and analyze its effects on matrices from a variety of fields, including chemical transport modeling, magnetohydrodynamics, and QCD. Finally, we apply this method to compute observables such as the strange axial charge of the proton and strange sigma terms in light nuclei. The work in this thesis is innovative for four reasons. First, we analyze the effects of deflation with a model that makes qualitative predictions about its effectiveness, taking only the singular value spectrum as input, and compare deflated variance with different types of trace estimator noise. Second, the synergy between probing methods and deflation is investigated both experimentally and theoretically. Third, we use the synergistic combination of deflation and a graph coloring algorithm known as hierarchical probing to conduct a lattice calculation of light disconnected matrix elements of the nucleon at two different values of the lattice spacing. Finally, we employ these algorithms to do a high-precision study of strange sigma terms in light nuclei; to our knowledge this is the first calculation of its kind from Lattice QCD.« less
Computing the Moore-Penrose Inverse of a Matrix with a Computer Algebra System
ERIC Educational Resources Information Center
Schmidt, Karsten
2008-01-01
In this paper "Derive" functions are provided for the computation of the Moore-Penrose inverse of a matrix, as well as for solving systems of linear equations by means of the Moore-Penrose inverse. Making it possible to compute the Moore-Penrose inverse easily with one of the most commonly used Computer Algebra Systems--and to have the blueprint…
A Subspace Pursuit–based Iterative Greedy Hierarchical Solution to the Neuromagnetic Inverse Problem
Babadi, Behtash; Obregon-Henao, Gabriel; Lamus, Camilo; Hämäläinen, Matti S.; Brown, Emery N.; Purdon, Patrick L.
2013-01-01
Magnetoencephalography (MEG) is an important non-invasive method for studying activity within the human brain. Source localization methods can be used to estimate spatiotemporal activity from MEG measurements with high temporal resolution, but the spatial resolution of these estimates is poor due to the ill-posed nature of the MEG inverse problem. Recent developments in source localization methodology have emphasized temporal as well as spatial constraints to improve source localization accuracy, but these methods can be computationally intense. Solutions emphasizing spatial sparsity hold tremendous promise, since the underlying neurophysiological processes generating MEG signals are often sparse in nature, whether in the form of focal sources, or distributed sources representing large-scale functional networks. Recent developments in the theory of compressed sensing (CS) provide a rigorous framework to estimate signals with sparse structure. In particular, a class of CS algorithms referred to as greedy pursuit algorithms can provide both high recovery accuracy and low computational complexity. Greedy pursuit algorithms are difficult to apply directly to the MEG inverse problem because of the high-dimensional structure of the MEG source space and the high spatial correlation in MEG measurements. In this paper, we develop a novel greedy pursuit algorithm for sparse MEG source localization that overcomes these fundamental problems. This algorithm, which we refer to as the Subspace Pursuit-based Iterative Greedy Hierarchical (SPIGH) inverse solution, exhibits very low computational complexity while achieving very high localization accuracy. We evaluate the performance of the proposed algorithm using comprehensive simulations, as well as the analysis of human MEG data during spontaneous brain activity and somatosensory stimuli. These studies reveal substantial performance gains provided by the SPIGH algorithm in terms of computational complexity, localization accuracy, and robustness. PMID:24055554
Efficient convolutional sparse coding
Wohlberg, Brendt
2017-06-20
Computationally efficient algorithms may be applied for fast dictionary learning solving the convolutional sparse coding problem in the Fourier domain. More specifically, efficient convolutional sparse coding may be derived within an alternating direction method of multipliers (ADMM) framework that utilizes fast Fourier transforms (FFT) to solve the main linear system in the frequency domain. Such algorithms may enable a significant reduction in computational cost over conventional approaches by implementing a linear solver for the most critical and computationally expensive component of the conventional iterative algorithm. The theoretical computational cost of the algorithm may be reduced from O(M.sup.3N) to O(MN log N), where N is the dimensionality of the data and M is the number of elements in the dictionary. This significant improvement in efficiency may greatly increase the range of problems that can practically be addressed via convolutional sparse representations.
Direction of Arrival Estimation for MIMO Radar via Unitary Nuclear Norm Minimization
Wang, Xianpeng; Huang, Mengxing; Wu, Xiaoqin; Bi, Guoan
2017-01-01
In this paper, we consider the direction of arrival (DOA) estimation issue of noncircular (NC) source in multiple-input multiple-output (MIMO) radar and propose a novel unitary nuclear norm minimization (UNNM) algorithm. In the proposed method, the noncircular properties of signals are used to double the virtual array aperture, and the real-valued data are obtained by utilizing unitary transformation. Then a real-valued block sparse model is established based on a novel over-complete dictionary, and a UNNM algorithm is formulated for recovering the block-sparse matrix. In addition, the real-valued NC-MUSIC spectrum is used to design a weight matrix for reweighting the nuclear norm minimization to achieve the enhanced sparsity of solutions. Finally, the DOA is estimated by searching the non-zero blocks of the recovered matrix. Because of using the noncircular properties of signals to extend the virtual array aperture and an additional real structure to suppress the noise, the proposed method provides better performance compared with the conventional sparse recovery based algorithms. Furthermore, the proposed method can handle the case of underdetermined DOA estimation. Simulation results show the effectiveness and advantages of the proposed method. PMID:28441770
Track monitoring from the dynamic response of a passing train: A sparse approach
NASA Astrophysics Data System (ADS)
Lederman, George; Chen, Siheng; Garrett, James H.; Kovačević, Jelena; Noh, Hae Young; Bielak, Jacobo
2017-06-01
Collecting vibration data from revenue service trains could be a low-cost way to more frequently monitor railroad tracks, yet operational variability makes robust analysis a challenge. We propose a novel analysis technique for track monitoring that exploits the sparsity inherent in train-vibration data. This sparsity is based on the observation that large vertical train vibrations typically involve the excitation of the train's fundamental mode due to track joints, switchgear, or other discrete hardware. Rather than try to model the entire rail profile, in this study we examine a sparse approach to solving an inverse problem where (1) the roughness is constrained to a discrete and limited set of "bumps"; and (2) the train system is idealized as a simple damped oscillator that models the train's vibration in the fundamental mode. We use an expectation maximization (EM) approach to iteratively solve for the track profile and the train system properties, using orthogonal matching pursuit (OMP) to find the sparse approximation within each step. By enforcing sparsity, the inverse problem is well posed and the train's position can be found relative to the sparse bumps, thus reducing the uncertainty in the GPS data. We validate the sparse approach on two sections of track monitored from an operational train over a 16 month period of time, one where track changes did not occur during this period and another where changes did occur. We show that this approach can not only detect when track changes occur, but also offers insight into the type of such changes.
Improved Quasi-Newton method via PSB update for solving systems of nonlinear equations
NASA Astrophysics Data System (ADS)
Mamat, Mustafa; Dauda, M. K.; Waziri, M. Y.; Ahmad, Fadhilah; Mohamad, Fatma Susilawati
2016-10-01
The Newton method has some shortcomings which includes computation of the Jacobian matrix which may be difficult or even impossible to compute and solving the Newton system in every iteration. Also, the common setback with some quasi-Newton methods is that they need to compute and store an n × n matrix at each iteration, this is computationally costly for large scale problems. To overcome such drawbacks, an improved Method for solving systems of nonlinear equations via PSB (Powell-Symmetric-Broyden) update is proposed. In the proposed method, the approximate Jacobian inverse Hk of PSB is updated and its efficiency has improved thereby require low memory storage, hence the main aim of this paper. The preliminary numerical results show that the proposed method is practically efficient when applied on some benchmark problems.
Acceleration of Linear Finite-Difference Poisson-Boltzmann Methods on Graphics Processing Units.
Qi, Ruxi; Botello-Smith, Wesley M; Luo, Ray
2017-07-11
Electrostatic interactions play crucial roles in biophysical processes such as protein folding and molecular recognition. Poisson-Boltzmann equation (PBE)-based models have emerged as widely used in modeling these important processes. Though great efforts have been put into developing efficient PBE numerical models, challenges still remain due to the high dimensionality of typical biomolecular systems. In this study, we implemented and analyzed commonly used linear PBE solvers for the ever-improving graphics processing units (GPU) for biomolecular simulations, including both standard and preconditioned conjugate gradient (CG) solvers with several alternative preconditioners. Our implementation utilizes the standard Nvidia CUDA libraries cuSPARSE, cuBLAS, and CUSP. Extensive tests show that good numerical accuracy can be achieved given that the single precision is often used for numerical applications on GPU platforms. The optimal GPU performance was observed with the Jacobi-preconditioned CG solver, with a significant speedup over standard CG solver on CPU in our diversified test cases. Our analysis further shows that different matrix storage formats also considerably affect the efficiency of different linear PBE solvers on GPU, with the diagonal format best suited for our standard finite-difference linear systems. Further efficiency may be possible with matrix-free operations and integrated grid stencil setup specifically tailored for the banded matrices in PBE-specific linear systems.
Xia, J.; Miller, R.D.; Xu, Y.
2008-01-01
Inversion of multimode surface-wave data is of increasing interest in the near-surface geophysics community. For a given near-surface geophysical problem, it is essential to understand how well the data, calculated according to a layered-earth model, might match the observed data. A data-resolution matrix is a function of the data kernel (determined by a geophysical model and a priori information applied to the problem), not the data. A data-resolution matrix of high-frequency (>2 Hz) Rayleigh-wave phase velocities, therefore, offers a quantitative tool for designing field surveys and predicting the match between calculated and observed data. We employed a data-resolution matrix to select data that would be well predicted and we find that there are advantages of incorporating higher modes in inversion. The resulting discussion using the data-resolution matrix provides insight into the process of inverting Rayleigh-wave phase velocities with higher-mode data to estimate S-wave velocity structure. Discussion also suggested that each near-surface geophysical target can only be resolved using Rayleigh-wave phase velocities within specific frequency ranges, and higher-mode data are normally more accurately predicted than fundamental-mode data because of restrictions on the data kernel for the inversion system. We used synthetic and real-world examples to demonstrate that selected data with the data-resolution matrix can provide better inversion results and to explain with the data-resolution matrix why incorporating higher-mode data in inversion can provide better results. We also calculated model-resolution matrices in these examples to show the potential of increasing model resolution with selected surface-wave data. ?? Birkhaueser 2008.
Zhao, De; He, Zhongyuan; Wang, Gang; Wang, Hongzhi; Zhang, Qinghong; Li, Yaogang
2016-09-15
Microfluidic technology plays a significant role in separating biomolecules, because of its miniaturization, integration, and automation. Introducing micro/nanostructured functional materials can improve the properties of microfluidic devices, and extend their application. Inverse opal has a three-dimensional ordered net-like structure. It possesses a large surface area and exhibits good mass transport, making it a good candidate for bio-separation. This study exploits inverse opal titanium dioxide-zirconium dioxide films for on-chip phosphopeptide enrichment. Titanium dioxide-zirconium dioxide inverse opal film-based microfluidic devices were constructed from templates of 270-, 340-, and 370-nm-diameter poly(methylmethacrylate) spheres. The phosphopeptide enrichments of these devices were determined by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. The device constructed from the 270-nm-diameter sphere template exhibited good comprehensive phosphopeptide enrichment, and was the best among these three devices. Because the size of opal template used in construction was the smallest, the inverse opal film therefore had the smallest pore sizes and the largest surface area. Enrichment by this device was also better than those of similar devices based on nanoparticle films and single component films. The titanium dioxide-zirconium dioxide inverse opal film-based device provides a promising approach for the efficient separation of various biomolecules. Copyright © 2016 Elsevier Inc. All rights reserved.
An ambiguity of information content and error in an ill-posed satellite inversion
NASA Astrophysics Data System (ADS)
Koner, Prabhat
According to Rodgers (2000, stochastic approach), the averaging kernel (AK) is the representational matrix to understand the information content in a scholastic inversion. On the other hand, in deterministic approach this is referred to as model resolution matrix (MRM, Menke 1989). The analysis of AK/MRM can only give some understanding of how much regularization is imposed on the inverse problem. The trace of the AK/MRM matrix, which is the so-called degree of freedom from signal (DFS; stochastic) or degree of freedom in retrieval (DFR; deterministic). There are no physical/mathematical explanations in the literature: why the trace of the matrix is a valid form to calculate this quantity? We will present an ambiguity between information and error using a real life problem of SST retrieval from GOES13. The stochastic information content calculation is based on the linear assumption. The validity of such mathematics in satellite inversion will be questioned because it is based on the nonlinear radiative transfer and ill-conditioned inverse problems. References: Menke, W., 1989: Geophysical data analysis: discrete inverse theory. San Diego academic press. Rodgers, C.D., 2000: Inverse methods for atmospheric soundings: theory and practice. Singapore :World Scientific.
Arikan and Alamouti matrices based on fast block-wise inverse Jacket transform
NASA Astrophysics Data System (ADS)
Lee, Moon Ho; Khan, Md Hashem Ali; Kim, Kyeong Jin
2013-12-01
Recently, Lee and Hou (IEEE Signal Process Lett 13: 461-464, 2006) proposed one-dimensional and two-dimensional fast algorithms for block-wise inverse Jacket transforms (BIJTs). Their BIJTs are not real inverse Jacket transforms from mathematical point of view because their inverses do not satisfy the usual condition, i.e., the multiplication of a matrix with its inverse matrix is not equal to the identity matrix. Therefore, we mathematically propose a fast block-wise inverse Jacket transform of orders N = 2 k , 3 k , 5 k , and 6 k , where k is a positive integer. Based on the Kronecker product of the successive lower order Jacket matrices and the basis matrix, the fast algorithms for realizing these transforms are obtained. Due to the simple inverse and fast algorithms of Arikan polar binary and Alamouti multiple-input multiple-output (MIMO) non-binary matrices, which are obtained from BIJTs, they can be applied in areas such as 3GPP physical layer for ultra mobile broadband permutation matrices design, first-order q-ary Reed-Muller code design, diagonal channel design, diagonal subchannel decompose for interference alignment, and 4G MIMO long-term evolution Alamouti precoding design.
NASA Astrophysics Data System (ADS)
Aghamaleki, Javad Abbasi; Behrad, Alireza
2018-01-01
Double compression detection is a crucial stage in digital image and video forensics. However, the detection of double compressed videos is challenging when the video forger uses the same quantization matrix and synchronized group of pictures (GOP) structure during the recompression history to conceal tampering effects. A passive approach is proposed for detecting double compressed MPEG videos with the same quantization matrix and synchronized GOP structure. To devise the proposed algorithm, the effects of recompression on P frames are mathematically studied. Then, based on the obtained guidelines, a feature vector is proposed to detect double compressed frames on the GOP level. Subsequently, sparse representations of the feature vectors are used for dimensionality reduction and enrich the traces of recompression. Finally, a support vector machine classifier is employed to detect and localize double compression in temporal domain. The experimental results show that the proposed algorithm achieves the accuracy of more than 95%. In addition, the comparisons of the results of the proposed method with those of other methods reveal the efficiency of the proposed algorithm.
Kussmann, Jörg; Ochsenfeld, Christian
2007-11-28
A density matrix-based time-dependent self-consistent field (D-TDSCF) method for the calculation of dynamic polarizabilities and first hyperpolarizabilities using the Hartree-Fock and Kohn-Sham density functional theory approaches is presented. The D-TDSCF method allows us to reduce the asymptotic scaling behavior of the computational effort from cubic to linear for systems with a nonvanishing band gap. The linear scaling is achieved by combining a density matrix-based reformulation of the TDSCF equations with linear-scaling schemes for the formation of Fock- or Kohn-Sham-type matrices. In our reformulation only potentially linear-scaling matrices enter the formulation and efficient sparse algebra routines can be employed. Furthermore, the corresponding formulas for the first hyperpolarizabilities are given in terms of zeroth- and first-order one-particle reduced density matrices according to Wigner's (2n+1) rule. The scaling behavior of our method is illustrated for first exemplary calculations with systems of up to 1011 atoms and 8899 basis functions.
MARE2DEM: a 2-D inversion code for controlled-source electromagnetic and magnetotelluric data
NASA Astrophysics Data System (ADS)
Key, Kerry
2016-10-01
This work presents MARE2DEM, a freely available code for 2-D anisotropic inversion of magnetotelluric (MT) data and frequency-domain controlled-source electromagnetic (CSEM) data from onshore and offshore surveys. MARE2DEM parametrizes the inverse model using a grid of arbitrarily shaped polygons, where unstructured triangular or quadrilateral grids are typically used due to their ease of construction. Unstructured grids provide significantly more geometric flexibility and parameter efficiency than the structured rectangular grids commonly used by most other inversion codes. Transmitter and receiver components located on topographic slopes can be tilted parallel to the boundary so that the simulated electromagnetic fields accurately reproduce the real survey geometry. The forward solution is implemented with a goal-oriented adaptive finite-element method that automatically generates and refines unstructured triangular element grids that conform to the inversion parameter grid, ensuring accurate responses as the model conductivity changes. This dual-grid approach is significantly more efficient than the conventional use of a single grid for both the forward and inverse meshes since the more detailed finite-element meshes required for accurate responses do not increase the memory requirements of the inverse problem. Forward solutions are computed in parallel with a highly efficient scaling by partitioning the data into smaller independent modeling tasks consisting of subsets of the input frequencies, transmitters and receivers. Non-linear inversion is carried out with a new Occam inversion approach that requires fewer forward calls. Dense matrix operations are optimized for memory and parallel scalability using the ScaLAPACK parallel library. Free parameters can be bounded using a new non-linear transformation that leaves the transformed parameters nearly the same as the original parameters within the bounds, thereby reducing non-linear smoothing effects. Data balancing normalization weights for the joint inversion of two or more data sets encourages the inversion to fit each data type equally well. A synthetic joint inversion of marine CSEM and MT data illustrates the algorithm's performance and parallel scaling on up to 480 processing cores. CSEM inversion of data from the Middle America Trench offshore Nicaragua demonstrates a real world application. The source code and MATLAB interface tools are freely available at http://mare2dem.ucsd.edu.
Sparse-View Ultrasound Diffraction Tomography Using Compressed Sensing with Nonuniform FFT
2014-01-01
Accurate reconstruction of the object from sparse-view sampling data is an appealing issue for ultrasound diffraction tomography (UDT). In this paper, we present a reconstruction method based on compressed sensing framework for sparse-view UDT. Due to the piecewise uniform characteristics of anatomy structures, the total variation is introduced into the cost function to find a more faithful sparse representation of the object. The inverse problem of UDT is iteratively resolved by conjugate gradient with nonuniform fast Fourier transform. Simulation results show the effectiveness of the proposed method that the main characteristics of the object can be properly presented with only 16 views. Compared to interpolation and multiband method, the proposed method can provide higher resolution and lower artifacts with the same view number. The robustness to noise and the computation complexity are also discussed. PMID:24868241
Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Kyungjoo; Rajamanickam, Sivasankaran; Stelle, George Widgery
We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block layout. The algorithm-byblocks approach induces a task graph for the factorization. These tasks are inter-related to each other through their data dependences in the factorization algorithm. To process the tasks on various manycore architectures in a portable manner, we also present a portable tasking API that incorporates different tasking backends and device-specific features using an open-source framework for manycore platforms i.e., Kokkos. A performance evaluation is presented onmore » both Intel Sandybridge and Xeon Phi platforms for matrices from the University of Florida sparse matrix collection to illustrate merits of the proposed task-based factorization. Experimental results demonstrate that our task-parallel implementation delivers about 26.6x speedup (geometric mean) over single-threaded incomplete Choleskyby- blocks and 19.2x speedup over serial Cholesky performance which does not carry tasking overhead using 56 threads on the Intel Xeon Phi processor for sparse matrices arising from various application problems.« less
Wen, Zaidao; Hou, Zaidao; Jiao, Licheng
2017-11-01
Discriminative dictionary learning (DDL) framework has been widely used in image classification which aims to learn some class-specific feature vectors as well as a representative dictionary according to a set of labeled training samples. However, interclass similarities and intraclass variances among input samples and learned features will generally weaken the representability of dictionary and the discrimination of feature vectors so as to degrade the classification performance. Therefore, how to explicitly represent them becomes an important issue. In this paper, we present a novel DDL framework with two-level low rank and group sparse decomposition model. In the first level, we learn a class-shared and several class-specific dictionaries, where a low rank and a group sparse regularization are, respectively, imposed on the corresponding feature matrices. In the second level, the class-specific feature matrix will be further decomposed into a low rank and a sparse matrix so that intraclass variances can be separated to concentrate the corresponding feature vectors. Extensive experimental results demonstrate the effectiveness of our model. Compared with the other state-of-the-arts on several popular image databases, our model can achieve a competitive or better performance in terms of the classification accuracy.
Nonlocal low-rank and sparse matrix decomposition for spectral CT reconstruction
NASA Astrophysics Data System (ADS)
Niu, Shanzhou; Yu, Gaohang; Ma, Jianhua; Wang, Jing
2018-02-01
Spectral computed tomography (CT) has been a promising technique in research and clinics because of its ability to produce improved energy resolution images with narrow energy bins. However, the narrow energy bin image is often affected by serious quantum noise because of the limited number of photons used in the corresponding energy bin. To address this problem, we present an iterative reconstruction method for spectral CT using nonlocal low-rank and sparse matrix decomposition (NLSMD), which exploits the self-similarity of patches that are collected in multi-energy images. Specifically, each set of patches can be decomposed into a low-rank component and a sparse component, and the low-rank component represents the stationary background over different energy bins, while the sparse component represents the rest of the different spectral features in individual energy bins. Subsequently, an effective alternating optimization algorithm was developed to minimize the associated objective function. To validate and evaluate the NLSMD method, qualitative and quantitative studies were conducted by using simulated and real spectral CT data. Experimental results show that the NLSMD method improves spectral CT images in terms of noise reduction, artifact suppression and resolution preservation.
Parallel heterogeneous architectures for efficient OMP compressive sensing reconstruction
NASA Astrophysics Data System (ADS)
Kulkarni, Amey; Stanislaus, Jerome L.; Mohsenin, Tinoosh
2014-05-01
Compressive Sensing (CS) is a novel scheme, in which a signal that is sparse in a known transform domain can be reconstructed using fewer samples. The signal reconstruction techniques are computationally intensive and have sluggish performance, which make them impractical for real-time processing applications . The paper presents novel architectures for Orthogonal Matching Pursuit algorithm, one of the popular CS reconstruction algorithms. We show the implementation results of proposed architectures on FPGA, ASIC and on a custom many-core platform. For FPGA and ASIC implementation, a novel thresholding method is used to reduce the processing time for the optimization problem by at least 25%. Whereas, for the custom many-core platform, efficient parallelization techniques are applied, to reconstruct signals with variant signal lengths of N and sparsity of m. The algorithm is divided into three kernels. Each kernel is parallelized to reduce execution time, whereas efficient reuse of the matrix operators allows us to reduce area. Matrix operations are efficiently paralellized by taking advantage of blocked algorithms. For demonstration purpose, all architectures reconstruct a 256-length signal with maximum sparsity of 8 using 64 measurements. Implementation on Xilinx Virtex-5 FPGA, requires 27.14 μs to reconstruct the signal using basic OMP. Whereas, with thresholding method it requires 18 μs. ASIC implementation reconstructs the signal in 13 μs. However, our custom many-core, operating at 1.18 GHz, takes 18.28 μs to complete. Our results show that compared to the previous published work of the same algorithm and matrix size, proposed architectures for FPGA and ASIC implementations perform 1.3x and 1.8x respectively faster. Also, the proposed many-core implementation performs 3000x faster than the CPU and 2000x faster than the GPU.
NASA Astrophysics Data System (ADS)
Xue, Zhaohui; Du, Peijun; Li, Jun; Su, Hongjun
2017-02-01
The generally limited availability of training data relative to the usually high data dimension pose a great challenge to accurate classification of hyperspectral imagery, especially for identifying crops characterized with highly correlated spectra. However, traditional parametric classification models are problematic due to the need of non-singular class-specific covariance matrices. In this research, a novel sparse graph regularization (SGR) method is presented, aiming at robust crop mapping using hyperspectral imagery with very few in situ data. The core of SGR lies in propagating labels from known data to unknown, which is triggered by: (1) the fraction matrix generated for the large unknown data by using an effective sparse representation algorithm with respect to the few training data serving as the dictionary; (2) the prediction function estimated for the few training data by formulating a regularization model based on sparse graph. Then, the labels of large unknown data can be obtained by maximizing the posterior probability distribution based on the two ingredients. SGR is more discriminative, data-adaptive, robust to noise, and efficient, which is unique with regard to previously proposed approaches and has high potentials in discriminating crops, especially when facing insufficient training data and high-dimensional spectral space. The study area is located at Zhangye basin in the middle reaches of Heihe watershed, Gansu, China, where eight crop types were mapped with Compact Airborne Spectrographic Imager (CASI) and Shortwave Infrared Airborne Spectrogrpahic Imager (SASI) hyperspectral data. Experimental results demonstrate that the proposed method significantly outperforms other traditional and state-of-the-art methods.
Ziyatdinov, Andrey; Vázquez-Santiago, Miquel; Brunel, Helena; Martinez-Perez, Angel; Aschard, Hugues; Soria, Jose Manuel
2018-02-27
Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by modeling such correlations as random effects in linear mixed models (LMMs). The R package lme4 is a well-established tool that implements major LMM features using sparse matrix methods; however, it is not fully adapted for QTL mapping association and linkage studies. In particular, two LMM features are lacking in the base version of lme4: the definition of random effects by custom covariance matrices; and parameter constraints, which are essential in advanced QTL models. Apart from applications in linkage studies of related individuals, such functionalities are of high interest for association studies in situations where multiple covariance matrices need to be modeled, a scenario not covered by many genome-wide association study (GWAS) software. To address the aforementioned limitations, we developed a new R package lme4qtl as an extension of lme4. First, lme4qtl contributes new models for genetic studies within a single tool integrated with lme4 and its companion packages. Second, lme4qtl offers a flexible framework for scenarios with multiple levels of relatedness and becomes efficient when covariance matrices are sparse. We showed the value of our package using real family-based data in the Genetic Analysis of Idiopathic Thrombophilia 2 (GAIT2) project. Our software lme4qtl enables QTL mapping models with a versatile structure of random effects and efficient computation for sparse covariances. lme4qtl is available at https://github.com/variani/lme4qtl .
Optimal parallel solution of sparse triangular systems
NASA Technical Reports Server (NTRS)
Alvarado, Fernando L.; Schreiber, Robert
1990-01-01
A method for the parallel solution of triangular sets of equations is described that is appropriate when there are many right-handed sides. By preprocessing, the method can reduce the number of parallel steps required to solve Lx = b compared to parallel forward or backsolve. Applications are to iterative solvers with triangular preconditioners, to structural analysis, or to power systems applications, where there may be many right-handed sides (not all available a priori). The inverse of L is represented as a product of sparse triangular factors. The problem is to find a factored representation of this inverse of L with the smallest number of factors (or partitions), subject to the requirement that no new nonzero elements be created in the formation of these inverse factors. A method from an earlier reference is shown to solve this problem. This method is improved upon by constructing a permutation of the rows and columns of L that preserves triangularity and allow for the best possible such partition. A number of practical examples and algorithmic details are presented. The parallelism attainable is illustrated by means of elimination trees and clique trees.
Estimation of near-surface shear-wave velocity by inversion of Rayleigh waves
Xia, J.; Miller, R.D.; Park, C.B.
1999-01-01
The shear-wave (S-wave) velocity of near-surface materials (soil, rocks, pavement) and its effect on seismic-wave propagation are of fundamental interest in many groundwater, engineering, and environmental studies. Rayleigh-wave phase velocity of a layered-earth model is a function of frequency and four groups of earth properties: P-wave velocity, S-wave velocity, density, and thickness of layers. Analysis of the Jacobian matrix provides a measure of dispersion-curve sensitivity to earth properties. S-wave velocities are the dominant influence on a dispersion curve in a high-frequency range (>5 Hz) followed by layer thickness. An iterative solution technique to the weighted equation proved very effective in the high-frequency range when using the Levenberg-Marquardt and singular-value decomposition techniques. Convergence of the weighted solution is guaranteed through selection of the damping factor using the Levenberg-Marquardt method. Synthetic examples demonstrated calculation efficiency and stability of inverse procedures. We verify our method using borehole S-wave velocity measurements.Iterative solutions to the weighted equation by the Levenberg-Marquardt and singular-value decomposition techniques are derived to estimate near-surface shear-wave velocity. Synthetic and real examples demonstrate the calculation efficiency and stability of the inverse procedure. The inverse results of the real example are verified by borehole S-wave velocity measurements.
A Fine-Grained Pipelined Implementation for Large-Scale Matrix Inversion on FPGA
NASA Astrophysics Data System (ADS)
Zhou, Jie; Dou, Yong; Zhao, Jianxun; Xia, Fei; Lei, Yuanwu; Tang, Yuxing
Large-scale matrix inversion play an important role in many applications. However to the best of our knowledge, there is no FPGA-based implementation. In this paper, we explore the possibility of accelerating large-scale matrix inversion on FPGA. To exploit the computational potential of FPGA, we introduce a fine-grained parallel algorithm for matrix inversion. A scalable linear array processing elements (PEs), which is the core component of the FPGA accelerator, is proposed to implement this algorithm. A total of 12 PEs can be integrated into an Altera StratixII EP2S130F1020C5 FPGA on our self-designed board. Experimental results show that a factor of 2.6 speedup and the maximum power-performance of 41 can be achieved compare to Pentium Dual CPU with double SSE threads.
Easy way to determine quantitative spatial resolution distribution for a general inverse problem
NASA Astrophysics Data System (ADS)
An, M.; Feng, M.
2013-12-01
The spatial resolution computation of a solution was nontrivial and more difficult than solving an inverse problem. Most geophysical studies, except for tomographic studies, almost uniformly neglect the calculation of a practical spatial resolution. In seismic tomography studies, a qualitative resolution length can be indicatively given via visual inspection of the restoration of a synthetic structure (e.g., checkerboard tests). An effective strategy for obtaining quantitative resolution length is to calculate Backus-Gilbert resolution kernels (also referred to as a resolution matrix) by matrix operation. However, not all resolution matrices can provide resolution length information, and the computation of resolution matrix is often a difficult problem for very large inverse problems. A new class of resolution matrices, called the statistical resolution matrices (An, 2012, GJI), can be directly determined via a simple one-parameter nonlinear inversion performed based on limited pairs of random synthetic models and their inverse solutions. The total procedure were restricted to forward/inversion processes used in the real inverse problem and were independent of the degree of inverse skill used in the solution inversion. Spatial resolution lengths can be directly given during the inversion. Tests on 1D/2D/3D model inversion demonstrated that this simple method can be at least valid for a general linear inverse problem.
Eigensolver for a Sparse, Large Hermitian Matrix
NASA Technical Reports Server (NTRS)
Tisdale, E. Robert; Oyafuso, Fabiano; Klimeck, Gerhard; Brown, R. Chris
2003-01-01
A parallel-processing computer program finds a few eigenvalues in a sparse Hermitian matrix that contains as many as 100 million diagonal elements. This program finds the eigenvalues faster, using less memory, than do other, comparable eigensolver programs. This program implements a Lanczos algorithm in the American National Standards Institute/ International Organization for Standardization (ANSI/ISO) C computing language, using the Message Passing Interface (MPI) standard to complement an eigensolver in PARPACK. [PARPACK (Parallel Arnoldi Package) is an extension, to parallel-processing computer architectures, of ARPACK (Arnoldi Package), which is a collection of Fortran 77 subroutines that solve large-scale eigenvalue problems.] The eigensolver runs on Beowulf clusters of computers at the Jet Propulsion Laboratory (JPL).
Linear precoding based on polynomial expansion: reducing complexity in massive MIMO.
Mueller, Axel; Kammoun, Abla; Björnson, Emil; Debbah, Mérouane
Massive multiple-input multiple-output (MIMO) techniques have the potential to bring tremendous improvements in spectral efficiency to future communication systems. Counterintuitively, the practical issues of having uncertain channel knowledge, high propagation losses, and implementing optimal non-linear precoding are solved more or less automatically by enlarging system dimensions. However, the computational precoding complexity grows with the system dimensions. For example, the close-to-optimal and relatively "antenna-efficient" regularized zero-forcing (RZF) precoding is very complicated to implement in practice, since it requires fast inversions of large matrices in every coherence period. Motivated by the high performance of RZF, we propose to replace the matrix inversion and multiplication by a truncated polynomial expansion (TPE), thereby obtaining the new TPE precoding scheme which is more suitable for real-time hardware implementation and significantly reduces the delay to the first transmitted symbol. The degree of the matrix polynomial can be adapted to the available hardware resources and enables smooth transition between simple maximum ratio transmission and more advanced RZF. By deriving new random matrix results, we obtain a deterministic expression for the asymptotic signal-to-interference-and-noise ratio (SINR) achieved by TPE precoding in massive MIMO systems. Furthermore, we provide a closed-form expression for the polynomial coefficients that maximizes this SINR. To maintain a fixed per-user rate loss as compared to RZF, the polynomial degree does not need to scale with the system, but it should be increased with the quality of the channel knowledge and the signal-to-noise ratio.
A study of the parallel algorithm for large-scale DC simulation of nonlinear systems
NASA Astrophysics Data System (ADS)
Cortés Udave, Diego Ernesto; Ogrodzki, Jan; Gutiérrez de Anda, Miguel Angel
Newton-Raphson DC analysis of large-scale nonlinear circuits may be an extremely time consuming process even if sparse matrix techniques and bypassing of nonlinear models calculation are used. A slight decrease in the time required for this task may be enabled on multi-core, multithread computers if the calculation of the mathematical models for the nonlinear elements as well as the stamp management of the sparse matrix entries are managed through concurrent processes. This numerical complexity can be further reduced via the circuit decomposition and parallel solution of blocks taking as a departure point the BBD matrix structure. This block-parallel approach may give a considerable profit though it is strongly dependent on the system topology and, of course, on the processor type. This contribution presents the easy-parallelizable decomposition-based algorithm for DC simulation and provides a detailed study of its effectiveness.
Sparse distributed memory and related models
NASA Technical Reports Server (NTRS)
Kanerva, Pentti
1992-01-01
Described here is sparse distributed memory (SDM) as a neural-net associative memory. It is characterized by two weight matrices and by a large internal dimension - the number of hidden units is much larger than the number of input or output units. The first matrix, A, is fixed and possibly random, and the second matrix, C, is modifiable. The SDM is compared and contrasted to (1) computer memory, (2) correlation-matrix memory, (3) feet-forward artificial neural network, (4) cortex of the cerebellum, (5) Marr and Albus models of the cerebellum, and (6) Albus' cerebellar model arithmetic computer (CMAC). Several variations of the basic SDM design are discussed: the selected-coordinate and hyperplane designs of Jaeckel, the pseudorandom associative neural memory of Hassoun, and SDM with real-valued input variables by Prager and Fallside. SDM research conducted mainly at the Research Institute for Advanced Computer Science (RIACS) in 1986-1991 is highlighted.
Wang, Li; Shi, Feng; Li, Gang; Lin, Weili; Gilmore, John H.; Shen, Dinggang
2014-01-01
Segmentation of infant brain MR images is challenging due to insufficient image quality, severe partial volume effect, and ongoing maturation and myelination process. During the first year of life, the signal contrast between white matter (WM) and gray matter (GM) in MR images undergoes inverse changes. In particular, the inversion of WM/GM signal contrast appears around 6–8 months of age, where brain tissues appear isointense and hence exhibit extremely low tissue contrast, posing significant challenges for automated segmentation. In this paper, we propose a novel segmentation method to address the above-mentioned challenge based on the sparse representation of the complementary tissue distribution information from T1, T2 and diffusion-weighted images. Specifically, we first derive an initial segmentation from a library of aligned multi-modality images with ground-truth segmentations by using sparse representation in a patch-based fashion. The segmentation is further refined by the integration of the geometrical constraint information. The proposed method was evaluated on 22 6-month-old training subjects using leave-one-out cross-validation, as well as 10 additional infant testing subjects, showing superior results in comparison to other state-of-the-art methods. PMID:24505729
Wang, Li; Shi, Feng; Li, Gang; Lin, Weili; Gilmore, John H; Shen, Dinggang
2013-01-01
Segmentation of infant brain MR images is challenging due to insufficient image quality, severe partial volume effect, and ongoing maturation and myelination process. During the first year of life, the signal contrast between white matter (WM) and gray matter (GM) in MR images undergoes inverse changes. In particular, the inversion of WM/GM signal contrast appears around 6-8 months of age, where brain tissues appear isointense and hence exhibit extremely low tissue contrast, posing significant challenges for automated segmentation. In this paper, we propose a novel segmentation method to address the above-mentioned challenge based on the sparse representation of the complementary tissue distribution information from T1, T2 and diffusion-weighted images. Specifically, we first derive an initial segmentation from a library of aligned multi-modality images with ground-truth segmentations by using sparse representation in a patch-based fashion. The segmentation is further refined by the integration of the geometrical constraint information. The proposed method was evaluated on 22 6-month-old training subjects using leave-one-out cross-validation, as well as 10 additional infant testing subjects, showing superior results in comparison to other state-of-the-art methods.
Colclough, Giles L; Woolrich, Mark W; Harrison, Samuel J; Rojas López, Pedro A; Valdes-Sosa, Pedro A; Smith, Stephen M
2018-05-07
A Bayesian model for sparse, hierarchical, inver-covariance estimation is presented, and applied to multi-subject functional connectivity estimation in the human brain. It enables simultaneous inference of the strength of connectivity between brain regions at both subject and population level, and is applicable to fMRI, MEG and EEG data. Two versions of the model can encourage sparse connectivity, either using continuous priors to suppress irrelevant connections, or using an explicit description of the network structure to estimate the connection probability between each pair of regions. A large evaluation of this model, and thirteen methods that represent the state of the art of inverse covariance modelling, is conducted using both simulated and resting-state functional imaging datasets. Our novel Bayesian approach has similar performance to the best extant alternative, Ng et al.'s Sparse Group Gaussian Graphical Model algorithm, which also is based on a hierarchical structure. Using data from the Human Connectome Project, we show that these hierarchical models are able to reduce the measurement error in MEG beta-band functional networks by 10%, producing concomitant increases in estimates of the genetic influence on functional connectivity. Copyright © 2018. Published by Elsevier Inc.
Kernel Recursive Least-Squares Temporal Difference Algorithms with Sparsification and Regularization
Zhu, Qingxin; Niu, Xinzheng
2016-01-01
By combining with sparse kernel methods, least-squares temporal difference (LSTD) algorithms can construct the feature dictionary automatically and obtain a better generalization ability. However, the previous kernel-based LSTD algorithms do not consider regularization and their sparsification processes are batch or offline, which hinder their widespread applications in online learning problems. In this paper, we combine the following five techniques and propose two novel kernel recursive LSTD algorithms: (i) online sparsification, which can cope with unknown state regions and be used for online learning, (ii) L 2 and L 1 regularization, which can avoid overfitting and eliminate the influence of noise, (iii) recursive least squares, which can eliminate matrix-inversion operations and reduce computational complexity, (iv) a sliding-window approach, which can avoid caching all history samples and reduce the computational cost, and (v) the fixed-point subiteration and online pruning, which can make L 1 regularization easy to implement. Finally, simulation results on two 50-state chain problems demonstrate the effectiveness of our algorithms. PMID:27436996
Zhang, Chunyuan; Zhu, Qingxin; Niu, Xinzheng
2016-01-01
By combining with sparse kernel methods, least-squares temporal difference (LSTD) algorithms can construct the feature dictionary automatically and obtain a better generalization ability. However, the previous kernel-based LSTD algorithms do not consider regularization and their sparsification processes are batch or offline, which hinder their widespread applications in online learning problems. In this paper, we combine the following five techniques and propose two novel kernel recursive LSTD algorithms: (i) online sparsification, which can cope with unknown state regions and be used for online learning, (ii) L 2 and L 1 regularization, which can avoid overfitting and eliminate the influence of noise, (iii) recursive least squares, which can eliminate matrix-inversion operations and reduce computational complexity, (iv) a sliding-window approach, which can avoid caching all history samples and reduce the computational cost, and (v) the fixed-point subiteration and online pruning, which can make L 1 regularization easy to implement. Finally, simulation results on two 50-state chain problems demonstrate the effectiveness of our algorithms.
Blind compressed sensing image reconstruction based on alternating direction method
NASA Astrophysics Data System (ADS)
Liu, Qinan; Guo, Shuxu
2018-04-01
In order to solve the problem of how to reconstruct the original image under the condition of unknown sparse basis, this paper proposes an image reconstruction method based on blind compressed sensing model. In this model, the image signal is regarded as the product of a sparse coefficient matrix and a dictionary matrix. Based on the existing blind compressed sensing theory, the optimal solution is solved by the alternative minimization method. The proposed method solves the problem that the sparse basis in compressed sensing is difficult to represent, which restrains the noise and improves the quality of reconstructed image. This method ensures that the blind compressed sensing theory has a unique solution and can recover the reconstructed original image signal from a complex environment with a stronger self-adaptability. The experimental results show that the image reconstruction algorithm based on blind compressed sensing proposed in this paper can recover high quality image signals under the condition of under-sampling.
Effects of Ordering Strategies and Programming Paradigms on Sparse Matrix Computations
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Li, Xiaoye; Husbands, Parry; Biswas, Rupak; Biegel, Bryan (Technical Monitor)
2002-01-01
The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. For systems that are ill-conditioned, it is often necessary to use a preconditioning technique. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and ILU(O) preconditioned CG (PCG) using different programming paradigms and architectures. Results show that for this class of applications: ordering significantly improves overall performance on both distributed and distributed shared-memory systems, that cache reuse may be more important than reducing communication, that it is possible to achieve message-passing performance using shared-memory constructs through careful data ordering and distribution, and that a hybrid MPI+OpenMP paradigm increases programming complexity with little performance gains. A implementation of CG on the Cray MTA does not require special ordering or partitioning to obtain high efficiency and scalability, giving it a distinct advantage for adaptive applications; however, it shows limited scalability for PCG due to a lack of thread level parallelism.
Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry; ...
2016-10-27
Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
The incomplete inverse and its applications to the linear least squares problem
NASA Technical Reports Server (NTRS)
Morduch, G. E.
1977-01-01
A modified matrix product is explained, and it is shown that this product defiles a group whose inverse is called the incomplete inverse. It was proven that the incomplete inverse of an augmented normal matrix includes all the quantities associated with the least squares solution. An answer is provided to the problem that occurs when the data residuals are too large and when insufficient data to justify augmenting the model are available.
Tang, Shiming; Zhang, Yimeng; Li, Zhihao; Li, Ming; Liu, Fang; Jiang, Hongfei; Lee, Tai Sing
2018-04-26
One general principle of sensory information processing is that the brain must optimize efficiency by reducing the number of neurons that process the same information. The sparseness of the sensory representations in a population of neurons reflects the efficiency of the neural code. Here, we employ large-scale two-photon calcium imaging to examine the responses of a large population of neurons within the superficial layers of area V1 with single-cell resolution, while simultaneously presenting a large set of natural visual stimuli, to provide the first direct measure of the population sparseness in awake primates. The results show that only 0.5% of neurons respond strongly to any given natural image - indicating a ten-fold increase in the inferred sparseness over previous measurements. These population activities are nevertheless necessary and sufficient to discriminate visual stimuli with high accuracy, suggesting that the neural code in the primary visual cortex is both super-sparse and highly efficient. © 2018, Tang et al.
Visual recognition and inference using dynamic overcomplete sparse learning.
Murray, Joseph F; Kreutz-Delgado, Kenneth
2007-09-01
We present a hierarchical architecture and learning algorithm for visual recognition and other visual inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we posit a stochastic generative world model and from it develop a simplified world model (SWM) based on a tractable variational approximation that is designed to enforce sparse coding. Recent developments in computational methods for learning overcomplete representations (Lewicki & Sejnowski, 2000; Teh, Welling, Osindero, & Hinton, 2003) suggest that overcompleteness can be useful for visual tasks, and we use an overcomplete dictionary learning algorithm (Kreutz-Delgado, et al., 2003) as a preprocessing stage to produce accurate, sparse codings of images. Inference is performed by constructing a dynamic multilayer network with feedforward, feedback, and lateral connections, which is trained to approximate the SWM. Learning is done with a variant of the back-propagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks, and to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter.
Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint.
Gao, Zhi; Lao, Mingjie; Sang, Yongsheng; Wen, Fei; Ramesh, Bharath; Zhai, Ruifang
2018-05-06
Light detection and ranging (LiDAR) sensors have been widely deployed on intelligent systems such as unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs) to perform localization, obstacle detection, and navigation tasks. Thus, research into range data processing with competitive performance in terms of both accuracy and efficiency has attracted increasing attention. Sparse coding has revolutionized signal processing and led to state-of-the-art performance in a variety of applications. However, dictionary learning, which plays the central role in sparse coding techniques, is computationally demanding, resulting in its limited applicability in real-time systems. In this study, we propose sparse coding algorithms with a fixed pre-learned ridge dictionary to realize range data denoising via leveraging the regularity of laser range measurements in man-made environments. Experiments on both synthesized data and real data demonstrate that our method obtains accuracy comparable to that of sophisticated sparse coding methods, but with much higher computational efficiency.
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel; ...
2017-06-01
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
Transurethral Ultrasound Diffraction Tomography
2007-03-01
the covariance matrix was derived. The covariance reduced to that of the X- ray CT under the assumptions of linear operator and real data.[5] The...the covariance matrix in the linear x- ray computed tomography is a special case of the inverse scattering matrix derived in this paper. The matrix was...is derived in Sec. IV, and its relation to that of the linear x- ray computed tomography appears in Sec. V. In Sec. VI, the inverse scattering
M-matrices with prescribed elementary divisors
NASA Astrophysics Data System (ADS)
Soto, Ricardo L.; Díaz, Roberto C.; Salas, Mario; Rojo, Oscar
2017-09-01
A real matrix A is said to be an M-matrix if it is of the form A=α I-B, where B is a nonnegative matrix with Perron eigenvalue ρ (B), and α ≥slant ρ (B) . This paper provides sufficient conditions for the existence and construction of an M-matrix A with prescribed elementary divisors, which are the characteristic polynomials of the Jordan blocks of the Jordan canonical form of A. This inverse problem on M-matrices has not been treated until now. We solve the inverse elementary divisors problem for diagonalizable M-matrices and the symmetric generalized doubly stochastic inverse M-matrix problem for lists of real numbers and for lists of complex numbers of the form Λ =\\{λ 1, a+/- bi, \\ldots, a+/- bi\\} . The constructive nature of our results allows for the computation of a solution matrix. The paper also discusses an application of M-matrices to a capacity problem in wireless communications.
A systematic linear space approach to solving partially described inverse eigenvalue problems
NASA Astrophysics Data System (ADS)
Hu, Sau-Lon James; Li, Haujun
2008-06-01
Most applications of the inverse eigenvalue problem (IEP), which concerns the reconstruction of a matrix from prescribed spectral data, are associated with special classes of structured matrices. Solving the IEP requires one to satisfy both the spectral constraint and the structural constraint. If the spectral constraint consists of only one or few prescribed eigenpairs, this kind of inverse problem has been referred to as the partially described inverse eigenvalue problem (PDIEP). This paper develops an efficient, general and systematic approach to solve the PDIEP. Basically, the approach, applicable to various structured matrices, converts the PDIEP into an ordinary inverse problem that is formulated as a set of simultaneous linear equations. While solving simultaneous linear equations for model parameters, the singular value decomposition method is applied. Because of the conversion to an ordinary inverse problem, other constraints associated with the model parameters can be easily incorporated into the solution procedure. The detailed derivation and numerical examples to implement the newly developed approach to symmetric Toeplitz and quadratic pencil (including mass, damping and stiffness matrices of a linear dynamic system) PDIEPs are presented. Excellent numerical results for both kinds of problem are achieved under the situations that have either unique or infinitely many solutions.
Automated dynamic analytical model improvement for damped structures
NASA Technical Reports Server (NTRS)
Fuh, J. S.; Berman, A.
1985-01-01
A method is described to improve a linear nonproportionally damped analytical model of a structure. The procedure finds the smallest changes in the analytical model such that the improved model matches the measured modal parameters. Features of the method are: (1) ability to properly treat complex valued modal parameters of a damped system; (2) applicability to realistically large structural models; and (3) computationally efficiency without involving eigensolutions and inversion of a large matrix.
Communication requirements of sparse Cholesky factorization with nested dissection ordering
NASA Technical Reports Server (NTRS)
Naik, Vijay K.; Patrick, Merrell L.
1989-01-01
Load distribution schemes for minimizing the communication requirements of the Cholesky factorization of dense and sparse, symmetric, positive definite matrices on multiprocessor systems are presented. The total data traffic in factoring an n x n sparse symmetric positive definite matrix representing an n-vertex regular two-dimensional grid graph using n exp alpha, alpha not greater than 1, processors are shown to be O(n exp 1 + alpha/2). It is O(n), when n exp alpha, alpha not smaller than 1, processors are used. Under the conditions of uniform load distribution, these results are shown to be asymptotically optimal.
Sparse representation-based image restoration via nonlocal supervised coding
NASA Astrophysics Data System (ADS)
Li, Ao; Chen, Deyun; Sun, Guanglu; Lin, Kezheng
2016-10-01
Sparse representation (SR) and nonlocal technique (NLT) have shown great potential in low-level image processing. However, due to the degradation of the observed image, SR and NLT may not be accurate enough to obtain a faithful restoration results when they are used independently. To improve the performance, in this paper, a nonlocal supervised coding strategy-based NLT for image restoration is proposed. The novel method has three main contributions. First, to exploit the useful nonlocal patches, a nonnegative sparse representation is introduced, whose coefficients can be utilized as the supervised weights among patches. Second, a novel objective function is proposed, which integrated the supervised weights learning and the nonlocal sparse coding to guarantee a more promising solution. Finally, to make the minimization tractable and convergence, a numerical scheme based on iterative shrinkage thresholding is developed to solve the above underdetermined inverse problem. The extensive experiments validate the effectiveness of the proposed method.
SAMBA: Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahlfeld, R., E-mail: r.ahlfeld14@imperial.ac.uk; Belkouchi, B.; Montomoli, F.
2016-09-01
A new arbitrary Polynomial Chaos (aPC) method is presented for moderately high-dimensional problems characterised by limited input data availability. The proposed methodology improves the algorithm of aPC and extends the method, that was previously only introduced as tensor product expansion, to moderately high-dimensional stochastic problems. The fundamental idea of aPC is to use the statistical moments of the input random variables to develop the polynomial chaos expansion. This approach provides the possibility to propagate continuous or discrete probability density functions and also histograms (data sets) as long as their moments exist, are finite and the determinant of the moment matrixmore » is strictly positive. For cases with limited data availability, this approach avoids bias and fitting errors caused by wrong assumptions. In this work, an alternative way to calculate the aPC is suggested, which provides the optimal polynomials, Gaussian quadrature collocation points and weights from the moments using only a handful of matrix operations on the Hankel matrix of moments. It can therefore be implemented without requiring prior knowledge about statistical data analysis or a detailed understanding of the mathematics of polynomial chaos expansions. The extension to more input variables suggested in this work, is an anisotropic and adaptive version of Smolyak's algorithm that is solely based on the moments of the input probability distributions. It is referred to as SAMBA (PC), which is short for Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos. It is illustrated that for moderately high-dimensional problems (up to 20 different input variables or histograms) SAMBA can significantly simplify the calculation of sparse Gaussian quadrature rules. SAMBA's efficiency for multivariate functions with regard to data availability is further demonstrated by analysing higher order convergence and accuracy for a set of nonlinear test functions with 2, 5 and 10 different input distributions or histograms.« less
An Energy-Efficient Compressive Image Coding for Green Internet of Things (IoT).
Li, Ran; Duan, Xiaomeng; Li, Xu; He, Wei; Li, Yanling
2018-04-17
Aimed at a low-energy consumption of Green Internet of Things (IoT), this paper presents an energy-efficient compressive image coding scheme, which provides compressive encoder and real-time decoder according to Compressive Sensing (CS) theory. The compressive encoder adaptively measures each image block based on the block-based gradient field, which models the distribution of block sparse degree, and the real-time decoder linearly reconstructs each image block through a projection matrix, which is learned by Minimum Mean Square Error (MMSE) criterion. Both the encoder and decoder have a low computational complexity, so that they only consume a small amount of energy. Experimental results show that the proposed scheme not only has a low encoding and decoding complexity when compared with traditional methods, but it also provides good objective and subjective reconstruction qualities. In particular, it presents better time-distortion performance than JPEG. Therefore, the proposed compressive image coding is a potential energy-efficient scheme for Green IoT.
NASA Astrophysics Data System (ADS)
Strodel, Paul; Tavan, Paul
2002-09-01
We present a revised multi-reference configuration interaction (MRCI) algorithm for balanced and efficient calculation of electronic excitations in molecules. The revision takes up an earlier method, which had been designed for flexible, state-specific, and individual selection (IS) of MRCI expansions, included perturbational corrections (PERT), and used the spin-coupled hole-particle formalism of Tavan and Schulten (1980) for matrix-element evaluation. It removes the deficiencies of this method by introducing tree structures, which code the CI bases and allow us to efficiently exploit the sparseness of the Hamiltonian matrices. The algorithmic complexity is shown to be optimal for IS/MRCI applications. The revised IS/MRCI/PERT module is combined with the effective valence shell Hamiltonian OM2 suggested by Weber and Thiel (2000). This coupling serves the purpose of making excited state surfaces of organic dye molecules accessible to relatively cheap and sufficiently precise descriptions.
SIMD Optimization of Linear Expressions for Programmable Graphics Hardware
Bajaj, Chandrajit; Ihm, Insung; Min, Jungki; Oh, Jinsang
2009-01-01
The increased programmability of graphics hardware allows efficient graphical processing unit (GPU) implementations of a wide range of general computations on commodity PCs. An important factor in such implementations is how to fully exploit the SIMD computing capacities offered by modern graphics processors. Linear expressions in the form of ȳ = Ax̄ + b̄, where A is a matrix, and x̄, ȳ and b̄ are vectors, constitute one of the most basic operations in many scientific computations. In this paper, we propose a SIMD code optimization technique that enables efficient shader codes to be generated for evaluating linear expressions. It is shown that performance can be improved considerably by efficiently packing arithmetic operations into four-wide SIMD instructions through reordering of the operations in linear expressions. We demonstrate that the presented technique can be used effectively for programming both vertex and pixel shaders for a variety of mathematical applications, including integrating differential equations and solving a sparse linear system of equations using iterative methods. PMID:19946569
Regenerator filled with a matrix of polycrystalline iron whiskers
NASA Astrophysics Data System (ADS)
Eder, F. X.; Appel, H.
1982-08-01
In thermal regenerators, parameters were optimized: convection coefficient, surface of heat accumulating matrix, matrix density and heat capacity, and frequency of cycle inversions. The variation of heat capacity with working temperature was also computed. Polycrystalline iron whiskers prove a good compromise as matrix for heat regenerators at working temperatures ranging from 300 to 80 K. They were compared with wire mesh screens and microspheres of bronze and stainless steel. For theses structures and materials, thermal conductivity, pressure drop, heat transfer and yield were calculated and related to the experimental values. As transport heat gas, helium, argon, and dry nitrogen were applied at pressures up to 20 bar. Experimental and theoretical studies result in a set of formulas for calculating pressure drop, heat capacity, and heat transfer rate for a given thermal regenerator in function of mass flow. It is proved that a whisker matrix has an efficiency that depends strongly on gas pressure and composition. Iron whiskers make a good matrix with heat capacities of kW/cu cm per K, but their relative high pressure drop may, at low pressures, be a limitation. A regenerator expansion machine is described.
An overview of NSPCG: A nonsymmetric preconditioned conjugate gradient package
NASA Astrophysics Data System (ADS)
Oppe, Thomas C.; Joubert, Wayne D.; Kincaid, David R.
1989-05-01
The most recent research-oriented software package developed as part of the ITPACK Project is called "NSPCG" since it contains many nonsymmetric preconditioned conjugate gradient procedures. It is designed to solve large sparse systems of linear algebraic equations by a variety of different iterative methods. One of the main purposes for the development of the package is to provide a common modular structure for research on iterative methods for nonsymmetric matrices. Another purpose for the development of the package is to investigate the suitability of several iterative methods for vector computers. Since the vectorizability of an iterative method depends greatly on the matrix structure, NSPCG allows great flexibility in the operator representation. The coefficient matrix can be passed in one of several different matrix data storage schemes. These sparse data formats allow matrices with a wide range of structures from highly structured ones such as those with all nonzeros along a relatively small number of diagonals to completely unstructured sparse matrices. Alternatively, the package allows the user to call the accelerators directly with user-supplied routines for performing certain matrix operations. In this case, one can use the data format from an application program and not be required to copy the matrix into one of the package formats. This is particularly advantageous when memory space is limited. Some of the basic preconditioners that are available are point methods such as Jacobi, Incomplete LU Decomposition and Symmetric Successive Overrelaxation as well as block and multicolor preconditioners. The user can select from a large collection of accelerators such as Conjugate Gradient (CG), Chebyshev (SI, for semi-iterative), Generalized Minimal Residual (GMRES), Biconjugate Gradient Squared (BCGS) and many others. The package is modular so that almost any accelerator can be used with almost any preconditioner.
2D Seismic Imaging of Elastic Parameters by Frequency Domain Full Waveform Inversion
NASA Astrophysics Data System (ADS)
Brossier, R.; Virieux, J.; Operto, S.
2008-12-01
Thanks to recent advances in parallel computing, full waveform inversion is today a tractable seismic imaging method to reconstruct physical parameters of the earth interior at different scales ranging from the near- surface to the deep crust. We present a massively parallel 2D frequency-domain full-waveform algorithm for imaging visco-elastic media from multi-component seismic data. The forward problem (i.e. the resolution of the frequency-domain 2D PSV elastodynamics equations) is based on low-order Discontinuous Galerkin (DG) method (P0 and/or P1 interpolations). Thanks to triangular unstructured meshes, the DG method allows accurate modeling of both body waves and surface waves in case of complex topography for a discretization of 10 to 15 cells per shear wavelength. The frequency-domain DG system is solved efficiently for multiple sources with the parallel direct solver MUMPS. The local inversion procedure (i.e. minimization of residuals between observed and computed data) is based on the adjoint-state method which allows to efficiently compute the gradient of the objective function. Applying the inversion hierarchically from the low frequencies to the higher ones defines a multiresolution imaging strategy which helps convergence towards the global minimum. In place of expensive Newton algorithm, the combined use of the diagonal terms of the approximate Hessian matrix and optimization algorithms based on quasi-Newton methods (Conjugate Gradient, LBFGS, ...) allows to improve the convergence of the iterative inversion. The distribution of forward problem solutions over processors driven by a mesh partitioning performed by METIS allows to apply most of the inversion in parallel. We shall present the main features of the parallel modeling/inversion algorithm, assess its scalability and illustrate its performances with realistic synthetic case studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2014-01-17
This library is an implementation of the Sparse Approximate Matrix Multiplication (SpAMM) algorithm introduced. It provides a matrix data type, and an approximate matrix product, which exhibits linear scaling computational complexity for matrices with decay. The product error and the performance of the multiply can be tuned by choosing an appropriate tolerance. The library can be compiled for serial execution or parallel execution on shared memory systems with an OpenMP capable compiler
Bit error rate tester using fast parallel generation of linear recurring sequences
Pierson, Lyndon G.; Witzke, Edward L.; Maestas, Joseph H.
2003-05-06
A fast method for generating linear recurring sequences by parallel linear recurring sequence generators (LRSGs) with a feedback circuit optimized to balance minimum propagation delay against maximal sequence period. Parallel generation of linear recurring sequences requires decimating the sequence (creating small contiguous sections of the sequence in each LRSG). A companion matrix form is selected depending on whether the LFSR is right-shifting or left-shifting. The companion matrix is completed by selecting a primitive irreducible polynomial with 1's most closely grouped in a corner of the companion matrix. A decimation matrix is created by raising the companion matrix to the (n*k).sup.th power, where k is the number of parallel LRSGs and n is the number of bits to be generated at a time by each LRSG. Companion matrices with 1's closely grouped in a corner will yield sparse decimation matrices. A feedback circuit comprised of XOR logic gates implements the decimation matrix in hardware. Sparse decimation matrices can be implemented with minimum number of XOR gates, and therefore a minimum propagation delay through the feedback circuit. The LRSG of the invention is particularly well suited to use as a bit error rate tester on high speed communication lines because it permits the receiver to synchronize to the transmitted pattern within 2n bits.
Tensor Sparse Coding for Positive Definite Matrices.
Sivalingam, Ravishankar; Boley, Daniel; Morellas, Vassilios; Papanikolopoulos, Nikos
2013-08-02
In recent years, there has been extensive research on sparse representation of vector-valued signals. In the matrix case, the data points are merely vectorized and treated as vectors thereafter (for e.g., image patches). However, this approach cannot be used for all matrices, as it may destroy the inherent structure of the data. Symmetric positive definite (SPD) matrices constitute one such class of signals, where their implicit structure of positive eigenvalues is lost upon vectorization. This paper proposes a novel sparse coding technique for positive definite matrices, which respects the structure of the Riemannian manifold and preserves the positivity of their eigenvalues, without resorting to vectorization. Synthetic and real-world computer vision experiments with region covariance descriptors demonstrate the need for and the applicability of the new sparse coding model. This work serves to bridge the gap between the sparse modeling paradigm and the space of positive definite matrices.
Tensor sparse coding for positive definite matrices.
Sivalingam, Ravishankar; Boley, Daniel; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2014-03-01
In recent years, there has been extensive research on sparse representation of vector-valued signals. In the matrix case, the data points are merely vectorized and treated as vectors thereafter (for example, image patches). However, this approach cannot be used for all matrices, as it may destroy the inherent structure of the data. Symmetric positive definite (SPD) matrices constitute one such class of signals, where their implicit structure of positive eigenvalues is lost upon vectorization. This paper proposes a novel sparse coding technique for positive definite matrices, which respects the structure of the Riemannian manifold and preserves the positivity of their eigenvalues, without resorting to vectorization. Synthetic and real-world computer vision experiments with region covariance descriptors demonstrate the need for and the applicability of the new sparse coding model. This work serves to bridge the gap between the sparse modeling paradigm and the space of positive definite matrices.
A note on the blind deconvolution of multiple sparse signals from unknown subspaces
NASA Astrophysics Data System (ADS)
Cosse, Augustin
2017-08-01
This note studies the recovery of multiple sparse signals, xn ∈ ℝL, n = 1, . . . , N, from the knowledge of their convolution with an unknown point spread function h ∈ ℝL. When the point spread function is known to be nonzero, |h[k]| > 0, this blind deconvolution problem can be relaxed into a linear, ill-posed inverse problem in the vector concatenating the unknown inputs xn together with the inverse of the filter, d ∈ ℝL where d[k] := 1/h[k]. When prior information is given on the input subspaces, the resulting overdetermined linear system can be solved efficiently via least squares (see Ling et al. 20161). When no information is given on those subspaces, and the inputs are only known to be sparse, it still remains possible to recover these inputs along with the filter by considering an additional l1 penalty. This note certifies exact recovery of both the unknown PSF and unknown sparse inputs, from the knowledge of their convolutions, as soon as the number of inputs N and the dimension of each input, L , satisfy L ≳ N and N ≳ T2max, up to log factors. Here Tmax = maxn{Tn} and Tn, n = 1, . . . , N denote the supports of the inputs xn. Our proof system combines the recent results on blind deconvolution via least squares to certify invertibility of the linear map encoding the convolutions, with the construction of a dual certificate following the structure first suggested in Candés et al. 2007.2 Unlike in these papers, however, it is not possible to rely on the norm ||(A*TAT)-1|| to certify recovery. We instead use a combination of the Schur Complement and Neumann series to compute an expression for the inverse (A*TAT)-1. Given this expression, it is possible to show that the poorly scaled blocks in (A*TAT)-1 are multiplied by the better scaled ones or vanish in the construction of the certificate. Recovery is certified with high probablility on the choice of the supports and distribution of the signs of each input xn on the support. The paper follows the line of previous work by Wang et al. 20163 where the authors guarantee recovery for subgaussian × Bernoulli inputs satisfying 𝔼xn|k| ∈ [1/10, 1] as soon as N ≳ L. Examples of applications include seismic imaging with unknown source or marine seismic data deghosting, magnetic resonance autocalibration or multiple channel estimation in communication. Numerical experiments are provided along with a discussion on the sample complexity tightness.
Comparison of Compressed Sensing Algorithms for Inversion of 3-D Electrical Resistivity Tomography.
NASA Astrophysics Data System (ADS)
Peddinti, S. R.; Ranjan, S.; Kbvn, D. P.
2016-12-01
Image reconstruction algorithms derived from electrical resistivity tomography (ERT) are highly non-linear, sparse, and ill-posed. The inverse problem is much severe, when dealing with 3-D datasets that result in large sized matrices. Conventional gradient based techniques using L2 norm minimization with some sort of regularization can impose smoothness constraint on the solution. Compressed sensing (CS) is relatively new technique that takes the advantage of inherent sparsity in parameter space in one or the other form. If favorable conditions are met, CS was proven to be an efficient image reconstruction technique that uses limited observations without losing edge sharpness. This paper deals with the development of an open source 3-D resistivity inversion tool using CS framework. The forward model was adopted from RESINVM3D (Pidlisecky et al., 2007) with CS as the inverse code. Discrete cosine transformation (DCT) function was used to induce model sparsity in orthogonal form. Two CS based algorithms viz., interior point method and two-step IST were evaluated on a synthetic layered model with surface electrode observations. The algorithms were tested (in terms of quality and convergence) under varying degrees of parameter heterogeneity, model refinement, and reduced observation data space. In comparison to conventional gradient algorithms, CS was proven to effectively reconstruct the sub-surface image with less computational cost. This was observed by a general increase in NRMSE from 0.5 in 10 iterations using gradient algorithm to 0.8 in 5 iterations using CS algorithms.
Scalability improvements to NRLMOL for DFT calculations of large molecules
NASA Astrophysics Data System (ADS)
Diaz, Carlos Manuel
Advances in high performance computing (HPC) have provided a way to treat large, computationally demanding tasks using thousands of processors. With the development of more powerful HPC architectures, the need to create efficient and scalable code has grown more important. Electronic structure calculations are valuable in understanding experimental observations and are routinely used for new materials predictions. For the electronic structure calculations, the memory and computation time are proportional to the number of atoms. Memory requirements for these calculations scale as N2, where N is the number of atoms. While the recent advances in HPC offer platforms with large numbers of cores, the limited amount of memory available on a given node and poor scalability of the electronic structure code hinder their efficient usage of these platforms. This thesis will present some developments to overcome these bottlenecks in order to study large systems. These developments, which are implemented in the NRLMOL electronic structure code, involve the use of sparse matrix storage formats and the use of linear algebra using sparse and distributed matrices. These developments along with other related development now allow ground state density functional calculations using up to 25,000 basis functions and the excited state calculations using up to 17,000 basis functions while utilizing all cores on a node. An example on a light-harvesting triad molecule is described. Finally, future plans to further improve the scalability will be presented.
Efficient Sparse Signal Transmission over a Lossy Link Using Compressive Sensing
Wu, Liantao; Yu, Kai; Cao, Dongyu; Hu, Yuhen; Wang, Zhi
2015-01-01
Reliable data transmission over lossy communication link is expensive due to overheads for error protection. For signals that have inherent sparse structures, compressive sensing (CS) is applied to facilitate efficient sparse signal transmissions over lossy communication links without data compression or error protection. The natural packet loss in the lossy link is modeled as a random sampling process of the transmitted data, and the original signal will be reconstructed from the lossy transmission results using the CS-based reconstruction method at the receiving end. The impacts of packet lengths on transmission efficiency under different channel conditions have been discussed, and interleaving is incorporated to mitigate the impact of burst data loss. Extensive simulations and experiments have been conducted and compared to the traditional automatic repeat request (ARQ) interpolation technique, and very favorable results have been observed in terms of both accuracy of the reconstructed signals and the transmission energy consumption. Furthermore, the packet length effect provides useful insights for using compressed sensing for efficient sparse signal transmission via lossy links. PMID:26287195
Zang, Yuguo; Kammerer, Bernd; Eisenkolb, Maike; Lohr, Katrin; Kiefer, Hans
2011-01-01
Crystallization conditions of an intact monoclonal IgG4 (immunoglobulin G, subclass 4) antibody were established in vapor diffusion mode by sparse matrix screening and subsequent optimization. The procedure was transferred to microbatch conditions and a phase diagram was built showing surprisingly low solubility of the antibody at equilibrium. With up-scaling to process scale in mind, purification efficiency of the crystallization step was investigated. Added model protein contaminants were excluded from the crystals to more than 95%. No measurable loss of Fc-binding activity was observed in the crystallized and redissolved antibody. Conditions could be adapted to crystallize the antibody directly from concentrated and diafiltrated cell culture supernatant, showing purification efficiency similar to that of Protein A chromatography. We conclude that crystallization has the potential to be included in downstream processing as a low-cost purification or formulation step. PMID:21966480
NASA Technical Reports Server (NTRS)
Whetstone, W. D.
1976-01-01
The functions and operating rules of the SPAR system, which is a group of computer programs used primarily to perform stress, buckling, and vibrational analyses of linear finite element systems, were given. The following subject areas were discussed: basic information, structure definition, format system matrix processors, utility programs, static solutions, stresses, sparse matrix eigensolver, dynamic response, graphics, and substructure processors.
Matrix Recipes for Hard Thresholding Methods
2012-11-07
have been proposed to approximate the solution. In [11], Donoho et al . demonstrate that, in the sparse approximation problem, under basic incoherence...inducing convex surrogate ‖ · ‖1 with provable guarantees for unique signal recovery. In the ARM problem, Fazel et al . [12] identified the nuclear norm...sparse recovery for all. Technical report, EPFL, 2011 . [25] N. Halko , P. G. Martinsson, and J. A. Tropp. Finding structure with randomness: Probabilistic
Multiple crack detection in 3D using a stable XFEM and global optimization
NASA Astrophysics Data System (ADS)
Agathos, Konstantinos; Chatzi, Eleni; Bordas, Stéphane P. A.
2018-02-01
A numerical scheme is proposed for the detection of multiple cracks in three dimensional (3D) structures. The scheme is based on a variant of the extended finite element method (XFEM) and a hybrid optimizer solution. The proposed XFEM variant is particularly well-suited for the simulation of 3D fracture problems, and as such serves as an efficient solution to the so-called forward problem. A set of heuristic optimization algorithms are recombined into a multiscale optimization scheme. The introduced approach proves effective in tackling the complex inverse problem involved, where identification of multiple flaws is sought on the basis of sparse measurements collected near the structural boundary. The potential of the scheme is demonstrated through a set of numerical case studies of varying complexity.
Near real-time estimation of the seismic source parameters in a compressed domain
NASA Astrophysics Data System (ADS)
Rodriguez, Ismael A. Vera
Seismic events can be characterized by its origin time, location and moment tensor. Fast estimations of these source parameters are important in areas of geophysics like earthquake seismology, and the monitoring of seismic activity produced by volcanoes, mining operations and hydraulic injections in geothermal and oil and gas reservoirs. Most available monitoring systems estimate the source parameters in a sequential procedure: first determining origin time and location (e.g., epicentre, hypocentre or centroid of the stress glut density), and then using this information to initialize the evaluation of the moment tensor. A more efficient estimation of the source parameters requires a concurrent evaluation of the three variables. The main objective of the present thesis is to address the simultaneous estimation of origin time, location and moment tensor of seismic events. The proposed method displays the benefits of being: 1) automatic, 2) continuous and, depending on the scale of application, 3) of providing results in real-time or near real-time. The inversion algorithm is based on theoretical results from sparse representation theory and compressive sensing. The feasibility of implementation is determined through the analysis of synthetic and real data examples. The numerical experiments focus on the microseismic monitoring of hydraulic fractures in oil and gas wells, however, an example using real earthquake data is also presented for validation. The thesis is complemented with a resolvability analysis of the moment tensor. The analysis targets common monitoring geometries employed in hydraulic fracturing in oil wells. Additionally, it is presented an application of sparse representation theory for the denoising of one-component and three-component microseismicity records, and an algorithm for improved automatic time-picking using non-linear inversion constraints.
NASA Astrophysics Data System (ADS)
Gerber, Florian; Mösinger, Kaspar; Furrer, Reinhard
2017-07-01
Software packages for spatial data often implement a hybrid approach of interpreted and compiled programming languages. The compiled parts are usually written in C, C++, or Fortran, and are efficient in terms of computational speed and memory usage. Conversely, the interpreted part serves as a convenient user-interface and calls the compiled code for computationally demanding operations. The price paid for the user friendliness of the interpreted component is-besides performance-the limited access to low level and optimized code. An example of such a restriction is the 64-bit vector support of the widely used statistical language R. On the R side, users do not need to change existing code and may not even notice the extension. On the other hand, interfacing 64-bit compiled code efficiently is challenging. Since many R packages for spatial data could benefit from 64-bit vectors, we investigate strategies to efficiently pass 64-bit vectors to compiled languages. More precisely, we show how to simply extend existing R packages using the foreign function interface to seamlessly support 64-bit vectors. This extension is shown with the sparse matrix algebra R package spam. The new capabilities are illustrated with an example of GIMMS NDVI3g data featuring a parametric modeling approach for a non-stationary covariance matrix.