sparse matrix algorithms: Topics by Science.gov

Sample records for sparse matrix algorithms

Efficient sparse matrix-matrix multiplication for computing periodic responses by shooting method on Intel Xeon Phi

NASA Astrophysics Data System (ADS)

Stoykov, S.; Atanassov, E.; Margenov, S.

2016-10-01

Many of the scientific applications involve sparse or dense matrix operations, such as solving linear systems, matrix-matrix products, eigensolvers, etc. In what concerns structural nonlinear dynamics, the computations of periodic responses and the determination of stability of the solution are of primary interest. Shooting method iswidely used for obtaining periodic responses of nonlinear systems. The method involves simultaneously operations with sparse and dense matrices. One of the computationally expensive operations in the method is multiplication of sparse by dense matrices. In the current work, a new algorithm for sparse matrix by dense matrix products is presented. The algorithm takes into account the structure of the sparse matrix, which is obtained by space discretization of the nonlinear Mindlin's plate equation of motion by the finite element method. The algorithm is developed to use the vector engine of Intel Xeon Phi coprocessors. It is compared with the standard sparse matrix by dense matrix algorithm and the one developed by Intel MKL and it is shown that by considering the properties of the sparse matrix better algorithms can be developed.
Brief announcement: Hypergraph parititioning for parallel sparse matrix-matrix multiplication

DOE PAGES

Ballard, Grey; Druinsky, Alex; Knight, Nicholas; ...

2015-01-01

The performance of parallel algorithms for sparse matrix-matrix multiplication is typically determined by the amount of interprocessor communication performed, which in turn depends on the nonzero structure of the input matrices. In this paper, we characterize the communication cost of a sparse matrix-matrix multiplication algorithm in terms of the size of a cut of an associated hypergraph that encodes the computation for a given input nonzero structure. Obtaining an optimal algorithm corresponds to solving a hypergraph partitioning problem. Furthermore, our hypergraph model generalizes several existing models for sparse matrix-vector multiplication, and we can leverage hypergraph partitioners developed for that computationmore » to improve application-specific algorithms for multiplying sparse matrices.« less
Multi-threaded Sparse Matrix Sparse Matrix Multiplication for Many-Core and GPU Architectures.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deveci, Mehmet; Trott, Christian Robert; Rajamanickam, Sivasankaran

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix- matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and datamore » structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.« less
Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU Architectures.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deveci, Mehmet; Rajamanickam, Sivasankaran; Trott, Christian Robert

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scienti c computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and datamore » structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.« less
Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis.

PubMed

Kim, Hyunsoo; Park, Haesun

2007-06-15

Many practical pattern recognition problems require non-negativity constraints. For example, pixels in digital images and chemical concentrations in bioinformatics are non-negative. Sparse non-negative matrix factorizations (NMFs) are useful when the degree of sparseness in the non-negative basis matrix or the non-negative coefficient matrix in an NMF needs to be controlled in approximating high-dimensional data in a lower dimensional space. In this article, we introduce a novel formulation of sparse NMF and show how the new formulation leads to a convergent sparse NMF algorithm via alternating non-negativity-constrained least squares. We apply our sparse NMF algorithm to cancer-class discovery and gene expression data analysis and offer biological analysis of the results obtained. Our experimental results illustrate that the proposed sparse NMF algorithm often achieves better clustering performance with shorter computing time compared to other existing NMF algorithms. The software is available as supplementary material.
Multi scales based sparse matrix spectral clustering image segmentation

NASA Astrophysics Data System (ADS)

Liu, Zhongmin; Chen, Zhicai; Li, Zhanming; Hu, Wenjin

2018-04-01

In image segmentation, spectral clustering algorithms have to adopt the appropriate scaling parameter to calculate the similarity matrix between the pixels, which may have a great impact on the clustering result. Moreover, when the number of data instance is large, computational complexity and memory use of the algorithm will greatly increase. To solve these two problems, we proposed a new spectral clustering image segmentation algorithm based on multi scales and sparse matrix. We devised a new feature extraction method at first, then extracted the features of image on different scales, at last, using the feature information to construct sparse similarity matrix which can improve the operation efficiency. Compared with traditional spectral clustering algorithm, image segmentation experimental results show our algorithm have better degree of accuracy and robustness.
FPGA implementation of sparse matrix algorithm for information retrieval

NASA Astrophysics Data System (ADS)

Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio

2005-06-01

Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
Algorithms and Application of Sparse Matrix Assembly and Equation Solvers for Aeroacoustics

NASA Technical Reports Server (NTRS)

Watson, W. R.; Nguyen, D. T.; Reddy, C. J.; Vatsa, V. N.; Tang, W. H.

2001-01-01

An algorithm for symmetric sparse equation solutions on an unstructured grid is described. Efficient, sequential sparse algorithms for degree-of-freedom reordering, supernodes, symbolic/numerical factorization, and forward backward solution phases are reviewed. Three sparse algorithms for the generation and assembly of symmetric systems of matrix equations are presented. The accuracy and numerical performance of the sequential version of the sparse algorithms are evaluated over the frequency range of interest in a three-dimensional aeroacoustics application. Results show that the solver solutions are accurate using a discretization of 12 points per wavelength. Results also show that the first assembly algorithm is impractical for high-frequency noise calculations. The second and third assembly algorithms have nearly equal performance at low values of source frequencies, but at higher values of source frequencies the third algorithm saves CPU time and RAM. The CPU time and the RAM required by the second and third assembly algorithms are two orders of magnitude smaller than that required by the sparse equation solver. A sequential version of these sparse algorithms can, therefore, be conveniently incorporated into a substructuring for domain decomposition formulation to achieve parallel computation, where different substructures are handles by different parallel processors.
Computing row and column counts for sparse QR and LU factorization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gilbert, John R.; Li, Xiaoye S.; Ng, Esmond G.

2001-01-01

We present algorithms to determine the number of nonzeros in each row and column of the factors of a sparse matrix, for both the QR factorization and the LU factorization with partial pivoting. The algorithms use only the nonzero structure of the input matrix, and run in time nearly linear in the number of nonzeros in that matrix. They may be used to set up data structures or schedule parallel operations in advance of the numerical factorization. The row and column counts we compute are upper bounds on the actual counts. If the input matrix is strong Hall and theremore » is no coincidental numerical cancellation, the counts are exact for QR factorization and are the tightest bounds possible for LU factorization. These algorithms are based on our earlier work on computing row and column counts for sparse Cholesky factorization, plus an efficient method to compute the column elimination tree of a sparse matrix without explicitly forming the product of the matrix and its transpose.« less
Matched field localization based on CS-MUSIC algorithm

NASA Astrophysics Data System (ADS)

Guo, Shuangle; Tang, Ruichun; Peng, Linhui; Ji, Xiaopeng

2016-04-01

The problem caused by shortness or excessiveness of snapshots and by coherent sources in underwater acoustic positioning is considered. A matched field localization algorithm based on CS-MUSIC (Compressive Sensing Multiple Signal Classification) is proposed based on the sparse mathematical model of the underwater positioning. The signal matrix is calculated through the SVD (Singular Value Decomposition) of the observation matrix. The observation matrix in the sparse mathematical model is replaced by the signal matrix, and a new concise sparse mathematical model is obtained, which means not only the scale of the localization problem but also the noise level is reduced; then the new sparse mathematical model is solved by the CS-MUSIC algorithm which is a combination of CS (Compressive Sensing) method and MUSIC (Multiple Signal Classification) method. The algorithm proposed in this paper can overcome effectively the difficulties caused by correlated sources and shortness of snapshots, and it can also reduce the time complexity and noise level of the localization problem by using the SVD of the observation matrix when the number of snapshots is large, which will be proved in this paper.
Sparse matrix methods based on orthogonality and conjugacy

NASA Technical Reports Server (NTRS)

Lawson, C. L.

1973-01-01

A matrix having a high percentage of zero elements is called spares. In the solution of systems of linear equations or linear least squares problems involving large sparse matrices, significant saving of computer cost can be achieved by taking advantage of the sparsity. The conjugate gradient algorithm and a set of related algorithms are described.
Sparse subspace clustering for data with missing entries and high-rank matrix completion.

PubMed

Fan, Jicong; Chow, Tommy W S

2017-09-01

Many methods have recently been proposed for subspace clustering, but they are often unable to handle incomplete data because of missing entries. Using matrix completion methods to recover missing entries is a common way to solve the problem. Conventional matrix completion methods require that the matrix should be of low-rank intrinsically, but most matrices are of high-rank or even full-rank in practice, especially when the number of subspaces is large. In this paper, a new method called Sparse Representation with Missing Entries and Matrix Completion is proposed to solve the problems of incomplete-data subspace clustering and high-rank matrix completion. The proposed algorithm alternately computes the matrix of sparse representation coefficients and recovers the missing entries of a data matrix. The proposed algorithm recovers missing entries through minimizing the representation coefficients, representation errors, and matrix rank. Thorough experimental study and comparative analysis based on synthetic data and natural images were conducted. The presented results demonstrate that the proposed algorithm is more effective in subspace clustering and matrix completion compared with other existing methods. Copyright © 2017 Elsevier Ltd. All rights reserved.
Sparse Matrix for ECG Identification with Two-Lead Features.

PubMed

Tseng, Kuo-Kun; Luo, Jiao; Hegarty, Robert; Wang, Wenmin; Haiting, Dong

2015-01-01

Electrocardiograph (ECG) human identification has the potential to improve biometric security. However, improvements in ECG identification and feature extraction are required. Previous work has focused on single lead ECG signals. Our work proposes a new algorithm for human identification by mapping two-lead ECG signals onto a two-dimensional matrix then employing a sparse matrix method to process the matrix. And that is the first application of sparse matrix techniques for ECG identification. Moreover, the results of our experiments demonstrate the benefits of our approach over existing methods.
Highly parallel sparse Cholesky factorization

NASA Technical Reports Server (NTRS)

Gilbert, John R.; Schreiber, Robert

1990-01-01

Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms.
A Spectral Algorithm for Envelope Reduction of Sparse Matrices

NASA Technical Reports Server (NTRS)

Barnard, Stephen T.; Pothen, Alex; Simon, Horst D.

1993-01-01

The problem of reordering a sparse symmetric matrix to reduce its envelope size is considered. A new spectral algorithm for computing an envelope-reducing reordering is obtained by associating a Laplacian matrix with the given matrix and then sorting the components of a specified eigenvector of the Laplacian. This Laplacian eigenvector solves a continuous relaxation of a discrete problem related to envelope minimization called the minimum 2-sum problem. The permutation vector computed by the spectral algorithm is a closest permutation vector to the specified Laplacian eigenvector. Numerical results show that the new reordering algorithm usually computes smaller envelope sizes than those obtained from the current standard algorithms such as Gibbs-Poole-Stockmeyer (GPS) or SPARSPAK reverse Cuthill-McKee (RCM), in some cases reducing the envelope by more than a factor of two.
Sparse Regression as a Sparse Eigenvalue Problem

NASA Technical Reports Server (NTRS)

Moghaddam, Baback; Gruber, Amit; Weiss, Yair; Avidan, Shai

2008-01-01

We extend the l0-norm "subspectral" algorithms for sparse-LDA [5] and sparse-PCA [6] to general quadratic costs such as MSE in linear (kernel) regression. The resulting "Sparse Least Squares" (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem (e.g., binary sparse-LDA [7]). Specifically, for a general quadratic cost we use a highly-efficient technique for direct eigenvalue computation using partitioned matrix inverses which leads to dramatic x103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) scaling behaviour that up to now has limited the previous algorithms' utility for high-dimensional learning problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes more efficient than forward selection. Similarly, branch-and-bound search for Exact Sparse Least Squares (ESLS) also benefits from partitioned matrix inverse techniques. Our Greedy Sparse Least Squares (GSLS) generalizes Natarajan's algorithm [9] also known as Order-Recursive Matching Pursuit (ORMP). Specifically, the forward half of GSLS is exactly equivalent to ORMP but more efficient. By including the backward pass, which only doubles the computation, we can achieve lower MSE than ORMP. Experimental comparisons to the state-of-the-art LARS algorithm [3] show forward-GSLS is faster, more accurate and more flexible in terms of choice of regularization
Computing sparse derivatives and consecutive zeros problem

NASA Astrophysics Data System (ADS)

Chandra, B. V. Ravi; Hossain, Shahadat

2013-02-01

We describe a substitution based sparse Jacobian matrix determination method using algorithmic differentiation. Utilizing the a priori known sparsity pattern, a compression scheme is determined using graph coloring. The "compressed pattern" of the Jacobian matrix is then reordered into a form suitable for computation by substitution. We show that the column reordering of the compressed pattern matrix (so as to align the zero entries into consecutive locations in each row) can be viewed as a variant of traveling salesman problem. Preliminary computational results show that on the test problems the performance of nearest-neighbor type heuristic algorithms is highly encouraging.
Algorithms for solving large sparse systems of simultaneous linear equations on vector processors

NASA Technical Reports Server (NTRS)

David, R. E.

1984-01-01

Very efficient algorithms for solving large sparse systems of simultaneous linear equations have been developed for serial processing computers. These involve a reordering of matrix rows and columns in order to obtain a near triangular pattern of nonzero elements. Then an LU factorization is developed to represent the matrix inverse in terms of a sequence of elementary Gaussian eliminations, or pivots. In this paper it is shown how these algorithms are adapted for efficient implementation on vector processors. Results obtained on the CYBER 200 Model 205 are presented for a series of large test problems which show the comparative advantages of the triangularization and vector processing algorithms.
Efficient parallel linear scaling construction of the density matrix for Born-Oppenheimer molecular dynamics.

PubMed

Mniszewski, S M; Cawkwell, M J; Wall, M E; Mohd-Yusof, J; Bock, N; Germann, T C; Niklasson, A M N

2015-10-13

We present an algorithm for the calculation of the density matrix that for insulators scales linearly with system size and parallelizes efficiently on multicore, shared memory platforms with small and controllable numerical errors. The algorithm is based on an implementation of the second-order spectral projection (SP2) algorithm [ Niklasson, A. M. N. Phys. Rev. B 2002 , 66 , 155115 ] in sparse matrix algebra with the ELLPACK-R data format. We illustrate the performance of the algorithm within self-consistent tight binding theory by total energy calculations of gas phase poly(ethylene) molecules and periodic liquid water systems containing up to 15,000 atoms on up to 16 CPU cores. We consider algorithm-specific performance aspects, such as local vs nonlocal memory access and the degree of matrix sparsity. Comparisons to sparse matrix algebra implementations using off-the-shelf libraries on multicore CPUs, graphics processing units (GPUs), and the Intel many integrated core (MIC) architecture are also presented. The accuracy and stability of the algorithm are illustrated with long duration Born-Oppenheimer molecular dynamics simulations of 1000 water molecules and a 303 atom Trp cage protein solvated by 2682 water molecules.
[Formula: see text]-regularized recursive total least squares based sparse system identification for the error-in-variables.

PubMed

Lim, Jun-Seok; Pang, Hee-Suk

2016-01-01

In this paper an [Formula: see text]-regularized recursive total least squares (RTLS) algorithm is considered for the sparse system identification. Although recursive least squares (RLS) has been successfully applied in sparse system identification, the estimation performance in RLS based algorithms becomes worse, when both input and output are contaminated by noise (the error-in-variables problem). We proposed an algorithm to handle the error-in-variables problem. The proposed [Formula: see text]-RTLS algorithm is an RLS like iteration using the [Formula: see text] regularization. The proposed algorithm not only gives excellent performance but also reduces the required complexity through the effective inversion matrix handling. Simulations demonstrate the superiority of the proposed [Formula: see text]-regularized RTLS for the sparse system identification setting.

Massively parallel sparse matrix function calculations with NTPoly

NASA Astrophysics Data System (ADS)

Dawson, William; Nakajima, Takahito

2018-04-01

We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.
Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Kyungjoo; Rajamanickam, Sivasankaran; Stelle, George Widgery

We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block layout. The algorithm-byblocks approach induces a task graph for the factorization. These tasks are inter-related to each other through their data dependences in the factorization algorithm. To process the tasks on various manycore architectures in a portable manner, we also present a portable tasking API that incorporates different tasking backends and device-specific features using an open-source framework for manycore platforms i.e., Kokkos. A performance evaluation is presented onmore » both Intel Sandybridge and Xeon Phi platforms for matrices from the University of Florida sparse matrix collection to illustrate merits of the proposed task-based factorization. Experimental results demonstrate that our task-parallel implementation delivers about 26.6x speedup (geometric mean) over single-threaded incomplete Choleskyby- blocks and 19.2x speedup over serial Cholesky performance which does not carry tasking overhead using 56 threads on the Intel Xeon Phi processor for sparse matrices arising from various application problems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chow, Edmond

Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.
Communication Optimal Parallel Multiplication of Sparse Random Matrices

DTIC Science & Technology

2013-02-21

Definition 2.1), and (2) the algorithm is sparsity- independent, where the computation is statically partitioned to processors independent of the sparsity...struc- ture of the input matrices (see Definition 2.5). The second assumption applies to nearly all existing al- gorithms for general sparse matrix-matrix...where A and B are n× n ER(d) matrices: Definition 2.1 An ER(d) matrix is an adjacency matrix of an Erdős-Rényi graph with parameters n and d/n. That
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication

DOE PAGES

Azad, Ariful; Ballard, Grey; Buluc, Aydin; ...

2016-11-08

Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdös-Rényi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first implementation of the 3D SpGEMM formulation that exploits multiple (intranode and internode) levels of parallelism, achievingmore » significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research.« less
A sparse matrix-vector multiplication based algorithm for accurate density matrix computations on systems of millions of atoms

NASA Astrophysics Data System (ADS)

Ghale, Purnima; Johnson, Harley T.

2018-06-01

We present an efficient sparse matrix-vector (SpMV) based method to compute the density matrix P from a given Hamiltonian in electronic structure computations. Our method is a hybrid approach based on Chebyshev-Jackson approximation theory and matrix purification methods like the second order spectral projection purification (SP2). Recent methods to compute the density matrix scale as O(N) in the number of floating point operations but are accompanied by large memory and communication overhead, and they are based on iterative use of the sparse matrix-matrix multiplication kernel (SpGEMM), which is known to be computationally irregular. In addition to irregularity in the sparse Hamiltonian H, the nonzero structure of intermediate estimates of P depends on products of H and evolves over the course of computation. On the other hand, an expansion of the density matrix P in terms of Chebyshev polynomials is straightforward and SpMV based; however, the resulting density matrix may not satisfy the required constraints exactly. In this paper, we analyze the strengths and weaknesses of the Chebyshev-Jackson polynomials and the second order spectral projection purification (SP2) method, and propose to combine them so that the accurate density matrix can be computed using the SpMV computational kernel only, and without having to store the density matrix P. Our method accomplishes these objectives by using the Chebyshev polynomial estimate as the initial guess for SP2, which is followed by using sparse matrix-vector multiplications (SpMVs) to replicate the behavior of the SP2 algorithm for purification. We demonstrate the method on a tight-binding model system of an oxide material containing more than 3 million atoms. In addition, we also present the predicted behavior of our method when applied to near-metallic Hamiltonians with a wide energy spectrum.
Sparse Covariance Matrix Estimation by DCA-Based Algorithms.

PubMed

Phan, Duy Nhat; Le Thi, Hoai An; Dinh, Tao Pham

2017-11-01

This letter proposes a novel approach using the [Formula: see text]-norm regularization for the sparse covariance matrix estimation (SCME) problem. The objective function of SCME problem is composed of a nonconvex part and the [Formula: see text] term, which is discontinuous and difficult to tackle. Appropriate DC (difference of convex functions) approximations of [Formula: see text]-norm are used that result in approximation SCME problems that are still nonconvex. DC programming and DCA (DC algorithm), powerful tools in nonconvex programming framework, are investigated. Two DC formulations are proposed and corresponding DCA schemes developed. Two applications of the SCME problem that are considered are classification via sparse quadratic discriminant analysis and portfolio optimization. A careful empirical experiment is performed through simulated and real data sets to study the performance of the proposed algorithms. Numerical results showed their efficiency and their superiority compared with seven state-of-the-art methods.
Research on sparse feature matching of improved RANSAC algorithm

NASA Astrophysics Data System (ADS)

Kong, Xiangsi; Zhao, Xian

2018-04-01

In this paper, a sparse feature matching method based on modified RANSAC algorithm is proposed to improve the precision and speed. Firstly, the feature points of the images are extracted using the SIFT algorithm. Then, the image pair is matched roughly by generating SIFT feature descriptor. At last, the precision of image matching is optimized by the modified RANSAC algorithm,. The RANSAC algorithm is improved from three aspects: instead of the homography matrix, this paper uses the fundamental matrix generated by the 8 point algorithm as the model; the sample is selected by a random block selecting method, which ensures the uniform distribution and the accuracy; adds sequential probability ratio test(SPRT) on the basis of standard RANSAC, which cut down the overall running time of the algorithm. The experimental results show that this method can not only get higher matching accuracy, but also greatly reduce the computation and improve the matching speed.
Robust extraction of basis functions for simultaneous and proportional myoelectric control via sparse non-negative matrix factorization

NASA Astrophysics Data System (ADS)

Lin, Chuang; Wang, Binghui; Jiang, Ning; Farina, Dario

2018-04-01

Objective. This paper proposes a novel simultaneous and proportional multiple degree of freedom (DOF) myoelectric control method for active prostheses. Approach. The approach is based on non-negative matrix factorization (NMF) of surface EMG signals with the inclusion of sparseness constraints. By applying a sparseness constraint to the control signal matrix, it is possible to extract the basis information from arbitrary movements (quasi-unsupervised approach) for multiple DOFs concurrently. Main Results. In online testing based on target hitting, able-bodied subjects reached a greater throughput (TP) when using sparse NMF (SNMF) than with classic NMF or with linear regression (LR). Accordingly, the completion time (CT) was shorter for SNMF than NMF or LR. The same observations were made in two patients with unilateral limb deficiencies. Significance. The addition of sparseness constraints to NMF allows for a quasi-unsupervised approach to myoelectric control with superior results with respect to previous methods for the simultaneous and proportional control of multi-DOF. The proposed factorization algorithm allows robust simultaneous and proportional control, is superior to previous supervised algorithms, and, because of minimal supervision, paves the way to online adaptation in myoelectric control.
Noniterative MAP reconstruction using sparse matrix representations.

PubMed

Cao, Guangzhi; Bouman, Charles A; Webb, Kevin J

2009-09-01

We present a method for noniterative maximum a posteriori (MAP) tomographic reconstruction which is based on the use of sparse matrix representations. Our approach is to precompute and store the inverse matrix required for MAP reconstruction. This approach has generally not been used in the past because the inverse matrix is typically large and fully populated (i.e., not sparse). In order to overcome this problem, we introduce two new ideas. The first idea is a novel theory for the lossy source coding of matrix transformations which we refer to as matrix source coding. This theory is based on a distortion metric that reflects the distortions produced in the final matrix-vector product, rather than the distortions in the coded matrix itself. The resulting algorithms are shown to require orthonormal transformations of both the measurement data and the matrix rows and columns before quantization and coding. The second idea is a method for efficiently storing and computing the required orthonormal transformations, which we call a sparse-matrix transform (SMT). The SMT is a generalization of the classical FFT in that it uses butterflies to compute an orthonormal transform; but unlike an FFT, the SMT uses the butterflies in an irregular pattern, and is numerically designed to best approximate the desired transforms. We demonstrate the potential of the noniterative MAP reconstruction with examples from optical tomography. The method requires offline computation to encode the inverse transform. However, once these offline computations are completed, the noniterative MAP algorithm is shown to reduce both storage and computation by well over two orders of magnitude, as compared to a linear iterative reconstruction methods.
Total variation-based method for radar coincidence imaging with model mismatch for extended target

NASA Astrophysics Data System (ADS)

Cao, Kaicheng; Zhou, Xiaoli; Cheng, Yongqiang; Fan, Bo; Qin, Yuliang

2017-11-01

Originating from traditional optical coincidence imaging, radar coincidence imaging (RCI) is a staring/forward-looking imaging technique. In RCI, the reference matrix must be computed precisely to reconstruct the image as preferred; unfortunately, such precision is almost impossible due to the existence of model mismatch in practical applications. Although some conventional sparse recovery algorithms are proposed to solve the model-mismatch problem, they are inapplicable to nonsparse targets. We therefore sought to derive the signal model of RCI with model mismatch by replacing the sparsity constraint item with total variation (TV) regularization in the sparse total least squares optimization problem; in this manner, we obtain the objective function of RCI with model mismatch for an extended target. A more robust and efficient algorithm called TV-TLS is proposed, in which the objective function is divided into two parts and the perturbation matrix and scattering coefficients are updated alternately. Moreover, due to the ability of TV regularization to recover sparse signal or image with sparse gradient, TV-TLS method is also applicable to sparse recovering. Results of numerical experiments demonstrate that, for uniform extended targets, sparse targets, and real extended targets, the algorithm can achieve preferred imaging performance both in suppressing noise and in adapting to model mismatch.
Sparse Gaussian elimination with controlled fill-in on a shared memory multiprocessor

NASA Technical Reports Server (NTRS)

Alaghband, Gita; Jordan, Harry F.

1989-01-01

It is shown that in sparse matrices arising from electronic circuits, it is possible to do computations on many diagonal elements simultaneously. A technique for obtaining an ordered compatible set directly from the ordered incompatible table is given. The ordering is based on the Markowitz number of the pivot candidates. This technique generates a set of compatible pivots with the property of generating few fills. A novel heuristic algorithm is presented that combines the idea of an order-compatible set with a limited binary tree search to generate several sets of compatible pivots in linear time. An elimination set for reducing the matrix is generated and selected on the basis of a minimum Markowitz sum number. The parallel pivoting technique presented is a stepwise algorithm and can be applied to any submatrix of the original matrix. Thus, it is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds. Parameters are suggested to obtain a balance between parallelism and fill-ins. Results of applying the proposed algorithms on several large application matrices using the HEP multiprocessor (Kowalik, 1985) are presented and analyzed.
Label consistent K-SVD: learning a discriminative dictionary for recognition.

PubMed

Jiang, Zhuolin; Lin, Zhe; Davis, Larry S

2013-11-01

A label consistent K-SVD (LC-KSVD) algorithm to learn a discriminative dictionary for sparse coding is presented. In addition to using class labels of training data, we also associate label information with each dictionary item (columns of the dictionary matrix) to enforce discriminability in sparse codes during the dictionary learning process. More specifically, we introduce a new label consistency constraint called "discriminative sparse-code error" and combine it with the reconstruction error and the classification error to form a unified objective function. The optimal solution is efficiently obtained using the K-SVD algorithm. Our algorithm learns a single overcomplete dictionary and an optimal linear classifier jointly. The incremental dictionary learning algorithm is presented for the situation of limited memory resources. It yields dictionaries so that feature points with the same class labels have similar sparse codes. Experimental results demonstrate that our algorithm outperforms many recently proposed sparse-coding techniques for face, action, scene, and object category recognition under the same learning conditions.
Amesos2 and Belos: Direct and Iterative Solvers for Large Sparse Linear Systems

DOE PAGES

Bavier, Eric; Hoemmen, Mark; Rajamanickam, Sivasankaran; ...

2012-01-01

Solvers for large sparse linear systems come in two categories: direct and iterative. Amesos2, a package in the Trilinos software project, provides direct methods, and Belos, another Trilinos package, provides iterative methods. Amesos2 offers a common interface to many different sparse matrix factorization codes, and can handle any implementation of sparse matrices and vectors, via an easy-to-extend C++ traits interface. It can also factor matrices whose entries have arbitrary “Scalar” type, enabling extended-precision and mixed-precision algorithms. Belos includes many different iterative methods for solving large sparse linear systems and least-squares problems. Unlike competing iterative solver libraries, Belos completely decouples themore » algorithms from the implementations of the underlying linear algebra objects. This lets Belos exploit the latest hardware without changes to the code. Belos favors algorithms that solve higher-level problems, such as multiple simultaneous linear systems and sequences of related linear systems, faster than standard algorithms. The package also supports extended-precision and mixed-precision algorithms. Together, Amesos2 and Belos form a complete suite of sparse linear solvers.« less
High-performance sparse matrix-matrix products on Intel KNL and multicore architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nagasaka, Y; Matsuoka, S; Azad, A

Sparse matrix-matrix multiplication (SpGEMM) is a computational primitive that is widely used in areas ranging from traditional numerical applications to recent big data analysis and machine learning. Although many SpGEMM algorithms have been proposed, hardware specific optimizations for multi- and many-core processors are lacking and a detailed analysis of their performance under various use cases and matrices is not available. We firstly identify and mitigate multiple bottlenecks with memory management and thread scheduling on Intel Xeon Phi (Knights Landing or KNL). Specifically targeting multi- and many-core processors, we develop a hash-table-based algorithm and optimize a heap-based shared-memory SpGEMM algorithm. Wemore » examine their performance together with other publicly available codes. Different from the literature, our evaluation also includes use cases that are representative of real graph algorithms, such as multi-source breadth-first search or triangle counting. Our hash-table and heap-based algorithms are showing significant speedups from libraries in the majority of the cases while different algorithms dominate the other scenarios with different matrix size, sparsity, compression factor and operation type. We wrap up in-depth evaluation results and make a recipe to give the best SpGEMM algorithm for target scenario. A critical finding is that hash-table-based SpGEMM gets a significant performance boost if the nonzeros are not required to be sorted within each row of the output matrix.« less
Biclustering sparse binary genomic data.

PubMed

van Uitert, Miranda; Meuleman, Wouter; Wessels, Lodewyk

2008-12-01

Genomic datasets often consist of large, binary, sparse data matrices. In such a dataset, one is often interested in finding contiguous blocks that (mostly) contain ones. This is a biclustering problem, and while many algorithms have been proposed to deal with gene expression data, only two algorithms have been proposed that specifically deal with binary matrices. None of the gene expression biclustering algorithms can handle the large number of zeros in sparse binary matrices. The two proposed binary algorithms failed to produce meaningful results. In this article, we present a new algorithm that is able to extract biclusters from sparse, binary datasets. A powerful feature is that biclusters with different numbers of rows and columns can be detected, varying from many rows to few columns and few rows to many columns. It allows the user to guide the search towards biclusters of specific dimensions. When applying our algorithm to an input matrix derived from TRANSFAC, we find transcription factors with distinctly dissimilar binding motifs, but a clear set of common targets that are significantly enriched for GO categories.
Systematic sparse matrix error control for linear scaling electronic structure calculations.

PubMed

Rubensson, Emanuel H; Sałek, Paweł

2005-11-30

Efficient truncation criteria used in multiatom blocked sparse matrix operations for ab initio calculations are proposed. As system size increases, so does the need to stay on top of errors and still achieve high performance. A variant of a blocked sparse matrix algebra to achieve strict error control with good performance is proposed. The presented idea is that the condition to drop a certain submatrix should depend not only on the magnitude of that particular submatrix, but also on which other submatrices that are dropped. The decision to remove a certain submatrix is based on the contribution the removal would cause to the error in the chosen norm. We study the effect of an accumulated truncation error in iterative algorithms like trace correcting density matrix purification. One way to reduce the initial exponential growth of this error is presented. The presented error control for a sparse blocked matrix toolbox allows for achieving optimal performance by performing only necessary operations needed to maintain the requested level of accuracy. Copyright 2005 Wiley Periodicals, Inc.
Direction of Arrival Estimation for MIMO Radar via Unitary Nuclear Norm Minimization

PubMed Central

Wang, Xianpeng; Huang, Mengxing; Wu, Xiaoqin; Bi, Guoan

2017-01-01

In this paper, we consider the direction of arrival (DOA) estimation issue of noncircular (NC) source in multiple-input multiple-output (MIMO) radar and propose a novel unitary nuclear norm minimization (UNNM) algorithm. In the proposed method, the noncircular properties of signals are used to double the virtual array aperture, and the real-valued data are obtained by utilizing unitary transformation. Then a real-valued block sparse model is established based on a novel over-complete dictionary, and a UNNM algorithm is formulated for recovering the block-sparse matrix. In addition, the real-valued NC-MUSIC spectrum is used to design a weight matrix for reweighting the nuclear norm minimization to achieve the enhanced sparsity of solutions. Finally, the DOA is estimated by searching the non-zero blocks of the recovered matrix. Because of using the noncircular properties of signals to extend the virtual array aperture and an additional real structure to suppress the noise, the proposed method provides better performance compared with the conventional sparse recovery based algorithms. Furthermore, the proposed method can handle the case of underdetermined DOA estimation. Simulation results show the effectiveness and advantages of the proposed method. PMID:28441770
Partitioning Rectangular and Structurally Nonsymmetric Sparse Matrices for Parallel Processing

DOE Office of Scientific and Technical Information (OSTI.GOV)

B. Hendrickson; T.G. Kolda

1998-09-01

A common operation in scientific computing is the multiplication of a sparse, rectangular or structurally nonsymmetric matrix and a vector. In many applications the matrix- transpose-vector product is also required. This paper addresses the efficient parallelization of these operations. We show that the problem can be expressed in terms of partitioning bipartite graphs. We then introduce several algorithms for this partitioning problem and compare their performance on a set of test matrices.
Joint Smoothed l₀-Norm DOA Estimation Algorithm for Multiple Measurement Vectors in MIMO Radar.

PubMed

Liu, Jing; Zhou, Weidong; Juwono, Filbert H

2017-05-08

Direction-of-arrival (DOA) estimation is usually confronted with a multiple measurement vector (MMV) case. In this paper, a novel fast sparse DOA estimation algorithm, named the joint smoothed l 0 -norm algorithm, is proposed for multiple measurement vectors in multiple-input multiple-output (MIMO) radar. To eliminate the white or colored Gaussian noises, the new method first obtains a low-complexity high-order cumulants based data matrix. Then, the proposed algorithm designs a joint smoothed function tailored for the MMV case, based on which joint smoothed l 0 -norm sparse representation framework is constructed. Finally, for the MMV-based joint smoothed function, the corresponding gradient-based sparse signal reconstruction is designed, thus the DOA estimation can be achieved. The proposed method is a fast sparse representation algorithm, which can solve the MMV problem and perform well for both white and colored Gaussian noises. The proposed joint algorithm is about two orders of magnitude faster than the l 1 -norm minimization based methods, such as l 1 -SVD (singular value decomposition), RV (real-valued) l 1 -SVD and RV l 1 -SRACV (sparse representation array covariance vectors), and achieves better DOA estimation performance.

Reducing computational costs in large scale 3D EIT by using a sparse Jacobian matrix with block-wise CGLS reconstruction.

PubMed

Yang, C L; Wei, H Y; Adler, A; Soleimani, M

2013-06-01

Electrical impedance tomography (EIT) is a fast and cost-effective technique to provide a tomographic conductivity image of a subject from boundary current-voltage data. This paper proposes a time and memory efficient method for solving a large scale 3D EIT inverse problem using a parallel conjugate gradient (CG) algorithm. The 3D EIT system with a large number of measurement data can produce a large size of Jacobian matrix; this could cause difficulties in computer storage and the inversion process. One of challenges in 3D EIT is to decrease the reconstruction time and memory usage, at the same time retaining the image quality. Firstly, a sparse matrix reduction technique is proposed using thresholding to set very small values of the Jacobian matrix to zero. By adjusting the Jacobian matrix into a sparse format, the element with zeros would be eliminated, which results in a saving of memory requirement. Secondly, a block-wise CG method for parallel reconstruction has been developed. The proposed method has been tested using simulated data as well as experimental test samples. Sparse Jacobian with a block-wise CG enables the large scale EIT problem to be solved efficiently. Image quality measures are presented to quantify the effect of sparse matrix reduction in reconstruction results.
Decoding the encoding of functional brain networks: An fMRI classification comparison of non-negative matrix factorization (NMF), independent component analysis (ICA), and sparse coding algorithms.

PubMed

Xie, Jianwen; Douglas, Pamela K; Wu, Ying Nian; Brody, Arthur L; Anderson, Ariana E

2017-04-15

Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet other mathematical constraints provide alternate biologically-plausible frameworks for generating brain networks. Non-negative matrix factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms (L1 Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks. The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks within scan for different constraints are used as basis functions to encode observed functional activity. These encodings are then decoded using machine learning, by using the time series weights to predict within scan whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects. The sparse coding algorithm of L1 Regularized Learning outperformed 4 variations of ICA (p<0.001) for predicting the task being performed within each scan using artifact-cleaned components. The NMF algorithms, which suppressed negative BOLD signal, had the poorest accuracy compared to the ICA and sparse coding algorithms. Holding constant the effect of the extraction algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (p<0.001). Lower classification accuracy occurred when the extracted spatial maps contained more CSF regions (p<0.001). The success of sparse coding algorithms suggests that algorithms which enforce sparsity, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA. Negative BOLD signal may capture task-related activations. Copyright © 2017 Elsevier B.V. All rights reserved.
Improving M-SBL for Joint Sparse Recovery Using a Subspace Penalty

NASA Astrophysics Data System (ADS)

Ye, Jong Chul; Kim, Jong Min; Bresler, Yoram

2015-12-01

The multiple measurement vector problem (MMV) is a generalization of the compressed sensing problem that addresses the recovery of a set of jointly sparse signal vectors. One of the important contributions of this paper is to reveal that the seemingly least related state-of-art MMV joint sparse recovery algorithms - M-SBL (multiple sparse Bayesian learning) and subspace-based hybrid greedy algorithms - have a very important link. More specifically, we show that replacing the $\\log\\det(\\cdot)$ term in M-SBL by a rank proxy that exploits the spark reduction property discovered in subspace-based joint sparse recovery algorithms, provides significant improvements. In particular, if we use the Schatten-$p$ quasi-norm as the corresponding rank proxy, the global minimiser of the proposed algorithm becomes identical to the true solution as $p \\rightarrow 0$. Furthermore, under the same regularity conditions, we show that the convergence to a local minimiser is guaranteed using an alternating minimization algorithm that has closed form expressions for each of the minimization steps, which are convex. Numerical simulations under a variety of scenarios in terms of SNR, and condition number of the signal amplitude matrix demonstrate that the proposed algorithm consistently outperforms M-SBL and other state-of-the art algorithms.
Detection of Protein Complexes Based on Penalized Matrix Decomposition in a Sparse Protein⁻Protein Interaction Network.

PubMed

Cao, Buwen; Deng, Shuguang; Qin, Hua; Ding, Pingjian; Chen, Shaopeng; Li, Guanghui

2018-06-15

High-throughput technology has generated large-scale protein interaction data, which is crucial in our understanding of biological organisms. Many complex identification algorithms have been developed to determine protein complexes. However, these methods are only suitable for dense protein interaction networks, because their capabilities decrease rapidly when applied to sparse protein⁻protein interaction (PPI) networks. In this study, based on penalized matrix decomposition ( PMD ), a novel method of penalized matrix decomposition for the identification of protein complexes (i.e., PMD pc ) was developed to detect protein complexes in the human protein interaction network. This method mainly consists of three steps. First, the adjacent matrix of the protein interaction network is normalized. Second, the normalized matrix is decomposed into three factor matrices. The PMD pc method can detect protein complexes in sparse PPI networks by imposing appropriate constraints on factor matrices. Finally, the results of our method are compared with those of other methods in human PPI network. Experimental results show that our method can not only outperform classical algorithms, such as CFinder, ClusterONE, RRW, HC-PIN, and PCE-FR, but can also achieve an ideal overall performance in terms of a composite score consisting of F-measure, accuracy (ACC), and the maximum matching ratio (MMR).
Progress on a generalized coordinates tensor product finite element 3DPNS algorithm for subsonic

NASA Technical Reports Server (NTRS)

Baker, A. J.; Orzechowski, J. A.

1983-01-01

A generalized coordinates form of the penalty finite element algorithm for the 3-dimensional parabolic Navier-Stokes equations for turbulent subsonic flows was derived. This algorithm formulation requires only three distinct hypermatrices and is applicable using any boundary fitted coordinate transformation procedure. The tensor matrix product approximation to the Jacobian of the Newton linear algebra matrix statement was also derived. Tne Newton algorithm was restructured to replace large sparse matrix solution procedures with grid sweeping using alpha-block tridiagonal matrices, where alpha equals the number of dependent variables. Numerical experiments were conducted and the resultant data gives guidance on potentially preferred tensor product constructions for the penalty finite element 3DPNS algorithm.
Compressed sensing for energy-efficient wireless telemonitoring of noninvasive fetal ECG via block sparse Bayesian learning.

PubMed

Zhang, Zhilin; Jung, Tzyy-Ping; Makeig, Scott; Rao, Bhaskar D

2013-02-01

Fetal ECG (FECG) telemonitoring is an important branch in telemedicine. The design of a telemonitoring system via a wireless body area network with low energy consumption for ambulatory use is highly desirable. As an emerging technique, compressed sensing (CS) shows great promise in compressing/reconstructing data with low energy consumption. However, due to some specific characteristics of raw FECG recordings such as nonsparsity and strong noise contamination, current CS algorithms generally fail in this application. This paper proposes to use the block sparse Bayesian learning framework to compress/reconstruct nonsparse raw FECG recordings. Experimental results show that the framework can reconstruct the raw recordings with high quality. Especially, the reconstruction does not destroy the interdependence relation among the multichannel recordings. This ensures that the independent component analysis decomposition of the reconstructed recordings has high fidelity. Furthermore, the framework allows the use of a sparse binary sensing matrix with much fewer nonzero entries to compress recordings. Particularly, each column of the matrix can contain only two nonzero entries. This shows that the framework, compared to other algorithms such as current CS algorithms and wavelet algorithms, can greatly reduce code execution in CPU in the data compression stage.
A Shifted Block Lanczos Algorithm 1: The Block Recurrence

NASA Technical Reports Server (NTRS)

Grimes, Roger G.; Lewis, John G.; Simon, Horst D.

1990-01-01

In this paper we describe a block Lanczos algorithm that is used as the key building block of a software package for the extraction of eigenvalues and eigenvectors of large sparse symmetric generalized eigenproblems. The software package comprises: a version of the block Lanczos algorithm specialized for spectrally transformed eigenproblems; an adaptive strategy for choosing shifts, and efficient codes for factoring large sparse symmetric indefinite matrices. This paper describes the algorithmic details of our block Lanczos recurrence. This uses a novel combination of block generalizations of several features that have only been investigated independently in the past. In particular new forms of partial reorthogonalization, selective reorthogonalization and local reorthogonalization are used, as is a new algorithm for obtaining the M-orthogonal factorization of a matrix. The heuristic shifting strategy, the integration with sparse linear equation solvers and numerical experience with the code are described in a companion paper.
GPU-accelerated algorithms for compressed signals recovery with application to astronomical imagery deblurring

NASA Astrophysics Data System (ADS)

Fiandrotti, Attilio; Fosson, Sophie M.; Ravazzi, Chiara; Magli, Enrico

2018-04-01

Compressive sensing promises to enable bandwidth-efficient on-board compression of astronomical data by lifting the encoding complexity from the source to the receiver. The signal is recovered off-line, exploiting GPUs parallel computation capabilities to speedup the reconstruction process. However, inherent GPU hardware constraints limit the size of the recoverable signal and the speedup practically achievable. In this work, we design parallel algorithms that exploit the properties of circulant matrices for efficient GPU-accelerated sparse signals recovery. Our approach reduces the memory requirements, allowing us to recover very large signals with limited memory. In addition, it achieves a tenfold signal recovery speedup thanks to ad-hoc parallelization of matrix-vector multiplications and matrix inversions. Finally, we practically demonstrate our algorithms in a typical application of circulant matrices: deblurring a sparse astronomical image in the compressed domain.
Improved parallel data partitioning by nested dissection with applications to information retrieval.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolf, Michael M.; Chevalier, Cedric; Boman, Erik Gunnar

The computational work in many information retrieval and analysis algorithms is based on sparse linear algebra. Sparse matrix-vector multiplication is a common kernel in many of these computations. Thus, an important related combinatorial problem in parallel computing is how to distribute the matrix and the vectors among processors so as to minimize the communication cost. We focus on minimizing the total communication volume while keeping the computation balanced across processes. In [1], the first two authors presented a new 2D partitioning method, the nested dissection partitioning algorithm. In this paper, we improve on that algorithm and show that it ismore » a good option for data partitioning in information retrieval. We also show partitioning time can be substantially reduced by using the SCOTCH software, and quality improves in some cases, too.« less
Experiments with conjugate gradient algorithms for homotopy curve tracking

NASA Technical Reports Server (NTRS)

Irani, Kashmira M.; Ribbens, Calvin J.; Watson, Layne T.; Kamat, Manohar P.; Walker, Homer F.

1991-01-01

There are algorithms for finding zeros or fixed points of nonlinear systems of equations that are globally convergent for almost all starting points, i.e., with probability one. The essence of all such algorithms is the construction of an appropriate homotopy map and then tracking some smooth curve in the zero set of this homotopy map. HOMPACK is a mathematical software package implementing globally convergent homotopy algorithms with three different techniques for tracking a homotopy zero curve, and has separate routines for dense and sparse Jacobian matrices. The HOMPACK algorithms for sparse Jacobian matrices use a preconditioned conjugate gradient algorithm for the computation of the kernel of the homotopy Jacobian matrix, a required linear algebra step for homotopy curve tracking. Here, variants of the conjugate gradient algorithm are implemented in the context of homotopy curve tracking and compared with Craig's preconditioned conjugate gradient method used in HOMPACK. The test problems used include actual large scale, sparse structural mechanics problems.
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION.

PubMed

Fan, Jianqing; Xue, Lingzhou; Zou, Hui

2014-06-01

Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression.
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION

PubMed Central

Fan, Jianqing; Xue, Lingzhou; Zou, Hui

2014-01-01

Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression. PMID:25598560
Sparse matrix-vector multiplication on network-on-chip

NASA Astrophysics Data System (ADS)

Sun, C.-C.; Götze, J.; Jheng, H.-Y.; Ruan, S.-J.

2010-12-01

In this paper, we present an idea for performing matrix-vector multiplication by using Network-on-Chip (NoC) architecture. In traditional IC design on-chip communications have been designed with dedicated point-to-point interconnections. Therefore, regular local data transfer is the major concept of many parallel implementations. However, when dealing with the parallel implementation of sparse matrix-vector multiplication (SMVM), which is the main step of all iterative algorithms for solving systems of linear equation, the required data transfers depend on the sparsity structure of the matrix and can be extremely irregular. Using the NoC architecture makes it possible to deal with arbitrary structure of the data transfers; i.e. with the irregular structure of the sparse matrices. So far, we have already implemented the proposed SMVM-NoC architecture with the size 4×4 and 5×5 in IEEE 754 single float point precision using FPGA.
Exact recovery of sparse multiple measurement vectors by [Formula: see text]-minimization.

PubMed

Wang, Changlong; Peng, Jigen

2018-01-01

The joint sparse recovery problem is a generalization of the single measurement vector problem widely studied in compressed sensing. It aims to recover a set of jointly sparse vectors, i.e., those that have nonzero entries concentrated at a common location. Meanwhile [Formula: see text]-minimization subject to matrixes is widely used in a large number of algorithms designed for this problem, i.e., [Formula: see text]-minimization [Formula: see text] Therefore the main contribution in this paper is two theoretical results about this technique. The first one is proving that in every multiple system of linear equations there exists a constant [Formula: see text] such that the original unique sparse solution also can be recovered from a minimization in [Formula: see text] quasi-norm subject to matrixes whenever [Formula: see text]. The other one is showing an analytic expression of such [Formula: see text]. Finally, we display the results of one example to confirm the validity of our conclusions, and we use some numerical experiments to show that we increase the efficiency of these algorithms designed for [Formula: see text]-minimization by using our results.
A Partitioning Algorithm for Block-Diagonal Matrices With Overlap

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guy Antoine Atenekeng Kahou; Laura Grigori; Masha Sosonkina

2008-02-02

We present a graph partitioning algorithm that aims at partitioning a sparse matrix into a block-diagonal form, such that any two consecutive blocks overlap. We denote this form of the matrix as the overlapped block-diagonal matrix. The partitioned matrix is suitable for applying the explicit formulation of Multiplicative Schwarz preconditioner (EFMS) described in [3]. The graph partitioning algorithm partitions the graph of the input matrix into K partitions, such that every partition {Omega}{sub i} has at most two neighbors {Omega}{sub i-1} and {Omega}{sub i+1}. First, an ordering algorithm, such as the reverse Cuthill-McKee algorithm, that reduces the matrix profile ismore » performed. An initial overlapped block-diagonal partition is obtained from the profile of the matrix. An iterative strategy is then used to further refine the partitioning by allowing nodes to be transferred between neighboring partitions. Experiments are performed on matrices arising from real-world applications to show the feasibility and usefulness of this approach.« less
An Efficient Scheme for Updating Sparse Cholesky Factors

NASA Technical Reports Server (NTRS)

Raghavan, Padma

2002-01-01

Raghavan had earlier developed the software package DCSPACK which can be used for solving sparse linear systems where the coefficient matrix is symmetric and positive definite (this project was not funded by NASA but by agencies such as NSF). DSCPACK-S is the serial code and DSCPACK-P is a parallel implementation suitable for multiprocessors or networks-of-workstations with message passing using MCI. The main algorithm used is the Cholesky factorization of a sparse symmetric positive positive definite matrix A = LL(T). The code can also compute the factorization A = LDL(T). The complexity of the software arises from several factors relating to the sparsity of the matrix A. A sparse N x N matrix A has typically less that cN nonzeroes where c is a small constant. If the matrix were dense, it would have O(N2) nonzeroes. The most complicated part of such sparse Cholesky factorization relates to fill-in, i.e., zeroes in the original matrix that become nonzeroes in the factor L. An efficient implementation depends to a large extent on complex data structures and on techniques from graph theory to reduce, identify, and manage fill. DSCPACK is based on an efficient multifrontal implementation with fill-managing algorithms and implementation arising from earlier research by Raghavan and others. Sparse Cholesky factorization is typically a four step process: (1) ordering to compute a fill-reducing numbering, (2) symbolic factorization to determine the nonzero structure of L, (3) numeric factorization to compute L, and, (4) triangular solution to solve L(T)x = y and Ly = b. The first two steps are symbolic and are performed using the graph of the matrix. The numeric factorization step is of dominant cost and there are several schemes for improving performance by exploiting the nested and dense structure of groups of columns in the factor. The latter are aimed at better utilization of the cache-memory hierarchy on modem processors to prevent cache-misses and provide execution rates (operations/second) that are close to the peak rates for dense matrix computations. Currently, EPISCOPACY is being used in an application at NASA directed by J. Newman and M. James. We propose the implementation of efficient schemes for updating the LL(T) or LDL(T) factors computed in DSCPACK-S to meet the computational requirements of their project. A brief description is provided in the next section.
A study of the parallel algorithm for large-scale DC simulation of nonlinear systems

NASA Astrophysics Data System (ADS)

Cortés Udave, Diego Ernesto; Ogrodzki, Jan; Gutiérrez de Anda, Miguel Angel

Newton-Raphson DC analysis of large-scale nonlinear circuits may be an extremely time consuming process even if sparse matrix techniques and bypassing of nonlinear models calculation are used. A slight decrease in the time required for this task may be enabled on multi-core, multithread computers if the calculation of the mathematical models for the nonlinear elements as well as the stamp management of the sparse matrix entries are managed through concurrent processes. This numerical complexity can be further reduced via the circuit decomposition and parallel solution of blocks taking as a departure point the BBD matrix structure. This block-parallel approach may give a considerable profit though it is strongly dependent on the system topology and, of course, on the processor type. This contribution presents the easy-parallelizable decomposition-based algorithm for DC simulation and provides a detailed study of its effectiveness.
Multitasking the Davidson algorithm for the large, sparse eigenvalue problem

DOE Office of Scientific and Technical Information (OSTI.GOV)

Umar, V.M.; Fischer, C.F.

1989-01-01

The authors report how the Davidson algorithm, developed for handling the eigenvalue problem for large and sparse matrices arising in quantum chemistry, was modified for use in atomic structure calculations. To date these calculations have used traditional eigenvalue methods, which limit the range of feasible calculations because of their excessive memory requirements and unsatisfactory performance attributed to time-consuming and costly processing of zero valued elements. The replacement of a traditional matrix eigenvalue method by the Davidson algorithm reduced these limitations. Significant speedup was found, which varied with the size of the underlying problem and its sparsity. Furthermore, the range ofmore » matrix sizes that can be manipulated efficiently was expended by more than one order or magnitude. On the CRAY X-MP the code was vectorized and the importance of gather/scatter analyzed. A parallelized version of the algorithm obtained an additional 35% reduction in execution time. Speedup due to vectorization and concurrency was also measured on the Alliant FX/8.« less
Subspace aware recovery of low rank and jointly sparse signals

PubMed Central

Biswas, Sampurna; Dasgupta, Soura; Mudumbai, Raghuraman; Jacob, Mathews

2017-01-01

We consider the recovery of a matrix X, which is simultaneously low rank and joint sparse, from few measurements of its columns using a two-step algorithm. Each column of X is measured using a combination of two measurement matrices; one which is the same for every column, while the the second measurement matrix varies from column to column. The recovery proceeds by first estimating the row subspace vectors from the measurements corresponding to the common matrix. The estimated row subspace vectors are then used to recover X from all the measurements using a convex program of joint sparsity minimization. Our main contribution is to provide sufficient conditions on the measurement matrices that guarantee the recovery of such a matrix using the above two-step algorithm. The results demonstrate quite significant savings in number of measurements when compared to the standard multiple measurement vector (MMV) scheme, which assumes same time invariant measurement pattern for all the time frames. We illustrate the impact of the sampling pattern on reconstruction quality using breath held cardiac cine MRI and cardiac perfusion MRI data, while the utility of the algorithm to accelerate the acquisition is demonstrated on MR parameter mapping. PMID:28630889
Using a multifrontal sparse solver in a high performance, finite element code

NASA Technical Reports Server (NTRS)

King, Scott D.; Lucas, Robert; Raefsky, Arthur

1990-01-01

We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.

Complexity of Kronecker Operations on Sparse Matrices with Applications to the Solution of Markov Models

NASA Technical Reports Server (NTRS)

Buchholz, Peter; Ciardo, Gianfranco; Donatelli, Susanna; Kemper, Peter

1997-01-01

We present a systematic discussion of algorithms to multiply a vector by a matrix expressed as the Kronecker product of sparse matrices, extending previous work in a unified notational framework. Then, we use our results to define new algorithms for the solution of large structured Markov models. In addition to a comprehensive overview of existing approaches, we give new results with respect to: (1) managing certain types of state-dependent behavior without incurring extra cost; (2) supporting both Jacobi-style and Gauss-Seidel-style methods by appropriate multiplication algorithms; (3) speeding up algorithms that consider probability vectors of size equal to the "actual" state space instead of the "potential" state space.
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices

NASA Technical Reports Server (NTRS)

Freund, Roland W.; Gutknecht, Martin H.; Nachtigal, Noel M.

1991-01-01

The nonsymmetric Lanczos method can be used to compute eigenvalues of large sparse non-Hermitian matrices or to solve large sparse non-Hermitian linear systems. However, the original Lanczos algorithm is susceptible to possible breakdowns and potential instabilities. An implementation is presented of a look-ahead version of the Lanczos algorithm that, except for the very special situation of an incurable breakdown, overcomes these problems by skipping over those steps in which a breakdown or near-breakdown would occur in the standard process. The proposed algorithm can handle look-ahead steps of any length and requires the same number of matrix-vector products and inner products as the standard Lanczos process without look-ahead.
Turbo-SMT: Parallel Coupled Sparse Matrix-Tensor Factorizations and Applications

PubMed Central

Papalexakis, Evangelos E.; Faloutsos, Christos; Mitchell, Tom M.; Talukdar, Partha Pratim; Sidiropoulos, Nicholas D.; Murphy, Brian

2016-01-01

How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ’edible’, ’fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a Facebook dataset (users, ’friends’, wall-postings); there, Turbo-SMT spots spammer-like anomalies. PMID:27672406
A Generalized Sampling and Preconditioning Scheme for Sparse Approximation of Polynomial Chaos Expansions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jakeman, John D.; Narayan, Akil; Zhou, Tao

We propose an algorithm for recovering sparse orthogonal polynomial expansions via collocation. A standard sampling approach for recovering sparse polynomials uses Monte Carlo sampling, from the density of orthogonality, which results in poor function recovery when the polynomial degree is high. Our proposed approach aims to mitigate this limitation by sampling with respect to the weighted equilibrium measure of the parametric domain and subsequently solves a preconditionedmore » $$\\ell^1$$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. Our algorithm can be applied to a wide class of orthogonal polynomial families on bounded and unbounded domains, including all classical families. We present theoretical analysis to motivate the algorithm and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. In conclusion, numerical examples are also provided to demonstrate that our proposed algorithm leads to comparable or improved accuracy even when compared with Legendre- and Hermite-specific algorithms.« less
A Generalized Sampling and Preconditioning Scheme for Sparse Approximation of Polynomial Chaos Expansions

DOE PAGES

Jakeman, John D.; Narayan, Akil; Zhou, Tao

2017-06-22

We propose an algorithm for recovering sparse orthogonal polynomial expansions via collocation. A standard sampling approach for recovering sparse polynomials uses Monte Carlo sampling, from the density of orthogonality, which results in poor function recovery when the polynomial degree is high. Our proposed approach aims to mitigate this limitation by sampling with respect to the weighted equilibrium measure of the parametric domain and subsequently solves a preconditionedmore » $$\\ell^1$$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. Our algorithm can be applied to a wide class of orthogonal polynomial families on bounded and unbounded domains, including all classical families. We present theoretical analysis to motivate the algorithm and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. In conclusion, numerical examples are also provided to demonstrate that our proposed algorithm leads to comparable or improved accuracy even when compared with Legendre- and Hermite-specific algorithms.« less
Sparse Matrix Motivated Reconstruction of Far-Field Radiation Patterns

DTIC Science & Technology

2015-03-01

method for base - station antenna radiation patterns. IEEE Antennas Propagation Magazine. 2001;43(2):132. 4. Vasiliadis TG, Dimitriou D, Sergiadis JD...algorithm based on sparse representations of radiation patterns using the inverse Discrete Fourier Transform (DFT) and the inverse Discrete Cosine...patterns using a Model- Based Parameter Estimation (MBPE) technique that reduces the computational time required to model radiation patterns. Another
An efficient sparse matrix multiplication scheme for the CYBER 205 computer

NASA Technical Reports Server (NTRS)

Lambiotte, Jules J., Jr.

1988-01-01

This paper describes the development of an efficient algorithm for computing the product of a matrix and vector on a CYBER 205 vector computer. The desire to provide software which allows the user to choose between the often conflicting goals of minimizing central processing unit (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of four types of storage is selected for each diagonal. The candidate storage types employed were chosen to be efficient on the CYBER 205 for diagonals which have nonzero structure which is dense, moderately sparse, very sparse and short, or very sparse and long; however, for many densities, no diagonal type is most efficient with respect to both resource requirements, and a trade-off must be made. For each diagonal, an initialization subroutine estimates the CPU time and storage required for each storage type based on results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the two resources. The adjusted resource requirements are then compared to select the most efficient storage and computational scheme.
A Chess-Like Game for Teaching Engineering Students to Solve Large System of Simultaneous Linear Equations

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.; Mohammed, Ahmed Ali; Kadiam, Subhash

2010-01-01

Solving large (and sparse) system of simultaneous linear equations has been (and continues to be) a major challenging problem for many real-world engineering/science applications [1-2]. For many practical/large-scale problems, the sparse, Symmetrical and Positive Definite (SPD) system of linear equations can be conveniently represented in matrix notation as [A] {x} = {b} , where the square coefficient matrix [A] and the Right-Hand-Side (RHS) vector {b} are known. The unknown solution vector {x} can be efficiently solved by the following step-by-step procedures [1-2]: Reordering phase, Matrix Factorization phase, Forward solution phase, and Backward solution phase. In this research work, a Game-Based Learning (GBL) approach has been developed to help engineering students to understand crucial details about matrix reordering and factorization phases. A "chess-like" game has been developed and can be played by either a single player, or two players. Through this "chess-like" open-ended game, the players/learners will not only understand the key concepts involved in reordering algorithms (based on existing algorithms), but also have the opportunities to "discover new algorithms" which are better than existing algorithms. Implementing the proposed "chess-like" game for matrix reordering and factorization phases can be enhanced by FLASH [3] computer environments, where computer simulation with animated human voice, sound effects, visual/graphical/colorful displays of matrix tables, score (or monetary) awards for the best game players, etc. can all be exploited. Preliminary demonstrations of the developed GBL approach can be viewed by anyone who has access to the internet web-site [4]!
Fast and Adaptive Sparse Precision Matrix Estimation in High Dimensions

PubMed Central

Liu, Weidong; Luo, Xi

2014-01-01

This paper proposes a new method for estimating sparse precision matrices in the high dimensional setting. It has been popular to study fast computation and adaptive procedures for this problem. We propose a novel approach, called Sparse Column-wise Inverse Operator, to address these two issues. We analyze an adaptive procedure based on cross validation, and establish its convergence rate under the Frobenius norm. The convergence rates under other matrix norms are also established. This method also enjoys the advantage of fast computation for large-scale problems, via a coordinate descent algorithm. Numerical merits are illustrated using both simulated and real datasets. In particular, it performs favorably on an HIV brain tissue dataset and an ADHD resting-state fMRI dataset. PMID:25750463
Adaptive OFDM Waveform Design for Spatio-Temporal-Sparsity Exploited STAP Radar

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sen, Satyabrata

In this chapter, we describe a sparsity-based space-time adaptive processing (STAP) algorithm to detect a slowly moving target using an orthogonal frequency division multiplexing (OFDM) radar. The motivation of employing an OFDM signal is that it improves the target-detectability from the interfering signals by increasing the frequency diversity of the system. However, due to the addition of one extra dimension in terms of frequency, the adaptive degrees-of-freedom in an OFDM-STAP also increases. Therefore, to avoid the construction a fully adaptive OFDM-STAP, we develop a sparsity-based STAP algorithm. We observe that the interference spectrum is inherently sparse in the spatio-temporal domain,more » as the clutter responses occupy only a diagonal ridge on the spatio-temporal plane and the jammer signals interfere only from a few spatial directions. Hence, we exploit that sparsity to develop an efficient STAP technique that utilizes considerably lesser number of secondary data compared to the other existing STAP techniques, and produces nearly optimum STAP performance. In addition to designing the STAP filter, we optimally design the transmit OFDM signals by maximizing the output signal-to-interference-plus-noise ratio (SINR) in order to improve the STAP performance. The computation of output SINR depends on the estimated value of the interference covariance matrix, which we obtain by applying the sparse recovery algorithm. Therefore, we analytically assess the effects of the synthesized OFDM coefficients on the sparse recovery of the interference covariance matrix by computing the coherence measure of the sparse measurement matrix. Our numerical examples demonstrate the achieved STAP-performance due to sparsity-based technique and adaptive waveform design.« less
A Distributed-Memory Package for Dense Hierarchically Semi-Separable Matrix Computations Using Randomization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter

In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
A Distributed-Memory Package for Dense Hierarchically Semi-Separable Matrix Computations Using Randomization

DOE PAGES

Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter; ...

2016-06-30

In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
An adaptive sparse deconvolution method for distinguishing the overlapping echoes of ultrasonic guided waves for pipeline crack inspection

NASA Astrophysics Data System (ADS)

Chang, Yong; Zi, Yanyang; Zhao, Jiyuan; Yang, Zhe; He, Wangpeng; Sun, Hailiang

2017-03-01

In guided wave pipeline inspection, echoes reflected from closely spaced reflectors generally overlap, meaning useful information is lost. To solve the overlapping problem, sparse deconvolution methods have been developed in the past decade. However, conventional sparse deconvolution methods have limitations in handling guided wave signals, because the input signal is directly used as the prototype of the convolution matrix, without considering the waveform change caused by the dispersion properties of the guided wave. In this paper, an adaptive sparse deconvolution (ASD) method is proposed to overcome these limitations. First, the Gaussian echo model is employed to adaptively estimate the column prototype of the convolution matrix instead of directly using the input signal as the prototype. Then, the convolution matrix is constructed upon the estimated results. Third, the split augmented Lagrangian shrinkage (SALSA) algorithm is introduced to solve the deconvolution problem with high computational efficiency. To verify the effectiveness of the proposed method, guided wave signals obtained from pipeline inspection are investigated numerically and experimentally. Compared to conventional sparse deconvolution methods, e.g. the {{l}1} -norm deconvolution method, the proposed method shows better performance in handling the echo overlap problem in the guided wave signal.
Comparison of two matrix data structures for advanced CSM testbed applications

NASA Technical Reports Server (NTRS)

Regelbrugge, M. E.; Brogan, F. A.; Nour-Omid, B.; Rankin, C. C.; Wright, M. A.

1989-01-01

The first section describes data storage schemes presently used by the Computational Structural Mechanics (CSM) testbed sparse matrix facilities and similar skyline (profile) matrix facilities. The second section contains a discussion of certain features required for the implementation of particular advanced CSM algorithms, and how these features might be incorporated into the data storage schemes described previously. The third section presents recommendations, based on the discussions of the prior sections, for directing future CSM testbed development to provide necessary matrix facilities for advanced algorithm implementation and use. The objective is to lend insight into the matrix structures discussed and to help explain the process of evaluating alternative matrix data structures and utilities for subsequent use in the CSM testbed.
Target detection in GPR data using joint low-rank and sparsity constraints

NASA Astrophysics Data System (ADS)

Bouzerdoum, Abdesselam; Tivive, Fok Hing Chi; Abeynayake, Canicious

2016-05-01

In ground penetrating radars, background clutter, which comprises the signals backscattered from the rough, uneven ground surface and the background noise, impairs the visualization of buried objects and subsurface inspections. In this paper, a clutter mitigation method is proposed for target detection. The removal of background clutter is formulated as a constrained optimization problem to obtain a low-rank matrix and a sparse matrix. The low-rank matrix captures the ground surface reflections and the background noise, whereas the sparse matrix contains the target reflections. An optimization method based on split-Bregman algorithm is developed to estimate these two matrices from the input GPR data. Evaluated on real radar data, the proposed method achieves promising results in removing the background clutter and enhancing the target signature.
Bundle block adjustment of large-scale remote sensing data with Block-based Sparse Matrix Compression combined with Preconditioned Conjugate Gradient

NASA Astrophysics Data System (ADS)

Zheng, Maoteng; Zhang, Yongjun; Zhou, Shunping; Zhu, Junfeng; Xiong, Xiaodong

2016-07-01

In recent years, new platforms and sensors in photogrammetry, remote sensing and computer vision areas have become available, such as Unmanned Aircraft Vehicles (UAV), oblique camera systems, common digital cameras and even mobile phone cameras. Images collected by all these kinds of sensors could be used as remote sensing data sources. These sensors can obtain large-scale remote sensing data which consist of a great number of images. Bundle block adjustment of large-scale data with conventional algorithm is very time and space (memory) consuming due to the super large normal matrix arising from large-scale data. In this paper, an efficient Block-based Sparse Matrix Compression (BSMC) method combined with the Preconditioned Conjugate Gradient (PCG) algorithm is chosen to develop a stable and efficient bundle block adjustment system in order to deal with the large-scale remote sensing data. The main contribution of this work is the BSMC-based PCG algorithm which is more efficient in time and memory than the traditional algorithm without compromising the accuracy. Totally 8 datasets of real data are used to test our proposed method. Preliminary results have shown that the BSMC method can efficiently decrease the time and memory requirement of large-scale data.
Structure-preserving and rank-revealing QR-factorizations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bischof, C.H.; Hansen, P.C.

1991-11-01

The rank-revealing QR-factorization (RRQR-factorization) is a special QR-factorization that is guaranteed to reveal the numerical rank of the matrix under consideration. This makes the RRQR-factorization a useful tool in the numerical treatment of many rank-deficient problems in numerical linear algebra. In this paper, a framework is presented for the efficient implementation of RRQR algorithms, in particular, for sparse matrices. A sparse RRQR-algorithm should seek to preserve the structure and sparsity of the matrix as much as possible while retaining the ability to capture safely the numerical rank. To this end, the paper proposes to compute an initial QR-factorization using amore » restricted pivoting strategy guarded by incremental condition estimation (ICE), and then applies the algorithm suggested by Chan and Foster to this QR-factorization. The column exchange strategy used in the initial QR factorization will exploit the fact that certain column exchanges do not change the sparsity structure, and compute a sparse QR-factorization that is a good approximation of the sought-after RRQR-factorization. Due to quantities produced by ICE, the Chan/Foster RRQR algorithm can be implemented very cheaply, thus verifying that the sought-after RRQR-factorization has indeed been computed. Experimental results on a model problem show that the initial QR-factorization is indeed very likely to produce RRQR-factorization.« less
Supercomputing on massively parallel bit-serial architectures

NASA Technical Reports Server (NTRS)

Iobst, Ken

1985-01-01

Research on the Goodyear Massively Parallel Processor (MPP) suggests that high-level parallel languages are practical and can be designed with powerful new semantics that allow algorithms to be efficiently mapped to the real machines. For the MPP these semantics include parallel/associative array selection for both dense and sparse matrices, variable precision arithmetic to trade accuracy for speed, micro-pipelined train broadcast, and conditional branching at the processing element (PE) control unit level. The preliminary design of a FORTRAN-like parallel language for the MPP has been completed and is being used to write programs to perform sparse matrix array selection, min/max search, matrix multiplication, Gaussian elimination on single bit arrays and other generic algorithms. A description is given of the MPP design. Features of the system and its operation are illustrated in the form of charts and diagrams.
Multiprocessor sparse L/U decomposition with controlled fill-in

NASA Technical Reports Server (NTRS)

Alaghband, G.; Jordan, H. F.

1985-01-01

Generation of the maximal compatibles of pivot elements for a class of small sparse matrices is studied. The algorithm involves a binary tree search and has a complexity exponential in the order of the matrix. Different strategies for selection of a set of compatible pivots based on the Markowitz criterion are investigated. The competing issues of parallelism and fill-in generation are studied and results are provided. A technque for obtaining an ordered compatible set directly from the ordered incompatible table is given. This technique generates a set of compatible pivots with the property of generating few fills. A new hueristic algorithm is then proposed that combines the idea of an ordered compatible set with a limited binary tree search to generate several sets of compatible pivots in linear time. Finally, an elimination set to reduce the matrix is selected. Parameters are suggested to obtain a balance between parallelism and fill-ins. Results of applying the proposed algorithms on several large application matrices are presented and analyzed.
Layout Study and Application of Mobile App Recommendation Approach Based On Spark Streaming Framework

NASA Astrophysics Data System (ADS)

Wang, H. T.; Chen, T. T.; Yan, C.; Pan, H.

2018-05-01

For App recommended areas of mobile phone software, made while using conduct App application recommended combined weighted Slope One algorithm collaborative filtering algorithm items based on further improvement of the traditional collaborative filtering algorithm in cold start, data matrix sparseness and other issues, will recommend Spark stasis parallel algorithm platform, the introduction of real-time streaming streaming real-time computing framework to improve real-time software applications recommended.

spammpack, Version 2013-06-18

DOE Office of Scientific and Technical Information (OSTI.GOV)

2014-01-17

This library is an implementation of the Sparse Approximate Matrix Multiplication (SpAMM) algorithm introduced. It provides a matrix data type, and an approximate matrix product, which exhibits linear scaling computational complexity for matrices with decay. The product error and the performance of the multiply can be tuned by choosing an appropriate tolerance. The library can be compiled for serial execution or parallel execution on shared memory systems with an OpenMP capable compiler
A physiologically motivated sparse, compact, and smooth (SCS) approach to EEG source localization.

PubMed

Cao, Cheng; Akalin Acar, Zeynep; Kreutz-Delgado, Kenneth; Makeig, Scott

2012-01-01

Here, we introduce a novel approach to the EEG inverse problem based on the assumption that principal cortical sources of multi-channel EEG recordings may be assumed to be spatially sparse, compact, and smooth (SCS). To enforce these characteristics of solutions to the EEG inverse problem, we propose a correlation-variance model which factors a cortical source space covariance matrix into the multiplication of a pre-given correlation coefficient matrix and the square root of the diagonal variance matrix learned from the data under a Bayesian learning framework. We tested the SCS method using simulated EEG data with various SNR and applied it to a real ECOG data set. We compare the results of SCS to those of an established SBL algorithm.
Improved collaborative filtering recommendation algorithm of similarity measure

NASA Astrophysics Data System (ADS)

Zhang, Baofu; Yuan, Baoping

2017-05-01

The Collaborative filtering recommendation algorithm is one of the most widely used recommendation algorithm in personalized recommender systems. The key is to find the nearest neighbor set of the active user by using similarity measure. However, the methods of traditional similarity measure mainly focus on the similarity of user common rating items, but ignore the relationship between the user common rating items and all items the user rates. And because rating matrix is very sparse, traditional collaborative filtering recommendation algorithm is not high efficiency. In order to obtain better accuracy, based on the consideration of common preference between users, the difference of rating scale and score of common items, this paper presents an improved similarity measure method, and based on this method, a collaborative filtering recommendation algorithm based on similarity improvement is proposed. Experimental results show that the algorithm can effectively improve the quality of recommendation, thus alleviate the impact of data sparseness.
A Sparsity-Promoted Method Based on Majorization-Minimization for Weak Fault Feature Enhancement

PubMed Central

Hao, Yansong; Song, Liuyang; Tang, Gang; Yuan, Hongfang

2018-01-01

Fault transient impulses induced by faulty components in rotating machinery usually contain substantial interference. Fault features are comparatively weak in the initial fault stage, which renders fault diagnosis more difficult. In this case, a sparse representation method based on the Majorzation-Minimization (MM) algorithm is proposed to enhance weak fault features and extract the features from strong background noise. However, the traditional MM algorithm suffers from two issues, which are the choice of sparse basis and complicated calculations. To address these challenges, a modified MM algorithm is proposed in which a sparse optimization objective function is designed firstly. Inspired by the Basis Pursuit (BP) model, the optimization function integrates an impulsive feature-preserving factor and a penalty function factor. Second, a modified Majorization iterative method is applied to address the convex optimization problem of the designed function. A series of sparse coefficients can be achieved through iterating, which only contain transient components. It is noteworthy that there is no need to select the sparse basis in the proposed iterative method because it is fixed as a unit matrix. Then the reconstruction step is omitted, which can significantly increase detection efficiency. Eventually, envelope analysis of the sparse coefficients is performed to extract weak fault features. Simulated and experimental signals including bearings and gearboxes are employed to validate the effectiveness of the proposed method. In addition, comparisons are made to prove that the proposed method outperforms the traditional MM algorithm in terms of detection results and efficiency. PMID:29597280
A Sparsity-Promoted Method Based on Majorization-Minimization for Weak Fault Feature Enhancement.

PubMed

Ren, Bangyue; Hao, Yansong; Wang, Huaqing; Song, Liuyang; Tang, Gang; Yuan, Hongfang

2018-03-28

Fault transient impulses induced by faulty components in rotating machinery usually contain substantial interference. Fault features are comparatively weak in the initial fault stage, which renders fault diagnosis more difficult. In this case, a sparse representation method based on the Majorzation-Minimization (MM) algorithm is proposed to enhance weak fault features and extract the features from strong background noise. However, the traditional MM algorithm suffers from two issues, which are the choice of sparse basis and complicated calculations. To address these challenges, a modified MM algorithm is proposed in which a sparse optimization objective function is designed firstly. Inspired by the Basis Pursuit (BP) model, the optimization function integrates an impulsive feature-preserving factor and a penalty function factor. Second, a modified Majorization iterative method is applied to address the convex optimization problem of the designed function. A series of sparse coefficients can be achieved through iterating, which only contain transient components. It is noteworthy that there is no need to select the sparse basis in the proposed iterative method because it is fixed as a unit matrix. Then the reconstruction step is omitted, which can significantly increase detection efficiency. Eventually, envelope analysis of the sparse coefficients is performed to extract weak fault features. Simulated and experimental signals including bearings and gearboxes are employed to validate the effectiveness of the proposed method. In addition, comparisons are made to prove that the proposed method outperforms the traditional MM algorithm in terms of detection results and efficiency.
A sparse matrix algorithm on the Boolean vector machine

NASA Technical Reports Server (NTRS)

Wagner, Robert A.; Patrick, Merrell L.

1988-01-01

VLSI technology is being used to implement a prototype Boolean Vector Machine (BVM), which is a large network of very small processors with equally small memories that operate in SIMD mode; these use bit-serial arithmetic, and communicate via cube-connected cycles network. The BVM's bit-serial arithmetic and the small memories of individual processors are noted to compromise the system's effectiveness in large numerical problem applications. Attention is presently given to the implementation of a basic matrix-vector iteration algorithm for space matrices of the BVM, in order to generate over 1 billion useful floating-point operations/sec for this iteration algorithm. The algorithm is expressed in a novel language designated 'BVM'.
Sparse representation of whole-brain fMRI signals for identification of functional networks.

PubMed

Lv, Jinglei; Jiang, Xi; Li, Xiang; Zhu, Dajiang; Chen, Hanbo; Zhang, Tuo; Zhang, Shu; Hu, Xintao; Han, Junwei; Huang, Heng; Zhang, Jing; Guo, Lei; Liu, Tianming

2015-02-01

There have been several recent studies that used sparse representation for fMRI signal analysis and activation detection based on the assumption that each voxel's fMRI signal is linearly composed of sparse components. Previous studies have employed sparse coding to model functional networks in various modalities and scales. These prior contributions inspired the exploration of whether/how sparse representation can be used to identify functional networks in a voxel-wise way and on the whole brain scale. This paper presents a novel, alternative methodology of identifying multiple functional networks via sparse representation of whole-brain task-based fMRI signals. Our basic idea is that all fMRI signals within the whole brain of one subject are aggregated into a big data matrix, which is then factorized into an over-complete dictionary basis matrix and a reference weight matrix via an effective online dictionary learning algorithm. Our extensive experimental results have shown that this novel methodology can uncover multiple functional networks that can be well characterized and interpreted in spatial, temporal and frequency domains based on current brain science knowledge. Importantly, these well-characterized functional network components are quite reproducible in different brains. In general, our methods offer a novel, effective and unified solution to multiple fMRI data analysis tasks including activation detection, de-activation detection, and functional network identification. Copyright © 2014 Elsevier B.V. All rights reserved.
Multivariable frequency domain identification via 2-norm minimization

NASA Technical Reports Server (NTRS)

Bayard, David S.

1992-01-01

The author develops a computational approach to multivariable frequency domain identification, based on 2-norm minimization. In particular, a Gauss-Newton (GN) iteration is developed to minimize the 2-norm of the error between frequency domain data and a matrix fraction transfer function estimate. To improve the global performance of the optimization algorithm, the GN iteration is initialized using the solution to a particular sequentially reweighted least squares problem, denoted as the SK iteration. The least squares problems which arise from both the SK and GN iterations are shown to involve sparse matrices with identical block structure. A sparse matrix QR factorization method is developed to exploit the special block structure, and to efficiently compute the least squares solution. A numerical example involving the identification of a multiple-input multiple-output (MIMO) plant having 286 unknown parameters is given to illustrate the effectiveness of the algorithm.
Automatic segmentation of right ventricle on ultrasound images using sparse matrix transform and level set

NASA Astrophysics Data System (ADS)

Qin, Xulei; Cong, Zhibin; Halig, Luma V.; Fei, Baowei

2013-03-01

An automatic framework is proposed to segment right ventricle on ultrasound images. This method can automatically segment both epicardial and endocardial boundaries from a continuous echocardiography series by combining sparse matrix transform (SMT), a training model, and a localized region based level set. First, the sparse matrix transform extracts main motion regions of myocardium as eigenimages by analyzing statistical information of these images. Second, a training model of right ventricle is registered to the extracted eigenimages in order to automatically detect the main location of the right ventricle and the corresponding transform relationship between the training model and the SMT-extracted results in the series. Third, the training model is then adjusted as an adapted initialization for the segmentation of each image in the series. Finally, based on the adapted initializations, a localized region based level set algorithm is applied to segment both epicardial and endocardial boundaries of the right ventricle from the whole series. Experimental results from real subject data validated the performance of the proposed framework in segmenting right ventricle from echocardiography. The mean Dice scores for both epicardial and endocardial boundaries are 89.1%+/-2.3% and 83.6+/-7.3%, respectively. The automatic segmentation method based on sparse matrix transform and level set can provide a useful tool for quantitative cardiac imaging.
Joint analysis of multiple high-dimensional data types using sparse matrix approximations of rank-1 with applications to ovarian and liver cancer.

PubMed

Okimoto, Gordon; Zeinalzadeh, Ashkan; Wenska, Tom; Loomis, Michael; Nation, James B; Fabre, Tiphaine; Tiirikainen, Maarit; Hernandez, Brenda; Chan, Owen; Wong, Linda; Kwee, Sandi

2016-01-01

Technological advances enable the cost-effective acquisition of Multi-Modal Data Sets (MMDS) composed of measurements for multiple, high-dimensional data types obtained from a common set of bio-samples. The joint analysis of the data matrices associated with the different data types of a MMDS should provide a more focused view of the biology underlying complex diseases such as cancer that would not be apparent from the analysis of a single data type alone. As multi-modal data rapidly accumulate in research laboratories and public databases such as The Cancer Genome Atlas (TCGA), the translation of such data into clinically actionable knowledge has been slowed by the lack of computational tools capable of analyzing MMDSs. Here, we describe the Joint Analysis of Many Matrices by ITeration (JAMMIT) algorithm that jointly analyzes the data matrices of a MMDS using sparse matrix approximations of rank-1. The JAMMIT algorithm jointly approximates an arbitrary number of data matrices by rank-1 outer-products composed of "sparse" left-singular vectors (eigen-arrays) that are unique to each matrix and a right-singular vector (eigen-signal) that is common to all the matrices. The non-zero coefficients of the eigen-arrays identify small subsets of variables for each data type (i.e., signatures) that in aggregate, or individually, best explain a dominant eigen-signal defined on the columns of the data matrices. The approximation is specified by a single "sparsity" parameter that is selected based on false discovery rate estimated by permutation testing. Multiple signals of interest in a given MDDS are sequentially detected and modeled by iterating JAMMIT on "residual" data matrices that result from a given sparse approximation. We show that JAMMIT outperforms other joint analysis algorithms in the detection of multiple signatures embedded in simulated MDDS. On real multimodal data for ovarian and liver cancer we show that JAMMIT identified multi-modal signatures that were clinically informative and enriched for cancer-related biology. Sparse matrix approximations of rank-1 provide a simple yet effective means of jointly reducing multiple, big data types to a small subset of variables that characterize important clinical and/or biological attributes of the bio-samples from which the data were acquired.
Non-convex Statistical Optimization for Sparse Tensor Graphical Model

PubMed Central

Sun, Wei; Wang, Zhaoran; Liu, Han; Cheng, Guang

2016-01-01

We consider the estimation of sparse graphical models that characterize the dependency structure of high-dimensional tensor-valued data. To facilitate the estimation of the precision matrix corresponding to each way of the tensor, we assume the data follow a tensor normal distribution whose covariance has a Kronecker product structure. The penalized maximum likelihood estimation of this model involves minimizing a non-convex objective function. In spite of the non-convexity of this estimation problem, we prove that an alternating minimization algorithm, which iteratively estimates each sparse precision matrix while fixing the others, attains an estimator with the optimal statistical rate of convergence as well as consistent graph recovery. Notably, such an estimator achieves estimation consistency with only one tensor sample, which is unobserved in previous work. Our theoretical results are backed by thorough numerical studies. PMID:28316459
Filtered gradient reconstruction algorithm for compressive spectral imaging

NASA Astrophysics Data System (ADS)

Mejia, Yuri; Arguello, Henry

2017-04-01

Compressive sensing matrices are traditionally based on random Gaussian and Bernoulli entries. Nevertheless, they are subject to physical constraints, and their structure unusually follows a dense matrix distribution, such as the case of the matrix related to compressive spectral imaging (CSI). The CSI matrix represents the integration of coded and shifted versions of the spectral bands. A spectral image can be recovered from CSI measurements by using iterative algorithms for linear inverse problems that minimize an objective function including a quadratic error term combined with a sparsity regularization term. However, current algorithms are slow because they do not exploit the structure and sparse characteristics of the CSI matrices. A gradient-based CSI reconstruction algorithm, which introduces a filtering step in each iteration of a conventional CSI reconstruction algorithm that yields improved image quality, is proposed. Motivated by the structure of the CSI matrix, Φ, this algorithm modifies the iterative solution such that it is forced to converge to a filtered version of the residual ΦTy, where y is the compressive measurement vector. We show that the filtered-based algorithm converges to better quality performance results than the unfiltered version. Simulation results highlight the relative performance gain over the existing iterative algorithms.
Layer-oriented multigrid wavefront reconstruction algorithms for multi-conjugate adaptive optics

NASA Astrophysics Data System (ADS)

Gilles, Luc; Ellerbroek, Brent L.; Vogel, Curtis R.

2003-02-01

Multi-conjugate adaptive optics (MCAO) systems with 104-105 degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wavefront control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of AO degrees of freedom. In this paper, we develop an iterative sparse matrix implementation of minimum variance wavefront reconstruction for telescope diameters up to 32m with more than 104 actuators. The basic approach is the preconditioned conjugate gradient method, using a multigrid preconditioner incorporating a layer-oriented (block) symmetric Gauss-Seidel iterative smoothing operator. We present open-loop numerical simulation results to illustrate algorithm convergence.
Analysis of Monte Carlo accelerated iterative methods for sparse linear systems: Analysis of Monte Carlo accelerated iterative methods for sparse linear systems

DOE PAGES

Benzi, Michele; Evans, Thomas M.; Hamilton, Steven P.; ...

2017-03-05

Here, we consider hybrid deterministic-stochastic iterative algorithms for the solution of large, sparse linear systems. Starting from a convergent splitting of the coefficient matrix, we analyze various types of Monte Carlo acceleration schemes applied to the original preconditioned Richardson (stationary) iteration. We expect that these methods will have considerable potential for resiliency to faults when implemented on massively parallel machines. We also establish sufficient conditions for the convergence of the hybrid schemes, and we investigate different types of preconditioners including sparse approximate inverses. Numerical experiments on linear systems arising from the discretization of partial differential equations are presented.
Matrix computations in MACSYMA

NASA Technical Reports Server (NTRS)

Wang, P. S.

1977-01-01

Facilities built into MACSYMA for manipulating matrices with numeric or symbolic entries are described. Computations will be done exactly, keeping symbols as symbols. Topics discussed include how to form a matrix and create other matrices by transforming existing matrices within MACSYMA; arithmetic and other computation with matrices; and user control of computational processes through the use of optional variables. Two algorithms designed for sparse matrices are given. The computing times of several different ways to compute the determinant of a matrix are compared.
A projected preconditioned conjugate gradient algorithm for computing many extreme eigenpairs of a Hermitian matrix [A projected preconditioned conjugate gradient algorithm for computing a large eigenspace of a Hermitian matrix

DOE PAGES

Vecharynski, Eugene; Yang, Chao; Pask, John E.

2015-02-25

Here, we present an iterative algorithm for computing an invariant subspace associated with the algebraically smallest eigenvalues of a large sparse or structured Hermitian matrix A. We are interested in the case in which the dimension of the invariant subspace is large (e.g., over several hundreds or thousands) even though it may still be small relative to the dimension of A. These problems arise from, for example, density functional theory (DFT) based electronic structure calculations for complex materials. The key feature of our algorithm is that it performs fewer Rayleigh–Ritz calculations compared to existing algorithms such as the locally optimalmore » block preconditioned conjugate gradient or the Davidson algorithm. It is a block algorithm, and hence can take advantage of efficient BLAS3 operations and be implemented with multiple levels of concurrency. We discuss a number of practical issues that must be addressed in order to implement the algorithm efficiently on a high performance computer.« less
Eigensolver for a Sparse, Large Hermitian Matrix

NASA Technical Reports Server (NTRS)

Tisdale, E. Robert; Oyafuso, Fabiano; Klimeck, Gerhard; Brown, R. Chris

2003-01-01

A parallel-processing computer program finds a few eigenvalues in a sparse Hermitian matrix that contains as many as 100 million diagonal elements. This program finds the eigenvalues faster, using less memory, than do other, comparable eigensolver programs. This program implements a Lanczos algorithm in the American National Standards Institute/ International Organization for Standardization (ANSI/ISO) C computing language, using the Message Passing Interface (MPI) standard to complement an eigensolver in PARPACK. [PARPACK (Parallel Arnoldi Package) is an extension, to parallel-processing computer architectures, of ARPACK (Arnoldi Package), which is a collection of Fortran 77 subroutines that solve large-scale eigenvalue problems.] The eigensolver runs on Beowulf clusters of computers at the Jet Propulsion Laboratory (JPL).
The application of nonlinear programming and collocation to optimal aeroassisted orbital transfers

NASA Astrophysics Data System (ADS)

Shi, Y. Y.; Nelson, R. L.; Young, D. H.; Gill, P. E.; Murray, W.; Saunders, M. A.

1992-01-01

Sequential quadratic programming (SQP) and collocation of the differential equations of motion were applied to optimal aeroassisted orbital transfers. The Optimal Trajectory by Implicit Simulation (OTIS) computer program codes with updated nonlinear programming code (NZSOL) were used as a testbed for the SQP nonlinear programming (NLP) algorithms. The state-of-the-art sparse SQP method is considered to be effective for solving large problems with a sparse matrix. Sparse optimizers are characterized in terms of memory requirements and computational efficiency. For the OTIS problems, less than 10 percent of the Jacobian matrix elements are nonzero. The SQP method encompasses two phases: finding an initial feasible point by minimizing the sum of infeasibilities and minimizing the quadratic objective function within the feasible region. The orbital transfer problem under consideration involves the transfer from a high energy orbit to a low energy orbit.
A Fast Gradient Method for Nonnegative Sparse Regression With Self-Dictionary

NASA Astrophysics Data System (ADS)

Gillis, Nicolas; Luce, Robert

2018-01-01

A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images.
Unsupervised Learning for Monaural Source Separation Using Maximization–Minimization Algorithm with Time–Frequency Deconvolution †

PubMed Central

Bouridane, Ahmed; Ling, Bingo Wing-Kuen

2018-01-01

This paper presents an unsupervised learning algorithm for sparse nonnegative matrix factor time–frequency deconvolution with optimized fractional β-divergence. The β-divergence is a group of cost functions parametrized by a single parameter β. The Itakura–Saito divergence, Kullback–Leibler divergence and Least Square distance are special cases that correspond to β=0, 1, 2, respectively. This paper presents a generalized algorithm that uses a flexible range of β that includes fractional values. It describes a maximization–minimization (MM) algorithm leading to the development of a fast convergence multiplicative update algorithm with guaranteed convergence. The proposed model operates in the time–frequency domain and decomposes an information-bearing matrix into two-dimensional deconvolution of factor matrices that represent the spectral dictionary and temporal codes. The deconvolution process has been optimized to yield sparse temporal codes through maximizing the likelihood of the observations. The paper also presents a method to estimate the fractional β value. The method is demonstrated on separating audio mixtures recorded from a single channel. The paper shows that the extraction of the spectral dictionary and temporal codes is significantly more efficient by using the proposed algorithm and subsequently leads to better source separation performance. Experimental tests and comparisons with other factorization methods have been conducted to verify its efficacy. PMID:29702629

CPDES3: A preconditioned conjugate gradient solver for linear asymmetric matrix equations arising from coupled partial differential equations in three dimensions

NASA Astrophysics Data System (ADS)

Anderson, D. V.; Koniges, A. E.; Shumaker, D. E.

1988-11-01

Many physical problems require the solution of coupled partial differential equations on three-dimensional domains. When the time scales of interest dictate an implicit discretization of the equations a rather complicated global matrix system needs solution. The exact form of the matrix depends on the choice of spatial grids and on the finite element or finite difference approximations employed. CPDES3 allows each spatial operator to have 7, 15, 19, or 27 point stencils and allows for general couplings between all of the component PDE's and it automatically generates the matrix structures needed to perform the algorithm. The resulting sparse matrix equation is solved by either the preconditioned conjugate gradient (CG) method or by the preconditioned biconjugate gradient (BCG) algorithm. An arbitrary number of component equations are permitted only limited by available memory. In the sub-band representation used, we generate an algorithm that is written compactly in terms of indirect induces which is vectorizable on some of the newer scientific computers.
CPDES2: A preconditioned conjugate gradient solver for linear asymmetric matrix equations arising from coupled partial differential equations in two dimensions

NASA Astrophysics Data System (ADS)

Anderson, D. V.; Koniges, A. E.; Shumaker, D. E.

1988-11-01

Many physical problems require the solution of coupled partial differential equations on two-dimensional domains. When the time scales of interest dictate an implicit discretization of the equations a rather complicated global matrix system needs solution. The exact form of the matrix depends on the choice of spatial grids and on the finite element or finite difference approximations employed. CPDES2 allows each spatial operator to have 5 or 9 point stencils and allows for general couplings between all of the component PDE's and it automatically generates the matrix structures needed to perform the algorithm. The resulting sparse matrix equation is solved by either the preconditioned conjugate gradient (CG) method or by the preconditioned biconjugate gradient (BCG) algorithm. An arbitrary number of component equations are permitted only limited by available memory. In the sub-band representation used, we generate an algorithm that is written compactly in terms of indirect indices which is vectorizable on some of the newer scientific computers.
Fast Solution in Sparse LDA for Binary Classification

NASA Technical Reports Server (NTRS)

Moghaddam, Baback

2010-01-01

An algorithm that performs sparse linear discriminant analysis (Sparse-LDA) finds near-optimal solutions in far less time than the prior art when specialized to binary classification (of 2 classes). Sparse-LDA is a type of feature- or variable- selection problem with numerous applications in statistics, machine learning, computer vision, computational finance, operations research, and bio-informatics. Because of its combinatorial nature, feature- or variable-selection problems are NP-hard or computationally intractable in cases involving more than 30 variables or features. Therefore, one typically seeks approximate solutions by means of greedy search algorithms. The prior Sparse-LDA algorithm was a greedy algorithm that considered the best variable or feature to add/ delete to/ from its subsets in order to maximally discriminate between multiple classes of data. The present algorithm is designed for the special but prevalent case of 2-class or binary classification (e.g. 1 vs. 0, functioning vs. malfunctioning, or change versus no change). The present algorithm provides near-optimal solutions on large real-world datasets having hundreds or even thousands of variables or features (e.g. selecting the fewest wavelength bands in a hyperspectral sensor to do terrain classification) and does so in typical computation times of minutes as compared to days or weeks as taken by the prior art. Sparse LDA requires solving generalized eigenvalue problems for a large number of variable subsets (represented by the submatrices of the input within-class and between-class covariance matrices). In the general (fullrank) case, the amount of computation scales at least cubically with the number of variables and thus the size of the problems that can be solved is limited accordingly. However, in binary classification, the principal eigenvalues can be found using a special analytic formula, without resorting to costly iterative techniques. The present algorithm exploits this analytic form along with the inherent sequential nature of greedy search itself. Together this enables the use of highly-efficient partitioned-matrix-inverse techniques that result in large speedups of computation in both the forward-selection and backward-elimination stages of greedy algorithms in general.
Sparse electrocardiogram signals recovery based on solving a row echelon-like form of system.

PubMed

Cai, Pingmei; Wang, Guinan; Yu, Shiwei; Zhang, Hongjuan; Ding, Shuxue; Wu, Zikai

2016-02-01

The study of biology and medicine in a noise environment is an evolving direction in biological data analysis. Among these studies, analysis of electrocardiogram (ECG) signals in a noise environment is a challenging direction in personalized medicine. Due to its periodic characteristic, ECG signal can be roughly regarded as sparse biomedical signals. This study proposes a two-stage recovery algorithm for sparse biomedical signals in time domain. In the first stage, the concentration subspaces are found in advance. Then by exploiting these subspaces, the mixing matrix is estimated accurately. In the second stage, based on the number of active sources at each time point, the time points are divided into different layers. Next, by constructing some transformation matrices, these time points form a row echelon-like system. After that, the sources at each layer can be solved out explicitly by corresponding matrix operations. It is noting that all these operations are conducted under a weak sparse condition that the number of active sources is less than the number of observations. Experimental results show that the proposed method has a better performance for sparse ECG signal recovery problem.
Three dimensional iterative beam propagation method for optical waveguide devices

NASA Astrophysics Data System (ADS)

Ma, Changbao; Van Keuren, Edward

2006-10-01

The finite difference beam propagation method (FD-BPM) is an effective model for simulating a wide range of optical waveguide structures. The classical FD-BPMs are based on the Crank-Nicholson scheme, and in tridiagonal form can be solved using the Thomas method. We present a different type of algorithm for 3-D structures. In this algorithm, the wave equation is formulated into a large sparse matrix equation which can be solved using iterative methods. The simulation window shifting scheme and threshold technique introduced in our earlier work are utilized to overcome the convergence problem of iterative methods for large sparse matrix equation and wide-angle simulations. This method enables us to develop higher-order 3-D wide-angle (WA-) BPMs based on Pade approximant operators and the multistep method, which are commonly used in WA-BPMs for 2-D structures. Simulations using the new methods will be compared to the analytical results to assure its effectiveness and applicability.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jakeman, John D.; Narayan, Akil; Zhou, Tao

We propose an algorithm for recovering sparse orthogonal polynomial expansions via collocation. A standard sampling approach for recovering sparse polynomials uses Monte Carlo sampling, from the density of orthogonality, which results in poor function recovery when the polynomial degree is high. Our proposed approach aims to mitigate this limitation by sampling with respect to the weighted equilibrium measure of the parametric domain and subsequently solves a preconditionedmore » $$\\ell^1$$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. Our algorithm can be applied to a wide class of orthogonal polynomial families on bounded and unbounded domains, including all classical families. We present theoretical analysis to motivate the algorithm and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. In conclusion, numerical examples are also provided to demonstrate that our proposed algorithm leads to comparable or improved accuracy even when compared with Legendre- and Hermite-specific algorithms.« less
A Stabilized Sparse-Matrix U-D Square-Root Implementation of a Large-State Extended Kalman Filter

NASA Technical Reports Server (NTRS)

Boggs, D.; Ghil, M.; Keppenne, C.

1995-01-01

The full nonlinear Kalman filter sequential algorithm is, in theory, well-suited to the four-dimensional data assimilation problem in large-scale atmospheric and oceanic problems. However, it was later discovered that this algorithm can be very sensitive to computer roundoff, and that results may cease to be meaningful as time advances. Implementations of a modified Kalman filter are given.
Partitioning sparse matrices with eigenvectors of graphs

NASA Technical Reports Server (NTRS)

Pothen, Alex; Simon, Horst D.; Liou, Kang-Pu

1990-01-01

The problem of computing a small vertex separator in a graph arises in the context of computing a good ordering for the parallel factorization of sparse, symmetric matrices. An algebraic approach for computing vertex separators is considered in this paper. It is shown that lower bounds on separator sizes can be obtained in terms of the eigenvalues of the Laplacian matrix associated with a graph. The Laplacian eigenvectors of grid graphs can be computed from Kronecker products involving the eigenvectors of path graphs, and these eigenvectors can be used to compute good separators in grid graphs. A heuristic algorithm is designed to compute a vertex separator in a general graph by first computing an edge separator in the graph from an eigenvector of the Laplacian matrix, and then using a maximum matching in a subgraph to compute the vertex separator. Results on the quality of the separators computed by the spectral algorithm are presented, and these are compared with separators obtained from other algorithms for computing separators. Finally, the time required to compute the Laplacian eigenvector is reported, and the accuracy with which the eigenvector must be computed to obtain good separators is considered. The spectral algorithm has the advantage that it can be implemented on a medium-size multiprocessor in a straightforward manner.
On-Chip Neural Data Compression Based On Compressed Sensing With Sparse Sensing Matrices.

PubMed

Zhao, Wenfeng; Sun, Biao; Wu, Tong; Yang, Zhi

2018-02-01

On-chip neural data compression is an enabling technique for wireless neural interfaces that suffer from insufficient bandwidth and power budgets to transmit the raw data. The data compression algorithm and its implementation should be power and area efficient and functionally reliable over different datasets. Compressed sensing is an emerging technique that has been applied to compress various neurophysiological data. However, the state-of-the-art compressed sensing (CS) encoders leverage random but dense binary measurement matrices, which incur substantial implementation costs on both power and area that could offset the benefits from the reduced wireless data rate. In this paper, we propose two CS encoder designs based on sparse measurement matrices that could lead to efficient hardware implementation. Specifically, two different approaches for the construction of sparse measurement matrices, i.e., the deterministic quasi-cyclic array code (QCAC) matrix and -sparse random binary matrix [-SRBM] are exploited. We demonstrate that the proposed CS encoders lead to comparable recovery performance. And efficient VLSI architecture designs are proposed for QCAC-CS and -SRBM encoders with reduced area and total power consumption.
Fast super-resolution estimation of DOA and DOD in bistatic MIMO Radar with off-grid targets

NASA Astrophysics Data System (ADS)

Zhang, Dong; Zhang, Yongshun; Zheng, Guimei; Feng, Cunqian; Tang, Jun

2018-05-01

In this paper, we focus on the problem of joint DOA and DOD estimation in Bistatic MIMO Radar using sparse reconstruction method. In traditional ways, we usually convert the 2D parameter estimation problem into 1D parameter estimation problem by Kronecker product which will enlarge the scale of the parameter estimation problem and bring more computational burden. Furthermore, it requires that the targets must fall on the predefined grids. In this paper, a 2D-off-grid model is built which can solve the grid mismatch problem of 2D parameters estimation. Then in order to solve the joint 2D sparse reconstruction problem directly and efficiently, three kinds of fast joint sparse matrix reconstruction methods are proposed which are Joint-2D-OMP algorithm, Joint-2D-SL0 algorithm and Joint-2D-SOONE algorithm. Simulation results demonstrate that our methods not only can improve the 2D parameter estimation accuracy but also reduce the computational complexity compared with the traditional Kronecker Compressed Sensing method.
New algorithms for field-theoretic block copolymer simulations: Progress on using adaptive-mesh refinement and sparse matrix solvers in SCFT calculations

NASA Astrophysics Data System (ADS)

Sides, Scott; Jamroz, Ben; Crockett, Robert; Pletzer, Alexander

2012-02-01

Self-consistent field theory (SCFT) for dense polymer melts has been highly successful in describing complex morphologies in block copolymers. Field-theoretic simulations such as these are able to access large length and time scales that are difficult or impossible for particle-based simulations such as molecular dynamics. The modified diffusion equations that arise as a consequence of the coarse-graining procedure in the SCF theory can be efficiently solved with a pseudo-spectral (PS) method that uses fast-Fourier transforms on uniform Cartesian grids. However, PS methods can be difficult to apply in many block copolymer SCFT simulations (eg. confinement, interface adsorption) in which small spatial regions might require finer resolution than most of the simulation grid. Progress on using new solver algorithms to address these problems will be presented. The Tech-X Chompst project aims at marrying the best of adaptive mesh refinement with linear matrix solver algorithms. The Tech-X code PolySwift++ is an SCFT simulation platform that leverages ongoing development in coupling Chombo, a package for solving PDEs via block-structured AMR calculations and embedded boundaries, with PETSc, a toolkit that includes a large assortment of sparse linear solvers.
Newmark-Beta-FDTD method for super-resolution analysis of time reversal waves

NASA Astrophysics Data System (ADS)

Shi, Sheng-Bing; Shao, Wei; Ma, Jing; Jin, Congjun; Wang, Xiao-Hua

2017-09-01

In this work, a new unconditionally stable finite-difference time-domain (FDTD) method with the split-field perfectly matched layer (PML) is proposed for the analysis of time reversal (TR) waves. The proposed method is very suitable for multiscale problems involving microstructures. The spatial and temporal derivatives in this method are discretized by the central difference technique and Newmark-Beta algorithm, respectively, and the derivation results in the calculation of a banded-sparse matrix equation. Since the coefficient matrix keeps unchanged during the whole simulation process, the lower-upper (LU) decomposition of the matrix needs to be performed only once at the beginning of the calculation. Moreover, the reverse Cuthill-Mckee (RCM) technique, an effective preprocessing technique in bandwidth compression of sparse matrices, is used to improve computational efficiency. The super-resolution focusing of TR wave propagation in two- and three-dimensional spaces is included to validate the accuracy and efficiency of the proposed method.
A modified dual-level algorithm for large-scale three-dimensional Laplace and Helmholtz equation

NASA Astrophysics Data System (ADS)

Li, Junpu; Chen, Wen; Fu, Zhuojia

2018-01-01

A modified dual-level algorithm is proposed in the article. By the help of the dual level structure, the fully-populated interpolation matrix on the fine level is transformed to a local supported sparse matrix to solve the highly ill-conditioning and excessive storage requirement resulting from fully-populated interpolation matrix. The kernel-independent fast multipole method is adopted to expediting the solving process of the linear equations on the coarse level. Numerical experiments up to 2-million fine-level nodes have successfully been achieved. It is noted that the proposed algorithm merely needs to place 2-3 coarse-level nodes in each wavelength per direction to obtain the reasonable solution, which almost down to the minimum requirement allowed by the Shannon's sampling theorem. In the real human head model example, it is observed that the proposed algorithm can simulate well computationally very challenging exterior high-frequency harmonic acoustic wave propagation up to 20,000 Hz.
An efficient classification method based on principal component and sparse representation.

PubMed

Zhai, Lin; Fu, Shujun; Zhang, Caiming; Liu, Yunxian; Wang, Lu; Liu, Guohua; Yang, Mingqiang

2016-01-01

As an important application in optical imaging, palmprint recognition is interfered by many unfavorable factors. An effective fusion of blockwise bi-directional two-dimensional principal component analysis and grouping sparse classification is presented. The dimension reduction and normalizing are implemented by the blockwise bi-directional two-dimensional principal component analysis for palmprint images to extract feature matrixes, which are assembled into an overcomplete dictionary in sparse classification. A subspace orthogonal matching pursuit algorithm is designed to solve the grouping sparse representation. Finally, the classification result is gained by comparing the residual between testing and reconstructed images. Experiments are carried out on a palmprint database, and the results show that this method has better robustness against position and illumination changes of palmprint images, and can get higher rate of palmprint recognition.
Simultaneous analysis of large INTEGRAL/SPI1 datasets: Optimizing the computation of the solution and its variance using sparse matrix algorithms

NASA Astrophysics Data System (ADS)

Bouchet, L.; Amestoy, P.; Buttari, A.; Rouet, F.-H.; Chauvin, M.

2013-02-01

Nowadays, analyzing and reducing the ever larger astronomical datasets is becoming a crucial challenge, especially for long cumulated observation times. The INTEGRAL/SPI X/γ-ray spectrometer is an instrument for which it is essential to process many exposures at the same time in order to increase the low signal-to-noise ratio of the weakest sources. In this context, the conventional methods for data reduction are inefficient and sometimes not feasible at all. Processing several years of data simultaneously requires computing not only the solution of a large system of equations, but also the associated uncertainties. We aim at reducing the computation time and the memory usage. Since the SPI transfer function is sparse, we have used some popular methods for the solution of large sparse linear systems; we briefly review these methods. We use the Multifrontal Massively Parallel Solver (MUMPS) to compute the solution of the system of equations. We also need to compute the variance of the solution, which amounts to computing selected entries of the inverse of the sparse matrix corresponding to our linear system. This can be achieved through one of the latest features of the MUMPS software that has been partly motivated by this work. In this paper we provide a brief presentation of this feature and evaluate its effectiveness on astrophysical problems requiring the processing of large datasets simultaneously, such as the study of the entire emission of the Galaxy. We used these algorithms to solve the large sparse systems arising from SPI data processing and to obtain both their solutions and the associated variances. In conclusion, thanks to these newly developed tools, processing large datasets arising from SPI is now feasible with both a reasonable execution time and a low memory usage.
Preconditioned conjugate gradient wave-front reconstructors for multiconjugate adaptive optics

NASA Astrophysics Data System (ADS)

Gilles, Luc; Ellerbroek, Brent L.; Vogel, Curtis R.

2003-09-01

Multiconjugate adaptive optics (MCAO) systems with 104-105 degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wave-front control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of adaptive optics degrees of freedom. We develop scalable open-loop iterative sparse matrix implementations of minimum variance wave-front reconstruction for telescope diameters up to 32 m with more than 104 actuators. The basic approach is the preconditioned conjugate gradient method with an efficient preconditioner, whose block structure is defined by the atmospheric turbulent layers very much like the layer-oriented MCAO algorithms of current interest. Two cost-effective preconditioners are investigated: a multigrid solver and a simpler block symmetric Gauss-Seidel (BSGS) sweep. Both options require off-line sparse Cholesky factorizations of the diagonal blocks of the matrix system. The cost to precompute these factors scales approximately as the three-halves power of the number of estimated phase grid points per atmospheric layer, and their average update rate is typically of the order of 10-2 Hz, i.e., 4-5 orders of magnitude lower than the typical 103 Hz temporal sampling rate. All other computations scale almost linearly with the total number of estimated phase grid points. We present numerical simulation results to illustrate algorithm convergence. Convergence rates of both preconditioners are similar, regardless of measurement noise level, indicating that the layer-oriented BSGS sweep is as effective as the more elaborated multiresolution preconditioner.
An analysis of spectral envelope-reduction via quadratic assignment problems

NASA Technical Reports Server (NTRS)

George, Alan; Pothen, Alex

1994-01-01

A new spectral algorithm for reordering a sparse symmetric matrix to reduce its envelope size was described. The ordering is computed by associating a Laplacian matrix with the given matrix and then sorting the components of a specified eigenvector of the Laplacian. In this paper, we provide an analysis of the spectral envelope reduction algorithm. We described related 1- and 2-sum problems; the former is related to the envelope size, while the latter is related to an upper bound on the work involved in an envelope Cholesky factorization scheme. We formulate the latter two problems as quadratic assignment problems, and then study the 2-sum problem in more detail. We obtain lower bounds on the 2-sum by considering a projected quadratic assignment problem, and then show that finding a permutation matrix closest to an orthogonal matrix attaining one of the lower bounds justifies the spectral envelope reduction algorithm. The lower bound on the 2-sum is seen to be tight for reasonably 'uniform' finite element meshes. We also obtain asymptotically tight lower bounds for the envelope size for certain classes of meshes.
An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks

PubMed Central

Wang, Donghao; Wan, Jiangwen; Chen, Junying; Zhang, Qiang

2016-01-01

To adapt to sense signals of enormous diversities and dynamics, and to decrease the reconstruction errors caused by ambient noise, a novel online dictionary learning method-based compressive data gathering (ODL-CDG) algorithm is proposed. The proposed dictionary is learned from a two-stage iterative procedure, alternately changing between a sparse coding step and a dictionary update step. The self-coherence of the learned dictionary is introduced as a penalty term during the dictionary update procedure. The dictionary is also constrained with sparse structure. It’s theoretically demonstrated that the sensing matrix satisfies the restricted isometry property (RIP) with high probability. In addition, the lower bound of necessary number of measurements for compressive sensing (CS) reconstruction is given. Simulation results show that the proposed ODL-CDG algorithm can enhance the recovery accuracy in the presence of noise, and reduce the energy consumption in comparison with other dictionary based data gathering methods. PMID:27669250
An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks.

PubMed

Wang, Donghao; Wan, Jiangwen; Chen, Junying; Zhang, Qiang

2016-09-22

To adapt to sense signals of enormous diversities and dynamics, and to decrease the reconstruction errors caused by ambient noise, a novel online dictionary learning method-based compressive data gathering (ODL-CDG) algorithm is proposed. The proposed dictionary is learned from a two-stage iterative procedure, alternately changing between a sparse coding step and a dictionary update step. The self-coherence of the learned dictionary is introduced as a penalty term during the dictionary update procedure. The dictionary is also constrained with sparse structure. It's theoretically demonstrated that the sensing matrix satisfies the restricted isometry property (RIP) with high probability. In addition, the lower bound of necessary number of measurements for compressive sensing (CS) reconstruction is given. Simulation results show that the proposed ODL-CDG algorithm can enhance the recovery accuracy in the presence of noise, and reduce the energy consumption in comparison with other dictionary based data gathering methods.
Parallel pivoting combined with parallel reduction

NASA Technical Reports Server (NTRS)

Alaghband, Gita

1987-01-01

Parallel algorithms for triangularization of large, sparse, and unsymmetric matrices are presented. The method combines the parallel reduction with a new parallel pivoting technique, control over generations of fill-ins and a check for numerical stability, all done in parallel with the work being distributed over the active processes. The parallel technique uses the compatibility relation between pivots to identify parallel pivot candidates and uses the Markowitz number of pivots to minimize fill-in. This technique is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds.

Blind compressed sensing image reconstruction based on alternating direction method

NASA Astrophysics Data System (ADS)

Liu, Qinan; Guo, Shuxu

2018-04-01

In order to solve the problem of how to reconstruct the original image under the condition of unknown sparse basis, this paper proposes an image reconstruction method based on blind compressed sensing model. In this model, the image signal is regarded as the product of a sparse coefficient matrix and a dictionary matrix. Based on the existing blind compressed sensing theory, the optimal solution is solved by the alternative minimization method. The proposed method solves the problem that the sparse basis in compressed sensing is difficult to represent, which restrains the noise and improves the quality of reconstructed image. This method ensures that the blind compressed sensing theory has a unique solution and can recover the reconstructed original image signal from a complex environment with a stronger self-adaptability. The experimental results show that the image reconstruction algorithm based on blind compressed sensing proposed in this paper can recover high quality image signals under the condition of under-sampling.
Accelerating scientific computations with mixed precision algorithms

NASA Astrophysics Data System (ADS)

Baboulin, Marc; Buttari, Alfredo; Dongarra, Jack; Kurzak, Jakub; Langou, Julie; Langou, Julien; Luszczek, Piotr; Tomov, Stanimire

2009-12-01

On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented. Program summaryProgram title: ITER-REF Catalogue identifier: AECO_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AECO_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 7211 No. of bytes in distributed program, including test data, etc.: 41 862 Distribution format: tar.gz Programming language: FORTRAN 77 Computer: desktop, server Operating system: Unix/Linux RAM: 512 Mbytes Classification: 4.8 External routines: BLAS (optional) Nature of problem: On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. Solution method: Mixed precision algorithms stem from the observation that, in many cases, a single precision solution of a problem can be refined to the point where double precision accuracy is achieved. A common approach to the solution of linear systems, either dense or sparse, is to perform the LU factorization of the coefficient matrix using Gaussian elimination. First, the coefficient matrix A is factored into the product of a lower triangular matrix L and an upper triangular matrix U. Partial row pivoting is in general used to improve numerical stability resulting in a factorization PA=LU, where P is a permutation matrix. The solution for the system is achieved by first solving Ly=Pb (forward substitution) and then solving Ux=y (backward substitution). Due to round-off errors, the computed solution, x, carries a numerical error magnified by the condition number of the coefficient matrix A. In order to improve the computed solution, an iterative process can be applied, which produces a correction to the computed solution at each iteration, which then yields the method that is commonly known as the iterative refinement algorithm. Provided that the system is not too ill-conditioned, the algorithm produces a solution correct to the working precision. Running time: seconds/minutes
Laplace-domain waveform modeling and inversion for the 3D acoustic-elastic coupled media

NASA Astrophysics Data System (ADS)

Shin, Jungkyun; Shin, Changsoo; Calandra, Henri

2016-06-01

Laplace-domain waveform inversion reconstructs long-wavelength subsurface models by using the zero-frequency component of damped seismic signals. Despite the computational advantages of Laplace-domain waveform inversion over conventional frequency-domain waveform inversion, an acoustic assumption and an iterative matrix solver have been used to invert 3D marine datasets to mitigate the intensive computing cost. In this study, we develop a Laplace-domain waveform modeling and inversion algorithm for 3D acoustic-elastic coupled media by using a parallel sparse direct solver library (MUltifrontal Massively Parallel Solver, MUMPS). We precisely simulate a real marine environment by coupling the 3D acoustic and elastic wave equations with the proper boundary condition at the fluid-solid interface. In addition, we can extract the elastic properties of the Earth below the sea bottom from the recorded acoustic pressure datasets. As a matrix solver, the parallel sparse direct solver is used to factorize the non-symmetric impedance matrix in a distributed memory architecture and rapidly solve the wave field for a number of shots by using the lower and upper matrix factors. Using both synthetic datasets and real datasets obtained by a 3D wide azimuth survey, the long-wavelength component of the P-wave and S-wave velocity models is reconstructed and the proposed modeling and inversion algorithm are verified. A cluster of 80 CPU cores is used for this study.
Large-region acoustic source mapping using a movable array and sparse covariance fitting.

PubMed

Zhao, Shengkui; Tuna, Cagdas; Nguyen, Thi Ngoc Tho; Jones, Douglas L

2017-01-01

Large-region acoustic source mapping is important for city-scale noise monitoring. Approaches using a single-position measurement scheme to scan large regions using small arrays cannot provide clean acoustic source maps, while deploying large arrays spanning the entire region of interest is prohibitively expensive. A multiple-position measurement scheme is applied to scan large regions at multiple spatial positions using a movable array of small size. Based on the multiple-position measurement scheme, a sparse-constrained multiple-position vectorized covariance matrix fitting approach is presented. In the proposed approach, the overall sample covariance matrix of the incoherent virtual array is first estimated using the multiple-position array data and then vectorized using the Khatri-Rao (KR) product. A linear model is then constructed for fitting the vectorized covariance matrix and a sparse-constrained reconstruction algorithm is proposed for recovering source powers from the model. The user parameter settings are discussed. The proposed approach is tested on a 30 m × 40 m region and a 60 m × 40 m region using simulated and measured data. Much cleaner acoustic source maps and lower sound pressure level errors are obtained compared to the beamforming approaches and the previous sparse approach [Zhao, Tuna, Nguyen, and Jones, Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP) (2016)].
User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.

NASA Technical Reports Server (NTRS)

Reddy, C. J.

2000-01-01

PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.
Randomized subspace-based robust principal component analysis for hyperspectral anomaly detection

NASA Astrophysics Data System (ADS)

Sun, Weiwei; Yang, Gang; Li, Jialin; Zhang, Dianfa

2018-01-01

A randomized subspace-based robust principal component analysis (RSRPCA) method for anomaly detection in hyperspectral imagery (HSI) is proposed. The RSRPCA combines advantages of randomized column subspace and robust principal component analysis (RPCA). It assumes that the background has low-rank properties, and the anomalies are sparse and do not lie in the column subspace of the background. First, RSRPCA implements random sampling to sketch the original HSI dataset from columns and to construct a randomized column subspace of the background. Structured random projections are also adopted to sketch the HSI dataset from rows. Sketching from columns and rows could greatly reduce the computational requirements of RSRPCA. Second, the RSRPCA adopts the columnwise RPCA (CWRPCA) to eliminate negative effects of sampled anomaly pixels and that purifies the previous randomized column subspace by removing sampled anomaly columns. The CWRPCA decomposes the submatrix of the HSI data into a low-rank matrix (i.e., background component), a noisy matrix (i.e., noise component), and a sparse anomaly matrix (i.e., anomaly component) with only a small proportion of nonzero columns. The algorithm of inexact augmented Lagrange multiplier is utilized to optimize the CWRPCA problem and estimate the sparse matrix. Nonzero columns of the sparse anomaly matrix point to sampled anomaly columns in the submatrix. Third, all the pixels are projected onto the complemental subspace of the purified randomized column subspace of the background and the anomaly pixels in the original HSI data are finally exactly located. Several experiments on three real hyperspectral images are carefully designed to investigate the detection performance of RSRPCA, and the results are compared with four state-of-the-art methods. Experimental results show that the proposed RSRPCA outperforms four comparison methods both in detection performance and in computational time.
Convex Banding of the Covariance Matrix

PubMed Central

Bien, Jacob; Bunea, Florentina; Xiao, Luo

2016-01-01

We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings. PMID:28042189
Convex Banding of the Covariance Matrix.

PubMed

Bien, Jacob; Bunea, Florentina; Xiao, Luo

2016-01-01

We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings.
Fast Compressive Tracking.

PubMed

Zhang, Kaihua; Zhang, Lei; Yang, Ming-Hsuan

2014-10-01

It is a challenging task to develop effective and efficient appearance models for robust object tracking due to factors such as pose variation, illumination change, occlusion, and motion blur. Existing online tracking algorithms often update models with samples from observations in recent frames. Despite much success has been demonstrated, numerous issues remain to be addressed. First, while these adaptive appearance models are data-dependent, there does not exist sufficient amount of data for online algorithms to learn at the outset. Second, online tracking algorithms often encounter the drift problems. As a result of self-taught learning, misaligned samples are likely to be added and degrade the appearance models. In this paper, we propose a simple yet effective and efficient tracking algorithm with an appearance model based on features extracted from a multiscale image feature space with data-independent basis. The proposed appearance model employs non-adaptive random projections that preserve the structure of the image feature space of objects. A very sparse measurement matrix is constructed to efficiently extract the features for the appearance model. We compress sample images of the foreground target and the background using the same sparse measurement matrix. The tracking task is formulated as a binary classification via a naive Bayes classifier with online update in the compressed domain. A coarse-to-fine search strategy is adopted to further reduce the computational complexity in the detection procedure. The proposed compressive tracking algorithm runs in real-time and performs favorably against state-of-the-art methods on challenging sequences in terms of efficiency, accuracy and robustness.
Nonlocal low-rank and sparse matrix decomposition for spectral CT reconstruction

NASA Astrophysics Data System (ADS)

Niu, Shanzhou; Yu, Gaohang; Ma, Jianhua; Wang, Jing

2018-02-01

Spectral computed tomography (CT) has been a promising technique in research and clinics because of its ability to produce improved energy resolution images with narrow energy bins. However, the narrow energy bin image is often affected by serious quantum noise because of the limited number of photons used in the corresponding energy bin. To address this problem, we present an iterative reconstruction method for spectral CT using nonlocal low-rank and sparse matrix decomposition (NLSMD), which exploits the self-similarity of patches that are collected in multi-energy images. Specifically, each set of patches can be decomposed into a low-rank component and a sparse component, and the low-rank component represents the stationary background over different energy bins, while the sparse component represents the rest of the different spectral features in individual energy bins. Subsequently, an effective alternating optimization algorithm was developed to minimize the associated objective function. To validate and evaluate the NLSMD method, qualitative and quantitative studies were conducted by using simulated and real spectral CT data. Experimental results show that the NLSMD method improves spectral CT images in terms of noise reduction, artifact suppression and resolution preservation.
Time integration algorithms for the two-dimensional Euler equations on unstructured meshes

NASA Technical Reports Server (NTRS)

Slack, David C.; Whitaker, D. L.; Walters, Robert W.

1994-01-01

Explicit and implicit time integration algorithms for the two-dimensional Euler equations on unstructured grids are presented. Both cell-centered and cell-vertex finite volume upwind schemes utilizing Roe's approximate Riemann solver are developed. For the cell-vertex scheme, a four-stage Runge-Kutta time integration, a fourstage Runge-Kutta time integration with implicit residual averaging, a point Jacobi method, a symmetric point Gauss-Seidel method and two methods utilizing preconditioned sparse matrix solvers are presented. For the cell-centered scheme, a Runge-Kutta scheme, an implicit tridiagonal relaxation scheme modeled after line Gauss-Seidel, a fully implicit lower-upper (LU) decomposition, and a hybrid scheme utilizing both Runge-Kutta and LU methods are presented. A reverse Cuthill-McKee renumbering scheme is employed for the direct solver to decrease CPU time by reducing the fill of the Jacobian matrix. A comparison of the various time integration schemes is made for both first-order and higher order accurate solutions using several mesh sizes, higher order accuracy is achieved by using multidimensional monotone linear reconstruction procedures. The results obtained for a transonic flow over a circular arc suggest that the preconditioned sparse matrix solvers perform better than the other methods as the number of elements in the mesh increases.
Recursive inverse factorization.

PubMed

Rubensson, Emanuel H; Bock, Nicolas; Holmström, Erik; Niklasson, Anders M N

2008-03-14

A recursive algorithm for the inverse factorization S(-1)=ZZ(*) of Hermitian positive definite matrices S is proposed. The inverse factorization is based on iterative refinement [A.M.N. Niklasson, Phys. Rev. B 70, 193102 (2004)] combined with a recursive decomposition of S. As the computational kernel is matrix-matrix multiplication, the algorithm can be parallelized and the computational effort increases linearly with system size for systems with sufficiently sparse matrices. Recent advances in network theory are used to find appropriate recursive decompositions. We show that optimization of the so-called network modularity results in an improved partitioning compared to other approaches. In particular, when the recursive inverse factorization is applied to overlap matrices of irregularly structured three-dimensional molecules.
Novel Fourier-based iterative reconstruction for sparse fan projection using alternating direction total variation minimization

NASA Astrophysics Data System (ADS)

Zhao, Jin; Han-Ming, Zhang; Bin, Yan; Lei, Li; Lin-Yuan, Wang; Ai-Long, Cai

2016-03-01

Sparse-view x-ray computed tomography (CT) imaging is an interesting topic in CT field and can efficiently decrease radiation dose. Compared with spatial reconstruction, a Fourier-based algorithm has advantages in reconstruction speed and memory usage. A novel Fourier-based iterative reconstruction technique that utilizes non-uniform fast Fourier transform (NUFFT) is presented in this work along with advanced total variation (TV) regularization for a fan sparse-view CT. The proposition of a selective matrix contributes to improve reconstruction quality. The new method employs the NUFFT and its adjoin to iterate back and forth between the Fourier and image space. The performance of the proposed algorithm is demonstrated through a series of digital simulations and experimental phantom studies. Results of the proposed algorithm are compared with those of existing TV-regularized techniques based on compressed sensing method, as well as basic algebraic reconstruction technique. Compared with the existing TV-regularized techniques, the proposed Fourier-based technique significantly improves convergence rate and reduces memory allocation, respectively. Projected supported by the National High Technology Research and Development Program of China (Grant No. 2012AA011603) and the National Natural Science Foundation of China (Grant No. 61372172).
Approximate method of variational Bayesian matrix factorization/completion with sparse prior

NASA Astrophysics Data System (ADS)

Kawasumi, Ryota; Takeda, Koujin

2018-05-01

We derive the analytical expression of a matrix factorization/completion solution by the variational Bayes method, under the assumption that the observed matrix is originally the product of low-rank, dense and sparse matrices with additive noise. We assume the prior of a sparse matrix is a Laplace distribution by taking matrix sparsity into consideration. Then we use several approximations for the derivation of a matrix factorization/completion solution. By our solution, we also numerically evaluate the performance of a sparse matrix reconstruction in matrix factorization, and completion of a missing matrix element in matrix completion.
The CSM testbed matrix processors internal logic and dataflow descriptions

NASA Technical Reports Server (NTRS)

Regelbrugge, Marc E.; Wright, Mary A.

1988-01-01

This report constitutes the final report for subtask 1 of Task 5 of NASA Contract NAS1-18444, Computational Structural Mechanics (CSM) Research. This report contains a detailed description of the coded workings of selected CSM Testbed matrix processors (i.e., TOPO, K, INV, SSOL) and of the arithmetic utility processor AUS. These processors and the current sparse matrix data structures are studied and documented. Items examined include: details of the data structures, interdependence of data structures, data-blocking logic in the data structures, processor data flow and architecture, and processor algorithmic logic flow.
Preconditioned conjugate gradient wave-front reconstructors for multiconjugate adaptive optics.

PubMed

Gilles, Luc; Ellerbroek, Brent L; Vogel, Curtis R

2003-09-10

Multiconjugate adaptive optics (MCAO) systems with 10(4)-10(5) degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wavefront control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of adaptive optics degrees of freedom. We develop scalable open-loop iterative sparse matrix implementations of minimum variance wave-front reconstruction for telescope diameters up to 32 m with more than 10(4) actuators. The basic approach is the preconditioned conjugate gradient method with an efficient preconditioner, whose block structure is defined by the atmospheric turbulent layers very much like the layer-oriented MCAO algorithms of current interest. Two cost-effective preconditioners are investigated: a multigrid solver and a simpler block symmetric Gauss-Seidel (BSGS) sweep. Both options require off-line sparse Cholesky factorizations of the diagonal blocks of the matrix system. The cost to precompute these factors scales approximately as the three-halves power of the number of estimated phase grid points per atmospheric layer, and their average update rate is typically of the order of 10(-2) Hz, i.e., 4-5 orders of magnitude lower than the typical 10(3) Hz temporal sampling rate. All other computations scale almost linearly with the total number of estimated phase grid points. We present numerical simulation results to illustrate algorithm convergence. Convergence rates of both preconditioners are similar, regardless of measurement noise level, indicating that the layer-oriented BSGS sweep is as effective as the more elaborated multiresolution preconditioner.
Algorithms and software for solving finite element equations on serial and parallel architectures

NASA Technical Reports Server (NTRS)

Chu, Eleanor; George, Alan

1988-01-01

The primary objective was to compare the performance of state-of-the-art techniques for solving sparse systems with those that are currently available in the Computational Structural Mechanics (MSC) testbed. One of the first tasks was to become familiar with the structure of the testbed, and to install some or all of the SPARSPAK package in the testbed. A brief overview of the CSM Testbed software and its usage is presented. An overview of the sparse matrix research for the Testbed currently employed in the CSM Testbed is given. An interface which was designed and implemented as a research tool for installing and appraising new matrix processors in the CSM Testbed is described. The results of numerical experiments performed in solving a set of testbed demonstration problems using the processor SPK and other experimental processors are contained.
Quantum algorithm for linear systems of equations.

PubMed

Harrow, Aram W; Hassidim, Avinatan; Lloyd, Seth

2009-10-09

Solving linear systems of equations is a common problem that arises both on its own and as a subroutine in more complex problems: given a matrix A and a vector b(-->), find a vector x(-->) such that Ax(-->) = b(-->). We consider the case where one does not need to know the solution x(-->) itself, but rather an approximation of the expectation value of some operator associated with x(-->), e.g., x(-->)(dagger) Mx(-->) for some matrix M. In this case, when A is sparse, N x N and has condition number kappa, the fastest known classical algorithms can find x(-->) and estimate x(-->)(dagger) Mx(-->) in time scaling roughly as N square root(kappa). Here, we exhibit a quantum algorithm for estimating x(-->)(dagger) Mx(-->) whose runtime is a polynomial of log(N) and kappa. Indeed, for small values of kappa [i.e., poly log(N)], we prove (using some common complexity-theoretic assumptions) that any classical algorithm for this problem generically requires exponentially more time than our quantum algorithm.
Background recovery via motion-based robust principal component analysis with matrix factorization

NASA Astrophysics Data System (ADS)

Pan, Peng; Wang, Yongli; Zhou, Mingyuan; Sun, Zhipeng; He, Guoping

2018-03-01

Background recovery is a key technique in video analysis, but it still suffers from many challenges, such as camouflage, lighting changes, and diverse types of image noise. Robust principal component analysis (RPCA), which aims to recover a low-rank matrix and a sparse matrix, is a general framework for background recovery. The nuclear norm is widely used as a convex surrogate for the rank function in RPCA, which requires computing the singular value decomposition (SVD), a task that is increasingly costly as matrix sizes and ranks increase. However, matrix factorization greatly reduces the dimension of the matrix for which the SVD must be computed. Motion information has been shown to improve low-rank matrix recovery in RPCA, but this method still finds it difficult to handle original video data sets because of its batch-mode formulation and implementation. Hence, in this paper, we propose a motion-assisted RPCA model with matrix factorization (FM-RPCA) for background recovery. Moreover, an efficient linear alternating direction method of multipliers with a matrix factorization (FL-ADM) algorithm is designed for solving the proposed FM-RPCA model. Experimental results illustrate that the method provides stable results and is more efficient than the current state-of-the-art algorithms.
Sparse matrix beamforming and image reconstruction for 2-D HIFU monitoring using harmonic motion imaging for focused ultrasound (HMIFU) with in vitro validation.

PubMed

Hou, Gary Y; Provost, Jean; Grondin, Julien; Wang, Shutao; Marquet, Fabrice; Bunting, Ethan; Konofagou, Elisa E

2014-11-01

Harmonic motion imaging for focused ultrasound (HMIFU) utilizes an amplitude-modulated HIFU beam to induce a localized focal oscillatory motion simultaneously estimated. The objective of this study is to develop and show the feasibility of a novel fast beamforming algorithm for image reconstruction using GPU-based sparse-matrix operation with real-time feedback. In this study, the algorithm was implemented onto a fully integrated, clinically relevant HMIFU system. A single divergent transmit beam was used while fast beamforming was implemented using a GPU-based delay-and-sum method and a sparse-matrix operation. Axial HMI displacements were then estimated from the RF signals using a 1-D normalized cross-correlation method and streamed to a graphic user interface with frame rates up to 15 Hz, a 100-fold increase compared to conventional CPU-based processing. The real-time feedback rate does not require interrupting the HIFU treatment. Results in phantom experiments showed reproducible HMI images and monitoring of 22 in vitro HIFU treatments using the new 2-D system demonstrated reproducible displacement imaging, and monitoring of 22 in vitro HIFU treatments using the new 2-D system showed a consistent average focal displacement decrease of 46.7 ±14.6% during lesion formation. Complementary focal temperature monitoring also indicated an average rate of displacement increase and decrease with focal temperature at 0.84±1.15%/(°)C, and 2.03±0.93%/(°)C , respectively. These results reinforce the HMIFU capability of estimating and monitoring stiffness related changes in real time. Current ongoing studies include clinical translation of the presented system for monitoring of HIFU treatment for breast and pancreatic tumor applications.

Ionospheric-thermospheric UV tomography: 1. Image space reconstruction algorithms

NASA Astrophysics Data System (ADS)

Dymond, K. F.; Budzien, S. A.; Hei, M. A.

2017-03-01

We present and discuss two algorithms of the class known as Image Space Reconstruction Algorithms (ISRAs) that we are applying to the solution of large-scale ionospheric tomography problems. ISRAs have several desirable features that make them useful for ionospheric tomography. In addition to producing nonnegative solutions, ISRAs are amenable to sparse-matrix formulations and are fast, stable, and robust. We present the results of our studies of two types of ISRA: the Least Squares Positive Definite and the Richardson-Lucy algorithms. We compare their performance to the Multiplicative Algebraic Reconstruction and Conjugate Gradient Least Squares algorithms. We then discuss the use of regularization in these algorithms and present our new approach based on regularization to a partial differential equation.
Indexing amyloid peptide diffraction from serial femtosecond crystallography: New algorithms for sparse patterns

DOE PAGES

Brewster, Aaron S.; Sawaya, Michael R.; Rodriguez, Jose; ...

2015-01-23

Still diffraction patterns from peptide nanocrystals with small unit cells are challenging to index using conventional methods owing to the limited number of spots and the lack of crystal orientation information for individual images. New indexing algorithms have been developed as part of the Computational Crystallography Toolbox( cctbx) to overcome these challenges. Accurate unit-cell information derived from an aggregate data set from thousands of diffraction patterns can be used to determine a crystal orientation matrix for individual images with as few as five reflections. These algorithms are potentially applicable not only to amyloid peptides but also to any set ofmore » diffraction patterns with sparse properties, such as low-resolution virus structures or high-throughput screening of still images captured by raster-scanning at synchrotron sources. As a proof of concept for this technique, successful integration of X-ray free-electron laser (XFEL) data to 2.5 Å resolution for the amyloid segment GNNQQNY from the Sup35 yeast prion is presented.« less
Superresolution radar imaging based on fast inverse-free sparse Bayesian learning for multiple measurement vectors

NASA Astrophysics Data System (ADS)

He, Xingyu; Tong, Ningning; Hu, Xiaowei

2018-01-01

Compressive sensing has been successfully applied to inverse synthetic aperture radar (ISAR) imaging of moving targets. By exploiting the block sparse structure of the target image, sparse solution for multiple measurement vectors (MMV) can be applied in ISAR imaging and a substantial performance improvement can be achieved. As an effective sparse recovery method, sparse Bayesian learning (SBL) for MMV involves a matrix inverse at each iteration. Its associated computational complexity grows significantly with the problem size. To address this problem, we develop a fast inverse-free (IF) SBL method for MMV. A relaxed evidence lower bound (ELBO), which is computationally more amiable than the traditional ELBO used by SBL, is obtained by invoking fundamental property for smooth functions. A variational expectation-maximization scheme is then employed to maximize the relaxed ELBO, and a computationally efficient IF-MSBL algorithm is proposed. Numerical results based on simulated and real data show that the proposed method can reconstruct row sparse signal accurately and obtain clear superresolution ISAR images. Moreover, the running time and computational complexity are reduced to a great extent compared with traditional SBL methods.
Automatic segmentation of right ventricular ultrasound images using sparse matrix transform and a level set

NASA Astrophysics Data System (ADS)

Qin, Xulei; Cong, Zhibin; Fei, Baowei

2013-11-01

An automatic segmentation framework is proposed to segment the right ventricle (RV) in echocardiographic images. The method can automatically segment both epicardial and endocardial boundaries from a continuous echocardiography series by combining sparse matrix transform, a training model, and a localized region-based level set. First, the sparse matrix transform extracts main motion regions of the myocardium as eigen-images by analyzing the statistical information of the images. Second, an RV training model is registered to the eigen-images in order to locate the position of the RV. Third, the training model is adjusted and then serves as an optimized initialization for the segmentation of each image. Finally, based on the initializations, a localized, region-based level set algorithm is applied to segment both epicardial and endocardial boundaries in each echocardiograph. Three evaluation methods were used to validate the performance of the segmentation framework. The Dice coefficient measures the overall agreement between the manual and automatic segmentation. The absolute distance and the Hausdorff distance between the boundaries from manual and automatic segmentation were used to measure the accuracy of the segmentation. Ultrasound images of human subjects were used for validation. For the epicardial and endocardial boundaries, the Dice coefficients were 90.8 ± 1.7% and 87.3 ± 1.9%, the absolute distances were 2.0 ± 0.42 mm and 1.79 ± 0.45 mm, and the Hausdorff distances were 6.86 ± 1.71 mm and 7.02 ± 1.17 mm, respectively. The automatic segmentation method based on a sparse matrix transform and level set can provide a useful tool for quantitative cardiac imaging.
Technical note: Avoiding the direct inversion of the numerator relationship matrix for genotyped animals in single-step genomic best linear unbiased prediction solved with the preconditioned conjugate gradient.

PubMed

Masuda, Y; Misztal, I; Legarra, A; Tsuruta, S; Lourenco, D A L; Fragomeni, B O; Aguilar, I

2017-01-01

This paper evaluates an efficient implementation to multiply the inverse of a numerator relationship matrix for genotyped animals () by a vector (). The computation is required for solving mixed model equations in single-step genomic BLUP (ssGBLUP) with the preconditioned conjugate gradient (PCG). The inverse can be decomposed into sparse matrices that are blocks of the sparse inverse of a numerator relationship matrix () including genotyped animals and their ancestors. The elements of were rapidly calculated with the Henderson's rule and stored as sparse matrices in memory. Implementation of was by a series of sparse matrix-vector multiplications. Diagonal elements of , which were required as preconditioners in PCG, were approximated with a Monte Carlo method using 1,000 samples. The efficient implementation of was compared with explicit inversion of with 3 data sets including about 15,000, 81,000, and 570,000 genotyped animals selected from populations with 213,000, 8.2 million, and 10.7 million pedigree animals, respectively. The explicit inversion required 1.8 GB, 49 GB, and 2,415 GB (estimated) of memory, respectively, and 42 s, 56 min, and 13.5 d (estimated), respectively, for the computations. The efficient implementation required <1 MB, 2.9 GB, and 2.3 GB of memory, respectively, and <1 sec, 3 min, and 5 min, respectively, for setting up. Only <1 sec was required for the multiplication in each PCG iteration for any data sets. When the equations in ssGBLUP are solved with the PCG algorithm, is no longer a limiting factor in the computations.
Compressive sensing using optimized sensing matrix for face verification

NASA Astrophysics Data System (ADS)

Oey, Endra; Jeffry; Wongso, Kelvin; Tommy

2017-12-01

Biometric appears as one of the solutions which is capable in solving problems that occurred in the usage of password in terms of data access, for example there is possibility in forgetting password and hard to recall various different passwords. With biometrics, physical characteristics of a person can be captured and used in the identification process. In this research, facial biometric is used in the verification process to determine whether the user has the authority to access the data or not. Facial biometric is chosen as its low cost implementation and generate quite accurate result for user identification. Face verification system which is adopted in this research is Compressive Sensing (CS) technique, in which aims to reduce dimension size as well as encrypt data in form of facial test image where the image is represented in sparse signals. Encrypted data can be reconstructed using Sparse Coding algorithm. Two types of Sparse Coding namely Orthogonal Matching Pursuit (OMP) and Iteratively Reweighted Least Squares -ℓp (IRLS-ℓp) will be used for comparison face verification system research. Reconstruction results of sparse signals are then used to find Euclidean norm with the sparse signal of user that has been previously saved in system to determine the validity of the facial test image. Results of system accuracy obtained in this research are 99% in IRLS with time response of face verification for 4.917 seconds and 96.33% in OMP with time response of face verification for 0.4046 seconds with non-optimized sensing matrix, while 99% in IRLS with time response of face verification for 13.4791 seconds and 98.33% for OMP with time response of face verification for 3.1571 seconds with optimized sensing matrix.
Symmetrized density matrix renormalization group algorithm for low-lying excited states of conjugated carbon systems: Application to 1,12-benzoperylene and polychrysene

NASA Astrophysics Data System (ADS)

Prodhan, Suryoday; Ramasesha, S.

2018-05-01

The symmetry adapted density matrix renormalization group (SDMRG) technique has been an efficient method for studying low-lying eigenstates in one- and quasi-one-dimensional electronic systems. However, the SDMRG method had bottlenecks involving the construction of linearly independent symmetry adapted basis states as the symmetry matrices in the DMRG basis were not sparse. We have developed a modified algorithm to overcome this bottleneck. The new method incorporates end-to-end interchange symmetry (C2) , electron-hole symmetry (J ) , and parity or spin-flip symmetry (P ) in these calculations. The one-to-one correspondence between direct-product basis states in the DMRG Hilbert space for these symmetry operations renders the symmetry matrices in the new basis with maximum sparseness, just one nonzero matrix element per row. Using methods similar to those employed in the exact diagonalization technique for Pariser-Parr-Pople (PPP) models, developed in the 1980s, it is possible to construct orthogonal SDMRG basis states while bypassing the slow step of the Gram-Schmidt orthonormalization procedure. The method together with the PPP model which incorporates long-range electronic correlations is employed to study the correlated excited-state spectra of 1,12-benzoperylene and a narrow mixed graphene nanoribbon with a chrysene molecule as the building unit, comprising both zigzag and cove-edge structures.
Sparse Covariance Matrix Estimation With Eigenvalue Constraints

PubMed Central

LIU, Han; WANG, Lie; ZHAO, Tuo

2014-01-01

We propose a new approach for estimating high-dimensional, positive-definite covariance matrices. Our method extends the generalized thresholding operator by adding an explicit eigenvalue constraint. The estimated covariance matrix simultaneously achieves sparsity and positive definiteness. The estimator is rate optimal in the minimax sense and we develop an efficient iterative soft-thresholding and projection algorithm based on the alternating direction method of multipliers. Empirically, we conduct thorough numerical experiments on simulated datasets as well as real data examples to illustrate the usefulness of our method. Supplementary materials for the article are available online. PMID:25620866
Dense and Sparse Matrix Operations on the Cell Processor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, Samuel W.; Shalf, John; Oliker, Leonid

2005-05-01

The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. Therefore, the high performance computing community is examining alternative architectures that address the limitations of modern superscalar designs. In this work, we examine STI's forthcoming Cell processor: a novel, low-power architecture that combines a PowerPC core with eight independent SIMD processing units coupled with a software-controlled memory to offer high FLOP/s/Watt. Since neither Cell hardware nor cycle-accurate simulators are currently publicly available, we develop an analytic framework to predict Cell performance on dense and sparse matrix operations, usingmore » a variety of algorithmic approaches. Results demonstrate Cell's potential to deliver more than an order of magnitude better GFLOP/s per watt performance, when compared with the Intel Itanium2 and Cray X1 processors.« less
Indexing amyloid peptide diffraction from serial femtosecond crystallography: new algorithms for sparse patterns

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brewster, Aaron S.; Sawaya, Michael R.; University of California, Los Angeles, CA 90095-1570

2015-02-01

Special methods are required to interpret sparse diffraction patterns collected from peptide crystals at X-ray free-electron lasers. Bragg spots can be indexed from composite-image powder rings, with crystal orientations then deduced from a very limited number of spot positions. Still diffraction patterns from peptide nanocrystals with small unit cells are challenging to index using conventional methods owing to the limited number of spots and the lack of crystal orientation information for individual images. New indexing algorithms have been developed as part of the Computational Crystallography Toolbox (cctbx) to overcome these challenges. Accurate unit-cell information derived from an aggregate data setmore » from thousands of diffraction patterns can be used to determine a crystal orientation matrix for individual images with as few as five reflections. These algorithms are potentially applicable not only to amyloid peptides but also to any set of diffraction patterns with sparse properties, such as low-resolution virus structures or high-throughput screening of still images captured by raster-scanning at synchrotron sources. As a proof of concept for this technique, successful integration of X-ray free-electron laser (XFEL) data to 2.5 Å resolution for the amyloid segment GNNQQNY from the Sup35 yeast prion is presented.« less
Energy scaling advantages of resistive memory crossbar based computation and its application to sparse coding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas

In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
Energy scaling advantages of resistive memory crossbar based computation and its application to sparse coding

DOE PAGES

Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas; ...

2016-01-06

In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
Two-stage sparse coding of region covariance via Log-Euclidean kernels to detect saliency.

PubMed

Zhang, Ying-Ying; Yang, Cai; Zhang, Ping

2017-05-01

In this paper, we present a novel bottom-up saliency detection algorithm from the perspective of covariance matrices on a Riemannian manifold. Each superpixel is described by a region covariance matrix on Riemannian Manifolds. We carry out a two-stage sparse coding scheme via Log-Euclidean kernels to extract salient objects efficiently. In the first stage, given background dictionary on image borders, sparse coding of each region covariance via Log-Euclidean kernels is performed. The reconstruction error on the background dictionary is regarded as the initial saliency of each superpixel. In the second stage, an improvement of the initial result is achieved by calculating reconstruction errors of the superpixels on foreground dictionary, which is extracted from the first stage saliency map. The sparse coding in the second stage is similar to the first stage, but is able to effectively highlight the salient objects uniformly from the background. Finally, three post-processing methods-highlight-inhibition function, context-based saliency weighting, and the graph cut-are adopted to further refine the saliency map. Experiments on four public benchmark datasets show that the proposed algorithm outperforms the state-of-the-art methods in terms of precision, recall and mean absolute error, and demonstrate the robustness and efficiency of the proposed method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Multiple Kernel Sparse Representation based Orthogonal Discriminative Projection and Its Cost-Sensitive Extension.

PubMed

Zhang, Guoqing; Sun, Huaijiang; Xia, Guiyu; Sun, Quansen

2016-07-07

Sparse representation based classification (SRC) has been developed and shown great potential for real-world application. Based on SRC, Yang et al. [10] devised a SRC steered discriminative projection (SRC-DP) method. However, as a linear algorithm, SRC-DP cannot handle the data with highly nonlinear distribution. Kernel sparse representation-based classifier (KSRC) is a non-linear extension of SRC and can remedy the drawback of SRC. KSRC requires the use of a predetermined kernel function and selection of the kernel function and its parameters is difficult. Recently, multiple kernel learning for SRC (MKL-SRC) [22] has been proposed to learn a kernel from a set of base kernels. However, MKL-SRC only considers the within-class reconstruction residual while ignoring the between-class relationship, when learning the kernel weights. In this paper, we propose a novel multiple kernel sparse representation-based classifier (MKSRC), and then we use it as a criterion to design a multiple kernel sparse representation based orthogonal discriminative projection method (MK-SR-ODP). The proposed algorithm aims at learning a projection matrix and a corresponding kernel from the given base kernels such that in the low dimension subspace the between-class reconstruction residual is maximized and the within-class reconstruction residual is minimized. Furthermore, to achieve a minimum overall loss by performing recognition in the learned low-dimensional subspace, we introduce cost information into the dimensionality reduction method. The solutions for the proposed method can be efficiently found based on trace ratio optimization method [33]. Extensive experimental results demonstrate the superiority of the proposed algorithm when compared with the state-of-the-art methods.
HO2 rovibrational eigenvalue studies for nonzero angular momentum

NASA Astrophysics Data System (ADS)

Wu, Xudong T.; Hayes, Edward F.

1997-08-01

An efficient parallel algorithm is reported for determining all bound rovibrational energy levels for the HO2 molecule for nonzero angular momentum values, J=1, 2, and 3. Performance tests on the CRAY T3D indicate that the algorithm scales almost linearly when up to 128 processors are used. Sustained performance levels of up to 3.8 Gflops have been achieved using 128 processors for J=3. The algorithm uses a direct product discrete variable representation (DVR) basis and the implicitly restarted Lanczos method (IRLM) of Sorensen to compute the eigenvalues of the polyatomic Hamiltonian. Since the IRLM is an iterative method, it does not require storage of the full Hamiltonian matrix—it only requires the multiplication of the Hamiltonian matrix by a vector. When the IRLM is combined with a formulation such as DVR, which produces a very sparse matrix, both memory and computation times can be reduced dramatically. This algorithm has the potential to achieve even higher performance levels for larger values of the total angular momentum.
ILUBCG2-11: Solution of 11-banded nonsymmetric linear equation systems by a preconditioned biconjugate gradient routine

NASA Astrophysics Data System (ADS)

Chen, Y.-M.; Koniges, A. E.; Anderson, D. V.

1989-10-01

The biconjugate gradient method (BCG) provides an attractive alternative to the usual conjugate gradient algorithms for the solution of sparse systems of linear equations with nonsymmetric and indefinite matrix operators. A preconditioned algorithm is given, whose form resembles the incomplete L-U conjugate gradient scheme (ILUCG2) previously presented. Although the BCG scheme requires the storage of two additional vectors, it converges in a significantly lesser number of iterations (often half), while the number of calculations per iteration remains essentially the same.
Sparse Matrices in MATLAB: Design and Implementation

NASA Technical Reports Server (NTRS)

Gilbert, John R.; Moler, Cleve; Schreiber, Robert

1992-01-01

The matrix computation language and environment MATLAB is extended to include sparse matrix storage and operations. The only change to the outward appearance of the MATLAB language is a pair of commands to create full or sparse matrices. Nearly all the operations of MATLAB now apply equally to full or sparse matrices, without any explicit action by the user. The sparse data structure represents a matrix in space proportional to the number of nonzero entries, and most of the operations compute sparse results in time proportional to the number of arithmetic operations on nonzeros.
Efficient spares matrix multiplication scheme for the CYBER 203

NASA Technical Reports Server (NTRS)

Lambiotte, J. J., Jr.

1984-01-01

This work has been directed toward the development of an efficient algorithm for performing this computation on the CYBER-203. The desire to provide software which gives the user the choice between the often conflicting goals of minimizing central processing (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of three types of storage is selected for each diagonal. For each storage type, an initialization sub-routine estimates the CPU and storage requirements based upon results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the resources. The three storage types employed were chosen to be efficient on the CYBER-203 for diagonals which are sparse, moderately sparse, or dense; however, for many densities, no diagonal type is most efficient with respect to both resource requirements. The user-supplied weights dictate the choice.
Variable is better than invariable: sparse VSS-NLMS algorithms with application to adaptive MIMO channel estimation.

PubMed

Gui, Guan; Chen, Zhang-xin; Xu, Li; Wan, Qun; Huang, Jiyan; Adachi, Fumiyuki

2014-01-01

Channel estimation problem is one of the key technical issues in sparse frequency-selective fading multiple-input multiple-output (MIMO) communication systems using orthogonal frequency division multiplexing (OFDM) scheme. To estimate sparse MIMO channels, sparse invariable step-size normalized least mean square (ISS-NLMS) algorithms were applied to adaptive sparse channel estimation (ACSE). It is well known that step-size is a critical parameter which controls three aspects: algorithm stability, estimation performance, and computational cost. However, traditional methods are vulnerable to cause estimation performance loss because ISS cannot balance the three aspects simultaneously. In this paper, we propose two stable sparse variable step-size NLMS (VSS-NLMS) algorithms to improve the accuracy of MIMO channel estimators. First, ASCE is formulated in MIMO-OFDM systems. Second, different sparse penalties are introduced to VSS-NLMS algorithm for ASCE. In addition, difference between sparse ISS-NLMS algorithms and sparse VSS-NLMS ones is explained and their lower bounds are also derived. At last, to verify the effectiveness of the proposed algorithms for ASCE, several selected simulation results are shown to prove that the proposed sparse VSS-NLMS algorithms can achieve better estimation performance than the conventional methods via mean square error (MSE) and bit error rate (BER) metrics.
Variable Is Better Than Invariable: Sparse VSS-NLMS Algorithms with Application to Adaptive MIMO Channel Estimation

PubMed Central

Gui, Guan; Chen, Zhang-xin; Xu, Li; Wan, Qun; Huang, Jiyan; Adachi, Fumiyuki

2014-01-01

Channel estimation problem is one of the key technical issues in sparse frequency-selective fading multiple-input multiple-output (MIMO) communication systems using orthogonal frequency division multiplexing (OFDM) scheme. To estimate sparse MIMO channels, sparse invariable step-size normalized least mean square (ISS-NLMS) algorithms were applied to adaptive sparse channel estimation (ACSE). It is well known that step-size is a critical parameter which controls three aspects: algorithm stability, estimation performance, and computational cost. However, traditional methods are vulnerable to cause estimation performance loss because ISS cannot balance the three aspects simultaneously. In this paper, we propose two stable sparse variable step-size NLMS (VSS-NLMS) algorithms to improve the accuracy of MIMO channel estimators. First, ASCE is formulated in MIMO-OFDM systems. Second, different sparse penalties are introduced to VSS-NLMS algorithm for ASCE. In addition, difference between sparse ISS-NLMS algorithms and sparse VSS-NLMS ones is explained and their lower bounds are also derived. At last, to verify the effectiveness of the proposed algorithms for ASCE, several selected simulation results are shown to prove that the proposed sparse VSS-NLMS algorithms can achieve better estimation performance than the conventional methods via mean square error (MSE) and bit error rate (BER) metrics. PMID:25089286

Exponentially more precise quantum simulation of fermions in the configuration interaction representation

NASA Astrophysics Data System (ADS)

Babbush, Ryan; Berry, Dominic W.; Sanders, Yuval R.; Kivlichan, Ian D.; Scherer, Artur; Wei, Annie Y.; Love, Peter J.; Aspuru-Guzik, Alán

2018-01-01

We present a quantum algorithm for the simulation of molecular systems that is asymptotically more efficient than all previous algorithms in the literature in terms of the main problem parameters. As in Babbush et al (2016 New Journal of Physics 18, 033032), we employ a recently developed technique for simulating Hamiltonian evolution using a truncated Taylor series to obtain logarithmic scaling with the inverse of the desired precision. The algorithm of this paper involves simulation under an oracle for the sparse, first-quantized representation of the molecular Hamiltonian known as the configuration interaction (CI) matrix. We construct and query the CI matrix oracle to allow for on-the-fly computation of molecular integrals in a way that is exponentially more efficient than classical numerical methods. Whereas second-quantized representations of the wavefunction require \\widetilde{{ O }}(N) qubits, where N is the number of single-particle spin-orbitals, the CI matrix representation requires \\widetilde{{ O }}(η ) qubits, where η \\ll N is the number of electrons in the molecule of interest. We show that the gate count of our algorithm scales at most as \\widetilde{{ O }}({η }2{N}3t).
Achieving Passive Localization with Traffic Light Schedules in Urban Road Sensor Networks

PubMed Central

Niu, Qiang; Yang, Xu; Gao, Shouwan; Chen, Pengpeng; Chan, Shibing

2016-01-01

Localization is crucial for the monitoring applications of cities, such as road monitoring, environment surveillance, vehicle tracking, etc. In urban road sensor networks, sensors are often sparely deployed due to the hardware cost. Under this sparse deployment, sensors cannot communicate with each other via ranging hardware or one-hop connectivity, rendering the existing localization solutions ineffective. To address this issue, this paper proposes a novel Traffic Lights Schedule-based localization algorithm (TLS), which is built on the fact that vehicles move through the intersection with a known traffic light schedule. We can first obtain the law by binary vehicle detection time stamps and describe the law as a matrix, called a detection matrix. At the same time, we can also use the known traffic light information to construct the matrices, which can be formed as a collection called a known matrix collection. The detection matrix is then matched in the known matrix collection for identifying where sensors are located on urban roads. We evaluate our algorithm by extensive simulation. The results show that the localization accuracy of intersection sensors can reach more than 90%. In addition, we compare it with a state-of-the-art algorithm and prove that it has a wider operational region. PMID:27735871
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations

DOE PAGES

Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel; ...

2017-06-01

As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel

As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
Recursive Factorization of the Inverse Overlap Matrix in Linear-Scaling Quantum Molecular Dynamics Simulations.

PubMed

Negre, Christian F A; Mniszewski, Susan M; Cawkwell, Marc J; Bock, Nicolas; Wall, Michael E; Niklasson, Anders M N

2016-07-12

We present a reduced complexity algorithm to compute the inverse overlap factors required to solve the generalized eigenvalue problem in a quantum-based molecular dynamics (MD) simulation. Our method is based on the recursive, iterative refinement of an initial guess of Z (inverse square root of the overlap matrix S). The initial guess of Z is obtained beforehand by using either an approximate divide-and-conquer technique or dynamical methods, propagated within an extended Lagrangian dynamics from previous MD time steps. With this formulation, we achieve long-term stability and energy conservation even under the incomplete, approximate, iterative refinement of Z. Linear-scaling performance is obtained using numerically thresholded sparse matrix algebra based on the ELLPACK-R sparse matrix data format, which also enables efficient shared-memory parallelization. As we show in this article using self-consistent density-functional-based tight-binding MD, our approach is faster than conventional methods based on the diagonalization of overlap matrix S for systems as small as a few hundred atoms, substantially accelerating quantum-based simulations even for molecular structures of intermediate size. For a 4158-atom water-solvated polyalanine system, we find an average speedup factor of 122 for the computation of Z in each MD step.
Recursive Factorization of the Inverse Overlap Matrix in Linear Scaling Quantum Molecular Dynamics Simulations

DOE PAGES

Negre, Christian F. A; Mniszewski, Susan M.; Cawkwell, Marc Jon; ...

2016-06-06

We present a reduced complexity algorithm to compute the inverse overlap factors required to solve the generalized eigenvalue problem in a quantum-based molecular dynamics (MD) simulation. Our method is based on the recursive iterative re nement of an initial guess Z of the inverse overlap matrix S. The initial guess of Z is obtained beforehand either by using an approximate divide and conquer technique or dynamically, propagated within an extended Lagrangian dynamics from previous MD time steps. With this formulation, we achieve long-term stability and energy conservation even under incomplete approximate iterative re nement of Z. Linear scaling performance ismore » obtained using numerically thresholded sparse matrix algebra based on the ELLPACK-R sparse matrix data format, which also enables e cient shared memory parallelization. As we show in this article using selfconsistent density functional based tight-binding MD, our approach is faster than conventional methods based on the direct diagonalization of the overlap matrix S for systems as small as a few hundred atoms, substantially accelerating quantum-based simulations even for molecular structures of intermediate size. For a 4,158 atom water-solvated polyalanine system we nd an average speedup factor of 122 for the computation of Z in each MD step.« less
Detecting double compressed MPEG videos with the same quantization matrix and synchronized group of pictures structure

NASA Astrophysics Data System (ADS)

Aghamaleki, Javad Abbasi; Behrad, Alireza

2018-01-01

Double compression detection is a crucial stage in digital image and video forensics. However, the detection of double compressed videos is challenging when the video forger uses the same quantization matrix and synchronized group of pictures (GOP) structure during the recompression history to conceal tampering effects. A passive approach is proposed for detecting double compressed MPEG videos with the same quantization matrix and synchronized GOP structure. To devise the proposed algorithm, the effects of recompression on P frames are mathematically studied. Then, based on the obtained guidelines, a feature vector is proposed to detect double compressed frames on the GOP level. Subsequently, sparse representations of the feature vectors are used for dimensionality reduction and enrich the traces of recompression. Finally, a support vector machine classifier is employed to detect and localize double compression in temporal domain. The experimental results show that the proposed algorithm achieves the accuracy of more than 95%. In addition, the comparisons of the results of the proposed method with those of other methods reveal the efficiency of the proposed algorithm.
Enforced Sparse Non-Negative Matrix Factorization

DTIC Science & Technology

2016-01-23

documents to find interesting pieces of information. With limited resources, analysts often employ automated text - mining tools that highlight common...represented as an undirected bipartite graph. It has become a common method for generating topic models of text data because it is known to produce good results...model and the convergence rate of the underlying algorithm. I. Introduction A common analyst challenge is searching through large quantities of text
Weighted low-rank sparse model via nuclear norm minimization for bearing fault detection

NASA Astrophysics Data System (ADS)

Du, Zhaohui; Chen, Xuefeng; Zhang, Han; Yang, Boyuan; Zhai, Zhi; Yan, Ruqiang

2017-07-01

It is a fundamental task in the machine fault diagnosis community to detect impulsive signatures generated by the localized faults of bearings. The main goal of this paper is to exploit the low-rank physical structure of periodic impulsive features and further establish a weighted low-rank sparse model for bearing fault detection. The proposed model mainly consists of three basic components: an adaptive partition window, a nuclear norm regularization and a weighted sequence. Firstly, due to the periodic repetition mechanism of impulsive feature, an adaptive partition window could be designed to transform the impulsive feature into a data matrix. The highlight of partition window is to accumulate all local feature information and align them. Then, all columns of the data matrix share similar waveforms and a core physical phenomenon arises, i.e., these singular values of the data matrix demonstrates a sparse distribution pattern. Therefore, a nuclear norm regularization is enforced to capture that sparse prior. However, the nuclear norm regularization treats all singular values equally and thus ignores one basic fact that larger singular values have more information volume of impulsive features and should be preserved as much as possible. Therefore, a weighted sequence with adaptively tuning weights inversely proportional to singular amplitude is adopted to guarantee the distribution consistence of large singular values. On the other hand, the proposed model is difficult to solve due to its non-convexity and thus a new algorithm is developed to search one satisfying stationary solution through alternatively implementing one proximal operator operation and least-square fitting. Moreover, the sensitivity analysis and selection principles of algorithmic parameters are comprehensively investigated through a set of numerical experiments, which shows that the proposed method is robust and only has a few adjustable parameters. Lastly, the proposed model is applied to the wind turbine (WT) bearing fault detection and its effectiveness is sufficiently verified. Compared with the current popular bearing fault diagnosis techniques, wavelet analysis and spectral kurtosis, our model achieves a higher diagnostic accuracy.
Optimization Algorithm for Kalman Filter Exploiting the Numerical Characteristics of SINS/GPS Integrated Navigation Systems.

PubMed

Hu, Shaoxing; Xu, Shike; Wang, Duhu; Zhang, Aiwu

2015-11-11

Aiming at addressing the problem of high computational cost of the traditional Kalman filter in SINS/GPS, a practical optimization algorithm with offline-derivation and parallel processing methods based on the numerical characteristics of the system is presented in this paper. The algorithm exploits the sparseness and/or symmetry of matrices to simplify the computational procedure. Thus plenty of invalid operations can be avoided by offline derivation using a block matrix technique. For enhanced efficiency, a new parallel computational mechanism is established by subdividing and restructuring calculation processes after analyzing the extracted "useful" data. As a result, the algorithm saves about 90% of the CPU processing time and 66% of the memory usage needed in a classical Kalman filter. Meanwhile, the method as a numerical approach needs no precise-loss transformation/approximation of system modules and the accuracy suffers little in comparison with the filter before computational optimization. Furthermore, since no complicated matrix theories are needed, the algorithm can be easily transplanted into other modified filters as a secondary optimization method to achieve further efficiency.
Extending the eigCG algorithm to nonsymmetric Lanczos for linear systems with multiple right-hand sides

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abdel-Rehim, A M; Stathopoulos, Andreas; Orginos, Kostas

2014-08-01

The technique that was used to build the EigCG algorithm for sparse symmetric linear systems is extended to the nonsymmetric case using the BiCG algorithm. We show that, similarly to the symmetric case, we can build an algorithm that is capable of computing a few smallest magnitude eigenvalues and their corresponding left and right eigenvectors of a nonsymmetric matrix using only a small window of the BiCG residuals while simultaneously solving a linear system with that matrix. For a system with multiple right-hand sides, we give an algorithm that computes incrementally more eigenvalues while solving the first few systems andmore » then uses the computed eigenvectors to deflate BiCGStab for the remaining systems. Our experiments on various test problems, including Lattice QCD, show the remarkable ability of EigBiCG to compute spectral approximations with accuracy comparable to that of the unrestarted, nonsymmetric Lanczos. Furthermore, our incremental EigBiCG followed by appropriately restarted and deflated BiCGStab provides a competitive method for systems with multiple right-hand sides.« less
Performance analysis of distributed symmetric sparse matrix vector multiplication algorithm for multi-core architectures

DOE PAGES

Oryspayev, Dossay; Aktulga, Hasan Metin; Sosonkina, Masha; ...

2015-07-14

In this article, sparse matrix vector multiply (SpMVM) is an important kernel that frequently arises in high performance computing applications. Due to its low arithmetic intensity, several approaches have been proposed in literature to improve its scalability and efficiency in large scale computations. In this paper, our target systems are high end multi-core architectures and we use messaging passing interface + open multiprocessing hybrid programming model for parallelism. We analyze the performance of recently proposed implementation of the distributed symmetric SpMVM, originally developed for large sparse symmetric matrices arising in ab initio nuclear structure calculations. We also study important featuresmore » of this implementation and compare with previously reported implementations that do not exploit underlying symmetry. Our SpMVM implementations leverage the hybrid paradigm to efficiently overlap expensive communications with computations. Our main comparison criterion is the "CPU core hours" metric, which is the main measure of resource usage on supercomputers. We analyze the effects of topology-aware mapping heuristic using simplified network load model. Furthermore, we have tested the different SpMVM implementations on two large clusters with 3D Torus and Dragonfly topology. Our results show that the distributed SpMVM implementation that exploits matrix symmetry and hides communication yields the best value for the "CPU core hours" metric and significantly reduces data movement overheads.« less
Fabric defect detection based on visual saliency using deep feature and low-rank recovery

NASA Astrophysics Data System (ADS)

Liu, Zhoufeng; Wang, Baorui; Li, Chunlei; Li, Bicao; Dong, Yan

2018-04-01

Fabric defect detection plays an important role in improving the quality of fabric product. In this paper, a novel fabric defect detection method based on visual saliency using deep feature and low-rank recovery was proposed. First, unsupervised training is carried out by the initial network parameters based on MNIST large datasets. The supervised fine-tuning of fabric image library based on Convolutional Neural Networks (CNNs) is implemented, and then more accurate deep neural network model is generated. Second, the fabric images are uniformly divided into the image block with the same size, then we extract their multi-layer deep features using the trained deep network. Thereafter, all the extracted features are concentrated into a feature matrix. Third, low-rank matrix recovery is adopted to divide the feature matrix into the low-rank matrix which indicates the background and the sparse matrix which indicates the salient defect. In the end, the iterative optimal threshold segmentation algorithm is utilized to segment the saliency maps generated by the sparse matrix to locate the fabric defect area. Experimental results demonstrate that the feature extracted by CNN is more suitable for characterizing the fabric texture than the traditional LBP, HOG and other hand-crafted features extraction method, and the proposed method can accurately detect the defect regions of various fabric defects, even for the image with complex texture.
SPARSKIT: A basic tool kit for sparse matrix computations

NASA Technical Reports Server (NTRS)

Saad, Youcef

1990-01-01

Presented here are the main features of a tool package for manipulating and working with sparse matrices. One of the goals of the package is to provide basic tools to facilitate the exchange of software and data between researchers in sparse matrix computations. The starting point is the Harwell/Boeing collection of matrices for which the authors provide a number of tools. Among other things, the package provides programs for converting data structures, printing simple statistics on a matrix, plotting a matrix profile, and performing linear algebra operations with sparse matrices.
Convergence Speed of a Dynamical System for Sparse Recovery

NASA Astrophysics Data System (ADS)

Balavoine, Aurele; Rozell, Christopher J.; Romberg, Justin

2013-09-01

This paper studies the convergence rate of a continuous-time dynamical system for L1-minimization, known as the Locally Competitive Algorithm (LCA). Solving L1-minimization} problems efficiently and rapidly is of great interest to the signal processing community, as these programs have been shown to recover sparse solutions to underdetermined systems of linear equations and come with strong performance guarantees. The LCA under study differs from the typical L1 solver in that it operates in continuous time: instead of being specified by discrete iterations, it evolves according to a system of nonlinear ordinary differential equations. The LCA is constructed from simple components, giving it the potential to be implemented as a large-scale analog circuit. The goal of this paper is to give guarantees on the convergence time of the LCA system. To do so, we analyze how the LCA evolves as it is recovering a sparse signal from underdetermined measurements. We show that under appropriate conditions on the measurement matrix and the problem parameters, the path the LCA follows can be described as a sequence of linear differential equations, each with a small number of active variables. This allows us to relate the convergence time of the system to the restricted isometry constant of the matrix. Interesting parallels to sparse-recovery digital solvers emerge from this study. Our analysis covers both the noisy and noiseless settings and is supported by simulation results.
Pure endmember extraction using robust kernel archetypoid analysis for hyperspectral imagery

NASA Astrophysics Data System (ADS)

Sun, Weiwei; Yang, Gang; Wu, Ke; Li, Weiyue; Zhang, Dianfa

2017-09-01

A robust kernel archetypoid analysis (RKADA) method is proposed to extract pure endmembers from hyperspectral imagery (HSI). The RKADA assumes that each pixel is a sparse linear mixture of all endmembers and each endmember corresponds to a real pixel in the image scene. First, it improves the re8gular archetypal analysis with a new binary sparse constraint, and the adoption of the kernel function constructs the principal convex hull in an infinite Hilbert space and enlarges the divergences between pairwise pixels. Second, the RKADA transfers the pure endmember extraction problem into an optimization problem by minimizing residual errors with the Huber loss function. The Huber loss function reduces the effects from big noises and outliers in the convergence procedure of RKADA and enhances the robustness of the optimization function. Third, the random kernel sinks for fast kernel matrix approximation and the two-stage algorithm for optimizing initial pure endmembers are utilized to improve its computational efficiency in realistic implementations of RKADA, respectively. The optimization equation of RKADA is solved by using the block coordinate descend scheme and the desired pure endmembers are finally obtained. Six state-of-the-art pure endmember extraction methods are employed to make comparisons with the RKADA on both synthetic and real Cuprite HSI datasets, including three geometrical algorithms vertex component analysis (VCA), alternative volume maximization (AVMAX) and orthogonal subspace projection (OSP), and three matrix factorization algorithms the preconditioning for successive projection algorithm (PreSPA), hierarchical clustering based on rank-two nonnegative matrix factorization (H2NMF) and self-dictionary multiple measurement vector (SDMMV). Experimental results show that the RKADA outperforms all the six methods in terms of spectral angle distance (SAD) and root-mean-square-error (RMSE). Moreover, the RKADA has short computational times in offline operations and shows significant improvement in identifying pure endmembers for ground objects with smaller spectrum differences. Therefore, the RKADA could be an alternative for pure endmember extraction from hyperspectral images.
Efficient diagonalization of the sparse matrices produced within the framework of the UK R-matrix molecular codes

NASA Astrophysics Data System (ADS)

Galiatsatos, P. G.; Tennyson, J.

2012-11-01

The most time consuming step within the framework of the UK R-matrix molecular codes is that of the diagonalization of the inner region Hamiltonian matrix (IRHM). Here we present the method that we follow to speed up this step. We use shared memory machines (SMM), distributed memory machines (DMM), the OpenMP directive based parallel language, the MPI function based parallel language, the sparse matrix diagonalizers ARPACK and PARPACK, a variation for real symmetric matrices of the official coordinate sparse matrix format and finally a parallel sparse matrix-vector product (PSMV). The efficient application of the previous techniques rely on two important facts: the sparsity of the matrix is large enough (more than 98%) and in order to get back converged results we need a small only part of the matrix spectrum.
Blockwise conjugate gradient methods for image reconstruction in volumetric CT.

PubMed

Qiu, W; Titley-Peloquin, D; Soleimani, M

2012-11-01

Cone beam computed tomography (CBCT) enables volumetric image reconstruction from 2D projection data and plays an important role in image guided radiation therapy (IGRT). Filtered back projection is still the most frequently used algorithm in applications. The algorithm discretizes the scanning process (forward projection) into a system of linear equations, which must then be solved to recover images from measured projection data. The conjugate gradients (CG) algorithm and its variants can be used to solve (possibly regularized) linear systems of equations Ax=b and linear least squares problems minx∥b-Ax∥2, especially when the matrix A is very large and sparse. Their applications can be found in a general CT context, but in tomography problems (e.g. CBCT reconstruction) they have not widely been used. Hence, CBCT reconstruction using the CG-type algorithm LSQR was implemented and studied in this paper. In CBCT reconstruction, the main computational challenge is that the matrix A usually is very large, and storing it in full requires an amount of memory well beyond the reach of commodity computers. Because of these memory capacity constraints, only a small fraction of the weighting matrix A is typically used, leading to a poor reconstruction. In this paper, to overcome this difficulty, the matrix A is partitioned and stored blockwise, and blockwise matrix-vector multiplications are implemented within LSQR. This implementation allows us to use the full weighting matrix A for CBCT reconstruction without further enhancing computer standards. Tikhonov regularization can also be implemented in this fashion, and can produce significant improvement in the reconstructed images. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Improving recovery of ECG signal with deterministic guarantees using split signal for multiple supports of matching pursuit (SS-MSMP) algorithm.

PubMed

Tawfic, Israa Shaker; Kayhan, Sema Koc

2017-02-01

Compressed sensing (CS) is a new field used for signal acquisition and design of sensor that made a large drooping in the cost of acquiring sparse signals. In this paper, new algorithms are developed to improve the performance of the greedy algorithms. In this paper, a new greedy pursuit algorithm, SS-MSMP (Split Signal for Multiple Support of Matching Pursuit), is introduced and theoretical analyses are given. The SS-MSMP is suggested for sparse data acquisition, in order to reconstruct analog and efficient signals via a small set of general measurements. This paper proposes a new fast method which depends on a study of the behavior of the support indices through picking the best estimation of the corrosion between residual and measurement matrix. The term multiple supports originates from an algorithm; in each iteration, the best support indices are picked based on maximum quality created by discovering correlation for a particular length of support. We depend on this new algorithm upon our previous derivative of halting condition that we produce for Least Support Orthogonal Matching Pursuit (LS-OMP) for clear and noisy signal. For better reconstructed results, SS-MSMP algorithm provides the recovery of support set for long signals such as signals used in WBAN. Numerical experiments demonstrate that the new suggested algorithm performs well compared to existing algorithms in terms of many factors used for reconstruction performance. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Face recognition based on two-dimensional discriminant sparse preserving projection

NASA Astrophysics Data System (ADS)

Zhang, Dawei; Zhu, Shanan

2018-04-01

In this paper, a supervised dimensionality reduction algorithm named two-dimensional discriminant sparse preserving projection (2DDSPP) is proposed for face recognition. In order to accurately model manifold structure of data, 2DDSPP constructs within-class affinity graph and between-class affinity graph by the constrained least squares (LS) and l1 norm minimization problem, respectively. Based on directly operating on image matrix, 2DDSPP integrates graph embedding (GE) with Fisher criterion. The obtained projection subspace preserves within-class neighborhood geometry structure of samples, while keeping away samples from different classes. The experimental results on the PIE and AR face databases show that 2DDSPP can achieve better recognition performance.

An embedded system for face classification in infrared video using sparse representation

NASA Astrophysics Data System (ADS)

Saavedra M., Antonio; Pezoa, Jorge E.; Zarkesh-Ha, Payman; Figueroa, Miguel

2017-09-01

We propose a platform for robust face recognition in Infrared (IR) images using Compressive Sensing (CS). In line with CS theory, the classification problem is solved using a sparse representation framework, where test images are modeled by means of a linear combination of the training set. Because the training set constitutes an over-complete dictionary, we identify new images by finding their sparsest representation based on the training set, using standard l1-minimization algorithms. Unlike conventional face-recognition algorithms, we feature extraction is performed using random projections with a precomputed binary matrix, as proposed in the CS literature. This random sampling reduces the effects of noise and occlusions such as facial hair, eyeglasses, and disguises, which are notoriously challenging in IR images. Thus, the performance of our framework is robust to these noise and occlusion factors, achieving an average accuracy of approximately 90% when the UCHThermalFace database is used for training and testing purposes. We implemented our framework on a high-performance embedded digital system, where the computation of the sparse representation of IR images was performed by a dedicated hardware using a deeply pipelined architecture on an Field-Programmable Gate Array (FPGA).
Computational efficiency improvements for image colorization

NASA Astrophysics Data System (ADS)

Yu, Chao; Sharma, Gaurav; Aly, Hussein

2013-03-01

We propose an efficient algorithm for colorization of greyscale images. As in prior work, colorization is posed as an optimization problem: a user specifies the color for a few scribbles drawn on the greyscale image and the color image is obtained by propagating color information from the scribbles to surrounding regions, while maximizing the local smoothness of colors. In this formulation, colorization is obtained by solving a large sparse linear system, which normally requires substantial computation and memory resources. Our algorithm improves the computational performance through three innovations over prior colorization implementations. First, the linear system is solved iteratively without explicitly constructing the sparse matrix, which significantly reduces the required memory. Second, we formulate each iteration in terms of integral images obtained by dynamic programming, reducing repetitive computation. Third, we use a coarseto- fine framework, where a lower resolution subsampled image is first colorized and this low resolution color image is upsampled to initialize the colorization process for the fine level. The improvements we develop provide significant speedup and memory savings compared to the conventional approach of solving the linear system directly using off-the-shelf sparse solvers, and allow us to colorize images with typical sizes encountered in realistic applications on typical commodity computing platforms.
Visual recognition and inference using dynamic overcomplete sparse learning.

PubMed

Murray, Joseph F; Kreutz-Delgado, Kenneth

2007-09-01

We present a hierarchical architecture and learning algorithm for visual recognition and other visual inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we posit a stochastic generative world model and from it develop a simplified world model (SWM) based on a tractable variational approximation that is designed to enforce sparse coding. Recent developments in computational methods for learning overcomplete representations (Lewicki & Sejnowski, 2000; Teh, Welling, Osindero, & Hinton, 2003) suggest that overcompleteness can be useful for visual tasks, and we use an overcomplete dictionary learning algorithm (Kreutz-Delgado, et al., 2003) as a preprocessing stage to produce accurate, sparse codings of images. Inference is performed by constructing a dynamic multilayer network with feedforward, feedback, and lateral connections, which is trained to approximate the SWM. Learning is done with a variant of the back-propagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks, and to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter.
Sparse Matrix Software Catalog, Sparse Matrix Symposium 1982, Fairfield Glade, Tennessee, October 24-27, 1982,

DTIC Science & Technology

1982-10-27

are buried within * a much larger, special purpose package. We regret such omissions, but to have reached the practi- tioners in each of the diverse...sparse matrix (form PAQ ) 4. Method of solution: Distribution count sort 5. Programming language: FORTRAN g Precision: Single and double precision 7
Performance issues for iterative solvers in device simulation

NASA Technical Reports Server (NTRS)

Fan, Qing; Forsyth, P. A.; Mcmacken, J. R. F.; Tang, Wei-Pai

1994-01-01

Due to memory limitations, iterative methods have become the method of choice for large scale semiconductor device simulation. However, it is well known that these methods still suffer from reliability problems. The linear systems which appear in numerical simulation of semiconductor devices are notoriously ill-conditioned. In order to produce robust algorithms for practical problems, careful attention must be given to many implementation issues. This paper concentrates on strategies for developing robust preconditioners. In addition, effective data structures and convergence check issues are also discussed. These algorithms are compared with a standard direct sparse matrix solver on a variety of problems.
Reprint of "Two-stage sparse coding of region covariance via Log-Euclidean kernels to detect saliency".

PubMed

Zhang, Ying-Ying; Yang, Cai; Zhang, Ping

2017-08-01

In this paper, we present a novel bottom-up saliency detection algorithm from the perspective of covariance matrices on a Riemannian manifold. Each superpixel is described by a region covariance matrix on Riemannian Manifolds. We carry out a two-stage sparse coding scheme via Log-Euclidean kernels to extract salient objects efficiently. In the first stage, given background dictionary on image borders, sparse coding of each region covariance via Log-Euclidean kernels is performed. The reconstruction error on the background dictionary is regarded as the initial saliency of each superpixel. In the second stage, an improvement of the initial result is achieved by calculating reconstruction errors of the superpixels on foreground dictionary, which is extracted from the first stage saliency map. The sparse coding in the second stage is similar to the first stage, but is able to effectively highlight the salient objects uniformly from the background. Finally, three post-processing methods-highlight-inhibition function, context-based saliency weighting, and the graph cut-are adopted to further refine the saliency map. Experiments on four public benchmark datasets show that the proposed algorithm outperforms the state-of-the-art methods in terms of precision, recall and mean absolute error, and demonstrate the robustness and efficiency of the proposed method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Solution of matrix equations using sparse techniques

NASA Technical Reports Server (NTRS)

Baddourah, Majdi

1994-01-01

The solution of large systems of matrix equations is key to the solution of a large number of scientific and engineering problems. This talk describes the sparse matrix solver developed at Langley which can routinely solve in excess of 263,000 equations in 40 seconds on one Cray C-90 processor. It appears that for large scale structural analysis applications, sparse matrix methods have a significant performance advantage over other methods.
SAMBA: Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos

NASA Astrophysics Data System (ADS)

Ahlfeld, R.; Belkouchi, B.; Montomoli, F.

2016-09-01

A new arbitrary Polynomial Chaos (aPC) method is presented for moderately high-dimensional problems characterised by limited input data availability. The proposed methodology improves the algorithm of aPC and extends the method, that was previously only introduced as tensor product expansion, to moderately high-dimensional stochastic problems. The fundamental idea of aPC is to use the statistical moments of the input random variables to develop the polynomial chaos expansion. This approach provides the possibility to propagate continuous or discrete probability density functions and also histograms (data sets) as long as their moments exist, are finite and the determinant of the moment matrix is strictly positive. For cases with limited data availability, this approach avoids bias and fitting errors caused by wrong assumptions. In this work, an alternative way to calculate the aPC is suggested, which provides the optimal polynomials, Gaussian quadrature collocation points and weights from the moments using only a handful of matrix operations on the Hankel matrix of moments. It can therefore be implemented without requiring prior knowledge about statistical data analysis or a detailed understanding of the mathematics of polynomial chaos expansions. The extension to more input variables suggested in this work, is an anisotropic and adaptive version of Smolyak's algorithm that is solely based on the moments of the input probability distributions. It is referred to as SAMBA (PC), which is short for Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos. It is illustrated that for moderately high-dimensional problems (up to 20 different input variables or histograms) SAMBA can significantly simplify the calculation of sparse Gaussian quadrature rules. SAMBA's efficiency for multivariate functions with regard to data availability is further demonstrated by analysing higher order convergence and accuracy for a set of nonlinear test functions with 2, 5 and 10 different input distributions or histograms.
Low-rank matrix decomposition and spatio-temporal sparse recovery for STAP radar

DOE PAGES

Sen, Satyabrata

2015-08-04

We develop space-time adaptive processing (STAP) methods by leveraging the advantages of sparse signal processing techniques in order to detect a slowly-moving target. We observe that the inherent sparse characteristics of a STAP problem can be formulated as the low-rankness of clutter covariance matrix when compared to the total adaptive degrees-of-freedom, and also as the sparse interference spectrum on the spatio-temporal domain. By exploiting these sparse properties, we propose two approaches for estimating the interference covariance matrix. In the first approach, we consider a constrained matrix rank minimization problem (RMP) to decompose the sample covariance matrix into a low-rank positivemore » semidefinite and a diagonal matrix. The solution of RMP is obtained by applying the trace minimization technique and the singular value decomposition with matrix shrinkage operator. Our second approach deals with the atomic norm minimization problem to recover the clutter response-vector that has a sparse support on the spatio-temporal plane. We use convex relaxation based standard sparse-recovery techniques to find the solutions. With extensive numerical examples, we demonstrate the performances of proposed STAP approaches with respect to both the ideal and practical scenarios, involving Doppler-ambiguous clutter ridges, spatial and temporal decorrelation effects. As a result, the low-rank matrix decomposition based solution requires secondary measurements as many as twice the clutter rank to attain a near-ideal STAP performance; whereas the spatio-temporal sparsity based approach needs a considerably small number of secondary data.« less
Color normalization of histology slides using graph regularized sparse NMF

NASA Astrophysics Data System (ADS)

Sha, Lingdao; Schonfeld, Dan; Sethi, Amit

2017-03-01

Computer based automatic medical image processing and quantification are becoming popular in digital pathology. However, preparation of histology slides can vary widely due to differences in staining equipment, procedures and reagents, which can reduce the accuracy of algorithms that analyze their color and texture information. To re- duce the unwanted color variations, various supervised and unsupervised color normalization methods have been proposed. Compared with supervised color normalization methods, unsupervised color normalization methods have advantages of time and cost efficient and universal applicability. Most of the unsupervised color normaliza- tion methods for histology are based on stain separation. Based on the fact that stain concentration cannot be negative and different parts of the tissue absorb different stains, nonnegative matrix factorization (NMF), and particular its sparse version (SNMF), are good candidates for stain separation. However, most of the existing unsupervised color normalization method like PCA, ICA, NMF and SNMF fail to consider important information about sparse manifolds that its pixels occupy, which could potentially result in loss of texture information during color normalization. Manifold learning methods like Graph Laplacian have proven to be very effective in interpreting high-dimensional data. In this paper, we propose a novel unsupervised stain separation method called graph regularized sparse nonnegative matrix factorization (GSNMF). By considering the sparse prior of stain concentration together with manifold information from high-dimensional image data, our method shows better performance in stain color deconvolution than existing unsupervised color deconvolution methods, especially in keeping connected texture information. To utilized the texture information, we construct a nearest neighbor graph between pixels within a spatial area of an image based on their distances using heat kernal in lαβ space. The representation of a pixel in the stain density space is constrained to follow the feature distance of the pixel to pixels in the neighborhood graph. Utilizing color matrix transfer method with the stain concentrations found using our GSNMF method, the color normalization performance was also better than existing methods.
Optimization of sparse matrix-vector multiplication on emerging multicore platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, Samuel; Oliker, Leonid; Vuduc, Richard

2007-01-01

We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dual-core and Intel quad-core designs, the heterogeneous STI Cell, as well as the first scientificmore » study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
Lanczos eigensolution method for high-performance computers

NASA Technical Reports Server (NTRS)

Bostic, Susan W.

1991-01-01

The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors.
Kanerva's sparse distributed memory: An associative memory algorithm well-suited to the Connection Machine

NASA Technical Reports Server (NTRS)

Rogers, David

1988-01-01

The advent of the Connection Machine profoundly changes the world of supercomputers. The highly nontraditional architecture makes possible the exploration of algorithms that were impractical for standard Von Neumann architectures. Sparse distributed memory (SDM) is an example of such an algorithm. Sparse distributed memory is a particularly simple and elegant formulation for an associative memory. The foundations for sparse distributed memory are described, and some simple examples of using the memory are presented. The relationship of sparse distributed memory to three important computational systems is shown: random-access memory, neural networks, and the cerebellum of the brain. Finally, the implementation of the algorithm for sparse distributed memory on the Connection Machine is discussed.
MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kannan, Ramakrishnan; Ballard, Grey; Park, Haesun

Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H, for the given input matrix A, such that A≈WH. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient parallel algorithms to solve the problem for big data sets. The main contribution of this work is a new, high-performance parallel computational framework for a broad class of NMF algorithms thatmore » iteratively solves alternating non-negative least squares (NLS) subproblems for W and H. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). The framework is flexible and able to leverage a variety of NMF and NLS algorithms, including Multiplicative Update, Hierarchical Alternating Least Squares, and Block Principal Pivoting. Our implementation allows us to benchmark and compare different algorithms on massive dense and sparse data matrices of size that spans from few hundreds of millions to billions. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements. The code and the datasets used for conducting the experiments are available online.« less
MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization

DOE PAGES

Kannan, Ramakrishnan; Ballard, Grey; Park, Haesun

2017-10-30

Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H, for the given input matrix A, such that A≈WH. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient parallel algorithms to solve the problem for big data sets. The main contribution of this work is a new, high-performance parallel computational framework for a broad class of NMF algorithms thatmore » iteratively solves alternating non-negative least squares (NLS) subproblems for W and H. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). The framework is flexible and able to leverage a variety of NMF and NLS algorithms, including Multiplicative Update, Hierarchical Alternating Least Squares, and Block Principal Pivoting. Our implementation allows us to benchmark and compare different algorithms on massive dense and sparse data matrices of size that spans from few hundreds of millions to billions. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements. The code and the datasets used for conducting the experiments are available online.« less
Infrared and visible image fusion based on robust principal component analysis and compressed sensing

NASA Astrophysics Data System (ADS)

Li, Jun; Song, Minghui; Peng, Yuanxi

2018-03-01

Current infrared and visible image fusion methods do not achieve adequate information extraction, i.e., they cannot extract the target information from infrared images while retaining the background information from visible images. Moreover, most of them have high complexity and are time-consuming. This paper proposes an efficient image fusion framework for infrared and visible images on the basis of robust principal component analysis (RPCA) and compressed sensing (CS). The novel framework consists of three phases. First, RPCA decomposition is applied to the infrared and visible images to obtain their sparse and low-rank components, which represent the salient features and background information of the images, respectively. Second, the sparse and low-rank coefficients are fused by different strategies. On the one hand, the measurements of the sparse coefficients are obtained by the random Gaussian matrix, and they are then fused by the standard deviation (SD) based fusion rule. Next, the fused sparse component is obtained by reconstructing the result of the fused measurement using the fast continuous linearized augmented Lagrangian algorithm (FCLALM). On the other hand, the low-rank coefficients are fused using the max-absolute rule. Subsequently, the fused image is superposed by the fused sparse and low-rank components. For comparison, several popular fusion algorithms are tested experimentally. By comparing the fused results subjectively and objectively, we find that the proposed framework can extract the infrared targets while retaining the background information in the visible images. Thus, it exhibits state-of-the-art performance in terms of both fusion effects and timeliness.
Acceleration of GPU-based Krylov solvers via data transfer reduction

DOE PAGES

Anzt, Hartwig; Tomov, Stanimire; Luszczek, Piotr; ...

2015-04-08

Krylov subspace iterative solvers are often the method of choice when solving large sparse linear systems. At the same time, hardware accelerators such as graphics processing units continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a well optimized but limited set of linear algebra operations, applications that use them often fail to reduce certain data communications, and hence fail to leverage the full potential of the accelerator. In this study, we target the acceleration of Krylov subspace iterative methods for graphicsmore » processing units, and in particular the Biconjugate Gradient Stabilized solver that significant improvement can be achieved by reformulating the method to reduce data-communications through application-specific kernels instead of using the generic BLAS kernels, e.g. as provided by NVIDIA’s cuBLAS library, and by designing a graphics processing unit specific sparse matrix-vector product kernel that is able to more efficiently use the graphics processing unit’s computing power. Furthermore, we derive a model estimating the performance improvement, and use experimental data to validate the expected runtime savings. Finally, considering that the derived implementation achieves significantly higher performance, we assert that similar optimizations addressing algorithm structure, as well as sparse matrix-vector, are crucial for the subsequent development of high-performance graphics processing units accelerated Krylov subspace iterative methods.« less
Group sparse multiview patch alignment framework with view consistency for image classification.

PubMed

Gui, Jie; Tao, Dacheng; Sun, Zhenan; Luo, Yong; You, Xinge; Tang, Yuan Yan

2014-07-01

No single feature can satisfactorily characterize the semantic concepts of an image. Multiview learning aims to unify different kinds of features to produce a consensual and efficient representation. This paper redefines part optimization in the patch alignment framework (PAF) and develops a group sparse multiview patch alignment framework (GSM-PAF). The new part optimization considers not only the complementary properties of different views, but also view consistency. In particular, view consistency models the correlations between all possible combinations of any two kinds of view. In contrast to conventional dimensionality reduction algorithms that perform feature extraction and feature selection independently, GSM-PAF enjoys joint feature extraction and feature selection by exploiting l(2,1)-norm on the projection matrix to achieve row sparsity, which leads to the simultaneous selection of relevant features and learning transformation, and thus makes the algorithm more discriminative. Experiments on two real-world image data sets demonstrate the effectiveness of GSM-PAF for image classification.
Orthogonal sparse linear discriminant analysis

NASA Astrophysics Data System (ADS)

Liu, Zhonghua; Liu, Gang; Pu, Jiexin; Wang, Xiaohong; Wang, Haijun

2018-03-01

Linear discriminant analysis (LDA) is a linear feature extraction approach, and it has received much attention. On the basis of LDA, researchers have done a lot of research work on it, and many variant versions of LDA were proposed. However, the inherent problem of LDA cannot be solved very well by the variant methods. The major disadvantages of the classical LDA are as follows. First, it is sensitive to outliers and noises. Second, only the global discriminant structure is preserved, while the local discriminant information is ignored. In this paper, we present a new orthogonal sparse linear discriminant analysis (OSLDA) algorithm. The k nearest neighbour graph is first constructed to preserve the locality discriminant information of sample points. Then, L2,1-norm constraint on the projection matrix is used to act as loss function, which can make the proposed method robust to outliers in data points. Extensive experiments have been performed on several standard public image databases, and the experiment results demonstrate the performance of the proposed OSLDA algorithm.
Sparse matrix beamforming and image reconstruction for real-time 2D HIFU monitoring using Harmonic Motion Imaging for Focused Ultrasound (HMIFU) with in vitro validation

PubMed Central

Hou, Gary Y.; Provost, Jean; Grondin, Julien; Wang, Shutao; Marquet, Fabrice; Bunting, Ethan; Konofagou, Elisa E.

2015-01-01

Harmonic Motion Imaging for Focused Ultrasound (HMIFU) is a recently developed High-Intensity Focused Ultrasound (HIFU) treatment monitoring method. HMIFU utilizes an Amplitude-Modulated (fAM = 25 Hz) HIFU beam to induce a localized focal oscillatory motion, which is simultaneously estimated and imaged by confocally-aligned imaging transducer. HMIFU feasibilities have been previously shown in silico, in vitro, and in vivo in 1-D or 2-D monitoring of HIFU treatment. The objective of this study is to develop and show the feasibility of a novel fast beamforming algorithm for image reconstruction using GPU-based sparse-matrix operation with real-time feedback. In this study, the algorithm was implemented onto a fully integrated, clinically relevant HMIFU system composed of a 93-element HIFU transducer (fcenter = 4.5MHz) and coaxially-aligned 64-element phased array (fcenter = 2.5MHz) for displacement excitation and motion estimation, respectively. A single transmit beam with divergent beam transmit was used while fast beamforming was implemented using a GPU-based delay-and-sum method and a sparse-matrix operation. Axial HMI displacements were then estimated from the RF signals using a 1-D normalized cross-correlation method and streamed to a graphic user interface. The present work developed and implemented a sparse matrix beamforming onto a fully-integrated, clinically relevant system, which can stream displacement images up to 15 Hz using a GPU-based processing, an increase of 100 fold in rate of streaming displacement images compared to conventional CPU-based conventional beamforming and reconstruction processing. The achieved feedback rate is also currently the fastest and only approach that does not require interrupting the HIFU treatment amongst the acoustic radiation force based HIFU imaging techniques. Results in phantom experiments showed reproducible displacement imaging, and monitoring of twenty two in vitro HIFU treatments using the new 2D system showed a consistent average focal displacement decrease of 46.7±14.6% during lesion formation. Complementary focal temperature monitoring also indicated an average rate of displacement increase and decrease with focal temperature at 0.84±1.15 %/ °C, and 2.03± 0.93%/ °C, respectively. These results reinforce the HMIFU capability of estimating and monitoring stiffness related changes in real time. Current ongoing studies include clinical translation of the presented system for monitoring of HIFU treatment for breast and pancreatic tumor applications. PMID:24960528

DOE Office of Scientific and Technical Information (OSTI.GOV)

Spotz, William F.

PyTrilinos is a set of Python interfaces to compiled Trilinos packages. This collection supports serial and parallel dense linear algebra, serial and parallel sparse linear algebra, direct and iterative linear solution techniques, algebraic and multilevel preconditioners, nonlinear solvers and continuation algorithms, eigensolvers and partitioning algorithms. Also included are a variety of related utility functions and classes, including distributed I/O, coloring algorithms and matrix generation. PyTrilinos vector objects are compatible with the popular NumPy Python package. As a Python front end to compiled libraries, PyTrilinos takes advantage of the flexibility and ease of use of Python, and the efficiency of themore » underlying C++, C and Fortran numerical kernels. This paper covers recent, previously unpublished advances in the PyTrilinos package.« less
Comparing implementations of penalized weighted least-squares sinogram restoration.

PubMed

Forthmann, Peter; Koehler, Thomas; Defrise, Michel; La Riviere, Patrick

2010-11-01

A CT scanner measures the energy that is deposited in each channel of a detector array by x rays that have been partially absorbed on their way through the object. The measurement process is complex and quantitative measurements are always and inevitably associated with errors, so CT data must be preprocessed prior to reconstruction. In recent years, the authors have formulated CT sinogram preprocessing as a statistical restoration problem in which the goal is to obtain the best estimate of the line integrals needed for reconstruction from the set of noisy, degraded measurements. The authors have explored both penalized Poisson likelihood (PL) and penalized weighted least-squares (PWLS) objective functions. At low doses, the authors found that the PL approach outperforms PWLS in terms of resolution-noise tradeoffs, but at standard doses they perform similarly. The PWLS objective function, being quadratic, is more amenable to computational acceleration than the PL objective. In this work, the authors develop and compare two different methods for implementing PWLS sinogram restoration with the hope of improving computational performance relative to PL in the standard-dose regime. Sinogram restoration is still significant in the standard-dose regime since it can still outperform standard approaches and it allows for correction of effects that are not usually modeled in standard CT preprocessing. The authors have explored and compared two implementation strategies for PWLS sinogram restoration: (1) A direct matrix-inversion strategy based on the closed-form solution to the PWLS optimization problem and (2) an iterative approach based on the conjugate-gradient algorithm. Obtaining optimal performance from each strategy required modifying the naive off-the-shelf implementations of the algorithms to exploit the particular symmetry and sparseness of the sinogram-restoration problem. For the closed-form approach, the authors subdivided the large matrix inversion into smaller coupled problems and exploited sparseness to minimize matrix operations. For the conjugate-gradient approach, the authors exploited sparseness and preconditioned the problem to speed up convergence. All methods produced qualitatively and quantitatively similar images as measured by resolution-variance tradeoffs and difference images. Despite the acceleration strategies, the direct matrix-inversion approach was found to be uncompetitive with iterative approaches, with a computational burden higher by an order of magnitude or more. The iterative conjugate-gradient approach, however, does appear promising, with computation times half that of the authors' previous penalized-likelihood implementation. Iterative conjugate-gradient based PWLS sinogram restoration with careful matrix optimizations has computational advantages over direct matrix PWLS inversion and over penalized-likelihood sinogram restoration and can be considered a good alternative in standard-dose regimes.
Sparse matrix multiplications for linear scaling electronic structure calculations in an atom-centered basis set using multiatom blocks.

PubMed

Saravanan, Chandra; Shao, Yihan; Baer, Roi; Ross, Philip N; Head-Gordon, Martin

2003-04-15

A sparse matrix multiplication scheme with multiatom blocks is reported, a tool that can be very useful for developing linear-scaling methods with atom-centered basis functions. Compared to conventional element-by-element sparse matrix multiplication schemes, efficiency is gained by the use of the highly optimized basic linear algebra subroutines (BLAS). However, some sparsity is lost in the multiatom blocking scheme because these matrix blocks will in general contain negligible elements. As a result, an optimal block size that minimizes the CPU time by balancing these two effects is recovered. In calculations on linear alkanes, polyglycines, estane polymers, and water clusters the optimal block size is found to be between 40 and 100 basis functions, where about 55-75% of the machine peak performance was achieved on an IBM RS6000 workstation. In these calculations, the blocked sparse matrix multiplications can be 10 times faster than a standard element-by-element sparse matrix package. Copyright 2003 Wiley Periodicals, Inc. J Comput Chem 24: 618-622, 2003
Disentangling giant component and finite cluster contributions in sparse random matrix spectra.

PubMed

Kühn, Reimer

2016-04-01

We describe a method for disentangling giant component and finite cluster contributions to sparse random matrix spectra, using sparse symmetric random matrices defined on Erdős-Rényi graphs as an example and test bed. Our methods apply to sparse matrices defined in terms of arbitrary graphs in the configuration model class, as long as they have finite mean degree.
Exploiting Data Sparsity in Parallel Matrix Powers Computations

DTIC Science & Technology

2013-05-03

2013 Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour...matrices of the form A = D+USV H, where D is sparse and USV H has low rank but may be dense. Matrices of this form arise in many practical applications...methods numerical partial di erential equation solvers, and preconditioned iterative methods. If A has this form , our algorithm enables a communication
Sparse Adaptive Iteratively-Weighted Thresholding Algorithm (SAITA) for Lp-Regularization Using the Multiple Sub-Dictionary Representation

PubMed Central

Zhang, Jie; Fan, Shangang; Xiong, Jian; Cheng, Xiefeng; Sari, Hikmet; Adachi, Fumiyuki

2017-01-01

Both L1/2 and L2/3 are two typical non-convex regularizations of Lp (0
Sparse Adaptive Iteratively-Weighted Thresholding Algorithm (SAITA) for Lp-Regularization Using the Multiple Sub-Dictionary Representation.

PubMed

Li, Yunyi; Zhang, Jie; Fan, Shangang; Yang, Jie; Xiong, Jian; Cheng, Xiefeng; Sari, Hikmet; Adachi, Fumiyuki; Gui, Guan

2017-12-15

Both L 1/2 and L 2/3 are two typical non-convex regularizations of L p (0
Crystallization of bFGF-DNA Aptamer Complexes Using a Sparse Matrix Designed for Protein-Nucleic Acid Complexes

NASA Technical Reports Server (NTRS)

Cannone, Jaime J.; Barnes, Cindy L.; Achari, Aniruddha; Kundrot, Craig E.; Whitaker, Ann F. (Technical Monitor)

2001-01-01

The Sparse Matrix approach for obtaining lead crystallization conditions has proven to be very fruitful for the crystallization of proteins and nucleic acids. Here we report a Sparse Matrix developed specifically for the crystallization of protein-DNA complexes. This method is rapid and economical, typically requiring 2.5 mg of complex to test 48 conditions. The method was originally developed to crystallize basic fibroblast growth factor (bFGF) complexed with DNA sequences identified through in vitro selection, or SELEX, methods. Two DNA aptamers that bind with approximately nanomolar affinity and inhibit the angiogenic properties of bFGF were selected for co-crystallization. The Sparse Matrix produced lead crystallization conditions for both bFGF-DNA complexes.
High-SNR spectrum measurement based on Hadamard encoding and sparse reconstruction

NASA Astrophysics Data System (ADS)

Wang, Zhaoxin; Yue, Jiang; Han, Jing; Li, Long; Jin, Yong; Gao, Yuan; Li, Baoming

2017-12-01

The denoising capabilities of the H-matrix and cyclic S-matrix based on the sparse reconstruction, employed in the Pixel of Focal Plane Coded Visible Spectrometer for spectrum measurement are investigated, where the spectrum is sparse in a known basis. In the measurement process, the digital micromirror device plays an important role, which implements the Hadamard coding. In contrast with Hadamard transform spectrometry, based on the shift invariability, this spectrometer may have the advantage of a high efficiency. Simulations and experiments show that the nonlinear solution with a sparse reconstruction has a better signal-to-noise ratio than the linear solution and the H-matrix outperforms the cyclic S-matrix whether the reconstruction method is nonlinear or linear.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Xiaojun; Lei, Guangtsai; Pan, Guangwen

In this paper, the continuous operator is discretized into matrix forms by Galerkin`s procedure, using periodic Battle-Lemarie wavelets as basis/testing functions. The polynomial decomposition of wavelets is applied to the evaluation of matrix elements, which makes the computational effort of the matrix elements no more expensive than that of method of moments (MoM) with conventional piecewise basis/testing functions. A new algorithm is developed employing the fast wavelet transform (FWT). Owing to localization, cancellation, and orthogonal properties of wavelets, very sparse matrices have been obtained, which are then solved by the LSQR iterative method. This algorithm is also adaptive in thatmore » one can add at will finer wavelet bases in the regions where fields vary rapidly, without any damage to the system orthogonality of the wavelet basis functions. To demonstrate the effectiveness of the new algorithm, we applied it to the evaluation of frequency-dependent resistance and inductance matrices of multiple lossy transmission lines. Numerical results agree with previously published data and laboratory measurements. The valid frequency range of the boundary integral equation results has been extended two to three decades in comparison with the traditional MoM approach. The new algorithm has been integrated into the computer aided design tool, MagiCAD, which is used for the design and simulation of high-speed digital systems and multichip modules Pan et al. 29 refs., 7 figs., 6 tabs.« less
Dual energy CT with one full scan and a second sparse-view scan using structure preserving iterative reconstruction (SPIR)

NASA Astrophysics Data System (ADS)

Wang, Tonghe; Zhu, Lei

2016-09-01

Conventional dual-energy CT (DECT) reconstruction requires two full-size projection datasets with two different energy spectra. In this study, we propose an iterative algorithm to enable a new data acquisition scheme which requires one full scan and a second sparse-view scan for potential reduction in imaging dose and engineering cost of DECT. A bilateral filter is calculated as a similarity matrix from the first full-scan CT image to quantify the similarity between any two pixels, which is assumed unchanged on a second CT image since DECT scans are performed on the same object. The second CT image from reduced projections is reconstructed by an iterative algorithm which updates the image by minimizing the total variation of the difference between the image and its filtered image by the similarity matrix under data fidelity constraint. As the redundant structural information of the two CT images is contained in the similarity matrix for CT reconstruction, we refer to the algorithm as structure preserving iterative reconstruction (SPIR). The proposed method is evaluated on both digital and physical phantoms, and is compared with the filtered-backprojection (FBP) method, the conventional total-variation-regularization-based algorithm (TVR) and prior-image-constrained-compressed-sensing (PICCS). SPIR with a second 10-view scan reduces the image noise STD by a factor of one order of magnitude with same spatial resolution as full-view FBP image. SPIR substantially improves over TVR on the reconstruction accuracy of a 10-view scan by decreasing the reconstruction error from 6.18% to 1.33%, and outperforms TVR at 50 and 20-view scans on spatial resolution with a higher frequency at the modulation transfer function value of 10% by an average factor of 4. Compared with the 20-view scan PICCS result, the SPIR image has 7 times lower noise STD with similar spatial resolution. The electron density map obtained from the SPIR-based DECT images with a second 10-view scan has an average error of less than 1%.
A General-applications Direct Global Matrix Algorithm for Rapid Seismo-acoustic Wavefield Computations

NASA Technical Reports Server (NTRS)

Schmidt, H.; Tango, G. J.; Werby, M. F.

1985-01-01

A new matrix method for rapid wave propagation modeling in generalized stratified media, which has recently been applied to numerical simulations in diverse areas of underwater acoustics, solid earth seismology, and nondestructive ultrasonic scattering is explained and illustrated. A portion of recent efforts jointly undertaken at NATOSACLANT and NORDA Numerical Modeling groups in developing, implementing, and testing a new fast general-applications wave propagation algorithm, SAFARI, formulated at SACLANT is summarized. The present general-applications SAFARI program uses a Direct Global Matrix Approach to multilayer Green's function calculation. A rapid and unconditionally stable solution is readily obtained via simple Gaussian ellimination on the resulting sparsely banded block system, precisely analogous to that arising in the Finite Element Method. The resulting gains in accuracy and computational speed allow consideration of much larger multilayered air/ocean/Earth/engineering material media models, for many more source-receiver configurations than previously possible. The validity and versatility of the SAFARI-DGM method is demonstrated by reviewing three practical examples of engineering interest, drawn from ocean acoustics, engineering seismology and ultrasonic scattering.
Factorization in large-scale many-body calculations

DOE PAGES

Johnson, Calvin W.; Ormand, W. Erich; Krastev, Plamen G.

2013-08-07

One approach for solving interacting many-fermion systems is the configuration-interaction method, also sometimes called the interacting shell model, where one finds eigenvalues of the Hamiltonian in a many-body basis of Slater determinants (antisymmetrized products of single-particle wavefunctions). The resulting Hamiltonian matrix is typically very sparse, but for large systems the nonzero matrix elements can nonetheless require terabytes or more of storage. An alternate algorithm, applicable to a broad class of systems with symmetry, in our case rotational invariance, is to exactly factorize both the basis and the interaction using additive/multiplicative quantum numbers; such an algorithm recreates the many-body matrix elementsmore » on the fly and can reduce the storage requirements by an order of magnitude or more. Here, we discuss factorization in general and introduce a novel, generalized factorization method, essentially a ‘double-factorization’ which speeds up basis generation and set-up of required arrays. Although we emphasize techniques, we also place factorization in the context of a specific (unpublished) configuration-interaction code, BIGSTICK, which runs both on serial and parallel machines, and discuss the savings in memory due to factorization.« less
Social Collaborative Filtering by Trust.

PubMed

Yang, Bo; Lei, Yu; Liu, Jiming; Li, Wenjie

2017-08-01

Recommender systems are used to accurately and actively provide users with potentially interesting information or services. Collaborative filtering is a widely adopted approach to recommendation, but sparse data and cold-start users are often barriers to providing high quality recommendations. To address such issues, we propose a novel method that works to improve the performance of collaborative filtering recommendations by integrating sparse rating data given by users and sparse social trust network among these same users. This is a model-based method that adopts matrix factorization technique that maps users into low-dimensional latent feature spaces in terms of their trust relationship, and aims to more accurately reflect the users reciprocal influence on the formation of their own opinions and to learn better preferential patterns of users for high-quality recommendations. We use four large-scale datasets to show that the proposed method performs much better, especially for cold start users, than state-of-the-art recommendation algorithms for social collaborative filtering based on trust.
An Optimization Code for Nonlinear Transient Problems of a Large Scale Multidisciplinary Mathematical Model

NASA Astrophysics Data System (ADS)

Takasaki, Koichi

This paper presents a program for the multidisciplinary optimization and identification problem of the nonlinear model of large aerospace vehicle structures. The program constructs the global matrix of the dynamic system in the time direction by the p-version finite element method (pFEM), and the basic matrix for each pFEM node in the time direction is described by a sparse matrix similarly to the static finite element problem. The algorithm used by the program does not require the Hessian matrix of the objective function and so has low memory requirements. It also has a relatively low computational cost, and is suited to parallel computation. The program was integrated as a solver module of the multidisciplinary analysis system CUMuLOUS (Computational Utility for Multidisciplinary Large scale Optimization of Undense System) which is under development by the Aerospace Research and Development Directorate (ARD) of the Japan Aerospace Exploration Agency (JAXA).
pySAPC, a python package for sparse affinity propagation clustering: Application to odontogenesis whole genome time series gene-expression data.

PubMed

Cao, Huojun; Amendt, Brad A

2016-11-01

Developmental dental anomalies are common forms of congenital defects. The molecular mechanisms of dental anomalies are poorly understood. Systematic approaches such as clustering genes based on similar expression patterns could identify novel genes involved in dental anomalies and provide a framework for understanding molecular regulatory mechanisms of these genes during tooth development (odontogenesis). A python package (pySAPC) of sparse affinity propagation clustering algorithm for large datasets was developed. Whole genome pair-wise similarity was calculated based on expression pattern similarity based on 45 microarrays of several stages during odontogenesis. pySAPC identified 743 gene clusters based on expression pattern similarity during mouse tooth development. Three clusters are significantly enriched for genes associated with dental anomalies (with FDR <0.1). The three clusters of genes have distinct expression patterns during odontogenesis. Clustering genes based on similar expression profiles recovered several known regulatory relationships for genes involved in odontogenesis, as well as many novel genes that may be involved with the same genetic pathways as genes that have already been shown to contribute to dental defects. By using sparse similarity matrix, pySAPC use much less memory and CPU time compared with the original affinity propagation program that uses a full similarity matrix. This python package will be useful for many applications where dataset(s) are too large to use full similarity matrix. This article is part of a Special Issue entitled "System Genetics" Guest Editor: Dr. Yudong Cai and Dr. Tao Huang. Copyright © 2016. Published by Elsevier B.V.
Compressed sensing for high-resolution nonlipid suppressed 1 H FID MRSI of the human brain at 9.4T.

PubMed

Nassirpour, Sahar; Chang, Paul; Avdievitch, Nikolai; Henning, Anke

2018-04-29

The aim of this study was to apply compressed sensing to accelerate the acquisition of high resolution metabolite maps of the human brain using a nonlipid suppressed ultra-short TR and TE 1 H FID MRSI sequence at 9.4T. X-t sparse compressed sensing reconstruction was optimized for nonlipid suppressed 1 H FID MRSI data. Coil-by-coil x-t sparse reconstruction was compared with SENSE x-t sparse and low rank reconstruction. The effect of matrix size and spatial resolution on the achievable acceleration factor was studied. Finally, in vivo metabolite maps with different acceleration factors of 2, 4, 5, and 10 were acquired and compared. Coil-by-coil x-t sparse compressed sensing reconstruction was not able to reliably recover the nonlipid suppressed data, rather a combination of parallel and sparse reconstruction was necessary (SENSE x-t sparse). For acceleration factors of up to 5, both the low-rank and the compressed sensing methods were able to reconstruct the data comparably well (root mean squared errors [RMSEs] ≤ 10.5% for Cre). However, the reconstruction time of the low rank algorithm was drastically longer than compressed sensing. Using the optimized compressed sensing reconstruction, acceleration factors of 4 or 5 could be reached for the MRSI data with a matrix size of 64 × 64. For lower spatial resolutions, an acceleration factor of up to R∼4 was successfully achieved. By tailoring the reconstruction scheme to the nonlipid suppressed data through parameter optimization and performance evaluation, we present high resolution (97 µL voxel size) accelerated in vivo metabolite maps of the human brain acquired at 9.4T within scan times of 3 to 3.75 min. © 2018 International Society for Magnetic Resonance in Medicine.
On the efficiency of a randomized mirror descent algorithm in online optimization problems

NASA Astrophysics Data System (ADS)

Gasnikov, A. V.; Nesterov, Yu. E.; Spokoiny, V. G.

2015-04-01

A randomized online version of the mirror descent method is proposed. It differs from the existing versions by the randomization method. Randomization is performed at the stage of the projection of a subgradient of the function being optimized onto the unit simplex rather than at the stage of the computation of a subgradient, which is common practice. As a result, a componentwise subgradient descent with a randomly chosen component is obtained, which admits an online interpretation. This observation, for example, has made it possible to uniformly interpret results on weighting expert decisions and propose the most efficient method for searching for an equilibrium in a zero-sum two-person matrix game with sparse matrix.
Three-Dimensional Inverse Transport Solver Based on Compressive Sensing Technique

NASA Astrophysics Data System (ADS)

Cheng, Yuxiong; Wu, Hongchun; Cao, Liangzhi; Zheng, Youqi

2013-09-01

According to the direct exposure measurements from flash radiographic image, a compressive sensing-based method for three-dimensional inverse transport problem is presented. The linear absorption coefficients and interface locations of objects are reconstructed directly at the same time. It is always very expensive to obtain enough measurements. With limited measurements, compressive sensing sparse reconstruction technique orthogonal matching pursuit is applied to obtain the sparse coefficients by solving an optimization problem. A three-dimensional inverse transport solver is developed based on a compressive sensing-based technique. There are three features in this solver: (1) AutoCAD is employed as a geometry preprocessor due to its powerful capacity in graphic. (2) The forward projection matrix rather than Gauss matrix is constructed by the visualization tool generator. (3) Fourier transform and Daubechies wavelet transform are adopted to convert an underdetermined system to a well-posed system in the algorithm. Simulations are performed and numerical results in pseudo-sine absorption problem, two-cube problem and two-cylinder problem when using compressive sensing-based solver agree well with the reference value.
Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, Samuel; Oliker, Leonid; Vuduc, Richard

2008-10-16

We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific-optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD quad-core, AMD dual-core, and Intel quad-core designs, the heterogeneous STI Cell, as well as one ofmore » the first scientific studies of the highly multithreaded Sun Victoria Falls (a Niagara2 SMP). We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural trade-offs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less

An efficient gridding reconstruction method for multishot non-Cartesian imaging with correction of off-resonance artifacts.

PubMed

Meng, Yuguang; Lei, Hao

2010-06-01

An efficient iterative gridding reconstruction method with correction of off-resonance artifacts was developed, which is especially tailored for multiple-shot non-Cartesian imaging. The novelty of the method lies in that the transformation matrix for gridding (T) was constructed as the convolution of two sparse matrices, among which the former is determined by the sampling interval and the spatial distribution of the off-resonance frequencies and the latter by the sampling trajectory and the target grid in the Cartesian space. The resulting T matrix is also sparse and can be solved efficiently with the iterative conjugate gradient algorithm. It was shown that, with the proposed method, the reconstruction speed in multiple-shot non-Cartesian imaging can be improved significantly while retaining high reconstruction fidelity. More important, the method proposed allows tradeoff between the accuracy and the computation time of reconstruction, making customization of the use of such a method in different applications possible. The performance of the proposed method was demonstrated by numerical simulation and multiple-shot spiral imaging on rat brain at 4.7 T. (c) 2010 Wiley-Liss, Inc.
Abnormal global and local event detection in compressive sensing domain

NASA Astrophysics Data System (ADS)

Wang, Tian; Qiao, Meina; Chen, Jie; Wang, Chuanyun; Zhang, Wenjia; Snoussi, Hichem

2018-05-01

Abnormal event detection, also known as anomaly detection, is one challenging task in security video surveillance. It is important to develop effective and robust movement representation models for global and local abnormal event detection to fight against factors such as occlusion and illumination change. In this paper, a new algorithm is proposed. It can locate the abnormal events on one frame, and detect the global abnormal frame. The proposed algorithm employs a sparse measurement matrix designed to represent the movement feature based on optical flow efficiently. Then, the abnormal detection mission is constructed as a one-class classification task via merely learning from the training normal samples. Experiments demonstrate that our algorithm performs well on the benchmark abnormal detection datasets against state-of-the-art methods.
Sparse nonnegative matrix factorization with ℓ0-constraints

PubMed Central

Peharz, Robert; Pernkopf, Franz

2012-01-01

Although nonnegative matrix factorization (NMF) favors a sparse and part-based representation of nonnegative data, there is no guarantee for this behavior. Several authors proposed NMF methods which enforce sparseness by constraining or penalizing the ℓ1-norm of the factor matrices. On the other hand, little work has been done using a more natural sparseness measure, the ℓ0-pseudo-norm. In this paper, we propose a framework for approximate NMF which constrains the ℓ0-norm of the basis matrix, or the coefficient matrix, respectively. For this purpose, techniques for unconstrained NMF can be easily incorporated, such as multiplicative update rules, or the alternating nonnegative least-squares scheme. In experiments we demonstrate the benefits of our methods, which compare to, or outperform existing approaches. PMID:22505792
Fast-SNP: a fast matrix pre-processing algorithm for efficient loopless flux optimization of metabolic models

PubMed Central

Saa, Pedro A.; Nielsen, Lars K.

2016-01-01

Motivation: Computation of steady-state flux solutions in large metabolic models is routinely performed using flux balance analysis based on a simple LP (Linear Programming) formulation. A minimal requirement for thermodynamic feasibility of the flux solution is the absence of internal loops, which are enforced using ‘loopless constraints’. The resulting loopless flux problem is a substantially harder MILP (Mixed Integer Linear Programming) problem, which is computationally expensive for large metabolic models. Results: We developed a pre-processing algorithm that significantly reduces the size of the original loopless problem into an easier and equivalent MILP problem. The pre-processing step employs a fast matrix sparsification algorithm—Fast- sparse null-space pursuit (SNP)—inspired by recent results on SNP. By finding a reduced feasible ‘loop-law’ matrix subject to known directionalities, Fast-SNP considerably improves the computational efficiency in several metabolic models running different loopless optimization problems. Furthermore, analysis of the topology encoded in the reduced loop matrix enabled identification of key directional constraints for the potential permanent elimination of infeasible loops in the underlying model. Overall, Fast-SNP is an effective and simple algorithm for efficient formulation of loop-law constraints, making loopless flux optimization feasible and numerically tractable at large scale. Availability and Implementation: Source code for MATLAB including examples is freely available for download at http://www.aibn.uq.edu.au/cssb-resources under Software. Optimization uses Gurobi, CPLEX or GLPK (the latter is included with the algorithm). Contact: lars.nielsen@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27559155
Nonnegative matrix factorization and sparse representation for the automated detection of periodic limb movements in sleep.

PubMed

Shokrollahi, Mehrnaz; Krishnan, Sridhar; Dopsa, Dustin D; Muir, Ryan T; Black, Sandra E; Swartz, Richard H; Murray, Brian J; Boulos, Mark I

2016-11-01

Stroke is a leading cause of death and disability in adults, and incurs a significant economic burden to society. Periodic limb movements (PLMs) in sleep are repetitive movements involving the great toe, ankle, and hip. Evolving evidence suggests that PLMs may be associated with high blood pressure and stroke, but this relationship remains underexplored. Several issues limit the study of PLMs including the need to manually score them, which is time-consuming and costly. For this reason, we developed a novel automated method for nocturnal PLM detection, which was shown to be correlated with (a) the manually scored PLM index on polysomnography, and (b) white matter hyperintensities on brain imaging, which have been demonstrated to be associated with PLMs. Our proposed algorithm consists of three main stages: (1) representing the signal in the time-frequency plane using time-frequency matrices (TFM), (2) applying K-nonnegative matrix factorization technique to decompose the TFM matrix into its significant components, and (3) applying kernel sparse representation for classification (KSRC) to the decomposed signal. Our approach was applied to a dataset that consisted of 65 subjects who underwent polysomnography. An overall classification of 97 % was achieved for discrimination of the aforementioned signals, demonstrating the potential of the presented method.
Deep Marginalized Sparse Denoising Auto-Encoder for Image Denoising

NASA Astrophysics Data System (ADS)

Ma, Hongqiang; Ma, Shiping; Xu, Yuelei; Zhu, Mingming

2018-01-01

Stacked Sparse Denoising Auto-Encoder (SSDA) has been successfully applied to image denoising. As a deep network, the SSDA network with powerful data feature learning ability is superior to the traditional image denoising algorithms. However, the algorithm has high computational complexity and slow convergence rate in the training. To address this limitation, we present a method of image denoising based on Deep Marginalized Sparse Denoising Auto-Encoder (DMSDA). The loss function of Sparse Denoising Auto-Encoder is marginalized so that it satisfies both sparseness and marginality. The experimental results show that the proposed algorithm can not only outperform SSDA in the convergence speed and training time, but also has better denoising performance than the current excellent denoising algorithms, including both the subjective and objective evaluation of image denoising.
Multiuser TOA Estimation Algorithm in DS-CDMA Sparse Channel for Radiolocation

NASA Astrophysics Data System (ADS)

Kim, Sunwoo

This letter considers multiuser time delay estimation in a sparse channel environment for radiolocation. The generalized successive interference cancellation (GSIC) algorithm is used to eliminate the multiple access interference (MAI). To adapt GSIC to sparse channels the alternating maximization (AM) algorithm is considered, and the continuous time delay of each path is estimated without requiring a priori known data sequences.
Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Li, Xiaoye; Heber, Gerd; Biswas, Rupak

2000-01-01

The ability of computers to solve hitherto intractable problems and simulate complex processes using mathematical models makes them an indispensable part of modern science and engineering. Computer simulations of large-scale realistic applications usually require solving a set of non-linear partial differential equations (PDES) over a finite region. For example, one thrust area in the DOE Grand Challenge projects is to design future accelerators such as the SpaHation Neutron Source (SNS). Our colleagues at SLAC need to model complex RFQ cavities with large aspect ratios. Unstructured grids are currently used to resolve the small features in a large computational domain; dynamic mesh adaptation will be added in the future for additional efficiency. The PDEs for electromagnetics are discretized by the FEM method, which leads to a generalized eigenvalue problem Kx = AMx, where K and M are the stiffness and mass matrices, and are very sparse. In a typical cavity model, the number of degrees of freedom is about one million. For such large eigenproblems, direct solution techniques quickly reach the memory limits. Instead, the most widely-used methods are Krylov subspace methods, such as Lanczos or Jacobi-Davidson. In all the Krylov-based algorithms, sparse matrix-vector multiplication (SPMV) must be performed repeatedly. Therefore, the efficiency of SPMV usually determines the eigensolver speed. SPMV is also one of the most heavily used kernels in large-scale numerical simulations.
A deconvolution extraction method for 2D multi-object fibre spectroscopy based on the regularized least-squares QR-factorization algorithm

NASA Astrophysics Data System (ADS)

Yu, Jian; Yin, Qian; Guo, Ping; Luo, A.-li

2014-09-01

This paper presents an efficient method for the extraction of astronomical spectra from two-dimensional (2D) multifibre spectrographs based on the regularized least-squares QR-factorization (LSQR) algorithm. We address two issues: we propose a modified Gaussian point spread function (PSF) for modelling the 2D PSF from multi-emission-line gas-discharge lamp images (arc images), and we develop an efficient deconvolution method to extract spectra in real circumstances. The proposed modified 2D Gaussian PSF model can fit various types of 2D PSFs, including different radial distortion angles and ellipticities. We adopt the regularized LSQR algorithm to solve the sparse linear equations constructed from the sparse convolution matrix, which we designate the deconvolution spectrum extraction method. Furthermore, we implement a parallelized LSQR algorithm based on graphics processing unit programming in the Compute Unified Device Architecture to accelerate the computational processing. Experimental results illustrate that the proposed extraction method can greatly reduce the computational cost and memory use of the deconvolution method and, consequently, increase its efficiency and practicability. In addition, the proposed extraction method has a stronger noise tolerance than other methods, such as the boxcar (aperture) extraction and profile extraction methods. Finally, we present an analysis of the sensitivity of the extraction results to the radius and full width at half-maximum of the 2D PSF.
Paradeisos: A perfect hashing algorithm for many-body eigenvalue problems

NASA Astrophysics Data System (ADS)

Jia, C. J.; Wang, Y.; Mendl, C. B.; Moritz, B.; Devereaux, T. P.

2018-03-01

We describe an essentially perfect hashing algorithm for calculating the position of an element in an ordered list, appropriate for the construction and manipulation of many-body Hamiltonian, sparse matrices. Each element of the list corresponds to an integer value whose binary representation reflects the occupation of single-particle basis states for each element in the many-body Hilbert space. The algorithm replaces conventional methods, such as binary search, for locating the elements of the ordered list, eliminating the need to store the integer representation for each element, without increasing the computational complexity. Combined with the "checkerboard" decomposition of the Hamiltonian matrix for distribution over parallel computing environments, this leads to a substantial savings in aggregate memory. While the algorithm can be applied broadly to many-body, correlated problems, we demonstrate its utility in reducing total memory consumption for a series of fermionic single-band Hubbard model calculations on small clusters with progressively larger Hilbert space dimension.
Remote sensing image stitch using modified structure deformation

NASA Astrophysics Data System (ADS)

Pan, Ke-cheng; Chen, Jin-wei; Chen, Yueting; Feng, Huajun

2012-10-01

To stitch remote sensing images seamlessly without producing visual artifact which is caused by severe intensity discrepancy and structure misalignment, we modify the original structure deformation based stitching algorithm which have two main problems: Firstly, using Poisson equation to propagate deformation vectors leads to the change of the topological relationship between the key points and their surrounding pixels, which may bring in wrong image characteristics. Secondly, the diffusion area of the sparse matrix is too limited to rectify the global intensity discrepancy. To solve the first problem, we adopt Spring-Mass model and bring in external force to keep the topological relationship between key points and their surrounding pixels. We also apply tensor voting algorithm to achieve the global intensity corresponding curve of the two images to solve the second problem. Both simulated and experimental results show that our algorithm is faster and can reach better result than the original algorithm.
On Parallel Push-Relabel based Algorithms for Bipartite Maximum Matching

DOE Office of Scientific and Technical Information (OSTI.GOV)

Langguth, Johannes; Azad, Md Ariful; Halappanavar, Mahantesh

2014-07-01

We study multithreaded push-relabel based algorithms for computing maximum cardinality matching in bipartite graphs. Matching is a fundamental combinatorial (graph) problem with applications in a wide variety of problems in science and engineering. We are motivated by its use in the context of sparse linear solvers for computing maximum transversal of a matrix. We implement and test our algorithms on several multi-socket multicore systems and compare their performance to state-of-the-art augmenting path-based serial and parallel algorithms using a testset comprised of a wide range of real-world instances. Building on several heuristics for enhancing performance, we demonstrate good scaling for themore » parallel push-relabel algorithm. We show that it is comparable to the best augmenting path-based algorithms for bipartite matching. To the best of our knowledge, this is the first extensive study of multithreaded push-relabel based algorithms. In addition to a direct impact on the applications using matching, the proposed algorithmic techniques can be extended to preflow-push based algorithms for computing maximum flow in graphs.« less
A Note on Substructuring Preconditioning for Nonconforming Finite Element Approximations of Second Order Elliptic Problems

NASA Technical Reports Server (NTRS)

Maliassov, Serguei

1996-01-01

In this paper an algebraic substructuring preconditioner is considered for nonconforming finite element approximations of second order elliptic problems in 3D domains with a piecewise constant diffusion coefficient. Using a substructuring idea and a block Gauss elimination, part of the unknowns is eliminated and the Schur complement obtained is preconditioned by a spectrally equivalent very sparse matrix. In the case of quasiuniform tetrahedral mesh an appropriate algebraic multigrid solver can be used to solve the problem with this matrix. Explicit estimates of condition numbers and implementation algorithms are established for the constructed preconditioner. It is shown that the condition number of the preconditioned matrix does not depend on either the mesh step size or the jump of the coefficient. Finally, numerical experiments are presented to illustrate the theory being developed.
Comparing implementations of penalized weighted least-squares sinogram restoration

PubMed Central

Forthmann, Peter; Koehler, Thomas; Defrise, Michel; La Riviere, Patrick

2010-01-01

Purpose: A CT scanner measures the energy that is deposited in each channel of a detector array by x rays that have been partially absorbed on their way through the object. The measurement process is complex and quantitative measurements are always and inevitably associated with errors, so CT data must be preprocessed prior to reconstruction. In recent years, the authors have formulated CT sinogram preprocessing as a statistical restoration problem in which the goal is to obtain the best estimate of the line integrals needed for reconstruction from the set of noisy, degraded measurements. The authors have explored both penalized Poisson likelihood (PL) and penalized weighted least-squares (PWLS) objective functions. At low doses, the authors found that the PL approach outperforms PWLS in terms of resolution-noise tradeoffs, but at standard doses they perform similarly. The PWLS objective function, being quadratic, is more amenable to computational acceleration than the PL objective. In this work, the authors develop and compare two different methods for implementing PWLS sinogram restoration with the hope of improving computational performance relative to PL in the standard-dose regime. Sinogram restoration is still significant in the standard-dose regime since it can still outperform standard approaches and it allows for correction of effects that are not usually modeled in standard CT preprocessing. Methods: The authors have explored and compared two implementation strategies for PWLS sinogram restoration: (1) A direct matrix-inversion strategy based on the closed-form solution to the PWLS optimization problem and (2) an iterative approach based on the conjugate-gradient algorithm. Obtaining optimal performance from each strategy required modifying the naive off-the-shelf implementations of the algorithms to exploit the particular symmetry and sparseness of the sinogram-restoration problem. For the closed-form approach, the authors subdivided the large matrix inversion into smaller coupled problems and exploited sparseness to minimize matrix operations. For the conjugate-gradient approach, the authors exploited sparseness and preconditioned the problem to speed up convergence. Results: All methods produced qualitatively and quantitatively similar images as measured by resolution-variance tradeoffs and difference images. Despite the acceleration strategies, the direct matrix-inversion approach was found to be uncompetitive with iterative approaches, with a computational burden higher by an order of magnitude or more. The iterative conjugate-gradient approach, however, does appear promising, with computation times half that of the authors’ previous penalized-likelihood implementation. Conclusions: Iterative conjugate-gradient based PWLS sinogram restoration with careful matrix optimizations has computational advantages over direct matrix PWLS inversion and over penalized-likelihood sinogram restoration and can be considered a good alternative in standard-dose regimes. PMID:21158306
Single and Multiple Object Tracking Using a Multi-Feature Joint Sparse Representation.

PubMed

Hu, Weiming; Li, Wei; Zhang, Xiaoqin; Maybank, Stephen

2015-04-01

In this paper, we propose a tracking algorithm based on a multi-feature joint sparse representation. The templates for the sparse representation can include pixel values, textures, and edges. In the multi-feature joint optimization, noise or occlusion is dealt with using a set of trivial templates. A sparse weight constraint is introduced to dynamically select the relevant templates from the full set of templates. A variance ratio measure is adopted to adaptively adjust the weights of different features. The multi-feature template set is updated adaptively. We further propose an algorithm for tracking multi-objects with occlusion handling based on the multi-feature joint sparse reconstruction. The observation model based on sparse reconstruction automatically focuses on the visible parts of an occluded object by using the information in the trivial templates. The multi-object tracking is simplified into a joint Bayesian inference. The experimental results show the superiority of our algorithm over several state-of-the-art tracking algorithms.
An analysis dictionary learning algorithm under a noisy data model with orthogonality constraint.

PubMed

Zhang, Ye; Yu, Tenglong; Wang, Wenwu

2014-01-01

Two common problems are often encountered in analysis dictionary learning (ADL) algorithms. The first one is that the original clean signals for learning the dictionary are assumed to be known, which otherwise need to be estimated from noisy measurements. This, however, renders a computationally slow optimization process and potentially unreliable estimation (if the noise level is high), as represented by the Analysis K-SVD (AK-SVD) algorithm. The other problem is the trivial solution to the dictionary, for example, the null dictionary matrix that may be given by a dictionary learning algorithm, as discussed in the learning overcomplete sparsifying transform (LOST) algorithm. Here we propose a novel optimization model and an iterative algorithm to learn the analysis dictionary, where we directly employ the observed data to compute the approximate analysis sparse representation of the original signals (leading to a fast optimization procedure) and enforce an orthogonality constraint on the optimization criterion to avoid the trivial solutions. Experiments demonstrate the competitive performance of the proposed algorithm as compared with three baselines, namely, the AK-SVD, LOST, and NAAOLA algorithms.
Smooth Approximation l 0-Norm Constrained Affine Projection Algorithm and Its Applications in Sparse Channel Estimation

PubMed Central

2014-01-01

We propose a smooth approximation l 0-norm constrained affine projection algorithm (SL0-APA) to improve the convergence speed and the steady-state error of affine projection algorithm (APA) for sparse channel estimation. The proposed algorithm ensures improved performance in terms of the convergence speed and the steady-state error via the combination of a smooth approximation l 0-norm (SL0) penalty on the coefficients into the standard APA cost function, which gives rise to a zero attractor that promotes the sparsity of the channel taps in the channel estimation and hence accelerates the convergence speed and reduces the steady-state error when the channel is sparse. The simulation results demonstrate that our proposed SL0-APA is superior to the standard APA and its sparsity-aware algorithms in terms of both the convergence speed and the steady-state behavior in a designated sparse channel. Furthermore, SL0-APA is shown to have smaller steady-state error than the previously proposed sparsity-aware algorithms when the number of nonzero taps in the sparse channel increases. PMID:24790588
High efficient optical remote sensing images acquisition for nano-satellite: reconstruction algorithms

NASA Astrophysics Data System (ADS)

Liu, Yang; Li, Feng; Xin, Lei; Fu, Jie; Huang, Puming

2017-10-01

Large amount of data is one of the most obvious features in satellite based remote sensing systems, which is also a burden for data processing and transmission. The theory of compressive sensing(CS) has been proposed for almost a decade, and massive experiments show that CS has favorable performance in data compression and recovery, so we apply CS theory to remote sensing images acquisition. In CS, the construction of classical sensing matrix for all sparse signals has to satisfy the Restricted Isometry Property (RIP) strictly, which limits applying CS in practical in image compression. While for remote sensing images, we know some inherent characteristics such as non-negative, smoothness and etc.. Therefore, the goal of this paper is to present a novel measurement matrix that breaks RIP. The new sensing matrix consists of two parts: the standard Nyquist sampling matrix for thumbnails and the conventional CS sampling matrix. Since most of sun-synchronous based satellites fly around the earth 90 minutes and the revisit cycle is also short, lots of previously captured remote sensing images of the same place are available in advance. This drives us to reconstruct remote sensing images through a deep learning approach with those measurements from the new framework. Therefore, we propose a novel deep convolutional neural network (CNN) architecture which takes in undersampsing measurements as input and outputs an intermediate reconstruction image. It is well known that the training procedure to the network costs long time, luckily, the training step can be done only once, which makes the approach attractive for a host of sparse recovery problems.
Multi-stage classification method oriented to aerial image based on low-rank recovery and multi-feature fusion sparse representation.

PubMed

Ma, Xu; Cheng, Yongmei; Hao, Shuai

2016-12-10

Automatic classification of terrain surfaces from an aerial image is essential for an autonomous unmanned aerial vehicle (UAV) landing at an unprepared site by using vision. Diverse terrain surfaces may show similar spectral properties due to the illumination and noise that easily cause poor classification performance. To address this issue, a multi-stage classification algorithm based on low-rank recovery and multi-feature fusion sparse representation is proposed. First, color moments and Gabor texture feature are extracted from training data and stacked as column vectors of a dictionary. Then we perform low-rank matrix recovery for the dictionary by using augmented Lagrange multipliers and construct a multi-stage terrain classifier. Experimental results on an aerial map database that we prepared verify the classification accuracy and robustness of the proposed method.
Parallel Conjugate Gradient: Effects of Ordering Strategies, Programming Paradigms, and Architectural Platforms

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Heber, Gerd; Biswas, Rupak

2000-01-01

The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations within a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming paradigms and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multi-threaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.

PyVCI: A flexible open-source code for calculating accurate molecular infrared spectra

NASA Astrophysics Data System (ADS)

Sibaev, Marat; Crittenden, Deborah L.

2016-06-01

The PyVCI program package is a general purpose open-source code for simulating accurate molecular spectra, based upon force field expansions of the potential energy surface in normal mode coordinates. It includes harmonic normal coordinate analysis and vibrational configuration interaction (VCI) algorithms, implemented primarily in Python for accessibility but with time-consuming routines written in C. Coriolis coupling terms may be optionally included in the vibrational Hamiltonian. Non-negligible VCI matrix elements are stored in sparse matrix format to alleviate the diagonalization problem. CPU and memory requirements may be further controlled by algorithmic choices and/or numerical screening procedures, and recommended values are established by benchmarking using a test set of 44 molecules for which accurate analytical potential energy surfaces are available. Force fields in normal mode coordinates are obtained from the PyPES library of high quality analytical potential energy surfaces (to 6th order) or by numerical differentiation of analytic second derivatives generated using the GAMESS quantum chemical program package (to 4th order).
Method and apparatus for optimized processing of sparse matrices

DOEpatents

Taylor, Valerie E.

1993-01-01

A computer architecture for processing a sparse matrix is disclosed. The apparatus stores a value-row vector corresponding to nonzero values of a sparse matrix. Each of the nonzero values is located at a defined row and column position in the matrix. The value-row vector includes a first vector including nonzero values and delimiting characters indicating a transition from one column to another. The value-row vector also includes a second vector which defines row position values in the matrix corresponding to the nonzero values in the first vector and column position values in the matrix corresponding to the column position of the nonzero values in the first vector. The architecture also includes a circuit for detecting a special character within the value-row vector. Matrix-vector multiplication is executed on the value-row vector. This multiplication is performed by multiplying an index value of the first vector value by a column value from a second matrix to form a matrix-vector product which is added to a previous matrix-vector product.
Learning Low-Rank Class-Specific Dictionary and Sparse Intra-Class Variant Dictionary for Face Recognition.

PubMed

Tang, Xin; Feng, Guo-Can; Li, Xiao-Xin; Cai, Jia-Xin

2015-01-01

Face recognition is challenging especially when the images from different persons are similar to each other due to variations in illumination, expression, and occlusion. If we have sufficient training images of each person which can span the facial variations of that person under testing conditions, sparse representation based classification (SRC) achieves very promising results. However, in many applications, face recognition often encounters the small sample size problem arising from the small number of available training images for each person. In this paper, we present a novel face recognition framework by utilizing low-rank and sparse error matrix decomposition, and sparse coding techniques (LRSE+SC). Firstly, the low-rank matrix recovery technique is applied to decompose the face images per class into a low-rank matrix and a sparse error matrix. The low-rank matrix of each individual is a class-specific dictionary and it captures the discriminative feature of this individual. The sparse error matrix represents the intra-class variations, such as illumination, expression changes. Secondly, we combine the low-rank part (representative basis) of each person into a supervised dictionary and integrate all the sparse error matrix of each individual into a within-individual variant dictionary which can be applied to represent the possible variations between the testing and training images. Then these two dictionaries are used to code the query image. The within-individual variant dictionary can be shared by all the subjects and only contribute to explain the lighting conditions, expressions, and occlusions of the query image rather than discrimination. At last, a reconstruction-based scheme is adopted for face recognition. Since the within-individual dictionary is introduced, LRSE+SC can handle the problem of the corrupted training data and the situation that not all subjects have enough samples for training. Experimental results show that our method achieves the state-of-the-art results on AR, FERET, FRGC and LFW databases.
Learning Low-Rank Class-Specific Dictionary and Sparse Intra-Class Variant Dictionary for Face Recognition

PubMed Central

Tang, Xin; Feng, Guo-can; Li, Xiao-xin; Cai, Jia-xin

2015-01-01

Face recognition is challenging especially when the images from different persons are similar to each other due to variations in illumination, expression, and occlusion. If we have sufficient training images of each person which can span the facial variations of that person under testing conditions, sparse representation based classification (SRC) achieves very promising results. However, in many applications, face recognition often encounters the small sample size problem arising from the small number of available training images for each person. In this paper, we present a novel face recognition framework by utilizing low-rank and sparse error matrix decomposition, and sparse coding techniques (LRSE+SC). Firstly, the low-rank matrix recovery technique is applied to decompose the face images per class into a low-rank matrix and a sparse error matrix. The low-rank matrix of each individual is a class-specific dictionary and it captures the discriminative feature of this individual. The sparse error matrix represents the intra-class variations, such as illumination, expression changes. Secondly, we combine the low-rank part (representative basis) of each person into a supervised dictionary and integrate all the sparse error matrix of each individual into a within-individual variant dictionary which can be applied to represent the possible variations between the testing and training images. Then these two dictionaries are used to code the query image. The within-individual variant dictionary can be shared by all the subjects and only contribute to explain the lighting conditions, expressions, and occlusions of the query image rather than discrimination. At last, a reconstruction-based scheme is adopted for face recognition. Since the within-individual dictionary is introduced, LRSE+SC can handle the problem of the corrupted training data and the situation that not all subjects have enough samples for training. Experimental results show that our method achieves the state-of-the-art results on AR, FERET, FRGC and LFW databases. PMID:26571112
Dwell time algorithm based on the optimization theory for magnetorheological finishing

NASA Astrophysics Data System (ADS)

Zhang, Yunfei; Wang, Yang; Wang, Yajun; He, Jianguo; Ji, Fang; Huang, Wen

2010-10-01

Magnetorheological finishing (MRF) is an advanced polishing technique capable of rapidly converging to the required surface figure. This process can deterministically control the amount of the material removed by varying a time to dwell at each particular position on the workpiece surface. The dwell time algorithm is one of the most important key techniques of the MRF. A dwell time algorithm based on the1 matrix equation and optimization theory was presented in this paper. The conventional mathematical model of the dwell time was transferred to a matrix equation containing initial surface error, removal function and dwell time function. The dwell time to be calculated was just the solution to the large, sparse matrix equation. A new mathematical model of the dwell time based on the optimization theory was established, which aims to minimize the 2-norm or ∞-norm of the residual surface error. The solution meets almost all the requirements of precise computer numerical control (CNC) without any need for extra data processing, because this optimization model has taken some polishing condition as the constraints. Practical approaches to finding a minimal least-squares solution and a minimal maximum solution are also discussed in this paper. Simulations have shown that the proposed algorithm is numerically robust and reliable. With this algorithm an experiment has been performed on the MRF machine developed by ourselves. After 4.7 minutes' polishing, the figure error of a flat workpiece with a 50 mm diameter is improved by PV from 0.191λ(λ = 632.8 nm) to 0.087λ and RMS 0.041λ to 0.010λ. This algorithm can be constructed to polish workpieces of all shapes including flats, spheres, aspheres, and prisms, and it is capable of improving the polishing figures dramatically.
Viewing Angle Classification of Cryo-Electron Microscopy Images Using Eigenvectors

PubMed Central

Singer, A.; Zhao, Z.; Shkolnisky, Y.; Hadani, R.

2012-01-01

The cryo-electron microscopy (cryo-EM) reconstruction problem is to find the three-dimensional structure of a macromolecule given noisy versions of its two-dimensional projection images at unknown random directions. We introduce a new algorithm for identifying noisy cryo-EM images of nearby viewing angles. This identification is an important first step in three-dimensional structure determination of macromolecules from cryo-EM, because once identified, these images can be rotationally aligned and averaged to produce “class averages” of better quality. The main advantage of our algorithm is its extreme robustness to noise. The algorithm is also very efficient in terms of running time and memory requirements, because it is based on the computation of the top few eigenvectors of a specially designed sparse Hermitian matrix. These advantages are demonstrated in numerous numerical experiments. PMID:22506089
Addressing the computational cost of large EIT solutions.

PubMed

Boyle, Alistair; Borsic, Andrea; Adler, Andy

2012-05-01

Electrical impedance tomography (EIT) is a soft field tomography modality based on the application of electric current to a body and measurement of voltages through electrodes at the boundary. The interior conductivity is reconstructed on a discrete representation of the domain using a finite-element method (FEM) mesh and a parametrization of that domain. The reconstruction requires a sequence of numerically intensive calculations. There is strong interest in reducing the cost of these calculations. An improvement in the compute time for current problems would encourage further exploration of computationally challenging problems such as the incorporation of time series data, wide-spread adoption of three-dimensional simulations and correlation of other modalities such as CT and ultrasound. Multicore processors offer an opportunity to reduce EIT computation times but may require some restructuring of the underlying algorithms to maximize the use of available resources. This work profiles two EIT software packages (EIDORS and NDRM) to experimentally determine where the computational costs arise in EIT as problems scale. Sparse matrix solvers, a key component for the FEM forward problem and sensitivity estimates in the inverse problem, are shown to take a considerable portion of the total compute time in these packages. A sparse matrix solver performance measurement tool, Meagre-Crowd, is developed to interface with a variety of solvers and compare their performance over a range of two- and three-dimensional problems of increasing node density. Results show that distributed sparse matrix solvers that operate on multiple cores are advantageous up to a limit that increases as the node density increases. We recommend a selection procedure to find a solver and hardware arrangement matched to the problem and provide guidance and tools to perform that selection.
Implementation of the Iterative Proportion Fitting Algorithm for Geostatistical Facies Modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li Yupeng, E-mail: yupeng@ualberta.ca; Deutsch, Clayton V.

2012-06-15

In geostatistics, most stochastic algorithm for simulation of categorical variables such as facies or rock types require a conditional probability distribution. The multivariate probability distribution of all the grouped locations including the unsampled location permits calculation of the conditional probability directly based on its definition. In this article, the iterative proportion fitting (IPF) algorithm is implemented to infer this multivariate probability. Using the IPF algorithm, the multivariate probability is obtained by iterative modification to an initial estimated multivariate probability using lower order bivariate probabilities as constraints. The imposed bivariate marginal probabilities are inferred from profiles along drill holes or wells.more » In the IPF process, a sparse matrix is used to calculate the marginal probabilities from the multivariate probability, which makes the iterative fitting more tractable and practical. This algorithm can be extended to higher order marginal probability constraints as used in multiple point statistics. The theoretical framework is developed and illustrated with estimation and simulation example.« less
HPC-NMF: A High-Performance Parallel Algorithm for Nonnegative Matrix Factorization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kannan, Ramakrishnan; Sukumar, Sreenivas R.; Ballard, Grey M.

NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient distributed algorithms to solve the problem for big data sets. We propose a high-performance distributed-memory parallel algorithm that computes the factorization by iteratively solving alternating non-negative least squares (NLS) subproblems formore » $$\\WW$$ and $$\\HH$$. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). As opposed to previous implementation, our algorithm is also flexible: It performs well for both dense and sparse matrices, and allows the user to choose any one of the multiple algorithms for solving the updates to low rank factors $$\\WW$$ and $$\\HH$$ within the alternating iterations.« less
Kernel Recursive Least-Squares Temporal Difference Algorithms with Sparsification and Regularization

PubMed Central

Zhu, Qingxin; Niu, Xinzheng

2016-01-01

By combining with sparse kernel methods, least-squares temporal difference (LSTD) algorithms can construct the feature dictionary automatically and obtain a better generalization ability. However, the previous kernel-based LSTD algorithms do not consider regularization and their sparsification processes are batch or offline, which hinder their widespread applications in online learning problems. In this paper, we combine the following five techniques and propose two novel kernel recursive LSTD algorithms: (i) online sparsification, which can cope with unknown state regions and be used for online learning, (ii) L 2 and L 1 regularization, which can avoid overfitting and eliminate the influence of noise, (iii) recursive least squares, which can eliminate matrix-inversion operations and reduce computational complexity, (iv) a sliding-window approach, which can avoid caching all history samples and reduce the computational cost, and (v) the fixed-point subiteration and online pruning, which can make L 1 regularization easy to implement. Finally, simulation results on two 50-state chain problems demonstrate the effectiveness of our algorithms. PMID:27436996
Kernel Recursive Least-Squares Temporal Difference Algorithms with Sparsification and Regularization.

PubMed

Zhang, Chunyuan; Zhu, Qingxin; Niu, Xinzheng

2016-01-01

By combining with sparse kernel methods, least-squares temporal difference (LSTD) algorithms can construct the feature dictionary automatically and obtain a better generalization ability. However, the previous kernel-based LSTD algorithms do not consider regularization and their sparsification processes are batch or offline, which hinder their widespread applications in online learning problems. In this paper, we combine the following five techniques and propose two novel kernel recursive LSTD algorithms: (i) online sparsification, which can cope with unknown state regions and be used for online learning, (ii) L 2 and L 1 regularization, which can avoid overfitting and eliminate the influence of noise, (iii) recursive least squares, which can eliminate matrix-inversion operations and reduce computational complexity, (iv) a sliding-window approach, which can avoid caching all history samples and reduce the computational cost, and (v) the fixed-point subiteration and online pruning, which can make L 1 regularization easy to implement. Finally, simulation results on two 50-state chain problems demonstrate the effectiveness of our algorithms.
Ontology Sparse Vector Learning Algorithm for Ontology Similarity Measuring and Ontology Mapping via ADAL Technology

NASA Astrophysics Data System (ADS)

Gao, Wei; Zhu, Linli; Wang, Kaiyun

2015-12-01

Ontology, a model of knowledge representation and storage, has had extensive applications in pharmaceutics, social science, chemistry and biology. In the age of “big data”, the constructed concepts are often represented as higher-dimensional data by scholars, and thus the sparse learning techniques are introduced into ontology algorithms. In this paper, based on the alternating direction augmented Lagrangian method, we present an ontology optimization algorithm for ontological sparse vector learning, and a fast version of such ontology technologies. The optimal sparse vector is obtained by an iterative procedure, and the ontology function is then obtained from the sparse vector. Four simulation experiments show that our ontological sparse vector learning model has a higher precision ratio on plant ontology, humanoid robotics ontology, biology ontology and physics education ontology data for similarity measuring and ontology mapping applications.
Reconstructing cortical current density by exploring sparseness in the transform domain

NASA Astrophysics Data System (ADS)

Ding, Lei

2009-05-01

In the present study, we have developed a novel electromagnetic source imaging approach to reconstruct extended cortical sources by means of cortical current density (CCD) modeling and a novel EEG imaging algorithm which explores sparseness in cortical source representations through the use of L1-norm in objective functions. The new sparse cortical current density (SCCD) imaging algorithm is unique since it reconstructs cortical sources by attaining sparseness in a transform domain (the variation map of cortical source distributions). While large variations are expected to occur along boundaries (sparseness) between active and inactive cortical regions, cortical sources can be reconstructed and their spatial extents can be estimated by locating these boundaries. We studied the SCCD algorithm using numerous simulations to investigate its capability in reconstructing cortical sources with different extents and in reconstructing multiple cortical sources with different extent contrasts. The SCCD algorithm was compared with two L2-norm solutions, i.e. weighted minimum norm estimate (wMNE) and cortical LORETA. Our simulation data from the comparison study show that the proposed sparse source imaging algorithm is able to accurately and efficiently recover extended cortical sources and is promising to provide high-accuracy estimation of cortical source extents.
FPGA architecture and implementation of sparse matrix vector multiplication for the finite element method

NASA Astrophysics Data System (ADS)

Elkurdi, Yousef; Fernández, David; Souleimanov, Evgueni; Giannacopoulos, Dennis; Gross, Warren J.

2008-04-01

The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory.
Practical Sub-Nyquist Sampling via Array-Based Compressed Sensing Receiver Architecture

DTIC Science & Technology

2016-07-10

different array ele- ments at different sub-Nyquist sampling rates. Signal processing inspired by the sparse fast Fourier transform allows for signal...reconstruction algorithms can be computationally demanding (REF). The related sparse Fourier transform algorithms aim to reduce the processing time nec- essary to...compute the DFT of frequency-sparse signals [7]. In particular, the sparse fast Fourier transform (sFFT) achieves processing time better than the
FINAL REPORT (MILESTONE DATE 9/30/11) FOR SUBCONTRACT NO. B594099 NUMERICAL METHODS FOR LARGE-SCALE DATA FACTORIZATION

DOE Office of Scientific and Technical Information (OSTI.GOV)

De Sterck, H

2011-10-18

The following work has been performed by PI Hans De Sterck and graduate student Manda Winlaw for the required tasks 1-5 (as listed in the Statement of Work). Graduate student Manda Winlaw has visited LLNL January 31-March 11, 2011 and May 23-August 19, 2010, working with Van Henson and Mike O'Hara on non-negative matrix factorizations (NMF). She has investigated the dense subgraph clustering algorithm from 'Finding Dense Subgraphs for Sparse Undirected, Directed, and Bipartite Graphs' by Chen and Saad, testing this method on several term-document matrices and adapting it to cluster based on the rank of the subgraphs instead ofmore » the density. Manda Winlaw was awarded a first prize in the annual LLNL summer student poster competition for a poster on her NMF research. PI Hans De Sterck has developed a new adaptive algebraic multigrid algorithm for computing a few dominant or minimal singular triplets of sparse rectangular matrices. This work builds on adaptive algebraic multigrid methods that were further developed by the PI and collaborators (including Sanders and Henson) for Markov chains. The method also combines and extends existing multigrid algorithms for the symmetric eigenproblem. The PI has visited LLNL February 22-25, 2011, and has given a CASC seminar 'Algebraic Multigrid for the Singular Value Problem' on this work on February 23, 2011. During his visit, he has discussed this work and related topics with Van Henson, Geoffrey Sanders, Panayot Vassilevski, and others. He has tested the algorithm on PDE matrices and on a term-document matrix, with promising initial results. Manda Winlaw has also started to work, with O'Hara, on estimating probability distributions over undirected graph edges. The goal is to estimate probabilistic models from sets of undirected graph edges for the purpose of prediction, anomaly detection and support to supervised learning. Graduate student Manda Winlaw is writing a paper on the results obtained with O'Hara which will be submitted some time later in 2011 to a data mining conference. PI Hans De Sterck has developed a new optimization algorithm for canonical tensor approximation, formulating an extension of the nonlinear GMRES method to optimization problems. Numerical results for tensors with up to 8 modes show that this new method is efficient for sparse and dense tensors. He has written a paper on this which has been submitted to the SIAM Journal on Scientific Computing. PI Hans De Sterck has further developed his new optimization algorithm for canonical tensor approximation, formulating an extension in terms of steepest-descent preconditioning, which makes the approach generally applicable for nonlinear optimization. He has written a paper on this extension which has been submitted to Numerical Linear Algebra with Applications.« less
On efficient randomized algorithms for finding the PageRank vector

NASA Astrophysics Data System (ADS)

Gasnikov, A. V.; Dmitriev, D. Yu.

2015-03-01

Two randomized methods are considered for finding the PageRank vector; in other words, the solution of the system p T = p T P with a stochastic n × n matrix P, where n ˜ 107-109, is sought (in the class of probability distributions) with accuracy ɛ: ɛ ≫ n -1. Thus, the possibility of brute-force multiplication of P by the column is ruled out in the case of dense objects. The first method is based on the idea of Markov chain Monte Carlo algorithms. This approach is efficient when the iterative process p {/t+1 T} = p {/t T} P quickly reaches a steady state. Additionally, it takes into account another specific feature of P, namely, the nonzero off-diagonal elements of P are equal in rows (this property is used to organize a random walk over the graph with the matrix P). Based on modern concentration-of-measure inequalities, new bounds for the running time of this method are presented that take into account the specific features of P. In the second method, the search for a ranking vector is reduced to finding the equilibrium in the antagonistic matrix game where S n (1) is a unit simplex in ℝ n and I is the identity matrix. The arising problem is solved by applying a slightly modified Grigoriadis-Khachiyan algorithm (1995). This technique, like the Nazin-Polyak method (2009), is a randomized version of Nemirovski's mirror descent method. The difference is that randomization in the Grigoriadis-Khachiyan algorithm is used when the gradient is projected onto the simplex rather than when the stochastic gradient is computed. For sparse matrices P, the method proposed yields noticeably better results.
SAMBA: Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahlfeld, R., E-mail: r.ahlfeld14@imperial.ac.uk; Belkouchi, B.; Montomoli, F.

2016-09-01

A new arbitrary Polynomial Chaos (aPC) method is presented for moderately high-dimensional problems characterised by limited input data availability. The proposed methodology improves the algorithm of aPC and extends the method, that was previously only introduced as tensor product expansion, to moderately high-dimensional stochastic problems. The fundamental idea of aPC is to use the statistical moments of the input random variables to develop the polynomial chaos expansion. This approach provides the possibility to propagate continuous or discrete probability density functions and also histograms (data sets) as long as their moments exist, are finite and the determinant of the moment matrixmore » is strictly positive. For cases with limited data availability, this approach avoids bias and fitting errors caused by wrong assumptions. In this work, an alternative way to calculate the aPC is suggested, which provides the optimal polynomials, Gaussian quadrature collocation points and weights from the moments using only a handful of matrix operations on the Hankel matrix of moments. It can therefore be implemented without requiring prior knowledge about statistical data analysis or a detailed understanding of the mathematics of polynomial chaos expansions. The extension to more input variables suggested in this work, is an anisotropic and adaptive version of Smolyak's algorithm that is solely based on the moments of the input probability distributions. It is referred to as SAMBA (PC), which is short for Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos. It is illustrated that for moderately high-dimensional problems (up to 20 different input variables or histograms) SAMBA can significantly simplify the calculation of sparse Gaussian quadrature rules. SAMBA's efficiency for multivariate functions with regard to data availability is further demonstrated by analysing higher order convergence and accuracy for a set of nonlinear test functions with 2, 5 and 10 different input distributions or histograms.« less
Greedy Algorithms for Nonnegativity-Constrained Simultaneous Sparse Recovery

PubMed Central

Kim, Daeun; Haldar, Justin P.

2016-01-01

This work proposes a family of greedy algorithms to jointly reconstruct a set of vectors that are (i) nonnegative and (ii) simultaneously sparse with a shared support set. The proposed algorithms generalize previous approaches that were designed to impose these constraints individually. Similar to previous greedy algorithms for sparse recovery, the proposed algorithms iteratively identify promising support indices. In contrast to previous approaches, the support index selection procedure has been adapted to prioritize indices that are consistent with both the nonnegativity and shared support constraints. Empirical results demonstrate for the first time that the combined use of simultaneous sparsity and nonnegativity constraints can substantially improve recovery performance relative to existing greedy algorithms that impose less signal structure. PMID:26973368
Numerical Modeling of Nanoelectronic Devices

NASA Technical Reports Server (NTRS)

Klimeck, Gerhard; Oyafuso, Fabiano; Bowen, R. Chris; Boykin, Timothy

2003-01-01

Nanoelectronic Modeling 3-D (NEMO 3-D) is a computer program for numerical modeling of the electronic structure properties of a semiconductor device that is embodied in a crystal containing as many as 16 million atoms in an arbitrary configuration and that has overall dimensions of the order of tens of nanometers. The underlying mathematical model represents the quantummechanical behavior of the device resolved to the atomistic level of granularity. The system of electrons in the device is represented by a sparse Hamiltonian matrix that contains hundreds of millions of terms. NEMO 3-D solves the matrix equation on a Beowulf-class cluster computer, by use of a parallel-processing matrix vector multiplication algorithm coupled to a Lanczos and/or Rayleigh-Ritz algorithm that solves for eigenvalues. In a recent update of NEMO 3-D, a new strain treatment, parameterized for bulk material properties of GaAs and InAs, was developed for two tight-binding submodels. The utility of the NEMO 3-D was demonstrated in an atomistic analysis of the effects of disorder in alloys and, in particular, in bulk In(x)Ga(l-x)As and in In0.6Ga0.4As quantum dots.

BCYCLIC: A parallel block tridiagonal matrix cyclic solver

NASA Astrophysics Data System (ADS)

Hirshman, S. P.; Perumalla, K. S.; Lynch, V. E.; Sanchez, R.

2010-09-01

A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.
Approximate equiangular tight frames for compressed sensing and CDMA applications

NASA Astrophysics Data System (ADS)

Tsiligianni, Evaggelia; Kondi, Lisimachos P.; Katsaggelos, Aggelos K.

2017-12-01

Performance guarantees for recovery algorithms employed in sparse representations, and compressed sensing highlights the importance of incoherence. Optimal bounds of incoherence are attained by equiangular unit norm tight frames (ETFs). Although ETFs are important in many applications, they do not exist for all dimensions, while their construction has been proven extremely difficult. In this paper, we construct frames that are close to ETFs. According to results from frame and graph theory, the existence of an ETF depends on the existence of its signature matrix, that is, a symmetric matrix with certain structure and spectrum consisting of two distinct eigenvalues. We view the construction of a signature matrix as an inverse eigenvalue problem and propose a method that produces frames of any dimensions that are close to ETFs. Due to the achieved equiangularity property, the so obtained frames can be employed as spreading sequences in synchronous code-division multiple access (s-CDMA) systems, besides compressed sensing.
A modified sparse reconstruction method for three-dimensional synthetic aperture radar image

NASA Astrophysics Data System (ADS)

Zhang, Ziqiang; Ji, Kefeng; Song, Haibo; Zou, Huanxin

2018-03-01

There is an increasing interest in three-dimensional Synthetic Aperture Radar (3-D SAR) imaging from observed sparse scattering data. However, the existing 3-D sparse imaging method requires large computing times and storage capacity. In this paper, we propose a modified method for the sparse 3-D SAR imaging. The method processes the collection of noisy SAR measurements, usually collected over nonlinear flight paths, and outputs 3-D SAR imagery. Firstly, the 3-D sparse reconstruction problem is transformed into a series of 2-D slices reconstruction problem by range compression. Then the slices are reconstructed by the modified SL0 (smoothed l0 norm) reconstruction algorithm. The improved algorithm uses hyperbolic tangent function instead of the Gaussian function to approximate the l0 norm and uses the Newton direction instead of the steepest descent direction, which can speed up the convergence rate of the SL0 algorithm. Finally, numerical simulation results are given to demonstrate the effectiveness of the proposed algorithm. It is shown that our method, compared with existing 3-D sparse imaging method, performs better in reconstruction quality and the reconstruction time.
Uniform Recovery Bounds for Structured Random Matrices in Corrupted Compressed Sensing

NASA Astrophysics Data System (ADS)

Zhang, Peng; Gan, Lu; Ling, Cong; Sun, Sumei

2018-04-01

We study the problem of recovering an $s$-sparse signal $\\mathbf{x}^{\\star}\\in\\mathbb{C}^n$ from corrupted measurements $\\mathbf{y} = \\mathbf{A}\\mathbf{x}^{\\star}+\\mathbf{z}^{\\star}+\\mathbf{w}$, where $\\mathbf{z}^{\\star}\\in\\mathbb{C}^m$ is a $k$-sparse corruption vector whose nonzero entries may be arbitrarily large and $\\mathbf{w}\\in\\mathbb{C}^m$ is a dense noise with bounded energy. The aim is to exactly and stably recover the sparse signal with tractable optimization programs. In this paper, we prove the uniform recovery guarantee of this problem for two classes of structured sensing matrices. The first class can be expressed as the product of a unit-norm tight frame (UTF), a random diagonal matrix and a bounded columnwise orthonormal matrix (e.g., partial random circulant matrix). When the UTF is bounded (i.e. $\\mu(\\mathbf{U})\\sim1/\\sqrt{m}$), we prove that with high probability, one can recover an $s$-sparse signal exactly and stably by $l_1$ minimization programs even if the measurements are corrupted by a sparse vector, provided $m = \\mathcal{O}(s \\log^2 s \\log^2 n)$ and the sparsity level $k$ of the corruption is a constant fraction of the total number of measurements. The second class considers randomly sub-sampled orthogonal matrix (e.g., random Fourier matrix). We prove the uniform recovery guarantee provided that the corruption is sparse on certain sparsifying domain. Numerous simulation results are also presented to verify and complement the theoretical results.
Semi-automatic sparse preconditioners for high-order finite element methods on non-uniform meshes

NASA Astrophysics Data System (ADS)

Austin, Travis M.; Brezina, Marian; Jamroz, Ben; Jhurani, Chetan; Manteuffel, Thomas A.; Ruge, John

2012-05-01

High-order finite elements often have a higher accuracy per degree of freedom than the classical low-order finite elements. However, in the context of implicit time-stepping methods, high-order finite elements present challenges to the construction of efficient simulations due to the high cost of inverting the denser finite element matrix. There are many cases where simulations are limited by the memory required to store the matrix and/or the algorithmic components of the linear solver. We are particularly interested in preconditioned Krylov methods for linear systems generated by discretization of elliptic partial differential equations with high-order finite elements. Using a preconditioner like Algebraic Multigrid can be costly in terms of memory due to the need to store matrix information at the various levels. We present a novel method for defining a preconditioner for systems generated by high-order finite elements that is based on a much sparser system than the original high-order finite element system. We investigate the performance for non-uniform meshes on a cube and a cubed sphere mesh, showing that the sparser preconditioner is more efficient and uses significantly less memory. Finally, we explore new methods to construct the sparse preconditioner and examine their effectiveness for non-uniform meshes. We compare results to a direct use of Algebraic Multigrid as a preconditioner and to a two-level additive Schwarz method.
Improved FastICA algorithm in fMRI data analysis using the sparsity property of the sources.

PubMed

Ge, Ruiyang; Wang, Yubao; Zhang, Jipeng; Yao, Li; Zhang, Hang; Long, Zhiying

2016-04-01

As a blind source separation technique, independent component analysis (ICA) has many applications in functional magnetic resonance imaging (fMRI). Although either temporal or spatial prior information has been introduced into the constrained ICA and semi-blind ICA methods to improve the performance of ICA in fMRI data analysis, certain types of additional prior information, such as the sparsity, has seldom been added to the ICA algorithms as constraints. In this study, we proposed a SparseFastICA method by adding the source sparsity as a constraint to the FastICA algorithm to improve the performance of the widely used FastICA. The source sparsity is estimated through a smoothed ℓ0 norm method. We performed experimental tests on both simulated data and real fMRI data to investigate the feasibility and robustness of SparseFastICA and made a performance comparison between SparseFastICA, FastICA and Infomax ICA. Results of the simulated and real fMRI data demonstrated the feasibility and robustness of SparseFastICA for the source separation in fMRI data. Both the simulated and real fMRI experimental results showed that SparseFastICA has better robustness to noise and better spatial detection power than FastICA. Although the spatial detection power of SparseFastICA and Infomax did not show significant difference, SparseFastICA had faster computation speed than Infomax. SparseFastICA was comparable to the Infomax algorithm with a faster computation speed. More importantly, SparseFastICA outperformed FastICA in robustness and spatial detection power and can be used to identify more accurate brain networks than FastICA algorithm. Copyright © 2016 Elsevier B.V. All rights reserved.
Dimension-Factorized Range Migration Algorithm for Regularly Distributed Array Imaging

PubMed Central

Guo, Qijia; Wang, Jie; Chang, Tianying

2017-01-01

The two-dimensional planar MIMO array is a popular approach for millimeter wave imaging applications. As a promising practical alternative, sparse MIMO arrays have been devised to reduce the number of antenna elements and transmitting/receiving channels with predictable and acceptable loss in image quality. In this paper, a high precision three-dimensional imaging algorithm is proposed for MIMO arrays of the regularly distributed type, especially the sparse varieties. Termed the Dimension-Factorized Range Migration Algorithm, the new imaging approach factorizes the conventional MIMO Range Migration Algorithm into multiple operations across the sparse dimensions. The thinner the sparse dimensions of the array, the more efficient the new algorithm will be. Advantages of the proposed approach are demonstrated by comparison with the conventional MIMO Range Migration Algorithm and its non-uniform fast Fourier transform based variant in terms of all the important characteristics of the approaches, especially the anti-noise capability. The computation cost is analyzed as well to evaluate the efficiency quantitatively. PMID:29113083
Multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement

NASA Astrophysics Data System (ADS)

Yan, Dan; Bai, Lianfa; Zhang, Yi; Han, Jing

2018-02-01

For the problems of missing details and performance of the colorization based on sparse representation, we propose a conceptual model framework for colorizing gray-scale images, and then a multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement (CEMDC) is proposed based on this framework. The algorithm can achieve a natural colorized effect for a gray-scale image, and it is consistent with the human vision. First, the algorithm establishes a multi-sparse dictionary classification colorization model. Then, to improve the accuracy rate of the classification, the corresponding local constraint algorithm is proposed. Finally, we propose a detail enhancement based on Laplacian Pyramid, which is effective in solving the problem of missing details and improving the speed of image colorization. In addition, the algorithm not only realizes the colorization of the visual gray-scale image, but also can be applied to the other areas, such as color transfer between color images, colorizing gray fusion images, and infrared images.
1-norm support vector novelty detection and its sparseness.

PubMed

Zhang, Li; Zhou, WeiDa

2013-12-01

This paper proposes a 1-norm support vector novelty detection (SVND) method and discusses its sparseness. 1-norm SVND is formulated as a linear programming problem and uses two techniques for inducing sparseness, or the 1-norm regularization and the hinge loss function. We also find two upper bounds on the sparseness of 1-norm SVND, or exact support vector (ESV) and kernel Gram matrix rank bounds. The ESV bound indicates that 1-norm SVND has a sparser representation model than SVND. The kernel Gram matrix rank bound can loosely estimate the sparseness of 1-norm SVND. Experimental results show that 1-norm SVND is feasible and effective. Copyright © 2013 Elsevier Ltd. All rights reserved.
Disconnected Diagrams in Lattice QCD

NASA Astrophysics Data System (ADS)

Gambhir, Arjun Singh

In this work, we present state-of-the-art numerical methods and their applications for computing a particular class of observables using lattice quantum chromodynamics (Lattice QCD), a discretized version of the fundamental theory of quarks and gluons. These observables require calculating so called "disconnected diagrams" and are important for understanding many aspects of hadron structure, such as the strange content of the proton. We begin by introducing the reader to the key concepts of Lattice QCD and rigorously define the meaning of disconnected diagrams through an example of the Wick contractions of the nucleon. Subsequently, the calculation of observables requiring disconnected diagrams is posed as the computationally challenging problem of finding the trace of the inverse of an incredibly large, sparse matrix. This is followed by a brief primer of numerical sparse matrix techniques that overviews broadly used methods in Lattice QCD and builds the background for the novel algorithm presented in this work. We then introduce singular value deflation as a method to improve convergence of trace estimation and analyze its effects on matrices from a variety of fields, including chemical transport modeling, magnetohydrodynamics, and QCD. Finally, we apply this method to compute observables such as the strange axial charge of the proton and strange sigma terms in light nuclei. The work in this thesis is innovative for four reasons. First, we analyze the effects of deflation with a model that makes qualitative predictions about its effectiveness, taking only the singular value spectrum as input, and compare deflated variance with different types of trace estimator noise. Second, the synergy between probing methods and deflation is investigated both experimentally and theoretically. Third, we use the synergistic combination of deflation and a graph coloring algorithm known as hierarchical probing to conduct a lattice calculation of light disconnected matrix elements of the nucleon at two different values of the lattice spacing. Finally, we employ these algorithms to do a high-precision study of strange sigma terms in light nuclei; to our knowledge this is the first calculation of its kind from Lattice QCD.
Disconnected Diagrams in Lattice QCD

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gambhir, Arjun

In this work, we present state-of-the-art numerical methods and their applications for computing a particular class of observables using lattice quantum chromodynamics (Lattice QCD), a discretized version of the fundamental theory of quarks and gluons. These observables require calculating so called \\disconnected diagrams" and are important for understanding many aspects of hadron structure, such as the strange content of the proton. We begin by introducing the reader to the key concepts of Lattice QCD and rigorously define the meaning of disconnected diagrams through an example of the Wick contractions of the nucleon. Subsequently, the calculation of observables requiring disconnected diagramsmore » is posed as the computationally challenging problem of finding the trace of the inverse of an incredibly large, sparse matrix. This is followed by a brief primer of numerical sparse matrix techniques that overviews broadly used methods in Lattice QCD and builds the background for the novel algorithm presented in this work. We then introduce singular value deflation as a method to improve convergence of trace estimation and analyze its effects on matrices from a variety of fields, including chemical transport modeling, magnetohydrodynamics, and QCD. Finally, we apply this method to compute observables such as the strange axial charge of the proton and strange sigma terms in light nuclei. The work in this thesis is innovative for four reasons. First, we analyze the effects of deflation with a model that makes qualitative predictions about its effectiveness, taking only the singular value spectrum as input, and compare deflated variance with different types of trace estimator noise. Second, the synergy between probing methods and deflation is investigated both experimentally and theoretically. Third, we use the synergistic combination of deflation and a graph coloring algorithm known as hierarchical probing to conduct a lattice calculation of light disconnected matrix elements of the nucleon at two different values of the lattice spacing. Finally, we employ these algorithms to do a high-precision study of strange sigma terms in light nuclei; to our knowledge this is the first calculation of its kind from Lattice QCD.« less
Quantum algorithms for Gibbs sampling and hitting-time estimation

DOE PAGES

Chowdhury, Anirban Narayan; Somma, Rolando D.

2017-02-01

In this paper, we present quantum algorithms for solving two problems regarding stochastic processes. The first algorithm prepares the thermal Gibbs state of a quantum system and runs in time almost linear in √Nβ/Ζ and polynomial in log(1/ϵ), where N is the Hilbert space dimension, β is the inverse temperature, Ζ is the partition function, and ϵ is the desired precision of the output state. Our quantum algorithm exponentially improves the dependence on 1/ϵ and quadratically improves the dependence on β of known quantum algorithms for this problem. The second algorithm estimates the hitting time of a Markov chain. Formore » a sparse stochastic matrix Ρ, it runs in time almost linear in 1/(ϵΔ 3/2), where ϵ is the absolute precision in the estimation and Δ is a parameter determined by Ρ, and whose inverse is an upper bound of the hitting time. Our quantum algorithm quadratically improves the dependence on 1/ϵ and 1/Δ of the analog classical algorithm for hitting-time estimation. Finally, both algorithms use tools recently developed in the context of Hamiltonian simulation, spectral gap amplification, and solving linear systems of equations.« less
Sparse Bayesian learning for DOA estimation with mutual coupling.

PubMed

Dai, Jisheng; Hu, Nan; Xu, Weichao; Chang, Chunqi

2015-10-16

Sparse Bayesian learning (SBL) has given renewed interest to the problem of direction-of-arrival (DOA) estimation. It is generally assumed that the measurement matrix in SBL is precisely known. Unfortunately, this assumption may be invalid in practice due to the imperfect manifold caused by unknown or misspecified mutual coupling. This paper describes a modified SBL method for joint estimation of DOAs and mutual coupling coefficients with uniform linear arrays (ULAs). Unlike the existing method that only uses stationary priors, our new approach utilizes a hierarchical form of the Student t prior to enforce the sparsity of the unknown signal more heavily. We also provide a distinct Bayesian inference for the expectation-maximization (EM) algorithm, which can update the mutual coupling coefficients more efficiently. Another difference is that our method uses an additional singular value decomposition (SVD) to reduce the computational complexity of the signal reconstruction process and the sensitivity to the measurement noise.
A Direction of Arrival Estimation Algorithm Based on Orthogonal Matching Pursuit

NASA Astrophysics Data System (ADS)

Tang, Junyao; Cao, Fei; Liu, Lipeng

2018-02-01

The results show that the modified DSM is able to predict local buckling capacity of hot-rolled RHS and SHS accurately. In order to solve the problem of the weak ability of anti-radiation missile against active decoy in modern electronic warfare, a direction of arrival estimation algorithm based on orthogonal matching pursuit is proposed in this paper. The algorithm adopts the compression sensing technology. This paper uses array antennas to receive signals, gets the sparse representation of signals, and then designs the corresponding perception matrix. The signal is reconstructed by orthogonal matching pursuit algorithm to estimate the optimal solution. At the same time, the error of the whole measurement system is analyzed and simulated, and the validity of this algorithm is verified. The algorithm greatly reduces the measurement time, the quantity of equipment and the total amount of the calculation, and accurately estimates the angle and strength of the incoming signal. This technology can effectively improve the angle resolution of the missile, which is of reference significance to the research of anti-active decoy.
Unsymmetric ordering using a constrained Markowitz scheme

DOE Office of Scientific and Technical Information (OSTI.GOV)

Amestoy, Patrick R.; Xiaoye S.; Pralet, Stephane

2005-01-18

We present a family of ordering algorithms that can be used as a preprocessing step prior to performing sparse LU factorization. The ordering algorithms simultaneously achieve the objectives of selecting numerically good pivots and preserving the sparsity. We describe the algorithmic properties and challenges in their implementation. By mixing the two objectives we show that we can reduce the amount of fill-in in the factors and reduce the number of numerical problems during factorization. On a set of large unsymmetric real problems, we obtained the median reductions of 12% in the factorization time, of 13% in the size of themore » LU factors, of 20% in the number of operations performed during the factorization phase, and of 11% in the memory needed by the multifrontal solver MA41-UNS. A byproduct of this ordering strategy is an incomplete LU-factored matrix that can be used as a preconditioner in an iterative solver.« less
GPU Accelerated Chemical Similarity Calculation for Compound Library Comparison

PubMed Central

Ma, Chao; Wang, Lirong; Xie, Xiang-Qun

2012-01-01

Chemical similarity calculation plays an important role in compound library design, virtual screening, and “lead” optimization. In this manuscript, we present a novel GPU-accelerated algorithm for all-vs-all Tanimoto matrix calculation and nearest neighbor search. By taking advantage of multi-core GPU architecture and CUDA parallel programming technology, the algorithm is up to 39 times superior to the existing commercial software that runs on CPUs. Because of the utilization of intrinsic GPU instructions, this approach is nearly 10 times faster than existing GPU-accelerated sparse vector algorithm, when Unity fingerprints are used for Tanimoto calculation. The GPU program that implements this new method takes about 20 minutes to complete the calculation of Tanimoto coefficients between 32M PubChem compounds and 10K Active Probes compounds, i.e., 324G Tanimoto coefficients, on a 128-CUDA-core GPU. PMID:21692447
Exploring Deep Learning and Sparse Matrix Format Selection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhao, Y.; Liao, C.; Shen, X.

We proposed to explore the use of Deep Neural Networks (DNN) for addressing the longstanding barriers. The recent rapid progress of DNN technology has created a large impact in many fields, which has significantly improved the prediction accuracy over traditional machine learning techniques in image classifications, speech recognitions, machine translations, and so on. To some degree, these tasks resemble the decision makings in many HPC tasks, including the aforementioned format selection for SpMV and linear solver selection. For instance, sparse matrix format selection is akin to image classification—such as, to tell whether an image contains a dog or a cat;more » in both problems, the right decisions are primarily determined by the spatial patterns of the elements in an input. For image classification, the patterns are of pixels, and for sparse matrix format selection, they are of non-zero elements. DNN could be naturally applied if we regard a sparse matrix as an image and the format selection or solver selection as classification problems.« less
Parallel heterogeneous architectures for efficient OMP compressive sensing reconstruction

NASA Astrophysics Data System (ADS)

Kulkarni, Amey; Stanislaus, Jerome L.; Mohsenin, Tinoosh

2014-05-01

Compressive Sensing (CS) is a novel scheme, in which a signal that is sparse in a known transform domain can be reconstructed using fewer samples. The signal reconstruction techniques are computationally intensive and have sluggish performance, which make them impractical for real-time processing applications . The paper presents novel architectures for Orthogonal Matching Pursuit algorithm, one of the popular CS reconstruction algorithms. We show the implementation results of proposed architectures on FPGA, ASIC and on a custom many-core platform. For FPGA and ASIC implementation, a novel thresholding method is used to reduce the processing time for the optimization problem by at least 25%. Whereas, for the custom many-core platform, efficient parallelization techniques are applied, to reconstruct signals with variant signal lengths of N and sparsity of m. The algorithm is divided into three kernels. Each kernel is parallelized to reduce execution time, whereas efficient reuse of the matrix operators allows us to reduce area. Matrix operations are efficiently paralellized by taking advantage of blocked algorithms. For demonstration purpose, all architectures reconstruct a 256-length signal with maximum sparsity of 8 using 64 measurements. Implementation on Xilinx Virtex-5 FPGA, requires 27.14 μs to reconstruct the signal using basic OMP. Whereas, with thresholding method it requires 18 μs. ASIC implementation reconstructs the signal in 13 μs. However, our custom many-core, operating at 1.18 GHz, takes 18.28 μs to complete. Our results show that compared to the previous published work of the same algorithm and matrix size, proposed architectures for FPGA and ASIC implementations perform 1.3x and 1.8x respectively faster. Also, the proposed many-core implementation performs 3000x faster than the CPU and 2000x faster than the GPU.
Margin based ontology sparse vector learning algorithm and applied in biology science.

PubMed

Gao, Wei; Qudair Baig, Abdul; Ali, Haidar; Sajjad, Wasim; Reza Farahani, Mohammad

2017-01-01

In biology field, the ontology application relates to a large amount of genetic information and chemical information of molecular structure, which makes knowledge of ontology concepts convey much information. Therefore, in mathematical notation, the dimension of vector which corresponds to the ontology concept is often very large, and thus improves the higher requirements of ontology algorithm. Under this background, we consider the designing of ontology sparse vector algorithm and application in biology. In this paper, using knowledge of marginal likelihood and marginal distribution, the optimized strategy of marginal based ontology sparse vector learning algorithm is presented. Finally, the new algorithm is applied to gene ontology and plant ontology to verify its efficiency.
Decentralized modal identification using sparse blind source separation

NASA Astrophysics Data System (ADS)

Sadhu, A.; Hazra, B.; Narasimhan, S.; Pandey, M. D.

2011-12-01

Popular ambient vibration-based system identification methods process information collected from a dense array of sensors centrally to yield the modal properties. In such methods, the need for a centralized processing unit capable of satisfying large memory and processing demands is unavoidable. With the advent of wireless smart sensor networks, it is now possible to process information locally at the sensor level, instead. The information at the individual sensor level can then be concatenated to obtain the global structure characteristics. A novel decentralized algorithm based on wavelet transforms to infer global structure mode information using measurements obtained using a small group of sensors at a time is proposed in this paper. The focus of the paper is on algorithmic development, while the actual hardware and software implementation is not pursued here. The problem of identification is cast within the framework of under-determined blind source separation invoking transformations of measurements to the time-frequency domain resulting in a sparse representation. The partial mode shape coefficients so identified are then combined to yield complete modal information. The transformations are undertaken using stationary wavelet packet transform (SWPT), yielding a sparse representation in the wavelet domain. Principal component analysis (PCA) is then performed on the resulting wavelet coefficients, yielding the partial mixing matrix coefficients from a few measurement channels at a time. This process is repeated using measurements obtained from multiple sensor groups, and the results so obtained from each group are concatenated to obtain the global modal characteristics of the structure.

Image fusion using sparse overcomplete feature dictionaries

DOEpatents

Brumby, Steven P.; Bettencourt, Luis; Kenyon, Garrett T.; Chartrand, Rick; Wohlberg, Brendt

2015-10-06

Approaches for deciding what individuals in a population of visual system "neurons" are looking for using sparse overcomplete feature dictionaries are provided. A sparse overcomplete feature dictionary may be learned for an image dataset and a local sparse representation of the image dataset may be built using the learned feature dictionary. A local maximum pooling operation may be applied on the local sparse representation to produce a translation-tolerant representation of the image dataset. An object may then be classified and/or clustered within the translation-tolerant representation of the image dataset using a supervised classification algorithm and/or an unsupervised clustering algorithm.
A Multilevel Algorithm for the Solution of Second Order Elliptic Differential Equations on Sparse Grids

NASA Technical Reports Server (NTRS)

Pflaum, Christoph

1996-01-01

A multilevel algorithm is presented that solves general second order elliptic partial differential equations on adaptive sparse grids. The multilevel algorithm consists of several V-cycles. Suitable discretizations provide that the discrete equation system can be solved in an efficient way. Numerical experiments show a convergence rate of order Omicron(1) for the multilevel algorithm.
Visual Tracking via Sparse and Local Linear Coding.

PubMed

Wang, Guofeng; Qin, Xueying; Zhong, Fan; Liu, Yue; Li, Hongbo; Peng, Qunsheng; Yang, Ming-Hsuan

2015-11-01

The state search is an important component of any object tracking algorithm. Numerous algorithms have been proposed, but stochastic sampling methods (e.g., particle filters) are arguably one of the most effective approaches. However, the discretization of the state space complicates the search for the precise object location. In this paper, we propose a novel tracking algorithm that extends the state space of particle observations from discrete to continuous. The solution is determined accurately via iterative linear coding between two convex hulls. The algorithm is modeled by an optimal function, which can be efficiently solved by either convex sparse coding or locality constrained linear coding. The algorithm is also very flexible and can be combined with many generic object representations. Thus, we first use sparse representation to achieve an efficient searching mechanism of the algorithm and demonstrate its accuracy. Next, two other object representation models, i.e., least soft-threshold squares and adaptive structural local sparse appearance, are implemented with improved accuracy to demonstrate the flexibility of our algorithm. Qualitative and quantitative experimental results demonstrate that the proposed tracking algorithm performs favorably against the state-of-the-art methods in dynamic scenes.
Parallel Computing Strategies for Irregular Algorithms

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Efficient convolutional sparse coding

DOEpatents

Wohlberg, Brendt

2017-06-20

Computationally efficient algorithms may be applied for fast dictionary learning solving the convolutional sparse coding problem in the Fourier domain. More specifically, efficient convolutional sparse coding may be derived within an alternating direction method of multipliers (ADMM) framework that utilizes fast Fourier transforms (FFT) to solve the main linear system in the frequency domain. Such algorithms may enable a significant reduction in computational cost over conventional approaches by implementing a linear solver for the most critical and computationally expensive component of the conventional iterative algorithm. The theoretical computational cost of the algorithm may be reduced from O(M.sup.3N) to O(MN log N), where N is the dimensionality of the data and M is the number of elements in the dictionary. This significant improvement in efficiency may greatly increase the range of problems that can practically be addressed via convolutional sparse representations.
Iterative algorithms for large sparse linear systems on parallel computers

NASA Technical Reports Server (NTRS)

Adams, L. M.

1982-01-01

Algorithms for assembling in parallel the sparse system of linear equations that result from finite difference or finite element discretizations of elliptic partial differential equations, such as those that arise in structural engineering are developed. Parallel linear stationary iterative algorithms and parallel preconditioned conjugate gradient algorithms are developed for solving these systems. In addition, a model for comparing parallel algorithms on array architectures is developed and results of this model for the algorithms are given.
Efficient Kriging Algorithms

NASA Technical Reports Server (NTRS)

Memarsadeghi, Nargess

2011-01-01

More efficient versions of an interpolation method, called kriging, have been introduced in order to reduce its traditionally high computational cost. Written in C++, these approaches were tested on both synthetic and real data. Kriging is a best unbiased linear estimator and suitable for interpolation of scattered data points. Kriging has long been used in the geostatistic and mining communities, but is now being researched for use in the image fusion of remotely sensed data. This allows a combination of data from various locations to be used to fill in any missing data from any single location. To arrive at the faster algorithms, sparse SYMMLQ iterative solver, covariance tapering, Fast Multipole Methods (FMM), and nearest neighbor searching techniques were used. These implementations were used when the coefficient matrix in the linear system is symmetric, but not necessarily positive-definite.
Multiagent Reinforcement Learning With Sparse Interactions by Negotiation and Knowledge Transfer.

PubMed

Zhou, Luowei; Yang, Pei; Chen, Chunlin; Gao, Yang

2017-05-01

Reinforcement learning has significant applications for multiagent systems, especially in unknown dynamic environments. However, most multiagent reinforcement learning (MARL) algorithms suffer from such problems as exponential computation complexity in the joint state-action space, which makes it difficult to scale up to realistic multiagent problems. In this paper, a novel algorithm named negotiation-based MARL with sparse interactions (NegoSIs) is presented. In contrast to traditional sparse-interaction-based MARL algorithms, NegoSI adopts the equilibrium concept and makes it possible for agents to select the nonstrict equilibrium-dominating strategy profile (nonstrict EDSP) or meta equilibrium for their joint actions. The presented NegoSI algorithm consists of four parts: 1) the equilibrium-based framework for sparse interactions; 2) the negotiation for the equilibrium set; 3) the minimum variance method for selecting one joint action; and 4) the knowledge transfer of local Q -values. In this integrated algorithm, three techniques, i.e., unshared value functions, equilibrium solutions, and sparse interactions are adopted to achieve privacy protection, better coordination and lower computational complexity, respectively. To evaluate the performance of the presented NegoSI algorithm, two groups of experiments are carried out regarding three criteria: 1) steps of each episode; 2) rewards of each episode; and 3) average runtime. The first group of experiments is conducted using six grid world games and shows fast convergence and high scalability of the presented algorithm. Then in the second group of experiments NegoSI is applied to an intelligent warehouse problem and simulated results demonstrate the effectiveness of the presented NegoSI algorithm compared with other state-of-the-art MARL algorithms.
A Spectral Reconstruction Algorithm of Miniature Spectrometer Based on Sparse Optimization and Dictionary Learning.

PubMed

Zhang, Shang; Dong, Yuhan; Fu, Hongyan; Huang, Shao-Lun; Zhang, Lin

2018-02-22

The miniaturization of spectrometer can broaden the application area of spectrometry, which has huge academic and industrial value. Among various miniaturization approaches, filter-based miniaturization is a promising implementation by utilizing broadband filters with distinct transmission functions. Mathematically, filter-based spectral reconstruction can be modeled as solving a system of linear equations. In this paper, we propose an algorithm of spectral reconstruction based on sparse optimization and dictionary learning. To verify the feasibility of the reconstruction algorithm, we design and implement a simple prototype of a filter-based miniature spectrometer. The experimental results demonstrate that sparse optimization is well applicable to spectral reconstruction whether the spectra are directly sparse or not. As for the non-directly sparse spectra, their sparsity can be enhanced by dictionary learning. In conclusion, the proposed approach has a bright application prospect in fabricating a practical miniature spectrometer.
A Spectral Reconstruction Algorithm of Miniature Spectrometer Based on Sparse Optimization and Dictionary Learning

PubMed Central

Zhang, Shang; Fu, Hongyan; Huang, Shao-Lun; Zhang, Lin

2018-01-01

The miniaturization of spectrometer can broaden the application area of spectrometry, which has huge academic and industrial value. Among various miniaturization approaches, filter-based miniaturization is a promising implementation by utilizing broadband filters with distinct transmission functions. Mathematically, filter-based spectral reconstruction can be modeled as solving a system of linear equations. In this paper, we propose an algorithm of spectral reconstruction based on sparse optimization and dictionary learning. To verify the feasibility of the reconstruction algorithm, we design and implement a simple prototype of a filter-based miniature spectrometer. The experimental results demonstrate that sparse optimization is well applicable to spectral reconstruction whether the spectra are directly sparse or not. As for the non-directly sparse spectra, their sparsity can be enhanced by dictionary learning. In conclusion, the proposed approach has a bright application prospect in fabricating a practical miniature spectrometer. PMID:29470406
Nonlinear hyperspectral unmixing based on sparse non-negative matrix factorization

NASA Astrophysics Data System (ADS)

Li, Jing; Li, Xiaorun; Zhao, Liaoying

2016-01-01

Hyperspectral unmixing aims at extracting pure material spectra, accompanied by their corresponding proportions, from a mixed pixel. Owing to modeling more accurate distribution of real material, nonlinear mixing models (non-LMM) are usually considered to hold better performance than LMMs in complicated scenarios. In the past years, numerous nonlinear models have been successfully applied to hyperspectral unmixing. However, most non-LMMs only think of sum-to-one constraint or positivity constraint while the widespread sparsity among real materials mixing is the very factor that cannot be ignored. That is, for non-LMMs, a pixel is usually composed of a few spectral signatures of different materials from all the pure pixel set. Thus, in this paper, a smooth sparsity constraint is incorporated into the state-of-the-art Fan nonlinear model to exploit the sparsity feature in nonlinear model and use it to enhance the unmixing performance. This sparsity-constrained Fan model is solved with the non-negative matrix factorization. The algorithm was implemented on synthetic and real hyperspectral data and presented its advantage over those competing algorithms in the experiments.
A Sparse Self-Consistent Field Algorithm and Its Parallel Implementation: Application to Density-Functional-Based Tight Binding.

PubMed

Scemama, Anthony; Renon, Nicolas; Rapacioli, Mathias

2014-06-10

We present an algorithm and its parallel implementation for solving a self-consistent problem as encountered in Hartree-Fock or density functional theory. The algorithm takes advantage of the sparsity of matrices through the use of local molecular orbitals. The implementation allows one to exploit efficiently modern symmetric multiprocessing (SMP) computer architectures. As a first application, the algorithm is used within the density-functional-based tight binding method, for which most of the computational time is spent in the linear algebra routines (diagonalization of the Fock/Kohn-Sham matrix). We show that with this algorithm (i) single point calculations on very large systems (millions of atoms) can be performed on large SMP machines, (ii) calculations involving intermediate size systems (1000-100 000 atoms) are also strongly accelerated and can run efficiently on standard servers, and (iii) the error on the total energy due to the use of a cutoff in the molecular orbital coefficients can be controlled such that it remains smaller than the SCF convergence criterion.
Sequential Dictionary Learning From Correlated Data: Application to fMRI Data Analysis.

PubMed

Seghouane, Abd-Krim; Iqbal, Asif

2017-03-22

Sequential dictionary learning via the K-SVD algorithm has been revealed as a successful alternative to conventional data driven methods such as independent component analysis (ICA) for functional magnetic resonance imaging (fMRI) data analysis. fMRI datasets are however structured data matrices with notions of spatio-temporal correlation and temporal smoothness. This prior information has not been included in the K-SVD algorithm when applied to fMRI data analysis. In this paper we propose three variants of the K-SVD algorithm dedicated to fMRI data analysis by accounting for this prior information. The proposed algorithms differ from the K-SVD in their sparse coding and dictionary update stages. The first two algorithms account for the known correlation structure in the fMRI data by using the squared Q, R-norm instead of the Frobenius norm for matrix approximation. The third and last algorithm account for both the known correlation structure in the fMRI data and the temporal smoothness. The temporal smoothness is incorporated in the dictionary update stage via regularization of the dictionary atoms obtained with penalization. The performance of the proposed dictionary learning algorithms are illustrated through simulations and applications on real fMRI data.
Basis Expansion Approaches for Regularized Sequential Dictionary Learning Algorithms With Enforced Sparsity for fMRI Data Analysis.

PubMed

Seghouane, Abd-Krim; Iqbal, Asif

2017-09-01

Sequential dictionary learning algorithms have been successfully applied to functional magnetic resonance imaging (fMRI) data analysis. fMRI data sets are, however, structured data matrices with the notions of temporal smoothness in the column direction. This prior information, which can be converted into a constraint of smoothness on the learned dictionary atoms, has seldomly been included in classical dictionary learning algorithms when applied to fMRI data analysis. In this paper, we tackle this problem by proposing two new sequential dictionary learning algorithms dedicated to fMRI data analysis by accounting for this prior information. These algorithms differ from the existing ones in their dictionary update stage. The steps of this stage are derived as a variant of the power method for computing the SVD. The proposed algorithms generate regularized dictionary atoms via the solution of a left regularized rank-one matrix approximation problem where temporal smoothness is enforced via regularization through basis expansion and sparse basis expansion in the dictionary update stage. Applications on synthetic data experiments and real fMRI data sets illustrating the performance of the proposed algorithms are provided.
Parallel computation safety analysis irradiation targets fission product molybdenum in neutronic aspect using the successive over-relaxation algorithm

NASA Astrophysics Data System (ADS)

Susmikanti, Mike; Dewayatna, Winter; Sulistyo, Yos

2014-09-01

One of the research activities in support of commercial radioisotope production program is a safety research on target FPM (Fission Product Molybdenum) irradiation. FPM targets form a tube made of stainless steel which contains nuclear-grade high-enrichment uranium. The FPM irradiation tube is intended to obtain fission products. Fission materials such as Mo99 used widely the form of kits in the medical world. The neutronics problem is solved using first-order perturbation theory derived from the diffusion equation for four groups. In contrast, Mo isotopes have longer half-lives, about 3 days (66 hours), so the delivery of radioisotopes to consumer centers and storage is possible though still limited. The production of this isotope potentially gives significant economic value. The criticality and flux in multigroup diffusion model was calculated for various irradiation positions and uranium contents. This model involves complex computation, with large and sparse matrix system. Several parallel algorithms have been developed for the sparse and large matrix solution. In this paper, a successive over-relaxation (SOR) algorithm was implemented for the calculation of reactivity coefficients which can be done in parallel. Previous works performed reactivity calculations serially with Gauss-Seidel iteratives. The parallel method can be used to solve multigroup diffusion equation system and calculate the criticality and reactivity coefficients. In this research a computer code was developed to exploit parallel processing to perform reactivity calculations which were to be used in safety analysis. The parallel processing in the multicore computer system allows the calculation to be performed more quickly. This code was applied for the safety limits calculation of irradiated FPM targets containing highly enriched uranium. The results of calculations neutron show that for uranium contents of 1.7676 g and 6.1866 g (× 106 cm-1) in a tube, their delta reactivities are the still within safety limits; however, for 7.9542 g and 8.838 g (× 106 cm-1) the limits were exceeded.
Solving large sparse eigenvalue problems on supercomputers

NASA Technical Reports Server (NTRS)

Philippe, Bernard; Saad, Youcef

1988-01-01

An important problem in scientific computing consists in finding a few eigenvalues and corresponding eigenvectors of a very large and sparse matrix. The most popular methods to solve these problems are based on projection techniques on appropriate subspaces. The main attraction of these methods is that they only require the use of the matrix in the form of matrix by vector multiplications. The implementations on supercomputers of two such methods for symmetric matrices, namely Lanczos' method and Davidson's method are compared. Since one of the most important operations in these two methods is the multiplication of vectors by the sparse matrix, methods of performing this operation efficiently are discussed. The advantages and the disadvantages of each method are compared and implementation aspects are discussed. Numerical experiments on a one processor CRAY 2 and CRAY X-MP are reported. Possible parallel implementations are also discussed.
Efficiency analysis of numerical integrations for finite element substructure in real-time hybrid simulation

NASA Astrophysics Data System (ADS)

Wang, Jinting; Lu, Liqiao; Zhu, Fei

2018-01-01

Finite element (FE) is a powerful tool and has been applied by investigators to real-time hybrid simulations (RTHSs). This study focuses on the computational efficiency, including the computational time and accuracy, of numerical integrations in solving FE numerical substructure in RTHSs. First, sparse matrix storage schemes are adopted to decrease the computational time of FE numerical substructure. In this way, the task execution time (TET) decreases such that the scale of the numerical substructure model increases. Subsequently, several commonly used explicit numerical integration algorithms, including the central difference method (CDM), the Newmark explicit method, the Chang method and the Gui-λ method, are comprehensively compared to evaluate their computational time in solving FE numerical substructure. CDM is better than the other explicit integration algorithms when the damping matrix is diagonal, while the Gui-λ (λ = 4) method is advantageous when the damping matrix is non-diagonal. Finally, the effect of time delay on the computational accuracy of RTHSs is investigated by simulating structure-foundation systems. Simulation results show that the influences of time delay on the displacement response become obvious with the mass ratio increasing, and delay compensation methods may reduce the relative error of the displacement peak value to less than 5% even under the large time-step and large time delay.
Improved Success of Sparse Matrix Protein Crystallization Screening with Heterogeneous Nucleating Agents

PubMed Central

Thakur, Anil S.; Robin, Gautier; Guncar, Gregor; Saunders, Neil F. W.; Newman, Janet; Martin, Jennifer L.; Kobe, Bostjan

2007-01-01

Background Crystallization is a major bottleneck in the process of macromolecular structure determination by X-ray crystallography. Successful crystallization requires the formation of nuclei and their subsequent growth to crystals of suitable size. Crystal growth generally occurs spontaneously in a supersaturated solution as a result of homogenous nucleation. However, in a typical sparse matrix screening experiment, precipitant and protein concentration are not sampled extensively, and supersaturation conditions suitable for nucleation are often missed. Methodology/Principal Findings We tested the effect of nine potential heterogenous nucleating agents on crystallization of ten test proteins in a sparse matrix screen. Several nucleating agents induced crystal formation under conditions where no crystallization occurred in the absence of the nucleating agent. Four nucleating agents: dried seaweed; horse hair; cellulose and hydroxyapatite, had a considerable overall positive effect on crystallization success. This effect was further enhanced when these nucleating agents were used in combination with each other. Conclusions/Significance Our results suggest that the addition of heterogeneous nucleating agents increases the chances of crystal formation when using sparse matrix screens. PMID:17971854
A critical analysis of computational protein design with sparse residue interaction graphs

PubMed Central

Georgiev, Ivelin S.

2017-01-01

Protein design algorithms enumerate a combinatorial number of candidate structures to compute the Global Minimum Energy Conformation (GMEC). To efficiently find the GMEC, protein design algorithms must methodically reduce the conformational search space. By applying distance and energy cutoffs, the protein system to be designed can thus be represented using a sparse residue interaction graph, where the number of interacting residue pairs is less than all pairs of mutable residues, and the corresponding GMEC is called the sparse GMEC. However, ignoring some pairwise residue interactions can lead to a change in the energy, conformation, or sequence of the sparse GMEC vs. the original or the full GMEC. Despite the widespread use of sparse residue interaction graphs in protein design, the above mentioned effects of their use have not been previously analyzed. To analyze the costs and benefits of designing with sparse residue interaction graphs, we computed the GMECs for 136 different protein design problems both with and without distance and energy cutoffs, and compared their energies, conformations, and sequences. Our analysis shows that the differences between the GMECs depend critically on whether or not the design includes core, boundary, or surface residues. Moreover, neglecting long-range interactions can alter local interactions and introduce large sequence differences, both of which can result in significant structural and functional changes. Designs on proteins with experimentally measured thermostability show it is beneficial to compute both the full and the sparse GMEC accurately and efficiently. To this end, we show that a provable, ensemble-based algorithm can efficiently compute both GMECs by enumerating a small number of conformations, usually fewer than 1000. This provides a novel way to combine sparse residue interaction graphs with provable, ensemble-based algorithms to reap the benefits of sparse residue interaction graphs while avoiding their potential inaccuracies. PMID:28358804
Semi-blind sparse image reconstruction with application to MRFM.

PubMed

Park, Se Un; Dobigeon, Nicolas; Hero, Alfred O

2012-09-01

We propose a solution to the image deconvolution problem where the convolution kernel or point spread function (PSF) is assumed to be only partially known. Small perturbations generated from the model are exploited to produce a few principal components explaining the PSF uncertainty in a high-dimensional space. Unlike recent developments on blind deconvolution of natural images, we assume the image is sparse in the pixel basis, a natural sparsity arising in magnetic resonance force microscopy (MRFM). Our approach adopts a Bayesian Metropolis-within-Gibbs sampling framework. The performance of our Bayesian semi-blind algorithm for sparse images is superior to previously proposed semi-blind algorithms such as the alternating minimization algorithm and blind algorithms developed for natural images. We illustrate our myopic algorithm on real MRFM tobacco virus data.

Matrix-Inversion-Free Compressed Sensing With Variable Orthogonal Multi-Matching Pursuit Based on Prior Information for ECG Signals.

PubMed

Cheng, Yih-Chun; Tsai, Pei-Yun; Huang, Ming-Hao

2016-05-19

Low-complexity compressed sensing (CS) techniques for monitoring electrocardiogram (ECG) signals in wireless body sensor network (WBSN) are presented. The prior probability of ECG sparsity in the wavelet domain is first exploited. Then, variable orthogonal multi-matching pursuit (vOMMP) algorithm that consists of two phases is proposed. In the first phase, orthogonal matching pursuit (OMP) algorithm is adopted to effectively augment the support set with reliable indices and in the second phase, the orthogonal multi-matching pursuit (OMMP) is employed to rescue the missing indices. The reconstruction performance is thus enhanced with the prior information and the vOMMP algorithm. Furthermore, the computation-intensive pseudo-inverse operation is simplified by the matrix-inversion-free (MIF) technique based on QR decomposition. The vOMMP-MIF CS decoder is then implemented in 90 nm CMOS technology. The QR decomposition is accomplished by two systolic arrays working in parallel. The implementation supports three settings for obtaining 40, 44, and 48 coefficients in the sparse vector. From the measurement result, the power consumption is 11.7 mW at 0.9 V and 12 MHz. Compared to prior chip implementations, our design shows good hardware efficiency and is suitable for low-energy applications.
[Object Separation from Medical X-Ray Images Based on ICA].

PubMed

Li, Yan; Yu, Chun-yu; Miao, Ya-jian; Fei, Bin; Zhuang, Feng-yun

2015-03-01

X-ray medical image can examine diseased tissue of patients and has important reference value for medical diagnosis. With the problems that traditional X-ray images have noise, poor level sense and blocked aliasing organs, this paper proposes a method for the introduction of multi-spectrum X-ray imaging and independent component analysis (ICA) algorithm to separate the target object. Firstly image de-noising preprocessing ensures the accuracy of target extraction based on independent component analysis and sparse code shrinkage. Then according to the main proportion of organ in the images, aliasing thickness matrix of each pixel was isolated. Finally independent component analysis obtains convergence matrix to reconstruct the target object with blind separation theory. In the ICA algorithm, it found that when the number is more than 40, the target objects separate successfully with the aid of subjective evaluation standard. And when the amplitudes of the scale are in the [25, 45] interval, the target images have high contrast and less distortion. The three-dimensional figure of Peak signal to noise ratio (PSNR) shows that the different convergence times and amplitudes have a greater influence on image quality. The contrast and edge information of experimental images achieve better effects with the convergence times 85 and amplitudes 35 in the ICA algorithm.
Application of a sparse representation method using K-SVD to data compression of experimental ambient vibration data for SHM

NASA Astrophysics Data System (ADS)

Noh, Hae Young; Kiremidjian, Anne S.

2011-04-01

This paper introduces a data compression method using the K-SVD algorithm and its application to experimental ambient vibration data for structural health monitoring purposes. Because many damage diagnosis algorithms that use system identification require vibration measurements of multiple locations, it is necessary to transmit long threads of data. In wireless sensor networks for structural health monitoring, however, data transmission is often a major source of battery consumption. Therefore, reducing the amount of data to transmit can significantly lengthen the battery life and reduce maintenance cost. The K-SVD algorithm was originally developed in information theory for sparse signal representation. This algorithm creates an optimal over-complete set of bases, referred to as a dictionary, using singular value decomposition (SVD) and represents the data as sparse linear combinations of these bases using the orthogonal matching pursuit (OMP) algorithm. Since ambient vibration data are stationary, we can segment them and represent each segment sparsely. Then only the dictionary and the sparse vectors of the coefficients need to be transmitted wirelessly for restoration of the original data. We applied this method to ambient vibration data measured from a four-story steel moment resisting frame. The results show that the method can compress the data efficiently and restore the data with very little error.
Locality constrained joint dynamic sparse representation for local matching based face recognition.

PubMed

Wang, Jianzhong; Yi, Yugen; Zhou, Wei; Shi, Yanjiao; Qi, Miao; Zhang, Ming; Zhang, Baoxue; Kong, Jun

2014-01-01

Recently, Sparse Representation-based Classification (SRC) has attracted a lot of attention for its applications to various tasks, especially in biometric techniques such as face recognition. However, factors such as lighting, expression, pose and disguise variations in face images will decrease the performances of SRC and most other face recognition techniques. In order to overcome these limitations, we propose a robust face recognition method named Locality Constrained Joint Dynamic Sparse Representation-based Classification (LCJDSRC) in this paper. In our method, a face image is first partitioned into several smaller sub-images. Then, these sub-images are sparsely represented using the proposed locality constrained joint dynamic sparse representation algorithm. Finally, the representation results for all sub-images are aggregated to obtain the final recognition result. Compared with other algorithms which process each sub-image of a face image independently, the proposed algorithm regards the local matching-based face recognition as a multi-task learning problem. Thus, the latent relationships among the sub-images from the same face image are taken into account. Meanwhile, the locality information of the data is also considered in our algorithm. We evaluate our algorithm by comparing it with other state-of-the-art approaches. Extensive experiments on four benchmark face databases (ORL, Extended YaleB, AR and LFW) demonstrate the effectiveness of LCJDSRC.
The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R.

PubMed

Pang, Haotian; Liu, Han; Vanderbei, Robert

2014-02-01

We develop an R package fastclime for solving a family of regularized linear programming (LP) problems. Our package efficiently implements the parametric simplex algorithm, which provides a scalable and sophisticated tool for solving large-scale linear programs. As an illustrative example, one use of our LP solver is to implement an important sparse precision matrix estimation method called CLIME (Constrained L 1 Minimization Estimator). Compared with existing packages for this problem such as clime and flare, our package has three advantages: (1) it efficiently calculates the full piecewise-linear regularization path; (2) it provides an accurate dual certificate as stopping criterion; (3) it is completely coded in C and is highly portable. This package is designed to be useful to statisticians and machine learning researchers for solving a wide range of problems.
Spectrum recovery method based on sparse representation for segmented multi-Gaussian model

NASA Astrophysics Data System (ADS)

Teng, Yidan; Zhang, Ye; Ti, Chunli; Su, Nan

2016-09-01

Hyperspectral images can realize crackajack features discriminability for supplying diagnostic characteristics with high spectral resolution. However, various degradations may generate negative influence on the spectral information, including water absorption, bands-continuous noise. On the other hand, the huge data volume and strong redundancy among spectrums produced intense demand on compressing HSIs in spectral dimension, which also leads to the loss of spectral information. The reconstruction of spectral diagnostic characteristics has irreplaceable significance for the subsequent application of HSIs. This paper introduces a spectrum restoration method for HSIs making use of segmented multi-Gaussian model (SMGM) and sparse representation. A SMGM is established to indicating the unsymmetrical spectral absorption and reflection characteristics, meanwhile, its rationality and sparse property are discussed. With the application of compressed sensing (CS) theory, we implement sparse representation to the SMGM. Then, the degraded and compressed HSIs can be reconstructed utilizing the uninjured or key bands. Finally, we take low rank matrix recovery (LRMR) algorithm for post processing to restore the spatial details. The proposed method was tested on the spectral data captured on the ground with artificial water absorption condition and an AVIRIS-HSI data set. The experimental results in terms of qualitative and quantitative assessments demonstrate that the effectiveness on recovering the spectral information from both degradations and loss compression. The spectral diagnostic characteristics and the spatial geometry feature are well preserved.
Overcoming Challenges in Kinetic Modeling of Magnetized Plasmas and Vacuum Electronic Devices

NASA Astrophysics Data System (ADS)

Omelchenko, Yuri; Na, Dong-Yeop; Teixeira, Fernando

2017-10-01

We transform the state-of-the art of plasma modeling by taking advantage of novel computational techniques for fast and robust integration of multiscale hybrid (full particle ions, fluid electrons, no displacement current) and full-PIC models. These models are implemented in 3D HYPERS and axisymmetric full-PIC CONPIC codes. HYPERS is a massively parallel, asynchronous code. The HYPERS solver does not step fields and particles synchronously in time but instead executes local variable updates (events) at their self-adaptive rates while preserving fundamental conservation laws. The charge-conserving CONPIC code has a matrix-free explicit finite-element (FE) solver based on a sparse-approximate inverse (SPAI) algorithm. This explicit solver approximates the inverse FE system matrix (``mass'' matrix) using successive sparsity pattern orders of the original matrix. It does not reduce the set of Maxwell's equations to a vector-wave (curl-curl) equation of second order but instead utilizes the standard coupled first-order Maxwell's system. We discuss the ability of our codes to accurately and efficiently account for multiscale physical phenomena in 3D magnetized space and laboratory plasmas and axisymmetric vacuum electronic devices.
The Research on Denoising of SAR Image Based on Improved K-SVD Algorithm

NASA Astrophysics Data System (ADS)

Tan, Linglong; Li, Changkai; Wang, Yueqin

2018-04-01

SAR images often receive noise interference in the process of acquisition and transmission, which can greatly reduce the quality of images and cause great difficulties for image processing. The existing complete DCT dictionary algorithm is fast in processing speed, but its denoising effect is poor. In this paper, the problem of poor denoising, proposed K-SVD (K-means and singular value decomposition) algorithm is applied to the image noise suppression. Firstly, the sparse dictionary structure is introduced in detail. The dictionary has a compact representation and can effectively train the image signal. Then, the sparse dictionary is trained by K-SVD algorithm according to the sparse representation of the dictionary. The algorithm has more advantages in high dimensional data processing. Experimental results show that the proposed algorithm can remove the speckle noise more effectively than the complete DCT dictionary and retain the edge details better.
A general parallel sparse-blocked matrix multiply for linear scaling SCF theory

NASA Astrophysics Data System (ADS)

Challacombe, Matt

2000-06-01

A general approach to the parallel sparse-blocked matrix-matrix multiply is developed in the context of linear scaling self-consistent-field (SCF) theory. The data-parallel message passing method uses non-blocking communication to overlap computation and communication. The space filling curve heuristic is used to achieve data locality for sparse matrix elements that decay with “separation”. Load balance is achieved by solving the bin packing problem for blocks with variable size.With this new method as the kernel, parallel performance of the simplified density matrix minimization (SDMM) for solution of the SCF equations is investigated for RHF/6-31G ∗∗ water clusters and RHF/3-21G estane globules. Sustained rates above 5.7 GFLOPS for the SDMM have been achieved for (H 2 O) 200 with 95 Origin 2000 processors. Scalability is found to be limited by load imbalance, which increases with decreasing granularity, due primarily to the inhomogeneous distribution of variable block sizes.
Matrix decomposition graphics processing unit solver for Poisson image editing

NASA Astrophysics Data System (ADS)

Lei, Zhao; Wei, Li

2012-10-01

In recent years, gradient-domain methods have been widely discussed in the image processing field, including seamless cloning and image stitching. These algorithms are commonly carried out by solving a large sparse linear system: the Poisson equation. However, solving the Poisson equation is a computational and memory intensive task which makes it not suitable for real-time image editing. A new matrix decomposition graphics processing unit (GPU) solver (MDGS) is proposed to settle the problem. A matrix decomposition method is used to distribute the work among GPU threads, so that MDGS will take full advantage of the computing power of current GPUs. Additionally, MDGS is a hybrid solver (combines both the direct and iterative techniques) and has two-level architecture. These enable MDGS to generate identical solutions with those of the common Poisson methods and achieve high convergence rate in most cases. This approach is advantageous in terms of parallelizability, enabling real-time image processing, low memory-taken and extensive applications.
Efficient preconditioning of the electronic structure problem in large scale ab initio molecular dynamics simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schiffmann, Florian; VandeVondele, Joost, E-mail: Joost.VandeVondele@mat.ethz.ch

2015-06-28

We present an improved preconditioning scheme for electronic structure calculations based on the orbital transformation method. First, a preconditioner is developed which includes information from the full Kohn-Sham matrix but avoids computationally demanding diagonalisation steps in its construction. This reduces the computational cost of its construction, eliminating a bottleneck in large scale simulations, while maintaining rapid convergence. In addition, a modified form of Hotelling’s iterative inversion is introduced to replace the exact inversion of the preconditioner matrix. This method is highly effective during molecular dynamics (MD), as the solution obtained in earlier MD steps is a suitable initial guess. Filteringmore » small elements during sparse matrix multiplication leads to linear scaling inversion, while retaining robustness, already for relatively small systems. For system sizes ranging from a few hundred to a few thousand atoms, which are typical for many practical applications, the improvements to the algorithm lead to a 2-5 fold speedup per MD step.« less
Energy Efficient GNSS Signal Acquisition Using Singular Value Decomposition (SVD).

PubMed

Bermúdez Ordoñez, Juan Carlos; Arnaldo Valdés, Rosa María; Gómez Comendador, Fernando

2018-05-16

A significant challenge in global navigation satellite system (GNSS) signal processing is a requirement for a very high sampling rate. The recently-emerging compressed sensing (CS) theory makes processing GNSS signals at a low sampling rate possible if the signal has a sparse representation in a certain space. Based on CS and SVD theories, an algorithm for sampling GNSS signals at a rate much lower than the Nyquist rate and reconstructing the compressed signal is proposed in this research, which is validated after the output from that process still performs signal detection using the standard fast Fourier transform (FFT) parallel frequency space search acquisition. The sparse representation of the GNSS signal is the most important precondition for CS, by constructing a rectangular Toeplitz matrix (TZ) of the transmitted signal, calculating the left singular vectors using SVD from the TZ, to achieve sparse signal representation. Next, obtaining the M-dimensional observation vectors based on the left singular vectors of the SVD, which are equivalent to the sampler operator in standard compressive sensing theory, the signal can be sampled below the Nyquist rate, and can still be reconstructed via ℓ 1 minimization with accuracy using convex optimization. As an added value, there is a GNSS signal acquisition enhancement effect by retaining the useful signal and filtering out noise by projecting the signal into the most significant proper orthogonal modes (PODs) which are the optimal distributions of signal power. The algorithm is validated with real recorded signals, and the results show that the proposed method is effective for sampling, reconstructing intermediate frequency (IF) GNSS signals in the time discrete domain.
Energy Efficient GNSS Signal Acquisition Using Singular Value Decomposition (SVD)

PubMed Central

Arnaldo Valdés, Rosa María; Gómez Comendador, Fernando

2018-01-01

A significant challenge in global navigation satellite system (GNSS) signal processing is a requirement for a very high sampling rate. The recently-emerging compressed sensing (CS) theory makes processing GNSS signals at a low sampling rate possible if the signal has a sparse representation in a certain space. Based on CS and SVD theories, an algorithm for sampling GNSS signals at a rate much lower than the Nyquist rate and reconstructing the compressed signal is proposed in this research, which is validated after the output from that process still performs signal detection using the standard fast Fourier transform (FFT) parallel frequency space search acquisition. The sparse representation of the GNSS signal is the most important precondition for CS, by constructing a rectangular Toeplitz matrix (TZ) of the transmitted signal, calculating the left singular vectors using SVD from the TZ, to achieve sparse signal representation. Next, obtaining the M-dimensional observation vectors based on the left singular vectors of the SVD, which are equivalent to the sampler operator in standard compressive sensing theory, the signal can be sampled below the Nyquist rate, and can still be reconstructed via ℓ1 minimization with accuracy using convex optimization. As an added value, there is a GNSS signal acquisition enhancement effect by retaining the useful signal and filtering out noise by projecting the signal into the most significant proper orthogonal modes (PODs) which are the optimal distributions of signal power. The algorithm is validated with real recorded signals, and the results show that the proposed method is effective for sampling, reconstructing intermediate frequency (IF) GNSS signals in the time discrete domain. PMID:29772731
Parallel Preconditioning for CFD Problems on the CM-5

NASA Technical Reports Server (NTRS)

Simon, Horst D.; Kremenetsky, Mark D.; Richardson, John; Lasinski, T. A. (Technical Monitor)

1994-01-01

Up to today, preconditioning methods on massively parallel systems have faced a major difficulty. The most successful preconditioning methods in terms of accelerating the convergence of the iterative solver such as incomplete LU factorizations are notoriously difficult to implement on parallel machines for two reasons: (1) the actual computation of the preconditioner is not very floating-point intensive, but requires a large amount of unstructured communication, and (2) the application of the preconditioning matrix in the iteration phase (i.e. triangular solves) are difficult to parallelize because of the recursive nature of the computation. Here we present a new approach to preconditioning for very large, sparse, unsymmetric, linear systems, which avoids both difficulties. We explicitly compute an approximate inverse to our original matrix. This new preconditioning matrix can be applied most efficiently for iterative methods on massively parallel machines, since the preconditioning phase involves only a matrix-vector multiplication, with possibly a dense matrix. Furthermore the actual computation of the preconditioning matrix has natural parallelism. For a problem of size n, the preconditioning matrix can be computed by solving n independent small least squares problems. The algorithm and its implementation on the Connection Machine CM-5 are discussed in detail and supported by extensive timings obtained from real problem data.
Implementation of hierarchical clustering using k-mer sparse matrix to analyze MERS-CoV genetic relationship

NASA Astrophysics Data System (ADS)

Bustamam, A.; Ulul, E. D.; Hura, H. F. A.; Siswantining, T.

2017-07-01

Hierarchical clustering is one of effective methods in creating a phylogenetic tree based on the distance matrix between DNA (deoxyribonucleic acid) sequences. One of the well-known methods to calculate the distance matrix is k-mer method. Generally, k-mer is more efficient than some distance matrix calculation techniques. The steps of k-mer method are started from creating k-mer sparse matrix, and followed by creating k-mer singular value vectors. The last step is computing the distance amongst vectors. In this paper, we analyze the sequences of MERS-CoV (Middle East Respiratory Syndrome - Coronavirus) DNA by implementing hierarchical clustering using k-mer sparse matrix in order to perform the phylogenetic analysis. Our results show that the ancestor of our MERS-CoV is coming from Egypt. Moreover, we found that the MERS-CoV infection that occurs in one country may not necessarily come from the same country of origin. This suggests that the process of MERS-CoV mutation might not only be influenced by geographical factor.
Signal processing using sparse derivatives with applications to chromatograms and ECG

NASA Astrophysics Data System (ADS)

Ning, Xiaoran

In this thesis, we investigate the sparsity exist in the derivative domain. Particularly, we focus on the type of signals which posses up to Mth (M > 0) order sparse derivatives. Efforts are put on formulating proper penalty functions and optimization problems to capture properties related to sparse derivatives, searching for fast, computationally efficient solvers. Also the effectiveness of these algorithms are applied to two real world applications. In the first application, we provide an algorithm which jointly addresses the problems of chromatogram baseline correction and noise reduction. The series of chromatogram peaks are modeled as sparse with sparse derivatives, and the baseline is modeled as a low-pass signal. A convex optimization problem is formulated so as to encapsulate these non-parametric models. To account for the positivity of chromatogram peaks, an asymmetric penalty function is also utilized with symmetric penalty functions. A robust, computationally efficient, iterative algorithm is developed that is guaranteed to converge to the unique optimal solution. The approach, termed Baseline Estimation And Denoising with Sparsity (BEADS), is evaluated and compared with two state-of-the-art methods using both simulated and real chromatogram data. Promising result is obtained. In the second application, a novel Electrocardiography (ECG) enhancement algorithm is designed also based on sparse derivatives. In the real medical environment, ECG signals are often contaminated by various kinds of noise or artifacts, for example, morphological changes due to motion artifact, non-stationary noise due to muscular contraction (EMG), etc. Some of these contaminations severely affect the usefulness of ECG signals, especially when computer aided algorithms are utilized. By solving the proposed convex l1 optimization problem, artifacts are reduced by modeling the clean ECG signal as a sum of two signals whose second and third-order derivatives (differences) are sparse respectively. At the end, the algorithm is applied to a QRS detection system and validated using the MIT-BIH Arrhythmia database (109452 anotations), resulting a sensitivity of Se = 99.87%$ and a positive prediction of +P = 99.88%.
Sparse maps—A systematic infrastructure for reduced-scaling electronic structure methods. I. An efficient and simple linear scaling local MP2 method that uses an intermediate basis of pair natural orbitals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pinski, Peter; Riplinger, Christoph; Neese, Frank, E-mail: evaleev@vt.edu, E-mail: frank.neese@cec.mpg.de

2015-07-21

In this work, a systematic infrastructure is described that formalizes concepts implicit in previous work and greatly simplifies computer implementation of reduced-scaling electronic structure methods. The key concept is sparse representation of tensors using chains of sparse maps between two index sets. Sparse map representation can be viewed as a generalization of compressed sparse row, a common representation of a sparse matrix, to tensor data. By combining few elementary operations on sparse maps (inversion, chaining, intersection, etc.), complex algorithms can be developed, illustrated here by a linear-scaling transformation of three-center Coulomb integrals based on our compact code library that implementsmore » sparse maps and operations on them. The sparsity of the three-center integrals arises from spatial locality of the basis functions and domain density fitting approximation. A novel feature of our approach is the use of differential overlap integrals computed in linear-scaling fashion for screening products of basis functions. Finally, a robust linear scaling domain based local pair natural orbital second-order Möller-Plesset (DLPNO-MP2) method is described based on the sparse map infrastructure that only depends on a minimal number of cutoff parameters that can be systematically tightened to approach 100% of the canonical MP2 correlation energy. With default truncation thresholds, DLPNO-MP2 recovers more than 99.9% of the canonical resolution of the identity MP2 (RI-MP2) energy while still showing a very early crossover with respect to the computational effort. Based on extensive benchmark calculations, relative energies are reproduced with an error of typically <0.2 kcal/mol. The efficiency of the local MP2 (LMP2) method can be drastically improved by carrying out the LMP2 iterations in a basis of pair natural orbitals. While the present work focuses on local electron correlation, it is of much broader applicability to computation with sparse tensors in quantum chemistry and beyond.« less
Sparse maps—A systematic infrastructure for reduced-scaling electronic structure methods. I. An efficient and simple linear scaling local MP2 method that uses an intermediate basis of pair natural orbitals.

PubMed

Pinski, Peter; Riplinger, Christoph; Valeev, Edward F; Neese, Frank

2015-07-21

In this work, a systematic infrastructure is described that formalizes concepts implicit in previous work and greatly simplifies computer implementation of reduced-scaling electronic structure methods. The key concept is sparse representation of tensors using chains of sparse maps between two index sets. Sparse map representation can be viewed as a generalization of compressed sparse row, a common representation of a sparse matrix, to tensor data. By combining few elementary operations on sparse maps (inversion, chaining, intersection, etc.), complex algorithms can be developed, illustrated here by a linear-scaling transformation of three-center Coulomb integrals based on our compact code library that implements sparse maps and operations on them. The sparsity of the three-center integrals arises from spatial locality of the basis functions and domain density fitting approximation. A novel feature of our approach is the use of differential overlap integrals computed in linear-scaling fashion for screening products of basis functions. Finally, a robust linear scaling domain based local pair natural orbital second-order Möller-Plesset (DLPNO-MP2) method is described based on the sparse map infrastructure that only depends on a minimal number of cutoff parameters that can be systematically tightened to approach 100% of the canonical MP2 correlation energy. With default truncation thresholds, DLPNO-MP2 recovers more than 99.9% of the canonical resolution of the identity MP2 (RI-MP2) energy while still showing a very early crossover with respect to the computational effort. Based on extensive benchmark calculations, relative energies are reproduced with an error of typically <0.2 kcal/mol. The efficiency of the local MP2 (LMP2) method can be drastically improved by carrying out the LMP2 iterations in a basis of pair natural orbitals. While the present work focuses on local electron correlation, it is of much broader applicability to computation with sparse tensors in quantum chemistry and beyond.
Fast iterative image reconstruction using sparse matrix factorization with GPU acceleration

NASA Astrophysics Data System (ADS)

Zhou, Jian; Qi, Jinyi

2011-03-01

Statistically based iterative approaches for image reconstruction have gained much attention in medical imaging. An accurate system matrix that defines the mapping from the image space to the data space is the key to high-resolution image reconstruction. However, an accurate system matrix is often associated with high computational cost and huge storage requirement. Here we present a method to address this problem by using sparse matrix factorization and parallel computing on a graphic processing unit (GPU).We factor the accurate system matrix into three sparse matrices: a sinogram blurring matrix, a geometric projection matrix, and an image blurring matrix. The sinogram blurring matrix models the detector response. The geometric projection matrix is based on a simple line integral model. The image blurring matrix is to compensate for the line-of-response (LOR) degradation due to the simplified geometric projection matrix. The geometric projection matrix is precomputed, while the sinogram and image blurring matrices are estimated by minimizing the difference between the factored system matrix and the original system matrix. The resulting factored system matrix has much less number of nonzero elements than the original system matrix and thus substantially reduces the storage and computation cost. The smaller size also allows an efficient implement of the forward and back projectors on GPUs, which have limited amount of memory. Our simulation studies show that the proposed method can dramatically reduce the computation cost of high-resolution iterative image reconstruction. The proposed technique is applicable to image reconstruction for different imaging modalities, including x-ray CT, PET, and SPECT.
Multi-GPU implementation of a VMAT treatment plan optimization algorithm.

PubMed

Tian, Zhen; Peng, Fei; Folkerts, Michael; Tan, Jun; Jia, Xun; Jiang, Steve B

2015-06-01

Volumetric modulated arc therapy (VMAT) optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units (GPUs) have been used to speed up the computations. However, GPU's relatively small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix in cases of, e.g., those with a large target size, multiple targets, multiple arcs, and/or small beamlet size. The main purpose of this paper is to report an implementation of a column-generation-based VMAT algorithm, previously developed in the authors' group, on a multi-GPU platform to solve the memory limitation problem. While the column-generation-based VMAT algorithm has been previously developed, the GPU implementation details have not been reported. Hence, another purpose is to present detailed techniques employed for GPU implementation. The authors also would like to utilize this particular problem as an example problem to study the feasibility of using a multi-GPU platform to solve large-scale problems in medical physics. The column-generation approach generates VMAT apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. In the authors' method, the sparse DDC matrix is first stored on a CPU in coordinate list format (COO). On the GPU side, this matrix is split into four submatrices according to beam angles, which are stored on four GPUs in compressed sparse row format. Computation of beamlet price, the first step in PP, is accomplished using multi-GPUs. A fast inter-GPU data transfer scheme is accomplished using peer-to-peer access. The remaining steps of PP and MP problems are implemented on CPU or a single GPU due to their modest problem scale and computational loads. Barzilai and Borwein algorithm with a subspace step scheme is adopted here to solve the MP problem. A head and neck (H&N) cancer case is then used to validate the authors' method. The authors also compare their multi-GPU implementation with three different single GPU implementation strategies, i.e., truncating DDC matrix (S1), repeatedly transferring DDC matrix between CPU and GPU (S2), and porting computations involving DDC matrix to CPU (S3), in terms of both plan quality and computational efficiency. Two more H&N patient cases and three prostate cases are used to demonstrate the advantages of the authors' method. The authors' multi-GPU implementation can finish the optimization process within ∼ 1 min for the H&N patient case. S1 leads to an inferior plan quality although its total time was 10 s shorter than the multi-GPU implementation due to the reduced matrix size. S2 and S3 yield the same plan quality as the multi-GPU implementation but take ∼4 and ∼6 min, respectively. High computational efficiency was consistently achieved for the other five patient cases tested, with VMAT plans of clinically acceptable quality obtained within 23-46 s. Conversely, to obtain clinically comparable or acceptable plans for all six of these VMAT cases that the authors have tested in this paper, the optimization time needed in a commercial TPS system on CPU was found to be in an order of several minutes. The results demonstrate that the multi-GPU implementation of the authors' column-generation-based VMAT optimization can handle the large-scale VMAT optimization problem efficiently without sacrificing plan quality. The authors' study may serve as an example to shed some light on other large-scale medical physics problems that require multi-GPU techniques.

A linear recurrent kernel online learning algorithm with sparse updates.

PubMed

Fan, Haijin; Song, Qing

2014-02-01

In this paper, we propose a recurrent kernel algorithm with selectively sparse updates for online learning. The algorithm introduces a linear recurrent term in the estimation of the current output. This makes the past information reusable for updating of the algorithm in the form of a recurrent gradient term. To ensure that the reuse of this recurrent gradient indeed accelerates the convergence speed, a novel hybrid recurrent training is proposed to switch on or off learning the recurrent information according to the magnitude of the current training error. Furthermore, the algorithm includes a data-dependent adaptive learning rate which can provide guaranteed system weight convergence at each training iteration. The learning rate is set as zero when the training violates the derived convergence conditions, which makes the algorithm updating process sparse. Theoretical analyses of the weight convergence are presented and experimental results show the good performance of the proposed algorithm in terms of convergence speed and estimation accuracy. Copyright © 2013 Elsevier Ltd. All rights reserved.
Using Chebyshev polynomials and approximate inverse triangular factorizations for preconditioning the conjugate gradient method

NASA Astrophysics Data System (ADS)

Kaporin, I. E.

2012-02-01

In order to precondition a sparse symmetric positive definite matrix, its approximate inverse is examined, which is represented as the product of two sparse mutually adjoint triangular matrices. In this way, the solution of the corresponding system of linear algebraic equations (SLAE) by applying the preconditioned conjugate gradient method (CGM) is reduced to performing only elementary vector operations and calculating sparse matrix-vector products. A method for constructing the above preconditioner is described and analyzed. The triangular factor has a fixed sparsity pattern and is optimal in the sense that the preconditioned matrix has a minimum K-condition number. The use of polynomial preconditioning based on Chebyshev polynomials makes it possible to considerably reduce the amount of scalar product operations (at the cost of an insignificant increase in the total number of arithmetic operations). The possibility of an efficient massively parallel implementation of the resulting method for solving SLAEs is discussed. For a sequential version of this method, the results obtained by solving 56 test problems from the Florida sparse matrix collection (which are large-scale and ill-conditioned) are presented. These results show that the method is highly reliable and has low computational costs.
A new scheduling algorithm for parallel sparse LU factorization with static pivoting

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigori, Laura; Li, Xiaoye S.

2002-08-20

In this paper we present a static scheduling algorithm for parallel sparse LU factorization with static pivoting. The algorithm is divided into mapping and scheduling phases, using the symmetric pruned graphs of L' and U to represent dependencies. The scheduling algorithm is designed for driving the parallel execution of the factorization on a distributed-memory architecture. Experimental results and comparisons with SuperLU{_}DIST are reported after applying this algorithm on real world application matrices on an IBM SP RS/6000 distributed memory machine.
Joint Feature Extraction and Classifier Design for ECG-Based Biometric Recognition.

PubMed

Gutta, Sandeep; Cheng, Qi

2016-03-01

Traditional biometric recognition systems often utilize physiological traits such as fingerprint, face, iris, etc. Recent years have seen a growing interest in electrocardiogram (ECG)-based biometric recognition techniques, especially in the field of clinical medicine. In existing ECG-based biometric recognition methods, feature extraction and classifier design are usually performed separately. In this paper, a multitask learning approach is proposed, in which feature extraction and classifier design are carried out simultaneously. Weights are assigned to the features within the kernel of each task. We decompose the matrix consisting of all the feature weights into sparse and low-rank components. The sparse component determines the features that are relevant to identify each individual, and the low-rank component determines the common feature subspace that is relevant to identify all the subjects. A fast optimization algorithm is developed, which requires only the first-order information. The performance of the proposed approach is demonstrated through experiments using the MIT-BIH Normal Sinus Rhythm database.
A revised MRCI-algorithm. I. Efficient combination of spin adaptation with individual configuration selection coupled to an effective valence-shell Hamiltonian

NASA Astrophysics Data System (ADS)

Strodel, Paul; Tavan, Paul

2002-09-01

We present a revised multi-reference configuration interaction (MRCI) algorithm for balanced and efficient calculation of electronic excitations in molecules. The revision takes up an earlier method, which had been designed for flexible, state-specific, and individual selection (IS) of MRCI expansions, included perturbational corrections (PERT), and used the spin-coupled hole-particle formalism of Tavan and Schulten (1980) for matrix-element evaluation. It removes the deficiencies of this method by introducing tree structures, which code the CI bases and allow us to efficiently exploit the sparseness of the Hamiltonian matrices. The algorithmic complexity is shown to be optimal for IS/MRCI applications. The revised IS/MRCI/PERT module is combined with the effective valence shell Hamiltonian OM2 suggested by Weber and Thiel (2000). This coupling serves the purpose of making excited state surfaces of organic dye molecules accessible to relatively cheap and sufficiently precise descriptions.
Sparse PCA with Oracle Property.

PubMed

Gu, Quanquan; Wang, Zhaoran; Liu, Han

In this paper, we study the estimation of the k -dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank- k , and attains a [Formula: see text] statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.
Sparse PCA with Oracle Property

PubMed Central

Gu, Quanquan; Wang, Zhaoran; Liu, Han

2014-01-01

In this paper, we study the estimation of the k-dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-k, and attains a s/n statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets. PMID:25684971
LiDAR point classification based on sparse representation

NASA Astrophysics Data System (ADS)

Li, Nan; Pfeifer, Norbert; Liu, Chun

2017-04-01

In order to combine the initial spatial structure and features of LiDAR data for accurate classification. The LiDAR data is represented as a 4-order tensor. Sparse representation for classification(SRC) method is used for LiDAR tensor classification. It turns out SRC need only a few of training samples from each class, meanwhile can achieve good classification result. Multiple features are extracted from raw LiDAR points to generate a high-dimensional vector at each point. Then the LiDAR tensor is built by the spatial distribution and feature vectors of the point neighborhood. The entries of LiDAR tensor are accessed via four indexes. Each index is called mode: three spatial modes in direction X ,Y ,Z and one feature mode. Sparse representation for classification(SRC) method is proposed in this paper. The sparsity algorithm is to find the best represent the test sample by sparse linear combination of training samples from a dictionary. To explore the sparsity of LiDAR tensor, the tucker decomposition is used. It decomposes a tensor into a core tensor multiplied by a matrix along each mode. Those matrices could be considered as the principal components in each mode. The entries of core tensor show the level of interaction between the different components. Therefore, the LiDAR tensor can be approximately represented by a sparse tensor multiplied by a matrix selected from a dictionary along each mode. The matrices decomposed from training samples are arranged as initial elements in the dictionary. By dictionary learning, a reconstructive and discriminative structure dictionary along each mode is built. The overall structure dictionary composes of class-specified sub-dictionaries. Then the sparse core tensor is calculated by tensor OMP(Orthogonal Matching Pursuit) method based on dictionaries along each mode. It is expected that original tensor should be well recovered by sub-dictionary associated with relevant class, while entries in the sparse tensor associated with other classed should be nearly zero. Therefore, SRC use the reconstruction error associated with each class to do data classification. A section of airborne LiDAR points of Vienna city is used and classified into 6classes: ground, roofs, vegetation, covered ground, walls and other points. Only 6 training samples from each class are taken. For the final classification result, ground and covered ground are merged into one same class(ground). The classification accuracy for ground is 94.60%, roof is 95.47%, vegetation is 85.55%, wall is 76.17%, other object is 20.39%.
Effects of partitioning and scheduling sparse matrix factorization on communication and load balance

NASA Technical Reports Server (NTRS)

Venugopal, Sesh; Naik, Vijay K.

1991-01-01

A block based, automatic partitioning and scheduling methodology is presented for sparse matrix factorization on distributed memory systems. Using experimental results, this technique is analyzed for communication and load imbalance overhead. To study the performance effects, these overheads were compared with those obtained from a straightforward 'wrap mapped' column assignment scheme. All experimental results were obtained using test sparse matrices from the Harwell-Boeing data set. The results show that there is a communication and load balance tradeoff. The block based method results in lower communication cost whereas the wrap mapped scheme gives better load balance.
Sparse approximation problem: how rapid simulated annealing succeeds and fails

NASA Astrophysics Data System (ADS)

Obuchi, Tomoyuki; Kabashima, Yoshiyuki

2016-03-01

Information processing techniques based on sparseness have been actively studied in several disciplines. Among them, a mathematical framework to approximately express a given dataset by a combination of a small number of basis vectors of an overcomplete basis is termed the sparse approximation. In this paper, we apply simulated annealing, a metaheuristic algorithm for general optimization problems, to sparse approximation in the situation where the given data have a planted sparse representation and noise is present. The result in the noiseless case shows that our simulated annealing works well in a reasonable parameter region: the planted solution is found fairly rapidly. This is true even in the case where a common relaxation of the sparse approximation problem, the G-relaxation, is ineffective. On the other hand, when the dimensionality of the data is close to the number of non-zero components, another metastable state emerges, and our algorithm fails to find the planted solution. This phenomenon is associated with a first-order phase transition. In the case of very strong noise, it is no longer meaningful to search for the planted solution. In this situation, our algorithm determines a solution with close-to-minimum distortion fairly quickly.
Development of a Real Time Sparse Non-Negative Matrix Factorization Module for Cochlear Implants by Using xPC Target

PubMed Central

Hu, Hongmei; Krasoulis, Agamemnon; Lutman, Mark; Bleeck, Stefan

2013-01-01

Cochlear implants (CIS) require efficient speech processing to maximize information transmission to the brain, especially in noise. A novel CI processing strategy was proposed in our previous studies, in which sparsity-constrained non-negative matrix factorization (NMF) was applied to the envelope matrix in order to improve the CI performance in noisy environments. It showed that the algorithm needs to be adaptive, rather than fixed, in order to adjust to acoustical conditions and individual characteristics. Here, we explore the benefit of a system that allows the user to adjust the signal processing in real time according to their individual listening needs and their individual hearing capabilities. In this system, which is based on MATLAB®, SIMULINK® and the xPC Target™ environment, the input/outupt (I/O) boards are interfaced between the SIMULINK blocks and the CI stimulation system, such that the output can be controlled successfully in the manner of a hardware-in-the-loop (HIL) simulation, hence offering a convenient way to implement a real time signal processing module that does not require any low level language. The sparsity constrained parameter of the algorithm was adapted online subjectively during an experiment with normal-hearing subjects and noise vocoded speech simulation. Results show that subjects chose different parameter values according to their own intelligibility preferences, indicating that adaptive real time algorithms are beneficial to fully explore subjective preferences. We conclude that the adaptive real time systems are beneficial for the experimental design, and such systems allow one to conduct psychophysical experiments with high ecological validity. PMID:24129021
Development of a real time sparse non-negative matrix factorization module for cochlear implants by using xPC target.

PubMed

Hu, Hongmei; Krasoulis, Agamemnon; Lutman, Mark; Bleeck, Stefan

2013-10-14

Cochlear implants (CIs) require efficient speech processing to maximize information transmission to the brain, especially in noise. A novel CI processing strategy was proposed in our previous studies, in which sparsity-constrained non-negative matrix factorization (NMF) was applied to the envelope matrix in order to improve the CI performance in noisy environments. It showed that the algorithm needs to be adaptive, rather than fixed, in order to adjust to acoustical conditions and individual characteristics. Here, we explore the benefit of a system that allows the user to adjust the signal processing in real time according to their individual listening needs and their individual hearing capabilities. In this system, which is based on MATLAB®, SIMULINK® and the xPC Target™ environment, the input/outupt (I/O) boards are interfaced between the SIMULINK blocks and the CI stimulation system, such that the output can be controlled successfully in the manner of a hardware-in-the-loop (HIL) simulation, hence offering a convenient way to implement a real time signal processing module that does not require any low level language. The sparsity constrained parameter of the algorithm was adapted online subjectively during an experiment with normal-hearing subjects and noise vocoded speech simulation. Results show that subjects chose different parameter values according to their own intelligibility preferences, indicating that adaptive real time algorithms are beneficial to fully explore subjective preferences. We conclude that the adaptive real time systems are beneficial for the experimental design, and such systems allow one to conduct psychophysical experiments with high ecological validity.
Response of selected binomial coefficients to varying degrees of matrix sparseness and to matrices with known data interrelationships

USGS Publications Warehouse

Archer, A.W.; Maples, C.G.

1989-01-01

Numerous departures from ideal relationships are revealed by Monte Carlo simulations of widely accepted binomial coefficients. For example, simulations incorporating varying levels of matrix sparseness (presence of zeros indicating lack of data) and computation of expected values reveal that not only are all common coefficients influenced by zero data, but also that some coefficients do not discriminate between sparse or dense matrices (few zero data). Such coefficients computationally merge mutually shared and mutually absent information and do not exploit all the information incorporated within the standard 2 ?? 2 contingency table; therefore, the commonly used formulae for such coefficients are more complicated than the actual range of values produced. Other coefficients do differentiate between mutual presences and absences; however, a number of these coefficients do not demonstrate a linear relationship to matrix sparseness. Finally, simulations using nonrandom matrices with known degrees of row-by-row similarities signify that several coefficients either do not display a reasonable range of values or are nonlinear with respect to known relationships within the data. Analyses with nonrandom matrices yield clues as to the utility of certain coefficients for specific applications. For example, coefficients such as Jaccard, Dice, and Baroni-Urbani and Buser are useful if correction of sparseness is desired, whereas the Russell-Rao coefficient is useful when sparseness correction is not desired. ?? 1989 International Association for Mathematical Geology.
An efficient dictionary learning algorithm and its application to 3-D medical image denoising.

PubMed

Li, Shutao; Fang, Leyuan; Yin, Haitao

2012-02-01

In this paper, we propose an efficient dictionary learning algorithm for sparse representation of given data and suggest a way to apply this algorithm to 3-D medical image denoising. Our learning approach is composed of two main parts: sparse coding and dictionary updating. On the sparse coding stage, an efficient algorithm named multiple clusters pursuit (MCP) is proposed. The MCP first applies a dictionary structuring strategy to cluster the atoms with high coherence together, and then employs a multiple-selection strategy to select several competitive atoms at each iteration. These two strategies can greatly reduce the computation complexity of the MCP and assist it to obtain better sparse solution. On the dictionary updating stage, the alternating optimization that efficiently approximates the singular value decomposition is introduced. Furthermore, in the 3-D medical image denoising application, a joint 3-D operation is proposed for taking the learning capabilities of the presented algorithm to simultaneously capture the correlations within each slice and correlations across the nearby slices, thereby obtaining better denoising results. The experiments on both synthetically generated data and real 3-D medical images demonstrate that the proposed approach has superior performance compared to some well-known methods. © 2011 IEEE
DOE Office of Scientific and Technical Information (OSTI.GOV)

Carpenter, J.A.

This report is a sequel to ORNL/CSD-106 in the ongoing supplements to Professor A.S. Householder's KWIC Index for Numerical Algebra. Beginning with the previous supplement, the subject has been restricted to Numerical Linear Algebra, roughly characterized by the American Mathematical Society's classification sections 15 and 65F but with little coverage of infinite matrices, matrices over fields of characteristics other than zero, operator theory, optimization and those parts of matrix theory primarily combinatorial in nature. Some consideration is given to the uses of graph theory in Numerical Linear Algebra, particularly with respect to algorithms for sparse matrix computations. The period coveredmore » by this report is roughly the calendar year 1982 as measured by the appearance of the articles in the American Mathematical Society's Contents of Mathematical Publications lagging actual appearance dates by up to nearly half a year. The review citations are limited to the Mathematical Reviews (MR).« less
Robust and sparse correlation matrix estimation for the analysis of high-dimensional genomics data.

PubMed

Serra, Angela; Coretto, Pietro; Fratello, Michele; Tagliaferri, Roberto; Stegle, Oliver

2018-02-15

Microarray technology can be used to study the expression of thousands of genes across a number of different experimental conditions, usually hundreds. The underlying principle is that genes sharing similar expression patterns, across different samples, can be part of the same co-expression system, or they may share the same biological functions. Groups of genes are usually identified based on cluster analysis. Clustering methods rely on the similarity matrix between genes. A common choice to measure similarity is to compute the sample correlation matrix. Dimensionality reduction is another popular data analysis task which is also based on covariance/correlation matrix estimates. Unfortunately, covariance/correlation matrix estimation suffers from the intrinsic noise present in high-dimensional data. Sources of noise are: sampling variations, presents of outlying sample units, and the fact that in most cases the number of units is much larger than the number of genes. In this paper, we propose a robust correlation matrix estimator that is regularized based on adaptive thresholding. The resulting method jointly tames the effects of the high-dimensionality, and data contamination. Computations are easy to implement and do not require hand tunings. Both simulated and real data are analyzed. A Monte Carlo experiment shows that the proposed method is capable of remarkable performances. Our correlation metric is more robust to outliers compared with the existing alternatives in two gene expression datasets. It is also shown how the regularization allows to automatically detect and filter spurious correlations. The same regularization is also extended to other less robust correlation measures. Finally, we apply the ARACNE algorithm on the SyNTreN gene expression data. Sensitivity and specificity of the reconstructed network is compared with the gold standard. We show that ARACNE performs better when it takes the proposed correlation matrix estimator as input. The R software is available at https://github.com/angy89/RobustSparseCorrelation. aserra@unisa.it or robtag@unisa.it. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ITrace: An implicit trust inference method for trust-aware collaborative filtering

NASA Astrophysics Data System (ADS)

He, Xu; Liu, Bin; Chen, Kejia

2018-04-01

The growth of Internet commerce has stimulated the use of collaborative filtering (CF) algorithms as recommender systems. A CF algorithm recommends items of interest to the target user by leveraging the votes given by other similar users. In a standard CF framework, it is assumed that the credibility of every voting user is exactly the same with respect to the target user. This assumption is not satisfied and thus may lead to misleading recommendations in many practical applications. A natural countermeasure is to design a trust-aware CF (TaCF) algorithm, which can take account of the difference in the credibilities of the voting users when performing CF. To this end, this paper presents a trust inference approach, which can predict the implicit trust of the target user on every voting user from a sparse explicit trust matrix. Then an improved CF algorithm termed iTrace is proposed, which takes advantage of both the explicit and the predicted implicit trust to provide recommendations with the CF framework. An empirical evaluation on a public dataset demonstrates that the proposed algorithm provides a significant improvement in recommendation quality in terms of mean absolute error.
THz spectral data analysis and components unmixing based on non-negative matrix factorization methods

NASA Astrophysics Data System (ADS)

Ma, Yehao; Li, Xian; Huang, Pingjie; Hou, Dibo; Wang, Qiang; Zhang, Guangxin

2017-04-01

In many situations the THz spectroscopic data observed from complex samples represent the integrated result of several interrelated variables or feature components acting together. The actual information contained in the original data might be overlapping and there is a necessity to investigate various approaches for model reduction and data unmixing. The development and use of low-rank approximate nonnegative matrix factorization (NMF) and smooth constraint NMF (CNMF) algorithms for feature components extraction and identification in the fields of terahertz time domain spectroscopy (THz-TDS) data analysis are presented. The evolution and convergence properties of NMF and CNMF methods based on sparseness, independence and smoothness constraints for the resulting nonnegative matrix factors are discussed. For general NMF, its cost function is nonconvex and the result is usually susceptible to initialization and noise corruption, and may fall into local minima and lead to unstable decomposition. To reduce these drawbacks, smoothness constraint is introduced to enhance the performance of NMF. The proposed algorithms are evaluated by several THz-TDS data decomposition experiments including a binary system and a ternary system simulating some applications such as medicine tablet inspection. Results show that CNMF is more capable of finding optimal solutions and more robust for random initialization in contrast to NMF. The investigated method is promising for THz data resolution contributing to unknown mixture identification.
THz spectral data analysis and components unmixing based on non-negative matrix factorization methods.

PubMed

Ma, Yehao; Li, Xian; Huang, Pingjie; Hou, Dibo; Wang, Qiang; Zhang, Guangxin

2017-04-15

In many situations the THz spectroscopic data observed from complex samples represent the integrated result of several interrelated variables or feature components acting together. The actual information contained in the original data might be overlapping and there is a necessity to investigate various approaches for model reduction and data unmixing. The development and use of low-rank approximate nonnegative matrix factorization (NMF) and smooth constraint NMF (CNMF) algorithms for feature components extraction and identification in the fields of terahertz time domain spectroscopy (THz-TDS) data analysis are presented. The evolution and convergence properties of NMF and CNMF methods based on sparseness, independence and smoothness constraints for the resulting nonnegative matrix factors are discussed. For general NMF, its cost function is nonconvex and the result is usually susceptible to initialization and noise corruption, and may fall into local minima and lead to unstable decomposition. To reduce these drawbacks, smoothness constraint is introduced to enhance the performance of NMF. The proposed algorithms are evaluated by several THz-TDS data decomposition experiments including a binary system and a ternary system simulating some applications such as medicine tablet inspection. Results show that CNMF is more capable of finding optimal solutions and more robust for random initialization in contrast to NMF. The investigated method is promising for THz data resolution contributing to unknown mixture identification. Copyright © 2017 Elsevier B.V. All rights reserved.
A new method for computation of eigenvector derivatives with distinct and repeated eigenvalues in structural dynamic analysis

NASA Astrophysics Data System (ADS)

Li, Zhengguang; Lai, Siu-Kai; Wu, Baisheng

2018-07-01

Determining eigenvector derivatives is a challenging task due to the singularity of the coefficient matrices of the governing equations, especially for those structural dynamic systems with repeated eigenvalues. An effective strategy is proposed to construct a non-singular coefficient matrix, which can be directly used to obtain the eigenvector derivatives with distinct and repeated eigenvalues. This approach also has an advantage that only requires eigenvalues and eigenvectors of interest, without solving the particular solutions of eigenvector derivatives. The Symmetric Quasi-Minimal Residual (SQMR) method is then adopted to solve the governing equations, only the existing factored (shifted) stiffness matrix from an iterative eigensolution such as the subspace iteration method or the Lanczos algorithm is utilized. The present method can deal with both cases of simple and repeated eigenvalues in a unified manner. Three numerical examples are given to illustrate the accuracy and validity of the proposed algorithm. Highly accurate approximations to the eigenvector derivatives are obtained within a few iteration steps, making a significant reduction of the computational effort. This method can be incorporated into a coupled eigensolver/derivative software module. In particular, it is applicable for finite element models with large sparse matrices.

Inverse problems with nonnegative and sparse solutions: algorithms and application to the phase retrieval problem

NASA Astrophysics Data System (ADS)

Quy Muoi, Pham; Nho Hào, Dinh; Sahoo, Sujit Kumar; Tang, Dongliang; Cong, Nguyen Huu; Dang, Cuong

2018-05-01

In this paper, we study a gradient-type method and a semismooth Newton method for minimization problems in regularizing inverse problems with nonnegative and sparse solutions. We propose a special penalty functional forcing the minimizers of regularized minimization problems to be nonnegative and sparse, and then we apply the proposed algorithms in a practical the problem. The strong convergence of the gradient-type method and the local superlinear convergence of the semismooth Newton method are proven. Then, we use these algorithms for the phase retrieval problem and illustrate their efficiency in numerical examples, particularly in the practical problem of optical imaging through scattering media where all the noises from experiment are presented.
Sparse principal component analysis in medical shape modeling

NASA Astrophysics Data System (ADS)

Sjöstrand, Karl; Stegmann, Mikkel B.; Larsen, Rasmus

2006-03-01

Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims at producing easily interpreted models through sparse loadings, i.e. each new variable is a linear combination of a subset of the original variables. One of the aims of using SPCA is the possible separation of the results into isolated and easily identifiable effects. This article introduces SPCA for shape analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA algorithm has been implemented using Matlab and is available for download. The general behavior of the algorithm is investigated, and strengths and weaknesses are discussed. The original report on the SPCA algorithm argues that the ordering of modes is not an issue. We disagree on this point and propose several approaches to establish sensible orderings. A method that orders modes by decreasing variance and maximizes the sum of variances for all modes is presented and investigated in detail.
Salient Object Detection via Structured Matrix Decomposition.

PubMed

Peng, Houwen; Li, Bing; Ling, Haibin; Hu, Weiming; Xiong, Weihua; Maybank, Stephen J

2016-05-04

Low-rank recovery models have shown potential for salient object detection, where a matrix is decomposed into a low-rank matrix representing image background and a sparse matrix identifying salient objects. Two deficiencies, however, still exist. First, previous work typically assumes the elements in the sparse matrix are mutually independent, ignoring the spatial and pattern relations of image regions. Second, when the low-rank and sparse matrices are relatively coherent, e.g., when there are similarities between the salient objects and background or when the background is complicated, it is difficult for previous models to disentangle them. To address these problems, we propose a novel structured matrix decomposition model with two structural regularizations: (1) a tree-structured sparsity-inducing regularization that captures the image structure and enforces patches from the same object to have similar saliency values, and (2) a Laplacian regularization that enlarges the gaps between salient objects and the background in feature space. Furthermore, high-level priors are integrated to guide the matrix decomposition and boost the detection. We evaluate our model for salient object detection on five challenging datasets including single object, multiple objects and complex scene images, and show competitive results as compared with 24 state-of-the-art methods in terms of seven performance metrics.
Hybrid sparse blind deconvolution: an implementation of SOOT algorithm to real data

NASA Astrophysics Data System (ADS)

Pakmanesh, Parvaneh; Goudarzi, Alireza; Kourki, Meisam

2018-06-01

Getting information of seismic data depends on deconvolution as an important processing step; it provides the reflectivity series by signal compression. This compression can be obtained by removing the wavelet effects on the traces. The recently blind deconvolution has provided reliable performance for sparse signal recovery. In this study, two deconvolution methods have been implemented to the seismic data; the convolution of these methods provides a robust spiking deconvolution approach. This hybrid deconvolution is applied using the sparse deconvolution (MM algorithm) and the Smoothed-One-Over-Two algorithm (SOOT) in a chain. The MM algorithm is based on the minimization of the cost function defined by standards l1 and l2. After applying the two algorithms to the seismic data, the SOOT algorithm provided well-compressed data with a higher resolution than the MM algorithm. The SOOT algorithm requires initial values to be applied for real data, such as the wavelet coefficients and reflectivity series that can be achieved through the MM algorithm. The computational cost of the hybrid method is high, and it is necessary to be implemented on post-stack or pre-stack seismic data of complex structure regions.
Discrete Sparse Coding.

PubMed

Exarchakis, Georgios; Lücke, Jörg

2017-11-01

Sparse coding algorithms with continuous latent variables have been the subject of a large number of studies. However, discrete latent spaces for sparse coding have been largely ignored. In this work, we study sparse coding with latents described by discrete instead of continuous prior distributions. We consider the general case in which the latents (while being sparse) can take on any value of a finite set of possible values and in which we learn the prior probability of any value from data. This approach can be applied to any data generated by discrete causes, and it can be applied as an approximation of continuous causes. As the prior probabilities are learned, the approach then allows for estimating the prior shape without assuming specific functional forms. To efficiently train the parameters of our probabilistic generative model, we apply a truncated expectation-maximization approach (expectation truncation) that we modify to work with a general discrete prior. We evaluate the performance of the algorithm by applying it to a variety of tasks: (1) we use artificial data to verify that the algorithm can recover the generating parameters from a random initialization, (2) use image patches of natural images and discuss the role of the prior for the extraction of image components, (3) use extracellular recordings of neurons to present a novel method of analysis for spiking neurons that includes an intuitive discretization strategy, and (4) apply the algorithm on the task of encoding audio waveforms of human speech. The diverse set of numerical experiments presented in this letter suggests that discrete sparse coding algorithms can scale efficiently to work with realistic data sets and provide novel statistical quantities to describe the structure of the data.
Compressive sensing for sparse time-frequency representation of nonstationary signals in the presence of impulsive noise

NASA Astrophysics Data System (ADS)

Orović, Irena; Stanković, Srdjan; Amin, Moeness

2013-05-01

A modified robust two-dimensional compressive sensing algorithm for reconstruction of sparse time-frequency representation (TFR) is proposed. The ambiguity function domain is assumed to be the domain of observations. The two-dimensional Fourier bases are used to linearly relate the observations to the sparse TFR, in lieu of the Wigner distribution. We assume that a set of available samples in the ambiguity domain is heavily corrupted by an impulsive type of noise. Consequently, the problem of sparse TFR reconstruction cannot be tackled using standard compressive sensing optimization algorithms. We introduce a two-dimensional L-statistics based modification into the transform domain representation. It provides suitable initial conditions that will produce efficient convergence of the reconstruction algorithm. This approach applies sorting and weighting operations to discard an expected amount of samples corrupted by noise. The remaining samples serve as observations used in sparse reconstruction of the time-frequency signal representation. The efficiency of the proposed approach is demonstrated on numerical examples that comprise both cases of monocomponent and multicomponent signals.
Layout optimization with algebraic multigrid methods

NASA Technical Reports Server (NTRS)

Regler, Hans; Ruede, Ulrich

1993-01-01

Finding the optimal position for the individual cells (also called functional modules) on the chip surface is an important and difficult step in the design of integrated circuits. This paper deals with the problem of relative placement, that is the minimization of a quadratic functional with a large, sparse, positive definite system matrix. The basic optimization problem must be augmented by constraints to inhibit solutions where cells overlap. Besides classical iterative methods, based on conjugate gradients (CG), we show that algebraic multigrid methods (AMG) provide an interesting alternative. For moderately sized examples with about 10000 cells, AMG is already competitive with CG and is expected to be superior for larger problems. Besides the classical 'multiplicative' AMG algorithm where the levels are visited sequentially, we propose an 'additive' variant of AMG where levels may be treated in parallel and that is suitable as a preconditioner in the CG algorithm.
Hierarchical Feature Extraction With Local Neural Response for Image Recognition.

PubMed

Li, Hong; Wei, Yantao; Li, Luoqing; Chen, C L P

2013-04-01

In this paper, a hierarchical feature extraction method is proposed for image recognition. The key idea of the proposed method is to extract an effective feature, called local neural response (LNR), of the input image with nontrivial discrimination and invariance properties by alternating between local coding and maximum pooling operation. The local coding, which is carried out on the locally linear manifold, can extract the salient feature of image patches and leads to a sparse measure matrix on which maximum pooling is carried out. The maximum pooling operation builds the translation invariance into the model. We also show that other invariant properties, such as rotation and scaling, can be induced by the proposed model. In addition, a template selection algorithm is presented to reduce computational complexity and to improve the discrimination ability of the LNR. Experimental results show that our method is robust to local distortion and clutter compared with state-of-the-art algorithms.
Site-percolation threshold of carbon nanotube fibers-Fast inspection of percolation with Markov stochastic theory

NASA Astrophysics Data System (ADS)

Xu, Fangbo; Xu, Zhiping; Yakobson, Boris I.

2014-08-01

We present a site-percolation model based on a modified FCC lattice, as well as an efficient algorithm of inspecting percolation which takes advantage of the Markov stochastic theory, in order to study the percolation threshold of carbon nanotube (CNT) fibers. Our Markov-chain based algorithm carries out the inspection of percolation by performing repeated sparse matrix-vector multiplications, which allows parallelized computation to accelerate the inspection for a given configuration. With this approach, we determine that the site-percolation transition of CNT fibers occurs at pc=0.1533±0.0013, and analyze the dependence of the effective percolation threshold (corresponding to 0.5 percolation probability) on the length and the aspect ratio of a CNT fiber on a finite-size-scaling basis. We also discuss the aspect ratio dependence of percolation probability with various values of p (not restricted to pc).
Visual Tracking Based on Extreme Learning Machine and Sparse Representation

PubMed Central

Wang, Baoxian; Tang, Linbo; Yang, Jinglin; Zhao, Baojun; Wang, Shuigen

2015-01-01

The existing sparse representation-based visual trackers mostly suffer from both being time consuming and having poor robustness problems. To address these issues, a novel tracking method is presented via combining sparse representation and an emerging learning technique, namely extreme learning machine (ELM). Specifically, visual tracking can be divided into two consecutive processes. Firstly, ELM is utilized to find the optimal separate hyperplane between the target observations and background ones. Thus, the trained ELM classification function is able to remove most of the candidate samples related to background contents efficiently, thereby reducing the total computational cost of the following sparse representation. Secondly, to further combine ELM and sparse representation, the resultant confidence values (i.e., probabilities to be a target) of samples on the ELM classification function are used to construct a new manifold learning constraint term of the sparse representation framework, which tends to achieve robuster results. Moreover, the accelerated proximal gradient method is used for deriving the optimal solution (in matrix form) of the constrained sparse tracking model. Additionally, the matrix form solution allows the candidate samples to be calculated in parallel, thereby leading to a higher efficiency. Experiments demonstrate the effectiveness of the proposed tracker. PMID:26506359
A sparse differential clustering algorithm for tracing cell type changes via single-cell RNA-sequencing data

PubMed Central

Barron, Martin; Zhang, Siyuan

2018-01-01

Abstract Cell types in cell populations change as the condition changes: some cell types die out, new cell types may emerge and surviving cell types evolve to adapt to the new condition. Using single-cell RNA-sequencing data that measure the gene expression of cells before and after the condition change, we propose an algorithm, SparseDC, which identifies cell types, traces their changes across conditions and identifies genes which are marker genes for these changes. By solving a unified optimization problem, SparseDC completes all three tasks simultaneously. SparseDC is highly computationally efficient and demonstrates its accuracy on both simulated and real data. PMID:29140455
Robust Principal Component Analysis Regularized by Truncated Nuclear Norm for Identifying Differentially Expressed Genes.

PubMed

Wang, Ya-Xuan; Gao, Ying-Lian; Liu, Jin-Xing; Kong, Xiang-Zhen; Li, Hai-Jun

2017-09-01

Identifying differentially expressed genes from the thousands of genes is a challenging task. Robust principal component analysis (RPCA) is an efficient method in the identification of differentially expressed genes. RPCA method uses nuclear norm to approximate the rank function. However, theoretical studies showed that the nuclear norm minimizes all singular values, so it may not be the best solution to approximate the rank function. The truncated nuclear norm is defined as the sum of some smaller singular values, which may achieve a better approximation of the rank function than nuclear norm. In this paper, a novel method is proposed by replacing nuclear norm of RPCA with the truncated nuclear norm, which is named robust principal component analysis regularized by truncated nuclear norm (TRPCA). The method decomposes the observation matrix of genomic data into a low-rank matrix and a sparse matrix. Because the significant genes can be considered as sparse signals, the differentially expressed genes are viewed as the sparse perturbation signals. Thus, the differentially expressed genes can be identified according to the sparse matrix. The experimental results on The Cancer Genome Atlas data illustrate that the TRPCA method outperforms other state-of-the-art methods in the identification of differentially expressed genes.
Estimation of white matter fiber parameters from compressed multiresolution diffusion MRI using sparse Bayesian learning.

PubMed

Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Duarte-Carvajalino, Julio M; Sapiro, Guillermo; Lenglet, Christophe

2018-02-15

We present a sparse Bayesian unmixing algorithm BusineX: Bayesian Unmixing for Sparse Inference-based Estimation of Fiber Crossings (X), for estimation of white matter fiber parameters from compressed (under-sampled) diffusion MRI (dMRI) data. BusineX combines compressive sensing with linear unmixing and introduces sparsity to the previously proposed multiresolution data fusion algorithm RubiX, resulting in a method for improved reconstruction, especially from data with lower number of diffusion gradients. We formulate the estimation of fiber parameters as a sparse signal recovery problem and propose a linear unmixing framework with sparse Bayesian learning for the recovery of sparse signals, the fiber orientations and volume fractions. The data is modeled using a parametric spherical deconvolution approach and represented using a dictionary created with the exponential decay components along different possible diffusion directions. Volume fractions of fibers along these directions define the dictionary weights. The proposed sparse inference, which is based on the dictionary representation, considers the sparsity of fiber populations and exploits the spatial redundancy in data representation, thereby facilitating inference from under-sampled q-space. The algorithm improves parameter estimation from dMRI through data-dependent local learning of hyperparameters, at each voxel and for each possible fiber orientation, that moderate the strength of priors governing the parameter variances. Experimental results on synthetic and in-vivo data show improved accuracy with a lower uncertainty in fiber parameter estimates. BusineX resolves a higher number of second and third fiber crossings. For under-sampled data, the algorithm is also shown to produce more reliable estimates. Copyright © 2017 Elsevier Inc. All rights reserved.
HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION

PubMed Central

Mukherjee, Rajarshi; Pillai, Natesh S.; Lin, Xihong

2015-01-01

In this paper, we study the detection boundary for minimax hypothesis testing in the context of high-dimensional, sparse binary regression models. Motivated by genetic sequencing association studies for rare variant effects, we investigate the complexity of the hypothesis testing problem when the design matrix is sparse. We observe a new phenomenon in the behavior of detection boundary which does not occur in the case of Gaussian linear regression. We derive the detection boundary as a function of two components: a design matrix sparsity index and signal strength, each of which is a function of the sparsity of the alternative. For any alternative, if the design matrix sparsity index is too high, any test is asymptotically powerless irrespective of the magnitude of signal strength. For binary design matrices with the sparsity index that is not too high, our results are parallel to those in the Gaussian case. In this context, we derive detection boundaries for both dense and sparse regimes. For the dense regime, we show that the generalized likelihood ratio is rate optimal; for the sparse regime, we propose an extended Higher Criticism Test and show it is rate optimal and sharp. We illustrate the finite sample properties of the theoretical results using simulation studies. PMID:26246645
Sparse Learning with Stochastic Composite Optimization.

PubMed

Zhang, Weizhong; Zhang, Lijun; Jin, Zhongming; Jin, Rong; Cai, Deng; Li, Xuelong; Liang, Ronghua; He, Xiaofei

2017-06-01

In this paper, we study Stochastic Composite Optimization (SCO) for sparse learning that aims to learn a sparse solution from a composite function. Most of the recent SCO algorithms have already reached the optimal expected convergence rate O(1/λT), but they often fail to deliver sparse solutions at the end either due to the limited sparsity regularization during stochastic optimization (SO) or due to the limitation in online-to-batch conversion. Even when the objective function is strongly convex, their high probability bounds can only attain O(√{log(1/δ)/T}) with δ is the failure probability, which is much worse than the expected convergence rate. To address these limitations, we propose a simple yet effective two-phase Stochastic Composite Optimization scheme by adding a novel powerful sparse online-to-batch conversion to the general Stochastic Optimization algorithms. We further develop three concrete algorithms, OptimalSL, LastSL and AverageSL, directly under our scheme to prove the effectiveness of the proposed scheme. Both the theoretical analysis and the experiment results show that our methods can really outperform the existing methods at the ability of sparse learning and at the meantime we can improve the high probability bound to approximately O(log(log(T)/δ)/λT).
Strategies for vectorizing the sparse matrix vector product on the CRAY XMP, CRAY 2, and CYBER 205

NASA Technical Reports Server (NTRS)

Bauschlicher, Charles W., Jr.; Partridge, Harry

1987-01-01

Large, randomly sparse matrix vector products are important in a number of applications in computational chemistry, such as matrix diagonalization and the solution of simultaneous equations. Vectorization of this process is considered for the CRAY XMP, CRAY 2, and CYBER 205, using a matrix of dimension of 20,000 with from 1 percent to 6 percent nonzeros. Efficient scatter/gather capabilities add coding flexibility and yield significant improvements in performance. For the CYBER 205, it is shown that minor changes in the IO can reduce the CPU time by a factor of 50. Similar changes in the CRAY codes make a far smaller improvement.
Linear-scaling density-functional simulations of charged point defects in Al2O3 using hierarchical sparse matrix algebra.

PubMed

Hine, N D M; Haynes, P D; Mostofi, A A; Payne, M C

2010-09-21

We present calculations of formation energies of defects in an ionic solid (Al(2)O(3)) extrapolated to the dilute limit, corresponding to a simulation cell of infinite size. The large-scale calculations required for this extrapolation are enabled by developments in the approach to parallel sparse matrix algebra operations, which are central to linear-scaling density-functional theory calculations. The computational cost of manipulating sparse matrices, whose sizes are determined by the large number of basis functions present, is greatly improved with this new approach. We present details of the sparse algebra scheme implemented in the ONETEP code using hierarchical sparsity patterns, and demonstrate its use in calculations on a wide range of systems, involving thousands of atoms on hundreds to thousands of parallel processes.
Characterizing and differentiating task-based and resting state fMRI signals via two-stage sparse representations.

PubMed

Zhang, Shu; Li, Xiang; Lv, Jinglei; Jiang, Xi; Guo, Lei; Liu, Tianming

2016-03-01

A relatively underexplored question in fMRI is whether there are intrinsic differences in terms of signal composition patterns that can effectively characterize and differentiate task-based or resting state fMRI (tfMRI or rsfMRI) signals. In this paper, we propose a novel two-stage sparse representation framework to examine the fundamental difference between tfMRI and rsfMRI signals. Specifically, in the first stage, the whole-brain tfMRI or rsfMRI signals of each subject were composed into a big data matrix, which was then factorized into a subject-specific dictionary matrix and a weight coefficient matrix for sparse representation. In the second stage, all of the dictionary matrices from both tfMRI/rsfMRI data across multiple subjects were composed into another big data-matrix, which was further sparsely represented by a cross-subjects common dictionary and a weight matrix. This framework has been applied on the recently publicly released Human Connectome Project (HCP) fMRI data and experimental results revealed that there are distinctive and descriptive atoms in the cross-subjects common dictionary that can effectively characterize and differentiate tfMRI and rsfMRI signals, achieving 100% classification accuracy. Moreover, our methods and results can be meaningfully interpreted, e.g., the well-known default mode network (DMN) activities can be recovered from the very noisy and heterogeneous aggregated big-data of tfMRI and rsfMRI signals across all subjects in HCP Q1 release.
RZA-NLMF algorithm-based adaptive sparse sensing for realizing compressive sensing

NASA Astrophysics Data System (ADS)

Gui, Guan; Xu, Li; Adachi, Fumiyuki

2014-12-01

Nonlinear sparse sensing (NSS) techniques have been adopted for realizing compressive sensing in many applications such as radar imaging. Unlike the NSS, in this paper, we propose an adaptive sparse sensing (ASS) approach using the reweighted zero-attracting normalized least mean fourth (RZA-NLMF) algorithm which depends on several given parameters, i.e., reweighted factor, regularization parameter, and initial step size. First, based on the independent assumption, Cramer-Rao lower bound (CRLB) is derived as for the performance comparisons. In addition, reweighted factor selection method is proposed for achieving robust estimation performance. Finally, to verify the algorithm, Monte Carlo-based computer simulations are given to show that the ASS achieves much better mean square error (MSE) performance than the NSS.
A new Mumford-Shah total variation minimization based model for sparse-view x-ray computed tomography image reconstruction.

PubMed

Chen, Bo; Bian, Zhaoying; Zhou, Xiaohui; Chen, Wensheng; Ma, Jianhua; Liang, Zhengrong

2018-04-12

Total variation (TV) minimization for the sparse-view x-ray computer tomography (CT) reconstruction has been widely explored to reduce radiation dose. However, due to the piecewise constant assumption for the TV model, the reconstructed images often suffer from over-smoothness on the image edges. To mitigate this drawback of TV minimization, we present a Mumford-Shah total variation (MSTV) minimization algorithm in this paper. The presented MSTV model is derived by integrating TV minimization and Mumford-Shah segmentation. Subsequently, a penalized weighted least-squares (PWLS) scheme with MSTV is developed for the sparse-view CT reconstruction. For simplicity, the proposed algorithm is named as 'PWLS-MSTV.' To evaluate the performance of the present PWLS-MSTV algorithm, both qualitative and quantitative studies were conducted by using a digital XCAT phantom and a physical phantom. Experimental results show that the present PWLS-MSTV algorithm has noticeable gains over the existing algorithms in terms of noise reduction, contrast-to-ratio measure and edge-preservation.

Distance descending ordering method: An O(n) algorithm for inverting the mass matrix in simulation of macromolecules with long branches

NASA Astrophysics Data System (ADS)

Xu, Xiankun; Li, Peiwen

2017-11-01

Fixman's work in 1974 and the follow-up studies have developed a method that can factorize the inverse of mass matrix into an arithmetic combination of three sparse matrices-one of them is positive definite and needs to be further factorized by using the Cholesky decomposition or similar methods. When the molecule subjected to study is of serial chain structure, this method can achieve O (n) time complexity. However, for molecules with long branches, Cholesky decomposition about the corresponding positive definite matrix will introduce massive fill-in due to its nonzero structure. Although there are several methods can be used to reduce the number of fill-in, none of them could strictly guarantee for zero fill-in for all molecules according to our test, and thus cannot obtain O (n) time complexity by using these traditional methods. In this paper we present a new method that can guarantee for no fill-in in doing the Cholesky decomposition, which was developed based on the correlations between the mass matrix and the geometrical structure of molecules. As a result, the inverting of mass matrix will remain the O (n) time complexity, no matter the molecule structure has long branches or not.
Turbo-SMT: Accelerating Coupled Sparse Matrix-Tensor Factorizations by 200×

PubMed Central

Papalexakis, Evangelos E.; Faloutsos, Christos; Mitchell, Tom M.; Talukdar, Partha Pratim; Sidiropoulos, Nicholas D.; Murphy, Brian

2015-01-01

How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ‘edible’, ‘fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we accelerate any CMTF solver, so that it runs within a few minutes instead of tens of hours to a day, while maintaining good accuracy? We introduce TURBO-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, by up to 200×, along with an up to 65 fold increase in sparsity, with comparable accuracy to the baseline. We apply TURBO-SMT to BRAINQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. TURBO-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. PMID:26473087
Spectral Calculation of ICRF Wave Propagation and Heating in 2-D Using Massively Parallel Computers

NASA Astrophysics Data System (ADS)

Jaeger, E. F.; D'Azevedo, E.; Berry, L. A.; Carter, M. D.; Batchelor, D. B.

2000-10-01

Spectral calculations of ICRF wave propagation in plasmas have the natural advantage that they require no assumption regarding the smallness of the ion Larmor radius ρ relative to wavelength λ. Results are therefore applicable to all orders in k_bot ρ where k_bot = 2π/λ. But because all modes in the spectral representation are coupled, the solution requires inversion of a large dense matrix. In contrast, finite difference algorithms involve only matrices that are sparse and banded. Thus, spectral calculations of wave propagation and heating in tokamak plasmas have so far been limited to 1-D. In this paper, we extend the spectral method to 2-D by taking advantage of new matrix inversion techniques that utilize massively parallel computers. By spreading the dense matrix over 576 processors on the ORNL IBM RS/6000 SP supercomputer, we are able to solve up to 120,000 coupled complex equations requiring 230 GBytes of memory and achieving over 500 Gflops/sec. Initial results for ASDEX and NSTX will be presented using up to 200 modes in both the radial and vertical dimensions.
Dimension Reduction With Extreme Learning Machine.

PubMed

Kasun, Liyanaarachchi Lekamalage Chamara; Yang, Yan; Huang, Guang-Bin; Zhang, Zhengyou

2016-08-01

Data may often contain noise or irrelevant information, which negatively affect the generalization capability of machine learning algorithms. The objective of dimension reduction algorithms, such as principal component analysis (PCA), non-negative matrix factorization (NMF), random projection (RP), and auto-encoder (AE), is to reduce the noise or irrelevant information of the data. The features of PCA (eigenvectors) and linear AE are not able to represent data as parts (e.g. nose in a face image). On the other hand, NMF and non-linear AE are maimed by slow learning speed and RP only represents a subspace of original data. This paper introduces a dimension reduction framework which to some extend represents data as parts, has fast learning speed, and learns the between-class scatter subspace. To this end, this paper investigates a linear and non-linear dimension reduction framework referred to as extreme learning machine AE (ELM-AE) and sparse ELM-AE (SELM-AE). In contrast to tied weight AE, the hidden neurons in ELM-AE and SELM-AE need not be tuned, and their parameters (e.g, input weights in additive neurons) are initialized using orthogonal and sparse random weights, respectively. Experimental results on USPS handwritten digit recognition data set, CIFAR-10 object recognition, and NORB object recognition data set show the efficacy of linear and non-linear ELM-AE and SELM-AE in terms of discriminative capability, sparsity, training time, and normalized mean square error.
Efficient image enhancement using sparse source separation in the Retinex theory

NASA Astrophysics Data System (ADS)

Yoon, Jongsu; Choi, Jangwon; Choe, Yoonsik

2017-11-01

Color constancy is the feature of the human vision system (HVS) that ensures the relative constancy of the perceived color of objects under varying illumination conditions. The Retinex theory of machine vision systems is based on the HVS. Among Retinex algorithms, the physics-based algorithms are efficient; however, they generally do not satisfy the local characteristics of the original Retinex theory because they eliminate global illumination from their optimization. We apply the sparse source separation technique to the Retinex theory to present a physics-based algorithm that satisfies the locality characteristic of the original Retinex theory. Previous Retinex algorithms have limited use in image enhancement because the total variation Retinex results in an overly enhanced image and the sparse source separation Retinex cannot completely restore the original image. In contrast, our proposed method preserves the image edge and can very nearly replicate the original image without any special operation.
Sparse matrix methods research using the CSM testbed software system

NASA Technical Reports Server (NTRS)

Chu, Eleanor; George, J. Alan

1989-01-01

Research is described on sparse matrix techniques for the Computational Structural Mechanics (CSM) Testbed. The primary objective was to compare the performance of state-of-the-art techniques for solving sparse systems with those that are currently available in the CSM Testbed. Thus, one of the first tasks was to become familiar with the structure of the testbed, and to install some or all of the SPARSPAK package in the testbed. A suite of subroutines to extract from the data base the relevant structural and numerical information about the matrix equations was written, and all the demonstration problems distributed with the testbed were successfully solved. These codes were documented, and performance studies comparing the SPARSPAK technology to the methods currently in the testbed were completed. In addition, some preliminary studies were done comparing some recently developed out-of-core techniques with the performance of the testbed processor INV.
A tight and explicit representation of Q in sparse QR factorization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ng, E.G.; Peyton, B.W.

1992-05-01

In QR factorization of a sparse m{times}n matrix A (m {ge} n) the orthogonal factor Q is often stored implicitly as a lower trapezoidal matrix H known as the Householder matrix. This paper presents a simple characterization of the row structure of Q, which could be used as the basis for a sparse data structure that can store Q explicitly. The new characterization is a simple extension of a well known row-oriented characterization of the structure of H. Hare, Johnson, Olesky, and van den Driessche have recently provided a complete sparsity analysis of the QR factorization. Let U be themore » matrix consisting of the first n columns of Q. Using results from, we show that the data structures for H and U resulting from our characterizations are tight when A is a strong Hall matrix. We also show that H and the lower trapezoidal part of U have the same sparsity characterization when A is strong Hall. We then show that this characterization can be extended to any weak Hall matrix that has been permuted into block upper triangular form. Finally, we show that permuting to block triangular form never increases the fill incurred during the factorization.« less
Doubly Nonparametric Sparse Nonnegative Matrix Factorization Based on Dependent Indian Buffet Processes.

PubMed

Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Xu, Richard Yi Da; Luo, Xiangfeng

2018-05-01

Sparse nonnegative matrix factorization (SNMF) aims to factorize a data matrix into two optimized nonnegative sparse factor matrices, which could benefit many tasks, such as document-word co-clustering. However, the traditional SNMF typically assumes the number of latent factors (i.e., dimensionality of the factor matrices) to be fixed. This assumption makes it inflexible in practice. In this paper, we propose a doubly sparse nonparametric NMF framework to mitigate this issue by using dependent Indian buffet processes (dIBP). We apply a correlation function for the generation of two stick weights associated with each column pair of factor matrices while still maintaining their respective marginal distribution specified by IBP. As a consequence, the generation of two factor matrices will be columnwise correlated. Under this framework, two classes of correlation function are proposed: 1) using bivariate Beta distribution and 2) using Copula function. Compared with the single IBP-based NMF, this paper jointly makes two factor matrices nonparametric and sparse, which could be applied to broader scenarios, such as co-clustering. This paper is seen to be much more flexible than Gaussian process-based and hierarchial Beta process-based dIBPs in terms of allowing the two corresponding binary matrix columns to have greater variations in their nonzero entries. Our experiments on synthetic data show the merits of this paper compared with the state-of-the-art models in respect of factorization efficiency, sparsity, and flexibility. Experiments on real-world data sets demonstrate the efficiency of this paper in document-word co-clustering tasks.
Large Covariance Estimation by Thresholding Principal Orthogonal Complements

PubMed Central

Fan, Jianqing; Liao, Yuan; Mincheva, Martina

2012-01-01

This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented. PMID:24348088
Large Covariance Estimation by Thresholding Principal Orthogonal Complements.

PubMed

Fan, Jianqing; Liao, Yuan; Mincheva, Martina

2013-09-01

This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented.
Alternatively Constrained Dictionary Learning For Image Superresolution.

PubMed

Lu, Xiaoqiang; Yuan, Yuan; Yan, Pingkun

2014-03-01

Dictionaries are crucial in sparse coding-based algorithm for image superresolution. Sparse coding is a typical unsupervised learning method to study the relationship between the patches of high-and low-resolution images. However, most of the sparse coding methods for image superresolution fail to simultaneously consider the geometrical structure of the dictionary and the corresponding coefficients, which may result in noticeable superresolution reconstruction artifacts. In other words, when a low-resolution image and its corresponding high-resolution image are represented in their feature spaces, the two sets of dictionaries and the obtained coefficients have intrinsic links, which has not yet been well studied. Motivated by the development on nonlocal self-similarity and manifold learning, a novel sparse coding method is reported to preserve the geometrical structure of the dictionary and the sparse coefficients of the data. Moreover, the proposed method can preserve the incoherence of dictionary entries and provide the sparse coefficients and learned dictionary from a new perspective, which have both reconstruction and discrimination properties to enhance the learning performance. Furthermore, to utilize the model of the proposed method more effectively for single-image superresolution, this paper also proposes a novel dictionary-pair learning method, which is named as two-stage dictionary training. Extensive experiments are carried out on a large set of images comparing with other popular algorithms for the same purpose, and the results clearly demonstrate the effectiveness of the proposed sparse representation model and the corresponding dictionary learning algorithm.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Boman, Erik G.

This LDRD project was a campus exec fellowship to fund (in part) Donald Nguyen’s PhD research at UT-Austin. His work has focused on parallel programming models, and scheduling irregular algorithms on shared-memory systems using the Galois framework. Galois provides a simple but powerful way for users and applications to automatically obtain good parallel performance using certain supported data containers. The naïve user can write serial code, while advanced users can optimize performance by advanced features, such as specifying the scheduling policy. Galois was used to parallelize two sparse matrix reordering schemes: RCM and Sloan. Such reordering is important in high-performancemore » computing to obtain better data locality and thus reduce run times.« less
Medical image classification based on multi-scale non-negative sparse coding.

PubMed

Zhang, Ruijie; Shen, Jian; Wei, Fushan; Li, Xiong; Sangaiah, Arun Kumar

2017-11-01

With the rapid development of modern medical imaging technology, medical image classification has become more and more important in medical diagnosis and clinical practice. Conventional medical image classification algorithms usually neglect the semantic gap problem between low-level features and high-level image semantic, which will largely degrade the classification performance. To solve this problem, we propose a multi-scale non-negative sparse coding based medical image classification algorithm. Firstly, Medical images are decomposed into multiple scale layers, thus diverse visual details can be extracted from different scale layers. Secondly, for each scale layer, the non-negative sparse coding model with fisher discriminative analysis is constructed to obtain the discriminative sparse representation of medical images. Then, the obtained multi-scale non-negative sparse coding features are combined to form a multi-scale feature histogram as the final representation for a medical image. Finally, SVM classifier is combined to conduct medical image classification. The experimental results demonstrate that our proposed algorithm can effectively utilize multi-scale and contextual spatial information of medical images, reduce the semantic gap in a large degree and improve medical image classification performance. Copyright © 2017 Elsevier B.V. All rights reserved.
Interferometric synthetic aperture radar phase unwrapping based on sparse Markov random fields by graph cuts

NASA Astrophysics Data System (ADS)

Zhou, Lifan; Chai, Dengfeng; Xia, Yu; Ma, Peifeng; Lin, Hui

2018-01-01

Phase unwrapping (PU) is one of the key processes in reconstructing the digital elevation model of a scene from its interferometric synthetic aperture radar (InSAR) data. It is known that two-dimensional (2-D) PU problems can be formulated as maximum a posteriori estimation of Markov random fields (MRFs). However, considering that the traditional MRF algorithm is usually defined on a rectangular grid, it fails easily if large parts of the wrapped data are dominated by noise caused by large low-coherence area or rapid-topography variation. A PU solution based on sparse MRF is presented to extend the traditional MRF algorithm to deal with sparse data, which allows the unwrapping of InSAR data dominated by high phase noise. To speed up the graph cuts algorithm for sparse MRF, we designed dual elementary graphs and merged them to obtain the Delaunay triangle graph, which is used to minimize the energy function efficiently. The experiments on simulated and real data, compared with other existing algorithms, both confirm the effectiveness of the proposed MRF approach, which suffers less from decorrelation effects caused by large low-coherence area or rapid-topography variation.
Adaptive structured dictionary learning for image fusion based on group-sparse-representation

NASA Astrophysics Data System (ADS)

Yang, Jiajie; Sun, Bin; Luo, Chengwei; Wu, Yuzhong; Xu, Limei

2018-04-01

Dictionary learning is the key process of sparse representation which is one of the most widely used image representation theories in image fusion. The existing dictionary learning method does not use the group structure information and the sparse coefficients well. In this paper, we propose a new adaptive structured dictionary learning algorithm and a l1-norm maximum fusion rule that innovatively utilizes grouped sparse coefficients to merge the images. In the dictionary learning algorithm, we do not need prior knowledge about any group structure of the dictionary. By using the characteristics of the dictionary in expressing the signal, our algorithm can automatically find the desired potential structure information that hidden in the dictionary. The fusion rule takes the physical meaning of the group structure dictionary, and makes activity-level judgement on the structure information when the images are being merged. Therefore, the fused image can retain more significant information. Comparisons have been made with several state-of-the-art dictionary learning methods and fusion rules. The experimental results demonstrate that, the dictionary learning algorithm and the fusion rule both outperform others in terms of several objective evaluation metrics.
Sparse gammatone signal model optimized for English speech does not match the human auditory filters.

PubMed

Strahl, Stefan; Mertins, Alfred

2008-07-18

Evidence that neurosensory systems use sparse signal representations as well as improved performance of signal processing algorithms using sparse signal models raised interest in sparse signal coding in the last years. For natural audio signals like speech and environmental sounds, gammatone atoms have been derived as expansion functions that generate a nearly optimal sparse signal model (Smith, E., Lewicki, M., 2006. Efficient auditory coding. Nature 439, 978-982). Furthermore, gammatone functions are established models for the human auditory filters. Thus far, a practical application of a sparse gammatone signal model has been prevented by the fact that deriving the sparsest representation is, in general, computationally intractable. In this paper, we applied an accelerated version of the matching pursuit algorithm for gammatone dictionaries allowing real-time and large data set applications. We show that a sparse signal model in general has advantages in audio coding and that a sparse gammatone signal model encodes speech more efficiently in terms of sparseness than a sparse modified discrete cosine transform (MDCT) signal model. We also show that the optimal gammatone parameters derived for English speech do not match the human auditory filters, suggesting for signal processing applications to derive the parameters individually for each applied signal class instead of using psychometrically derived parameters. For brain research, it means that care should be taken with directly transferring findings of optimality for technical to biological systems.
A Sparse Bayesian Learning Algorithm for White Matter Parameter Estimation from Compressed Multi-shell Diffusion MRI.

PubMed

Pisharady, Pramod Kumar; Sotiropoulos, Stamatios N; Sapiro, Guillermo; Lenglet, Christophe

2017-09-01

We propose a sparse Bayesian learning algorithm for improved estimation of white matter fiber parameters from compressed (under-sampled q-space) multi-shell diffusion MRI data. The multi-shell data is represented in a dictionary form using a non-monoexponential decay model of diffusion, based on continuous gamma distribution of diffusivities. The fiber volume fractions with predefined orientations, which are the unknown parameters, form the dictionary weights. These unknown parameters are estimated with a linear un-mixing framework, using a sparse Bayesian learning algorithm. A localized learning of hyperparameters at each voxel and for each possible fiber orientations improves the parameter estimation. Our experiments using synthetic data from the ISBI 2012 HARDI reconstruction challenge and in-vivo data from the Human Connectome Project demonstrate the improvements.
Collaborative sparse priors for multi-view ATR

NASA Astrophysics Data System (ADS)

Li, Xuelu; Monga, Vishal

2018-04-01

Recent work has seen a surge of sparse representation based classification (SRC) methods applied to automatic target recognition problems. While traditional SRC approaches used l0 or l1 norm to quantify sparsity, spike and slab priors have established themselves as the gold standard for providing general tunable sparse structures on vectors. In this work, we employ collaborative spike and slab priors that can be applied to matrices to encourage sparsity for the problem of multi-view ATR. That is, target images captured from multiple views are expanded in terms of a training dictionary multiplied with a coefficient matrix. Ideally, for a test image set comprising of multiple views of a target, coefficients corresponding to its identifying class are expected to be active, while others should be zero, i.e. the coefficient matrix is naturally sparse. We develop a new approach to solve the optimization problem that estimates the sparse coefficient matrix jointly with the sparsity inducing parameters in the collaborative prior. ATR problems are investigated on the mid-wave infrared (MWIR) database made available by the US Army Night Vision and Electronic Sensors Directorate, which has a rich collection of views. Experimental results show that the proposed joint prior and coefficient estimation method (JPCEM) can: 1.) enable improved accuracy when multiple views vs. a single one are invoked, and 2.) outperform state of the art alternatives particularly when training imagery is limited.
Network Inference via the Time-Varying Graphical Lasso

PubMed Central

Hallac, David; Park, Youngsuk; Boyd, Stephen; Leskovec, Jure

2018-01-01

Many important problems can be modeled as a system of interconnected entities, where each entity is recording time-dependent observations or measurements. In order to spot trends, detect anomalies, and interpret the temporal dynamics of such data, it is essential to understand the relationships between the different entities and how these relationships evolve over time. In this paper, we introduce the time-varying graphical lasso (TVGL), a method of inferring time-varying networks from raw time series data. We cast the problem in terms of estimating a sparse time-varying inverse covariance matrix, which reveals a dynamic network of interdependencies between the entities. Since dynamic network inference is a computationally expensive task, we derive a scalable message-passing algorithm based on the Alternating Direction Method of Multipliers (ADMM) to solve this problem in an efficient way. We also discuss several extensions, including a streaming algorithm to update the model and incorporate new observations in real time. Finally, we evaluate our TVGL algorithm on both real and synthetic datasets, obtaining interpretable results and outperforming state-of-the-art baselines in terms of both accuracy and scalability. PMID:29770256
Effects of Ordering Strategies and Programming Paradigms on Sparse Matrix Computations

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Li, Xiaoye; Husbands, Parry; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2002-01-01

The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. For systems that are ill-conditioned, it is often necessary to use a preconditioning technique. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and ILU(O) preconditioned CG (PCG) using different programming paradigms and architectures. Results show that for this class of applications: ordering significantly improves overall performance on both distributed and distributed shared-memory systems, that cache reuse may be more important than reducing communication, that it is possible to achieve message-passing performance using shared-memory constructs through careful data ordering and distribution, and that a hybrid MPI+OpenMP paradigm increases programming complexity with little performance gains. A implementation of CG on the Cray MTA does not require special ordering or partitioning to obtain high efficiency and scalability, giving it a distinct advantage for adaptive applications; however, it shows limited scalability for PCG due to a lack of thread level parallelism.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry

Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling

DOE PAGES

Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry; ...

2016-10-27

Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
Task-driven dictionary learning.

PubMed

Mairal, Julien; Bach, Francis; Ponce, Jean

2012-04-01

Modeling data with linear combinations of a few elements from a learned dictionary has been the focus of much recent research in machine learning, neuroscience, and signal processing. For signals such as natural images that admit such sparse representations, it is now well established that these models are well suited to restoration tasks. In this context, learning the dictionary amounts to solving a large-scale matrix factorization problem, which can be done efficiently with classical optimization tools. The same approach has also been used for learning features from data for other purposes, e.g., image classification, but tuning the dictionary in a supervised way for these tasks has proven to be more difficult. In this paper, we present a general formulation for supervised dictionary learning adapted to a wide variety of tasks, and present an efficient algorithm for solving the corresponding optimization problem. Experiments on handwritten digit classification, digital art identification, nonlinear inverse image problems, and compressed sensing demonstrate that our approach is effective in large-scale settings, and is well suited to supervised and semi-supervised classification, as well as regression tasks for data that admit sparse representations.
Online Least Squares One-Class Support Vector Machines-Based Abnormal Visual Event Detection

PubMed Central

Wang, Tian; Chen, Jie; Zhou, Yi; Snoussi, Hichem

2013-01-01

The abnormal event detection problem is an important subject in real-time video surveillance. In this paper, we propose a novel online one-class classification algorithm, online least squares one-class support vector machine (online LS-OC-SVM), combined with its sparsified version (sparse online LS-OC-SVM). LS-OC-SVM extracts a hyperplane as an optimal description of training objects in a regularized least squares sense. The online LS-OC-SVM learns a training set with a limited number of samples to provide a basic normal model, then updates the model through remaining data. In the sparse online scheme, the model complexity is controlled by the coherence criterion. The online LS-OC-SVM is adopted to handle the abnormal event detection problem. Each frame of the video is characterized by the covariance matrix descriptor encoding the moving information, then is classified into a normal or an abnormal frame. Experiments are conducted, on a two-dimensional synthetic distribution dataset and a benchmark video surveillance dataset, to demonstrate the promising results of the proposed online LS-OC-SVM method. PMID:24351629
Online least squares one-class support vector machines-based abnormal visual event detection.

PubMed

Wang, Tian; Chen, Jie; Zhou, Yi; Snoussi, Hichem

2013-12-12

The abnormal event detection problem is an important subject in real-time video surveillance. In this paper, we propose a novel online one-class classification algorithm, online least squares one-class support vector machine (online LS-OC-SVM), combined with its sparsified version (sparse online LS-OC-SVM). LS-OC-SVM extracts a hyperplane as an optimal description of training objects in a regularized least squares sense. The online LS-OC-SVM learns a training set with a limited number of samples to provide a basic normal model, then updates the model through remaining data. In the sparse online scheme, the model complexity is controlled by the coherence criterion. The online LS-OC-SVM is adopted to handle the abnormal event detection problem. Each frame of the video is characterized by the covariance matrix descriptor encoding the moving information, then is classified into a normal or an abnormal frame. Experiments are conducted, on a two-dimensional synthetic distribution dataset and a benchmark video surveillance dataset, to demonstrate the promising results of the proposed online LS-OC-SVM method.
SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics

PubMed Central

Will, Sebastian; Otto, Christina; Miladi, Milad; Möhl, Mathias; Backofen, Rolf

2015-01-01

Motivation: RNA-Seq experiments have revealed a multitude of novel ncRNAs. The gold standard for their analysis based on simultaneous alignment and folding suffers from extreme time complexity of O(n6). Subsequently, numerous faster ‘Sankoff-style’ approaches have been suggested. Commonly, the performance of such methods relies on sequence-based heuristics that restrict the search space to optimal or near-optimal sequence alignments; however, the accuracy of sequence-based methods breaks down for RNAs with sequence identities below 60%. Alignment approaches like LocARNA that do not require sequence-based heuristics, have been limited to high complexity (≥ quartic time). Results: Breaking this barrier, we introduce the novel Sankoff-style algorithm ‘sparsified prediction and alignment of RNAs based on their structure ensembles (SPARSE)’, which runs in quadratic time without sequence-based heuristics. To achieve this low complexity, on par with sequence alignment algorithms, SPARSE features strong sparsification based on structural properties of the RNA ensembles. Following PMcomp, SPARSE gains further speed-up from lightweight energy computation. Although all existing lightweight Sankoff-style methods restrict Sankoff’s original model by disallowing loop deletions and insertions, SPARSE transfers the Sankoff algorithm to the lightweight energy model completely for the first time. Compared with LocARNA, SPARSE achieves similar alignment and better folding quality in significantly less time (speedup: 3.7). At similar run-time, it aligns low sequence identity instances substantially more accurate than RAF, which uses sequence-based heuristics. Availability and implementation: SPARSE is freely available at http://www.bioinf.uni-freiburg.de/Software/SPARSE. Contact: backofen@informatik.uni-freiburg.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25838465
Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation.

PubMed

Lam, Clifford; Fan, Jianqing

2009-01-01

This paper studies the sparsistency and rates of convergence for estimating sparse covariance and precision matrices based on penalized likelihood with nonconvex penalty functions. Here, sparsistency refers to the property that all parameters that are zero are actually estimated as zero with probability tending to one. Depending on the case of applications, sparsity priori may occur on the covariance matrix, its inverse or its Cholesky decomposition. We study these three sparsity exploration problems under a unified framework with a general penalty function. We show that the rates of convergence for these problems under the Frobenius norm are of order (s(n) log p(n)/n)(1/2), where s(n) is the number of nonzero elements, p(n) is the size of the covariance matrix and n is the sample size. This explicitly spells out the contribution of high-dimensionality is merely of a logarithmic factor. The conditions on the rate with which the tuning parameter λ(n) goes to 0 have been made explicit and compared under different penalties. As a result, for the L(1)-penalty, to guarantee the sparsistency and optimal rate of convergence, the number of nonzero elements should be small: sn'=O(pn) at most, among O(pn2) parameters, for estimating sparse covariance or correlation matrix, sparse precision or inverse correlation matrix or sparse Cholesky factor, where sn' is the number of the nonzero elements on the off-diagonal entries. On the other hand, using the SCAD or hard-thresholding penalty functions, there is no such a restriction.
Immunohistochemical evidence of rapid extracellular matrix remodeling after iron-particle irradiation of mouse mammary gland

NASA Technical Reports Server (NTRS)

Ehrhart, E. J.; Gillette, E. L.; Barcellos-Hoff, M. H.; Chaterjee, A. (Principal Investigator)

1996-01-01

High-LET radiation has unique physical and biological properties compared to sparsely ionizing radiation. Recent studies demonstrate that sparsely ionizing radiation rapidly alters the pattern of extracellular matrix expression in several tissues, but little is known about the effect of heavy-ion radiation. This study investigates densely ionizing radiation-induced changes in extracellular matrix localization in the mammary glands of adult female BALB/c mice after whole-body irradiation with 0.8 Gy 600 MeV iron particles. The basement membrane and interstitial extracellular matrix proteins of the mammary gland stroma were mapped with respect to time postirradiation using immunofluorescence. Collagen III was induced in the adipose stroma within 1 day, continued to increase through day 9 and was resolved by day 14. Immunoreactive tenascin was induced in the epithelium by day 1, was evident at the epithelial-stromal interface by day 5-9 and persisted as a condensed layer beneath the basement membrane through day 14. These findings parallel similar changes induced by gamma irradiation but demonstrate different onset and chronicity. In contrast, the integrity of epithelial basement membrane, which was unaffected by sparsely ionizing radiation, was disrupted by iron-particle irradiation. Laminin immunoreactivity was mildly irregular at 1 h postirradiation and showed discontinuities and thickening from days 1 to 9. Continuity was restored by day 14. Thus high-LET radiation, like sparsely ionizing radiation, induces rapid-remodeling of the stromal extracellular matrix but also appears to alter the integrity of the epithelial basement membrane, which is an important regulator of epithelial cell proliferation and differentiation.
Unified commutation-pruning technique for efficient computation of composite DFTs

NASA Astrophysics Data System (ADS)

Castro-Palazuelos, David E.; Medina-Melendrez, Modesto Gpe.; Torres-Roman, Deni L.; Shkvarko, Yuriy V.

2015-12-01

An efficient computation of a composite length discrete Fourier transform (DFT), as well as a fast Fourier transform (FFT) of both time and space data sequences in uncertain (non-sparse or sparse) computational scenarios, requires specific processing algorithms. Traditional algorithms typically employ some pruning methods without any commutations, which prevents them from attaining the potential computational efficiency. In this paper, we propose an alternative unified approach with automatic commutations between three computational modalities aimed at efficient computations of the pruned DFTs adapted for variable composite lengths of the non-sparse input-output data. The first modality is an implementation of the direct computation of a composite length DFT, the second one employs the second-order recursive filtering method, and the third one performs the new pruned decomposed transform. The pruned decomposed transform algorithm performs the decimation in time or space (DIT) data acquisition domain and, then, decimation in frequency (DIF). The unified combination of these three algorithms is addressed as the DFTCOMM technique. Based on the treatment of the combinational-type hypotheses testing optimization problem of preferable allocations between all feasible commuting-pruning modalities, we have found the global optimal solution to the pruning problem that always requires a fewer or, at most, the same number of arithmetic operations than other feasible modalities. The DFTCOMM method outperforms the existing competing pruning techniques in the sense of attainable savings in the number of required arithmetic operations. It requires fewer or at most the same number of arithmetic operations for its execution than any other of the competing pruning methods reported in the literature. Finally, we provide the comparison of the DFTCOMM with the recently developed sparse fast Fourier transform (SFFT) algorithmic family. We feature that, in the sensing scenarios with sparse/non-sparse data Fourier spectrum, the DFTCOMM technique manifests robustness against such model uncertainties in the sense of insensitivity for sparsity/non-sparsity restrictions and the variability of the operating parameters.
High-dimensional statistical inference: From vector to matrix

NASA Astrophysics Data System (ADS)

Zhang, Anru

Statistical inference for sparse signals or low-rank matrices in high-dimensional settings is of significant interest in a range of contemporary applications. It has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. In this thesis, we consider several problems in including sparse signal recovery (compressed sensing under restricted isometry) and low-rank matrix recovery (matrix recovery via rank-one projections and structured matrix completion). The first part of the thesis discusses compressed sensing and affine rank minimization in both noiseless and noisy cases and establishes sharp restricted isometry conditions for sparse signal and low-rank matrix recovery. The analysis relies on a key technical tool which represents points in a polytope by convex combinations of sparse vectors. The technique is elementary while leads to sharp results. It is shown that, in compressed sensing, delta kA < 1/3, deltak A+ thetak,kA < 1, or deltatkA < √( t - 1)/t for any given constant t ≥ 4/3 guarantee the exact recovery of all k sparse signals in the noiseless case through the constrained ℓ1 minimization, and similarly in affine rank minimization delta rM < 1/3, deltar M + thetar, rM < 1, or deltatrM< √( t - 1)/t ensure the exact reconstruction of all matrices with rank at most r in the noiseless case via the constrained nuclear norm minimization. Moreover, for any epsilon > 0, delta kA < 1/3 + epsilon, deltak A + thetak,kA < 1 + epsilon, or deltatkA< √(t - 1) / t + epsilon are not sufficient to guarantee the exact recovery of all k-sparse signals for large k. Similar result also holds for matrix recovery. In addition, the conditions delta kA<1/3, deltak A+ thetak,kA<1, delta tkA < √(t - 1)/t and deltarM<1/3, delta rM+ thetar,rM<1, delta trM< √(t - 1)/ t are also shown to be sufficient respectively for stable recovery of approximately sparse signals and low-rank matrices in the noisy case. For the second part of the thesis, we introduce a rank-one projection model for low-rank matrix recovery and propose a constrained nuclear norm minimization method for stable recovery of low-rank matrices in the noisy case. The procedure is adaptive to the rank and robust against small perturbations. Both upper and lower bounds for the estimation accuracy under the Frobenius norm loss are obtained. The proposed estimator is shown to be rate-optimal under certain conditions. The estimator is easy to implement via convex programming and performs well numerically. The techniques and main results developed in the chapter also have implications to other related statistical problems. An application to estimation of spiked covariance matrices from one-dimensional random projections is considered. The results demonstrate that it is still possible to accurately estimate the covariance matrix of a high-dimensional distribution based only on one-dimensional projections. For the third part of the thesis, we consider another setting of low-rank matrix completion. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival.
Spectral Diffusion: An Algorithm for Robust Material Decomposition of Spectral CT Data

PubMed Central

Clark, Darin P.; Badea, Cristian T.

2014-01-01

Clinical successes with dual energy CT, aggressive development of energy discriminating x-ray detectors, and novel, target-specific, nanoparticle contrast agents promise to establish spectral CT as a powerful functional imaging modality. Common to all of these applications is the need for a material decomposition algorithm which is robust in the presence of noise. Here, we develop such an algorithm which uses spectrally joint, piece-wise constant kernel regression and the split Bregman method to iteratively solve for a material decomposition which is gradient sparse, quantitatively accurate, and minimally biased. We call this algorithm spectral diffusion because it integrates structural information from multiple spectral channels and their corresponding material decompositions within the framework of diffusion-like denoising algorithms (e.g. anisotropic diffusion, total variation, bilateral filtration). Using a 3D, digital bar phantom and a material sensitivity matrix calibrated for use with a polychromatic x-ray source, we quantify the limits of detectability (CNR = 5) afforded by spectral diffusion in the triple-energy material decomposition of iodine (3.1 mg/mL), gold (0.9 mg/mL), and gadolinium (2.9 mg/mL) concentrations. We then apply spectral diffusion to the in vivo separation of these three materials in the mouse kidneys, liver, and spleen. PMID:25296173
Spectral diffusion: an algorithm for robust material decomposition of spectral CT data.

PubMed

Clark, Darin P; Badea, Cristian T

2014-11-07

Clinical successes with dual energy CT, aggressive development of energy discriminating x-ray detectors, and novel, target-specific, nanoparticle contrast agents promise to establish spectral CT as a powerful functional imaging modality. Common to all of these applications is the need for a material decomposition algorithm which is robust in the presence of noise. Here, we develop such an algorithm which uses spectrally joint, piecewise constant kernel regression and the split Bregman method to iteratively solve for a material decomposition which is gradient sparse, quantitatively accurate, and minimally biased. We call this algorithm spectral diffusion because it integrates structural information from multiple spectral channels and their corresponding material decompositions within the framework of diffusion-like denoising algorithms (e.g. anisotropic diffusion, total variation, bilateral filtration). Using a 3D, digital bar phantom and a material sensitivity matrix calibrated for use with a polychromatic x-ray source, we quantify the limits of detectability (CNR = 5) afforded by spectral diffusion in the triple-energy material decomposition of iodine (3.1 mg mL(-1)), gold (0.9 mg mL(-1)), and gadolinium (2.9 mg mL(-1)) concentrations. We then apply spectral diffusion to the in vivo separation of these three materials in the mouse kidneys, liver, and spleen.
Sparse graph regularization for robust crop mapping using hyperspectral remotely sensed imagery with very few in situ data

NASA Astrophysics Data System (ADS)

Xue, Zhaohui; Du, Peijun; Li, Jun; Su, Hongjun

2017-02-01

The generally limited availability of training data relative to the usually high data dimension pose a great challenge to accurate classification of hyperspectral imagery, especially for identifying crops characterized with highly correlated spectra. However, traditional parametric classification models are problematic due to the need of non-singular class-specific covariance matrices. In this research, a novel sparse graph regularization (SGR) method is presented, aiming at robust crop mapping using hyperspectral imagery with very few in situ data. The core of SGR lies in propagating labels from known data to unknown, which is triggered by: (1) the fraction matrix generated for the large unknown data by using an effective sparse representation algorithm with respect to the few training data serving as the dictionary; (2) the prediction function estimated for the few training data by formulating a regularization model based on sparse graph. Then, the labels of large unknown data can be obtained by maximizing the posterior probability distribution based on the two ingredients. SGR is more discriminative, data-adaptive, robust to noise, and efficient, which is unique with regard to previously proposed approaches and has high potentials in discriminating crops, especially when facing insufficient training data and high-dimensional spectral space. The study area is located at Zhangye basin in the middle reaches of Heihe watershed, Gansu, China, where eight crop types were mapped with Compact Airborne Spectrographic Imager (CASI) and Shortwave Infrared Airborne Spectrogrpahic Imager (SASI) hyperspectral data. Experimental results demonstrate that the proposed method significantly outperforms other traditional and state-of-the-art methods.
A multiple hold-out framework for Sparse Partial Least Squares.

PubMed

Monteiro, João M; Rao, Anil; Shawe-Taylor, John; Mourão-Miranda, Janaina

2016-09-15

Supervised classification machine learning algorithms may have limitations when studying brain diseases with heterogeneous populations, as the labels might be unreliable. More exploratory approaches, such as Sparse Partial Least Squares (SPLS), may provide insights into the brain's mechanisms by finding relationships between neuroimaging and clinical/demographic data. The identification of these relationships has the potential to improve the current understanding of disease mechanisms, refine clinical assessment tools, and stratify patients. SPLS finds multivariate associative effects in the data by computing pairs of sparse weight vectors, where each pair is used to remove its corresponding associative effect from the data by matrix deflation, before computing additional pairs. We propose a novel SPLS framework which selects the adequate number of voxels and clinical variables to describe each associative effect, and tests their reliability by fitting the model to different splits of the data. As a proof of concept, the approach was applied to find associations between grey matter probability maps and individual items of the Mini-Mental State Examination (MMSE) in a clinical sample with various degrees of dementia. The framework found two statistically significant associative effects between subsets of brain voxels and subsets of the questions/tasks. SPLS was compared with its non-sparse version (PLS). The use of projection deflation versus a classical PLS deflation was also tested in both PLS and SPLS. SPLS outperformed PLS, finding statistically significant effects and providing higher correlation values in hold-out data. Moreover, projection deflation provided better results. Copyright © 2016 The Author(s). Published by Elsevier B.V. All rights reserved.
Sparse Representation for Color Image Restoration (PREPRINT)

DTIC Science & Technology

2006-10-01

as a universal denoiser of images, which learns the posterior from the given image in a way inspired by the Lempel - Ziv universal compression ...such as images, admit a sparse decomposition over a redundant dictionary leads to efficient algorithms for handling such sources of data . In...describe the data source. Such a model becomes paramount when developing algorithms for processing these signals. In this context, Markov-Random-Field
Solving large tomographic linear systems: size reduction and error estimation

NASA Astrophysics Data System (ADS)

Voronin, Sergey; Mikesell, Dylan; Slezak, Inna; Nolet, Guust

2014-10-01

We present a new approach to reduce a sparse, linear system of equations associated with tomographic inverse problems. We begin by making a modification to the commonly used compressed sparse-row format, whereby our format is tailored to the sparse structure of finite-frequency (volume) sensitivity kernels in seismic tomography. Next, we cluster the sparse matrix rows to divide a large matrix into smaller subsets representing ray paths that are geographically close. Singular value decomposition of each subset allows us to project the data onto a subspace associated with the largest eigenvalues of the subset. After projection we reject those data that have a signal-to-noise ratio (SNR) below a chosen threshold. Clustering in this way assures that the sparse nature of the system is minimally affected by the projection. Moreover, our approach allows for a precise estimation of the noise affecting the data while also giving us the ability to identify outliers. We illustrate the method by reducing large matrices computed for global tomographic systems with cross-correlation body wave delays, as well as with surface wave phase velocity anomalies. For a massive matrix computed for 3.7 million Rayleigh wave phase velocity measurements, imposing a threshold of 1 for the SNR, we condensed the matrix size from 1103 to 63 Gbyte. For a global data set of multiple-frequency P wave delays from 60 well-distributed deep earthquakes we obtain a reduction to 5.9 per cent. This type of reduction allows one to avoid loss of information due to underparametrizing models. Alternatively, if data have to be rejected to fit the system into computer memory, it assures that the most important data are preserved.
People counting in classroom based on video surveillance

NASA Astrophysics Data System (ADS)

Zhang, Quanbin; Huang, Xiang; Su, Juan

2014-11-01

Currently, the switches of the lights and other electronic devices in the classroom are mainly relied on manual control, as a result, many lights are on while no one or only few people in the classroom. It is important to change the current situation and control the electronic devices intelligently according to the number and the distribution of the students in the classroom, so as to reduce the considerable waste of electronic resources. This paper studies the problem of people counting in classroom based on video surveillance. As the camera in the classroom can not get the full shape contour information of bodies and the clear features information of faces, most of the classical algorithms such as the pedestrian detection method based on HOG (histograms of oriented gradient) feature and the face detection method based on machine learning are unable to obtain a satisfied result. A new kind of dual background updating model based on sparse and low-rank matrix decomposition is proposed in this paper, according to the fact that most of the students in the classroom are almost in stationary state and there are body movement occasionally. Firstly, combining the frame difference with the sparse and low-rank matrix decomposition to predict the moving areas, and updating the background model with different parameters according to the positional relationship between the pixels of current video frame and the predicted motion regions. Secondly, the regions of moving objects are determined based on the updated background using the background subtraction method. Finally, some operations including binarization, median filtering and morphology processing, connected component detection, etc. are performed on the regions acquired by the background subtraction, in order to induce the effects of the noise and obtain the number of people in the classroom. The experiment results show the validity of the algorithm of people counting.
Discriminative Transfer Subspace Learning via Low-Rank and Sparse Representation.

PubMed

Xu, Yong; Fang, Xiaozhao; Wu, Jian; Li, Xuelong; Zhang, David

2016-02-01

In this paper, we address the problem of unsupervised domain transfer learning in which no labels are available in the target domain. We use a transformation matrix to transfer both the source and target data to a common subspace, where each target sample can be represented by a combination of source samples such that the samples from different domains can be well interlaced. In this way, the discrepancy of the source and target domains is reduced. By imposing joint low-rank and sparse constraints on the reconstruction coefficient matrix, the global and local structures of data can be preserved. To enlarge the margins between different classes as much as possible and provide more freedom to diminish the discrepancy, a flexible linear classifier (projection) is obtained by learning a non-negative label relaxation matrix that allows the strict binary label matrix to relax into a slack variable matrix. Our method can avoid a potentially negative transfer by using a sparse matrix to model the noise and, thus, is more robust to different types of noise. We formulate our problem as a constrained low-rankness and sparsity minimization problem and solve it by the inexact augmented Lagrange multiplier method. Extensive experiments on various visual domain adaptation tasks show the superiority of the proposed method over the state-of-the art methods. The MATLAB code of our method will be publicly available at http://www.yongxu.org/lunwen.html.
Improved analysis of SP and CoSaMP under total perturbations

NASA Astrophysics Data System (ADS)

Li, Haifeng

2016-12-01

Practically, in the underdetermined model y= A x, where x is a K sparse vector (i.e., it has no more than K nonzero entries), both y and A could be totally perturbed. A more relaxed condition means less number of measurements are needed to ensure the sparse recovery from theoretical aspect. In this paper, based on restricted isometry property (RIP), for subspace pursuit (SP) and compressed sampling matching pursuit (CoSaMP), two relaxed sufficient conditions are presented under total perturbations to guarantee that the sparse vector x is recovered. Taking random matrix as measurement matrix, we also discuss the advantage of our condition. Numerical experiments validate that SP and CoSaMP can provide oracle-order recovery performance.
Multi-energy CT based on a prior rank, intensity and sparsity model (PRISM).

PubMed

Gao, Hao; Yu, Hengyong; Osher, Stanley; Wang, Ge

2011-11-01

We propose a compressive sensing approach for multi-energy computed tomography (CT), namely the prior rank, intensity and sparsity model (PRISM). To further compress the multi-energy image for allowing the reconstruction with fewer CT data and less radiation dose, the PRISM models a multi-energy image as the superposition of a low-rank matrix and a sparse matrix (with row dimension in space and column dimension in energy), where the low-rank matrix corresponds to the stationary background over energy that has a low matrix rank, and the sparse matrix represents the rest of distinct spectral features that are often sparse. Distinct from previous methods, the PRISM utilizes the generalized rank, e.g., the matrix rank of tight-frame transform of a multi-energy image, which offers a way to characterize the multi-level and multi-filtered image coherence across the energy spectrum. Besides, the energy-dependent intensity information can be incorporated into the PRISM in terms of the spectral curves for base materials, with which the restoration of the multi-energy image becomes the reconstruction of the energy-independent material composition matrix. In other words, the PRISM utilizes prior knowledge on the generalized rank and sparsity of a multi-energy image, and intensity/spectral characteristics of base materials. Furthermore, we develop an accurate and fast split Bregman method for the PRISM and demonstrate the superior performance of the PRISM relative to several competing methods in simulations.

Sparse array angle estimation using reduced-dimension ESPRIT-MUSIC in MIMO radar.

PubMed

Zhang, Chaozhu; Pang, Yucai

2013-01-01

Sparse linear arrays provide better performance than the filled linear arrays in terms of angle estimation and resolution with reduced size and low cost. However, they are subject to manifold ambiguity. In this paper, both the transmit array and receive array are sparse linear arrays in the bistatic MIMO radar. Firstly, we present an ESPRIT-MUSIC method in which ESPRIT algorithm is used to obtain ambiguous angle estimates. The disambiguation algorithm uses MUSIC-based procedure to identify the true direction cosine estimate from a set of ambiguous candidate estimates. The paired transmit angle and receive angle can be estimated and the manifold ambiguity can be solved. However, the proposed algorithm has high computational complexity due to the requirement of two-dimension search. Further, the Reduced-Dimension ESPRIT-MUSIC (RD-ESPRIT-MUSIC) is proposed to reduce the complexity of the algorithm. And the RD-ESPRIT-MUSIC only demands one-dimension search. Simulation results demonstrate the effectiveness of the method.
Progressive sample processing of band selection for hyperspectral imagery

NASA Astrophysics Data System (ADS)

Liu, Keng-Hao; Chien, Hung-Chang; Chen, Shih-Yu

2017-10-01

Band selection (BS) is one of the most important topics in hyperspectral image (HSI) processing. The objective of BS is to find a set of representative bands that can represent the whole image with lower inter-band redundancy. Many types of BS algorithms were proposed in the past. However, most of them can be carried on in an off-line manner. It means that they can only be implemented on the pre-collected data. Those off-line based methods are sometime useless for those applications that are timeliness, particular in disaster prevention and target detection. To tackle this issue, a new concept, called progressive sample processing (PSP), was proposed recently. The PSP is an "on-line" framework where the specific type of algorithm can process the currently collected data during the data transmission under band-interleavedby-sample/pixel (BIS/BIP) protocol. This paper proposes an online BS method that integrates a sparse-based BS into PSP framework, called PSP-BS. In PSP-BS, the BS can be carried out by updating BS result recursively pixel by pixel in the same way that a Kalman filter does for updating data information in a recursive fashion. The sparse regression is solved by orthogonal matching pursuit (OMP) algorithm, and the recursive equations of PSP-BS are derived by using matrix decomposition. The experiments conducted on a real hyperspectral image show that the PSP-BS can progressively output the BS status with very low computing time. The convergence of BS results during the transmission can be quickly achieved by using a rearranged pixel transmission sequence. This significant advantage allows BS to be implemented in a real time manner when the HSI data is transmitted pixel by pixel.
Realization of preconditioned Lanczos and conjugate gradient algorithms on optical linear algebra processors.

PubMed

Ghosh, A

1988-08-01

Lanczos and conjugate gradient algorithms are important in computational linear algebra. In this paper, a parallel pipelined realization of these algorithms on a ring of optical linear algebra processors is described. The flow of data is designed to minimize the idle times of the optical multiprocessor and the redundancy of computations. The effects of optical round-off errors on the solutions obtained by the optical Lanczos and conjugate gradient algorithms are analyzed, and it is shown that optical preconditioning can improve the accuracy of these algorithms substantially. Algorithms for optical preconditioning and results of numerical experiments on solving linear systems of equations arising from partial differential equations are discussed. Since the Lanczos algorithm is used mostly with sparse matrices, a folded storage scheme to represent sparse matrices on spatial light modulators is also described.
Fast sparse recovery and coherence factor weighting in optoacoustic tomography

NASA Astrophysics Data System (ADS)

He, Hailong; Prakash, Jaya; Buehler, Andreas; Ntziachristos, Vasilis

2017-03-01

Sparse recovery algorithms have shown great potential to reconstruct images with limited view datasets in optoacoustic tomography, with a disadvantage of being computational expensive. In this paper, we improve the fast convergent Split Augmented Lagrangian Shrinkage Algorithm (SALSA) method based on least square QR (LSQR) formulation for performing accelerated reconstructions. Further, coherence factor is calculated to weight the final reconstruction result, which can further reduce artifacts arising in limited-view scenarios and acoustically heterogeneous mediums. Several phantom and biological experiments indicate that the accelerated SALSA method with coherence factor (ASALSA-CF) can provide improved reconstructions and much faster convergence compared to existing sparse recovery methods.
A path-oriented knowledge representation system: Defusing the combinatorial system

NASA Technical Reports Server (NTRS)

Karamouzis, Stamos T.; Barry, John S.; Smith, Steven L.; Feyock, Stefan

1995-01-01

LIMAP is a programming system oriented toward efficient information manipulation over fixed finite domains, and quantification over paths and predicates. A generalization of Warshall's Algorithm to precompute paths in a sparse matrix representation of semantic nets is employed to allow questions involving paths between components to be posed and answered easily. LIMAP's ability to cache all paths between two components in a matrix cell proved to be a computational obstacle, however, when the semantic net grew to realistic size. The present paper describes a means of mitigating this combinatorial explosion to an extent that makes the use of the LIMAP representation feasible for problems of significant size. The technique we describe radically reduces the size of the search space in which LIMAP must operate; semantic nets of more than 500 nodes have been attacked successfully. Furthermore, it appears that the procedure described is applicable not only to LIMAP, but to a number of other combinatorially explosive search space problems found in AI as well.
Summer Proceedings 2016: The Center for Computing Research at Sandia National Laboratories

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carleton, James Brian; Parks, Michael L.

Solving sparse linear systems from the discretization of elliptic partial differential equations (PDEs) is an important building block in many engineering applications. Sparse direct solvers can solve general linear systems, but are usually slower and use much more memory than effective iterative solvers. To overcome these two disadvantages, a hierarchical solver (LoRaSp) based on H2-matrices was introduced in [22]. Here, we have developed a parallel version of the algorithm in LoRaSp to solve large sparse matrices on distributed memory machines. On a single processor, the factorization time of our parallel solver scales almost linearly with the problem size for three-dimensionalmore » problems, as opposed to the quadratic scalability of many existing sparse direct solvers. Moreover, our solver leads to almost constant numbers of iterations, when used as a preconditioner for Poisson problems. On more than one processor, our algorithm has significant speedups compared to sequential runs. With this parallel algorithm, we are able to solve large problems much faster than many existing packages as demonstrated by the numerical experiments.« less
Time-domain analysis of planar microstrip devices using a generalized Yee-algorithm based on unstructured grids

NASA Technical Reports Server (NTRS)

Gedney, Stephen D.; Lansing, Faiza

1993-01-01

The generalized Yee-algorithm is presented for the temporal full-wave analysis of planar microstrip devices. This algorithm has the significant advantage over the traditional Yee-algorithm in that it is based on unstructured and irregular grids. The robustness of the generalized Yee-algorithm is that structures that contain curved conductors or complex three-dimensional geometries can be more accurately, and much more conveniently modeled using standard automatic grid generation techniques. This generalized Yee-algorithm is based on the the time-marching solution of the discrete form of Maxwell's equations in their integral form. To this end, the electric and magnetic fields are discretized over a dual, irregular, and unstructured grid. The primary grid is assumed to be composed of general fitted polyhedra distributed throughout the volume. The secondary grid (or dual grid) is built up of the closed polyhedra whose edges connect the centroid's of adjacent primary cells, penetrating shared faces. Faraday's law and Ampere's law are used to update the fields normal to the primary and secondary grid faces, respectively. Subsequently, a correction scheme is introduced to project the normal fields onto the grid edges. It is shown that this scheme is stable, maintains second-order accuracy, and preserves the divergenceless nature of the flux densities. Finally, for computational efficiency the algorithm is structured as a series of sparse matrix-vector multiplications. Based on this scheme, the generalized Yee-algorithm has been implemented on vector and parallel high performance computers in a highly efficient manner.
Adapting iterative algorithms for solving large sparse linear systems for efficient use on the CDC CYBER 205

NASA Technical Reports Server (NTRS)

Kincaid, D. R.; Young, D. M.

1984-01-01

Adapting and designing mathematical software to achieve optimum performance on the CYBER 205 is discussed. Comments and observations are made in light of recent work done on modifying the ITPACK software package and on writing new software for vector supercomputers. The goal was to develop very efficient vector algorithms and software for solving large sparse linear systems using iterative methods.
Network Data: Statistical Theory and New Models

DTIC Science & Technology

2016-02-17

SECURITY CLASSIFICATION OF: During this period of review, Bin Yu worked on many thrusts of high-dimensional statistical theory and methodologies. Her...research covered a wide range of topics in statistics including analysis and methods for spectral clustering for sparse and structured networks...2,7,8,21], sparse modeling (e.g. Lasso) [4,10,11,17,18,19], statistical guarantees for the EM algorithm [3], statistical analysis of algorithm leveraging
Multi-GPU implementation of a VMAT treatment plan optimization algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tian, Zhen, E-mail: Zhen.Tian@UTSouthwestern.edu, E-mail: Xun.Jia@UTSouthwestern.edu, E-mail: Steve.Jiang@UTSouthwestern.edu; Folkerts, Michael; Tan, Jun

Purpose: Volumetric modulated arc therapy (VMAT) optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units (GPUs) have been used to speed up the computations. However, GPU’s relatively small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix in cases of, e.g., those with a large target size, multiple targets, multiple arcs, and/or small beamlet size. The main purpose of this paper is to report an implementation of a column-generation-based VMAT algorithm, previously developed in the authors’ group, on a multi-GPU platform tomore » solve the memory limitation problem. While the column-generation-based VMAT algorithm has been previously developed, the GPU implementation details have not been reported. Hence, another purpose is to present detailed techniques employed for GPU implementation. The authors also would like to utilize this particular problem as an example problem to study the feasibility of using a multi-GPU platform to solve large-scale problems in medical physics. Methods: The column-generation approach generates VMAT apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. In the authors’ method, the sparse DDC matrix is first stored on a CPU in coordinate list format (COO). On the GPU side, this matrix is split into four submatrices according to beam angles, which are stored on four GPUs in compressed sparse row format. Computation of beamlet price, the first step in PP, is accomplished using multi-GPUs. A fast inter-GPU data transfer scheme is accomplished using peer-to-peer access. The remaining steps of PP and MP problems are implemented on CPU or a single GPU due to their modest problem scale and computational loads. Barzilai and Borwein algorithm with a subspace step scheme is adopted here to solve the MP problem. A head and neck (H and N) cancer case is then used to validate the authors’ method. The authors also compare their multi-GPU implementation with three different single GPU implementation strategies, i.e., truncating DDC matrix (S1), repeatedly transferring DDC matrix between CPU and GPU (S2), and porting computations involving DDC matrix to CPU (S3), in terms of both plan quality and computational efficiency. Two more H and N patient cases and three prostate cases are used to demonstrate the advantages of the authors’ method. Results: The authors’ multi-GPU implementation can finish the optimization process within ∼1 min for the H and N patient case. S1 leads to an inferior plan quality although its total time was 10 s shorter than the multi-GPU implementation due to the reduced matrix size. S2 and S3 yield the same plan quality as the multi-GPU implementation but take ∼4 and ∼6 min, respectively. High computational efficiency was consistently achieved for the other five patient cases tested, with VMAT plans of clinically acceptable quality obtained within 23–46 s. Conversely, to obtain clinically comparable or acceptable plans for all six of these VMAT cases that the authors have tested in this paper, the optimization time needed in a commercial TPS system on CPU was found to be in an order of several minutes. Conclusions: The results demonstrate that the multi-GPU implementation of the authors’ column-generation-based VMAT optimization can handle the large-scale VMAT optimization problem efficiently without sacrificing plan quality. The authors’ study may serve as an example to shed some light on other large-scale medical physics problems that require multi-GPU techniques.« less
Saliency Detection via Absorbing Markov Chain With Learnt Transition Probability.

PubMed

Lihe Zhang; Jianwu Ai; Bowen Jiang; Huchuan Lu; Xiukui Li

2018-02-01

In this paper, we propose a bottom-up saliency model based on absorbing Markov chain (AMC). First, a sparsely connected graph is constructed to capture the local context information of each node. All image boundary nodes and other nodes are, respectively, treated as the absorbing nodes and transient nodes in the absorbing Markov chain. Then, the expected number of times from each transient node to all other transient nodes can be used to represent the saliency value of this node. The absorbed time depends on the weights on the path and their spatial coordinates, which are completely encoded in the transition probability matrix. Considering the importance of this matrix, we adopt different hierarchies of deep features extracted from fully convolutional networks and learn a transition probability matrix, which is called learnt transition probability matrix. Although the performance is significantly promoted, salient objects are not uniformly highlighted very well. To solve this problem, an angular embedding technique is investigated to refine the saliency results. Based on pairwise local orderings, which are produced by the saliency maps of AMC and boundary maps, we rearrange the global orderings (saliency value) of all nodes. Extensive experiments demonstrate that the proposed algorithm outperforms the state-of-the-art methods on six publicly available benchmark data sets.
Dictionary Pair Learning on Grassmann Manifolds for Image Denoising.

PubMed

Zeng, Xianhua; Bian, Wei; Liu, Wei; Shen, Jialie; Tao, Dacheng

2015-11-01

Image denoising is a fundamental problem in computer vision and image processing that holds considerable practical importance for real-world applications. The traditional patch-based and sparse coding-driven image denoising methods convert 2D image patches into 1D vectors for further processing. Thus, these methods inevitably break down the inherent 2D geometric structure of natural images. To overcome this limitation pertaining to the previous image denoising methods, we propose a 2D image denoising model, namely, the dictionary pair learning (DPL) model, and we design a corresponding algorithm called the DPL on the Grassmann-manifold (DPLG) algorithm. The DPLG algorithm first learns an initial dictionary pair (i.e., the left and right dictionaries) by employing a subspace partition technique on the Grassmann manifold, wherein the refined dictionary pair is obtained through a sub-dictionary pair merging. The DPLG obtains a sparse representation by encoding each image patch only with the selected sub-dictionary pair. The non-zero elements of the sparse representation are further smoothed by the graph Laplacian operator to remove the noise. Consequently, the DPLG algorithm not only preserves the inherent 2D geometric structure of natural images but also performs manifold smoothing in the 2D sparse coding space. We demonstrate that the DPLG algorithm also improves the structural SIMilarity values of the perceptual visual quality for denoised images using the experimental evaluations on the benchmark images and Berkeley segmentation data sets. Moreover, the DPLG also produces the competitive peak signal-to-noise ratio values from popular image denoising algorithms.
SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics.

PubMed

Will, Sebastian; Otto, Christina; Miladi, Milad; Möhl, Mathias; Backofen, Rolf

2015-08-01

RNA-Seq experiments have revealed a multitude of novel ncRNAs. The gold standard for their analysis based on simultaneous alignment and folding suffers from extreme time complexity of [Formula: see text]. Subsequently, numerous faster 'Sankoff-style' approaches have been suggested. Commonly, the performance of such methods relies on sequence-based heuristics that restrict the search space to optimal or near-optimal sequence alignments; however, the accuracy of sequence-based methods breaks down for RNAs with sequence identities below 60%. Alignment approaches like LocARNA that do not require sequence-based heuristics, have been limited to high complexity ([Formula: see text] quartic time). Breaking this barrier, we introduce the novel Sankoff-style algorithm 'sparsified prediction and alignment of RNAs based on their structure ensembles (SPARSE)', which runs in quadratic time without sequence-based heuristics. To achieve this low complexity, on par with sequence alignment algorithms, SPARSE features strong sparsification based on structural properties of the RNA ensembles. Following PMcomp, SPARSE gains further speed-up from lightweight energy computation. Although all existing lightweight Sankoff-style methods restrict Sankoff's original model by disallowing loop deletions and insertions, SPARSE transfers the Sankoff algorithm to the lightweight energy model completely for the first time. Compared with LocARNA, SPARSE achieves similar alignment and better folding quality in significantly less time (speedup: 3.7). At similar run-time, it aligns low sequence identity instances substantially more accurate than RAF, which uses sequence-based heuristics. © The Author 2015. Published by Oxford University Press.
Sparse image reconstruction for molecular imaging.

PubMed

Ting, Michael; Raich, Raviv; Hero, Alfred O

2009-06-01

The application that motivates this paper is molecular imaging at the atomic level. When discretized at subatomic distances, the volume is inherently sparse. Noiseless measurements from an imaging technology can be modeled by convolution of the image with the system point spread function (psf). Such is the case with magnetic resonance force microscopy (MRFM), an emerging technology where imaging of an individual tobacco mosaic virus was recently demonstrated with nanometer resolution. We also consider additive white Gaussian noise (AWGN) in the measurements. Many prior works of sparse estimators have focused on the case when H has low coherence; however, the system matrix H in our application is the convolution matrix for the system psf. A typical convolution matrix has high coherence. This paper, therefore, does not assume a low coherence H. A discrete-continuous form of the Laplacian and atom at zero (LAZE) p.d.f. used by Johnstone and Silverman is formulated, and two sparse estimators derived by maximizing the joint p.d.f. of the observation and image conditioned on the hyperparameters. A thresholding rule that generalizes the hard and soft thresholding rule appears in the course of the derivation. This so-called hybrid thresholding rule, when used in the iterative thresholding framework, gives rise to the hybrid estimator, a generalization of the lasso. Estimates of the hyperparameters for the lasso and hybrid estimator are obtained via Stein's unbiased risk estimate (SURE). A numerical study with a Gaussian psf and two sparse images shows that the hybrid estimator outperforms the lasso.
Multi-color incomplete Cholesky conjugate gradient methods for vector computers. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Poole, E. L.

1986-01-01

In this research, we are concerned with the solution on vector computers of linear systems of equations, Ax = b, where A is a larger, sparse symmetric positive definite matrix. We solve the system using an iterative method, the incomplete Cholesky conjugate gradient method (ICCG). We apply a multi-color strategy to obtain p-color matrices for which a block-oriented ICCG method is implemented on the CYBER 205. (A p-colored matrix is a matrix which can be partitioned into a pXp block matrix where the diagonal blocks are diagonal matrices). This algorithm, which is based on a no-fill strategy, achieves O(N/p) length vector operations in both the decomposition of A and in the forward and back solves necessary at each iteration of the method. We discuss the natural ordering of the unknowns as an ordering that minimizes the number of diagonals in the matrix and define multi-color orderings in terms of disjoint sets of the unknowns. We give necessary and sufficient conditions to determine which multi-color orderings of the unknowns correpond to p-color matrices. A performance model is given which is used both to predict execution time for ICCG methods and also to compare an ICCG method to conjugate gradient without preconditioning or another ICCG method. Results are given from runs on the CYBER 205 at NASA's Langley Research Center for four model problems.
SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.

PubMed

Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman

2017-03-01

We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).
Underdetermined blind separation of three-way fluorescence spectra of PAHs in water

NASA Astrophysics Data System (ADS)

Yang, Ruifang; Zhao, Nanjing; Xiao, Xue; Zhu, Wei; Chen, Yunan; Yin, Gaofang; Liu, Jianguo; Liu, Wenqing

2018-06-01

In this work, underdetermined blind decomposition method is developed to recognize individual components from the three-way fluorescent spectra of their mixtures by using sparse component analysis (SCA). The mixing matrix is estimated from the mixtures using fuzzy data clustering algorithm together with the scatters corresponding to local energy maximum value in the time-frequency domain, and the spectra of object components are recovered by pseudo inverse technique. As an example, using this method three and four pure components spectra can be blindly extracted from two samples of their mixture, with similarities between resolved and reference spectra all above 0.80. This work opens a new and effective path to realize monitoring PAHs in water by three-way fluorescence spectroscopy technique.
Implicit Kalman filtering

NASA Technical Reports Server (NTRS)

Skliar, M.; Ramirez, W. F.

1997-01-01

For an implicitly defined discrete system, a new algorithm for Kalman filtering is developed and an efficient numerical implementation scheme is proposed. Unlike the traditional explicit approach, the implicit filter can be readily applied to ill-conditioned systems and allows for generalization to descriptor systems. The implementation of the implicit filter depends on the solution of the congruence matrix equation (A1)(Px)(AT1) = Py. We develop a general iterative method for the solution of this equation, and prove necessary and sufficient conditions for convergence. It is shown that when the system matrices of an implicit system are sparse, the implicit Kalman filter requires significantly less computer time and storage to implement as compared to the traditional explicit Kalman filter. Simulation results are presented to illustrate and substantiate the theoretical developments.
Constraint treatment techniques and parallel algorithms for multibody dynamic analysis. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Chiou, Jin-Chern

1990-01-01

Computational procedures for kinematic and dynamic analysis of three-dimensional multibody dynamic (MBD) systems are developed from the differential-algebraic equations (DAE's) viewpoint. Constraint violations during the time integration process are minimized and penalty constraint stabilization techniques and partitioning schemes are developed. The governing equations of motion, a two-stage staggered explicit-implicit numerical algorithm, are treated which takes advantage of a partitioned solution procedure. A robust and parallelizable integration algorithm is developed. This algorithm uses a two-stage staggered central difference algorithm to integrate the translational coordinates and the angular velocities. The angular orientations of bodies in MBD systems are then obtained by using an implicit algorithm via the kinematic relationship between Euler parameters and angular velocities. It is shown that the combination of the present solution procedures yields a computationally more accurate solution. To speed up the computational procedures, parallel implementation of the present constraint treatment techniques, the two-stage staggered explicit-implicit numerical algorithm was efficiently carried out. The DAE's and the constraint treatment techniques were transformed into arrowhead matrices to which Schur complement form was derived. By fully exploiting the sparse matrix structural analysis techniques, a parallel preconditioned conjugate gradient numerical algorithm is used to solve the systems equations written in Schur complement form. A software testbed was designed and implemented in both sequential and parallel computers. This testbed was used to demonstrate the robustness and efficiency of the constraint treatment techniques, the accuracy of the two-stage staggered explicit-implicit numerical algorithm, and the speed up of the Schur-complement-based parallel preconditioned conjugate gradient algorithm on a parallel computer.
Visual saliency detection based on in-depth analysis of sparse representation

NASA Astrophysics Data System (ADS)

Wang, Xin; Shen, Siqiu; Ning, Chen

2018-03-01

Visual saliency detection has been receiving great attention in recent years since it can facilitate a wide range of applications in computer vision. A variety of saliency models have been proposed based on different assumptions within which saliency detection via sparse representation is one of the newly arisen approaches. However, most existing sparse representation-based saliency detection methods utilize partial characteristics of sparse representation, lacking of in-depth analysis. Thus, they may have limited detection performance. Motivated by this, this paper proposes an algorithm for detecting visual saliency based on in-depth analysis of sparse representation. A number of discriminative dictionaries are first learned with randomly sampled image patches by means of inner product-based dictionary atom classification. Then, the input image is partitioned into many image patches, and these patches are classified into salient and nonsalient ones based on the in-depth analysis of sparse coding coefficients. Afterward, sparse reconstruction errors are calculated for the salient and nonsalient patch sets. By investigating the sparse reconstruction errors, the most salient atoms, which tend to be from the most salient region, are screened out and taken away from the discriminative dictionaries. Finally, an effective method is exploited for saliency map generation with the reduced dictionaries. Comprehensive evaluations on publicly available datasets and comparisons with some state-of-the-art approaches demonstrate the effectiveness of the proposed algorithm.

Sparsity-Cognizant Algorithms with Applications to Communications, Signal Processing, and the Smart Grid

NASA Astrophysics Data System (ADS)

Zhu, Hao

Sparsity plays an instrumental role in a plethora of scientific fields, including statistical inference for variable selection, parsimonious signal representations, and solving under-determined systems of linear equations - what has led to the ground-breaking result of compressive sampling (CS). This Thesis leverages exciting ideas of sparse signal reconstruction to develop sparsity-cognizant algorithms, and analyze their performance. The vision is to devise tools exploiting the 'right' form of sparsity for the 'right' application domain of multiuser communication systems, array signal processing systems, and the emerging challenges in the smart power grid. Two important power system monitoring tasks are addressed first by capitalizing on the hidden sparsity. To robustify power system state estimation, a sparse outlier model is leveraged to capture the possible corruption in every datum, while the problem nonconvexity due to nonlinear measurements is handled using the semidefinite relaxation technique. Different from existing iterative methods, the proposed algorithm approximates well the global optimum regardless of the initialization. In addition, for enhanced situational awareness, a novel sparse overcomplete representation is introduced to capture (possibly multiple) line outages, and develop real-time algorithms for solving the combinatorially complex identification problem. The proposed algorithms exhibit near-optimal performance while incurring only linear complexity in the number of lines, which makes it possible to quickly bring contingencies to attention. This Thesis also accounts for two basic issues in CS, namely fully-perturbed models and the finite alphabet property. The sparse total least-squares (S-TLS) approach is proposed to furnish CS algorithms for fully-perturbed linear models, leading to statistically optimal and computationally efficient solvers. The S-TLS framework is well motivated for grid-based sensing applications and exhibits higher accuracy than existing sparse algorithms. On the other hand, exploiting the finite alphabet of unknown signals emerges naturally in communication systems, along with sparsity coming from the low activity of each user. Compared to approaches only accounting for either one of the two, joint exploitation of both leads to statistically optimal detectors with improved error performance.
Dictionary Learning Algorithms for Sparse Representation

PubMed Central

Kreutz-Delgado, Kenneth; Murray, Joseph F.; Rao, Bhaskar D.; Engan, Kjersti; Lee, Te-Won; Sejnowski, Terrence J.

2010-01-01

Algorithms for data-driven learning of domain-specific overcomplete dictionaries are developed to obtain maximum likelihood and maximum a posteriori dictionary estimates based on the use of Bayesian models with concave/Schur-concave (CSC) negative log priors. Such priors are appropriate for obtaining sparse representations of environmental signals within an appropriately chosen (environmentally matched) dictionary. The elements of the dictionary can be interpreted as concepts, features, or words capable of succinct expression of events encountered in the environment (the source of the measured signals). This is a generalization of vector quantization in that one is interested in a description involving a few dictionary entries (the proverbial “25 words or less”), but not necessarily as succinct as one entry. To learn an environmentally adapted dictionary capable of concise expression of signals generated by the environment, we develop algorithms that iterate between a representative set of sparse representations found by variants of FOCUSS and an update of the dictionary using these sparse representations. Experiments were performed using synthetic data and natural images. For complete dictionaries, we demonstrate that our algorithms have improved performance over other independent component analysis (ICA) methods, measured in terms of signal-to-noise ratios of separated sources. In the overcomplete case, we show that the true underlying dictionary and sparse sources can be accurately recovered. In tests with natural images, learned overcomplete dictionaries are shown to have higher coding efficiency than complete dictionaries; that is, images encoded with an over-complete dictionary have both higher compression (fewer bits per pixel) and higher accuracy (lower mean square error). PMID:12590811
Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners

DOE PAGES

Li, Ruipeng; Saad, Yousef

2017-08-01

This study presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits domain decomposition (DD) and low-rank corrections. The DD approach decouples the matrix and, once inverted, a low-rank approximation is applied by exploiting the Sherman--Morrison--Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisonsmore » with pARMS, a DD-based parallel incomplete LU (ILU) preconditioning method, are presented for solving Poisson's equation and linear elasticity problems.« less
Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Ruipeng; Saad, Yousef

This study presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits domain decomposition (DD) and low-rank corrections. The DD approach decouples the matrix and, once inverted, a low-rank approximation is applied by exploiting the Sherman--Morrison--Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisonsmore » with pARMS, a DD-based parallel incomplete LU (ILU) preconditioning method, are presented for solving Poisson's equation and linear elasticity problems.« less
Incoherent dictionary learning for reducing crosstalk noise in least-squares reverse time migration

NASA Astrophysics Data System (ADS)

Wu, Juan; Bai, Min

2018-05-01

We propose to apply a novel incoherent dictionary learning (IDL) algorithm for regularizing the least-squares inversion in seismic imaging. The IDL is proposed to overcome the drawback of traditional dictionary learning algorithm in losing partial texture information. Firstly, the noisy image is divided into overlapped image patches, and some random patches are extracted for dictionary learning. Then, we apply the IDL technology to minimize the coherency between atoms during dictionary learning. Finally, the sparse representation problem is solved by a sparse coding algorithm, and image is restored by those sparse coefficients. By reducing the correlation among atoms, it is possible to preserve most of the small-scale features in the image while removing much of the long-wavelength noise. The application of the IDL method to regularization of seismic images from least-squares reverse time migration shows successful performance.
DOLPHIn—Dictionary Learning for Phase Retrieval

NASA Astrophysics Data System (ADS)

Tillmann, Andreas M.; Eldar, Yonina C.; Mairal, Julien

2016-12-01

We propose a new algorithm to learn a dictionary for reconstructing and sparsely encoding signals from measurements without phase. Specifically, we consider the task of estimating a two-dimensional image from squared-magnitude measurements of a complex-valued linear transformation of the original image. Several recent phase retrieval algorithms exploit underlying sparsity of the unknown signal in order to improve recovery performance. In this work, we consider such a sparse signal prior in the context of phase retrieval, when the sparsifying dictionary is not known in advance. Our algorithm jointly reconstructs the unknown signal - possibly corrupted by noise - and learns a dictionary such that each patch of the estimated image can be sparsely represented. Numerical experiments demonstrate that our approach can obtain significantly better reconstructions for phase retrieval problems with noise than methods that cannot exploit such "hidden" sparsity. Moreover, on the theoretical side, we provide a convergence result for our method.
Accelerated Path-following Iterative Shrinkage Thresholding Algorithm with Application to Semiparametric Graph Estimation

PubMed Central

Zhao, Tuo; Liu, Han

2016-01-01

We propose an accelerated path-following iterative shrinkage thresholding algorithm (APISTA) for solving high dimensional sparse nonconvex learning problems. The main difference between APISTA and the path-following iterative shrinkage thresholding algorithm (PISTA) is that APISTA exploits an additional coordinate descent subroutine to boost the computational performance. Such a modification, though simple, has profound impact: APISTA not only enjoys the same theoretical guarantee as that of PISTA, i.e., APISTA attains a linear rate of convergence to a unique sparse local optimum with good statistical properties, but also significantly outperforms PISTA in empirical benchmarks. As an application, we apply APISTA to solve a family of nonconvex optimization problems motivated by estimating sparse semiparametric graphical models. APISTA allows us to obtain new statistical recovery results which do not exist in the existing literature. Thorough numerical results are provided to back up our theory. PMID:28133430
Solving large-scale dynamic systems using band Lanczos method in Rockwell NASTRAN on CRAY X-MP

NASA Technical Reports Server (NTRS)

Gupta, V. K.; Zillmer, S. D.; Allison, R. E.

1986-01-01

The improved cost effectiveness using better models, more accurate and faster algorithms and large scale computing offers more representative dynamic analyses. The band Lanczos eigen-solution method was implemented in Rockwell's version of 1984 COSMIC-released NASTRAN finite element structural analysis computer program to effectively solve for structural vibration modes including those of large complex systems exceeding 10,000 degrees of freedom. The Lanczos vectors were re-orthogonalized locally using the Lanczos Method and globally using the modified Gram-Schmidt method for sweeping rigid-body modes and previously generated modes and Lanczos vectors. The truncated band matrix was solved for vibration frequencies and mode shapes using Givens rotations. Numerical examples are included to demonstrate the cost effectiveness and accuracy of the method as implemented in ROCKWELL NASTRAN. The CRAY version is based on RPK's COSMIC/NASTRAN. The band Lanczos method was more reliable and accurate and converged faster than the single vector Lanczos Method. The band Lanczos method was comparable to the subspace iteration method which was a block version of the inverse power method. However, the subspace matrix tended to be fully populated in the case of subspace iteration and not as sparse as a band matrix.
Hybrid reconstruction of quantum density matrix: when low-rank meets sparsity

NASA Astrophysics Data System (ADS)

Li, Kezhi; Zheng, Kai; Yang, Jingbei; Cong, Shuang; Liu, Xiaomei; Li, Zhaokai

2017-12-01

Both the mathematical theory and experiments have verified that the quantum state tomography based on compressive sensing is an efficient framework for the reconstruction of quantum density states. In recent physical experiments, we found that many unknown density matrices in which people are interested in are low-rank as well as sparse. Bearing this information in mind, in this paper we propose a reconstruction algorithm that combines the low-rank and the sparsity property of density matrices and further theoretically prove that the solution of the optimization function can be, and only be, the true density matrix satisfying the model with overwhelming probability, as long as a necessary number of measurements are allowed. The solver leverages the fixed-point equation technique in which a step-by-step strategy is developed by utilizing an extended soft threshold operator that copes with complex values. Numerical experiments of the density matrix estimation for real nuclear magnetic resonance devices reveal that the proposed method achieves a better accuracy compared to some existing methods. We believe that the proposed method could be leveraged as a generalized approach and widely implemented in the quantum state estimation.
New Parallel Algorithms for Structural Analysis and Design of Aerospace Structures

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.

1998-01-01

Subspace and Lanczos iterations have been developed, well documented, and widely accepted as efficient methods for obtaining p-lowest eigen-pair solutions of large-scale, practical engineering problems. The focus of this paper is to incorporate recent developments in vectorized sparse technologies in conjunction with Subspace and Lanczos iterative algorithms for computational enhancements. Numerical performance, in terms of accuracy and efficiency of the proposed sparse strategies for Subspace and Lanczos algorithm, is demonstrated by solving for the lowest frequencies and mode shapes of structural problems on the IBM-R6000/590 and SunSparc 20 workstations.
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS.

PubMed

Fan, Jianqing; Liao, Yuan; Mincheva, Martina

2011-01-01

The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied.
Optimization-based image reconstruction from sparse-view data in offset-detector CBCT

NASA Astrophysics Data System (ADS)

Bian, Junguo; Wang, Jiong; Han, Xiao; Sidky, Emil Y.; Shao, Lingxiong; Pan, Xiaochuan

2013-01-01

The field of view (FOV) of a cone-beam computed tomography (CBCT) unit in a single-photon emission computed tomography (SPECT)/CBCT system can be increased by offsetting the CBCT detector. Analytic-based algorithms have been developed for image reconstruction from data collected at a large number of densely sampled views in offset-detector CBCT. However, the radiation dose involved in a large number of projections can be of a health concern to the imaged subject. CBCT-imaging dose can be reduced by lowering the number of projections. As analytic-based algorithms are unlikely to reconstruct accurate images from sparse-view data, we investigate and characterize in the work optimization-based algorithms, including an adaptive steepest descent-weighted projection onto convex sets (ASD-WPOCS) algorithms, for image reconstruction from sparse-view data collected in offset-detector CBCT. Using simulated data and real data collected from a physical pelvis phantom and patient, we verify and characterize properties of the algorithms under study. Results of our study suggest that optimization-based algorithms such as ASD-WPOCS may be developed for yielding images of potential utility from a number of projections substantially smaller than those used currently in clinical SPECT/CBCT imaging, thus leading to a dose reduction in CBCT imaging.
Online Hierarchical Sparse Representation of Multifeature for Robust Object Tracking

PubMed Central

Qu, Shiru

2016-01-01

Object tracking based on sparse representation has given promising tracking results in recent years. However, the trackers under the framework of sparse representation always overemphasize the sparse representation and ignore the correlation of visual information. In addition, the sparse coding methods only encode the local region independently and ignore the spatial neighborhood information of the image. In this paper, we propose a robust tracking algorithm. Firstly, multiple complementary features are used to describe the object appearance; the appearance model of the tracked target is modeled by instantaneous and stable appearance features simultaneously. A two-stage sparse-coded method which takes the spatial neighborhood information of the image patch and the computation burden into consideration is used to compute the reconstructed object appearance. Then, the reliability of each tracker is measured by the tracking likelihood function of transient and reconstructed appearance models. Finally, the most reliable tracker is obtained by a well established particle filter framework; the training set and the template library are incrementally updated based on the current tracking results. Experiment results on different challenging video sequences show that the proposed algorithm performs well with superior tracking accuracy and robustness. PMID:27630710
Bayesian inference of the number of factors in gene-expression analysis: application to human virus challenge studies.

PubMed

Chen, Bo; Chen, Minhua; Paisley, John; Zaas, Aimee; Woods, Christopher; Ginsburg, Geoffrey S; Hero, Alfred; Lucas, Joseph; Dunson, David; Carin, Lawrence

2010-11-09

Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Incorporating structure from motion uncertainty into image-based pose estimation

NASA Astrophysics Data System (ADS)

Ludington, Ben T.; Brown, Andrew P.; Sheffler, Michael J.; Taylor, Clark N.; Berardi, Stephen

2015-05-01

A method for generating and utilizing structure from motion (SfM) uncertainty estimates within image-based pose estimation is presented. The method is applied to a class of problems in which SfM algorithms are utilized to form a geo-registered reference model of a particular ground area using imagery gathered during flight by a small unmanned aircraft. The model is then used to form camera pose estimates in near real-time from imagery gathered later. The resulting pose estimates can be utilized by any of the other onboard systems (e.g. as a replacement for GPS data) or downstream exploitation systems, e.g., image-based object trackers. However, many of the consumers of pose estimates require an assessment of the pose accuracy. The method for generating the accuracy assessment is presented. First, the uncertainty in the reference model is estimated. Bundle Adjustment (BA) is utilized for model generation. While the high-level approach for generating a covariance matrix of the BA parameters is straightforward, typical computing hardware is not able to support the required operations due to the scale of the optimization problem within BA. Therefore, a series of sparse matrix operations is utilized to form an exact covariance matrix for only the parameters that are needed at a particular moment. Once the uncertainty in the model has been determined, it is used to augment Perspective-n-Point pose estimation algorithms to improve the pose accuracy and to estimate the resulting pose uncertainty. The implementation of the described method is presented along with results including results gathered from flight test data.
Multi-linear sparse reconstruction for SAR imaging based on higher-order SVD

NASA Astrophysics Data System (ADS)

Gao, Yu-Fei; Gui, Guan; Cong, Xun-Chao; Yang, Yue; Zou, Yan-Bin; Wan, Qun

2017-12-01

This paper focuses on the spotlight synthetic aperture radar (SAR) imaging for point scattering targets based on tensor modeling. In a real-world scenario, scatterers usually distribute in the block sparse pattern. Such a distribution feature has been scarcely utilized by the previous studies of SAR imaging. Our work takes advantage of this structure property of the target scene, constructing a multi-linear sparse reconstruction algorithm for SAR imaging. The multi-linear block sparsity is introduced into higher-order singular value decomposition (SVD) with a dictionary constructing procedure by this research. The simulation experiments for ideal point targets show the robustness of the proposed algorithm to the noise and sidelobe disturbance which always influence the imaging quality of the conventional methods. The computational resources requirement is further investigated in this paper. As a consequence of the algorithm complexity analysis, the present method possesses the superiority on resource consumption compared with the classic matching pursuit method. The imaging implementations for practical measured data also demonstrate the effectiveness of the algorithm developed in this paper.
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.

PubMed

Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo

2015-08-01

Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
Recent advances on terrain database correlation testing

NASA Astrophysics Data System (ADS)

Sakude, Milton T.; Schiavone, Guy A.; Morelos-Borja, Hector; Martin, Glenn; Cortes, Art

1998-08-01

Terrain database correlation is a major requirement for interoperability in distributed simulation. There are numerous situations in which terrain database correlation problems can occur that, in turn, lead to lack of interoperability in distributed training simulations. Examples are the use of different run-time terrain databases derived from inconsistent on source data, the use of different resolutions, and the use of different data models between databases for both terrain and culture data. IST has been developing a suite of software tools, named ZCAP, to address terrain database interoperability issues. In this paper we discuss recent enhancements made to this suite, including improved algorithms for sampling and calculating line-of-sight, an improved method for measuring terrain roughness, and the application of a sparse matrix method to the terrain remediation solution developed at the Visual Systems Lab of the Institute for Simulation and Training. We review the application of some of these new algorithms to the terrain correlation measurement processes. The application of these new algorithms improves our support for very large terrain databases, and provides the capability for performing test replications to estimate the sampling error of the tests. With this set of tools, a user can quantitatively assess the degree of correlation between large terrain databases.
Object detection approach using generative sparse, hierarchical networks with top-down and lateral connections for combining texture/color detection and shape/contour detection

DOEpatents

Paiton, Dylan M.; Kenyon, Garrett T.; Brumby, Steven P.; Schultz, Peter F.; George, John S.

2015-07-28

An approach to detecting objects in an image dataset may combine texture/color detection, shape/contour detection, and/or motion detection using sparse, generative, hierarchical models with lateral and top-down connections. A first independent representation of objects in an image dataset may be produced using a color/texture detection algorithm. A second independent representation of objects in the image dataset may be produced using a shape/contour detection algorithm. A third independent representation of objects in the image dataset may be produced using a motion detection algorithm. The first, second, and third independent representations may then be combined into a single coherent output using a combinatorial algorithm.
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices, part 1

NASA Technical Reports Server (NTRS)

Freund, Roland W.; Gutknecht, Martin H.; Nachtigal, Noel M.

1990-01-01

The nonsymmetric Lanczos method can be used to compute eigenvalues of large sparse non-Hermitian matrices or to solve large sparse non-Hermitian linear systems. However, the original Lanczos algorithm is susceptible to possible breakdowns and potential instabilities. We present an implementation of a look-ahead version of the Lanczos algorithm which overcomes these problems by skipping over those steps in which a breakdown or near-breakdown would occur in the standard process. The proposed algorithm can handle look-ahead steps of any length and is not restricted to steps of length 2, as earlier implementations are. Also, our implementation has the feature that it requires roughly the same number of inner products as the standard Lanczos process without look-ahead.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran

We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less
A performance study of sparse Cholesky factorization on INTEL iPSC/860

NASA Technical Reports Server (NTRS)

Zubair, M.; Ghose, M.

1992-01-01

The problem of Cholesky factorization of a sparse matrix has been very well investigated on sequential machines. A number of efficient codes exist for factorizing large unstructured sparse matrices. However, there is a lack of such efficient codes on parallel machines in general, and distributed machines in particular. Some of the issues that are critical to the implementation of sparse Cholesky factorization on a distributed memory parallel machine are ordering, partitioning and mapping, load balancing, and ordering of various tasks within a processor. Here, we focus on the effect of various partitioning schemes on the performance of sparse Cholesky factorization on the Intel iPSC/860. Also, a new partitioning heuristic for structured as well as unstructured sparse matrices is proposed, and its performance is compared with other schemes.
Speckle noise reduction for optical coherence tomography based on adaptive 2D dictionary

NASA Astrophysics Data System (ADS)

Lv, Hongli; Fu, Shujun; Zhang, Caiming; Zhai, Lin

2018-05-01

As a high-resolution biomedical imaging modality, optical coherence tomography (OCT) is widely used in medical sciences. However, OCT images often suffer from speckle noise, which can mask some important image information, and thus reduce the accuracy of clinical diagnosis. Taking full advantage of nonlocal self-similarity and adaptive 2D-dictionary-based sparse representation, in this work, a speckle noise reduction algorithm is proposed for despeckling OCT images. To reduce speckle noise while preserving local image features, similar nonlocal patches are first extracted from the noisy image and put into groups using a gamma- distribution-based block matching method. An adaptive 2D dictionary is then learned for each patch group. Unlike traditional vector-based sparse coding, we express each image patch by the linear combination of a few matrices. This image-to-matrix method can exploit the local correlation between pixels. Since each image patch might belong to several groups, the despeckled OCT image is finally obtained by aggregating all filtered image patches. The experimental results demonstrate the superior performance of the proposed method over other state-of-the-art despeckling methods, in terms of objective metrics and visual inspection.
MRM-Lasso: A Sparse Multiview Feature Selection Method via Low-Rank Analysis.

PubMed

Yang, Wanqi; Gao, Yang; Shi, Yinghuan; Cao, Longbing

2015-11-01

Learning about multiview data involves many applications, such as video understanding, image classification, and social media. However, when the data dimension increases dramatically, it is important but very challenging to remove redundant features in multiview feature selection. In this paper, we propose a novel feature selection algorithm, multiview rank minimization-based Lasso (MRM-Lasso), which jointly utilizes Lasso for sparse feature selection and rank minimization for learning relevant patterns across views. Instead of simply integrating multiple Lasso from view level, we focus on the performance of sample-level (sample significance) and introduce pattern-specific weights into MRM-Lasso. The weights are utilized to measure the contribution of each sample to the labels in the current view. In addition, the latent correlation across different views is successfully captured by learning a low-rank matrix consisting of pattern-specific weights. The alternating direction method of multipliers is applied to optimize the proposed MRM-Lasso. Experiments on four real-life data sets show that features selected by MRM-Lasso have better multiview classification performance than the baselines. Moreover, pattern-specific weights are demonstrated to be significant for learning about multiview data, compared with view-specific weights.
A Digital Compressed Sensing-Based Energy-Efficient Single-Spot Bluetooth ECG Node

PubMed Central

Cai, Zhipeng; Zou, Fumin; Zhang, Xiangyu

2018-01-01

Energy efficiency is still the obstacle for long-term real-time wireless ECG monitoring. In this paper, a digital compressed sensing- (CS-) based single-spot Bluetooth ECG node is proposed to deal with the challenge in wireless ECG application. A periodic sleep/wake-up scheme and a CS-based compression algorithm are implemented in a node, which consists of ultra-low-power analog front-end, microcontroller, Bluetooth 4.0 communication module, and so forth. The efficiency improvement and the node's specifics are evidenced by the experiments using the ECG signals sampled by the proposed node under daily activities of lay, sit, stand, walk, and run. Under using sparse binary matrix (SBM), block sparse Bayesian learning (BSBL) method, and discrete cosine transform (DCT) basis, all ECG signals were essentially undistorted recovered with root-mean-square differences (PRDs) which are less than 6%. The proposed sleep/wake-up scheme and data compression can reduce the airtime over energy-hungry wireless links, the energy consumption of proposed node is 6.53 mJ, and the energy consumption of radio decreases 77.37%. Moreover, the energy consumption increase caused by CS code execution is negligible, which is 1.3% of the total energy consumption. PMID:29599945
A Digital Compressed Sensing-Based Energy-Efficient Single-Spot Bluetooth ECG Node.

PubMed

Luo, Kan; Cai, Zhipeng; Du, Keqin; Zou, Fumin; Zhang, Xiangyu; Li, Jianqing

2018-01-01

Energy efficiency is still the obstacle for long-term real-time wireless ECG monitoring. In this paper, a digital compressed sensing- (CS-) based single-spot Bluetooth ECG node is proposed to deal with the challenge in wireless ECG application. A periodic sleep/wake-up scheme and a CS-based compression algorithm are implemented in a node, which consists of ultra-low-power analog front-end, microcontroller, Bluetooth 4.0 communication module, and so forth. The efficiency improvement and the node's specifics are evidenced by the experiments using the ECG signals sampled by the proposed node under daily activities of lay, sit, stand, walk, and run. Under using sparse binary matrix (SBM), block sparse Bayesian learning (BSBL) method, and discrete cosine transform (DCT) basis, all ECG signals were essentially undistorted recovered with root-mean-square differences (PRDs) which are less than 6%. The proposed sleep/wake-up scheme and data compression can reduce the airtime over energy-hungry wireless links, the energy consumption of proposed node is 6.53 mJ, and the energy consumption of radio decreases 77.37%. Moreover, the energy consumption increase caused by CS code execution is negligible, which is 1.3% of the total energy consumption.
Securing image information using double random phase encoding and parallel compressive sensing with updated sampling processes

NASA Astrophysics Data System (ADS)

Hu, Guiqiang; Xiao, Di; Wang, Yong; Xiang, Tao; Zhou, Qing

2017-11-01

Recently, a new kind of image encryption approach using compressive sensing (CS) and double random phase encoding has received much attention due to the advantages such as compressibility and robustness. However, this approach is found to be vulnerable to chosen plaintext attack (CPA) if the CS measurement matrix is re-used. Therefore, designing an efficient measurement matrix updating mechanism that ensures resistance to CPA is of practical significance. In this paper, we provide a novel solution to update the CS measurement matrix by altering the secret sparse basis with the help of counter mode operation. Particularly, the secret sparse basis is implemented by a reality-preserving fractional cosine transform matrix. Compared with the conventional CS-based cryptosystem that totally generates all the random entries of measurement matrix, our scheme owns efficiency superiority while guaranteeing resistance to CPA. Experimental and analysis results show that the proposed scheme has a good security performance and has robustness against noise and occlusion.
Superiorized algorithm for reconstruction of CT images from sparse-view and limited-angle polyenergetic data

NASA Astrophysics Data System (ADS)

Humphries, T.; Winn, J.; Faridani, A.

2017-08-01

Recent work in CT image reconstruction has seen increasing interest in the use of total variation (TV) and related penalties to regularize problems involving reconstruction from undersampled or incomplete data. Superiorization is a recently proposed heuristic which provides an automatic procedure to ‘superiorize’ an iterative image reconstruction algorithm with respect to a chosen objective function, such as TV. Under certain conditions, the superiorized algorithm is guaranteed to find a solution that is as satisfactory as any found by the original algorithm with respect to satisfying the constraints of the problem; this solution is also expected to be superior with respect to the chosen objective. Most work on superiorization has used reconstruction algorithms which assume a linear measurement model, which in the case of CT corresponds to data generated from a monoenergetic x-ray beam. Many CT systems generate x-rays from a polyenergetic spectrum, however, in which the measured data represent an integral of object attenuation over all energies in the spectrum. This inconsistency with the linear model produces the well-known beam hardening artifacts, which impair analysis of CT images. In this work we superiorize an iterative algorithm for reconstruction from polyenergetic data, using both TV and an anisotropic TV (ATV) penalty. We apply the superiorized algorithm in numerical phantom experiments modeling both sparse-view and limited-angle scenarios. In our experiments, the superiorized algorithm successfully finds solutions which are as constraints-compatible as those found by the original algorithm, with significantly reduced TV and ATV values. The superiorized algorithm thus produces images with greatly reduced sparse-view and limited angle artifacts, which are also largely free of the beam hardening artifacts that would be present if a superiorized version of a monoenergetic algorithm were used.
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS

PubMed Central

Fan, Jianqing; Liao, Yuan; Mincheva, Martina

2012-01-01

The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied. PMID:22661790
Resolution enhancement of tri-stereo remote sensing images by super resolution methods

NASA Astrophysics Data System (ADS)

Tuna, Caglayan; Akoguz, Alper; Unal, Gozde; Sertel, Elif

2016-10-01

Super resolution (SR) refers to generation of a High Resolution (HR) image from a decimated, blurred, low-resolution (LR) image set, which can be either a single frame or multi-frame that contains a collection of several images acquired from slightly different views of the same observation area. In this study, we propose a novel application of tri-stereo Remote Sensing (RS) satellite images to the super resolution problem. Since the tri-stereo RS images of the same observation area are acquired from three different viewing angles along the flight path of the satellite, these RS images are properly suited to a SR application. We first estimate registration between the chosen reference LR image and other LR images to calculate the sub pixel shifts among the LR images. Then, the warping, blurring and down sampling matrix operators are created as sparse matrices to avoid high memory and computational requirements, which would otherwise make the RS-SR solution impractical. Finally, the overall system matrix, which is constructed based on the obtained operator matrices is used to obtain the estimate HR image in one step in each iteration of the SR algorithm. Both the Laplacian and total variation regularizers are incorporated separately into our algorithm and the results are presented to demonstrate an improved quantitative performance against the standard interpolation method as well as improved qualitative results due expert evaluations.
Robust Gaussian Graphical Modeling via l1 Penalization

PubMed Central

Sun, Hokeun; Li, Hongzhe

2012-01-01

Summary Gaussian graphical models have been widely used as an effective method for studying the conditional independency structure among genes and for constructing genetic networks. However, gene expression data typically have heavier tails or more outlying observations than the standard Gaussian distribution. Such outliers in gene expression data can lead to wrong inference on the dependency structure among the genes. We propose a l1 penalized estimation procedure for the sparse Gaussian graphical models that is robustified against possible outliers. The likelihood function is weighted according to how the observation is deviated, where the deviation of the observation is measured based on its own likelihood. An efficient computational algorithm based on the coordinate gradient descent method is developed to obtain the minimizer of the negative penalized robustified-likelihood, where nonzero elements of the concentration matrix represents the graphical links among the genes. After the graphical structure is obtained, we re-estimate the positive definite concentration matrix using an iterative proportional fitting algorithm. Through simulations, we demonstrate that the proposed robust method performs much better than the graphical Lasso for the Gaussian graphical models in terms of both graph structure selection and estimation when outliers are present. We apply the robust estimation procedure to an analysis of yeast gene expression data and show that the resulting graph has better biological interpretation than that obtained from the graphical Lasso. PMID:23020775
Augmented l1 and Nuclear-Norm Models with a Globally Linearly Convergent Algorithm. Revision 1

DTIC Science & Technology

2012-10-17

nonzero and sampled from the standard Gaussian distribution (for Figure 2) or the Bernoulli distribution (for Figure 3). Both tests had the same sensing...dual variable y(k) Figure 3: Convergence of primal and dual variables of three algorithms on Bernoulli sparse x0 was the slowest. Besides the obvious...slower convergence than the final stage. Comparing the results of two tests, the convergence was faster on the Bernoulli sparse signal than the
Discovering mutated driver genes through a robust and sparse co-regularized matrix factorization framework with prior information from mRNA expression patterns and interaction network.

PubMed

Xi, Jianing; Wang, Minghui; Li, Ao

2018-06-05

Discovery of mutated driver genes is one of the primary objective for studying tumorigenesis. To discover some relatively low frequently mutated driver genes from somatic mutation data, many existing methods incorporate interaction network as prior information. However, the prior information of mRNA expression patterns are not exploited by these existing network-based methods, which is also proven to be highly informative of cancer progressions. To incorporate prior information from both interaction network and mRNA expressions, we propose a robust and sparse co-regularized nonnegative matrix factorization to discover driver genes from mutation data. Furthermore, our framework also conducts Frobenius norm regularization to overcome overfitting issue. Sparsity-inducing penalty is employed to obtain sparse scores in gene representations, of which the top scored genes are selected as driver candidates. Evaluation experiments by known benchmarking genes indicate that the performance of our method benefits from the two type of prior information. Our method also outperforms the existing network-based methods, and detect some driver genes that are not predicted by the competing methods. In summary, our proposed method can improve the performance of driver gene discovery by effectively incorporating prior information from interaction network and mRNA expression patterns into a robust and sparse co-regularized matrix factorization framework.
Dynamic Textures Modeling via Joint Video Dictionary Learning.

PubMed

Wei, Xian; Li, Yuanxiang; Shen, Hao; Chen, Fang; Kleinsteuber, Martin; Wang, Zhongfeng

2017-04-06

Video representation is an important and challenging task in the computer vision community. In this paper, we consider the problem of modeling and classifying video sequences of dynamic scenes which could be modeled in a dynamic textures (DT) framework. At first, we assume that image frames of a moving scene can be modeled as a Markov random process. We propose a sparse coding framework, named joint video dictionary learning (JVDL), to model a video adaptively. By treating the sparse coefficients of image frames over a learned dictionary as the underlying "states", we learn an efficient and robust linear transition matrix between two adjacent frames of sparse events in time series. Hence, a dynamic scene sequence is represented by an appropriate transition matrix associated with a dictionary. In order to ensure the stability of JVDL, we impose several constraints on such transition matrix and dictionary. The developed framework is able to capture the dynamics of a moving scene by exploring both sparse properties and the temporal correlations of consecutive video frames. Moreover, such learned JVDL parameters can be used for various DT applications, such as DT synthesis and recognition. Experimental results demonstrate the strong competitiveness of the proposed JVDL approach in comparison with state-of-the-art video representation methods. Especially, it performs significantly better in dealing with DT synthesis and recognition on heavily corrupted data.
Computing the full spectrum of large sparse palindromic quadratic eigenvalue problems arising from surface Green's function calculations

NASA Astrophysics Data System (ADS)

Huang, Tsung-Ming; Lin, Wen-Wei; Tian, Heng; Chen, Guan-Hua

2018-03-01

Full spectrum of a large sparse ⊤-palindromic quadratic eigenvalue problem (⊤-PQEP) is considered arguably for the first time in this article. Such a problem is posed by calculation of surface Green's functions (SGFs) of mesoscopic transistors with a tremendous non-periodic cross-section. For this problem, general purpose eigensolvers are not efficient, nor is advisable to resort to the decimation method etc. to obtain the Wiener-Hopf factorization. After reviewing some rigorous understanding of SGF calculation from the perspective of ⊤-PQEP and nonlinear matrix equation, we present our new approach to this problem. In a nutshell, the unit disk where the spectrum of interest lies is broken down adaptively into pieces small enough that they each can be locally tackled by the generalized ⊤-skew-Hamiltonian implicitly restarted shift-and-invert Arnoldi (G⊤SHIRA) algorithm with suitable shifts and other parameters, and the eigenvalues missed by this divide-and-conquer strategy can be recovered thanks to the accurate estimation provided by our newly developed scheme. Notably the novel non-equivalence deflation is proposed to avoid as much as possible duplication of nearby known eigenvalues when a new shift of G⊤SHIRA is determined. We demonstrate our new approach by calculating the SGF of a realistic nanowire whose unit cell is described by a matrix of size 4000 × 4000 at the density functional tight binding level, corresponding to a 8 × 8nm2 cross-section. We believe that quantum transport simulation of realistic nano-devices in the mesoscopic regime will greatly benefit from this work.
A Sparse Matrix Approach for Simultaneous Quantification of Nystagmus and Saccade

NASA Technical Reports Server (NTRS)

Kukreja, Sunil L.; Stone, Lee; Boyle, Richard D.

2012-01-01

The vestibulo-ocular reflex (VOR) consists of two intermingled non-linear subsystems; namely, nystagmus and saccade. Typically, nystagmus is analysed using a single sufficiently long signal or a concatenation of them. Saccade information is not analysed and discarded due to insufficient data length to provide consistent and minimum variance estimates. This paper presents a novel sparse matrix approach to system identification of the VOR. It allows for the simultaneous estimation of both nystagmus and saccade signals. We show via simulation of the VOR that our technique provides consistent and unbiased estimates in the presence of output additive noise.
An efficient implementation of a high-order filter for a cubed-sphere spectral element model

NASA Astrophysics Data System (ADS)

Kang, Hyun-Gyu; Cheong, Hyeong-Bin

2017-03-01

A parallel-scalable, isotropic, scale-selective spatial filter was developed for the cubed-sphere spectral element model on the sphere. The filter equation is a high-order elliptic (Helmholtz) equation based on the spherical Laplacian operator, which is transformed into cubed-sphere local coordinates. The Laplacian operator is discretized on the computational domain, i.e., on each cell, by the spectral element method with Gauss-Lobatto Lagrange interpolating polynomials (GLLIPs) as the orthogonal basis functions. On the global domain, the discrete filter equation yielded a linear system represented by a highly sparse matrix. The density of this matrix increases quadratically (linearly) with the order of GLLIP (order of the filter), and the linear system is solved in only O (Ng) operations, where Ng is the total number of grid points. The solution, obtained by a row reduction method, demonstrated the typical accuracy and convergence rate of the cubed-sphere spectral element method. To achieve computational efficiency on parallel computers, the linear system was treated by an inverse matrix method (a sparse matrix-vector multiplication). The density of the inverse matrix was lowered to only a few times of the original sparse matrix without degrading the accuracy of the solution. For better computational efficiency, a local-domain high-order filter was introduced: The filter equation is applied to multiple cells, and then the central cell was only used to reconstruct the filtered field. The parallel efficiency of applying the inverse matrix method to the global- and local-domain filter was evaluated by the scalability on a distributed-memory parallel computer. The scale-selective performance of the filter was demonstrated on Earth topography. The usefulness of the filter as a hyper-viscosity for the vorticity equation was also demonstrated.
Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization.

PubMed

Lu, Canyi; Lin, Zhouchen; Yan, Shuicheng

2015-02-01

This paper presents a general framework for solving the low-rank and/or sparse matrix minimization problems, which may involve multiple nonsmooth terms. The iteratively reweighted least squares (IRLSs) method is a fast solver, which smooths the objective function and minimizes it by alternately updating the variables and their weights. However, the traditional IRLS can only solve a sparse only or low rank only minimization problem with squared loss or an affine constraint. This paper generalizes IRLS to solve joint/mixed low-rank and sparse minimization problems, which are essential formulations for many tasks. As a concrete example, we solve the Schatten-p norm and l2,q-norm regularized low-rank representation problem by IRLS, and theoretically prove that the derived solution is a stationary point (globally optimal if p,q ≥ 1). Our convergence proof of IRLS is more general than previous one that depends on the special properties of the Schatten-p norm and l2,q-norm. Extensive experiments on both synthetic and real data sets demonstrate that our IRLS is much more efficient.
A three-dimensional wide-angle BPM for optical waveguide structures.

PubMed

Ma, Changbao; Van Keuren, Edward

2007-01-22

Algorithms for effective modeling of optical propagation in three- dimensional waveguide structures are critical for the design of photonic devices. We present a three-dimensional (3-D) wide-angle beam propagation method (WA-BPM) using Hoekstra's scheme. A sparse matrix algebraic equation is formed and solved using iterative methods. The applicability, accuracy and effectiveness of our method are demonstrated by applying it to simulations of wide-angle beam propagation, along with a technique for shifting the simulation window to reduce the dimension of the numerical equation and a threshold technique to further ensure its convergence. These techniques can ensure the implementation of iterative methods for waveguide structures by relaxing the convergence problem, which will further enable us to develop higher-order 3-D WA-BPMs based on Padé approximant operators.
Multi-focus image fusion and robust encryption algorithm based on compressive sensing

NASA Astrophysics Data System (ADS)

Xiao, Di; Wang, Lan; Xiang, Tao; Wang, Yong

2017-06-01

Multi-focus image fusion schemes have been studied in recent years. However, little work has been done in multi-focus image transmission security. This paper proposes a scheme that can reduce data transmission volume and resist various attacks. First, multi-focus image fusion based on wavelet decomposition can generate complete scene images and optimize the perception of the human eye. The fused images are sparsely represented with DCT and sampled with structurally random matrix (SRM), which reduces the data volume and realizes the initial encryption. Then the obtained measurements are further encrypted to resist noise and crop attack through combining permutation and diffusion stages. At the receiver, the cipher images can be jointly decrypted and reconstructed. Simulation results demonstrate the security and robustness of the proposed scheme.

Underdetermined blind separation of three-way fluorescence spectra of PAHs in water.

PubMed

Yang, Ruifang; Zhao, Nanjing; Xiao, Xue; Zhu, Wei; Chen, Yunan; Yin, Gaofang; Liu, Jianguo; Liu, Wenqing

2018-06-15

In this work, underdetermined blind decomposition method is developed to recognize individual components from the three-way fluorescent spectra of their mixtures by using sparse component analysis (SCA). The mixing matrix is estimated from the mixtures using fuzzy data clustering algorithm together with the scatters corresponding to local energy maximum value in the time-frequency domain, and the spectra of object components are recovered by pseudo inverse technique. As an example, using this method three and four pure components spectra can be blindly extracted from two samples of their mixture, with similarities between resolved and reference spectra all above 0.80. This work opens a new and effective path to realize monitoring PAHs in water by three-way fluorescence spectroscopy technique. Copyright © 2018 Elsevier B.V. All rights reserved.
A three-dimensional wide-angle BPM for optical waveguide structures

NASA Astrophysics Data System (ADS)

Ma, Changbao; van Keuren, Edward

2007-01-01

Algorithms for effective modeling of optical propagation in three- dimensional waveguide structures are critical for the design of photonic devices. We present a three-dimensional (3-D) wide-angle beam propagation method (WA-BPM) using Hoekstra’s scheme. A sparse matrix algebraic equation is formed and solved using iterative methods. The applicability, accuracy and effectiveness of our method are demonstrated by applying it to simulations of wide-angle beam propagation, along with a technique for shifting the simulation window to reduce the dimension of the numerical equation and a threshold technique to further ensure its convergence. These techniques can ensure the implementation of iterative methods for waveguide structures by relaxing the convergence problem, which will further enable us to develop higher-order 3-D WA-BPMs based on Padé approximant operators.
A range-based predictive localization algorithm for WSID networks

NASA Astrophysics Data System (ADS)

Liu, Yuan; Chen, Junjie; Li, Gang

2017-11-01

Most studies on localization algorithms are conducted on the sensor networks with densely distributed nodes. However, the non-localizable problems are prone to occur in the network with sparsely distributed sensor nodes. To solve this problem, a range-based predictive localization algorithm (RPLA) is proposed in this paper for the wireless sensor networks syncretizing the RFID (WSID) networks. The Gaussian mixture model is established to predict the trajectory of a mobile target. Then, the received signal strength indication is used to reduce the residence area of the target location based on the approximate point-in-triangulation test algorithm. In addition, collaborative localization schemes are introduced to locate the target in the non-localizable situations. Simulation results verify that the RPLA achieves accurate localization for the network with sparsely distributed sensor nodes. The localization accuracy of the RPLA is 48.7% higher than that of the APIT algorithm, 16.8% higher than that of the single Gaussian model-based algorithm and 10.5% higher than that of the Kalman filtering-based algorithm.
A Distributed Learning Method for ℓ1-Regularized Kernel Machine over Wireless Sensor Networks

PubMed Central

Ji, Xinrong; Hou, Cuiqin; Hou, Yibin; Gao, Fang; Wang, Shulong

2016-01-01

In wireless sensor networks, centralized learning methods have very high communication costs and energy consumption. These are caused by the need to transmit scattered training examples from various sensor nodes to the central fusion center where a classifier or a regression machine is trained. To reduce the communication cost, a distributed learning method for a kernel machine that incorporates ℓ1 norm regularization (ℓ1-regularized) is investigated, and a novel distributed learning algorithm for the ℓ1-regularized kernel minimum mean squared error (KMSE) machine is proposed. The proposed algorithm relies on in-network processing and a collaboration that transmits the sparse model only between single-hop neighboring nodes. This paper evaluates the proposed algorithm with respect to the prediction accuracy, the sparse rate of model, the communication cost and the number of iterations on synthetic and real datasets. The simulation results show that the proposed algorithm can obtain approximately the same prediction accuracy as that obtained by the batch learning method. Moreover, it is significantly superior in terms of the sparse rate of model and communication cost, and it can converge with fewer iterations. Finally, an experiment conducted on a wireless sensor network (WSN) test platform further shows the advantages of the proposed algorithm with respect to communication cost. PMID:27376298
Sparse signal representation and its applications in ultrasonic NDE.

PubMed

Zhang, Guang-Ming; Zhang, Cheng-Zhong; Harvey, David M

2012-03-01

Many sparse signal representation (SSR) algorithms have been developed in the past decade. The advantages of SSR such as compact representations and super resolution lead to the state of the art performance of SSR for processing ultrasonic non-destructive evaluation (NDE) signals. Choosing a suitable SSR algorithm and designing an appropriate overcomplete dictionary is a key for success. After a brief review of sparse signal representation methods and the design of overcomplete dictionaries, this paper addresses the recent accomplishments of SSR for processing ultrasonic NDE signals. The advantages and limitations of SSR algorithms and various overcomplete dictionaries widely-used in ultrasonic NDE applications are explored in depth. Their performance improvement compared to conventional signal processing methods in many applications such as ultrasonic flaw detection and noise suppression, echo separation and echo estimation, and ultrasonic imaging is investigated. The challenging issues met in practical ultrasonic NDE applications for example the design of a good dictionary are discussed. Representative experimental results are presented for demonstration. Copyright © 2011 Elsevier B.V. All rights reserved.
Temporal flicker reduction and denoising in video using sparse directional transforms

NASA Astrophysics Data System (ADS)

Kanumuri, Sandeep; Guleryuz, Onur G.; Civanlar, M. Reha; Fujibayashi, Akira; Boon, Choong S.

2008-08-01

The bulk of the video content available today over the Internet and over mobile networks suffers from many imperfections caused during acquisition and transmission. In the case of user-generated content, which is typically produced with inexpensive equipment, these imperfections manifest in various ways through noise, temporal flicker and blurring, just to name a few. Imperfections caused by compression noise and temporal flicker are present in both studio-produced and user-generated video content transmitted at low bit-rates. In this paper, we introduce an algorithm designed to reduce temporal flicker and noise in video sequences. The algorithm takes advantage of the sparse nature of video signals in an appropriate transform domain that is chosen adaptively based on local signal statistics. When the signal corresponds to a sparse representation in this transform domain, flicker and noise, which are spread over the entire domain, can be reduced easily by enforcing sparsity. Our results show that the proposed algorithm reduces flicker and noise significantly and enables better presentation of compressed videos.
A Fast and Accurate Sparse Continuous Signal Reconstruction by Homotopy DCD with Non-Convex Regularization

PubMed Central

Wang, Tianyun; Lu, Xinfei; Yu, Xiaofei; Xi, Zhendong; Chen, Weidong

2014-01-01

In recent years, various applications regarding sparse continuous signal recovery such as source localization, radar imaging, communication channel estimation, etc., have been addressed from the perspective of compressive sensing (CS) theory. However, there are two major defects that need to be tackled when considering any practical utilization. The first issue is off-grid problem caused by the basis mismatch between arbitrary located unknowns and the pre-specified dictionary, which would make conventional CS reconstruction methods degrade considerably. The second important issue is the urgent demand for low-complexity algorithms, especially when faced with the requirement of real-time implementation. In this paper, to deal with these two problems, we have presented three fast and accurate sparse reconstruction algorithms, termed as HR-DCD, Hlog-DCD and Hlp-DCD, which are based on homotopy, dichotomous coordinate descent (DCD) iterations and non-convex regularizations, by combining with the grid refinement technique. Experimental results are provided to demonstrate the effectiveness of the proposed algorithms and related analysis. PMID:24675758
Logarithmic Laplacian Prior Based Bayesian Inverse Synthetic Aperture Radar Imaging.

PubMed

Zhang, Shuanghui; Liu, Yongxiang; Li, Xiang; Bi, Guoan

2016-04-28

This paper presents a novel Inverse Synthetic Aperture Radar Imaging (ISAR) algorithm based on a new sparse prior, known as the logarithmic Laplacian prior. The newly proposed logarithmic Laplacian prior has a narrower main lobe with higher tail values than the Laplacian prior, which helps to achieve performance improvement on sparse representation. The logarithmic Laplacian prior is used for ISAR imaging within the Bayesian framework to achieve better focused radar image. In the proposed method of ISAR imaging, the phase errors are jointly estimated based on the minimum entropy criterion to accomplish autofocusing. The maximum a posterior (MAP) estimation and the maximum likelihood estimation (MLE) are utilized to estimate the model parameters to avoid manually tuning process. Additionally, the fast Fourier Transform (FFT) and Hadamard product are used to minimize the required computational efficiency. Experimental results based on both simulated and measured data validate that the proposed algorithm outperforms the traditional sparse ISAR imaging algorithms in terms of resolution improvement and noise suppression.
BinTree Seeking: A Novel Approach to Mine Both Bi-Sparse and Cohesive Modules in Protein Interaction Networks

PubMed Central

Shen, Hong-Bin

2011-01-01

Modern science of networks has brought significant advances to our understanding of complex systems biology. As a representative model of systems biology, Protein Interaction Networks (PINs) are characterized by a remarkable modular structures, reflecting functional associations between their components. Many methods were proposed to capture cohesive modules so that there is a higher density of edges within modules than those across them. Recent studies reveal that cohesively interacting modules of proteins is not a universal organizing principle in PINs, which has opened up new avenues for revisiting functional modules in PINs. In this paper, functional clusters in PINs are found to be able to form unorthodox structures defined as bi-sparse module. In contrast to the traditional cohesive module, the nodes in the bi-sparse module are sparsely connected internally and densely connected with other bi-sparse or cohesive modules. We present a novel protocol called the BinTree Seeking (BTS) for mining both bi-sparse and cohesive modules in PINs based on Edge Density of Module (EDM) and matrix theory. BTS detects modules by depicting links and nodes rather than nodes alone and its derivation procedure is totally performed on adjacency matrix of networks. The number of modules in a PIN can be automatically determined in the proposed BTS approach. BTS is tested on three real PINs and the results demonstrate that functional modules in PINs are not dominantly cohesive but can be sparse. BTS software and the supporting information are available at: www.csbio.sjtu.edu.cn/bioinf/BTS/. PMID:22140454
Combining DCQGMP-Based Sparse Decomposition and MPDR Beamformer for Multi-Type Interferences Mitigation for GNSS Receivers.

PubMed

Guo, Qiang; Qi, Liangang

2017-04-10

In the coexistence of multiple types of interfering signals, the performance of interference suppression methods based on time and frequency domains is degraded seriously, and the technique using an antenna array requires a large enough size and huge hardware costs. To combat multi-type interferences better for GNSS receivers, this paper proposes a cascaded multi-type interferences mitigation method combining improved double chain quantum genetic matching pursuit (DCQGMP)-based sparse decomposition and an MPDR beamformer. The key idea behind the proposed method is that the multiple types of interfering signals can be excised by taking advantage of their sparse features in different domains. In the first stage, the single-tone (multi-tone) and linear chirp interfering signals are canceled by sparse decomposition according to their sparsity in the over-complete dictionary. In order to improve the timeliness of matching pursuit (MP)-based sparse decomposition, a DCQGMP is introduced by combining an improved double chain quantum genetic algorithm (DCQGA) and the MP algorithm, and the DCQGMP algorithm is extended to handle the multi-channel signals according to the correlation among the signals in different channels. In the second stage, the minimum power distortionless response (MPDR) beamformer is utilized to nullify the residuary interferences (e.g., wideband Gaussian noise interferences). Several simulation results show that the proposed method can not only improve the interference mitigation degree of freedom (DoF) of the array antenna, but also effectively deal with the interference arriving from the same direction with the GNSS signal, which can be sparse represented in the over-complete dictionary. Moreover, it does not bring serious distortions into the navigation signal.
Combining DCQGMP-Based Sparse Decomposition and MPDR Beamformer for Multi-Type Interferences Mitigation for GNSS Receivers

PubMed Central

Guo, Qiang; Qi, Liangang

2017-01-01

In the coexistence of multiple types of interfering signals, the performance of interference suppression methods based on time and frequency domains is degraded seriously, and the technique using an antenna array requires a large enough size and huge hardware costs. To combat multi-type interferences better for GNSS receivers, this paper proposes a cascaded multi-type interferences mitigation method combining improved double chain quantum genetic matching pursuit (DCQGMP)-based sparse decomposition and an MPDR beamformer. The key idea behind the proposed method is that the multiple types of interfering signals can be excised by taking advantage of their sparse features in different domains. In the first stage, the single-tone (multi-tone) and linear chirp interfering signals are canceled by sparse decomposition according to their sparsity in the over-complete dictionary. In order to improve the timeliness of matching pursuit (MP)-based sparse decomposition, a DCQGMP is introduced by combining an improved double chain quantum genetic algorithm (DCQGA) and the MP algorithm, and the DCQGMP algorithm is extended to handle the multi-channel signals according to the correlation among the signals in different channels. In the second stage, the minimum power distortionless response (MPDR) beamformer is utilized to nullify the residuary interferences (e.g., wideband Gaussian noise interferences). Several simulation results show that the proposed method can not only improve the interference mitigation degree of freedom (DoF) of the array antenna, but also effectively deal with the interference arriving from the same direction with the GNSS signal, which can be sparse represented in the over-complete dictionary. Moreover, it does not bring serious distortions into the navigation signal. PMID:28394290
Preserving sparseness in multivariate polynominal factorization

NASA Technical Reports Server (NTRS)

Wang, P. S.

1977-01-01

Attempts were made to factor these ten polynomials on MACSYMA. However it did not get very far with any of the larger polynomials. At that time, MACSYMA used an algorithm created by Wang and Rothschild. This factoring algorithm was also implemented for the symbolic manipulation system, SCRATCHPAD of IBM. A closer look at this old factoring algorithm revealed three problem areas, each of which contribute to losing sparseness and intermediate expression growth. This study led to effective ways of avoiding these problems and actually to a new factoring algorithm. The three problems are known as the extraneous factor problem, the leading coefficient problem, and the bad zero problem. These problems are examined separately. Their causes and effects are set forth in detail; the ways to avoid or lessen these problems are described.
Object detection approach using generative sparse, hierarchical networks with top-down and lateral connections for combining texture/color detection and shape/contour detection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paiton, Dylan M.; Kenyon, Garrett T.; Brumby, Steven P.

An approach to detecting objects in an image dataset may combine texture/color detection, shape/contour detection, and/or motion detection using sparse, generative, hierarchical models with lateral and top-down connections. A first independent representation of objects in an image dataset may be produced using a color/texture detection algorithm. A second independent representation of objects in the image dataset may be produced using a shape/contour detection algorithm. A third independent representation of objects in the image dataset may be produced using a motion detection algorithm. The first, second, and third independent representations may then be combined into a single coherent output using amore » combinatorial algorithm.« less
Removing flicker based on sparse color correspondences in old film restoration

NASA Astrophysics Data System (ADS)

Huang, Xi; Ding, Youdong; Yu, Bing; Xia, Tianran

2018-04-01

In the long history of human civilization, archived film is an indispensable part of it, and using digital method to repair damaged film is also a mainstream trend nowadays. In this paper, we propose a sparse color correspondences based technique to remove fading flicker for old films. Our model, combined with multi frame images to establish a simple correction model, includes three key steps. Firstly, we recover sparse color correspondences in the input frames to build a matrix with many missing entries. Secondly, we present a low-rank matrix factorization approach to estimate the unknown parameters of this model. Finally, we adopt a two-step strategy that divide the estimated parameters into reference frame parameters for color recovery correction and other frame parameters for color consistency correction to remove flicker. Our method combined multi-frames takes continuity of the input sequence into account, and the experimental results show the method can remove fading flicker efficiently.
Efficient Implementation of an Optimal Interpolator for Large Spatial Data Sets

NASA Technical Reports Server (NTRS)

Memarsadeghi, Nargess; Mount, David M.

2007-01-01

Scattered data interpolation is a problem of interest in numerous areas such as electronic imaging, smooth surface modeling, and computational geometry. Our motivation arises from applications in geology and mining, which often involve large scattered data sets and a demand for high accuracy. The method of choice is ordinary kriging. This is because it is a best unbiased estimator. Unfortunately, this interpolant is computationally very expensive to compute exactly. For n scattered data points, computing the value of a single interpolant involves solving a dense linear system of size roughly n x n. This is infeasible for large n. In practice, kriging is solved approximately by local approaches that are based on considering only a relatively small'number of points that lie close to the query point. There are many problems with this local approach, however. The first is that determining the proper neighborhood size is tricky, and is usually solved by ad hoc methods such as selecting a fixed number of nearest neighbors or all the points lying within a fixed radius. Such fixed neighborhood sizes may not work well for all query points, depending on local density of the point distribution. Local methods also suffer from the problem that the resulting interpolant is not continuous. Meyer showed that while kriging produces smooth continues surfaces, it has zero order continuity along its borders. Thus, at interface boundaries where the neighborhood changes, the interpolant behaves discontinuously. Therefore, it is important to consider and solve the global system for each interpolant. However, solving such large dense systems for each query point is impractical. Recently a more principled approach to approximating kriging has been proposed based on a technique called covariance tapering. The problems arise from the fact that the covariance functions that are used in kriging have global support. Our implementations combine, utilize, and enhance a number of different approaches that have been introduced in literature for solving large linear systems for interpolation of scattered data points. For very large systems, exact methods such as Gaussian elimination are impractical since they require 0(n(exp 3)) time and 0(n(exp 2)) storage. As Billings et al. suggested, we use an iterative approach. In particular, we use the SYMMLQ method, for solving the large but sparse ordinary kriging systems that result from tapering. The main technical issue that need to be overcome in our algorithmic solution is that the points' covariance matrix for kriging should be symmetric positive definite. The goal of tapering is to obtain a sparse approximate representation of the covariance matrix while maintaining its positive definiteness. Furrer et al. used tapering to obtain a sparse linear system of the form Ax = b, where A is the tapered symmetric positive definite covariance matrix. Thus, Cholesky factorization could be used to solve their linear systems. They implemented an efficient sparse Cholesky decomposition method. They also showed if these tapers are used for a limited class of covariance models, the solution of the system converges to the solution of the original system. Matrix A in the ordinary kriging system, while symmetric, is not positive definite. Thus, their approach is not applicable to the ordinary kriging system. Therefore, we use tapering only to obtain a sparse linear system. Then, we use SYMMLQ to solve the ordinary kriging system. We show that solving large kriging systems becomes practical via tapering and iterative methods, and results in lower estimation errors compared to traditional local approaches, and significant memory savings compared to the original global system. We also developed a more efficient variant of the sparse SYMMLQ method for large ordinary kriging systems. This approach adaptively finds the correct local neighborhood for each query point in the interpolation process.
LSRN: A PARALLEL ITERATIVE SOLVER FOR STRONGLY OVER- OR UNDERDETERMINED SYSTEMS*

PubMed Central

Meng, Xiangrui; Saunders, Michael A.; Mahoney, Michael W.

2014-01-01

We describe a parallel iterative least squares solver named LSRN that is based on random normal projection. LSRN computes the min-length solution to minx∈ℝn ‖Ax − b‖2, where A ∈ ℝm × n with m ≫ n or m ≪ n, and where A may be rank-deficient. Tikhonov regularization may also be included. Since A is involved only in matrix-matrix and matrix-vector multiplications, it can be a dense or sparse matrix or a linear operator, and LSRN automatically speeds up when A is sparse or a fast linear operator. The preconditioning phase consists of a random normal projection, which is embarrassingly parallel, and a singular value decomposition of size ⌈γ min(m, n)⌉ × min(m, n), where γ is moderately larger than 1, e.g., γ = 2. We prove that the preconditioned system is well-conditioned, with a strong concentration result on the extreme singular values, and hence that the number of iterations is fully predictable when we apply LSQR or the Chebyshev semi-iterative method. As we demonstrate, the Chebyshev method is particularly efficient for solving large problems on clusters with high communication cost. Numerical results show that on a shared-memory machine, LSRN is very competitive with LAPACK’s DGELSD and a fast randomized least squares solver called Blendenpik on large dense problems, and it outperforms the least squares solver from SuiteSparseQR on sparse problems without sparsity patterns that can be exploited to reduce fill-in. Further experiments show that LSRN scales well on an Amazon Elastic Compute Cloud cluster. PMID:25419094
Thin-film sparse boundary array design for passive acoustic mapping during ultrasound therapy.

PubMed

Coviello, Christian M; Kozick, Richard J; Hurrell, Andrew; Smith, Penny Probert; Coussios, Constantin-C

2012-10-01

A new 2-D hydrophone array for ultrasound therapy monitoring is presented, along with a novel algorithm for passive acoustic mapping using a sparse weighted aperture. The array is constructed using existing polyvinylidene fluoride (PVDF) ultrasound sensor technology, and is utilized for its broadband characteristics and its high receive sensitivity. For most 2-D arrays, high-resolution imagery is desired, which requires a large aperture at the cost of a large number of elements. The proposed array's geometry is sparse, with elements only on the boundary of the rectangular aperture. The missing information from the interior is filled in using linear imaging techniques. After receiving acoustic emissions during ultrasound therapy, this algorithm applies an apodization to the sparse aperture to limit side lobes and then reconstructs acoustic activity with high spatiotemporal resolution. Experiments show verification of the theoretical point spread function, and cavitation maps in agar phantoms correspond closely to predicted areas, showing the validity of the array and methodology.
Non-Convex Sparse and Low-Rank Based Robust Subspace Segmentation for Data Mining.

PubMed

Cheng, Wenlong; Zhao, Mingbo; Xiong, Naixue; Chui, Kwok Tai

2017-07-15

Parsimony, including sparsity and low-rank, has shown great importance for data mining in social networks, particularly in tasks such as segmentation and recognition. Traditionally, such modeling approaches rely on an iterative algorithm that minimizes an objective function with convex l ₁-norm or nuclear norm constraints. However, the obtained results by convex optimization are usually suboptimal to solutions of original sparse or low-rank problems. In this paper, a novel robust subspace segmentation algorithm has been proposed by integrating l p -norm and Schatten p -norm constraints. Our so-obtained affinity graph can better capture local geometrical structure and the global information of the data. As a consequence, our algorithm is more generative, discriminative and robust. An efficient linearized alternating direction method is derived to realize our model. Extensive segmentation experiments are conducted on public datasets. The proposed algorithm is revealed to be more effective and robust compared to five existing algorithms.
Sparse signals recovered by non-convex penalty in quasi-linear systems.

PubMed

Cui, Angang; Li, Haiyang; Wen, Meng; Peng, Jigen

2018-01-01

The goal of compressed sensing is to reconstruct a sparse signal under a few linear measurements far less than the dimension of the ambient space of the signal. However, many real-life applications in physics and biomedical sciences carry some strongly nonlinear structures, and the linear model is no longer suitable. Compared with the compressed sensing under the linear circumstance, this nonlinear compressed sensing is much more difficult, in fact also NP-hard, combinatorial problem, because of the discrete and discontinuous nature of the [Formula: see text]-norm and the nonlinearity. In order to get a convenience for sparse signal recovery, we set the nonlinear models have a smooth quasi-linear nature in this paper, and study a non-convex fraction function [Formula: see text] in this quasi-linear compressed sensing. We propose an iterative fraction thresholding algorithm to solve the regularization problem [Formula: see text] for all [Formula: see text]. With the change of parameter [Formula: see text], our algorithm could get a promising result, which is one of the advantages for our algorithm compared with some state-of-art algorithms. Numerical experiments show that our method performs much better than some state-of-the-art methods.
Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis

DOE PAGES

Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.; ...

2017-10-16

This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less

Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.

This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less
Multi-Source Multi-Target Dictionary Learning for Prediction of Cognitive Decline.

PubMed

Zhang, Jie; Li, Qingyang; Caselli, Richard J; Thompson, Paul M; Ye, Jieping; Wang, Yalin

2017-06-01

Alzheimer's Disease (AD) is the most common type of dementia. Identifying correct biomarkers may determine pre-symptomatic AD subjects and enable early intervention. Recently, Multi-task sparse feature learning has been successfully applied to many computer vision and biomedical informatics researches. It aims to improve the generalization performance by exploiting the shared features among different tasks. However, most of the existing algorithms are formulated as a supervised learning scheme. Its drawback is with either insufficient feature numbers or missing label information. To address these challenges, we formulate an unsupervised framework for multi-task sparse feature learning based on a novel dictionary learning algorithm. To solve the unsupervised learning problem, we propose a two-stage Multi-Source Multi-Target Dictionary Learning (MMDL) algorithm. In stage 1, we propose a multi-source dictionary learning method to utilize the common and individual sparse features in different time slots. In stage 2, supported by a rigorous theoretical analysis, we develop a multi-task learning method to solve the missing label problem. Empirical studies on an N = 3970 longitudinal brain image data set, which involves 2 sources and 5 targets, demonstrate the improved prediction accuracy and speed efficiency of MMDL in comparison with other state-of-the-art algorithms.
Image super-resolution via sparse representation.

PubMed

Yang, Jianchao; Wright, John; Huang, Thomas S; Ma, Yi

2010-11-01

This paper presents a new approach to single-image super-resolution, based on sparse signal representation. Research on image statistics suggests that image patches can be well-represented as a sparse linear combination of elements from an appropriately chosen over-complete dictionary. Inspired by this observation, we seek a sparse representation for each patch of the low-resolution input, and then use the coefficients of this representation to generate the high-resolution output. Theoretical results from compressed sensing suggest that under mild conditions, the sparse representation can be correctly recovered from the downsampled signals. By jointly training two dictionaries for the low- and high-resolution image patches, we can enforce the similarity of sparse representations between the low resolution and high resolution image patch pair with respect to their own dictionaries. Therefore, the sparse representation of a low resolution image patch can be applied with the high resolution image patch dictionary to generate a high resolution image patch. The learned dictionary pair is a more compact representation of the patch pairs, compared to previous approaches, which simply sample a large amount of image patch pairs, reducing the computational cost substantially. The effectiveness of such a sparsity prior is demonstrated for both general image super-resolution and the special case of face hallucination. In both cases, our algorithm generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods. In addition, the local sparse modeling of our approach is naturally robust to noise, and therefore the proposed algorithm can handle super-resolution with noisy inputs in a more unified framework.
Sparse Bayesian Learning for Nonstationary Data Sources

NASA Astrophysics Data System (ADS)

Fujimaki, Ryohei; Yairi, Takehisa; Machida, Kazuo

This paper proposes an online Sparse Bayesian Learning (SBL) algorithm for modeling nonstationary data sources. Although most learning algorithms implicitly assume that a data source does not change over time (stationary), one in the real world usually does due to such various factors as dynamically changing environments, device degradation, sudden failures, etc (nonstationary). The proposed algorithm can be made useable for stationary online SBL by setting time decay parameters to zero, and as such it can be interpreted as a single unified framework for online SBL for use with stationary and nonstationary data sources. Tests both on four types of benchmark problems and on actual stock price data have shown it to perform well.
Application of composite dictionary multi-atom matching in gear fault diagnosis.

PubMed

Cui, Lingli; Kang, Chenhui; Wang, Huaqing; Chen, Peng

2011-01-01

The sparse decomposition based on matching pursuit is an adaptive sparse expression method for signals. This paper proposes an idea concerning a composite dictionary multi-atom matching decomposition and reconstruction algorithm, and the introduction of threshold de-noising in the reconstruction algorithm. Based on the structural characteristics of gear fault signals, a composite dictionary combining the impulse time-frequency dictionary and the Fourier dictionary was constituted, and a genetic algorithm was applied to search for the best matching atom. The analysis results of gear fault simulation signals indicated the effectiveness of the hard threshold, and the impulse or harmonic characteristic components could be separately extracted. Meanwhile, the robustness of the composite dictionary multi-atom matching algorithm at different noise levels was investigated. Aiming at the effects of data lengths on the calculation efficiency of the algorithm, an improved segmented decomposition and reconstruction algorithm was proposed, and the calculation efficiency of the decomposition algorithm was significantly enhanced. In addition it is shown that the multi-atom matching algorithm was superior to the single-atom matching algorithm in both calculation efficiency and algorithm robustness. Finally, the above algorithm was applied to gear fault engineering signals, and achieved good results.
Sparse modeling applied to patient identification for safety in medical physics applications

NASA Astrophysics Data System (ADS)

Lewkowitz, Stephanie

Every scheduled treatment at a radiation therapy clinic involves a series of safety protocol to ensure the utmost patient care. Despite safety protocol, on a rare occasion an entirely preventable medical event, an accident, may occur. Delivering a treatment plan to the wrong patient is preventable, yet still is a clinically documented error. This research describes a computational method to identify patients with a novel machine learning technique to combat misadministration. The patient identification program stores face and fingerprint data for each patient. New, unlabeled data from those patients are categorized according to the library. The categorization of data by this face-fingerprint detector is accomplished with new machine learning algorithms based on Sparse Modeling that have already begun transforming the foundation of Computer Vision. Previous patient recognition software required special subroutines for faces and different tailored subroutines for fingerprints. In this research, the same exact model is used for both fingerprints and faces, without any additional subroutines and even without adjusting the two hyperparameters. Sparse modeling is a powerful tool, already shown utility in the areas of super-resolution, denoising, inpainting, demosaicing, and sub-nyquist sampling, i.e. compressed sensing. Sparse Modeling is possible because natural images are inherently sparse in some bases, due to their inherent structure. This research chooses datasets of face and fingerprint images to test the patient identification model. The model stores the images of each dataset as a basis (library). One image at a time is removed from the library, and is classified by a sparse code in terms of the remaining library. The Locally Competitive Algorithm, a truly neural inspired Artificial Neural Network, solves the computationally difficult task of finding the sparse code for the test image. The components of the sparse representation vector are summed by ℓ1 pooling, and correct patient identification is consistently achieved 100% over 1000 trials, when either the face data or fingerprint data are implemented as a classification basis. The algorithm gets 100% classification when faces and fingerprints are concatenated into multimodal datasets. This suggests that 100% patient identification will be achievable in the clinal setting.
Target Transformation Constrained Sparse Unmixing (ttcsu) Algorithm for Retrieving Hydrous Minerals on Mars: Application to Southwest Melas Chasma

NASA Astrophysics Data System (ADS)

Lin, H.; Zhang, X.; Wu, X.; Tarnas, J. D.; Mustard, J. F.

2018-04-01

Quantitative analysis of hydrated minerals from hyperspectral remote sensing data is fundamental for understanding Martian geologic process. Because of the difficulties for selecting endmembers from hyperspectral images, a sparse unmixing algorithm has been proposed to be applied to CRISM data on Mars. However, it's challenge when the endmember library increases dramatically. Here, we proposed a new methodology termed Target Transformation Constrained Sparse Unmixing (TTCSU) to accurately detect hydrous minerals on Mars. A new version of target transformation technique proposed in our recent work was used to obtain the potential detections from CRISM data. Sparse unmixing constrained with these detections as prior information was applied to CRISM single-scattering albedo images, which were calculated using a Hapke radiative transfer model. This methodology increases success rate of the automatic endmember selection of sparse unmixing and could get more accurate abundances. CRISM images with well analyzed in Southwest Melas Chasma was used to validate our methodology in this study. The sulfates jarosite was detected from Southwest Melas Chasma, the distribution is consistent with previous work and the abundance is comparable. More validations will be done in our future work.
Microstructure Images Restoration of Metallic Materials Based upon KSVD and Smoothing Penalty Sparse Representation Approach.

PubMed

Li, Qing; Liang, Steven Y

2018-04-20

Microstructure images of metallic materials play a significant role in industrial applications. To address image degradation problem of metallic materials, a novel image restoration technique based on K-means singular value decomposition (KSVD) and smoothing penalty sparse representation (SPSR) algorithm is proposed in this work, the microstructure images of aluminum alloy 7075 (AA7075) material are used as examples. To begin with, to reflect the detail structure characteristics of the damaged image, the KSVD dictionary is introduced to substitute the traditional sparse transform basis (TSTB) for sparse representation. Then, due to the image restoration, modeling belongs to a highly underdetermined equation, and traditional sparse reconstruction methods may cause instability and obvious artifacts in the reconstructed images, especially reconstructed image with many smooth regions and the noise level is strong, thus the SPSR (here, q = 0.5) algorithm is designed to reconstruct the damaged image. The results of simulation and two practical cases demonstrate that the proposed method has superior performance compared with some state-of-the-art methods in terms of restoration performance factors and visual quality. Meanwhile, the grain size parameters and grain boundaries of microstructure image are discussed before and after they are restored by proposed method.
Discriminative object tracking via sparse representation and online dictionary learning.

PubMed

Xie, Yuan; Zhang, Wensheng; Li, Cuihua; Lin, Shuyang; Qu, Yanyun; Zhang, Yinghua

2014-04-01

We propose a robust tracking algorithm based on local sparse coding with discriminative dictionary learning and new keypoint matching schema. This algorithm consists of two parts: the local sparse coding with online updated discriminative dictionary for tracking (SOD part), and the keypoint matching refinement for enhancing the tracking performance (KP part). In the SOD part, the local image patches of the target object and background are represented by their sparse codes using an over-complete discriminative dictionary. Such discriminative dictionary, which encodes the information of both the foreground and the background, may provide more discriminative power. Furthermore, in order to adapt the dictionary to the variation of the foreground and background during the tracking, an online learning method is employed to update the dictionary. The KP part utilizes refined keypoint matching schema to improve the performance of the SOD. With the help of sparse representation and online updated discriminative dictionary, the KP part are more robust than the traditional method to reject the incorrect matches and eliminate the outliers. The proposed method is embedded into a Bayesian inference framework for visual tracking. Experimental results on several challenging video sequences demonstrate the effectiveness and robustness of our approach.
Object-Oriented Design for Sparse Direct Solvers

NASA Technical Reports Server (NTRS)

Dobrian, Florin; Kumfert, Gary; Pothen, Alex

1999-01-01

We discuss the object-oriented design of a software package for solving sparse, symmetric systems of equations (positive definite and indefinite) by direct methods. At the highest layers, we decouple data structure classes from algorithmic classes for flexibility. We describe the important structural and algorithmic classes in our design, and discuss the trade-offs we made for high performance. The kernels at the lower layers were optimized by hand. Our results show no performance loss from our object-oriented design, while providing flexibility, case of use, and extensibility over solvers using procedural design.
A denoising algorithm for CT image using low-rank sparse coding

NASA Astrophysics Data System (ADS)

Lei, Yang; Xu, Dong; Zhou, Zhengyang; Wang, Tonghe; Dong, Xue; Liu, Tian; Dhabaan, Anees; Curran, Walter J.; Yang, Xiaofeng

2018-03-01

We propose a denoising method of CT image based on low-rank sparse coding. The proposed method constructs an adaptive dictionary of image patches and estimates the sparse coding regularization parameters using the Bayesian interpretation. A low-rank approximation approach is used to simultaneously construct the dictionary and achieve sparse representation through clustering similar image patches. A variable-splitting scheme and a quadratic optimization are used to reconstruct CT image based on achieved sparse coefficients. We tested this denoising technology using phantom, brain and abdominal CT images. The experimental results showed that the proposed method delivers state-of-art denoising performance, both in terms of objective criteria and visual quality.
Array signal recovery algorithm for a single-RF-channel DBF array

NASA Astrophysics Data System (ADS)

Zhang, Duo; Wu, Wen; Fang, Da Gang

2016-12-01

An array signal recovery algorithm based on sparse signal reconstruction theory is proposed for a single-RF-channel digital beamforming (DBF) array. A single-RF-channel antenna array is a low-cost antenna array in which signals are obtained from all antenna elements by only one microwave digital receiver. The spatially parallel array signals are converted into time-sequence signals, which are then sampled by the system. The proposed algorithm uses these time-sequence samples to recover the original parallel array signals by exploiting the second-order sparse structure of the array signals. Additionally, an optimization method based on the artificial bee colony (ABC) algorithm is proposed to improve the reconstruction performance. Using the proposed algorithm, the motion compensation problem for the single-RF-channel DBF array can be solved effectively, and the angle and Doppler information for the target can be simultaneously estimated. The effectiveness of the proposed algorithms is demonstrated by the results of numerical simulations.
SU-G-TeP1-15: Toward a Novel GPU Accelerated Deterministic Solution to the Linear Boltzmann Transport Equation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, R; Fallone, B; Cross Cancer Institute, Edmonton, AB

Purpose: To develop a Graphic Processor Unit (GPU) accelerated deterministic solution to the Linear Boltzmann Transport Equation (LBTE) for accurate dose calculations in radiotherapy (RT). A deterministic solution yields the potential for major speed improvements due to the sparse matrix-vector and vector-vector multiplications and would thus be of benefit to RT. Methods: In order to leverage the massively parallel architecture of GPUs, the first order LBTE was reformulated as a second order self-adjoint equation using the Least Squares Finite Element Method (LSFEM). This produces a symmetric positive-definite matrix which is efficiently solved using a parallelized conjugate gradient (CG) solver. Themore » LSFEM formalism is applied in space, discrete ordinates is applied in angle, and the Multigroup method is applied in energy. The final linear system of equations produced is tightly coupled in space and angle. Our code written in CUDA-C was benchmarked on an Nvidia GeForce TITAN-X GPU against an Intel i7-6700K CPU. A spatial mesh of 30,950 tetrahedral elements was used with an S4 angular approximation. Results: To avoid repeating a full computationally intensive finite element matrix assembly at each Multigroup energy, a novel mapping algorithm was developed which minimized the operations required at each energy. Additionally, a parallelized memory mapping for the kronecker product between the sparse spatial and angular matrices, including Dirichlet boundary conditions, was created. Atomicity is preserved by graph-coloring overlapping nodes into separate kernel launches. The one-time mapping calculations for matrix assembly, kronecker product, and boundary condition application took 452±1ms on GPU. Matrix assembly for 16 energy groups took 556±3s on CPU, and 358±2ms on GPU using the mappings developed. The CG solver took 93±1s on CPU, and 468±2ms on GPU. Conclusion: Three computationally intensive subroutines in deterministically solving the LBTE have been formulated on GPU, resulting in two orders of magnitude speedup. Funding support from Natural Sciences and Engineering Research Council and Alberta Innovates Health Solutions. Dr. Fallone is a co-founder and CEO of MagnetTx Oncology Solutions (under discussions to license Alberta bi-planar linac MR for commercialization).« less
Hybrid Weighted Minimum Norm Method A new method based LORETA to solve EEG inverse problem.

PubMed

Song, C; Zhuang, T; Wu, Q

2005-01-01

This Paper brings forward a new method to solve EEG inverse problem. Based on following physiological characteristic of neural electrical activity source: first, the neighboring neurons are prone to active synchronously; second, the distribution of source space is sparse; third, the active intensity of the sources are high centralized, we take these prior knowledge as prerequisite condition to develop the inverse solution of EEG, and not assume other characteristic of inverse solution to realize the most commonly 3D EEG reconstruction map. The proposed algorithm takes advantage of LORETA's low resolution method which emphasizes particularly on 'localization' and FOCUSS's high resolution method which emphasizes particularly on 'separability'. The method is still under the frame of the weighted minimum norm method. The keystone is to construct a weighted matrix which takes reference from the existing smoothness operator, competition mechanism and study algorithm. The basic processing is to obtain an initial solution's estimation firstly, then construct a new estimation using the initial solution's information, repeat this process until the solutions under last two estimate processing is keeping unchanged.
An Improved Compressive Sensing and Received Signal Strength-Based Target Localization Algorithm with Unknown Target Population for Wireless Local Area Networks.

PubMed

Yan, Jun; Yu, Kegen; Chen, Ruizhi; Chen, Liang

2017-05-30

In this paper a two-phase compressive sensing (CS) and received signal strength (RSS)-based target localization approach is proposed to improve position accuracy by dealing with the unknown target population and the effect of grid dimensions on position error. In the coarse localization phase, by formulating target localization as a sparse signal recovery problem, grids with recovery vector components greater than a threshold are chosen as the candidate target grids. In the fine localization phase, by partitioning each candidate grid, the target position in a grid is iteratively refined by using the minimum residual error rule and the least-squares technique. When all the candidate target grids are iteratively partitioned and the measurement matrix is updated, the recovery vector is re-estimated. Threshold-based detection is employed again to determine the target grids and hence the target population. As a consequence, both the target population and the position estimation accuracy can be significantly improved. Simulation results demonstrate that the proposed approach achieves the best accuracy among all the algorithms compared.
Matrix Algebra for GPU and Multicore Architectures (MAGMA) for Large Petascale Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dongarra, Jack J.; Tomov, Stanimire

2014-03-24

The goal of the MAGMA project is to create a new generation of linear algebra libraries that achieve the fastest possible time to an accurate solution on hybrid Multicore+GPU-based systems, using all the processing power that future high-end systems can make available within given energy constraints. Our efforts at the University of Tennessee achieved the goals set in all of the five areas identified in the proposal: 1. Communication optimal algorithms; 2. Autotuning for GPU and hybrid processors; 3. Scheduling and memory management techniques for heterogeneity and scale; 4. Fault tolerance and robustness for large scale systems; 5. Building energymore » efficiency into software foundations. The University of Tennessee’s main contributions, as proposed, were the research and software development of new algorithms for hybrid multi/many-core CPUs and GPUs, as related to two-sided factorizations and complete eigenproblem solvers, hybrid BLAS, and energy efficiency for dense, as well as sparse, operations. Furthermore, as proposed, we investigated and experimented with various techniques targeting the five main areas outlined.« less
Overview of Sparse Graph for Multiple Access in Future Mobile Networks

NASA Astrophysics Data System (ADS)

Lei, Jing; Li, Baoguo; Li, Erbao; Gong, Zhenghui

2017-10-01

Multiple access via sparse graph, such as low density signature (LDS) and sparse code multiple access (SCMA), is a promising technique for future wireless communications. This survey presents an overview of the developments in this burgeoning field, including transmitter structures, extrinsic information transform (EXIT) chart analysis and comparisons with existing multiple access techniques. Such technique enables multiple access under overloaded conditions to achieve a satisfactory performance. Message passing algorithm is utilized for multi-user detection in the receiver, and structures of the sparse graph are illustrated in detail. Outlooks and challenges of this technique are also presented.
Sparse RNA folding revisited: space-efficient minimum free energy structure prediction.

PubMed

Will, Sebastian; Jabbari, Hosna

2016-01-01

RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while guaranteeing the same result; however, for many such folding problems, space efficiency is of even greater concern, particularly for long RNA sequences. So far, space-efficient sparsified RNA folding with fold reconstruction was solved only for simple base-pair-based pseudo-energy models. Here, we revisit the problem of space-efficient free energy minimization. Whereas the space-efficient minimization of the free energy has been sketched before, the reconstruction of the optimum structure has not even been discussed. We show that this reconstruction is not possible in trivial extension of the method for simple energy models. Then, we present the time- and space-efficient sparsified free energy minimization algorithm SparseMFEFold that guarantees MFE structure prediction. In particular, this novel algorithm provides efficient fold reconstruction based on dynamically garbage-collected trace arrows. The complexity of our algorithm depends on two parameters, the number of candidates Z and the number of trace arrows T; both are bounded by [Formula: see text], but are typically much smaller. The time complexity of RNA folding is reduced from [Formula: see text] to [Formula: see text]; the space complexity, from [Formula: see text] to [Formula: see text]. Our empirical results show more than 80 % space savings over RNAfold [Vienna RNA package] on the long RNAs from the RNA STRAND database (≥2500 bases). The presented technique is intentionally generalizable to complex prediction algorithms; due to their high space demands, algorithms like pseudoknot prediction and RNA-RNA-interaction prediction are expected to profit even stronger than "standard" MFE folding. SparseMFEFold is free software, available at http://www.bioinf.uni-leipzig.de/~will/Software/SparseMFEFold.
Improved genome-scale multi-target virtual screening via a novel collaborative filtering approach to cold-start problem

PubMed Central

Lim, Hansaim; Gray, Paul; Xie, Lei; Poleksic, Aleksandar

2016-01-01

Conventional one-drug-one-gene approach has been of limited success in modern drug discovery. Polypharmacology, which focuses on searching for multi-targeted drugs to perturb disease-causing networks instead of designing selective ligands to target individual proteins, has emerged as a new drug discovery paradigm. Although many methods for single-target virtual screening have been developed to improve the efficiency of drug discovery, few of these algorithms are designed for polypharmacology. Here, we present a novel theoretical framework and a corresponding algorithm for genome-scale multi-target virtual screening based on the one-class collaborative filtering technique. Our method overcomes the sparseness of the protein-chemical interaction data by means of interaction matrix weighting and dual regularization from both chemicals and proteins. While the statistical foundation behind our method is general enough to encompass genome-wide drug off-target prediction, the program is specifically tailored to find protein targets for new chemicals with little to no available interaction data. We extensively evaluate our method using a number of the most widely accepted gene-specific and cross-gene family benchmarks and demonstrate that our method outperforms other state-of-the-art algorithms for predicting the interaction of new chemicals with multiple proteins. Thus, the proposed algorithm may provide a powerful tool for multi-target drug design. PMID:27958331
Improved genome-scale multi-target virtual screening via a novel collaborative filtering approach to cold-start problem.

PubMed

Lim, Hansaim; Gray, Paul; Xie, Lei; Poleksic, Aleksandar

2016-12-13

Conventional one-drug-one-gene approach has been of limited success in modern drug discovery. Polypharmacology, which focuses on searching for multi-targeted drugs to perturb disease-causing networks instead of designing selective ligands to target individual proteins, has emerged as a new drug discovery paradigm. Although many methods for single-target virtual screening have been developed to improve the efficiency of drug discovery, few of these algorithms are designed for polypharmacology. Here, we present a novel theoretical framework and a corresponding algorithm for genome-scale multi-target virtual screening based on the one-class collaborative filtering technique. Our method overcomes the sparseness of the protein-chemical interaction data by means of interaction matrix weighting and dual regularization from both chemicals and proteins. While the statistical foundation behind our method is general enough to encompass genome-wide drug off-target prediction, the program is specifically tailored to find protein targets for new chemicals with little to no available interaction data. We extensively evaluate our method using a number of the most widely accepted gene-specific and cross-gene family benchmarks and demonstrate that our method outperforms other state-of-the-art algorithms for predicting the interaction of new chemicals with multiple proteins. Thus, the proposed algorithm may provide a powerful tool for multi-target drug design.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.