NASA Astrophysics Data System (ADS)
Stoykov, S.; Atanassov, E.; Margenov, S.
2016-10-01
Many of the scientific applications involve sparse or dense matrix operations, such as solving linear systems, matrix-matrix products, eigensolvers, etc. In what concerns structural nonlinear dynamics, the computations of periodic responses and the determination of stability of the solution are of primary interest. Shooting method iswidely used for obtaining periodic responses of nonlinear systems. The method involves simultaneously operations with sparse and dense matrices. One of the computationally expensive operations in the method is multiplication of sparse by dense matrices. In the current work, a new algorithm for sparse matrix by dense matrix products is presented. The algorithm takes into account the structure of the sparse matrix, which is obtained by space discretization of the nonlinear Mindlin's plate equation of motion by the finite element method. The algorithm is developed to use the vector engine of Intel Xeon Phi coprocessors. It is compared with the standard sparse matrix by dense matrix algorithm and the one developed by Intel MKL and it is shown that by considering the properties of the sparse matrix better algorithms can be developed.
Sparse Matrices in MATLAB: Design and Implementation
NASA Technical Reports Server (NTRS)
Gilbert, John R.; Moler, Cleve; Schreiber, Robert
1992-01-01
The matrix computation language and environment MATLAB is extended to include sparse matrix storage and operations. The only change to the outward appearance of the MATLAB language is a pair of commands to create full or sparse matrices. Nearly all the operations of MATLAB now apply equally to full or sparse matrices, without any explicit action by the user. The sparse data structure represents a matrix in space proportional to the number of nonzero entries, and most of the operations compute sparse results in time proportional to the number of arithmetic operations on nonzeros.
User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.
NASA Technical Reports Server (NTRS)
Reddy, C. J.
2000-01-01
PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.
SPARSKIT: A basic tool kit for sparse matrix computations
NASA Technical Reports Server (NTRS)
Saad, Youcef
1990-01-01
Presented here are the main features of a tool package for manipulating and working with sparse matrices. One of the goals of the package is to provide basic tools to facilitate the exchange of software and data between researchers in sparse matrix computations. The starting point is the Harwell/Boeing collection of matrices for which the authors provide a number of tools. Among other things, the package provides programs for converting data structures, printing simple statistics on a matrix, plotting a matrix profile, and performing linear algebra operations with sparse matrices.
Systematic sparse matrix error control for linear scaling electronic structure calculations.
Rubensson, Emanuel H; Sałek, Paweł
2005-11-30
Efficient truncation criteria used in multiatom blocked sparse matrix operations for ab initio calculations are proposed. As system size increases, so does the need to stay on top of errors and still achieve high performance. A variant of a blocked sparse matrix algebra to achieve strict error control with good performance is proposed. The presented idea is that the condition to drop a certain submatrix should depend not only on the magnitude of that particular submatrix, but also on which other submatrices that are dropped. The decision to remove a certain submatrix is based on the contribution the removal would cause to the error in the chosen norm. We study the effect of an accumulated truncation error in iterative algorithms like trace correcting density matrix purification. One way to reduce the initial exponential growth of this error is presented. The presented error control for a sparse blocked matrix toolbox allows for achieving optimal performance by performing only necessary operations needed to maintain the requested level of accuracy. Copyright 2005 Wiley Periodicals, Inc.
Solving large sparse eigenvalue problems on supercomputers
NASA Technical Reports Server (NTRS)
Philippe, Bernard; Saad, Youcef
1988-01-01
An important problem in scientific computing consists in finding a few eigenvalues and corresponding eigenvectors of a very large and sparse matrix. The most popular methods to solve these problems are based on projection techniques on appropriate subspaces. The main attraction of these methods is that they only require the use of the matrix in the form of matrix by vector multiplications. The implementations on supercomputers of two such methods for symmetric matrices, namely Lanczos' method and Davidson's method are compared. Since one of the most important operations in these two methods is the multiplication of vectors by the sparse matrix, methods of performing this operation efficiently are discussed. The advantages and the disadvantages of each method are compared and implementation aspects are discussed. Numerical experiments on a one processor CRAY 2 and CRAY X-MP are reported. Possible parallel implementations are also discussed.
NASA Astrophysics Data System (ADS)
Kaporin, I. E.
2012-02-01
In order to precondition a sparse symmetric positive definite matrix, its approximate inverse is examined, which is represented as the product of two sparse mutually adjoint triangular matrices. In this way, the solution of the corresponding system of linear algebraic equations (SLAE) by applying the preconditioned conjugate gradient method (CGM) is reduced to performing only elementary vector operations and calculating sparse matrix-vector products. A method for constructing the above preconditioner is described and analyzed. The triangular factor has a fixed sparsity pattern and is optimal in the sense that the preconditioned matrix has a minimum K-condition number. The use of polynomial preconditioning based on Chebyshev polynomials makes it possible to considerably reduce the amount of scalar product operations (at the cost of an insignificant increase in the total number of arithmetic operations). The possibility of an efficient massively parallel implementation of the resulting method for solving SLAEs is discussed. For a sequential version of this method, the results obtained by solving 56 test problems from the Florida sparse matrix collection (which are large-scale and ill-conditioned) are presented. These results show that the method is highly reliable and has low computational costs.
Partitioning Rectangular and Structurally Nonsymmetric Sparse Matrices for Parallel Processing
DOE Office of Scientific and Technical Information (OSTI.GOV)
B. Hendrickson; T.G. Kolda
1998-09-01
A common operation in scientific computing is the multiplication of a sparse, rectangular or structurally nonsymmetric matrix and a vector. In many applications the matrix- transpose-vector product is also required. This paper addresses the efficient parallelization of these operations. We show that the problem can be expressed in terms of partitioning bipartite graphs. We then introduce several algorithms for this partitioning problem and compare their performance on a set of test matrices.
Low-rank matrix decomposition and spatio-temporal sparse recovery for STAP radar
Sen, Satyabrata
2015-08-04
We develop space-time adaptive processing (STAP) methods by leveraging the advantages of sparse signal processing techniques in order to detect a slowly-moving target. We observe that the inherent sparse characteristics of a STAP problem can be formulated as the low-rankness of clutter covariance matrix when compared to the total adaptive degrees-of-freedom, and also as the sparse interference spectrum on the spatio-temporal domain. By exploiting these sparse properties, we propose two approaches for estimating the interference covariance matrix. In the first approach, we consider a constrained matrix rank minimization problem (RMP) to decompose the sample covariance matrix into a low-rank positivemore » semidefinite and a diagonal matrix. The solution of RMP is obtained by applying the trace minimization technique and the singular value decomposition with matrix shrinkage operator. Our second approach deals with the atomic norm minimization problem to recover the clutter response-vector that has a sparse support on the spatio-temporal plane. We use convex relaxation based standard sparse-recovery techniques to find the solutions. With extensive numerical examples, we demonstrate the performances of proposed STAP approaches with respect to both the ideal and practical scenarios, involving Doppler-ambiguous clutter ridges, spatial and temporal decorrelation effects. As a result, the low-rank matrix decomposition based solution requires secondary measurements as many as twice the clutter rank to attain a near-ideal STAP performance; whereas the spatio-temporal sparsity based approach needs a considerably small number of secondary data.« less
Computing row and column counts for sparse QR and LU factorization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilbert, John R.; Li, Xiaoye S.; Ng, Esmond G.
2001-01-01
We present algorithms to determine the number of nonzeros in each row and column of the factors of a sparse matrix, for both the QR factorization and the LU factorization with partial pivoting. The algorithms use only the nonzero structure of the input matrix, and run in time nearly linear in the number of nonzeros in that matrix. They may be used to set up data structures or schedule parallel operations in advance of the numerical factorization. The row and column counts we compute are upper bounds on the actual counts. If the input matrix is strong Hall and theremore » is no coincidental numerical cancellation, the counts are exact for QR factorization and are the tightest bounds possible for LU factorization. These algorithms are based on our earlier work on computing row and column counts for sparse Cholesky factorization, plus an efficient method to compute the column elimination tree of a sparse matrix without explicitly forming the product of the matrix and its transpose.« less
Multi scales based sparse matrix spectral clustering image segmentation
NASA Astrophysics Data System (ADS)
Liu, Zhongmin; Chen, Zhicai; Li, Zhanming; Hu, Wenjin
2018-04-01
In image segmentation, spectral clustering algorithms have to adopt the appropriate scaling parameter to calculate the similarity matrix between the pixels, which may have a great impact on the clustering result. Moreover, when the number of data instance is large, computational complexity and memory use of the algorithm will greatly increase. To solve these two problems, we proposed a new spectral clustering image segmentation algorithm based on multi scales and sparse matrix. We devised a new feature extraction method at first, then extracted the features of image on different scales, at last, using the feature information to construct sparse similarity matrix which can improve the operation efficiency. Compared with traditional spectral clustering algorithm, image segmentation experimental results show our algorithm have better degree of accuracy and robustness.
NASA Astrophysics Data System (ADS)
Ghale, Purnima; Johnson, Harley T.
2018-06-01
We present an efficient sparse matrix-vector (SpMV) based method to compute the density matrix P from a given Hamiltonian in electronic structure computations. Our method is a hybrid approach based on Chebyshev-Jackson approximation theory and matrix purification methods like the second order spectral projection purification (SP2). Recent methods to compute the density matrix scale as O(N) in the number of floating point operations but are accompanied by large memory and communication overhead, and they are based on iterative use of the sparse matrix-matrix multiplication kernel (SpGEMM), which is known to be computationally irregular. In addition to irregularity in the sparse Hamiltonian H, the nonzero structure of intermediate estimates of P depends on products of H and evolves over the course of computation. On the other hand, an expansion of the density matrix P in terms of Chebyshev polynomials is straightforward and SpMV based; however, the resulting density matrix may not satisfy the required constraints exactly. In this paper, we analyze the strengths and weaknesses of the Chebyshev-Jackson polynomials and the second order spectral projection purification (SP2) method, and propose to combine them so that the accurate density matrix can be computed using the SpMV computational kernel only, and without having to store the density matrix P. Our method accomplishes these objectives by using the Chebyshev polynomial estimate as the initial guess for SP2, which is followed by using sparse matrix-vector multiplications (SpMVs) to replicate the behavior of the SP2 algorithm for purification. We demonstrate the method on a tight-binding model system of an oxide material containing more than 3 million atoms. In addition, we also present the predicted behavior of our method when applied to near-metallic Hamiltonians with a wide energy spectrum.
An efficient implementation of a high-order filter for a cubed-sphere spectral element model
NASA Astrophysics Data System (ADS)
Kang, Hyun-Gyu; Cheong, Hyeong-Bin
2017-03-01
A parallel-scalable, isotropic, scale-selective spatial filter was developed for the cubed-sphere spectral element model on the sphere. The filter equation is a high-order elliptic (Helmholtz) equation based on the spherical Laplacian operator, which is transformed into cubed-sphere local coordinates. The Laplacian operator is discretized on the computational domain, i.e., on each cell, by the spectral element method with Gauss-Lobatto Lagrange interpolating polynomials (GLLIPs) as the orthogonal basis functions. On the global domain, the discrete filter equation yielded a linear system represented by a highly sparse matrix. The density of this matrix increases quadratically (linearly) with the order of GLLIP (order of the filter), and the linear system is solved in only O (Ng) operations, where Ng is the total number of grid points. The solution, obtained by a row reduction method, demonstrated the typical accuracy and convergence rate of the cubed-sphere spectral element method. To achieve computational efficiency on parallel computers, the linear system was treated by an inverse matrix method (a sparse matrix-vector multiplication). The density of the inverse matrix was lowered to only a few times of the original sparse matrix without degrading the accuracy of the solution. For better computational efficiency, a local-domain high-order filter was introduced: The filter equation is applied to multiple cells, and then the central cell was only used to reconstruct the filtered field. The parallel efficiency of applying the inverse matrix method to the global- and local-domain filter was evaluated by the scalability on a distributed-memory parallel computer. The scale-selective performance of the filter was demonstrated on Earth topography. The usefulness of the filter as a hyper-viscosity for the vorticity equation was also demonstrated.
Sparse electrocardiogram signals recovery based on solving a row echelon-like form of system.
Cai, Pingmei; Wang, Guinan; Yu, Shiwei; Zhang, Hongjuan; Ding, Shuxue; Wu, Zikai
2016-02-01
The study of biology and medicine in a noise environment is an evolving direction in biological data analysis. Among these studies, analysis of electrocardiogram (ECG) signals in a noise environment is a challenging direction in personalized medicine. Due to its periodic characteristic, ECG signal can be roughly regarded as sparse biomedical signals. This study proposes a two-stage recovery algorithm for sparse biomedical signals in time domain. In the first stage, the concentration subspaces are found in advance. Then by exploiting these subspaces, the mixing matrix is estimated accurately. In the second stage, based on the number of active sources at each time point, the time points are divided into different layers. Next, by constructing some transformation matrices, these time points form a row echelon-like system. After that, the sources at each layer can be solved out explicitly by corresponding matrix operations. It is noting that all these operations are conducted under a weak sparse condition that the number of active sources is less than the number of observations. Experimental results show that the proposed method has a better performance for sparse ECG signal recovery problem.
Using a multifrontal sparse solver in a high performance, finite element code
NASA Technical Reports Server (NTRS)
King, Scott D.; Lucas, Robert; Raefsky, Arthur
1990-01-01
We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel; ...
2017-06-01
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
Hine, N D M; Haynes, P D; Mostofi, A A; Payne, M C
2010-09-21
We present calculations of formation energies of defects in an ionic solid (Al(2)O(3)) extrapolated to the dilute limit, corresponding to a simulation cell of infinite size. The large-scale calculations required for this extrapolation are enabled by developments in the approach to parallel sparse matrix algebra operations, which are central to linear-scaling density-functional theory calculations. The computational cost of manipulating sparse matrices, whose sizes are determined by the large number of basis functions present, is greatly improved with this new approach. We present details of the sparse algebra scheme implemented in the ONETEP code using hierarchical sparsity patterns, and demonstrate its use in calculations on a wide range of systems, involving thousands of atoms on hundreds to thousands of parallel processes.
Fast and Adaptive Sparse Precision Matrix Estimation in High Dimensions
Liu, Weidong; Luo, Xi
2014-01-01
This paper proposes a new method for estimating sparse precision matrices in the high dimensional setting. It has been popular to study fast computation and adaptive procedures for this problem. We propose a novel approach, called Sparse Column-wise Inverse Operator, to address these two issues. We analyze an adaptive procedure based on cross validation, and establish its convergence rate under the Frobenius norm. The convergence rates under other matrix norms are also established. This method also enjoys the advantage of fast computation for large-scale problems, via a coordinate descent algorithm. Numerical merits are illustrated using both simulated and real datasets. In particular, it performs favorably on an HIV brain tissue dataset and an ADHD resting-state fMRI dataset. PMID:25750463
Approximate method of variational Bayesian matrix factorization/completion with sparse prior
NASA Astrophysics Data System (ADS)
Kawasumi, Ryota; Takeda, Koujin
2018-05-01
We derive the analytical expression of a matrix factorization/completion solution by the variational Bayes method, under the assumption that the observed matrix is originally the product of low-rank, dense and sparse matrices with additive noise. We assume the prior of a sparse matrix is a Laplace distribution by taking matrix sparsity into consideration. Then we use several approximations for the derivation of a matrix factorization/completion solution. By our solution, we also numerically evaluate the performance of a sparse matrix reconstruction in matrix factorization, and completion of a missing matrix element in matrix completion.
Brief announcement: Hypergraph parititioning for parallel sparse matrix-matrix multiplication
Ballard, Grey; Druinsky, Alex; Knight, Nicholas; ...
2015-01-01
The performance of parallel algorithms for sparse matrix-matrix multiplication is typically determined by the amount of interprocessor communication performed, which in turn depends on the nonzero structure of the input matrices. In this paper, we characterize the communication cost of a sparse matrix-matrix multiplication algorithm in terms of the size of a cut of an associated hypergraph that encodes the computation for a given input nonzero structure. Obtaining an optimal algorithm corresponds to solving a hypergraph partitioning problem. Furthermore, our hypergraph model generalizes several existing models for sparse matrix-vector multiplication, and we can leverage hypergraph partitioners developed for that computationmore » to improve application-specific algorithms for multiplying sparse matrices.« less
LSRN: A PARALLEL ITERATIVE SOLVER FOR STRONGLY OVER- OR UNDERDETERMINED SYSTEMS*
Meng, Xiangrui; Saunders, Michael A.; Mahoney, Michael W.
2014-01-01
We describe a parallel iterative least squares solver named LSRN that is based on random normal projection. LSRN computes the min-length solution to minx∈ℝn ‖Ax − b‖2, where A ∈ ℝm × n with m ≫ n or m ≪ n, and where A may be rank-deficient. Tikhonov regularization may also be included. Since A is involved only in matrix-matrix and matrix-vector multiplications, it can be a dense or sparse matrix or a linear operator, and LSRN automatically speeds up when A is sparse or a fast linear operator. The preconditioning phase consists of a random normal projection, which is embarrassingly parallel, and a singular value decomposition of size ⌈γ min(m, n)⌉ × min(m, n), where γ is moderately larger than 1, e.g., γ = 2. We prove that the preconditioned system is well-conditioned, with a strong concentration result on the extreme singular values, and hence that the number of iterations is fully predictable when we apply LSQR or the Chebyshev semi-iterative method. As we demonstrate, the Chebyshev method is particularly efficient for solving large problems on clusters with high communication cost. Numerical results show that on a shared-memory machine, LSRN is very competitive with LAPACK’s DGELSD and a fast randomized least squares solver called Blendenpik on large dense problems, and it outperforms the least squares solver from SuiteSparseQR on sparse problems without sparsity patterns that can be exploited to reduce fill-in. Further experiments show that LSRN scales well on an Amazon Elastic Compute Cloud cluster. PMID:25419094
NASA Technical Reports Server (NTRS)
Whetstone, W. D.
1976-01-01
The functions and operating rules of the SPAR system, which is a group of computer programs used primarily to perform stress, buckling, and vibrational analyses of linear finite element systems, were given. The following subject areas were discussed: basic information, structure definition, format system matrix processors, utility programs, static solutions, stresses, sparse matrix eigensolver, dynamic response, graphics, and substructure processors.
Multi-threaded Sparse Matrix Sparse Matrix Multiplication for Many-Core and GPU Architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deveci, Mehmet; Trott, Christian Robert; Rajamanickam, Sivasankaran
Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix- matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and datamore » structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.« less
An Efficient Scheme for Updating Sparse Cholesky Factors
NASA Technical Reports Server (NTRS)
Raghavan, Padma
2002-01-01
Raghavan had earlier developed the software package DCSPACK which can be used for solving sparse linear systems where the coefficient matrix is symmetric and positive definite (this project was not funded by NASA but by agencies such as NSF). DSCPACK-S is the serial code and DSCPACK-P is a parallel implementation suitable for multiprocessors or networks-of-workstations with message passing using MCI. The main algorithm used is the Cholesky factorization of a sparse symmetric positive positive definite matrix A = LL(T). The code can also compute the factorization A = LDL(T). The complexity of the software arises from several factors relating to the sparsity of the matrix A. A sparse N x N matrix A has typically less that cN nonzeroes where c is a small constant. If the matrix were dense, it would have O(N2) nonzeroes. The most complicated part of such sparse Cholesky factorization relates to fill-in, i.e., zeroes in the original matrix that become nonzeroes in the factor L. An efficient implementation depends to a large extent on complex data structures and on techniques from graph theory to reduce, identify, and manage fill. DSCPACK is based on an efficient multifrontal implementation with fill-managing algorithms and implementation arising from earlier research by Raghavan and others. Sparse Cholesky factorization is typically a four step process: (1) ordering to compute a fill-reducing numbering, (2) symbolic factorization to determine the nonzero structure of L, (3) numeric factorization to compute L, and, (4) triangular solution to solve L(T)x = y and Ly = b. The first two steps are symbolic and are performed using the graph of the matrix. The numeric factorization step is of dominant cost and there are several schemes for improving performance by exploiting the nested and dense structure of groups of columns in the factor. The latter are aimed at better utilization of the cache-memory hierarchy on modem processors to prevent cache-misses and provide execution rates (operations/second) that are close to the peak rates for dense matrix computations. Currently, EPISCOPACY is being used in an application at NASA directed by J. Newman and M. James. We propose the implementation of efficient schemes for updating the LL(T) or LDL(T) factors computed in DSCPACK-S to meet the computational requirements of their project. A brief description is provided in the next section.
SMOKE TOOL FOR MODELS-3 VERSION 4.1 STRUCTURE AND OPERATION DOCUMENTATION
The SMOKE Tool is a part of the Models-3 system, a flexible software system designed to simplify the development and use of air quality models and other environmental decision support tools. The SMOKE Tool is an input processor for SMOKE, (Sparse Matrix Operator Kernel Emissio...
NASA Astrophysics Data System (ADS)
Hu, Guiqiang; Xiao, Di; Wang, Yong; Xiang, Tao; Zhou, Qing
2017-11-01
Recently, a new kind of image encryption approach using compressive sensing (CS) and double random phase encoding has received much attention due to the advantages such as compressibility and robustness. However, this approach is found to be vulnerable to chosen plaintext attack (CPA) if the CS measurement matrix is re-used. Therefore, designing an efficient measurement matrix updating mechanism that ensures resistance to CPA is of practical significance. In this paper, we provide a novel solution to update the CS measurement matrix by altering the secret sparse basis with the help of counter mode operation. Particularly, the secret sparse basis is implemented by a reality-preserving fractional cosine transform matrix. Compared with the conventional CS-based cryptosystem that totally generates all the random entries of measurement matrix, our scheme owns efficiency superiority while guaranteeing resistance to CPA. Experimental and analysis results show that the proposed scheme has a good security performance and has robustness against noise and occlusion.
Kim, Hyunsoo; Park, Haesun
2007-06-15
Many practical pattern recognition problems require non-negativity constraints. For example, pixels in digital images and chemical concentrations in bioinformatics are non-negative. Sparse non-negative matrix factorizations (NMFs) are useful when the degree of sparseness in the non-negative basis matrix or the non-negative coefficient matrix in an NMF needs to be controlled in approximating high-dimensional data in a lower dimensional space. In this article, we introduce a novel formulation of sparse NMF and show how the new formulation leads to a convergent sparse NMF algorithm via alternating non-negativity-constrained least squares. We apply our sparse NMF algorithm to cancer-class discovery and gene expression data analysis and offer biological analysis of the results obtained. Our experimental results illustrate that the proposed sparse NMF algorithm often achieves better clustering performance with shorter computing time compared to other existing NMF algorithms. The software is available as supplementary material.
NASA Astrophysics Data System (ADS)
Galiatsatos, P. G.; Tennyson, J.
2012-11-01
The most time consuming step within the framework of the UK R-matrix molecular codes is that of the diagonalization of the inner region Hamiltonian matrix (IRHM). Here we present the method that we follow to speed up this step. We use shared memory machines (SMM), distributed memory machines (DMM), the OpenMP directive based parallel language, the MPI function based parallel language, the sparse matrix diagonalizers ARPACK and PARPACK, a variation for real symmetric matrices of the official coordinate sparse matrix format and finally a parallel sparse matrix-vector product (PSMV). The efficient application of the previous techniques rely on two important facts: the sparsity of the matrix is large enough (more than 98%) and in order to get back converged results we need a small only part of the matrix spectrum.
An overview of NSPCG: A nonsymmetric preconditioned conjugate gradient package
NASA Astrophysics Data System (ADS)
Oppe, Thomas C.; Joubert, Wayne D.; Kincaid, David R.
1989-05-01
The most recent research-oriented software package developed as part of the ITPACK Project is called "NSPCG" since it contains many nonsymmetric preconditioned conjugate gradient procedures. It is designed to solve large sparse systems of linear algebraic equations by a variety of different iterative methods. One of the main purposes for the development of the package is to provide a common modular structure for research on iterative methods for nonsymmetric matrices. Another purpose for the development of the package is to investigate the suitability of several iterative methods for vector computers. Since the vectorizability of an iterative method depends greatly on the matrix structure, NSPCG allows great flexibility in the operator representation. The coefficient matrix can be passed in one of several different matrix data storage schemes. These sparse data formats allow matrices with a wide range of structures from highly structured ones such as those with all nonzeros along a relatively small number of diagonals to completely unstructured sparse matrices. Alternatively, the package allows the user to call the accelerators directly with user-supplied routines for performing certain matrix operations. In this case, one can use the data format from an application program and not be required to copy the matrix into one of the package formats. This is particularly advantageous when memory space is limited. Some of the basic preconditioners that are available are point methods such as Jacobi, Incomplete LU Decomposition and Symmetric Successive Overrelaxation as well as block and multicolor preconditioners. The user can select from a large collection of accelerators such as Conjugate Gradient (CG), Chebyshev (SI, for semi-iterative), Generalized Minimal Residual (GMRES), Biconjugate Gradient Squared (BCGS) and many others. The package is modular so that almost any accelerator can be used with almost any preconditioner.
Dense and Sparse Matrix Operations on the Cell Processor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel W.; Shalf, John; Oliker, Leonid
2005-05-01
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. Therefore, the high performance computing community is examining alternative architectures that address the limitations of modern superscalar designs. In this work, we examine STI's forthcoming Cell processor: a novel, low-power architecture that combines a PowerPC core with eight independent SIMD processing units coupled with a software-controlled memory to offer high FLOP/s/Watt. Since neither Cell hardware nor cycle-accurate simulators are currently publicly available, we develop an analytic framework to predict Cell performance on dense and sparse matrix operations, usingmore » a variety of algorithmic approaches. Results demonstrate Cell's potential to deliver more than an order of magnitude better GFLOP/s per watt performance, when compared with the Intel Itanium2 and Cray X1 processors.« less
1982-10-27
are buried within * a much larger, special purpose package. We regret such omissions, but to have reached the practi- tioners in each of the diverse...sparse matrix (form PAQ ) 4. Method of solution: Distribution count sort 5. Programming language: FORTRAN g Precision: Single and double precision 7
Solution of matrix equations using sparse techniques
NASA Technical Reports Server (NTRS)
Baddourah, Majdi
1994-01-01
The solution of large systems of matrix equations is key to the solution of a large number of scientific and engineering problems. This talk describes the sparse matrix solver developed at Langley which can routinely solve in excess of 263,000 equations in 40 seconds on one Cray C-90 processor. It appears that for large scale structural analysis applications, sparse matrix methods have a significant performance advantage over other methods.
Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU Architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deveci, Mehmet; Rajamanickam, Sivasankaran; Trott, Christian Robert
Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scienti c computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and datamore » structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.« less
Sparse Matrix for ECG Identification with Two-Lead Features.
Tseng, Kuo-Kun; Luo, Jiao; Hegarty, Robert; Wang, Wenmin; Haiting, Dong
2015-01-01
Electrocardiograph (ECG) human identification has the potential to improve biometric security. However, improvements in ECG identification and feature extraction are required. Previous work has focused on single lead ECG signals. Our work proposes a new algorithm for human identification by mapping two-lead ECG signals onto a two-dimensional matrix then employing a sparse matrix method to process the matrix. And that is the first application of sparse matrix techniques for ECG identification. Moreover, the results of our experiments demonstrate the benefits of our approach over existing methods.
(EDMUNDS, WA) WILDLAND FIRE EMISSIONS MODELING: INTEGRATING BLUESKY AND SMOKE
This presentation is a status update of the BlueSky emissions modeling system. BlueSky-EM has been coupled with the Sparse Matrix Operational Kernel Emissions (SMOKE) system, and is now available as a tool for estimating emissions from wildland fires
Convex Banding of the Covariance Matrix
Bien, Jacob; Bunea, Florentina; Xiao, Luo
2016-01-01
We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings. PMID:28042189
Convex Banding of the Covariance Matrix.
Bien, Jacob; Bunea, Florentina; Xiao, Luo
2016-01-01
We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings.
Saravanan, Chandra; Shao, Yihan; Baer, Roi; Ross, Philip N; Head-Gordon, Martin
2003-04-15
A sparse matrix multiplication scheme with multiatom blocks is reported, a tool that can be very useful for developing linear-scaling methods with atom-centered basis functions. Compared to conventional element-by-element sparse matrix multiplication schemes, efficiency is gained by the use of the highly optimized basic linear algebra subroutines (BLAS). However, some sparsity is lost in the multiatom blocking scheme because these matrix blocks will in general contain negligible elements. As a result, an optimal block size that minimizes the CPU time by balancing these two effects is recovered. In calculations on linear alkanes, polyglycines, estane polymers, and water clusters the optimal block size is found to be between 40 and 100 basis functions, where about 55-75% of the machine peak performance was achieved on an IBM RS6000 workstation. In these calculations, the blocked sparse matrix multiplications can be 10 times faster than a standard element-by-element sparse matrix package. Copyright 2003 Wiley Periodicals, Inc. J Comput Chem 24: 618-622, 2003
Massively parallel sparse matrix function calculations with NTPoly
NASA Astrophysics Data System (ADS)
Dawson, William; Nakajima, Takahito
2018-04-01
We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.
Improvements in sparse matrix operations of NASTRAN
NASA Technical Reports Server (NTRS)
Harano, S.
1980-01-01
A "nontransmit" packing routine was added to NASTRAN to allow matrix data to be refered to directly from the input/output buffer. Use of the packing routine permits various routines for matrix handling to perform a direct reference to the input/output buffer if data addresses have once been received. The packing routine offers a buffer by buffer backspace feature for efficient backspacing in sequential access. Unlike a conventional backspacing that needs twice back record for a single read of one record (one column), this feature omits overlapping of READ operation and back record. It eliminates the necessity of writing, in decomposition of a symmetric matrix, of a portion of the matrix to its upper triangular matrix from the last to the first columns of the symmetric matrix, thus saving time for generating the upper triangular matrix. Only a lower triangular matrix must be written onto the secondary storage device, bringing 10 to 30% reduction in use of the disk space of the storage device.
Disentangling giant component and finite cluster contributions in sparse random matrix spectra.
Kühn, Reimer
2016-04-01
We describe a method for disentangling giant component and finite cluster contributions to sparse random matrix spectra, using sparse symmetric random matrices defined on Erdős-Rényi graphs as an example and test bed. Our methods apply to sparse matrices defined in terms of arbitrary graphs in the configuration model class, as long as they have finite mean degree.
NASA Technical Reports Server (NTRS)
Cannone, Jaime J.; Barnes, Cindy L.; Achari, Aniruddha; Kundrot, Craig E.; Whitaker, Ann F. (Technical Monitor)
2001-01-01
The Sparse Matrix approach for obtaining lead crystallization conditions has proven to be very fruitful for the crystallization of proteins and nucleic acids. Here we report a Sparse Matrix developed specifically for the crystallization of protein-DNA complexes. This method is rapid and economical, typically requiring 2.5 mg of complex to test 48 conditions. The method was originally developed to crystallize basic fibroblast growth factor (bFGF) complexed with DNA sequences identified through in vitro selection, or SELEX, methods. Two DNA aptamers that bind with approximately nanomolar affinity and inhibit the angiogenic properties of bFGF were selected for co-crystallization. The Sparse Matrix produced lead crystallization conditions for both bFGF-DNA complexes.
High-SNR spectrum measurement based on Hadamard encoding and sparse reconstruction
NASA Astrophysics Data System (ADS)
Wang, Zhaoxin; Yue, Jiang; Han, Jing; Li, Long; Jin, Yong; Gao, Yuan; Li, Baoming
2017-12-01
The denoising capabilities of the H-matrix and cyclic S-matrix based on the sparse reconstruction, employed in the Pixel of Focal Plane Coded Visible Spectrometer for spectrum measurement are investigated, where the spectrum is sparse in a known basis. In the measurement process, the digital micromirror device plays an important role, which implements the Hadamard coding. In contrast with Hadamard transform spectrometry, based on the shift invariability, this spectrometer may have the advantage of a high efficiency. Simulations and experiments show that the nonlinear solution with a sparse reconstruction has a better signal-to-noise ratio than the linear solution and the H-matrix outperforms the cyclic S-matrix whether the reconstruction method is nonlinear or linear.
Sparse nonnegative matrix factorization with ℓ0-constraints
Peharz, Robert; Pernkopf, Franz
2012-01-01
Although nonnegative matrix factorization (NMF) favors a sparse and part-based representation of nonnegative data, there is no guarantee for this behavior. Several authors proposed NMF methods which enforce sparseness by constraining or penalizing the ℓ1-norm of the factor matrices. On the other hand, little work has been done using a more natural sparseness measure, the ℓ0-pseudo-norm. In this paper, we propose a framework for approximate NMF which constrains the ℓ0-norm of the basis matrix, or the coefficient matrix, respectively. For this purpose, techniques for unconstrained NMF can be easily incorporated, such as multiplicative update rules, or the alternating nonnegative least-squares scheme. In experiments we demonstrate the benefits of our methods, which compare to, or outperform existing approaches. PMID:22505792
GPU-accelerated element-free reverse-time migration with Gauss points partition
NASA Astrophysics Data System (ADS)
Zhou, Zhen; Jia, Xiaofeng; Qiang, Xiaodong
2018-06-01
An element-free method (EFM) has been demonstrated successfully in elasticity, heat conduction and fatigue crack growth problems. We present the theory of EFM and its numerical applications in seismic modelling and reverse time migration (RTM). Compared with the finite difference method and the finite element method, the EFM has unique advantages: (1) independence of grids in computation and (2) lower expense and more flexibility (because only the information of the nodes and the boundary of the concerned area is required). However, in EFM, due to improper computation and storage of some large sparse matrices, such as the mass matrix and the stiffness matrix, the method is difficult to apply to seismic modelling and RTM for a large velocity model. To solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition and utilise the graphics processing unit to improve the computational efficiency. We employ the compressed sparse row format to compress the intermediate large sparse matrices and attempt to simplify the operations by solving the linear equations with CULA solver. To improve the computation efficiency further, we introduce the concept of the lumped mass matrix. Numerical experiments indicate that the proposed method is accurate and more efficient than the regular EFM.
Supercomputing on massively parallel bit-serial architectures
NASA Technical Reports Server (NTRS)
Iobst, Ken
1985-01-01
Research on the Goodyear Massively Parallel Processor (MPP) suggests that high-level parallel languages are practical and can be designed with powerful new semantics that allow algorithms to be efficiently mapped to the real machines. For the MPP these semantics include parallel/associative array selection for both dense and sparse matrices, variable precision arithmetic to trade accuracy for speed, micro-pipelined train broadcast, and conditional branching at the processing element (PE) control unit level. The preliminary design of a FORTRAN-like parallel language for the MPP has been completed and is being used to write programs to perform sparse matrix array selection, min/max search, matrix multiplication, Gaussian elimination on single bit arrays and other generic algorithms. A description is given of the MPP design. Features of the system and its operation are illustrated in the form of charts and diagrams.
Yang, C L; Wei, H Y; Adler, A; Soleimani, M
2013-06-01
Electrical impedance tomography (EIT) is a fast and cost-effective technique to provide a tomographic conductivity image of a subject from boundary current-voltage data. This paper proposes a time and memory efficient method for solving a large scale 3D EIT inverse problem using a parallel conjugate gradient (CG) algorithm. The 3D EIT system with a large number of measurement data can produce a large size of Jacobian matrix; this could cause difficulties in computer storage and the inversion process. One of challenges in 3D EIT is to decrease the reconstruction time and memory usage, at the same time retaining the image quality. Firstly, a sparse matrix reduction technique is proposed using thresholding to set very small values of the Jacobian matrix to zero. By adjusting the Jacobian matrix into a sparse format, the element with zeros would be eliminated, which results in a saving of memory requirement. Secondly, a block-wise CG method for parallel reconstruction has been developed. The proposed method has been tested using simulated data as well as experimental test samples. Sparse Jacobian with a block-wise CG enables the large scale EIT problem to be solved efficiently. Image quality measures are presented to quantify the effect of sparse matrix reduction in reconstruction results.
Method and apparatus for optimized processing of sparse matrices
Taylor, Valerie E.
1993-01-01
A computer architecture for processing a sparse matrix is disclosed. The apparatus stores a value-row vector corresponding to nonzero values of a sparse matrix. Each of the nonzero values is located at a defined row and column position in the matrix. The value-row vector includes a first vector including nonzero values and delimiting characters indicating a transition from one column to another. The value-row vector also includes a second vector which defines row position values in the matrix corresponding to the nonzero values in the first vector and column position values in the matrix corresponding to the column position of the nonzero values in the first vector. The architecture also includes a circuit for detecting a special character within the value-row vector. Matrix-vector multiplication is executed on the value-row vector. This multiplication is performed by multiplying an index value of the first vector value by a column value from a second matrix to form a matrix-vector product which is added to a previous matrix-vector product.
Tang, Xin; Feng, Guo-Can; Li, Xiao-Xin; Cai, Jia-Xin
2015-01-01
Face recognition is challenging especially when the images from different persons are similar to each other due to variations in illumination, expression, and occlusion. If we have sufficient training images of each person which can span the facial variations of that person under testing conditions, sparse representation based classification (SRC) achieves very promising results. However, in many applications, face recognition often encounters the small sample size problem arising from the small number of available training images for each person. In this paper, we present a novel face recognition framework by utilizing low-rank and sparse error matrix decomposition, and sparse coding techniques (LRSE+SC). Firstly, the low-rank matrix recovery technique is applied to decompose the face images per class into a low-rank matrix and a sparse error matrix. The low-rank matrix of each individual is a class-specific dictionary and it captures the discriminative feature of this individual. The sparse error matrix represents the intra-class variations, such as illumination, expression changes. Secondly, we combine the low-rank part (representative basis) of each person into a supervised dictionary and integrate all the sparse error matrix of each individual into a within-individual variant dictionary which can be applied to represent the possible variations between the testing and training images. Then these two dictionaries are used to code the query image. The within-individual variant dictionary can be shared by all the subjects and only contribute to explain the lighting conditions, expressions, and occlusions of the query image rather than discrimination. At last, a reconstruction-based scheme is adopted for face recognition. Since the within-individual dictionary is introduced, LRSE+SC can handle the problem of the corrupted training data and the situation that not all subjects have enough samples for training. Experimental results show that our method achieves the state-of-the-art results on AR, FERET, FRGC and LFW databases.
Tang, Xin; Feng, Guo-can; Li, Xiao-xin; Cai, Jia-xin
2015-01-01
Face recognition is challenging especially when the images from different persons are similar to each other due to variations in illumination, expression, and occlusion. If we have sufficient training images of each person which can span the facial variations of that person under testing conditions, sparse representation based classification (SRC) achieves very promising results. However, in many applications, face recognition often encounters the small sample size problem arising from the small number of available training images for each person. In this paper, we present a novel face recognition framework by utilizing low-rank and sparse error matrix decomposition, and sparse coding techniques (LRSE+SC). Firstly, the low-rank matrix recovery technique is applied to decompose the face images per class into a low-rank matrix and a sparse error matrix. The low-rank matrix of each individual is a class-specific dictionary and it captures the discriminative feature of this individual. The sparse error matrix represents the intra-class variations, such as illumination, expression changes. Secondly, we combine the low-rank part (representative basis) of each person into a supervised dictionary and integrate all the sparse error matrix of each individual into a within-individual variant dictionary which can be applied to represent the possible variations between the testing and training images. Then these two dictionaries are used to code the query image. The within-individual variant dictionary can be shared by all the subjects and only contribute to explain the lighting conditions, expressions, and occlusions of the query image rather than discrimination. At last, a reconstruction-based scheme is adopted for face recognition. Since the within-individual dictionary is introduced, LRSE+SC can handle the problem of the corrupted training data and the situation that not all subjects have enough samples for training. Experimental results show that our method achieves the state-of-the-art results on AR, FERET, FRGC and LFW databases. PMID:26571112
Hou, Gary Y; Provost, Jean; Grondin, Julien; Wang, Shutao; Marquet, Fabrice; Bunting, Ethan; Konofagou, Elisa E
2014-11-01
Harmonic motion imaging for focused ultrasound (HMIFU) utilizes an amplitude-modulated HIFU beam to induce a localized focal oscillatory motion simultaneously estimated. The objective of this study is to develop and show the feasibility of a novel fast beamforming algorithm for image reconstruction using GPU-based sparse-matrix operation with real-time feedback. In this study, the algorithm was implemented onto a fully integrated, clinically relevant HMIFU system. A single divergent transmit beam was used while fast beamforming was implemented using a GPU-based delay-and-sum method and a sparse-matrix operation. Axial HMI displacements were then estimated from the RF signals using a 1-D normalized cross-correlation method and streamed to a graphic user interface with frame rates up to 15 Hz, a 100-fold increase compared to conventional CPU-based processing. The real-time feedback rate does not require interrupting the HIFU treatment. Results in phantom experiments showed reproducible HMI images and monitoring of 22 in vitro HIFU treatments using the new 2-D system demonstrated reproducible displacement imaging, and monitoring of 22 in vitro HIFU treatments using the new 2-D system showed a consistent average focal displacement decrease of 46.7 ±14.6% during lesion formation. Complementary focal temperature monitoring also indicated an average rate of displacement increase and decrease with focal temperature at 0.84±1.15%/(°)C, and 2.03±0.93%/(°)C , respectively. These results reinforce the HMIFU capability of estimating and monitoring stiffness related changes in real time. Current ongoing studies include clinical translation of the presented system for monitoring of HIFU treatment for breast and pancreatic tumor applications.
Turbo-SMT: Parallel Coupled Sparse Matrix-Tensor Factorizations and Applications
Papalexakis, Evangelos E.; Faloutsos, Christos; Mitchell, Tom M.; Talukdar, Partha Pratim; Sidiropoulos, Nicholas D.; Murphy, Brian
2016-01-01
How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ’edible’, ’fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a Facebook dataset (users, ’friends’, wall-postings); there, Turbo-SMT spots spammer-like anomalies. PMID:27672406
Sparse subspace clustering for data with missing entries and high-rank matrix completion.
Fan, Jicong; Chow, Tommy W S
2017-09-01
Many methods have recently been proposed for subspace clustering, but they are often unable to handle incomplete data because of missing entries. Using matrix completion methods to recover missing entries is a common way to solve the problem. Conventional matrix completion methods require that the matrix should be of low-rank intrinsically, but most matrices are of high-rank or even full-rank in practice, especially when the number of subspaces is large. In this paper, a new method called Sparse Representation with Missing Entries and Matrix Completion is proposed to solve the problems of incomplete-data subspace clustering and high-rank matrix completion. The proposed algorithm alternately computes the matrix of sparse representation coefficients and recovers the missing entries of a data matrix. The proposed algorithm recovers missing entries through minimizing the representation coefficients, representation errors, and matrix rank. Thorough experimental study and comparative analysis based on synthetic data and natural images were conducted. The presented results demonstrate that the proposed algorithm is more effective in subspace clustering and matrix completion compared with other existing methods. Copyright © 2017 Elsevier Ltd. All rights reserved.
FPGA implementation of sparse matrix algorithm for information retrieval
NASA Astrophysics Data System (ADS)
Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio
2005-06-01
Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
Uniform Recovery Bounds for Structured Random Matrices in Corrupted Compressed Sensing
NASA Astrophysics Data System (ADS)
Zhang, Peng; Gan, Lu; Ling, Cong; Sun, Sumei
2018-04-01
We study the problem of recovering an $s$-sparse signal $\\mathbf{x}^{\\star}\\in\\mathbb{C}^n$ from corrupted measurements $\\mathbf{y} = \\mathbf{A}\\mathbf{x}^{\\star}+\\mathbf{z}^{\\star}+\\mathbf{w}$, where $\\mathbf{z}^{\\star}\\in\\mathbb{C}^m$ is a $k$-sparse corruption vector whose nonzero entries may be arbitrarily large and $\\mathbf{w}\\in\\mathbb{C}^m$ is a dense noise with bounded energy. The aim is to exactly and stably recover the sparse signal with tractable optimization programs. In this paper, we prove the uniform recovery guarantee of this problem for two classes of structured sensing matrices. The first class can be expressed as the product of a unit-norm tight frame (UTF), a random diagonal matrix and a bounded columnwise orthonormal matrix (e.g., partial random circulant matrix). When the UTF is bounded (i.e. $\\mu(\\mathbf{U})\\sim1/\\sqrt{m}$), we prove that with high probability, one can recover an $s$-sparse signal exactly and stably by $l_1$ minimization programs even if the measurements are corrupted by a sparse vector, provided $m = \\mathcal{O}(s \\log^2 s \\log^2 n)$ and the sparsity level $k$ of the corruption is a constant fraction of the total number of measurements. The second class considers randomly sub-sampled orthogonal matrix (e.g., random Fourier matrix). We prove the uniform recovery guarantee provided that the corruption is sparse on certain sparsifying domain. Numerous simulation results are also presented to verify and complement the theoretical results.
Convergence Speed of a Dynamical System for Sparse Recovery
NASA Astrophysics Data System (ADS)
Balavoine, Aurele; Rozell, Christopher J.; Romberg, Justin
2013-09-01
This paper studies the convergence rate of a continuous-time dynamical system for L1-minimization, known as the Locally Competitive Algorithm (LCA). Solving L1-minimization} problems efficiently and rapidly is of great interest to the signal processing community, as these programs have been shown to recover sparse solutions to underdetermined systems of linear equations and come with strong performance guarantees. The LCA under study differs from the typical L1 solver in that it operates in continuous time: instead of being specified by discrete iterations, it evolves according to a system of nonlinear ordinary differential equations. The LCA is constructed from simple components, giving it the potential to be implemented as a large-scale analog circuit. The goal of this paper is to give guarantees on the convergence time of the LCA system. To do so, we analyze how the LCA evolves as it is recovering a sparse signal from underdetermined measurements. We show that under appropriate conditions on the measurement matrix and the problem parameters, the path the LCA follows can be described as a sequence of linear differential equations, each with a small number of active variables. This allows us to relate the convergence time of the system to the restricted isometry constant of the matrix. Interesting parallels to sparse-recovery digital solvers emerge from this study. Our analysis covers both the noisy and noiseless settings and is supported by simulation results.
Matched field localization based on CS-MUSIC algorithm
NASA Astrophysics Data System (ADS)
Guo, Shuangle; Tang, Ruichun; Peng, Linhui; Ji, Xiaopeng
2016-04-01
The problem caused by shortness or excessiveness of snapshots and by coherent sources in underwater acoustic positioning is considered. A matched field localization algorithm based on CS-MUSIC (Compressive Sensing Multiple Signal Classification) is proposed based on the sparse mathematical model of the underwater positioning. The signal matrix is calculated through the SVD (Singular Value Decomposition) of the observation matrix. The observation matrix in the sparse mathematical model is replaced by the signal matrix, and a new concise sparse mathematical model is obtained, which means not only the scale of the localization problem but also the noise level is reduced; then the new sparse mathematical model is solved by the CS-MUSIC algorithm which is a combination of CS (Compressive Sensing) method and MUSIC (Multiple Signal Classification) method. The algorithm proposed in this paper can overcome effectively the difficulties caused by correlated sources and shortness of snapshots, and it can also reduce the time complexity and noise level of the localization problem by using the SVD of the observation matrix when the number of snapshots is large, which will be proved in this paper.
1-norm support vector novelty detection and its sparseness.
Zhang, Li; Zhou, WeiDa
2013-12-01
This paper proposes a 1-norm support vector novelty detection (SVND) method and discusses its sparseness. 1-norm SVND is formulated as a linear programming problem and uses two techniques for inducing sparseness, or the 1-norm regularization and the hinge loss function. We also find two upper bounds on the sparseness of 1-norm SVND, or exact support vector (ESV) and kernel Gram matrix rank bounds. The ESV bound indicates that 1-norm SVND has a sparser representation model than SVND. The kernel Gram matrix rank bound can loosely estimate the sparseness of 1-norm SVND. Experimental results show that 1-norm SVND is feasible and effective. Copyright © 2013 Elsevier Ltd. All rights reserved.
A sparse matrix algorithm on the Boolean vector machine
NASA Technical Reports Server (NTRS)
Wagner, Robert A.; Patrick, Merrell L.
1988-01-01
VLSI technology is being used to implement a prototype Boolean Vector Machine (BVM), which is a large network of very small processors with equally small memories that operate in SIMD mode; these use bit-serial arithmetic, and communicate via cube-connected cycles network. The BVM's bit-serial arithmetic and the small memories of individual processors are noted to compromise the system's effectiveness in large numerical problem applications. Attention is presently given to the implementation of a basic matrix-vector iteration algorithm for space matrices of the BVM, in order to generate over 1 billion useful floating-point operations/sec for this iteration algorithm. The algorithm is expressed in a novel language designated 'BVM'.
Sparse Covariance Matrix Estimation With Eigenvalue Constraints
LIU, Han; WANG, Lie; ZHAO, Tuo
2014-01-01
We propose a new approach for estimating high-dimensional, positive-definite covariance matrices. Our method extends the generalized thresholding operator by adding an explicit eigenvalue constraint. The estimated covariance matrix simultaneously achieves sparsity and positive definiteness. The estimator is rate optimal in the minimax sense and we develop an efficient iterative soft-thresholding and projection algorithm based on the alternating direction method of multipliers. Empirically, we conduct thorough numerical experiments on simulated datasets as well as real data examples to illustrate the usefulness of our method. Supplementary materials for the article are available online. PMID:25620866
NASA Technical Reports Server (NTRS)
Szyld, D. B.
1984-01-01
A brief description of the Model of the World Economy implemented at the Institute for Economic Analysis is presented, together with our experience in converting the software to vector code. For each time period, the model is reduced to a linear system of over 2000 variables. The matrix of coefficients has a bordered block diagonal structure, and we show how some of the matrix operations can be carried out on all diagonal blocks at once.
Exploring Deep Learning and Sparse Matrix Format Selection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Y.; Liao, C.; Shen, X.
We proposed to explore the use of Deep Neural Networks (DNN) for addressing the longstanding barriers. The recent rapid progress of DNN technology has created a large impact in many fields, which has significantly improved the prediction accuracy over traditional machine learning techniques in image classifications, speech recognitions, machine translations, and so on. To some degree, these tasks resemble the decision makings in many HPC tasks, including the aforementioned format selection for SpMV and linear solver selection. For instance, sparse matrix format selection is akin to image classification—such as, to tell whether an image contains a dog or a cat;more » in both problems, the right decisions are primarily determined by the spatial patterns of the elements in an input. For image classification, the patterns are of pixels, and for sparse matrix format selection, they are of non-zero elements. DNN could be naturally applied if we regard a sparse matrix as an image and the format selection or solver selection as classification problems.« less
Accelerating scientific computations with mixed precision algorithms
NASA Astrophysics Data System (ADS)
Baboulin, Marc; Buttari, Alfredo; Dongarra, Jack; Kurzak, Jakub; Langou, Julie; Langou, Julien; Luszczek, Piotr; Tomov, Stanimire
2009-12-01
On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented. Program summaryProgram title: ITER-REF Catalogue identifier: AECO_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AECO_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 7211 No. of bytes in distributed program, including test data, etc.: 41 862 Distribution format: tar.gz Programming language: FORTRAN 77 Computer: desktop, server Operating system: Unix/Linux RAM: 512 Mbytes Classification: 4.8 External routines: BLAS (optional) Nature of problem: On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. Solution method: Mixed precision algorithms stem from the observation that, in many cases, a single precision solution of a problem can be refined to the point where double precision accuracy is achieved. A common approach to the solution of linear systems, either dense or sparse, is to perform the LU factorization of the coefficient matrix using Gaussian elimination. First, the coefficient matrix A is factored into the product of a lower triangular matrix L and an upper triangular matrix U. Partial row pivoting is in general used to improve numerical stability resulting in a factorization PA=LU, where P is a permutation matrix. The solution for the system is achieved by first solving Ly=Pb (forward substitution) and then solving Ux=y (backward substitution). Due to round-off errors, the computed solution, x, carries a numerical error magnified by the condition number of the coefficient matrix A. In order to improve the computed solution, an iterative process can be applied, which produces a correction to the computed solution at each iteration, which then yields the method that is commonly known as the iterative refinement algorithm. Provided that the system is not too ill-conditioned, the algorithm produces a solution correct to the working precision. Running time: seconds/minutes
Weighted low-rank sparse model via nuclear norm minimization for bearing fault detection
NASA Astrophysics Data System (ADS)
Du, Zhaohui; Chen, Xuefeng; Zhang, Han; Yang, Boyuan; Zhai, Zhi; Yan, Ruqiang
2017-07-01
It is a fundamental task in the machine fault diagnosis community to detect impulsive signatures generated by the localized faults of bearings. The main goal of this paper is to exploit the low-rank physical structure of periodic impulsive features and further establish a weighted low-rank sparse model for bearing fault detection. The proposed model mainly consists of three basic components: an adaptive partition window, a nuclear norm regularization and a weighted sequence. Firstly, due to the periodic repetition mechanism of impulsive feature, an adaptive partition window could be designed to transform the impulsive feature into a data matrix. The highlight of partition window is to accumulate all local feature information and align them. Then, all columns of the data matrix share similar waveforms and a core physical phenomenon arises, i.e., these singular values of the data matrix demonstrates a sparse distribution pattern. Therefore, a nuclear norm regularization is enforced to capture that sparse prior. However, the nuclear norm regularization treats all singular values equally and thus ignores one basic fact that larger singular values have more information volume of impulsive features and should be preserved as much as possible. Therefore, a weighted sequence with adaptively tuning weights inversely proportional to singular amplitude is adopted to guarantee the distribution consistence of large singular values. On the other hand, the proposed model is difficult to solve due to its non-convexity and thus a new algorithm is developed to search one satisfying stationary solution through alternatively implementing one proximal operator operation and least-square fitting. Moreover, the sensitivity analysis and selection principles of algorithmic parameters are comprehensively investigated through a set of numerical experiments, which shows that the proposed method is robust and only has a few adjustable parameters. Lastly, the proposed model is applied to the wind turbine (WT) bearing fault detection and its effectiveness is sufficiently verified. Compared with the current popular bearing fault diagnosis techniques, wavelet analysis and spectral kurtosis, our model achieves a higher diagnostic accuracy.
Communication Optimal Parallel Multiplication of Sparse Random Matrices
2013-02-21
Definition 2.1), and (2) the algorithm is sparsity- independent, where the computation is statically partitioned to processors independent of the sparsity...struc- ture of the input matrices (see Definition 2.5). The second assumption applies to nearly all existing al- gorithms for general sparse matrix-matrix...where A and B are n× n ER(d) matrices: Definition 2.1 An ER(d) matrix is an adjacency matrix of an Erdős-Rényi graph with parameters n and d/n. That
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chow, Edmond
Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.
Noniterative MAP reconstruction using sparse matrix representations.
Cao, Guangzhi; Bouman, Charles A; Webb, Kevin J
2009-09-01
We present a method for noniterative maximum a posteriori (MAP) tomographic reconstruction which is based on the use of sparse matrix representations. Our approach is to precompute and store the inverse matrix required for MAP reconstruction. This approach has generally not been used in the past because the inverse matrix is typically large and fully populated (i.e., not sparse). In order to overcome this problem, we introduce two new ideas. The first idea is a novel theory for the lossy source coding of matrix transformations which we refer to as matrix source coding. This theory is based on a distortion metric that reflects the distortions produced in the final matrix-vector product, rather than the distortions in the coded matrix itself. The resulting algorithms are shown to require orthonormal transformations of both the measurement data and the matrix rows and columns before quantization and coding. The second idea is a method for efficiently storing and computing the required orthonormal transformations, which we call a sparse-matrix transform (SMT). The SMT is a generalization of the classical FFT in that it uses butterflies to compute an orthonormal transform; but unlike an FFT, the SMT uses the butterflies in an irregular pattern, and is numerically designed to best approximate the desired transforms. We demonstrate the potential of the noniterative MAP reconstruction with examples from optical tomography. The method requires offline computation to encode the inverse transform. However, once these offline computations are completed, the noniterative MAP algorithm is shown to reduce both storage and computation by well over two orders of magnitude, as compared to a linear iterative reconstruction methods.
Acceleration of GPU-based Krylov solvers via data transfer reduction
Anzt, Hartwig; Tomov, Stanimire; Luszczek, Piotr; ...
2015-04-08
Krylov subspace iterative solvers are often the method of choice when solving large sparse linear systems. At the same time, hardware accelerators such as graphics processing units continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a well optimized but limited set of linear algebra operations, applications that use them often fail to reduce certain data communications, and hence fail to leverage the full potential of the accelerator. In this study, we target the acceleration of Krylov subspace iterative methods for graphicsmore » processing units, and in particular the Biconjugate Gradient Stabilized solver that significant improvement can be achieved by reformulating the method to reduce data-communications through application-specific kernels instead of using the generic BLAS kernels, e.g. as provided by NVIDIA’s cuBLAS library, and by designing a graphics processing unit specific sparse matrix-vector product kernel that is able to more efficiently use the graphics processing unit’s computing power. Furthermore, we derive a model estimating the performance improvement, and use experimental data to validate the expected runtime savings. Finally, considering that the derived implementation achieves significantly higher performance, we assert that similar optimizations addressing algorithm structure, as well as sparse matrix-vector, are crucial for the subsequent development of high-performance graphics processing units accelerated Krylov subspace iterative methods.« less
High-performance sparse matrix-matrix products on Intel KNL and multicore architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nagasaka, Y; Matsuoka, S; Azad, A
Sparse matrix-matrix multiplication (SpGEMM) is a computational primitive that is widely used in areas ranging from traditional numerical applications to recent big data analysis and machine learning. Although many SpGEMM algorithms have been proposed, hardware specific optimizations for multi- and many-core processors are lacking and a detailed analysis of their performance under various use cases and matrices is not available. We firstly identify and mitigate multiple bottlenecks with memory management and thread scheduling on Intel Xeon Phi (Knights Landing or KNL). Specifically targeting multi- and many-core processors, we develop a hash-table-based algorithm and optimize a heap-based shared-memory SpGEMM algorithm. Wemore » examine their performance together with other publicly available codes. Different from the literature, our evaluation also includes use cases that are representative of real graph algorithms, such as multi-source breadth-first search or triangle counting. Our hash-table and heap-based algorithms are showing significant speedups from libraries in the majority of the cases while different algorithms dominate the other scenarios with different matrix size, sparsity, compression factor and operation type. We wrap up in-depth evaluation results and make a recipe to give the best SpGEMM algorithm for target scenario. A critical finding is that hash-table-based SpGEMM gets a significant performance boost if the nonzeros are not required to be sorted within each row of the output matrix.« less
Sparse matrix methods based on orthogonality and conjugacy
NASA Technical Reports Server (NTRS)
Lawson, C. L.
1973-01-01
A matrix having a high percentage of zero elements is called spares. In the solution of systems of linear equations or linear least squares problems involving large sparse matrices, significant saving of computer cost can be achieved by taking advantage of the sparsity. The conjugate gradient algorithm and a set of related algorithms are described.
Thakur, Anil S.; Robin, Gautier; Guncar, Gregor; Saunders, Neil F. W.; Newman, Janet; Martin, Jennifer L.; Kobe, Bostjan
2007-01-01
Background Crystallization is a major bottleneck in the process of macromolecular structure determination by X-ray crystallography. Successful crystallization requires the formation of nuclei and their subsequent growth to crystals of suitable size. Crystal growth generally occurs spontaneously in a supersaturated solution as a result of homogenous nucleation. However, in a typical sparse matrix screening experiment, precipitant and protein concentration are not sampled extensively, and supersaturation conditions suitable for nucleation are often missed. Methodology/Principal Findings We tested the effect of nine potential heterogenous nucleating agents on crystallization of ten test proteins in a sparse matrix screen. Several nucleating agents induced crystal formation under conditions where no crystallization occurred in the absence of the nucleating agent. Four nucleating agents: dried seaweed; horse hair; cellulose and hydroxyapatite, had a considerable overall positive effect on crystallization success. This effect was further enhanced when these nucleating agents were used in combination with each other. Conclusions/Significance Our results suggest that the addition of heterogeneous nucleating agents increases the chances of crystal formation when using sparse matrix screens. PMID:17971854
Three dimensional iterative beam propagation method for optical waveguide devices
NASA Astrophysics Data System (ADS)
Ma, Changbao; Van Keuren, Edward
2006-10-01
The finite difference beam propagation method (FD-BPM) is an effective model for simulating a wide range of optical waveguide structures. The classical FD-BPMs are based on the Crank-Nicholson scheme, and in tridiagonal form can be solved using the Thomas method. We present a different type of algorithm for 3-D structures. In this algorithm, the wave equation is formulated into a large sparse matrix equation which can be solved using iterative methods. The simulation window shifting scheme and threshold technique introduced in our earlier work are utilized to overcome the convergence problem of iterative methods for large sparse matrix equation and wide-angle simulations. This method enables us to develop higher-order 3-D wide-angle (WA-) BPMs based on Pade approximant operators and the multistep method, which are commonly used in WA-BPMs for 2-D structures. Simulations using the new methods will be compared to the analytical results to assure its effectiveness and applicability.
A path-oriented matrix-based knowledge representation system
NASA Technical Reports Server (NTRS)
Feyock, Stefan; Karamouzis, Stamos T.
1993-01-01
Experience has shown that designing a good representation is often the key to turning hard problems into simple ones. Most AI (Artificial Intelligence) search/representation techniques are oriented toward an infinite domain of objects and arbitrary relations among them. In reality much of what needs to be represented in AI can be expressed using a finite domain and unary or binary predicates. Well-known vector- and matrix-based representations can efficiently represent finite domains and unary/binary predicates, and allow effective extraction of path information by generalized transitive closure/path matrix computations. In order to avoid space limitations a set of abstract sparse matrix data types was developed along with a set of operations on them. This representation forms the basis of an intelligent information system for representing and manipulating relational data.
NASA Technical Reports Server (NTRS)
Buchholz, Peter; Ciardo, Gianfranco; Donatelli, Susanna; Kemper, Peter
1997-01-01
We present a systematic discussion of algorithms to multiply a vector by a matrix expressed as the Kronecker product of sparse matrices, extending previous work in a unified notational framework. Then, we use our results to define new algorithms for the solution of large structured Markov models. In addition to a comprehensive overview of existing approaches, we give new results with respect to: (1) managing certain types of state-dependent behavior without incurring extra cost; (2) supporting both Jacobi-style and Gauss-Seidel-style methods by appropriate multiplication algorithms; (3) speeding up algorithms that consider probability vectors of size equal to the "actual" state space instead of the "potential" state space.
Face recognition based on two-dimensional discriminant sparse preserving projection
NASA Astrophysics Data System (ADS)
Zhang, Dawei; Zhu, Shanan
2018-04-01
In this paper, a supervised dimensionality reduction algorithm named two-dimensional discriminant sparse preserving projection (2DDSPP) is proposed for face recognition. In order to accurately model manifold structure of data, 2DDSPP constructs within-class affinity graph and between-class affinity graph by the constrained least squares (LS) and l1 norm minimization problem, respectively. Based on directly operating on image matrix, 2DDSPP integrates graph embedding (GE) with Fisher criterion. The obtained projection subspace preserves within-class neighborhood geometry structure of samples, while keeping away samples from different classes. The experimental results on the PIE and AR face databases show that 2DDSPP can achieve better recognition performance.
A general parallel sparse-blocked matrix multiply for linear scaling SCF theory
NASA Astrophysics Data System (ADS)
Challacombe, Matt
2000-06-01
A general approach to the parallel sparse-blocked matrix-matrix multiply is developed in the context of linear scaling self-consistent-field (SCF) theory. The data-parallel message passing method uses non-blocking communication to overlap computation and communication. The space filling curve heuristic is used to achieve data locality for sparse matrix elements that decay with “separation”. Load balance is achieved by solving the bin packing problem for blocks with variable size.With this new method as the kernel, parallel performance of the simplified density matrix minimization (SDMM) for solution of the SCF equations is investigated for RHF/6-31G ∗∗ water clusters and RHF/3-21G estane globules. Sustained rates above 5.7 GFLOPS for the SDMM have been achieved for (H 2 O) 200 with 95 Origin 2000 processors. Scalability is found to be limited by load imbalance, which increases with decreasing granularity, due primarily to the inhomogeneous distribution of variable block sizes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pinski, Peter; Riplinger, Christoph; Neese, Frank, E-mail: evaleev@vt.edu, E-mail: frank.neese@cec.mpg.de
2015-07-21
In this work, a systematic infrastructure is described that formalizes concepts implicit in previous work and greatly simplifies computer implementation of reduced-scaling electronic structure methods. The key concept is sparse representation of tensors using chains of sparse maps between two index sets. Sparse map representation can be viewed as a generalization of compressed sparse row, a common representation of a sparse matrix, to tensor data. By combining few elementary operations on sparse maps (inversion, chaining, intersection, etc.), complex algorithms can be developed, illustrated here by a linear-scaling transformation of three-center Coulomb integrals based on our compact code library that implementsmore » sparse maps and operations on them. The sparsity of the three-center integrals arises from spatial locality of the basis functions and domain density fitting approximation. A novel feature of our approach is the use of differential overlap integrals computed in linear-scaling fashion for screening products of basis functions. Finally, a robust linear scaling domain based local pair natural orbital second-order Möller-Plesset (DLPNO-MP2) method is described based on the sparse map infrastructure that only depends on a minimal number of cutoff parameters that can be systematically tightened to approach 100% of the canonical MP2 correlation energy. With default truncation thresholds, DLPNO-MP2 recovers more than 99.9% of the canonical resolution of the identity MP2 (RI-MP2) energy while still showing a very early crossover with respect to the computational effort. Based on extensive benchmark calculations, relative energies are reproduced with an error of typically <0.2 kcal/mol. The efficiency of the local MP2 (LMP2) method can be drastically improved by carrying out the LMP2 iterations in a basis of pair natural orbitals. While the present work focuses on local electron correlation, it is of much broader applicability to computation with sparse tensors in quantum chemistry and beyond.« less
Pinski, Peter; Riplinger, Christoph; Valeev, Edward F; Neese, Frank
2015-07-21
In this work, a systematic infrastructure is described that formalizes concepts implicit in previous work and greatly simplifies computer implementation of reduced-scaling electronic structure methods. The key concept is sparse representation of tensors using chains of sparse maps between two index sets. Sparse map representation can be viewed as a generalization of compressed sparse row, a common representation of a sparse matrix, to tensor data. By combining few elementary operations on sparse maps (inversion, chaining, intersection, etc.), complex algorithms can be developed, illustrated here by a linear-scaling transformation of three-center Coulomb integrals based on our compact code library that implements sparse maps and operations on them. The sparsity of the three-center integrals arises from spatial locality of the basis functions and domain density fitting approximation. A novel feature of our approach is the use of differential overlap integrals computed in linear-scaling fashion for screening products of basis functions. Finally, a robust linear scaling domain based local pair natural orbital second-order Möller-Plesset (DLPNO-MP2) method is described based on the sparse map infrastructure that only depends on a minimal number of cutoff parameters that can be systematically tightened to approach 100% of the canonical MP2 correlation energy. With default truncation thresholds, DLPNO-MP2 recovers more than 99.9% of the canonical resolution of the identity MP2 (RI-MP2) energy while still showing a very early crossover with respect to the computational effort. Based on extensive benchmark calculations, relative energies are reproduced with an error of typically <0.2 kcal/mol. The efficiency of the local MP2 (LMP2) method can be drastically improved by carrying out the LMP2 iterations in a basis of pair natural orbitals. While the present work focuses on local electron correlation, it is of much broader applicability to computation with sparse tensors in quantum chemistry and beyond.
NASA Astrophysics Data System (ADS)
Bustamam, A.; Ulul, E. D.; Hura, H. F. A.; Siswantining, T.
2017-07-01
Hierarchical clustering is one of effective methods in creating a phylogenetic tree based on the distance matrix between DNA (deoxyribonucleic acid) sequences. One of the well-known methods to calculate the distance matrix is k-mer method. Generally, k-mer is more efficient than some distance matrix calculation techniques. The steps of k-mer method are started from creating k-mer sparse matrix, and followed by creating k-mer singular value vectors. The last step is computing the distance amongst vectors. In this paper, we analyze the sequences of MERS-CoV (Middle East Respiratory Syndrome - Coronavirus) DNA by implementing hierarchical clustering using k-mer sparse matrix in order to perform the phylogenetic analysis. Our results show that the ancestor of our MERS-CoV is coming from Egypt. Moreover, we found that the MERS-CoV infection that occurs in one country may not necessarily come from the same country of origin. This suggests that the process of MERS-CoV mutation might not only be influenced by geographical factor.
Hou, Gary Y.; Provost, Jean; Grondin, Julien; Wang, Shutao; Marquet, Fabrice; Bunting, Ethan; Konofagou, Elisa E.
2015-01-01
Harmonic Motion Imaging for Focused Ultrasound (HMIFU) is a recently developed High-Intensity Focused Ultrasound (HIFU) treatment monitoring method. HMIFU utilizes an Amplitude-Modulated (fAM = 25 Hz) HIFU beam to induce a localized focal oscillatory motion, which is simultaneously estimated and imaged by confocally-aligned imaging transducer. HMIFU feasibilities have been previously shown in silico, in vitro, and in vivo in 1-D or 2-D monitoring of HIFU treatment. The objective of this study is to develop and show the feasibility of a novel fast beamforming algorithm for image reconstruction using GPU-based sparse-matrix operation with real-time feedback. In this study, the algorithm was implemented onto a fully integrated, clinically relevant HMIFU system composed of a 93-element HIFU transducer (fcenter = 4.5MHz) and coaxially-aligned 64-element phased array (fcenter = 2.5MHz) for displacement excitation and motion estimation, respectively. A single transmit beam with divergent beam transmit was used while fast beamforming was implemented using a GPU-based delay-and-sum method and a sparse-matrix operation. Axial HMI displacements were then estimated from the RF signals using a 1-D normalized cross-correlation method and streamed to a graphic user interface. The present work developed and implemented a sparse matrix beamforming onto a fully-integrated, clinically relevant system, which can stream displacement images up to 15 Hz using a GPU-based processing, an increase of 100 fold in rate of streaming displacement images compared to conventional CPU-based conventional beamforming and reconstruction processing. The achieved feedback rate is also currently the fastest and only approach that does not require interrupting the HIFU treatment amongst the acoustic radiation force based HIFU imaging techniques. Results in phantom experiments showed reproducible displacement imaging, and monitoring of twenty two in vitro HIFU treatments using the new 2D system showed a consistent average focal displacement decrease of 46.7±14.6% during lesion formation. Complementary focal temperature monitoring also indicated an average rate of displacement increase and decrease with focal temperature at 0.84±1.15 %/ °C, and 2.03± 0.93%/ °C, respectively. These results reinforce the HMIFU capability of estimating and monitoring stiffness related changes in real time. Current ongoing studies include clinical translation of the presented system for monitoring of HIFU treatment for breast and pancreatic tumor applications. PMID:24960528
Sparse matrix-vector multiplication on network-on-chip
NASA Astrophysics Data System (ADS)
Sun, C.-C.; Götze, J.; Jheng, H.-Y.; Ruan, S.-J.
2010-12-01
In this paper, we present an idea for performing matrix-vector multiplication by using Network-on-Chip (NoC) architecture. In traditional IC design on-chip communications have been designed with dedicated point-to-point interconnections. Therefore, regular local data transfer is the major concept of many parallel implementations. However, when dealing with the parallel implementation of sparse matrix-vector multiplication (SMVM), which is the main step of all iterative algorithms for solving systems of linear equation, the required data transfers depend on the sparsity structure of the matrix and can be extremely irregular. Using the NoC architecture makes it possible to deal with arbitrary structure of the data transfers; i.e. with the irregular structure of the sparse matrices. So far, we have already implemented the proposed SMVM-NoC architecture with the size 4×4 and 5×5 in IEEE 754 single float point precision using FPGA.
Fast iterative image reconstruction using sparse matrix factorization with GPU acceleration
NASA Astrophysics Data System (ADS)
Zhou, Jian; Qi, Jinyi
2011-03-01
Statistically based iterative approaches for image reconstruction have gained much attention in medical imaging. An accurate system matrix that defines the mapping from the image space to the data space is the key to high-resolution image reconstruction. However, an accurate system matrix is often associated with high computational cost and huge storage requirement. Here we present a method to address this problem by using sparse matrix factorization and parallel computing on a graphic processing unit (GPU).We factor the accurate system matrix into three sparse matrices: a sinogram blurring matrix, a geometric projection matrix, and an image blurring matrix. The sinogram blurring matrix models the detector response. The geometric projection matrix is based on a simple line integral model. The image blurring matrix is to compensate for the line-of-response (LOR) degradation due to the simplified geometric projection matrix. The geometric projection matrix is precomputed, while the sinogram and image blurring matrices are estimated by minimizing the difference between the factored system matrix and the original system matrix. The resulting factored system matrix has much less number of nonzero elements than the original system matrix and thus substantially reduces the storage and computation cost. The smaller size also allows an efficient implement of the forward and back projectors on GPUs, which have limited amount of memory. Our simulation studies show that the proposed method can dramatically reduce the computation cost of high-resolution iterative image reconstruction. The proposed technique is applicable to image reconstruction for different imaging modalities, including x-ray CT, PET, and SPECT.
Preconditioned conjugate residual methods for the solution of spectral equations
NASA Technical Reports Server (NTRS)
Wong, Y. S.; Zang, T. A.; Hussaini, M. Y.
1986-01-01
Conjugate residual methods for the solution of spectral equations are described. An inexact finite-difference operator is introduced as a preconditioner in the iterative procedures. Application of these techniques is limited to problems for which the symmetric part of the coefficient matrix is positive definite. Although the spectral equation is a very ill-conditioned and full matrix problem, the computational effort of the present iterative methods for solving such a system is comparable to that for the sparse matrix equations obtained from the application of either finite-difference or finite-element methods to the same problems. Numerical experiments are shown for a self-adjoint elliptic partial differential equation with Dirichlet boundary conditions, and comparison with other solution procedures for spectral equations is presented.
Wavelet-like bases for thin-wire integral equations in electromagnetics
NASA Astrophysics Data System (ADS)
Francomano, E.; Tortorici, A.; Toscano, E.; Ala, G.; Viola, F.
2005-03-01
In this paper, wavelets are used in solving, by the method of moments, a modified version of the thin-wire electric field integral equation, in frequency domain. The time domain electromagnetic quantities, are obtained by using the inverse discrete fast Fourier transform. The retarded scalar electric and vector magnetic potentials are employed in order to obtain the integral formulation. The discretized model generated by applying the direct method of moments via point-matching procedure, results in a linear system with a dense matrix which have to be solved for each frequency of the Fourier spectrum of the time domain impressed source. Therefore, orthogonal wavelet-like basis transform is used to sparsify the moment matrix. In particular, dyadic and M-band wavelet transforms have been adopted, so generating different sparse matrix structures. This leads to an efficient solution in solving the resulting sparse matrix equation. Moreover, a wavelet preconditioner is used to accelerate the convergence rate of the iterative solver employed. These numerical features are used in analyzing the transient behavior of a lightning protection system. In particular, the transient performance of the earth termination system of a lightning protection system or of the earth electrode of an electric power substation, during its operation is focused. The numerical results, obtained by running a complex structure, are discussed and the features of the used method are underlined.
NASA Astrophysics Data System (ADS)
Chang, Yong; Zi, Yanyang; Zhao, Jiyuan; Yang, Zhe; He, Wangpeng; Sun, Hailiang
2017-03-01
In guided wave pipeline inspection, echoes reflected from closely spaced reflectors generally overlap, meaning useful information is lost. To solve the overlapping problem, sparse deconvolution methods have been developed in the past decade. However, conventional sparse deconvolution methods have limitations in handling guided wave signals, because the input signal is directly used as the prototype of the convolution matrix, without considering the waveform change caused by the dispersion properties of the guided wave. In this paper, an adaptive sparse deconvolution (ASD) method is proposed to overcome these limitations. First, the Gaussian echo model is employed to adaptively estimate the column prototype of the convolution matrix instead of directly using the input signal as the prototype. Then, the convolution matrix is constructed upon the estimated results. Third, the split augmented Lagrangian shrinkage (SALSA) algorithm is introduced to solve the deconvolution problem with high computational efficiency. To verify the effectiveness of the proposed method, guided wave signals obtained from pipeline inspection are investigated numerically and experimentally. Compared to conventional sparse deconvolution methods, e.g. the {{l}1} -norm deconvolution method, the proposed method shows better performance in handling the echo overlap problem in the guided wave signal.
NASA Astrophysics Data System (ADS)
Lin, Chuang; Wang, Binghui; Jiang, Ning; Farina, Dario
2018-04-01
Objective. This paper proposes a novel simultaneous and proportional multiple degree of freedom (DOF) myoelectric control method for active prostheses. Approach. The approach is based on non-negative matrix factorization (NMF) of surface EMG signals with the inclusion of sparseness constraints. By applying a sparseness constraint to the control signal matrix, it is possible to extract the basis information from arbitrary movements (quasi-unsupervised approach) for multiple DOFs concurrently. Main Results. In online testing based on target hitting, able-bodied subjects reached a greater throughput (TP) when using sparse NMF (SNMF) than with classic NMF or with linear regression (LR). Accordingly, the completion time (CT) was shorter for SNMF than NMF or LR. The same observations were made in two patients with unilateral limb deficiencies. Significance. The addition of sparseness constraints to NMF allows for a quasi-unsupervised approach to myoelectric control with superior results with respect to previous methods for the simultaneous and proportional control of multi-DOF. The proposed factorization algorithm allows robust simultaneous and proportional control, is superior to previous supervised algorithms, and, because of minimal supervision, paves the way to online adaptation in myoelectric control.
Sparse PCA with Oracle Property.
Gu, Quanquan; Wang, Zhaoran; Liu, Han
In this paper, we study the estimation of the k -dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank- k , and attains a [Formula: see text] statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.
Sparse PCA with Oracle Property
Gu, Quanquan; Wang, Zhaoran; Liu, Han
2014-01-01
In this paper, we study the estimation of the k-dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-k, and attains a s/n statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets. PMID:25684971
Effects of partitioning and scheduling sparse matrix factorization on communication and load balance
NASA Technical Reports Server (NTRS)
Venugopal, Sesh; Naik, Vijay K.
1991-01-01
A block based, automatic partitioning and scheduling methodology is presented for sparse matrix factorization on distributed memory systems. Using experimental results, this technique is analyzed for communication and load imbalance overhead. To study the performance effects, these overheads were compared with those obtained from a straightforward 'wrap mapped' column assignment scheme. All experimental results were obtained using test sparse matrices from the Harwell-Boeing data set. The results show that there is a communication and load balance tradeoff. The block based method results in lower communication cost whereas the wrap mapped scheme gives better load balance.
Archer, A.W.; Maples, C.G.
1989-01-01
Numerous departures from ideal relationships are revealed by Monte Carlo simulations of widely accepted binomial coefficients. For example, simulations incorporating varying levels of matrix sparseness (presence of zeros indicating lack of data) and computation of expected values reveal that not only are all common coefficients influenced by zero data, but also that some coefficients do not discriminate between sparse or dense matrices (few zero data). Such coefficients computationally merge mutually shared and mutually absent information and do not exploit all the information incorporated within the standard 2 ?? 2 contingency table; therefore, the commonly used formulae for such coefficients are more complicated than the actual range of values produced. Other coefficients do differentiate between mutual presences and absences; however, a number of these coefficients do not demonstrate a linear relationship to matrix sparseness. Finally, simulations using nonrandom matrices with known degrees of row-by-row similarities signify that several coefficients either do not display a reasonable range of values or are nonlinear with respect to known relationships within the data. Analyses with nonrandom matrices yield clues as to the utility of certain coefficients for specific applications. For example, coefficients such as Jaccard, Dice, and Baroni-Urbani and Buser are useful if correction of sparseness is desired, whereas the Russell-Rao coefficient is useful when sparseness correction is not desired. ?? 1989 International Association for Mathematical Geology.
Sparse representation of whole-brain fMRI signals for identification of functional networks.
Lv, Jinglei; Jiang, Xi; Li, Xiang; Zhu, Dajiang; Chen, Hanbo; Zhang, Tuo; Zhang, Shu; Hu, Xintao; Han, Junwei; Huang, Heng; Zhang, Jing; Guo, Lei; Liu, Tianming
2015-02-01
There have been several recent studies that used sparse representation for fMRI signal analysis and activation detection based on the assumption that each voxel's fMRI signal is linearly composed of sparse components. Previous studies have employed sparse coding to model functional networks in various modalities and scales. These prior contributions inspired the exploration of whether/how sparse representation can be used to identify functional networks in a voxel-wise way and on the whole brain scale. This paper presents a novel, alternative methodology of identifying multiple functional networks via sparse representation of whole-brain task-based fMRI signals. Our basic idea is that all fMRI signals within the whole brain of one subject are aggregated into a big data matrix, which is then factorized into an over-complete dictionary basis matrix and a reference weight matrix via an effective online dictionary learning algorithm. Our extensive experimental results have shown that this novel methodology can uncover multiple functional networks that can be well characterized and interpreted in spatial, temporal and frequency domains based on current brain science knowledge. Importantly, these well-characterized functional network components are quite reproducible in different brains. In general, our methods offer a novel, effective and unified solution to multiple fMRI data analysis tasks including activation detection, de-activation detection, and functional network identification. Copyright © 2014 Elsevier B.V. All rights reserved.
Wei, Jianing; Bouman, Charles A; Allebach, Jan P
2014-05-01
Many imaging applications require the implementation of space-varying convolution for accurate restoration and reconstruction of images. Here, we use the term space-varying convolution to refer to linear operators whose impulse response has slow spatial variation. In addition, these space-varying convolution operators are often dense, so direct implementation of the convolution operator is typically computationally impractical. One such example is the problem of stray light reduction in digital cameras, which requires the implementation of a dense space-varying deconvolution operator. However, other inverse problems, such as iterative tomographic reconstruction, can also depend on the implementation of dense space-varying convolution. While space-invariant convolution can be efficiently implemented with the fast Fourier transform, this approach does not work for space-varying operators. So direct convolution is often the only option for implementing space-varying convolution. In this paper, we develop a general approach to the efficient implementation of space-varying convolution, and demonstrate its use in the application of stray light reduction. Our approach, which we call matrix source coding, is based on lossy source coding of the dense space-varying convolution matrix. Importantly, by coding the transformation matrix, we not only reduce the memory required to store it; we also dramatically reduce the computation required to implement matrix-vector products. Our algorithm is able to reduce computation by approximately factoring the dense space-varying convolution operator into a product of sparse transforms. Experimental results show that our method can dramatically reduce the computation required for stray light reduction while maintaining high accuracy.
Salient Object Detection via Structured Matrix Decomposition.
Peng, Houwen; Li, Bing; Ling, Haibin; Hu, Weiming; Xiong, Weihua; Maybank, Stephen J
2016-05-04
Low-rank recovery models have shown potential for salient object detection, where a matrix is decomposed into a low-rank matrix representing image background and a sparse matrix identifying salient objects. Two deficiencies, however, still exist. First, previous work typically assumes the elements in the sparse matrix are mutually independent, ignoring the spatial and pattern relations of image regions. Second, when the low-rank and sparse matrices are relatively coherent, e.g., when there are similarities between the salient objects and background or when the background is complicated, it is difficult for previous models to disentangle them. To address these problems, we propose a novel structured matrix decomposition model with two structural regularizations: (1) a tree-structured sparsity-inducing regularization that captures the image structure and enforces patches from the same object to have similar saliency values, and (2) a Laplacian regularization that enlarges the gaps between salient objects and the background in feature space. Furthermore, high-level priors are integrated to guide the matrix decomposition and boost the detection. We evaluate our model for salient object detection on five challenging datasets including single object, multiple objects and complex scene images, and show competitive results as compared with 24 state-of-the-art methods in terms of seven performance metrics.
Layer-oriented multigrid wavefront reconstruction algorithms for multi-conjugate adaptive optics
NASA Astrophysics Data System (ADS)
Gilles, Luc; Ellerbroek, Brent L.; Vogel, Curtis R.
2003-02-01
Multi-conjugate adaptive optics (MCAO) systems with 104-105 degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wavefront control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of AO degrees of freedom. In this paper, we develop an iterative sparse matrix implementation of minimum variance wavefront reconstruction for telescope diameters up to 32m with more than 104 actuators. The basic approach is the preconditioned conjugate gradient method, using a multigrid preconditioner incorporating a layer-oriented (block) symmetric Gauss-Seidel iterative smoothing operator. We present open-loop numerical simulation results to illustrate algorithm convergence.
Algorithms and Application of Sparse Matrix Assembly and Equation Solvers for Aeroacoustics
NASA Technical Reports Server (NTRS)
Watson, W. R.; Nguyen, D. T.; Reddy, C. J.; Vatsa, V. N.; Tang, W. H.
2001-01-01
An algorithm for symmetric sparse equation solutions on an unstructured grid is described. Efficient, sequential sparse algorithms for degree-of-freedom reordering, supernodes, symbolic/numerical factorization, and forward backward solution phases are reviewed. Three sparse algorithms for the generation and assembly of symmetric systems of matrix equations are presented. The accuracy and numerical performance of the sequential version of the sparse algorithms are evaluated over the frequency range of interest in a three-dimensional aeroacoustics application. Results show that the solver solutions are accurate using a discretization of 12 points per wavelength. Results also show that the first assembly algorithm is impractical for high-frequency noise calculations. The second and third assembly algorithms have nearly equal performance at low values of source frequencies, but at higher values of source frequencies the third algorithm saves CPU time and RAM. The CPU time and the RAM required by the second and third assembly algorithms are two orders of magnitude smaller than that required by the sparse equation solver. A sequential version of these sparse algorithms can, therefore, be conveniently incorporated into a substructuring for domain decomposition formulation to achieve parallel computation, where different substructures are handles by different parallel processors.
Visual Tracking Based on Extreme Learning Machine and Sparse Representation
Wang, Baoxian; Tang, Linbo; Yang, Jinglin; Zhao, Baojun; Wang, Shuigen
2015-01-01
The existing sparse representation-based visual trackers mostly suffer from both being time consuming and having poor robustness problems. To address these issues, a novel tracking method is presented via combining sparse representation and an emerging learning technique, namely extreme learning machine (ELM). Specifically, visual tracking can be divided into two consecutive processes. Firstly, ELM is utilized to find the optimal separate hyperplane between the target observations and background ones. Thus, the trained ELM classification function is able to remove most of the candidate samples related to background contents efficiently, thereby reducing the total computational cost of the following sparse representation. Secondly, to further combine ELM and sparse representation, the resultant confidence values (i.e., probabilities to be a target) of samples on the ELM classification function are used to construct a new manifold learning constraint term of the sparse representation framework, which tends to achieve robuster results. Moreover, the accelerated proximal gradient method is used for deriving the optimal solution (in matrix form) of the constrained sparse tracking model. Additionally, the matrix form solution allows the candidate samples to be calculated in parallel, thereby leading to a higher efficiency. Experiments demonstrate the effectiveness of the proposed tracker. PMID:26506359
A Systematic Approach for Obtaining Performance on Matrix-Like Operations
NASA Astrophysics Data System (ADS)
Veras, Richard Michael
Scientific Computation provides a critical role in the scientific process because it allows us ask complex queries and test predictions that would otherwise be unfeasible to perform experimentally. Because of its power, Scientific Computing has helped drive advances in many fields ranging from Engineering and Physics to Biology and Sociology to Economics and Drug Development and even to Machine Learning and Artificial Intelligence. Common among these domains is the desire for timely computational results, thus a considerable amount of human expert effort is spent towards obtaining performance for these scientific codes. However, this is no easy task because each of these domains present their own unique set of challenges to software developers, such as domain specific operations, structurally complex data and ever-growing datasets. Compounding these problems are the myriads of constantly changing, complex and unique hardware platforms that an expert must target. Unfortunately, an expert is typically forced to reproduce their effort across multiple problem domains and hardware platforms. In this thesis, we demonstrate the automatic generation of expert level high-performance scientific codes for Dense Linear Algebra (DLA), Structured Mesh (Stencil), Sparse Linear Algebra and Graph Analytic. In particular, this thesis seeks to address the issue of obtaining performance on many complex platforms for a certain class of matrix-like operations that span across many scientific, engineering and social fields. We do this by automating a method used for obtaining high performance in DLA and extending it to structured, sparse and scale-free domains. We argue that it is through the use of the underlying structure found in the data from these domains that enables this process. Thus, obtaining performance for most operations does not occur in isolation of the data being operated on, but instead depends significantly on the structure of the data.
NASA Astrophysics Data System (ADS)
Qin, Xulei; Cong, Zhibin; Halig, Luma V.; Fei, Baowei
2013-03-01
An automatic framework is proposed to segment right ventricle on ultrasound images. This method can automatically segment both epicardial and endocardial boundaries from a continuous echocardiography series by combining sparse matrix transform (SMT), a training model, and a localized region based level set. First, the sparse matrix transform extracts main motion regions of myocardium as eigenimages by analyzing statistical information of these images. Second, a training model of right ventricle is registered to the extracted eigenimages in order to automatically detect the main location of the right ventricle and the corresponding transform relationship between the training model and the SMT-extracted results in the series. Third, the training model is then adjusted as an adapted initialization for the segmentation of each image in the series. Finally, based on the adapted initializations, a localized region based level set algorithm is applied to segment both epicardial and endocardial boundaries of the right ventricle from the whole series. Experimental results from real subject data validated the performance of the proposed framework in segmenting right ventricle from echocardiography. The mean Dice scores for both epicardial and endocardial boundaries are 89.1%+/-2.3% and 83.6+/-7.3%, respectively. The automatic segmentation method based on sparse matrix transform and level set can provide a useful tool for quantitative cardiac imaging.
Wang, Ya-Xuan; Gao, Ying-Lian; Liu, Jin-Xing; Kong, Xiang-Zhen; Li, Hai-Jun
2017-09-01
Identifying differentially expressed genes from the thousands of genes is a challenging task. Robust principal component analysis (RPCA) is an efficient method in the identification of differentially expressed genes. RPCA method uses nuclear norm to approximate the rank function. However, theoretical studies showed that the nuclear norm minimizes all singular values, so it may not be the best solution to approximate the rank function. The truncated nuclear norm is defined as the sum of some smaller singular values, which may achieve a better approximation of the rank function than nuclear norm. In this paper, a novel method is proposed by replacing nuclear norm of RPCA with the truncated nuclear norm, which is named robust principal component analysis regularized by truncated nuclear norm (TRPCA). The method decomposes the observation matrix of genomic data into a low-rank matrix and a sparse matrix. Because the significant genes can be considered as sparse signals, the differentially expressed genes are viewed as the sparse perturbation signals. Thus, the differentially expressed genes can be identified according to the sparse matrix. The experimental results on The Cancer Genome Atlas data illustrate that the TRPCA method outperforms other state-of-the-art methods in the identification of differentially expressed genes.
Sparse Regression as a Sparse Eigenvalue Problem
NASA Technical Reports Server (NTRS)
Moghaddam, Baback; Gruber, Amit; Weiss, Yair; Avidan, Shai
2008-01-01
We extend the l0-norm "subspectral" algorithms for sparse-LDA [5] and sparse-PCA [6] to general quadratic costs such as MSE in linear (kernel) regression. The resulting "Sparse Least Squares" (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem (e.g., binary sparse-LDA [7]). Specifically, for a general quadratic cost we use a highly-efficient technique for direct eigenvalue computation using partitioned matrix inverses which leads to dramatic x103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) scaling behaviour that up to now has limited the previous algorithms' utility for high-dimensional learning problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes more efficient than forward selection. Similarly, branch-and-bound search for Exact Sparse Least Squares (ESLS) also benefits from partitioned matrix inverse techniques. Our Greedy Sparse Least Squares (GSLS) generalizes Natarajan's algorithm [9] also known as Order-Recursive Matching Pursuit (ORMP). Specifically, the forward half of GSLS is exactly equivalent to ORMP but more efficient. By including the backward pass, which only doubles the computation, we can achieve lower MSE than ORMP. Experimental comparisons to the state-of-the-art LARS algorithm [3] show forward-GSLS is faster, more accurate and more flexible in terms of choice of regularization
HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION
Mukherjee, Rajarshi; Pillai, Natesh S.; Lin, Xihong
2015-01-01
In this paper, we study the detection boundary for minimax hypothesis testing in the context of high-dimensional, sparse binary regression models. Motivated by genetic sequencing association studies for rare variant effects, we investigate the complexity of the hypothesis testing problem when the design matrix is sparse. We observe a new phenomenon in the behavior of detection boundary which does not occur in the case of Gaussian linear regression. We derive the detection boundary as a function of two components: a design matrix sparsity index and signal strength, each of which is a function of the sparsity of the alternative. For any alternative, if the design matrix sparsity index is too high, any test is asymptotically powerless irrespective of the magnitude of signal strength. For binary design matrices with the sparsity index that is not too high, our results are parallel to those in the Gaussian case. In this context, we derive detection boundaries for both dense and sparse regimes. For the dense regime, we show that the generalized likelihood ratio is rate optimal; for the sparse regime, we propose an extended Higher Criticism Test and show it is rate optimal and sharp. We illustrate the finite sample properties of the theoretical results using simulation studies. PMID:26246645
Addressing the computational cost of large EIT solutions.
Boyle, Alistair; Borsic, Andrea; Adler, Andy
2012-05-01
Electrical impedance tomography (EIT) is a soft field tomography modality based on the application of electric current to a body and measurement of voltages through electrodes at the boundary. The interior conductivity is reconstructed on a discrete representation of the domain using a finite-element method (FEM) mesh and a parametrization of that domain. The reconstruction requires a sequence of numerically intensive calculations. There is strong interest in reducing the cost of these calculations. An improvement in the compute time for current problems would encourage further exploration of computationally challenging problems such as the incorporation of time series data, wide-spread adoption of three-dimensional simulations and correlation of other modalities such as CT and ultrasound. Multicore processors offer an opportunity to reduce EIT computation times but may require some restructuring of the underlying algorithms to maximize the use of available resources. This work profiles two EIT software packages (EIDORS and NDRM) to experimentally determine where the computational costs arise in EIT as problems scale. Sparse matrix solvers, a key component for the FEM forward problem and sensitivity estimates in the inverse problem, are shown to take a considerable portion of the total compute time in these packages. A sparse matrix solver performance measurement tool, Meagre-Crowd, is developed to interface with a variety of solvers and compare their performance over a range of two- and three-dimensional problems of increasing node density. Results show that distributed sparse matrix solvers that operate on multiple cores are advantageous up to a limit that increases as the node density increases. We recommend a selection procedure to find a solver and hardware arrangement matched to the problem and provide guidance and tools to perform that selection.
Strategies for vectorizing the sparse matrix vector product on the CRAY XMP, CRAY 2, and CYBER 205
NASA Technical Reports Server (NTRS)
Bauschlicher, Charles W., Jr.; Partridge, Harry
1987-01-01
Large, randomly sparse matrix vector products are important in a number of applications in computational chemistry, such as matrix diagonalization and the solution of simultaneous equations. Vectorization of this process is considered for the CRAY XMP, CRAY 2, and CYBER 205, using a matrix of dimension of 20,000 with from 1 percent to 6 percent nonzeros. Efficient scatter/gather capabilities add coding flexibility and yield significant improvements in performance. For the CYBER 205, it is shown that minor changes in the IO can reduce the CPU time by a factor of 50. Similar changes in the CRAY codes make a far smaller improvement.
Zhang, Shu; Li, Xiang; Lv, Jinglei; Jiang, Xi; Guo, Lei; Liu, Tianming
2016-03-01
A relatively underexplored question in fMRI is whether there are intrinsic differences in terms of signal composition patterns that can effectively characterize and differentiate task-based or resting state fMRI (tfMRI or rsfMRI) signals. In this paper, we propose a novel two-stage sparse representation framework to examine the fundamental difference between tfMRI and rsfMRI signals. Specifically, in the first stage, the whole-brain tfMRI or rsfMRI signals of each subject were composed into a big data matrix, which was then factorized into a subject-specific dictionary matrix and a weight coefficient matrix for sparse representation. In the second stage, all of the dictionary matrices from both tfMRI/rsfMRI data across multiple subjects were composed into another big data-matrix, which was further sparsely represented by a cross-subjects common dictionary and a weight matrix. This framework has been applied on the recently publicly released Human Connectome Project (HCP) fMRI data and experimental results revealed that there are distinctive and descriptive atoms in the cross-subjects common dictionary that can effectively characterize and differentiate tfMRI and rsfMRI signals, achieving 100% classification accuracy. Moreover, our methods and results can be meaningfully interpreted, e.g., the well-known default mode network (DMN) activities can be recovered from the very noisy and heterogeneous aggregated big-data of tfMRI and rsfMRI signals across all subjects in HCP Q1 release.
Computing sparse derivatives and consecutive zeros problem
NASA Astrophysics Data System (ADS)
Chandra, B. V. Ravi; Hossain, Shahadat
2013-02-01
We describe a substitution based sparse Jacobian matrix determination method using algorithmic differentiation. Utilizing the a priori known sparsity pattern, a compression scheme is determined using graph coloring. The "compressed pattern" of the Jacobian matrix is then reordered into a form suitable for computation by substitution. We show that the column reordering of the compressed pattern matrix (so as to align the zero entries into consecutive locations in each row) can be viewed as a variant of traveling salesman problem. Preliminary computational results show that on the test problems the performance of nearest-neighbor type heuristic algorithms is highly encouraging.
Sparse matrix methods research using the CSM testbed software system
NASA Technical Reports Server (NTRS)
Chu, Eleanor; George, J. Alan
1989-01-01
Research is described on sparse matrix techniques for the Computational Structural Mechanics (CSM) Testbed. The primary objective was to compare the performance of state-of-the-art techniques for solving sparse systems with those that are currently available in the CSM Testbed. Thus, one of the first tasks was to become familiar with the structure of the testbed, and to install some or all of the SPARSPAK package in the testbed. A suite of subroutines to extract from the data base the relevant structural and numerical information about the matrix equations was written, and all the demonstration problems distributed with the testbed were successfully solved. These codes were documented, and performance studies comparing the SPARSPAK technology to the methods currently in the testbed were completed. In addition, some preliminary studies were done comparing some recently developed out-of-core techniques with the performance of the testbed processor INV.
A tight and explicit representation of Q in sparse QR factorization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ng, E.G.; Peyton, B.W.
1992-05-01
In QR factorization of a sparse m{times}n matrix A (m {ge} n) the orthogonal factor Q is often stored implicitly as a lower trapezoidal matrix H known as the Householder matrix. This paper presents a simple characterization of the row structure of Q, which could be used as the basis for a sparse data structure that can store Q explicitly. The new characterization is a simple extension of a well known row-oriented characterization of the structure of H. Hare, Johnson, Olesky, and van den Driessche have recently provided a complete sparsity analysis of the QR factorization. Let U be themore » matrix consisting of the first n columns of Q. Using results from, we show that the data structures for H and U resulting from our characterizations are tight when A is a strong Hall matrix. We also show that H and the lower trapezoidal part of U have the same sparsity characterization when A is strong Hall. We then show that this characterization can be extended to any weak Hall matrix that has been permuted into block upper triangular form. Finally, we show that permuting to block triangular form never increases the fill incurred during the factorization.« less
Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Xu, Richard Yi Da; Luo, Xiangfeng
2018-05-01
Sparse nonnegative matrix factorization (SNMF) aims to factorize a data matrix into two optimized nonnegative sparse factor matrices, which could benefit many tasks, such as document-word co-clustering. However, the traditional SNMF typically assumes the number of latent factors (i.e., dimensionality of the factor matrices) to be fixed. This assumption makes it inflexible in practice. In this paper, we propose a doubly sparse nonparametric NMF framework to mitigate this issue by using dependent Indian buffet processes (dIBP). We apply a correlation function for the generation of two stick weights associated with each column pair of factor matrices while still maintaining their respective marginal distribution specified by IBP. As a consequence, the generation of two factor matrices will be columnwise correlated. Under this framework, two classes of correlation function are proposed: 1) using bivariate Beta distribution and 2) using Copula function. Compared with the single IBP-based NMF, this paper jointly makes two factor matrices nonparametric and sparse, which could be applied to broader scenarios, such as co-clustering. This paper is seen to be much more flexible than Gaussian process-based and hierarchial Beta process-based dIBPs in terms of allowing the two corresponding binary matrix columns to have greater variations in their nonzero entries. Our experiments on synthetic data show the merits of this paper compared with the state-of-the-art models in respect of factorization efficiency, sparsity, and flexibility. Experiments on real-world data sets demonstrate the efficiency of this paper in document-word co-clustering tasks.
NASA Astrophysics Data System (ADS)
Anderson, D. V.; Koniges, A. E.; Shumaker, D. E.
1988-11-01
Many physical problems require the solution of coupled partial differential equations on three-dimensional domains. When the time scales of interest dictate an implicit discretization of the equations a rather complicated global matrix system needs solution. The exact form of the matrix depends on the choice of spatial grids and on the finite element or finite difference approximations employed. CPDES3 allows each spatial operator to have 7, 15, 19, or 27 point stencils and allows for general couplings between all of the component PDE's and it automatically generates the matrix structures needed to perform the algorithm. The resulting sparse matrix equation is solved by either the preconditioned conjugate gradient (CG) method or by the preconditioned biconjugate gradient (BCG) algorithm. An arbitrary number of component equations are permitted only limited by available memory. In the sub-band representation used, we generate an algorithm that is written compactly in terms of indirect induces which is vectorizable on some of the newer scientific computers.
NASA Astrophysics Data System (ADS)
Anderson, D. V.; Koniges, A. E.; Shumaker, D. E.
1988-11-01
Many physical problems require the solution of coupled partial differential equations on two-dimensional domains. When the time scales of interest dictate an implicit discretization of the equations a rather complicated global matrix system needs solution. The exact form of the matrix depends on the choice of spatial grids and on the finite element or finite difference approximations employed. CPDES2 allows each spatial operator to have 5 or 9 point stencils and allows for general couplings between all of the component PDE's and it automatically generates the matrix structures needed to perform the algorithm. The resulting sparse matrix equation is solved by either the preconditioned conjugate gradient (CG) method or by the preconditioned biconjugate gradient (BCG) algorithm. An arbitrary number of component equations are permitted only limited by available memory. In the sub-band representation used, we generate an algorithm that is written compactly in terms of indirect indices which is vectorizable on some of the newer scientific computers.
Large Covariance Estimation by Thresholding Principal Orthogonal Complements
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2012-01-01
This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented. PMID:24348088
Large Covariance Estimation by Thresholding Principal Orthogonal Complements.
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2013-09-01
This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented.
Collaborative sparse priors for multi-view ATR
NASA Astrophysics Data System (ADS)
Li, Xuelu; Monga, Vishal
2018-04-01
Recent work has seen a surge of sparse representation based classification (SRC) methods applied to automatic target recognition problems. While traditional SRC approaches used l0 or l1 norm to quantify sparsity, spike and slab priors have established themselves as the gold standard for providing general tunable sparse structures on vectors. In this work, we employ collaborative spike and slab priors that can be applied to matrices to encourage sparsity for the problem of multi-view ATR. That is, target images captured from multiple views are expanded in terms of a training dictionary multiplied with a coefficient matrix. Ideally, for a test image set comprising of multiple views of a target, coefficients corresponding to its identifying class are expected to be active, while others should be zero, i.e. the coefficient matrix is naturally sparse. We develop a new approach to solve the optimization problem that estimates the sparse coefficient matrix jointly with the sparsity inducing parameters in the collaborative prior. ATR problems are investigated on the mid-wave infrared (MWIR) database made available by the US Army Night Vision and Electronic Sensors Directorate, which has a rich collection of views. Experimental results show that the proposed joint prior and coefficient estimation method (JPCEM) can: 1.) enable improved accuracy when multiple views vs. a single one are invoked, and 2.) outperform state of the art alternatives particularly when training imagery is limited.
Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation.
Lam, Clifford; Fan, Jianqing
2009-01-01
This paper studies the sparsistency and rates of convergence for estimating sparse covariance and precision matrices based on penalized likelihood with nonconvex penalty functions. Here, sparsistency refers to the property that all parameters that are zero are actually estimated as zero with probability tending to one. Depending on the case of applications, sparsity priori may occur on the covariance matrix, its inverse or its Cholesky decomposition. We study these three sparsity exploration problems under a unified framework with a general penalty function. We show that the rates of convergence for these problems under the Frobenius norm are of order (s(n) log p(n)/n)(1/2), where s(n) is the number of nonzero elements, p(n) is the size of the covariance matrix and n is the sample size. This explicitly spells out the contribution of high-dimensionality is merely of a logarithmic factor. The conditions on the rate with which the tuning parameter λ(n) goes to 0 have been made explicit and compared under different penalties. As a result, for the L(1)-penalty, to guarantee the sparsistency and optimal rate of convergence, the number of nonzero elements should be small: sn'=O(pn) at most, among O(pn2) parameters, for estimating sparse covariance or correlation matrix, sparse precision or inverse correlation matrix or sparse Cholesky factor, where sn' is the number of the nonzero elements on the off-diagonal entries. On the other hand, using the SCAD or hard-thresholding penalty functions, there is no such a restriction.
NASA Technical Reports Server (NTRS)
Ehrhart, E. J.; Gillette, E. L.; Barcellos-Hoff, M. H.; Chaterjee, A. (Principal Investigator)
1996-01-01
High-LET radiation has unique physical and biological properties compared to sparsely ionizing radiation. Recent studies demonstrate that sparsely ionizing radiation rapidly alters the pattern of extracellular matrix expression in several tissues, but little is known about the effect of heavy-ion radiation. This study investigates densely ionizing radiation-induced changes in extracellular matrix localization in the mammary glands of adult female BALB/c mice after whole-body irradiation with 0.8 Gy 600 MeV iron particles. The basement membrane and interstitial extracellular matrix proteins of the mammary gland stroma were mapped with respect to time postirradiation using immunofluorescence. Collagen III was induced in the adipose stroma within 1 day, continued to increase through day 9 and was resolved by day 14. Immunoreactive tenascin was induced in the epithelium by day 1, was evident at the epithelial-stromal interface by day 5-9 and persisted as a condensed layer beneath the basement membrane through day 14. These findings parallel similar changes induced by gamma irradiation but demonstrate different onset and chronicity. In contrast, the integrity of epithelial basement membrane, which was unaffected by sparsely ionizing radiation, was disrupted by iron-particle irradiation. Laminin immunoreactivity was mildly irregular at 1 h postirradiation and showed discontinuities and thickening from days 1 to 9. Continuity was restored by day 14. Thus high-LET radiation, like sparsely ionizing radiation, induces rapid-remodeling of the stromal extracellular matrix but also appears to alter the integrity of the epithelial basement membrane, which is an important regulator of epithelial cell proliferation and differentiation.
High-dimensional statistical inference: From vector to matrix
NASA Astrophysics Data System (ADS)
Zhang, Anru
Statistical inference for sparse signals or low-rank matrices in high-dimensional settings is of significant interest in a range of contemporary applications. It has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. In this thesis, we consider several problems in including sparse signal recovery (compressed sensing under restricted isometry) and low-rank matrix recovery (matrix recovery via rank-one projections and structured matrix completion). The first part of the thesis discusses compressed sensing and affine rank minimization in both noiseless and noisy cases and establishes sharp restricted isometry conditions for sparse signal and low-rank matrix recovery. The analysis relies on a key technical tool which represents points in a polytope by convex combinations of sparse vectors. The technique is elementary while leads to sharp results. It is shown that, in compressed sensing, delta kA < 1/3, deltak A+ thetak,kA < 1, or deltatkA < √( t - 1)/t for any given constant t ≥ 4/3 guarantee the exact recovery of all k sparse signals in the noiseless case through the constrained ℓ1 minimization, and similarly in affine rank minimization delta rM < 1/3, deltar M + thetar, rM < 1, or deltatrM< √( t - 1)/t ensure the exact reconstruction of all matrices with rank at most r in the noiseless case via the constrained nuclear norm minimization. Moreover, for any epsilon > 0, delta kA < 1/3 + epsilon, deltak A + thetak,kA < 1 + epsilon, or deltatkA< √(t - 1) / t + epsilon are not sufficient to guarantee the exact recovery of all k-sparse signals for large k. Similar result also holds for matrix recovery. In addition, the conditions delta kA<1/3, deltak A+ thetak,kA<1, delta tkA < √(t - 1)/t and deltarM<1/3, delta rM+ thetar,rM<1, delta trM< √(t - 1)/ t are also shown to be sufficient respectively for stable recovery of approximately sparse signals and low-rank matrices in the noisy case. For the second part of the thesis, we introduce a rank-one projection model for low-rank matrix recovery and propose a constrained nuclear norm minimization method for stable recovery of low-rank matrices in the noisy case. The procedure is adaptive to the rank and robust against small perturbations. Both upper and lower bounds for the estimation accuracy under the Frobenius norm loss are obtained. The proposed estimator is shown to be rate-optimal under certain conditions. The estimator is easy to implement via convex programming and performs well numerically. The techniques and main results developed in the chapter also have implications to other related statistical problems. An application to estimation of spiked covariance matrices from one-dimensional random projections is considered. The results demonstrate that it is still possible to accurately estimate the covariance matrix of a high-dimensional distribution based only on one-dimensional projections. For the third part of the thesis, we consider another setting of low-rank matrix completion. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival.
NASA Astrophysics Data System (ADS)
Prodhan, Suryoday; Ramasesha, S.
2018-05-01
The symmetry adapted density matrix renormalization group (SDMRG) technique has been an efficient method for studying low-lying eigenstates in one- and quasi-one-dimensional electronic systems. However, the SDMRG method had bottlenecks involving the construction of linearly independent symmetry adapted basis states as the symmetry matrices in the DMRG basis were not sparse. We have developed a modified algorithm to overcome this bottleneck. The new method incorporates end-to-end interchange symmetry (C2) , electron-hole symmetry (J ) , and parity or spin-flip symmetry (P ) in these calculations. The one-to-one correspondence between direct-product basis states in the DMRG Hilbert space for these symmetry operations renders the symmetry matrices in the new basis with maximum sparseness, just one nonzero matrix element per row. Using methods similar to those employed in the exact diagonalization technique for Pariser-Parr-Pople (PPP) models, developed in the 1980s, it is possible to construct orthogonal SDMRG basis states while bypassing the slow step of the Gram-Schmidt orthonormalization procedure. The method together with the PPP model which incorporates long-range electronic correlations is employed to study the correlated excited-state spectra of 1,12-benzoperylene and a narrow mixed graphene nanoribbon with a chrysene molecule as the building unit, comprising both zigzag and cove-edge structures.
Solving large tomographic linear systems: size reduction and error estimation
NASA Astrophysics Data System (ADS)
Voronin, Sergey; Mikesell, Dylan; Slezak, Inna; Nolet, Guust
2014-10-01
We present a new approach to reduce a sparse, linear system of equations associated with tomographic inverse problems. We begin by making a modification to the commonly used compressed sparse-row format, whereby our format is tailored to the sparse structure of finite-frequency (volume) sensitivity kernels in seismic tomography. Next, we cluster the sparse matrix rows to divide a large matrix into smaller subsets representing ray paths that are geographically close. Singular value decomposition of each subset allows us to project the data onto a subspace associated with the largest eigenvalues of the subset. After projection we reject those data that have a signal-to-noise ratio (SNR) below a chosen threshold. Clustering in this way assures that the sparse nature of the system is minimally affected by the projection. Moreover, our approach allows for a precise estimation of the noise affecting the data while also giving us the ability to identify outliers. We illustrate the method by reducing large matrices computed for global tomographic systems with cross-correlation body wave delays, as well as with surface wave phase velocity anomalies. For a massive matrix computed for 3.7 million Rayleigh wave phase velocity measurements, imposing a threshold of 1 for the SNR, we condensed the matrix size from 1103 to 63 Gbyte. For a global data set of multiple-frequency P wave delays from 60 well-distributed deep earthquakes we obtain a reduction to 5.9 per cent. This type of reduction allows one to avoid loss of information due to underparametrizing models. Alternatively, if data have to be rejected to fit the system into computer memory, it assures that the most important data are preserved.
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Heber, Gerd; Biswas, Rupak
2000-01-01
The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations within a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming paradigms and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multi-threaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.
An exact formulation of the time-ordered exponential using path-sums
DOE Office of Scientific and Technical Information (OSTI.GOV)
Giscard, P.-L., E-mail: p.giscard1@physics.ox.ac.uk; Lui, K.; Thwaite, S. J.
2015-05-15
We present the path-sum formulation for the time-ordered exponential of a time-dependent matrix. The path-sum formulation gives the time-ordered exponential as a branched continued fraction of finite depth and breadth. The terms of the path-sum have an elementary interpretation as self-avoiding walks and self-avoiding polygons on a graph. Our result is based on a representation of the time-ordered exponential as the inverse of an operator, the mapping of this inverse to sums of walks on a graphs, and the algebraic structure of sets of walks. We give examples demonstrating our approach. We establish a super-exponential decay bound for the magnitudemore » of the entries of the time-ordered exponential of sparse matrices. We give explicit results for matrices with commonly encountered sparse structures.« less
Discriminative Transfer Subspace Learning via Low-Rank and Sparse Representation.
Xu, Yong; Fang, Xiaozhao; Wu, Jian; Li, Xuelong; Zhang, David
2016-02-01
In this paper, we address the problem of unsupervised domain transfer learning in which no labels are available in the target domain. We use a transformation matrix to transfer both the source and target data to a common subspace, where each target sample can be represented by a combination of source samples such that the samples from different domains can be well interlaced. In this way, the discrepancy of the source and target domains is reduced. By imposing joint low-rank and sparse constraints on the reconstruction coefficient matrix, the global and local structures of data can be preserved. To enlarge the margins between different classes as much as possible and provide more freedom to diminish the discrepancy, a flexible linear classifier (projection) is obtained by learning a non-negative label relaxation matrix that allows the strict binary label matrix to relax into a slack variable matrix. Our method can avoid a potentially negative transfer by using a sparse matrix to model the noise and, thus, is more robust to different types of noise. We formulate our problem as a constrained low-rankness and sparsity minimization problem and solve it by the inexact augmented Lagrange multiplier method. Extensive experiments on various visual domain adaptation tasks show the superiority of the proposed method over the state-of-the art methods. The MATLAB code of our method will be publicly available at http://www.yongxu.org/lunwen.html.
Improved analysis of SP and CoSaMP under total perturbations
NASA Astrophysics Data System (ADS)
Li, Haifeng
2016-12-01
Practically, in the underdetermined model y= A x, where x is a K sparse vector (i.e., it has no more than K nonzero entries), both y and A could be totally perturbed. A more relaxed condition means less number of measurements are needed to ensure the sparse recovery from theoretical aspect. In this paper, based on restricted isometry property (RIP), for subspace pursuit (SP) and compressed sampling matching pursuit (CoSaMP), two relaxed sufficient conditions are presented under total perturbations to guarantee that the sparse vector x is recovered. Taking random matrix as measurement matrix, we also discuss the advantage of our condition. Numerical experiments validate that SP and CoSaMP can provide oracle-order recovery performance.
Multi-energy CT based on a prior rank, intensity and sparsity model (PRISM).
Gao, Hao; Yu, Hengyong; Osher, Stanley; Wang, Ge
2011-11-01
We propose a compressive sensing approach for multi-energy computed tomography (CT), namely the prior rank, intensity and sparsity model (PRISM). To further compress the multi-energy image for allowing the reconstruction with fewer CT data and less radiation dose, the PRISM models a multi-energy image as the superposition of a low-rank matrix and a sparse matrix (with row dimension in space and column dimension in energy), where the low-rank matrix corresponds to the stationary background over energy that has a low matrix rank, and the sparse matrix represents the rest of distinct spectral features that are often sparse. Distinct from previous methods, the PRISM utilizes the generalized rank, e.g., the matrix rank of tight-frame transform of a multi-energy image, which offers a way to characterize the multi-level and multi-filtered image coherence across the energy spectrum. Besides, the energy-dependent intensity information can be incorporated into the PRISM in terms of the spectral curves for base materials, with which the restoration of the multi-energy image becomes the reconstruction of the energy-independent material composition matrix. In other words, the PRISM utilizes prior knowledge on the generalized rank and sparsity of a multi-energy image, and intensity/spectral characteristics of base materials. Furthermore, we develop an accurate and fast split Bregman method for the PRISM and demonstrate the superior performance of the PRISM relative to several competing methods in simulations.
Cao, Buwen; Deng, Shuguang; Qin, Hua; Ding, Pingjian; Chen, Shaopeng; Li, Guanghui
2018-06-15
High-throughput technology has generated large-scale protein interaction data, which is crucial in our understanding of biological organisms. Many complex identification algorithms have been developed to determine protein complexes. However, these methods are only suitable for dense protein interaction networks, because their capabilities decrease rapidly when applied to sparse protein⁻protein interaction (PPI) networks. In this study, based on penalized matrix decomposition ( PMD ), a novel method of penalized matrix decomposition for the identification of protein complexes (i.e., PMD pc ) was developed to detect protein complexes in the human protein interaction network. This method mainly consists of three steps. First, the adjacent matrix of the protein interaction network is normalized. Second, the normalized matrix is decomposed into three factor matrices. The PMD pc method can detect protein complexes in sparse PPI networks by imposing appropriate constraints on factor matrices. Finally, the results of our method are compared with those of other methods in human PPI network. Experimental results show that our method can not only outperform classical algorithms, such as CFinder, ClusterONE, RRW, HC-PIN, and PCE-FR, but can also achieve an ideal overall performance in terms of a composite score consisting of F-measure, accuracy (ACC), and the maximum matching ratio (MMR).
Target detection in GPR data using joint low-rank and sparsity constraints
NASA Astrophysics Data System (ADS)
Bouzerdoum, Abdesselam; Tivive, Fok Hing Chi; Abeynayake, Canicious
2016-05-01
In ground penetrating radars, background clutter, which comprises the signals backscattered from the rough, uneven ground surface and the background noise, impairs the visualization of buried objects and subsurface inspections. In this paper, a clutter mitigation method is proposed for target detection. The removal of background clutter is formulated as a constrained optimization problem to obtain a low-rank matrix and a sparse matrix. The low-rank matrix captures the ground surface reflections and the background noise, whereas the sparse matrix contains the target reflections. An optimization method based on split-Bregman algorithm is developed to estimate these two matrices from the input GPR data. Evaluated on real radar data, the proposed method achieves promising results in removing the background clutter and enhancing the target signature.
Sparse image reconstruction for molecular imaging.
Ting, Michael; Raich, Raviv; Hero, Alfred O
2009-06-01
The application that motivates this paper is molecular imaging at the atomic level. When discretized at subatomic distances, the volume is inherently sparse. Noiseless measurements from an imaging technology can be modeled by convolution of the image with the system point spread function (psf). Such is the case with magnetic resonance force microscopy (MRFM), an emerging technology where imaging of an individual tobacco mosaic virus was recently demonstrated with nanometer resolution. We also consider additive white Gaussian noise (AWGN) in the measurements. Many prior works of sparse estimators have focused on the case when H has low coherence; however, the system matrix H in our application is the convolution matrix for the system psf. A typical convolution matrix has high coherence. This paper, therefore, does not assume a low coherence H. A discrete-continuous form of the Laplacian and atom at zero (LAZE) p.d.f. used by Johnstone and Silverman is formulated, and two sparse estimators derived by maximizing the joint p.d.f. of the observation and image conditioned on the hyperparameters. A thresholding rule that generalizes the hard and soft thresholding rule appears in the course of the derivation. This so-called hybrid thresholding rule, when used in the iterative thresholding framework, gives rise to the hybrid estimator, a generalization of the lasso. Estimates of the hyperparameters for the lasso and hybrid estimator are obtained via Stein's unbiased risk estimate (SURE). A numerical study with a Gaussian psf and two sparse images shows that the hybrid estimator outperforms the lasso.
Total variation-based method for radar coincidence imaging with model mismatch for extended target
NASA Astrophysics Data System (ADS)
Cao, Kaicheng; Zhou, Xiaoli; Cheng, Yongqiang; Fan, Bo; Qin, Yuliang
2017-11-01
Originating from traditional optical coincidence imaging, radar coincidence imaging (RCI) is a staring/forward-looking imaging technique. In RCI, the reference matrix must be computed precisely to reconstruct the image as preferred; unfortunately, such precision is almost impossible due to the existence of model mismatch in practical applications. Although some conventional sparse recovery algorithms are proposed to solve the model-mismatch problem, they are inapplicable to nonsparse targets. We therefore sought to derive the signal model of RCI with model mismatch by replacing the sparsity constraint item with total variation (TV) regularization in the sparse total least squares optimization problem; in this manner, we obtain the objective function of RCI with model mismatch for an extended target. A more robust and efficient algorithm called TV-TLS is proposed, in which the objective function is divided into two parts and the perturbation matrix and scattering coefficients are updated alternately. Moreover, due to the ability of TV regularization to recover sparse signal or image with sparse gradient, TV-TLS method is also applicable to sparse recovering. Results of numerical experiments demonstrate that, for uniform extended targets, sparse targets, and real extended targets, the algorithm can achieve preferred imaging performance both in suppressing noise and in adapting to model mismatch.
Randomized subspace-based robust principal component analysis for hyperspectral anomaly detection
NASA Astrophysics Data System (ADS)
Sun, Weiwei; Yang, Gang; Li, Jialin; Zhang, Dianfa
2018-01-01
A randomized subspace-based robust principal component analysis (RSRPCA) method for anomaly detection in hyperspectral imagery (HSI) is proposed. The RSRPCA combines advantages of randomized column subspace and robust principal component analysis (RPCA). It assumes that the background has low-rank properties, and the anomalies are sparse and do not lie in the column subspace of the background. First, RSRPCA implements random sampling to sketch the original HSI dataset from columns and to construct a randomized column subspace of the background. Structured random projections are also adopted to sketch the HSI dataset from rows. Sketching from columns and rows could greatly reduce the computational requirements of RSRPCA. Second, the RSRPCA adopts the columnwise RPCA (CWRPCA) to eliminate negative effects of sampled anomaly pixels and that purifies the previous randomized column subspace by removing sampled anomaly columns. The CWRPCA decomposes the submatrix of the HSI data into a low-rank matrix (i.e., background component), a noisy matrix (i.e., noise component), and a sparse anomaly matrix (i.e., anomaly component) with only a small proportion of nonzero columns. The algorithm of inexact augmented Lagrange multiplier is utilized to optimize the CWRPCA problem and estimate the sparse matrix. Nonzero columns of the sparse anomaly matrix point to sampled anomaly columns in the submatrix. Third, all the pixels are projected onto the complemental subspace of the purified randomized column subspace of the background and the anomaly pixels in the original HSI data are finally exactly located. Several experiments on three real hyperspectral images are carefully designed to investigate the detection performance of RSRPCA, and the results are compared with four state-of-the-art methods. Experimental results show that the proposed RSRPCA outperforms four comparison methods both in detection performance and in computational time.
Highly parallel sparse Cholesky factorization
NASA Technical Reports Server (NTRS)
Gilbert, John R.; Schreiber, Robert
1990-01-01
Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms.
Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners
Li, Ruipeng; Saad, Yousef
2017-08-01
This study presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits domain decomposition (DD) and low-rank corrections. The DD approach decouples the matrix and, once inverted, a low-rank approximation is applied by exploiting the Sherman--Morrison--Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisonsmore » with pARMS, a DD-based parallel incomplete LU (ILU) preconditioning method, are presented for solving Poisson's equation and linear elasticity problems.« less
Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Ruipeng; Saad, Yousef
This study presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits domain decomposition (DD) and low-rank corrections. The DD approach decouples the matrix and, once inverted, a low-rank approximation is applied by exploiting the Sherman--Morrison--Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisonsmore » with pARMS, a DD-based parallel incomplete LU (ILU) preconditioning method, are presented for solving Poisson's equation and linear elasticity problems.« less
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS.
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2011-01-01
The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied.
High-performance equation solvers and their impact on finite element analysis
NASA Technical Reports Server (NTRS)
Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. Dale, Jr.
1990-01-01
The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number of operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.
High-performance equation solvers and their impact on finite element analysis
NASA Technical Reports Server (NTRS)
Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. D., Jr.
1992-01-01
The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number od operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.
SAMBA: Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos
NASA Astrophysics Data System (ADS)
Ahlfeld, R.; Belkouchi, B.; Montomoli, F.
2016-09-01
A new arbitrary Polynomial Chaos (aPC) method is presented for moderately high-dimensional problems characterised by limited input data availability. The proposed methodology improves the algorithm of aPC and extends the method, that was previously only introduced as tensor product expansion, to moderately high-dimensional stochastic problems. The fundamental idea of aPC is to use the statistical moments of the input random variables to develop the polynomial chaos expansion. This approach provides the possibility to propagate continuous or discrete probability density functions and also histograms (data sets) as long as their moments exist, are finite and the determinant of the moment matrix is strictly positive. For cases with limited data availability, this approach avoids bias and fitting errors caused by wrong assumptions. In this work, an alternative way to calculate the aPC is suggested, which provides the optimal polynomials, Gaussian quadrature collocation points and weights from the moments using only a handful of matrix operations on the Hankel matrix of moments. It can therefore be implemented without requiring prior knowledge about statistical data analysis or a detailed understanding of the mathematics of polynomial chaos expansions. The extension to more input variables suggested in this work, is an anisotropic and adaptive version of Smolyak's algorithm that is solely based on the moments of the input probability distributions. It is referred to as SAMBA (PC), which is short for Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos. It is illustrated that for moderately high-dimensional problems (up to 20 different input variables or histograms) SAMBA can significantly simplify the calculation of sparse Gaussian quadrature rules. SAMBA's efficiency for multivariate functions with regard to data availability is further demonstrated by analysing higher order convergence and accuracy for a set of nonlinear test functions with 2, 5 and 10 different input distributions or histograms.
A physiologically motivated sparse, compact, and smooth (SCS) approach to EEG source localization.
Cao, Cheng; Akalin Acar, Zeynep; Kreutz-Delgado, Kenneth; Makeig, Scott
2012-01-01
Here, we introduce a novel approach to the EEG inverse problem based on the assumption that principal cortical sources of multi-channel EEG recordings may be assumed to be spatially sparse, compact, and smooth (SCS). To enforce these characteristics of solutions to the EEG inverse problem, we propose a correlation-variance model which factors a cortical source space covariance matrix into the multiplication of a pre-given correlation coefficient matrix and the square root of the diagonal variance matrix learned from the data under a Bayesian learning framework. We tested the SCS method using simulated EEG data with various SNR and applied it to a real ECOG data set. We compare the results of SCS to those of an established SBL algorithm.
NASA Astrophysics Data System (ADS)
Chen, Y.-M.; Koniges, A. E.; Anderson, D. V.
1989-10-01
The biconjugate gradient method (BCG) provides an attractive alternative to the usual conjugate gradient algorithms for the solution of sparse systems of linear equations with nonsymmetric and indefinite matrix operators. A preconditioned algorithm is given, whose form resembles the incomplete L-U conjugate gradient scheme (ILUCG2) previously presented. Although the BCG scheme requires the storage of two additional vectors, it converges in a significantly lesser number of iterations (often half), while the number of calculations per iteration remains essentially the same.
Elongation cutoff technique armed with quantum fast multipole method for linear scaling.
Korchowiec, Jacek; Lewandowski, Jakub; Makowski, Marcin; Gu, Feng Long; Aoki, Yuriko
2009-11-30
A linear-scaling implementation of the elongation cutoff technique (ELG/C) that speeds up Hartree-Fock (HF) self-consistent field calculations is presented. The cutoff method avoids the known bottleneck of the conventional HF scheme, that is, diagonalization, because it operates within the low dimension subspace of the whole atomic orbital space. The efficiency of ELG/C is illustrated for two model systems. The obtained results indicate that the ELG/C is a very efficient sparse matrix algebra scheme. Copyright 2009 Wiley Periodicals, Inc.
Comparing implementations of penalized weighted least-squares sinogram restoration.
Forthmann, Peter; Koehler, Thomas; Defrise, Michel; La Riviere, Patrick
2010-11-01
A CT scanner measures the energy that is deposited in each channel of a detector array by x rays that have been partially absorbed on their way through the object. The measurement process is complex and quantitative measurements are always and inevitably associated with errors, so CT data must be preprocessed prior to reconstruction. In recent years, the authors have formulated CT sinogram preprocessing as a statistical restoration problem in which the goal is to obtain the best estimate of the line integrals needed for reconstruction from the set of noisy, degraded measurements. The authors have explored both penalized Poisson likelihood (PL) and penalized weighted least-squares (PWLS) objective functions. At low doses, the authors found that the PL approach outperforms PWLS in terms of resolution-noise tradeoffs, but at standard doses they perform similarly. The PWLS objective function, being quadratic, is more amenable to computational acceleration than the PL objective. In this work, the authors develop and compare two different methods for implementing PWLS sinogram restoration with the hope of improving computational performance relative to PL in the standard-dose regime. Sinogram restoration is still significant in the standard-dose regime since it can still outperform standard approaches and it allows for correction of effects that are not usually modeled in standard CT preprocessing. The authors have explored and compared two implementation strategies for PWLS sinogram restoration: (1) A direct matrix-inversion strategy based on the closed-form solution to the PWLS optimization problem and (2) an iterative approach based on the conjugate-gradient algorithm. Obtaining optimal performance from each strategy required modifying the naive off-the-shelf implementations of the algorithms to exploit the particular symmetry and sparseness of the sinogram-restoration problem. For the closed-form approach, the authors subdivided the large matrix inversion into smaller coupled problems and exploited sparseness to minimize matrix operations. For the conjugate-gradient approach, the authors exploited sparseness and preconditioned the problem to speed up convergence. All methods produced qualitatively and quantitatively similar images as measured by resolution-variance tradeoffs and difference images. Despite the acceleration strategies, the direct matrix-inversion approach was found to be uncompetitive with iterative approaches, with a computational burden higher by an order of magnitude or more. The iterative conjugate-gradient approach, however, does appear promising, with computation times half that of the authors' previous penalized-likelihood implementation. Iterative conjugate-gradient based PWLS sinogram restoration with careful matrix optimizations has computational advantages over direct matrix PWLS inversion and over penalized-likelihood sinogram restoration and can be considered a good alternative in standard-dose regimes.
A performance study of sparse Cholesky factorization on INTEL iPSC/860
NASA Technical Reports Server (NTRS)
Zubair, M.; Ghose, M.
1992-01-01
The problem of Cholesky factorization of a sparse matrix has been very well investigated on sequential machines. A number of efficient codes exist for factorizing large unstructured sparse matrices. However, there is a lack of such efficient codes on parallel machines in general, and distributed machines in particular. Some of the issues that are critical to the implementation of sparse Cholesky factorization on a distributed memory parallel machine are ordering, partitioning and mapping, load balancing, and ordering of various tasks within a processor. Here, we focus on the effect of various partitioning schemes on the performance of sparse Cholesky factorization on the Intel iPSC/860. Also, a new partitioning heuristic for structured as well as unstructured sparse matrices is proposed, and its performance is compared with other schemes.
Sparse Gaussian elimination with controlled fill-in on a shared memory multiprocessor
NASA Technical Reports Server (NTRS)
Alaghband, Gita; Jordan, Harry F.
1989-01-01
It is shown that in sparse matrices arising from electronic circuits, it is possible to do computations on many diagonal elements simultaneously. A technique for obtaining an ordered compatible set directly from the ordered incompatible table is given. The ordering is based on the Markowitz number of the pivot candidates. This technique generates a set of compatible pivots with the property of generating few fills. A novel heuristic algorithm is presented that combines the idea of an order-compatible set with a limited binary tree search to generate several sets of compatible pivots in linear time. An elimination set for reducing the matrix is generated and selected on the basis of a minimum Markowitz sum number. The parallel pivoting technique presented is a stepwise algorithm and can be applied to any submatrix of the original matrix. Thus, it is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds. Parameters are suggested to obtain a balance between parallelism and fill-ins. Results of applying the proposed algorithms on several large application matrices using the HEP multiprocessor (Kowalik, 1985) are presented and analyzed.
Non-convex Statistical Optimization for Sparse Tensor Graphical Model
Sun, Wei; Wang, Zhaoran; Liu, Han; Cheng, Guang
2016-01-01
We consider the estimation of sparse graphical models that characterize the dependency structure of high-dimensional tensor-valued data. To facilitate the estimation of the precision matrix corresponding to each way of the tensor, we assume the data follow a tensor normal distribution whose covariance has a Kronecker product structure. The penalized maximum likelihood estimation of this model involves minimizing a non-convex objective function. In spite of the non-convexity of this estimation problem, we prove that an alternating minimization algorithm, which iteratively estimates each sparse precision matrix while fixing the others, attains an estimator with the optimal statistical rate of convergence as well as consistent graph recovery. Notably, such an estimator achieves estimation consistency with only one tensor sample, which is unobserved in previous work. Our theoretical results are backed by thorough numerical studies. PMID:28316459
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication
Azad, Ariful; Ballard, Grey; Buluc, Aydin; ...
2016-11-08
Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdös-Rényi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first implementation of the 3D SpGEMM formulation that exploits multiple (intranode and internode) levels of parallelism, achievingmore » significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research.« less
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2012-01-01
The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied. PMID:22661790
Xi, Jianing; Wang, Minghui; Li, Ao
2018-06-05
Discovery of mutated driver genes is one of the primary objective for studying tumorigenesis. To discover some relatively low frequently mutated driver genes from somatic mutation data, many existing methods incorporate interaction network as prior information. However, the prior information of mRNA expression patterns are not exploited by these existing network-based methods, which is also proven to be highly informative of cancer progressions. To incorporate prior information from both interaction network and mRNA expressions, we propose a robust and sparse co-regularized nonnegative matrix factorization to discover driver genes from mutation data. Furthermore, our framework also conducts Frobenius norm regularization to overcome overfitting issue. Sparsity-inducing penalty is employed to obtain sparse scores in gene representations, of which the top scored genes are selected as driver candidates. Evaluation experiments by known benchmarking genes indicate that the performance of our method benefits from the two type of prior information. Our method also outperforms the existing network-based methods, and detect some driver genes that are not predicted by the competing methods. In summary, our proposed method can improve the performance of driver gene discovery by effectively incorporating prior information from interaction network and mRNA expression patterns into a robust and sparse co-regularized matrix factorization framework.
Large-region acoustic source mapping using a movable array and sparse covariance fitting.
Zhao, Shengkui; Tuna, Cagdas; Nguyen, Thi Ngoc Tho; Jones, Douglas L
2017-01-01
Large-region acoustic source mapping is important for city-scale noise monitoring. Approaches using a single-position measurement scheme to scan large regions using small arrays cannot provide clean acoustic source maps, while deploying large arrays spanning the entire region of interest is prohibitively expensive. A multiple-position measurement scheme is applied to scan large regions at multiple spatial positions using a movable array of small size. Based on the multiple-position measurement scheme, a sparse-constrained multiple-position vectorized covariance matrix fitting approach is presented. In the proposed approach, the overall sample covariance matrix of the incoherent virtual array is first estimated using the multiple-position array data and then vectorized using the Khatri-Rao (KR) product. A linear model is then constructed for fitting the vectorized covariance matrix and a sparse-constrained reconstruction algorithm is proposed for recovering source powers from the model. The user parameter settings are discussed. The proposed approach is tested on a 30 m × 40 m region and a 60 m × 40 m region using simulated and measured data. Much cleaner acoustic source maps and lower sound pressure level errors are obtained compared to the beamforming approaches and the previous sparse approach [Zhao, Tuna, Nguyen, and Jones, Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP) (2016)].
Dynamic Textures Modeling via Joint Video Dictionary Learning.
Wei, Xian; Li, Yuanxiang; Shen, Hao; Chen, Fang; Kleinsteuber, Martin; Wang, Zhongfeng
2017-04-06
Video representation is an important and challenging task in the computer vision community. In this paper, we consider the problem of modeling and classifying video sequences of dynamic scenes which could be modeled in a dynamic textures (DT) framework. At first, we assume that image frames of a moving scene can be modeled as a Markov random process. We propose a sparse coding framework, named joint video dictionary learning (JVDL), to model a video adaptively. By treating the sparse coefficients of image frames over a learned dictionary as the underlying "states", we learn an efficient and robust linear transition matrix between two adjacent frames of sparse events in time series. Hence, a dynamic scene sequence is represented by an appropriate transition matrix associated with a dictionary. In order to ensure the stability of JVDL, we impose several constraints on such transition matrix and dictionary. The developed framework is able to capture the dynamics of a moving scene by exploring both sparse properties and the temporal correlations of consecutive video frames. Moreover, such learned JVDL parameters can be used for various DT applications, such as DT synthesis and recognition. Experimental results demonstrate the strong competitiveness of the proposed JVDL approach in comparison with state-of-the-art video representation methods. Especially, it performs significantly better in dealing with DT synthesis and recognition on heavily corrupted data.
A Sparse Matrix Approach for Simultaneous Quantification of Nystagmus and Saccade
NASA Technical Reports Server (NTRS)
Kukreja, Sunil L.; Stone, Lee; Boyle, Richard D.
2012-01-01
The vestibulo-ocular reflex (VOR) consists of two intermingled non-linear subsystems; namely, nystagmus and saccade. Typically, nystagmus is analysed using a single sufficiently long signal or a concatenation of them. Saccade information is not analysed and discarded due to insufficient data length to provide consistent and minimum variance estimates. This paper presents a novel sparse matrix approach to system identification of the VOR. It allows for the simultaneous estimation of both nystagmus and saccade signals. We show via simulation of the VOR that our technique provides consistent and unbiased estimates in the presence of output additive noise.
On-Chip Neural Data Compression Based On Compressed Sensing With Sparse Sensing Matrices.
Zhao, Wenfeng; Sun, Biao; Wu, Tong; Yang, Zhi
2018-02-01
On-chip neural data compression is an enabling technique for wireless neural interfaces that suffer from insufficient bandwidth and power budgets to transmit the raw data. The data compression algorithm and its implementation should be power and area efficient and functionally reliable over different datasets. Compressed sensing is an emerging technique that has been applied to compress various neurophysiological data. However, the state-of-the-art compressed sensing (CS) encoders leverage random but dense binary measurement matrices, which incur substantial implementation costs on both power and area that could offset the benefits from the reduced wireless data rate. In this paper, we propose two CS encoder designs based on sparse measurement matrices that could lead to efficient hardware implementation. Specifically, two different approaches for the construction of sparse measurement matrices, i.e., the deterministic quasi-cyclic array code (QCAC) matrix and -sparse random binary matrix [-SRBM] are exploited. We demonstrate that the proposed CS encoders lead to comparable recovery performance. And efficient VLSI architecture designs are proposed for QCAC-CS and -SRBM encoders with reduced area and total power consumption.
Exact recovery of sparse multiple measurement vectors by [Formula: see text]-minimization.
Wang, Changlong; Peng, Jigen
2018-01-01
The joint sparse recovery problem is a generalization of the single measurement vector problem widely studied in compressed sensing. It aims to recover a set of jointly sparse vectors, i.e., those that have nonzero entries concentrated at a common location. Meanwhile [Formula: see text]-minimization subject to matrixes is widely used in a large number of algorithms designed for this problem, i.e., [Formula: see text]-minimization [Formula: see text] Therefore the main contribution in this paper is two theoretical results about this technique. The first one is proving that in every multiple system of linear equations there exists a constant [Formula: see text] such that the original unique sparse solution also can be recovered from a minimization in [Formula: see text] quasi-norm subject to matrixes whenever [Formula: see text]. The other one is showing an analytic expression of such [Formula: see text]. Finally, we display the results of one example to confirm the validity of our conclusions, and we use some numerical experiments to show that we increase the efficiency of these algorithms designed for [Formula: see text]-minimization by using our results.
Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization.
Lu, Canyi; Lin, Zhouchen; Yan, Shuicheng
2015-02-01
This paper presents a general framework for solving the low-rank and/or sparse matrix minimization problems, which may involve multiple nonsmooth terms. The iteratively reweighted least squares (IRLSs) method is a fast solver, which smooths the objective function and minimizes it by alternately updating the variables and their weights. However, the traditional IRLS can only solve a sparse only or low rank only minimization problem with squared loss or an affine constraint. This paper generalizes IRLS to solve joint/mixed low-rank and sparse minimization problems, which are essential formulations for many tasks. As a concrete example, we solve the Schatten-p norm and l2,q-norm regularized low-rank representation problem by IRLS, and theoretically prove that the derived solution is a stationary point (globally optimal if p,q ≥ 1). Our convergence proof of IRLS is more general than previous one that depends on the special properties of the Schatten-p norm and l2,q-norm. Extensive experiments on both synthetic and real data sets demonstrate that our IRLS is much more efficient.
Shen, Hong-Bin
2011-01-01
Modern science of networks has brought significant advances to our understanding of complex systems biology. As a representative model of systems biology, Protein Interaction Networks (PINs) are characterized by a remarkable modular structures, reflecting functional associations between their components. Many methods were proposed to capture cohesive modules so that there is a higher density of edges within modules than those across them. Recent studies reveal that cohesively interacting modules of proteins is not a universal organizing principle in PINs, which has opened up new avenues for revisiting functional modules in PINs. In this paper, functional clusters in PINs are found to be able to form unorthodox structures defined as bi-sparse module. In contrast to the traditional cohesive module, the nodes in the bi-sparse module are sparsely connected internally and densely connected with other bi-sparse or cohesive modules. We present a novel protocol called the BinTree Seeking (BTS) for mining both bi-sparse and cohesive modules in PINs based on Edge Density of Module (EDM) and matrix theory. BTS detects modules by depicting links and nodes rather than nodes alone and its derivation procedure is totally performed on adjacency matrix of networks. The number of modules in a PIN can be automatically determined in the proposed BTS approach. BTS is tested on three real PINs and the results demonstrate that functional modules in PINs are not dominantly cohesive but can be sparse. BTS software and the supporting information are available at: www.csbio.sjtu.edu.cn/bioinf/BTS/. PMID:22140454
A Spectral Algorithm for Envelope Reduction of Sparse Matrices
NASA Technical Reports Server (NTRS)
Barnard, Stephen T.; Pothen, Alex; Simon, Horst D.
1993-01-01
The problem of reordering a sparse symmetric matrix to reduce its envelope size is considered. A new spectral algorithm for computing an envelope-reducing reordering is obtained by associating a Laplacian matrix with the given matrix and then sorting the components of a specified eigenvector of the Laplacian. This Laplacian eigenvector solves a continuous relaxation of a discrete problem related to envelope minimization called the minimum 2-sum problem. The permutation vector computed by the spectral algorithm is a closest permutation vector to the specified Laplacian eigenvector. Numerical results show that the new reordering algorithm usually computes smaller envelope sizes than those obtained from the current standard algorithms such as Gibbs-Poole-Stockmeyer (GPS) or SPARSPAK reverse Cuthill-McKee (RCM), in some cases reducing the envelope by more than a factor of two.
The application of nonlinear programming and collocation to optimal aeroassisted orbital transfers
NASA Astrophysics Data System (ADS)
Shi, Y. Y.; Nelson, R. L.; Young, D. H.; Gill, P. E.; Murray, W.; Saunders, M. A.
1992-01-01
Sequential quadratic programming (SQP) and collocation of the differential equations of motion were applied to optimal aeroassisted orbital transfers. The Optimal Trajectory by Implicit Simulation (OTIS) computer program codes with updated nonlinear programming code (NZSOL) were used as a testbed for the SQP nonlinear programming (NLP) algorithms. The state-of-the-art sparse SQP method is considered to be effective for solving large problems with a sparse matrix. Sparse optimizers are characterized in terms of memory requirements and computational efficiency. For the OTIS problems, less than 10 percent of the Jacobian matrix elements are nonzero. The SQP method encompasses two phases: finding an initial feasible point by minimizing the sum of infeasibilities and minimizing the quadratic objective function within the feasible region. The orbital transfer problem under consideration involves the transfer from a high energy orbit to a low energy orbit.
Removing flicker based on sparse color correspondences in old film restoration
NASA Astrophysics Data System (ADS)
Huang, Xi; Ding, Youdong; Yu, Bing; Xia, Tianran
2018-04-01
In the long history of human civilization, archived film is an indispensable part of it, and using digital method to repair damaged film is also a mainstream trend nowadays. In this paper, we propose a sparse color correspondences based technique to remove fading flicker for old films. Our model, combined with multi frame images to establish a simple correction model, includes three key steps. Firstly, we recover sparse color correspondences in the input frames to build a matrix with many missing entries. Secondly, we present a low-rank matrix factorization approach to estimate the unknown parameters of this model. Finally, we adopt a two-step strategy that divide the estimated parameters into reference frame parameters for color recovery correction and other frame parameters for color consistency correction to remove flicker. Our method combined multi-frames takes continuity of the input sequence into account, and the experimental results show the method can remove fading flicker efficiently.
A Fast Gradient Method for Nonnegative Sparse Regression With Self-Dictionary
NASA Astrophysics Data System (ADS)
Gillis, Nicolas; Luce, Robert
2018-01-01
A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images.
Sparse Covariance Matrix Estimation by DCA-Based Algorithms.
Phan, Duy Nhat; Le Thi, Hoai An; Dinh, Tao Pham
2017-11-01
This letter proposes a novel approach using the [Formula: see text]-norm regularization for the sparse covariance matrix estimation (SCME) problem. The objective function of SCME problem is composed of a nonconvex part and the [Formula: see text] term, which is discontinuous and difficult to tackle. Appropriate DC (difference of convex functions) approximations of [Formula: see text]-norm are used that result in approximation SCME problems that are still nonconvex. DC programming and DCA (DC algorithm), powerful tools in nonconvex programming framework, are investigated. Two DC formulations are proposed and corresponding DCA schemes developed. Two applications of the SCME problem that are considered are classification via sparse quadratic discriminant analysis and portfolio optimization. A careful empirical experiment is performed through simulated and real data sets to study the performance of the proposed algorithms. Numerical results showed their efficiency and their superiority compared with seven state-of-the-art methods.
NASA Astrophysics Data System (ADS)
Qin, Xulei; Cong, Zhibin; Fei, Baowei
2013-11-01
An automatic segmentation framework is proposed to segment the right ventricle (RV) in echocardiographic images. The method can automatically segment both epicardial and endocardial boundaries from a continuous echocardiography series by combining sparse matrix transform, a training model, and a localized region-based level set. First, the sparse matrix transform extracts main motion regions of the myocardium as eigen-images by analyzing the statistical information of the images. Second, an RV training model is registered to the eigen-images in order to locate the position of the RV. Third, the training model is adjusted and then serves as an optimized initialization for the segmentation of each image. Finally, based on the initializations, a localized, region-based level set algorithm is applied to segment both epicardial and endocardial boundaries in each echocardiograph. Three evaluation methods were used to validate the performance of the segmentation framework. The Dice coefficient measures the overall agreement between the manual and automatic segmentation. The absolute distance and the Hausdorff distance between the boundaries from manual and automatic segmentation were used to measure the accuracy of the segmentation. Ultrasound images of human subjects were used for validation. For the epicardial and endocardial boundaries, the Dice coefficients were 90.8 ± 1.7% and 87.3 ± 1.9%, the absolute distances were 2.0 ± 0.42 mm and 1.79 ± 0.45 mm, and the Hausdorff distances were 6.86 ± 1.71 mm and 7.02 ± 1.17 mm, respectively. The automatic segmentation method based on a sparse matrix transform and level set can provide a useful tool for quantitative cardiac imaging.
Masuda, Y; Misztal, I; Legarra, A; Tsuruta, S; Lourenco, D A L; Fragomeni, B O; Aguilar, I
2017-01-01
This paper evaluates an efficient implementation to multiply the inverse of a numerator relationship matrix for genotyped animals () by a vector (). The computation is required for solving mixed model equations in single-step genomic BLUP (ssGBLUP) with the preconditioned conjugate gradient (PCG). The inverse can be decomposed into sparse matrices that are blocks of the sparse inverse of a numerator relationship matrix () including genotyped animals and their ancestors. The elements of were rapidly calculated with the Henderson's rule and stored as sparse matrices in memory. Implementation of was by a series of sparse matrix-vector multiplications. Diagonal elements of , which were required as preconditioners in PCG, were approximated with a Monte Carlo method using 1,000 samples. The efficient implementation of was compared with explicit inversion of with 3 data sets including about 15,000, 81,000, and 570,000 genotyped animals selected from populations with 213,000, 8.2 million, and 10.7 million pedigree animals, respectively. The explicit inversion required 1.8 GB, 49 GB, and 2,415 GB (estimated) of memory, respectively, and 42 s, 56 min, and 13.5 d (estimated), respectively, for the computations. The efficient implementation required <1 MB, 2.9 GB, and 2.3 GB of memory, respectively, and <1 sec, 3 min, and 5 min, respectively, for setting up. Only <1 sec was required for the multiplication in each PCG iteration for any data sets. When the equations in ssGBLUP are solved with the PCG algorithm, is no longer a limiting factor in the computations.
Compressive sensing using optimized sensing matrix for face verification
NASA Astrophysics Data System (ADS)
Oey, Endra; Jeffry; Wongso, Kelvin; Tommy
2017-12-01
Biometric appears as one of the solutions which is capable in solving problems that occurred in the usage of password in terms of data access, for example there is possibility in forgetting password and hard to recall various different passwords. With biometrics, physical characteristics of a person can be captured and used in the identification process. In this research, facial biometric is used in the verification process to determine whether the user has the authority to access the data or not. Facial biometric is chosen as its low cost implementation and generate quite accurate result for user identification. Face verification system which is adopted in this research is Compressive Sensing (CS) technique, in which aims to reduce dimension size as well as encrypt data in form of facial test image where the image is represented in sparse signals. Encrypted data can be reconstructed using Sparse Coding algorithm. Two types of Sparse Coding namely Orthogonal Matching Pursuit (OMP) and Iteratively Reweighted Least Squares -ℓp (IRLS-ℓp) will be used for comparison face verification system research. Reconstruction results of sparse signals are then used to find Euclidean norm with the sparse signal of user that has been previously saved in system to determine the validity of the facial test image. Results of system accuracy obtained in this research are 99% in IRLS with time response of face verification for 4.917 seconds and 96.33% in OMP with time response of face verification for 0.4046 seconds with non-optimized sensing matrix, while 99% in IRLS with time response of face verification for 13.4791 seconds and 98.33% for OMP with time response of face verification for 3.1571 seconds with optimized sensing matrix.
Direction of Arrival Estimation for MIMO Radar via Unitary Nuclear Norm Minimization
Wang, Xianpeng; Huang, Mengxing; Wu, Xiaoqin; Bi, Guoan
2017-01-01
In this paper, we consider the direction of arrival (DOA) estimation issue of noncircular (NC) source in multiple-input multiple-output (MIMO) radar and propose a novel unitary nuclear norm minimization (UNNM) algorithm. In the proposed method, the noncircular properties of signals are used to double the virtual array aperture, and the real-valued data are obtained by utilizing unitary transformation. Then a real-valued block sparse model is established based on a novel over-complete dictionary, and a UNNM algorithm is formulated for recovering the block-sparse matrix. In addition, the real-valued NC-MUSIC spectrum is used to design a weight matrix for reweighting the nuclear norm minimization to achieve the enhanced sparsity of solutions. Finally, the DOA is estimated by searching the non-zero blocks of the recovered matrix. Because of using the noncircular properties of signals to extend the virtual array aperture and an additional real structure to suppress the noise, the proposed method provides better performance compared with the conventional sparse recovery based algorithms. Furthermore, the proposed method can handle the case of underdetermined DOA estimation. Simulation results show the effectiveness and advantages of the proposed method. PMID:28441770
Benzi, Michele; Evans, Thomas M.; Hamilton, Steven P.; ...
2017-03-05
Here, we consider hybrid deterministic-stochastic iterative algorithms for the solution of large, sparse linear systems. Starting from a convergent splitting of the coefficient matrix, we analyze various types of Monte Carlo acceleration schemes applied to the original preconditioned Richardson (stationary) iteration. We expect that these methods will have considerable potential for resiliency to faults when implemented on massively parallel machines. We also establish sufficient conditions for the convergence of the hybrid schemes, and we investigate different types of preconditioners including sparse approximate inverses. Numerical experiments on linear systems arising from the discretization of partial differential equations are presented.
Sparse Matrix Motivated Reconstruction of Far-Field Radiation Patterns
2015-03-01
method for base - station antenna radiation patterns. IEEE Antennas Propagation Magazine. 2001;43(2):132. 4. Vasiliadis TG, Dimitriou D, Sergiadis JD...algorithm based on sparse representations of radiation patterns using the inverse Discrete Fourier Transform (DFT) and the inverse Discrete Cosine...patterns using a Model- Based Parameter Estimation (MBPE) technique that reduces the computational time required to model radiation patterns. Another
Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Kyungjoo; Rajamanickam, Sivasankaran; Stelle, George Widgery
We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block layout. The algorithm-byblocks approach induces a task graph for the factorization. These tasks are inter-related to each other through their data dependences in the factorization algorithm. To process the tasks on various manycore architectures in a portable manner, we also present a portable tasking API that incorporates different tasking backends and device-specific features using an open-source framework for manycore platforms i.e., Kokkos. A performance evaluation is presented onmore » both Intel Sandybridge and Xeon Phi platforms for matrices from the University of Florida sparse matrix collection to illustrate merits of the proposed task-based factorization. Experimental results demonstrate that our task-parallel implementation delivers about 26.6x speedup (geometric mean) over single-threaded incomplete Choleskyby- blocks and 19.2x speedup over serial Cholesky performance which does not carry tasking overhead using 56 threads on the Intel Xeon Phi processor for sparse matrices arising from various application problems.« less
An efficient sparse matrix multiplication scheme for the CYBER 205 computer
NASA Technical Reports Server (NTRS)
Lambiotte, Jules J., Jr.
1988-01-01
This paper describes the development of an efficient algorithm for computing the product of a matrix and vector on a CYBER 205 vector computer. The desire to provide software which allows the user to choose between the often conflicting goals of minimizing central processing unit (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of four types of storage is selected for each diagonal. The candidate storage types employed were chosen to be efficient on the CYBER 205 for diagonals which have nonzero structure which is dense, moderately sparse, very sparse and short, or very sparse and long; however, for many densities, no diagonal type is most efficient with respect to both resource requirements, and a trade-off must be made. For each diagonal, an initialization subroutine estimates the CPU time and storage required for each storage type based on results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the two resources. The adjusted resource requirements are then compared to select the most efficient storage and computational scheme.
Wen, Zaidao; Hou, Zaidao; Jiao, Licheng
2017-11-01
Discriminative dictionary learning (DDL) framework has been widely used in image classification which aims to learn some class-specific feature vectors as well as a representative dictionary according to a set of labeled training samples. However, interclass similarities and intraclass variances among input samples and learned features will generally weaken the representability of dictionary and the discrimination of feature vectors so as to degrade the classification performance. Therefore, how to explicitly represent them becomes an important issue. In this paper, we present a novel DDL framework with two-level low rank and group sparse decomposition model. In the first level, we learn a class-shared and several class-specific dictionaries, where a low rank and a group sparse regularization are, respectively, imposed on the corresponding feature matrices. In the second level, the class-specific feature matrix will be further decomposed into a low rank and a sparse matrix so that intraclass variances can be separated to concentrate the corresponding feature vectors. Extensive experimental results demonstrate the effectiveness of our model. Compared with the other state-of-the-arts on several popular image databases, our model can achieve a competitive or better performance in terms of the classification accuracy.
Nonlocal low-rank and sparse matrix decomposition for spectral CT reconstruction
NASA Astrophysics Data System (ADS)
Niu, Shanzhou; Yu, Gaohang; Ma, Jianhua; Wang, Jing
2018-02-01
Spectral computed tomography (CT) has been a promising technique in research and clinics because of its ability to produce improved energy resolution images with narrow energy bins. However, the narrow energy bin image is often affected by serious quantum noise because of the limited number of photons used in the corresponding energy bin. To address this problem, we present an iterative reconstruction method for spectral CT using nonlocal low-rank and sparse matrix decomposition (NLSMD), which exploits the self-similarity of patches that are collected in multi-energy images. Specifically, each set of patches can be decomposed into a low-rank component and a sparse component, and the low-rank component represents the stationary background over different energy bins, while the sparse component represents the rest of the different spectral features in individual energy bins. Subsequently, an effective alternating optimization algorithm was developed to minimize the associated objective function. To validate and evaluate the NLSMD method, qualitative and quantitative studies were conducted by using simulated and real spectral CT data. Experimental results show that the NLSMD method improves spectral CT images in terms of noise reduction, artifact suppression and resolution preservation.
Comparing implementations of penalized weighted least-squares sinogram restoration
Forthmann, Peter; Koehler, Thomas; Defrise, Michel; La Riviere, Patrick
2010-01-01
Purpose: A CT scanner measures the energy that is deposited in each channel of a detector array by x rays that have been partially absorbed on their way through the object. The measurement process is complex and quantitative measurements are always and inevitably associated with errors, so CT data must be preprocessed prior to reconstruction. In recent years, the authors have formulated CT sinogram preprocessing as a statistical restoration problem in which the goal is to obtain the best estimate of the line integrals needed for reconstruction from the set of noisy, degraded measurements. The authors have explored both penalized Poisson likelihood (PL) and penalized weighted least-squares (PWLS) objective functions. At low doses, the authors found that the PL approach outperforms PWLS in terms of resolution-noise tradeoffs, but at standard doses they perform similarly. The PWLS objective function, being quadratic, is more amenable to computational acceleration than the PL objective. In this work, the authors develop and compare two different methods for implementing PWLS sinogram restoration with the hope of improving computational performance relative to PL in the standard-dose regime. Sinogram restoration is still significant in the standard-dose regime since it can still outperform standard approaches and it allows for correction of effects that are not usually modeled in standard CT preprocessing. Methods: The authors have explored and compared two implementation strategies for PWLS sinogram restoration: (1) A direct matrix-inversion strategy based on the closed-form solution to the PWLS optimization problem and (2) an iterative approach based on the conjugate-gradient algorithm. Obtaining optimal performance from each strategy required modifying the naive off-the-shelf implementations of the algorithms to exploit the particular symmetry and sparseness of the sinogram-restoration problem. For the closed-form approach, the authors subdivided the large matrix inversion into smaller coupled problems and exploited sparseness to minimize matrix operations. For the conjugate-gradient approach, the authors exploited sparseness and preconditioned the problem to speed up convergence. Results: All methods produced qualitatively and quantitatively similar images as measured by resolution-variance tradeoffs and difference images. Despite the acceleration strategies, the direct matrix-inversion approach was found to be uncompetitive with iterative approaches, with a computational burden higher by an order of magnitude or more. The iterative conjugate-gradient approach, however, does appear promising, with computation times half that of the authors’ previous penalized-likelihood implementation. Conclusions: Iterative conjugate-gradient based PWLS sinogram restoration with careful matrix optimizations has computational advantages over direct matrix PWLS inversion and over penalized-likelihood sinogram restoration and can be considered a good alternative in standard-dose regimes. PMID:21158306
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter
In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter; ...
2016-06-30
In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas
In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas; ...
2016-01-06
In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
Bouridane, Ahmed; Ling, Bingo Wing-Kuen
2018-01-01
This paper presents an unsupervised learning algorithm for sparse nonnegative matrix factor time–frequency deconvolution with optimized fractional β-divergence. The β-divergence is a group of cost functions parametrized by a single parameter β. The Itakura–Saito divergence, Kullback–Leibler divergence and Least Square distance are special cases that correspond to β=0, 1, 2, respectively. This paper presents a generalized algorithm that uses a flexible range of β that includes fractional values. It describes a maximization–minimization (MM) algorithm leading to the development of a fast convergence multiplicative update algorithm with guaranteed convergence. The proposed model operates in the time–frequency domain and decomposes an information-bearing matrix into two-dimensional deconvolution of factor matrices that represent the spectral dictionary and temporal codes. The deconvolution process has been optimized to yield sparse temporal codes through maximizing the likelihood of the observations. The paper also presents a method to estimate the fractional β value. The method is demonstrated on separating audio mixtures recorded from a single channel. The paper shows that the extraction of the spectral dictionary and temporal codes is significantly more efficient by using the proposed algorithm and subsequently leads to better source separation performance. Experimental tests and comparisons with other factorization methods have been conducted to verify its efficacy. PMID:29702629
Efficient ICCG on a shared memory multiprocessor
NASA Technical Reports Server (NTRS)
Hammond, Steven W.; Schreiber, Robert
1989-01-01
Different approaches are discussed for exploiting parallelism in the ICCG (Incomplete Cholesky Conjugate Gradient) method for solving large sparse symmetric positive definite systems of equations on a shared memory parallel computer. Techniques for efficiently solving triangular systems and computing sparse matrix-vector products are explored. Three methods for scheduling the tasks in solving triangular systems are implemented on the Sequent Balance 21000. Sample problems that are representative of a large class of problems solved using iterative methods are used. We show that a static analysis to determine data dependences in the triangular solve can greatly improve its parallel efficiency. We also show that ignoring symmetry and storing the whole matrix can reduce solution time substantially.
Algorithms for solving large sparse systems of simultaneous linear equations on vector processors
NASA Technical Reports Server (NTRS)
David, R. E.
1984-01-01
Very efficient algorithms for solving large sparse systems of simultaneous linear equations have been developed for serial processing computers. These involve a reordering of matrix rows and columns in order to obtain a near triangular pattern of nonzero elements. Then an LU factorization is developed to represent the matrix inverse in terms of a sequence of elementary Gaussian eliminations, or pivots. In this paper it is shown how these algorithms are adapted for efficient implementation on vector processors. Results obtained on the CYBER 200 Model 205 are presented for a series of large test problems which show the comparative advantages of the triangularization and vector processing algorithms.
Eigensolver for a Sparse, Large Hermitian Matrix
NASA Technical Reports Server (NTRS)
Tisdale, E. Robert; Oyafuso, Fabiano; Klimeck, Gerhard; Brown, R. Chris
2003-01-01
A parallel-processing computer program finds a few eigenvalues in a sparse Hermitian matrix that contains as many as 100 million diagonal elements. This program finds the eigenvalues faster, using less memory, than do other, comparable eigensolver programs. This program implements a Lanczos algorithm in the American National Standards Institute/ International Organization for Standardization (ANSI/ISO) C computing language, using the Message Passing Interface (MPI) standard to complement an eigensolver in PARPACK. [PARPACK (Parallel Arnoldi Package) is an extension, to parallel-processing computer architectures, of ARPACK (Arnoldi Package), which is a collection of Fortran 77 subroutines that solve large-scale eigenvalue problems.] The eigensolver runs on Beowulf clusters of computers at the Jet Propulsion Laboratory (JPL).
A study of the parallel algorithm for large-scale DC simulation of nonlinear systems
NASA Astrophysics Data System (ADS)
Cortés Udave, Diego Ernesto; Ogrodzki, Jan; Gutiérrez de Anda, Miguel Angel
Newton-Raphson DC analysis of large-scale nonlinear circuits may be an extremely time consuming process even if sparse matrix techniques and bypassing of nonlinear models calculation are used. A slight decrease in the time required for this task may be enabled on multi-core, multithread computers if the calculation of the mathematical models for the nonlinear elements as well as the stamp management of the sparse matrix entries are managed through concurrent processes. This numerical complexity can be further reduced via the circuit decomposition and parallel solution of blocks taking as a departure point the BBD matrix structure. This block-parallel approach may give a considerable profit though it is strongly dependent on the system topology and, of course, on the processor type. This contribution presents the easy-parallelizable decomposition-based algorithm for DC simulation and provides a detailed study of its effectiveness.
Sparse distributed memory and related models
NASA Technical Reports Server (NTRS)
Kanerva, Pentti
1992-01-01
Described here is sparse distributed memory (SDM) as a neural-net associative memory. It is characterized by two weight matrices and by a large internal dimension - the number of hidden units is much larger than the number of input or output units. The first matrix, A, is fixed and possibly random, and the second matrix, C, is modifiable. The SDM is compared and contrasted to (1) computer memory, (2) correlation-matrix memory, (3) feet-forward artificial neural network, (4) cortex of the cerebellum, (5) Marr and Albus models of the cerebellum, and (6) Albus' cerebellar model arithmetic computer (CMAC). Several variations of the basic SDM design are discussed: the selected-coordinate and hyperplane designs of Jaeckel, the pseudorandom associative neural memory of Hassoun, and SDM with real-valued input variables by Prager and Fallside. SDM research conducted mainly at the Research Institute for Advanced Computer Science (RIACS) in 1986-1991 is highlighted.
Newmark-Beta-FDTD method for super-resolution analysis of time reversal waves
NASA Astrophysics Data System (ADS)
Shi, Sheng-Bing; Shao, Wei; Ma, Jing; Jin, Congjun; Wang, Xiao-Hua
2017-09-01
In this work, a new unconditionally stable finite-difference time-domain (FDTD) method with the split-field perfectly matched layer (PML) is proposed for the analysis of time reversal (TR) waves. The proposed method is very suitable for multiscale problems involving microstructures. The spatial and temporal derivatives in this method are discretized by the central difference technique and Newmark-Beta algorithm, respectively, and the derivation results in the calculation of a banded-sparse matrix equation. Since the coefficient matrix keeps unchanged during the whole simulation process, the lower-upper (LU) decomposition of the matrix needs to be performed only once at the beginning of the calculation. Moreover, the reverse Cuthill-Mckee (RCM) technique, an effective preprocessing technique in bandwidth compression of sparse matrices, is used to improve computational efficiency. The super-resolution focusing of TR wave propagation in two- and three-dimensional spaces is included to validate the accuracy and efficiency of the proposed method.
Blind compressed sensing image reconstruction based on alternating direction method
NASA Astrophysics Data System (ADS)
Liu, Qinan; Guo, Shuxu
2018-04-01
In order to solve the problem of how to reconstruct the original image under the condition of unknown sparse basis, this paper proposes an image reconstruction method based on blind compressed sensing model. In this model, the image signal is regarded as the product of a sparse coefficient matrix and a dictionary matrix. Based on the existing blind compressed sensing theory, the optimal solution is solved by the alternative minimization method. The proposed method solves the problem that the sparse basis in compressed sensing is difficult to represent, which restrains the noise and improves the quality of reconstructed image. This method ensures that the blind compressed sensing theory has a unique solution and can recover the reconstructed original image signal from a complex environment with a stronger self-adaptability. The experimental results show that the image reconstruction algorithm based on blind compressed sensing proposed in this paper can recover high quality image signals under the condition of under-sampling.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carpenter, J.A.
This report is a sequel to ORNL/CSD-106 in the ongoing supplements to Professor A.S. Householder's KWIC Index for Numerical Algebra. Beginning with the previous supplement, the subject has been restricted to Numerical Linear Algebra, roughly characterized by the American Mathematical Society's classification sections 15 and 65F but with little coverage of infinite matrices, matrices over fields of characteristics other than zero, operator theory, optimization and those parts of matrix theory primarily combinatorial in nature. Some consideration is given to the uses of graph theory in Numerical Linear Algebra, particularly with respect to algorithms for sparse matrix computations. The period coveredmore » by this report is roughly the calendar year 1982 as measured by the appearance of the articles in the American Mathematical Society's Contents of Mathematical Publications lagging actual appearance dates by up to nearly half a year. The review citations are limited to the Mathematical Reviews (MR).« less
Communication requirements of sparse Cholesky factorization with nested dissection ordering
NASA Technical Reports Server (NTRS)
Naik, Vijay K.; Patrick, Merrell L.
1989-01-01
Load distribution schemes for minimizing the communication requirements of the Cholesky factorization of dense and sparse, symmetric, positive definite matrices on multiprocessor systems are presented. The total data traffic in factoring an n x n sparse symmetric positive definite matrix representing an n-vertex regular two-dimensional grid graph using n exp alpha, alpha not greater than 1, processors are shown to be O(n exp 1 + alpha/2). It is O(n), when n exp alpha, alpha not smaller than 1, processors are used. Under the conditions of uniform load distribution, these results are shown to be asymptotically optimal.
Achieving Passive Localization with Traffic Light Schedules in Urban Road Sensor Networks
Niu, Qiang; Yang, Xu; Gao, Shouwan; Chen, Pengpeng; Chan, Shibing
2016-01-01
Localization is crucial for the monitoring applications of cities, such as road monitoring, environment surveillance, vehicle tracking, etc. In urban road sensor networks, sensors are often sparely deployed due to the hardware cost. Under this sparse deployment, sensors cannot communicate with each other via ranging hardware or one-hop connectivity, rendering the existing localization solutions ineffective. To address this issue, this paper proposes a novel Traffic Lights Schedule-based localization algorithm (TLS), which is built on the fact that vehicles move through the intersection with a known traffic light schedule. We can first obtain the law by binary vehicle detection time stamps and describe the law as a matrix, called a detection matrix. At the same time, we can also use the known traffic light information to construct the matrices, which can be formed as a collection called a known matrix collection. The detection matrix is then matched in the known matrix collection for identifying where sensors are located on urban roads. We evaluate our algorithm by extensive simulation. The results show that the localization accuracy of intersection sensors can reach more than 90%. In addition, we compare it with a state-of-the-art algorithm and prove that it has a wider operational region. PMID:27735871
Multi-color incomplete Cholesky conjugate gradient methods for vector computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Poole, E.L.
1986-01-01
This research is concerned with the solution on vector computers of linear systems of equations. Ax = b, where A is a large, sparse symmetric positive definite matrix with non-zero elements lying only along a few diagonals of the matrix. The system is solved using the incomplete Cholesky conjugate gradient method (ICCG). Multi-color orderings are used of the unknowns in the linear system to obtain p-color matrices for which a no-fill block ICCG method is implemented on the CYBER 205 with O(N/p) length vector operations in both the decomposition of A and, more importantly, in the forward and back solvesmore » necessary at each iteration of the method. (N is the number of unknowns and p is a small constant). A p-colored matrix is a matrix that can be partitioned into a p x p block matrix where the diagonal blocks are diagonal matrices. The matrix is stored by diagonals and matrix multiplication by diagonals is used to carry out the decomposition of A and the forward and back solves. Additionally, if the vectors across adjacent blocks line up, then some of the overhead associated with vector startups can be eliminated in the matrix vector multiplication necessary at each conjugate gradient iteration. Necessary and sufficient conditions are given to determine which multi-color orderings of the unknowns correspond to p-color matrices, and a process is indicated for choosing multi-color orderings.« less
Color normalization of histology slides using graph regularized sparse NMF
NASA Astrophysics Data System (ADS)
Sha, Lingdao; Schonfeld, Dan; Sethi, Amit
2017-03-01
Computer based automatic medical image processing and quantification are becoming popular in digital pathology. However, preparation of histology slides can vary widely due to differences in staining equipment, procedures and reagents, which can reduce the accuracy of algorithms that analyze their color and texture information. To re- duce the unwanted color variations, various supervised and unsupervised color normalization methods have been proposed. Compared with supervised color normalization methods, unsupervised color normalization methods have advantages of time and cost efficient and universal applicability. Most of the unsupervised color normaliza- tion methods for histology are based on stain separation. Based on the fact that stain concentration cannot be negative and different parts of the tissue absorb different stains, nonnegative matrix factorization (NMF), and particular its sparse version (SNMF), are good candidates for stain separation. However, most of the existing unsupervised color normalization method like PCA, ICA, NMF and SNMF fail to consider important information about sparse manifolds that its pixels occupy, which could potentially result in loss of texture information during color normalization. Manifold learning methods like Graph Laplacian have proven to be very effective in interpreting high-dimensional data. In this paper, we propose a novel unsupervised stain separation method called graph regularized sparse nonnegative matrix factorization (GSNMF). By considering the sparse prior of stain concentration together with manifold information from high-dimensional image data, our method shows better performance in stain color deconvolution than existing unsupervised color deconvolution methods, especially in keeping connected texture information. To utilized the texture information, we construct a nearest neighbor graph between pixels within a spatial area of an image based on their distances using heat kernal in lαβ space. The representation of a pixel in the stain density space is constrained to follow the feature distance of the pixel to pixels in the neighborhood graph. Utilizing color matrix transfer method with the stain concentrations found using our GSNMF method, the color normalization performance was also better than existing methods.
Mniszewski, S M; Cawkwell, M J; Wall, M E; Mohd-Yusof, J; Bock, N; Germann, T C; Niklasson, A M N
2015-10-13
We present an algorithm for the calculation of the density matrix that for insulators scales linearly with system size and parallelizes efficiently on multicore, shared memory platforms with small and controllable numerical errors. The algorithm is based on an implementation of the second-order spectral projection (SP2) algorithm [ Niklasson, A. M. N. Phys. Rev. B 2002 , 66 , 155115 ] in sparse matrix algebra with the ELLPACK-R data format. We illustrate the performance of the algorithm within self-consistent tight binding theory by total energy calculations of gas phase poly(ethylene) molecules and periodic liquid water systems containing up to 15,000 atoms on up to 16 CPU cores. We consider algorithm-specific performance aspects, such as local vs nonlocal memory access and the degree of matrix sparsity. Comparisons to sparse matrix algebra implementations using off-the-shelf libraries on multicore CPUs, graphics processing units (GPUs), and the Intel many integrated core (MIC) architecture are also presented. The accuracy and stability of the algorithm are illustrated with long duration Born-Oppenheimer molecular dynamics simulations of 1000 water molecules and a 303 atom Trp cage protein solvated by 2682 water molecules.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2014-01-17
This library is an implementation of the Sparse Approximate Matrix Multiplication (SpAMM) algorithm introduced. It provides a matrix data type, and an approximate matrix product, which exhibits linear scaling computational complexity for matrices with decay. The product error and the performance of the multiply can be tuned by choosing an appropriate tolerance. The library can be compiled for serial execution or parallel execution on shared memory systems with an OpenMP capable compiler
Bit error rate tester using fast parallel generation of linear recurring sequences
Pierson, Lyndon G.; Witzke, Edward L.; Maestas, Joseph H.
2003-05-06
A fast method for generating linear recurring sequences by parallel linear recurring sequence generators (LRSGs) with a feedback circuit optimized to balance minimum propagation delay against maximal sequence period. Parallel generation of linear recurring sequences requires decimating the sequence (creating small contiguous sections of the sequence in each LRSG). A companion matrix form is selected depending on whether the LFSR is right-shifting or left-shifting. The companion matrix is completed by selecting a primitive irreducible polynomial with 1's most closely grouped in a corner of the companion matrix. A decimation matrix is created by raising the companion matrix to the (n*k).sup.th power, where k is the number of parallel LRSGs and n is the number of bits to be generated at a time by each LRSG. Companion matrices with 1's closely grouped in a corner will yield sparse decimation matrices. A feedback circuit comprised of XOR logic gates implements the decimation matrix in hardware. Sparse decimation matrices can be implemented with minimum number of XOR gates, and therefore a minimum propagation delay through the feedback circuit. The LRSG of the invention is particularly well suited to use as a bit error rate tester on high speed communication lines because it permits the receiver to synchronize to the transmitted pattern within 2n bits.
Tensor Sparse Coding for Positive Definite Matrices.
Sivalingam, Ravishankar; Boley, Daniel; Morellas, Vassilios; Papanikolopoulos, Nikos
2013-08-02
In recent years, there has been extensive research on sparse representation of vector-valued signals. In the matrix case, the data points are merely vectorized and treated as vectors thereafter (for e.g., image patches). However, this approach cannot be used for all matrices, as it may destroy the inherent structure of the data. Symmetric positive definite (SPD) matrices constitute one such class of signals, where their implicit structure of positive eigenvalues is lost upon vectorization. This paper proposes a novel sparse coding technique for positive definite matrices, which respects the structure of the Riemannian manifold and preserves the positivity of their eigenvalues, without resorting to vectorization. Synthetic and real-world computer vision experiments with region covariance descriptors demonstrate the need for and the applicability of the new sparse coding model. This work serves to bridge the gap between the sparse modeling paradigm and the space of positive definite matrices.
Tensor sparse coding for positive definite matrices.
Sivalingam, Ravishankar; Boley, Daniel; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2014-03-01
In recent years, there has been extensive research on sparse representation of vector-valued signals. In the matrix case, the data points are merely vectorized and treated as vectors thereafter (for example, image patches). However, this approach cannot be used for all matrices, as it may destroy the inherent structure of the data. Symmetric positive definite (SPD) matrices constitute one such class of signals, where their implicit structure of positive eigenvalues is lost upon vectorization. This paper proposes a novel sparse coding technique for positive definite matrices, which respects the structure of the Riemannian manifold and preserves the positivity of their eigenvalues, without resorting to vectorization. Synthetic and real-world computer vision experiments with region covariance descriptors demonstrate the need for and the applicability of the new sparse coding model. This work serves to bridge the gap between the sparse modeling paradigm and the space of positive definite matrices.
Implicit solvers for unstructured meshes
NASA Technical Reports Server (NTRS)
Venkatakrishnan, V.; Mavriplis, Dimitri J.
1991-01-01
Implicit methods for unstructured mesh computations are developed and tested. The approximate system which arises from the Newton-linearization of the nonlinear evolution operator is solved by using the preconditioned generalized minimum residual technique. These different preconditioners are investigated: the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over-relaxation (SSOR). The preconditioners have been optimized to have good vectorization properties. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also investigated. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
Matrix Recipes for Hard Thresholding Methods
2012-11-07
have been proposed to approximate the solution. In [11], Donoho et al . demonstrate that, in the sparse approximation problem, under basic incoherence...inducing convex surrogate ‖ · ‖1 with provable guarantees for unique signal recovery. In the ARM problem, Fazel et al . [12] identified the nuclear norm...sparse recovery for all. Technical report, EPFL, 2011 . [25] N. Halko , P. G. Martinsson, and J. A. Tropp. Finding structure with randomness: Probabilistic
Mohr, Stephan; Dawson, William; Wagner, Michael; Caliste, Damien; Nakajima, Takahito; Genovese, Luigi
2017-10-10
We present CheSS, the "Chebyshev Sparse Solvers" library, which has been designed to solve typical problems arising in large-scale electronic structure calculations using localized basis sets. The library is based on a flexible and efficient expansion in terms of Chebyshev polynomials and presently features the calculation of the density matrix, the calculation of matrix powers for arbitrary powers, and the extraction of eigenvalues in a selected interval. CheSS is able to exploit the sparsity of the matrices and scales linearly with respect to the number of nonzero entries, making it well-suited for large-scale calculations. The approach is particularly adapted for setups leading to small spectral widths of the involved matrices and outperforms alternative methods in this regime. By coupling CheSS to the DFT code BigDFT, we show that such a favorable setup is indeed possible in practice. In addition, the approach based on Chebyshev polynomials can be massively parallelized, and CheSS exhibits excellent scaling up to thousands of cores even for relatively small matrix sizes.
Vecharynski, Eugene; Yang, Chao; Pask, John E.
2015-02-25
Here, we present an iterative algorithm for computing an invariant subspace associated with the algebraically smallest eigenvalues of a large sparse or structured Hermitian matrix A. We are interested in the case in which the dimension of the invariant subspace is large (e.g., over several hundreds or thousands) even though it may still be small relative to the dimension of A. These problems arise from, for example, density functional theory (DFT) based electronic structure calculations for complex materials. The key feature of our algorithm is that it performs fewer Rayleigh–Ritz calculations compared to existing algorithms such as the locally optimalmore » block preconditioned conjugate gradient or the Davidson algorithm. It is a block algorithm, and hence can take advantage of efficient BLAS3 operations and be implemented with multiple levels of concurrency. We discuss a number of practical issues that must be addressed in order to implement the algorithm efficiently on a high performance computer.« less
An efficient classification method based on principal component and sparse representation.
Zhai, Lin; Fu, Shujun; Zhang, Caiming; Liu, Yunxian; Wang, Lu; Liu, Guohua; Yang, Mingqiang
2016-01-01
As an important application in optical imaging, palmprint recognition is interfered by many unfavorable factors. An effective fusion of blockwise bi-directional two-dimensional principal component analysis and grouping sparse classification is presented. The dimension reduction and normalizing are implemented by the blockwise bi-directional two-dimensional principal component analysis for palmprint images to extract feature matrixes, which are assembled into an overcomplete dictionary in sparse classification. A subspace orthogonal matching pursuit algorithm is designed to solve the grouping sparse representation. Finally, the classification result is gained by comparing the residual between testing and reconstructed images. Experiments are carried out on a palmprint database, and the results show that this method has better robustness against position and illumination changes of palmprint images, and can get higher rate of palmprint recognition.
Lim, Jun-Seok; Pang, Hee-Suk
2016-01-01
In this paper an [Formula: see text]-regularized recursive total least squares (RTLS) algorithm is considered for the sparse system identification. Although recursive least squares (RLS) has been successfully applied in sparse system identification, the estimation performance in RLS based algorithms becomes worse, when both input and output are contaminated by noise (the error-in-variables problem). We proposed an algorithm to handle the error-in-variables problem. The proposed [Formula: see text]-RTLS algorithm is an RLS like iteration using the [Formula: see text] regularization. The proposed algorithm not only gives excellent performance but also reduces the required complexity through the effective inversion matrix handling. Simulations demonstrate the superiority of the proposed [Formula: see text]-regularized RTLS for the sparse system identification setting.
Okimoto, Gordon; Zeinalzadeh, Ashkan; Wenska, Tom; Loomis, Michael; Nation, James B; Fabre, Tiphaine; Tiirikainen, Maarit; Hernandez, Brenda; Chan, Owen; Wong, Linda; Kwee, Sandi
2016-01-01
Technological advances enable the cost-effective acquisition of Multi-Modal Data Sets (MMDS) composed of measurements for multiple, high-dimensional data types obtained from a common set of bio-samples. The joint analysis of the data matrices associated with the different data types of a MMDS should provide a more focused view of the biology underlying complex diseases such as cancer that would not be apparent from the analysis of a single data type alone. As multi-modal data rapidly accumulate in research laboratories and public databases such as The Cancer Genome Atlas (TCGA), the translation of such data into clinically actionable knowledge has been slowed by the lack of computational tools capable of analyzing MMDSs. Here, we describe the Joint Analysis of Many Matrices by ITeration (JAMMIT) algorithm that jointly analyzes the data matrices of a MMDS using sparse matrix approximations of rank-1. The JAMMIT algorithm jointly approximates an arbitrary number of data matrices by rank-1 outer-products composed of "sparse" left-singular vectors (eigen-arrays) that are unique to each matrix and a right-singular vector (eigen-signal) that is common to all the matrices. The non-zero coefficients of the eigen-arrays identify small subsets of variables for each data type (i.e., signatures) that in aggregate, or individually, best explain a dominant eigen-signal defined on the columns of the data matrices. The approximation is specified by a single "sparsity" parameter that is selected based on false discovery rate estimated by permutation testing. Multiple signals of interest in a given MDDS are sequentially detected and modeled by iterating JAMMIT on "residual" data matrices that result from a given sparse approximation. We show that JAMMIT outperforms other joint analysis algorithms in the detection of multiple signatures embedded in simulated MDDS. On real multimodal data for ovarian and liver cancer we show that JAMMIT identified multi-modal signatures that were clinically informative and enriched for cancer-related biology. Sparse matrix approximations of rank-1 provide a simple yet effective means of jointly reducing multiple, big data types to a small subset of variables that characterize important clinical and/or biological attributes of the bio-samples from which the data were acquired.
Adaptive OFDM Waveform Design for Spatio-Temporal-Sparsity Exploited STAP Radar
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sen, Satyabrata
In this chapter, we describe a sparsity-based space-time adaptive processing (STAP) algorithm to detect a slowly moving target using an orthogonal frequency division multiplexing (OFDM) radar. The motivation of employing an OFDM signal is that it improves the target-detectability from the interfering signals by increasing the frequency diversity of the system. However, due to the addition of one extra dimension in terms of frequency, the adaptive degrees-of-freedom in an OFDM-STAP also increases. Therefore, to avoid the construction a fully adaptive OFDM-STAP, we develop a sparsity-based STAP algorithm. We observe that the interference spectrum is inherently sparse in the spatio-temporal domain,more » as the clutter responses occupy only a diagonal ridge on the spatio-temporal plane and the jammer signals interfere only from a few spatial directions. Hence, we exploit that sparsity to develop an efficient STAP technique that utilizes considerably lesser number of secondary data compared to the other existing STAP techniques, and produces nearly optimum STAP performance. In addition to designing the STAP filter, we optimally design the transmit OFDM signals by maximizing the output signal-to-interference-plus-noise ratio (SINR) in order to improve the STAP performance. The computation of output SINR depends on the estimated value of the interference covariance matrix, which we obtain by applying the sparse recovery algorithm. Therefore, we analytically assess the effects of the synthesized OFDM coefficients on the sparse recovery of the interference covariance matrix by computing the coherence measure of the sparse measurement matrix. Our numerical examples demonstrate the achieved STAP-performance due to sparsity-based technique and adaptive waveform design.« less
NoGOA: predicting noisy GO annotations using evidences and sparse representation.
Yu, Guoxian; Lu, Chang; Wang, Jun
2017-07-21
Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .
AZTEC. Parallel Iterative method Software for Solving Linear Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hutchinson, S.; Shadid, J.; Tuminaro, R.
1995-07-01
AZTEC is an interactive library that greatly simplifies the parrallelization process when solving the linear systems of equations Ax=b where A is a user supplied n X n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. AZTEC is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparse unstructured matricesmore » for parallel solutions.« less
Portable implementation model for CFD simulations. Application to hybrid CPU/GPU supercomputers
NASA Astrophysics Data System (ADS)
Oyarzun, Guillermo; Borrell, Ricard; Gorobets, Andrey; Oliva, Assensi
2017-10-01
Nowadays, high performance computing (HPC) systems experience a disruptive moment with a variety of novel architectures and frameworks, without any clarity of which one is going to prevail. In this context, the portability of codes across different architectures is of major importance. This paper presents a portable implementation model based on an algebraic operational approach for direct numerical simulation (DNS) and large eddy simulation (LES) of incompressible turbulent flows using unstructured hybrid meshes. The strategy proposed consists in representing the whole time-integration algorithm using only three basic algebraic operations: sparse matrix-vector product, a linear combination of vectors and dot product. The main idea is based on decomposing the nonlinear operators into a concatenation of two SpMV operations. This provides high modularity and portability. An exhaustive analysis of the proposed implementation for hybrid CPU/GPU supercomputers has been conducted with tests using up to 128 GPUs. The main objective consists in understanding the challenges of implementing CFD codes on new architectures.
NASA Astrophysics Data System (ADS)
Yihaa Roodhiyah, Lisa’; Tjong, Tiffany; Nurhasan; Sutarno, D.
2018-04-01
The late research, linear matrices of vector finite element in two dimensional(2-D) magnetotelluric (MT) responses modeling was solved by non-sparse direct solver in TE mode. Nevertheless, there is some weakness which have to be improved especially accuracy in the low frequency (10-3 Hz-10-5 Hz) which is not achieved yet and high cost computation in dense mesh. In this work, the solver which is used is sparse direct solver instead of non-sparse direct solverto overcome the weaknesses of solving linear matrices of vector finite element metod using non-sparse direct solver. Sparse direct solver will be advantageous in solving linear matrices of vector finite element method because of the matrix properties which is symmetrical and sparse. The validation of sparse direct solver in solving linear matrices of vector finite element has been done for a homogen half-space model and vertical contact model by analytical solution. Thevalidation result of sparse direct solver in solving linear matrices of vector finite element shows that sparse direct solver is more stable than non-sparse direct solver in computing linear problem of vector finite element method especially in low frequency. In the end, the accuracy of 2D MT responses modelling in low frequency (10-3 Hz-10-5 Hz) has been reached out under the efficient allocation memory of array and less computational time consuming.
SAMBA: Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahlfeld, R., E-mail: r.ahlfeld14@imperial.ac.uk; Belkouchi, B.; Montomoli, F.
2016-09-01
A new arbitrary Polynomial Chaos (aPC) method is presented for moderately high-dimensional problems characterised by limited input data availability. The proposed methodology improves the algorithm of aPC and extends the method, that was previously only introduced as tensor product expansion, to moderately high-dimensional stochastic problems. The fundamental idea of aPC is to use the statistical moments of the input random variables to develop the polynomial chaos expansion. This approach provides the possibility to propagate continuous or discrete probability density functions and also histograms (data sets) as long as their moments exist, are finite and the determinant of the moment matrixmore » is strictly positive. For cases with limited data availability, this approach avoids bias and fitting errors caused by wrong assumptions. In this work, an alternative way to calculate the aPC is suggested, which provides the optimal polynomials, Gaussian quadrature collocation points and weights from the moments using only a handful of matrix operations on the Hankel matrix of moments. It can therefore be implemented without requiring prior knowledge about statistical data analysis or a detailed understanding of the mathematics of polynomial chaos expansions. The extension to more input variables suggested in this work, is an anisotropic and adaptive version of Smolyak's algorithm that is solely based on the moments of the input probability distributions. It is referred to as SAMBA (PC), which is short for Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos. It is illustrated that for moderately high-dimensional problems (up to 20 different input variables or histograms) SAMBA can significantly simplify the calculation of sparse Gaussian quadrature rules. SAMBA's efficiency for multivariate functions with regard to data availability is further demonstrated by analysing higher order convergence and accuracy for a set of nonlinear test functions with 2, 5 and 10 different input distributions or histograms.« less
Oryspayev, Dossay; Aktulga, Hasan Metin; Sosonkina, Masha; ...
2015-07-14
In this article, sparse matrix vector multiply (SpMVM) is an important kernel that frequently arises in high performance computing applications. Due to its low arithmetic intensity, several approaches have been proposed in literature to improve its scalability and efficiency in large scale computations. In this paper, our target systems are high end multi-core architectures and we use messaging passing interface + open multiprocessing hybrid programming model for parallelism. We analyze the performance of recently proposed implementation of the distributed symmetric SpMVM, originally developed for large sparse symmetric matrices arising in ab initio nuclear structure calculations. We also study important featuresmore » of this implementation and compare with previously reported implementations that do not exploit underlying symmetry. Our SpMVM implementations leverage the hybrid paradigm to efficiently overlap expensive communications with computations. Our main comparison criterion is the "CPU core hours" metric, which is the main measure of resource usage on supercomputers. We analyze the effects of topology-aware mapping heuristic using simplified network load model. Furthermore, we have tested the different SpMVM implementations on two large clusters with 3D Torus and Dragonfly topology. Our results show that the distributed SpMVM implementation that exploits matrix symmetry and hides communication yields the best value for the "CPU core hours" metric and significantly reduces data movement overheads.« less
Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem*
Katsevich, E.; Katsevich, A.; Singer, A.
2015-01-01
In cryo-electron microscopy (cryo-EM), a microscope generates a top view of a sample of randomly oriented copies of a molecule. The problem of single particle reconstruction (SPR) from cryo-EM is to use the resulting set of noisy two-dimensional projection images taken at unknown directions to reconstruct the three-dimensional (3D) structure of the molecule. In some situations, the molecule under examination exhibits structural variability, which poses a fundamental challenge in SPR. The heterogeneity problem is the task of mapping the space of conformational states of a molecule. It has been previously suggested that the leading eigenvectors of the covariance matrix of the 3D molecules can be used to solve the heterogeneity problem. Estimating the covariance matrix is challenging, since only projections of the molecules are observed, but not the molecules themselves. In this paper, we formulate a general problem of covariance estimation from noisy projections of samples. This problem has intimate connections with matrix completion problems and high-dimensional principal component analysis. We propose an estimator and prove its consistency. When there are finitely many heterogeneity classes, the spectrum of the estimated covariance matrix reveals the number of classes. The estimator can be found as the solution to a certain linear system. In the cryo-EM case, the linear operator to be inverted, which we term the projection covariance transform, is an important object in covariance estimation for tomographic problems involving structural variation. Inverting it involves applying a filter akin to the ramp filter in tomography. We design a basis in which this linear operator is sparse and thus can be tractably inverted despite its large size. We demonstrate via numerical experiments on synthetic datasets the robustness of our algorithm to high levels of noise. PMID:25699132
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2014-06-01
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression.
NASA Astrophysics Data System (ADS)
He, Xingyu; Tong, Ningning; Hu, Xiaowei
2018-01-01
Compressive sensing has been successfully applied to inverse synthetic aperture radar (ISAR) imaging of moving targets. By exploiting the block sparse structure of the target image, sparse solution for multiple measurement vectors (MMV) can be applied in ISAR imaging and a substantial performance improvement can be achieved. As an effective sparse recovery method, sparse Bayesian learning (SBL) for MMV involves a matrix inverse at each iteration. Its associated computational complexity grows significantly with the problem size. To address this problem, we develop a fast inverse-free (IF) SBL method for MMV. A relaxed evidence lower bound (ELBO), which is computationally more amiable than the traditional ELBO used by SBL, is obtained by invoking fundamental property for smooth functions. A variational expectation-maximization scheme is then employed to maximize the relaxed ELBO, and a computationally efficient IF-MSBL algorithm is proposed. Numerical results based on simulated and real data show that the proposed method can reconstruct row sparse signal accurately and obtain clear superresolution ISAR images. Moreover, the running time and computational complexity are reduced to a great extent compared with traditional SBL methods.
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2014-01-01
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties. To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For a folded concave penalized estimation problem, we show that as long as the problem is localizable and the oracle estimator is well behaved, we can obtain the oracle estimator by using the one-step local linear approximation. In addition, once the oracle estimator is obtained, the local linear approximation algorithm converges, namely it produces the same estimator in the next iteration. The general theory is demonstrated by using four classical sparse estimation problems, i.e., sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression. PMID:25598560
Amesos2 and Belos: Direct and Iterative Solvers for Large Sparse Linear Systems
Bavier, Eric; Hoemmen, Mark; Rajamanickam, Sivasankaran; ...
2012-01-01
Solvers for large sparse linear systems come in two categories: direct and iterative. Amesos2, a package in the Trilinos software project, provides direct methods, and Belos, another Trilinos package, provides iterative methods. Amesos2 offers a common interface to many different sparse matrix factorization codes, and can handle any implementation of sparse matrices and vectors, via an easy-to-extend C++ traits interface. It can also factor matrices whose entries have arbitrary “Scalar” type, enabling extended-precision and mixed-precision algorithms. Belos includes many different iterative methods for solving large sparse linear systems and least-squares problems. Unlike competing iterative solver libraries, Belos completely decouples themore » algorithms from the implementations of the underlying linear algebra objects. This lets Belos exploit the latest hardware without changes to the code. Belos favors algorithms that solve higher-level problems, such as multiple simultaneous linear systems and sequences of related linear systems, faster than standard algorithms. The package also supports extended-precision and mixed-precision algorithms. Together, Amesos2 and Belos form a complete suite of sparse linear solvers.« less
Matrix computations in MACSYMA
NASA Technical Reports Server (NTRS)
Wang, P. S.
1977-01-01
Facilities built into MACSYMA for manipulating matrices with numeric or symbolic entries are described. Computations will be done exactly, keeping symbols as symbols. Topics discussed include how to form a matrix and create other matrices by transforming existing matrices within MACSYMA; arithmetic and other computation with matrices; and user control of computational processes through the use of optional variables. Two algorithms designed for sparse matrices are given. The computing times of several different ways to compute the determinant of a matrix are compared.
Fabric defect detection based on visual saliency using deep feature and low-rank recovery
NASA Astrophysics Data System (ADS)
Liu, Zhoufeng; Wang, Baorui; Li, Chunlei; Li, Bicao; Dong, Yan
2018-04-01
Fabric defect detection plays an important role in improving the quality of fabric product. In this paper, a novel fabric defect detection method based on visual saliency using deep feature and low-rank recovery was proposed. First, unsupervised training is carried out by the initial network parameters based on MNIST large datasets. The supervised fine-tuning of fabric image library based on Convolutional Neural Networks (CNNs) is implemented, and then more accurate deep neural network model is generated. Second, the fabric images are uniformly divided into the image block with the same size, then we extract their multi-layer deep features using the trained deep network. Thereafter, all the extracted features are concentrated into a feature matrix. Third, low-rank matrix recovery is adopted to divide the feature matrix into the low-rank matrix which indicates the background and the sparse matrix which indicates the salient defect. In the end, the iterative optimal threshold segmentation algorithm is utilized to segment the saliency maps generated by the sparse matrix to locate the fabric defect area. Experimental results demonstrate that the feature extracted by CNN is more suitable for characterizing the fabric texture than the traditional LBP, HOG and other hand-crafted features extraction method, and the proposed method can accurately detect the defect regions of various fabric defects, even for the image with complex texture.
Tensor-GMRES method for large sparse systems of nonlinear equations
NASA Technical Reports Server (NTRS)
Feng, Dan; Pulliam, Thomas H.
1994-01-01
This paper introduces a tensor-Krylov method, the tensor-GMRES method, for large sparse systems of nonlinear equations. This method is a coupling of tensor model formation and solution techniques for nonlinear equations with Krylov subspace projection techniques for unsymmetric systems of linear equations. Traditional tensor methods for nonlinear equations are based on a quadratic model of the nonlinear function, a standard linear model augmented by a simple second order term. These methods are shown to be significantly more efficient than standard methods both on nonsingular problems and on problems where the Jacobian matrix at the solution is singular. A major disadvantage of the traditional tensor methods is that the solution of the tensor model requires the factorization of the Jacobian matrix, which may not be suitable for problems where the Jacobian matrix is large and has a 'bad' sparsity structure for an efficient factorization. We overcome this difficulty by forming and solving the tensor model using an extension of a Newton-GMRES scheme. Like traditional tensor methods, we show that the new tensor method has significant computational advantages over the analogous Newton counterpart. Consistent with Krylov subspace based methods, the new tensor method does not depend on the factorization of the Jacobian matrix. As a matter of fact, the Jacobian matrix is never needed explicitly.
Negre, Christian F A; Mniszewski, Susan M; Cawkwell, Marc J; Bock, Nicolas; Wall, Michael E; Niklasson, Anders M N
2016-07-12
We present a reduced complexity algorithm to compute the inverse overlap factors required to solve the generalized eigenvalue problem in a quantum-based molecular dynamics (MD) simulation. Our method is based on the recursive, iterative refinement of an initial guess of Z (inverse square root of the overlap matrix S). The initial guess of Z is obtained beforehand by using either an approximate divide-and-conquer technique or dynamical methods, propagated within an extended Lagrangian dynamics from previous MD time steps. With this formulation, we achieve long-term stability and energy conservation even under the incomplete, approximate, iterative refinement of Z. Linear-scaling performance is obtained using numerically thresholded sparse matrix algebra based on the ELLPACK-R sparse matrix data format, which also enables efficient shared-memory parallelization. As we show in this article using self-consistent density-functional-based tight-binding MD, our approach is faster than conventional methods based on the diagonalization of overlap matrix S for systems as small as a few hundred atoms, substantially accelerating quantum-based simulations even for molecular structures of intermediate size. For a 4158-atom water-solvated polyalanine system, we find an average speedup factor of 122 for the computation of Z in each MD step.
Negre, Christian F. A; Mniszewski, Susan M.; Cawkwell, Marc Jon; ...
2016-06-06
We present a reduced complexity algorithm to compute the inverse overlap factors required to solve the generalized eigenvalue problem in a quantum-based molecular dynamics (MD) simulation. Our method is based on the recursive iterative re nement of an initial guess Z of the inverse overlap matrix S. The initial guess of Z is obtained beforehand either by using an approximate divide and conquer technique or dynamically, propagated within an extended Lagrangian dynamics from previous MD time steps. With this formulation, we achieve long-term stability and energy conservation even under incomplete approximate iterative re nement of Z. Linear scaling performance ismore » obtained using numerically thresholded sparse matrix algebra based on the ELLPACK-R sparse matrix data format, which also enables e cient shared memory parallelization. As we show in this article using selfconsistent density functional based tight-binding MD, our approach is faster than conventional methods based on the direct diagonalization of the overlap matrix S for systems as small as a few hundred atoms, substantially accelerating quantum-based simulations even for molecular structures of intermediate size. For a 4,158 atom water-solvated polyalanine system we nd an average speedup factor of 122 for the computation of Z in each MD step.« less
Label consistent K-SVD: learning a discriminative dictionary for recognition.
Jiang, Zhuolin; Lin, Zhe; Davis, Larry S
2013-11-01
A label consistent K-SVD (LC-KSVD) algorithm to learn a discriminative dictionary for sparse coding is presented. In addition to using class labels of training data, we also associate label information with each dictionary item (columns of the dictionary matrix) to enforce discriminability in sparse codes during the dictionary learning process. More specifically, we introduce a new label consistency constraint called "discriminative sparse-code error" and combine it with the reconstruction error and the classification error to form a unified objective function. The optimal solution is efficiently obtained using the K-SVD algorithm. Our algorithm learns a single overcomplete dictionary and an optimal linear classifier jointly. The incremental dictionary learning algorithm is presented for the situation of limited memory resources. It yields dictionaries so that feature points with the same class labels have similar sparse codes. Experimental results demonstrate that our algorithm outperforms many recently proposed sparse-coding techniques for face, action, scene, and object category recognition under the same learning conditions.
Multi-color incomplete Cholesky conjugate gradient methods for vector computers. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Poole, E. L.
1986-01-01
In this research, we are concerned with the solution on vector computers of linear systems of equations, Ax = b, where A is a larger, sparse symmetric positive definite matrix. We solve the system using an iterative method, the incomplete Cholesky conjugate gradient method (ICCG). We apply a multi-color strategy to obtain p-color matrices for which a block-oriented ICCG method is implemented on the CYBER 205. (A p-colored matrix is a matrix which can be partitioned into a pXp block matrix where the diagonal blocks are diagonal matrices). This algorithm, which is based on a no-fill strategy, achieves O(N/p) length vector operations in both the decomposition of A and in the forward and back solves necessary at each iteration of the method. We discuss the natural ordering of the unknowns as an ordering that minimizes the number of diagonals in the matrix and define multi-color orderings in terms of disjoint sets of the unknowns. We give necessary and sufficient conditions to determine which multi-color orderings of the unknowns correpond to p-color matrices. A performance model is given which is used both to predict execution time for ICCG methods and also to compare an ICCG method to conjugate gradient without preconditioning or another ICCG method. Results are given from runs on the CYBER 205 at NASA's Langley Research Center for four model problems.
Subspace aware recovery of low rank and jointly sparse signals
Biswas, Sampurna; Dasgupta, Soura; Mudumbai, Raghuraman; Jacob, Mathews
2017-01-01
We consider the recovery of a matrix X, which is simultaneously low rank and joint sparse, from few measurements of its columns using a two-step algorithm. Each column of X is measured using a combination of two measurement matrices; one which is the same for every column, while the the second measurement matrix varies from column to column. The recovery proceeds by first estimating the row subspace vectors from the measurements corresponding to the common matrix. The estimated row subspace vectors are then used to recover X from all the measurements using a convex program of joint sparsity minimization. Our main contribution is to provide sufficient conditions on the measurement matrices that guarantee the recovery of such a matrix using the above two-step algorithm. The results demonstrate quite significant savings in number of measurements when compared to the standard multiple measurement vector (MMV) scheme, which assumes same time invariant measurement pattern for all the time frames. We illustrate the impact of the sampling pattern on reconstruction quality using breath held cardiac cine MRI and cardiac perfusion MRI data, while the utility of the algorithm to accelerate the acquisition is demonstrated on MR parameter mapping. PMID:28630889
Beyond Low Rank + Sparse: Multi-scale Low Rank Matrix Decomposition
Ong, Frank; Lustig, Michael
2016-01-01
We present a natural generalization of the recent low rank + sparse matrix decomposition and consider the decomposition of matrices into components of multiple scales. Such decomposition is well motivated in practice as data matrices often exhibit local correlations in multiple scales. Concretely, we propose a multi-scale low rank modeling that represents a data matrix as a sum of block-wise low rank matrices with increasing scales of block sizes. We then consider the inverse problem of decomposing the data matrix into its multi-scale low rank components and approach the problem via a convex formulation. Theoretically, we show that under various incoherence conditions, the convex program recovers the multi-scale low rank components either exactly or approximately. Practically, we provide guidance on selecting the regularization parameters and incorporate cycle spinning to reduce blocking artifacts. Experimentally, we show that the multi-scale low rank decomposition provides a more intuitive decomposition than conventional low rank methods and demonstrate its effectiveness in four applications, including illumination normalization for face images, motion separation for surveillance videos, multi-scale modeling of the dynamic contrast enhanced magnetic resonance imaging and collaborative filtering exploiting age information. PMID:28450978
Fast Solution in Sparse LDA for Binary Classification
NASA Technical Reports Server (NTRS)
Moghaddam, Baback
2010-01-01
An algorithm that performs sparse linear discriminant analysis (Sparse-LDA) finds near-optimal solutions in far less time than the prior art when specialized to binary classification (of 2 classes). Sparse-LDA is a type of feature- or variable- selection problem with numerous applications in statistics, machine learning, computer vision, computational finance, operations research, and bio-informatics. Because of its combinatorial nature, feature- or variable-selection problems are NP-hard or computationally intractable in cases involving more than 30 variables or features. Therefore, one typically seeks approximate solutions by means of greedy search algorithms. The prior Sparse-LDA algorithm was a greedy algorithm that considered the best variable or feature to add/ delete to/ from its subsets in order to maximally discriminate between multiple classes of data. The present algorithm is designed for the special but prevalent case of 2-class or binary classification (e.g. 1 vs. 0, functioning vs. malfunctioning, or change versus no change). The present algorithm provides near-optimal solutions on large real-world datasets having hundreds or even thousands of variables or features (e.g. selecting the fewest wavelength bands in a hyperspectral sensor to do terrain classification) and does so in typical computation times of minutes as compared to days or weeks as taken by the prior art. Sparse LDA requires solving generalized eigenvalue problems for a large number of variable subsets (represented by the submatrices of the input within-class and between-class covariance matrices). In the general (fullrank) case, the amount of computation scales at least cubically with the number of variables and thus the size of the problems that can be solved is limited accordingly. However, in binary classification, the principal eigenvalues can be found using a special analytic formula, without resorting to costly iterative techniques. The present algorithm exploits this analytic form along with the inherent sequential nature of greedy search itself. Together this enables the use of highly-efficient partitioned-matrix-inverse techniques that result in large speedups of computation in both the forward-selection and backward-elimination stages of greedy algorithms in general.
A path-oriented knowledge representation system: Defusing the combinatorial system
NASA Technical Reports Server (NTRS)
Karamouzis, Stamos T.; Barry, John S.; Smith, Steven L.; Feyock, Stefan
1995-01-01
LIMAP is a programming system oriented toward efficient information manipulation over fixed finite domains, and quantification over paths and predicates. A generalization of Warshall's Algorithm to precompute paths in a sparse matrix representation of semantic nets is employed to allow questions involving paths between components to be posed and answered easily. LIMAP's ability to cache all paths between two components in a matrix cell proved to be a computational obstacle, however, when the semantic net grew to realistic size. The present paper describes a means of mitigating this combinatorial explosion to an extent that makes the use of the LIMAP representation feasible for problems of significant size. The technique we describe radically reduces the size of the search space in which LIMAP must operate; semantic nets of more than 500 nodes have been attacked successfully. Furthermore, it appears that the procedure described is applicable not only to LIMAP, but to a number of other combinatorially explosive search space problems found in AI as well.
Quantum algorithm for linear systems of equations.
Harrow, Aram W; Hassidim, Avinatan; Lloyd, Seth
2009-10-09
Solving linear systems of equations is a common problem that arises both on its own and as a subroutine in more complex problems: given a matrix A and a vector b(-->), find a vector x(-->) such that Ax(-->) = b(-->). We consider the case where one does not need to know the solution x(-->) itself, but rather an approximation of the expectation value of some operator associated with x(-->), e.g., x(-->)(dagger) Mx(-->) for some matrix M. In this case, when A is sparse, N x N and has condition number kappa, the fastest known classical algorithms can find x(-->) and estimate x(-->)(dagger) Mx(-->) in time scaling roughly as N square root(kappa). Here, we exhibit a quantum algorithm for estimating x(-->)(dagger) Mx(-->) whose runtime is a polynomial of log(N) and kappa. Indeed, for small values of kappa [i.e., poly log(N)], we prove (using some common complexity-theoretic assumptions) that any classical algorithm for this problem generically requires exponentially more time than our quantum algorithm.
Coupled Modeling of Hydrodynamics and Sound in Coastal Ocean for Renewable Ocean Energy Development
DOE Office of Scientific and Technical Information (OSTI.GOV)
Long, Wen; Jung, Ki Won; Yang, Zhaoqing
An underwater sound model was developed to simulate sound propagation from marine and hydrokinetic energy (MHK) devices or offshore wind (OSW) energy platforms. Finite difference methods were developed to solve the 3D Helmholtz equation for sound propagation in the coastal environment. A 3D sparse matrix solver with complex coefficients was formed for solving the resulting acoustic pressure field. The Complex Shifted Laplacian Preconditioner (CSLP) method was applied to solve the matrix system iteratively with MPI parallelization using a high performance cluster. The sound model was then coupled with the Finite Volume Community Ocean Model (FVCOM) for simulating sound propagation generatedmore » by human activities, such as construction of OSW turbines or tidal stream turbine operations, in a range-dependent setting. As a proof of concept, initial validation of the solver is presented for two coastal wedge problems. This sound model can be useful for evaluating impacts on marine mammals due to deployment of MHK devices and OSW energy platforms.« less
Compressed sensing for high-resolution nonlipid suppressed 1 H FID MRSI of the human brain at 9.4T.
Nassirpour, Sahar; Chang, Paul; Avdievitch, Nikolai; Henning, Anke
2018-04-29
The aim of this study was to apply compressed sensing to accelerate the acquisition of high resolution metabolite maps of the human brain using a nonlipid suppressed ultra-short TR and TE 1 H FID MRSI sequence at 9.4T. X-t sparse compressed sensing reconstruction was optimized for nonlipid suppressed 1 H FID MRSI data. Coil-by-coil x-t sparse reconstruction was compared with SENSE x-t sparse and low rank reconstruction. The effect of matrix size and spatial resolution on the achievable acceleration factor was studied. Finally, in vivo metabolite maps with different acceleration factors of 2, 4, 5, and 10 were acquired and compared. Coil-by-coil x-t sparse compressed sensing reconstruction was not able to reliably recover the nonlipid suppressed data, rather a combination of parallel and sparse reconstruction was necessary (SENSE x-t sparse). For acceleration factors of up to 5, both the low-rank and the compressed sensing methods were able to reconstruct the data comparably well (root mean squared errors [RMSEs] ≤ 10.5% for Cre). However, the reconstruction time of the low rank algorithm was drastically longer than compressed sensing. Using the optimized compressed sensing reconstruction, acceleration factors of 4 or 5 could be reached for the MRSI data with a matrix size of 64 × 64. For lower spatial resolutions, an acceleration factor of up to R∼4 was successfully achieved. By tailoring the reconstruction scheme to the nonlipid suppressed data through parameter optimization and performance evaluation, we present high resolution (97 µL voxel size) accelerated in vivo metabolite maps of the human brain acquired at 9.4T within scan times of 3 to 3.75 min. © 2018 International Society for Magnetic Resonance in Medicine.
SIMD Optimization of Linear Expressions for Programmable Graphics Hardware
Bajaj, Chandrajit; Ihm, Insung; Min, Jungki; Oh, Jinsang
2009-01-01
The increased programmability of graphics hardware allows efficient graphical processing unit (GPU) implementations of a wide range of general computations on commodity PCs. An important factor in such implementations is how to fully exploit the SIMD computing capacities offered by modern graphics processors. Linear expressions in the form of ȳ = Ax̄ + b̄, where A is a matrix, and x̄, ȳ and b̄ are vectors, constitute one of the most basic operations in many scientific computations. In this paper, we propose a SIMD code optimization technique that enables efficient shader codes to be generated for evaluating linear expressions. It is shown that performance can be improved considerably by efficiently packing arithmetic operations into four-wide SIMD instructions through reordering of the operations in linear expressions. We demonstrate that the presented technique can be used effectively for programming both vertex and pixel shaders for a variety of mathematical applications, including integrating differential equations and solving a sparse linear system of equations using iterative methods. PMID:19946569
Hybrid reconstruction of quantum density matrix: when low-rank meets sparsity
NASA Astrophysics Data System (ADS)
Li, Kezhi; Zheng, Kai; Yang, Jingbei; Cong, Shuang; Liu, Xiaomei; Li, Zhaokai
2017-12-01
Both the mathematical theory and experiments have verified that the quantum state tomography based on compressive sensing is an efficient framework for the reconstruction of quantum density states. In recent physical experiments, we found that many unknown density matrices in which people are interested in are low-rank as well as sparse. Bearing this information in mind, in this paper we propose a reconstruction algorithm that combines the low-rank and the sparsity property of density matrices and further theoretically prove that the solution of the optimization function can be, and only be, the true density matrix satisfying the model with overwhelming probability, as long as a necessary number of measurements are allowed. The solver leverages the fixed-point equation technique in which a step-by-step strategy is developed by utilizing an extended soft threshold operator that copes with complex values. Numerical experiments of the density matrix estimation for real nuclear magnetic resonance devices reveal that the proposed method achieves a better accuracy compared to some existing methods. We believe that the proposed method could be leveraged as a generalized approach and widely implemented in the quantum state estimation.
NASA Astrophysics Data System (ADS)
Bouchet, L.; Amestoy, P.; Buttari, A.; Rouet, F.-H.; Chauvin, M.
2013-02-01
Nowadays, analyzing and reducing the ever larger astronomical datasets is becoming a crucial challenge, especially for long cumulated observation times. The INTEGRAL/SPI X/γ-ray spectrometer is an instrument for which it is essential to process many exposures at the same time in order to increase the low signal-to-noise ratio of the weakest sources. In this context, the conventional methods for data reduction are inefficient and sometimes not feasible at all. Processing several years of data simultaneously requires computing not only the solution of a large system of equations, but also the associated uncertainties. We aim at reducing the computation time and the memory usage. Since the SPI transfer function is sparse, we have used some popular methods for the solution of large sparse linear systems; we briefly review these methods. We use the Multifrontal Massively Parallel Solver (MUMPS) to compute the solution of the system of equations. We also need to compute the variance of the solution, which amounts to computing selected entries of the inverse of the sparse matrix corresponding to our linear system. This can be achieved through one of the latest features of the MUMPS software that has been partly motivated by this work. In this paper we provide a brief presentation of this feature and evaluate its effectiveness on astrophysical problems requiring the processing of large datasets simultaneously, such as the study of the entire emission of the Galaxy. We used these algorithms to solve the large sparse systems arising from SPI data processing and to obtain both their solutions and the associated variances. In conclusion, thanks to these newly developed tools, processing large datasets arising from SPI is now feasible with both a reasonable execution time and a low memory usage.
Algorithms and software for solving finite element equations on serial and parallel architectures
NASA Technical Reports Server (NTRS)
Chu, Eleanor; George, Alan
1988-01-01
The primary objective was to compare the performance of state-of-the-art techniques for solving sparse systems with those that are currently available in the Computational Structural Mechanics (MSC) testbed. One of the first tasks was to become familiar with the structure of the testbed, and to install some or all of the SPARSPAK package in the testbed. A brief overview of the CSM Testbed software and its usage is presented. An overview of the sparse matrix research for the Testbed currently employed in the CSM Testbed is given. An interface which was designed and implemented as a research tool for installing and appraising new matrix processors in the CSM Testbed is described. The results of numerical experiments performed in solving a set of testbed demonstration problems using the processor SPK and other experimental processors are contained.
NASA Astrophysics Data System (ADS)
Fiandrotti, Attilio; Fosson, Sophie M.; Ravazzi, Chiara; Magli, Enrico
2018-04-01
Compressive sensing promises to enable bandwidth-efficient on-board compression of astronomical data by lifting the encoding complexity from the source to the receiver. The signal is recovered off-line, exploiting GPUs parallel computation capabilities to speedup the reconstruction process. However, inherent GPU hardware constraints limit the size of the recoverable signal and the speedup practically achievable. In this work, we design parallel algorithms that exploit the properties of circulant matrices for efficient GPU-accelerated sparse signals recovery. Our approach reduces the memory requirements, allowing us to recover very large signals with limited memory. In addition, it achieves a tenfold signal recovery speedup thanks to ad-hoc parallelization of matrix-vector multiplications and matrix inversions. Finally, we practically demonstrate our algorithms in a typical application of circulant matrices: deblurring a sparse astronomical image in the compressed domain.
Improved parallel data partitioning by nested dissection with applications to information retrieval.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wolf, Michael M.; Chevalier, Cedric; Boman, Erik Gunnar
The computational work in many information retrieval and analysis algorithms is based on sparse linear algebra. Sparse matrix-vector multiplication is a common kernel in many of these computations. Thus, an important related combinatorial problem in parallel computing is how to distribute the matrix and the vectors among processors so as to minimize the communication cost. We focus on minimizing the total communication volume while keeping the computation balanced across processes. In [1], the first two authors presented a new 2D partitioning method, the nested dissection partitioning algorithm. In this paper, we improve on that algorithm and show that it ismore » a good option for data partitioning in information retrieval. We also show partitioning time can be substantially reduced by using the SCOTCH software, and quality improves in some cases, too.« less
Hierarchical Feature Extraction With Local Neural Response for Image Recognition.
Li, Hong; Wei, Yantao; Li, Luoqing; Chen, C L P
2013-04-01
In this paper, a hierarchical feature extraction method is proposed for image recognition. The key idea of the proposed method is to extract an effective feature, called local neural response (LNR), of the input image with nontrivial discrimination and invariance properties by alternating between local coding and maximum pooling operation. The local coding, which is carried out on the locally linear manifold, can extract the salient feature of image patches and leads to a sparse measure matrix on which maximum pooling is carried out. The maximum pooling operation builds the translation invariance into the model. We also show that other invariant properties, such as rotation and scaling, can be induced by the proposed model. In addition, a template selection algorithm is presented to reduce computational complexity and to improve the discrimination ability of the LNR. Experimental results show that our method is robust to local distortion and clutter compared with state-of-the-art algorithms.
Das, Kiranmoy; Daniels, Michael J.
2014-01-01
Summary Estimation of the covariance structure for irregular sparse longitudinal data has been studied by many authors in recent years but typically using fully parametric specifications. In addition, when data are collected from several groups over time, it is known that assuming the same or completely different covariance matrices over groups can lead to loss of efficiency and/or bias. Nonparametric approaches have been proposed for estimating the covariance matrix for regular univariate longitudinal data by sharing information across the groups under study. For the irregular case, with longitudinal measurements that are bivariate or multivariate, modeling becomes more difficult. In this article, to model bivariate sparse longitudinal data from several groups, we propose a flexible covariance structure via a novel matrix stick-breaking process for the residual covariance structure and a Dirichlet process mixture of normals for the random effects. Simulation studies are performed to investigate the effectiveness of the proposed approach over more traditional approaches. We also analyze a subset of Framingham Heart Study data to examine how the blood pressure trajectories and covariance structures differ for the patients from different BMI groups (high, medium and low) at baseline. PMID:24400941
Zhang, Zhilin; Jung, Tzyy-Ping; Makeig, Scott; Rao, Bhaskar D
2013-02-01
Fetal ECG (FECG) telemonitoring is an important branch in telemedicine. The design of a telemonitoring system via a wireless body area network with low energy consumption for ambulatory use is highly desirable. As an emerging technique, compressed sensing (CS) shows great promise in compressing/reconstructing data with low energy consumption. However, due to some specific characteristics of raw FECG recordings such as nonsparsity and strong noise contamination, current CS algorithms generally fail in this application. This paper proposes to use the block sparse Bayesian learning framework to compress/reconstruct nonsparse raw FECG recordings. Experimental results show that the framework can reconstruct the raw recordings with high quality. Especially, the reconstruction does not destroy the interdependence relation among the multichannel recordings. This ensures that the independent component analysis decomposition of the reconstructed recordings has high fidelity. Furthermore, the framework allows the use of a sparse binary sensing matrix with much fewer nonzero entries to compress recordings. Particularly, each column of the matrix can contain only two nonzero entries. This shows that the framework, compared to other algorithms such as current CS algorithms and wavelet algorithms, can greatly reduce code execution in CPU in the data compression stage.
Technical note: an R package for fitting sparse neural networks with application in animal breeding.
Wang, Yangfan; Mi, Xue; Rosa, Guilherme J M; Chen, Zhihui; Lin, Ping; Wang, Shi; Bao, Zhenmin
2018-05-04
Neural networks (NNs) have emerged as a new tool for genomic selection (GS) in animal breeding. However, the properties of NN used in GS for the prediction of phenotypic outcomes are not well characterized due to the problem of over-parameterization of NN and difficulties in using whole-genome marker sets as high-dimensional NN input. In this note, we have developed an R package called snnR that finds an optimal sparse structure of a NN by minimizing the square error subject to a penalty on the L1-norm of the parameters (weights and biases), therefore solving the problem of over-parameterization in NN. We have also tested some models fitted in the snnR package to demonstrate their feasibility and effectiveness to be used in several cases as examples. In comparison of snnR to the R package brnn (the Bayesian regularized single layer NNs), with both using the entries of a genotype matrix or a genomic relationship matrix as inputs, snnR has greatly improved the computational efficiency and the prediction ability for the GS in animal breeding because snnR implements a sparse NN with many hidden layers.
Comparison of two matrix data structures for advanced CSM testbed applications
NASA Technical Reports Server (NTRS)
Regelbrugge, M. E.; Brogan, F. A.; Nour-Omid, B.; Rankin, C. C.; Wright, M. A.
1989-01-01
The first section describes data storage schemes presently used by the Computational Structural Mechanics (CSM) testbed sparse matrix facilities and similar skyline (profile) matrix facilities. The second section contains a discussion of certain features required for the implementation of particular advanced CSM algorithms, and how these features might be incorporated into the data storage schemes described previously. The third section presents recommendations, based on the discussions of the prior sections, for directing future CSM testbed development to provide necessary matrix facilities for advanced algorithm implementation and use. The objective is to lend insight into the matrix structures discussed and to help explain the process of evaluating alternative matrix data structures and utilities for subsequent use in the CSM testbed.
Pure endmember extraction using robust kernel archetypoid analysis for hyperspectral imagery
NASA Astrophysics Data System (ADS)
Sun, Weiwei; Yang, Gang; Wu, Ke; Li, Weiyue; Zhang, Dianfa
2017-09-01
A robust kernel archetypoid analysis (RKADA) method is proposed to extract pure endmembers from hyperspectral imagery (HSI). The RKADA assumes that each pixel is a sparse linear mixture of all endmembers and each endmember corresponds to a real pixel in the image scene. First, it improves the re8gular archetypal analysis with a new binary sparse constraint, and the adoption of the kernel function constructs the principal convex hull in an infinite Hilbert space and enlarges the divergences between pairwise pixels. Second, the RKADA transfers the pure endmember extraction problem into an optimization problem by minimizing residual errors with the Huber loss function. The Huber loss function reduces the effects from big noises and outliers in the convergence procedure of RKADA and enhances the robustness of the optimization function. Third, the random kernel sinks for fast kernel matrix approximation and the two-stage algorithm for optimizing initial pure endmembers are utilized to improve its computational efficiency in realistic implementations of RKADA, respectively. The optimization equation of RKADA is solved by using the block coordinate descend scheme and the desired pure endmembers are finally obtained. Six state-of-the-art pure endmember extraction methods are employed to make comparisons with the RKADA on both synthetic and real Cuprite HSI datasets, including three geometrical algorithms vertex component analysis (VCA), alternative volume maximization (AVMAX) and orthogonal subspace projection (OSP), and three matrix factorization algorithms the preconditioning for successive projection algorithm (PreSPA), hierarchical clustering based on rank-two nonnegative matrix factorization (H2NMF) and self-dictionary multiple measurement vector (SDMMV). Experimental results show that the RKADA outperforms all the six methods in terms of spectral angle distance (SAD) and root-mean-square-error (RMSE). Moreover, the RKADA has short computational times in offline operations and shows significant improvement in identifying pure endmembers for ground objects with smaller spectrum differences. Therefore, the RKADA could be an alternative for pure endmember extraction from hyperspectral images.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, R; Fallone, B; Cross Cancer Institute, Edmonton, AB
Purpose: To develop a Graphic Processor Unit (GPU) accelerated deterministic solution to the Linear Boltzmann Transport Equation (LBTE) for accurate dose calculations in radiotherapy (RT). A deterministic solution yields the potential for major speed improvements due to the sparse matrix-vector and vector-vector multiplications and would thus be of benefit to RT. Methods: In order to leverage the massively parallel architecture of GPUs, the first order LBTE was reformulated as a second order self-adjoint equation using the Least Squares Finite Element Method (LSFEM). This produces a symmetric positive-definite matrix which is efficiently solved using a parallelized conjugate gradient (CG) solver. Themore » LSFEM formalism is applied in space, discrete ordinates is applied in angle, and the Multigroup method is applied in energy. The final linear system of equations produced is tightly coupled in space and angle. Our code written in CUDA-C was benchmarked on an Nvidia GeForce TITAN-X GPU against an Intel i7-6700K CPU. A spatial mesh of 30,950 tetrahedral elements was used with an S4 angular approximation. Results: To avoid repeating a full computationally intensive finite element matrix assembly at each Multigroup energy, a novel mapping algorithm was developed which minimized the operations required at each energy. Additionally, a parallelized memory mapping for the kronecker product between the sparse spatial and angular matrices, including Dirichlet boundary conditions, was created. Atomicity is preserved by graph-coloring overlapping nodes into separate kernel launches. The one-time mapping calculations for matrix assembly, kronecker product, and boundary condition application took 452±1ms on GPU. Matrix assembly for 16 energy groups took 556±3s on CPU, and 358±2ms on GPU using the mappings developed. The CG solver took 93±1s on CPU, and 468±2ms on GPU. Conclusion: Three computationally intensive subroutines in deterministically solving the LBTE have been formulated on GPU, resulting in two orders of magnitude speedup. Funding support from Natural Sciences and Engineering Research Council and Alberta Innovates Health Solutions. Dr. Fallone is a co-founder and CEO of MagnetTx Oncology Solutions (under discussions to license Alberta bi-planar linac MR for commercialization).« less
Implicit solvers for unstructured meshes
NASA Technical Reports Server (NTRS)
Venkatakrishnan, V.; Mavriplis, Dimitri J.
1991-01-01
Implicit methods were developed and tested for unstructured mesh computations. The approximate system which arises from the Newton linearization of the nonlinear evolution operator is solved by using the preconditioned GMRES (Generalized Minimum Residual) technique. Three different preconditioners were studied, namely, the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over relaxation (SSOR). The preconditioners were optimized to have good vectorization properties. SSOR and ILU were also studied as iterative schemes. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also studied. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
Condition number estimation of preconditioned matrices.
Kushida, Noriyuki
2015-01-01
The present paper introduces a condition number estimation method for preconditioned matrices. The newly developed method provides reasonable results, while the conventional method which is based on the Lanczos connection gives meaningless results. The Lanczos connection based method provides the condition numbers of coefficient matrices of systems of linear equations with information obtained through the preconditioned conjugate gradient method. Estimating the condition number of preconditioned matrices is sometimes important when describing the effectiveness of new preconditionerers or selecting adequate preconditioners. Operating a preconditioner on a coefficient matrix is the simplest method of estimation. However, this is not possible for large-scale computing, especially if computation is performed on distributed memory parallel computers. This is because, the preconditioned matrices become dense, even if the original matrices are sparse. Although the Lanczos connection method can be used to calculate the condition number of preconditioned matrices, it is not considered to be applicable to large-scale problems because of its weakness with respect to numerical errors. Therefore, we have developed a robust and parallelizable method based on Hager's method. The feasibility studies are curried out for the diagonal scaling preconditioner and the SSOR preconditioner with a diagonal matrix, a tri-daigonal matrix and Pei's matrix. As a result, the Lanczos connection method contains around 10% error in the results even with a simple problem. On the other hand, the new method contains negligible errors. In addition, the newly developed method returns reasonable solutions when the Lanczos connection method fails with Pei's matrix, and matrices generated with the finite element method.
Parallel heterogeneous architectures for efficient OMP compressive sensing reconstruction
NASA Astrophysics Data System (ADS)
Kulkarni, Amey; Stanislaus, Jerome L.; Mohsenin, Tinoosh
2014-05-01
Compressive Sensing (CS) is a novel scheme, in which a signal that is sparse in a known transform domain can be reconstructed using fewer samples. The signal reconstruction techniques are computationally intensive and have sluggish performance, which make them impractical for real-time processing applications . The paper presents novel architectures for Orthogonal Matching Pursuit algorithm, one of the popular CS reconstruction algorithms. We show the implementation results of proposed architectures on FPGA, ASIC and on a custom many-core platform. For FPGA and ASIC implementation, a novel thresholding method is used to reduce the processing time for the optimization problem by at least 25%. Whereas, for the custom many-core platform, efficient parallelization techniques are applied, to reconstruct signals with variant signal lengths of N and sparsity of m. The algorithm is divided into three kernels. Each kernel is parallelized to reduce execution time, whereas efficient reuse of the matrix operators allows us to reduce area. Matrix operations are efficiently paralellized by taking advantage of blocked algorithms. For demonstration purpose, all architectures reconstruct a 256-length signal with maximum sparsity of 8 using 64 measurements. Implementation on Xilinx Virtex-5 FPGA, requires 27.14 μs to reconstruct the signal using basic OMP. Whereas, with thresholding method it requires 18 μs. ASIC implementation reconstructs the signal in 13 μs. However, our custom many-core, operating at 1.18 GHz, takes 18.28 μs to complete. Our results show that compared to the previous published work of the same algorithm and matrix size, proposed architectures for FPGA and ASIC implementations perform 1.3x and 1.8x respectively faster. Also, the proposed many-core implementation performs 3000x faster than the CPU and 2000x faster than the GPU.
Cucheb: A GPU implementation of the filtered Lanczos procedure
NASA Astrophysics Data System (ADS)
Aurentz, Jared L.; Kalantzis, Vassilis; Saad, Yousef
2017-11-01
This paper describes the software package Cucheb, a GPU implementation of the filtered Lanczos procedure for the solution of large sparse symmetric eigenvalue problems. The filtered Lanczos procedure uses a carefully chosen polynomial spectral transformation to accelerate convergence of the Lanczos method when computing eigenvalues within a desired interval. This method has proven particularly effective for eigenvalue problems that arise in electronic structure calculations and density functional theory. We compare our implementation against an equivalent CPU implementation and show that using the GPU can reduce the computation time by more than a factor of 10. Program Summary Program title: Cucheb Program Files doi:http://dx.doi.org/10.17632/rjr9tzchmh.1 Licensing provisions: MIT Programming language: CUDA C/C++ Nature of problem: Electronic structure calculations require the computation of all eigenvalue-eigenvector pairs of a symmetric matrix that lie inside a user-defined real interval. Solution method: To compute all the eigenvalues within a given interval a polynomial spectral transformation is constructed that maps the desired eigenvalues of the original matrix to the exterior of the spectrum of the transformed matrix. The Lanczos method is then used to compute the desired eigenvectors of the transformed matrix, which are then used to recover the desired eigenvalues of the original matrix. The bulk of the operations are executed in parallel using a graphics processing unit (GPU). Runtime: Variable, depending on the number of eigenvalues sought and the size and sparsity of the matrix. Additional comments: Cucheb is compatible with CUDA Toolkit v7.0 or greater.
Multivariable frequency domain identification via 2-norm minimization
NASA Technical Reports Server (NTRS)
Bayard, David S.
1992-01-01
The author develops a computational approach to multivariable frequency domain identification, based on 2-norm minimization. In particular, a Gauss-Newton (GN) iteration is developed to minimize the 2-norm of the error between frequency domain data and a matrix fraction transfer function estimate. To improve the global performance of the optimization algorithm, the GN iteration is initialized using the solution to a particular sequentially reweighted least squares problem, denoted as the SK iteration. The least squares problems which arise from both the SK and GN iterations are shown to involve sparse matrices with identical block structure. A sparse matrix QR factorization method is developed to exploit the special block structure, and to efficiently compute the least squares solution. A numerical example involving the identification of a multiple-input multiple-output (MIMO) plant having 286 unknown parameters is given to illustrate the effectiveness of the algorithm.
A fast time-difference inverse solver for 3D EIT with application to lung imaging.
Javaherian, Ashkan; Soleimani, Manuchehr; Moeller, Knut
2016-08-01
A class of sparse optimization techniques that require solely matrix-vector products, rather than an explicit access to the forward matrix and its transpose, has been paid much attention in the recent decade for dealing with large-scale inverse problems. This study tailors application of the so-called Gradient Projection for Sparse Reconstruction (GPSR) to large-scale time-difference three-dimensional electrical impedance tomography (3D EIT). 3D EIT typically suffers from the need for a large number of voxels to cover the whole domain, so its application to real-time imaging, for example monitoring of lung function, remains scarce since the large number of degrees of freedom of the problem extremely increases storage space and reconstruction time. This study shows the great potential of the GPSR for large-size time-difference 3D EIT. Further studies are needed to improve its accuracy for imaging small-size anomalies.
Research on sparse feature matching of improved RANSAC algorithm
NASA Astrophysics Data System (ADS)
Kong, Xiangsi; Zhao, Xian
2018-04-01
In this paper, a sparse feature matching method based on modified RANSAC algorithm is proposed to improve the precision and speed. Firstly, the feature points of the images are extracted using the SIFT algorithm. Then, the image pair is matched roughly by generating SIFT feature descriptor. At last, the precision of image matching is optimized by the modified RANSAC algorithm,. The RANSAC algorithm is improved from three aspects: instead of the homography matrix, this paper uses the fundamental matrix generated by the 8 point algorithm as the model; the sample is selected by a random block selecting method, which ensures the uniform distribution and the accuracy; adds sequential probability ratio test(SPRT) on the basis of standard RANSAC, which cut down the overall running time of the algorithm. The experimental results show that this method can not only get higher matching accuracy, but also greatly reduce the computation and improve the matching speed.
3D Reconstruction of human bones based on dictionary learning.
Zhang, Binkai; Wang, Xiang; Liang, Xiao; Zheng, Jinjin
2017-11-01
An effective method for reconstructing a 3D model of human bones from computed tomography (CT) image data based on dictionary learning is proposed. In this study, the dictionary comprises the vertices of triangular meshes, and the sparse coefficient matrix indicates the connectivity information. For better reconstruction performance, we proposed a balance coefficient between the approximation and regularisation terms and a method for optimisation. Moreover, we applied a local updating strategy and a mesh-optimisation method to update the dictionary and the sparse matrix, respectively. The two updating steps are iterated alternately until the objective function converges. Thus, a reconstructed mesh could be obtained with high accuracy and regularisation. The experimental results show that the proposed method has the potential to obtain high precision and high-quality triangular meshes for rapid prototyping, medical diagnosis, and tissue engineering. Copyright © 2017 IPEM. Published by Elsevier Ltd. All rights reserved.
Parallel pivoting combined with parallel reduction
NASA Technical Reports Server (NTRS)
Alaghband, Gita
1987-01-01
Parallel algorithms for triangularization of large, sparse, and unsymmetric matrices are presented. The method combines the parallel reduction with a new parallel pivoting technique, control over generations of fill-ins and a check for numerical stability, all done in parallel with the work being distributed over the active processes. The parallel technique uses the compatibility relation between pivots to identify parallel pivot candidates and uses the Markowitz number of pivots to minimize fill-in. This technique is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds.
Representation-Independent Iteration of Sparse Data Arrays
NASA Technical Reports Server (NTRS)
James, Mark
2007-01-01
An approach is defined that describes a method of iterating over massively large arrays containing sparse data using an approach that is implementation independent of how the contents of the sparse arrays are laid out in memory. What is unique and important here is the decoupling of the iteration over the sparse set of array elements from how they are internally represented in memory. This enables this approach to be backward compatible with existing schemes for representing sparse arrays as well as new approaches. What is novel here is a new approach for efficiently iterating over sparse arrays that is independent of the underlying memory layout representation of the array. A functional interface is defined for implementing sparse arrays in any modern programming language with a particular focus for the Chapel programming language. Examples are provided that show the translation of a loop that computes a matrix vector product into this representation for both the distributed and not-distributed cases. This work is directly applicable to NASA and its High Productivity Computing Systems (HPCS) program that JPL and our current program are engaged in. The goal of this program is to create powerful, scalable, and economically viable high-powered computer systems suitable for use in national security and industry by 2010. This is important to NASA for its computationally intensive requirements for analyzing and understanding the volumes of science data from our returned missions.
Cao, Huojun; Amendt, Brad A
2016-11-01
Developmental dental anomalies are common forms of congenital defects. The molecular mechanisms of dental anomalies are poorly understood. Systematic approaches such as clustering genes based on similar expression patterns could identify novel genes involved in dental anomalies and provide a framework for understanding molecular regulatory mechanisms of these genes during tooth development (odontogenesis). A python package (pySAPC) of sparse affinity propagation clustering algorithm for large datasets was developed. Whole genome pair-wise similarity was calculated based on expression pattern similarity based on 45 microarrays of several stages during odontogenesis. pySAPC identified 743 gene clusters based on expression pattern similarity during mouse tooth development. Three clusters are significantly enriched for genes associated with dental anomalies (with FDR <0.1). The three clusters of genes have distinct expression patterns during odontogenesis. Clustering genes based on similar expression profiles recovered several known regulatory relationships for genes involved in odontogenesis, as well as many novel genes that may be involved with the same genetic pathways as genes that have already been shown to contribute to dental defects. By using sparse similarity matrix, pySAPC use much less memory and CPU time compared with the original affinity propagation program that uses a full similarity matrix. This python package will be useful for many applications where dataset(s) are too large to use full similarity matrix. This article is part of a Special Issue entitled "System Genetics" Guest Editor: Dr. Yudong Cai and Dr. Tao Huang. Copyright © 2016. Published by Elsevier B.V.
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.; Mohammed, Ahmed Ali; Kadiam, Subhash
2010-01-01
Solving large (and sparse) system of simultaneous linear equations has been (and continues to be) a major challenging problem for many real-world engineering/science applications [1-2]. For many practical/large-scale problems, the sparse, Symmetrical and Positive Definite (SPD) system of linear equations can be conveniently represented in matrix notation as [A] {x} = {b} , where the square coefficient matrix [A] and the Right-Hand-Side (RHS) vector {b} are known. The unknown solution vector {x} can be efficiently solved by the following step-by-step procedures [1-2]: Reordering phase, Matrix Factorization phase, Forward solution phase, and Backward solution phase. In this research work, a Game-Based Learning (GBL) approach has been developed to help engineering students to understand crucial details about matrix reordering and factorization phases. A "chess-like" game has been developed and can be played by either a single player, or two players. Through this "chess-like" open-ended game, the players/learners will not only understand the key concepts involved in reordering algorithms (based on existing algorithms), but also have the opportunities to "discover new algorithms" which are better than existing algorithms. Implementing the proposed "chess-like" game for matrix reordering and factorization phases can be enhanced by FLASH [3] computer environments, where computer simulation with animated human voice, sound effects, visual/graphical/colorful displays of matrix tables, score (or monetary) awards for the best game players, etc. can all be exploited. Preliminary demonstrations of the developed GBL approach can be viewed by anyone who has access to the internet web-site [4]!
NASA Astrophysics Data System (ADS)
Guda, A. A.; Guda, S. A.; Soldatov, M. A.; Lomachenko, K. A.; Bugaev, A. L.; Lamberti, C.; Gawelda, W.; Bressler, C.; Smolentsev, G.; Soldatov, A. V.; Joly, Y.
2016-05-01
Finite difference method (FDM) implemented in the FDMNES software [Phys. Rev. B, 2001, 63, 125120] was revised. Thorough analysis shows, that the calculated diagonal in the FDM matrix consists of about 96% zero elements. Thus a sparse solver would be more suitable for the problem instead of traditional Gaussian elimination for the diagonal neighbourhood. We have tried several iterative sparse solvers and the direct one MUMPS solver with METIS ordering turned out to be the best. Compared to the Gaussian solver present method is up to 40 times faster and allows XANES simulations for complex systems already on personal computers. We show applicability of the software for metal-organic [Fe(bpy)3]2+ complex both for low spin and high spin states populated after laser excitation.
NASA Astrophysics Data System (ADS)
Ma, Sangback
In this paper we compare various parallel preconditioners such as Point-SSOR (Symmetric Successive OverRelaxation), ILU(0) (Incomplete LU) in the Wavefront ordering, ILU(0) in the Multi-color ordering, Multi-Color Block SOR (Successive OverRelaxation), SPAI (SParse Approximate Inverse) and pARMS (Parallel Algebraic Recursive Multilevel Solver) for solving large sparse linear systems arising from two-dimensional PDE (Partial Differential Equation)s on structured grids. Point-SSOR is well-known, and ILU(0) is one of the most popular preconditioner, but it is inherently serial. ILU(0) in the Wavefront ordering maximizes the parallelism in the natural order, but the lengths of the wave-fronts are often nonuniform. ILU(0) in the Multi-color ordering is a simple way of achieving a parallelism of the order N, where N is the order of the matrix, but its convergence rate often deteriorates as compared to that of natural ordering. We have chosen the Multi-Color Block SOR preconditioner combined with direct sparse matrix solver, since for the Laplacian matrix the SOR method is known to have a nondeteriorating rate of convergence when used with the Multi-Color ordering. By using block version we expect to minimize the interprocessor communications. SPAI computes the sparse approximate inverse directly by least squares method. Finally, ARMS is a preconditioner recursively exploiting the concept of independent sets and pARMS is the parallel version of ARMS. Experiments were conducted for the Finite Difference and Finite Element discretizations of five two-dimensional PDEs with large meshsizes up to a million on an IBM p595 machine with distributed memory. Our matrices are real positive, i. e., their real parts of the eigenvalues are positive. We have used GMRES(m) as our outer iterative method, so that the convergence of GMRES(m) for our test matrices are mathematically guaranteed. Interprocessor communications were done using MPI (Message Passing Interface) primitives. The results show that in general ILU(0) in the Multi-Color ordering ahd ILU(0) in the Wavefront ordering outperform the other methods but for symmetric and nearly symmetric 5-point matrices Multi-Color Block SOR gives the best performance, except for a few cases with a small number of processors.
Biclustering sparse binary genomic data.
van Uitert, Miranda; Meuleman, Wouter; Wessels, Lodewyk
2008-12-01
Genomic datasets often consist of large, binary, sparse data matrices. In such a dataset, one is often interested in finding contiguous blocks that (mostly) contain ones. This is a biclustering problem, and while many algorithms have been proposed to deal with gene expression data, only two algorithms have been proposed that specifically deal with binary matrices. None of the gene expression biclustering algorithms can handle the large number of zeros in sparse binary matrices. The two proposed binary algorithms failed to produce meaningful results. In this article, we present a new algorithm that is able to extract biclusters from sparse, binary datasets. A powerful feature is that biclusters with different numbers of rows and columns can be detected, varying from many rows to few columns and few rows to many columns. It allows the user to guide the search towards biclusters of specific dimensions. When applying our algorithm to an input matrix derived from TRANSFAC, we find transcription factors with distinctly dissimilar binding motifs, but a clear set of common targets that are significantly enriched for GO categories.
A Sparse Bayesian Approach for Forward-Looking Superresolution Radar Imaging
Zhang, Yin; Zhang, Yongchao; Huang, Yulin; Yang, Jianyu
2017-01-01
This paper presents a sparse superresolution approach for high cross-range resolution imaging of forward-looking scanning radar based on the Bayesian criterion. First, a novel forward-looking signal model is established as the product of the measurement matrix and the cross-range target distribution, which is more accurate than the conventional convolution model. Then, based on the Bayesian criterion, the widely-used sparse regularization is considered as the penalty term to recover the target distribution. The derivation of the cost function is described, and finally, an iterative expression for minimizing this function is presented. Alternatively, this paper discusses how to estimate the single parameter of Gaussian noise. With the advantage of a more accurate model, the proposed sparse Bayesian approach enjoys a lower model error. Meanwhile, when compared with the conventional superresolution methods, the proposed approach shows high cross-range resolution and small location error. The superresolution results for the simulated point target, scene data, and real measured data are presented to demonstrate the superior performance of the proposed approach. PMID:28604583
Sparse dictionary learning for resting-state fMRI analysis
NASA Astrophysics Data System (ADS)
Lee, Kangjoo; Han, Paul Kyu; Ye, Jong Chul
2011-09-01
Recently, there has been increased interest in the usage of neuroimaging techniques to investigate what happens in the brain at rest. Functional imaging studies have revealed that the default-mode network activity is disrupted in Alzheimer's disease (AD). However, there is no consensus, as yet, on the choice of analysis method for the application of resting-state analysis for disease classification. This paper proposes a novel compressed sensing based resting-state fMRI analysis tool called Sparse-SPM. As the brain's functional systems has shown to have features of complex networks according to graph theoretical analysis, we apply a graph model to represent a sparse combination of information flows in complex network perspectives. In particular, a new concept of spatially adaptive design matrix has been proposed by implementing sparse dictionary learning based on sparsity. The proposed approach shows better performance compared to other conventional methods, such as independent component analysis (ICA) and seed-based approach, in classifying the AD patients from normal using resting-state analysis.
NASA Astrophysics Data System (ADS)
Elkurdi, Yousef; Fernández, David; Souleimanov, Evgueni; Giannacopoulos, Dennis; Gross, Warren J.
2008-04-01
The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory.
Improved Estimation and Interpretation of Correlations in Neural Circuits
Yatsenko, Dimitri; Josić, Krešimir; Ecker, Alexander S.; Froudarakis, Emmanouil; Cotton, R. James; Tolias, Andreas S.
2015-01-01
Ambitious projects aim to record the activity of ever larger and denser neuronal populations in vivo. Correlations in neural activity measured in such recordings can reveal important aspects of neural circuit organization. However, estimating and interpreting large correlation matrices is statistically challenging. Estimation can be improved by regularization, i.e. by imposing a structure on the estimate. The amount of improvement depends on how closely the assumed structure represents dependencies in the data. Therefore, the selection of the most efficient correlation matrix estimator for a given neural circuit must be determined empirically. Importantly, the identity and structure of the most efficient estimator informs about the types of dominant dependencies governing the system. We sought statistically efficient estimators of neural correlation matrices in recordings from large, dense groups of cortical neurons. Using fast 3D random-access laser scanning microscopy of calcium signals, we recorded the activity of nearly every neuron in volumes 200 μm wide and 100 μm deep (150–350 cells) in mouse visual cortex. We hypothesized that in these densely sampled recordings, the correlation matrix should be best modeled as the combination of a sparse graph of pairwise partial correlations representing local interactions and a low-rank component representing common fluctuations and external inputs. Indeed, in cross-validation tests, the covariance matrix estimator with this structure consistently outperformed other regularized estimators. The sparse component of the estimate defined a graph of interactions. These interactions reflected the physical distances and orientation tuning properties of cells: The density of positive ‘excitatory’ interactions decreased rapidly with geometric distances and with differences in orientation preference whereas negative ‘inhibitory’ interactions were less selective. Because of its superior performance, this ‘sparse+latent’ estimator likely provides a more physiologically relevant representation of the functional connectivity in densely sampled recordings than the sample correlation matrix. PMID:25826696
GPU-accelerated Modeling and Element-free Reverse-time Migration with Gauss Points Partition
NASA Astrophysics Data System (ADS)
Zhen, Z.; Jia, X.
2014-12-01
Element-free method (EFM) has been applied to seismic modeling and migration. Compared with finite element method (FEM) and finite difference method (FDM), it is much cheaper and more flexible because only the information of the nodes and the boundary of the study area are required in computation. In the EFM, the number of Gauss points should be consistent with the number of model nodes; otherwise the accuracy of the intermediate coefficient matrices would be harmed. Thus when we increase the nodes of velocity model in order to obtain higher resolution, we find that the size of the computer's memory will be a bottleneck. The original EFM can deal with at most 81×81 nodes in the case of 2G memory, as tested by Jia and Hu (2006). In order to solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition (GPP), and utilize the GPUs to improve the computation efficiency. Considering the characteristics of the Gaussian points, the GPP method doesn't influence the propagation of seismic wave in the velocity model. To overcome the time-consuming computation of the stiffness matrix (K) and the mass matrix (M), we also use the GPUs in our computation program. We employ the compressed sparse row (CSR) format to compress the intermediate sparse matrices and try to simplify the operations by solving the linear equations with the CULA Sparse's Conjugate Gradient (CG) solver instead of the linear sparse solver 'PARDISO'. It is observed that our strategy can significantly reduce the computational time of K and Mcompared with the algorithm based on CPU. The model tested is Marmousi model. The length of the model is 7425m and the depth is 2990m. We discretize the model with 595x298 nodes, 300x300 Gauss cells and 3x3 Gauss points in each cell. In contrast to the computational time of the conventional EFM, the GPUs-GPP approach can substantially improve the efficiency. The speedup ratio of time consumption of computing K, M is 120 and the speedup ratio time consumption of RTM is 11.5. At the same time, the accuracy of imaging is not harmed. Another advantage of the GPUs-GPP method is its easy applications in other numerical methods such as the FEM. Finally, in the GPUs-GPP method, the arrays require quite limited memory storage, which makes the method promising in dealing with large-scale 3D problems.
The CSM testbed matrix processors internal logic and dataflow descriptions
NASA Technical Reports Server (NTRS)
Regelbrugge, Marc E.; Wright, Mary A.
1988-01-01
This report constitutes the final report for subtask 1 of Task 5 of NASA Contract NAS1-18444, Computational Structural Mechanics (CSM) Research. This report contains a detailed description of the coded workings of selected CSM Testbed matrix processors (i.e., TOPO, K, INV, SSOL) and of the arithmetic utility processor AUS. These processors and the current sparse matrix data structures are studied and documented. Items examined include: details of the data structures, interdependence of data structures, data-blocking logic in the data structures, processor data flow and architecture, and processor algorithmic logic flow.
Energy conserving, linear scaling Born-Oppenheimer molecular dynamics.
Cawkwell, M J; Niklasson, Anders M N
2012-10-07
Born-Oppenheimer molecular dynamics simulations with long-term conservation of the total energy and a computational cost that scales linearly with system size have been obtained simultaneously. Linear scaling with a low pre-factor is achieved using density matrix purification with sparse matrix algebra and a numerical threshold on matrix elements. The extended Lagrangian Born-Oppenheimer molecular dynamics formalism [A. M. N. Niklasson, Phys. Rev. Lett. 100, 123004 (2008)] yields microcanonical trajectories with the approximate forces obtained from the linear scaling method that exhibit no systematic drift over hundreds of picoseconds and which are indistinguishable from trajectories computed using exact forces.
Reconstruction of Complex Network based on the Noise via QR Decomposition and Compressed Sensing.
Li, Lixiang; Xu, Dafei; Peng, Haipeng; Kurths, Jürgen; Yang, Yixian
2017-11-08
It is generally known that the states of network nodes are stable and have strong correlations in a linear network system. We find that without the control input, the method of compressed sensing can not succeed in reconstructing complex networks in which the states of nodes are generated through the linear network system. However, noise can drive the dynamics between nodes to break the stability of the system state. Therefore, a new method integrating QR decomposition and compressed sensing is proposed to solve the reconstruction problem of complex networks under the assistance of the input noise. The state matrix of the system is decomposed by QR decomposition. We construct the measurement matrix with the aid of Gaussian noise so that the sparse input matrix can be reconstructed by compressed sensing. We also discover that noise can build a bridge between the dynamics and the topological structure. Experiments are presented to show that the proposed method is more accurate and more efficient to reconstruct four model networks and six real networks by the comparisons between the proposed method and only compressed sensing. In addition, the proposed method can reconstruct not only the sparse complex networks, but also the dense complex networks.
Signal Sampling for Efficient Sparse Representation of Resting State FMRI Data
Ge, Bao; Makkie, Milad; Wang, Jin; Zhao, Shijie; Jiang, Xi; Li, Xiang; Lv, Jinglei; Zhang, Shu; Zhang, Wei; Han, Junwei; Guo, Lei; Liu, Tianming
2015-01-01
As the size of brain imaging data such as fMRI grows explosively, it provides us with unprecedented and abundant information about the brain. How to reduce the size of fMRI data but not lose much information becomes a more and more pressing issue. Recent literature studies tried to deal with it by dictionary learning and sparse representation methods, however, their computation complexities are still high, which hampers the wider application of sparse representation method to large scale fMRI datasets. To effectively address this problem, this work proposes to represent resting state fMRI (rs-fMRI) signals of a whole brain via a statistical sampling based sparse representation. First we sampled the whole brain’s signals via different sampling methods, then the sampled signals were aggregate into an input data matrix to learn a dictionary, finally this dictionary was used to sparsely represent the whole brain’s signals and identify the resting state networks. Comparative experiments demonstrate that the proposed signal sampling framework can speed-up by ten times in reconstructing concurrent brain networks without losing much information. The experiments on the 1000 Functional Connectomes Project further demonstrate its effectiveness and superiority. PMID:26646924
Joint Smoothed l₀-Norm DOA Estimation Algorithm for Multiple Measurement Vectors in MIMO Radar.
Liu, Jing; Zhou, Weidong; Juwono, Filbert H
2017-05-08
Direction-of-arrival (DOA) estimation is usually confronted with a multiple measurement vector (MMV) case. In this paper, a novel fast sparse DOA estimation algorithm, named the joint smoothed l 0 -norm algorithm, is proposed for multiple measurement vectors in multiple-input multiple-output (MIMO) radar. To eliminate the white or colored Gaussian noises, the new method first obtains a low-complexity high-order cumulants based data matrix. Then, the proposed algorithm designs a joint smoothed function tailored for the MMV case, based on which joint smoothed l 0 -norm sparse representation framework is constructed. Finally, for the MMV-based joint smoothed function, the corresponding gradient-based sparse signal reconstruction is designed, thus the DOA estimation can be achieved. The proposed method is a fast sparse representation algorithm, which can solve the MMV problem and perform well for both white and colored Gaussian noises. The proposed joint algorithm is about two orders of magnitude faster than the l 1 -norm minimization based methods, such as l 1 -SVD (singular value decomposition), RV (real-valued) l 1 -SVD and RV l 1 -SRACV (sparse representation array covariance vectors), and achieves better DOA estimation performance.
Hu, Shaoxing; Xu, Shike; Wang, Duhu; Zhang, Aiwu
2015-11-11
Aiming at addressing the problem of high computational cost of the traditional Kalman filter in SINS/GPS, a practical optimization algorithm with offline-derivation and parallel processing methods based on the numerical characteristics of the system is presented in this paper. The algorithm exploits the sparseness and/or symmetry of matrices to simplify the computational procedure. Thus plenty of invalid operations can be avoided by offline derivation using a block matrix technique. For enhanced efficiency, a new parallel computational mechanism is established by subdividing and restructuring calculation processes after analyzing the extracted "useful" data. As a result, the algorithm saves about 90% of the CPU processing time and 66% of the memory usage needed in a classical Kalman filter. Meanwhile, the method as a numerical approach needs no precise-loss transformation/approximation of system modules and the accuracy suffers little in comparison with the filter before computational optimization. Furthermore, since no complicated matrix theories are needed, the algorithm can be easily transplanted into other modified filters as a secondary optimization method to achieve further efficiency.
Wavelets in electronic structure calculations
NASA Astrophysics Data System (ADS)
Modisette, Jason Perry
1997-09-01
Ab initio calculations of the electronic structure of bulk materials and large clusters are not possible on today's computers using current techniques. The storage and diagonalization of the Hamiltonian matrix are the limiting factors in both memory and execution time. The scaling of both quantities with problem size can be reduced by using approximate diagonalization or direct minimization of the total energy with respect to the density matrix in conjunction with a localized basis. Wavelet basis members are much more localized than conventional bases such as Gaussians or numerical atomic orbitals. This localization leads to sparse matrices of the operators that arise in SCF multi-electron calculations. We have investigated the construction of the one-electron Hamiltonian, and also the effective one- electron Hamiltonians that appear in density-functional and Hartree-Fock theories. We develop efficient methods for the generation of the kinetic energy and potential matrices, the Hartree and exchange potentials, and the local exchange-correlation potential of the LDA. Test calculations are performed on one-electron problems with a variety of potentials in one and three dimensions.
Solution of the three-dimensional Helmholtz equation with nonlocal boundary conditions
NASA Technical Reports Server (NTRS)
Hodge, Steve L.; Zorumski, William E.; Watson, Willie R.
1995-01-01
The Helmholtz equation is solved within a three-dimensional rectangular duct with a nonlocal radiation boundary condition at the duct exit plane. This condition accurately models the acoustic admittance at an arbitrarily-located computational boundary plane. A linear system of equations is constructed with second-order central differences for the Helmholtz operator and second-order backward differences for both local admittance conditions and the gradient term in the nonlocal radiation boundary condition. The resulting matrix equation is large, sparse, and non-Hermitian. The size and structure of the matrix makes direct solution techniques impractical; as a result, a nonstationary iterative technique is used for its solution. The theory behind the nonstationary technique is reviewed, and numerical results are presented for radiation from both a point source and a planar acoustic source. The solutions with the nonlocal boundary conditions are invariant to the location of the computational boundary, and the same nonlocal conditions are valid for all solutions. The nonlocal conditions thus provide a means of minimizing the size of three-dimensional computational domains.
Energy Efficient GNSS Signal Acquisition Using Singular Value Decomposition (SVD).
Bermúdez Ordoñez, Juan Carlos; Arnaldo Valdés, Rosa María; Gómez Comendador, Fernando
2018-05-16
A significant challenge in global navigation satellite system (GNSS) signal processing is a requirement for a very high sampling rate. The recently-emerging compressed sensing (CS) theory makes processing GNSS signals at a low sampling rate possible if the signal has a sparse representation in a certain space. Based on CS and SVD theories, an algorithm for sampling GNSS signals at a rate much lower than the Nyquist rate and reconstructing the compressed signal is proposed in this research, which is validated after the output from that process still performs signal detection using the standard fast Fourier transform (FFT) parallel frequency space search acquisition. The sparse representation of the GNSS signal is the most important precondition for CS, by constructing a rectangular Toeplitz matrix (TZ) of the transmitted signal, calculating the left singular vectors using SVD from the TZ, to achieve sparse signal representation. Next, obtaining the M-dimensional observation vectors based on the left singular vectors of the SVD, which are equivalent to the sampler operator in standard compressive sensing theory, the signal can be sampled below the Nyquist rate, and can still be reconstructed via ℓ 1 minimization with accuracy using convex optimization. As an added value, there is a GNSS signal acquisition enhancement effect by retaining the useful signal and filtering out noise by projecting the signal into the most significant proper orthogonal modes (PODs) which are the optimal distributions of signal power. The algorithm is validated with real recorded signals, and the results show that the proposed method is effective for sampling, reconstructing intermediate frequency (IF) GNSS signals in the time discrete domain.
Energy Efficient GNSS Signal Acquisition Using Singular Value Decomposition (SVD)
Arnaldo Valdés, Rosa María; Gómez Comendador, Fernando
2018-01-01
A significant challenge in global navigation satellite system (GNSS) signal processing is a requirement for a very high sampling rate. The recently-emerging compressed sensing (CS) theory makes processing GNSS signals at a low sampling rate possible if the signal has a sparse representation in a certain space. Based on CS and SVD theories, an algorithm for sampling GNSS signals at a rate much lower than the Nyquist rate and reconstructing the compressed signal is proposed in this research, which is validated after the output from that process still performs signal detection using the standard fast Fourier transform (FFT) parallel frequency space search acquisition. The sparse representation of the GNSS signal is the most important precondition for CS, by constructing a rectangular Toeplitz matrix (TZ) of the transmitted signal, calculating the left singular vectors using SVD from the TZ, to achieve sparse signal representation. Next, obtaining the M-dimensional observation vectors based on the left singular vectors of the SVD, which are equivalent to the sampler operator in standard compressive sensing theory, the signal can be sampled below the Nyquist rate, and can still be reconstructed via ℓ1 minimization with accuracy using convex optimization. As an added value, there is a GNSS signal acquisition enhancement effect by retaining the useful signal and filtering out noise by projecting the signal into the most significant proper orthogonal modes (PODs) which are the optimal distributions of signal power. The algorithm is validated with real recorded signals, and the results show that the proposed method is effective for sampling, reconstructing intermediate frequency (IF) GNSS signals in the time discrete domain. PMID:29772731
Image fusion using sparse overcomplete feature dictionaries
Brumby, Steven P.; Bettencourt, Luis; Kenyon, Garrett T.; Chartrand, Rick; Wohlberg, Brendt
2015-10-06
Approaches for deciding what individuals in a population of visual system "neurons" are looking for using sparse overcomplete feature dictionaries are provided. A sparse overcomplete feature dictionary may be learned for an image dataset and a local sparse representation of the image dataset may be built using the learned feature dictionary. A local maximum pooling operation may be applied on the local sparse representation to produce a translation-tolerant representation of the image dataset. An object may then be classified and/or clustered within the translation-tolerant representation of the image dataset using a supervised classification algorithm and/or an unsupervised clustering algorithm.
Incorporating structure from motion uncertainty into image-based pose estimation
NASA Astrophysics Data System (ADS)
Ludington, Ben T.; Brown, Andrew P.; Sheffler, Michael J.; Taylor, Clark N.; Berardi, Stephen
2015-05-01
A method for generating and utilizing structure from motion (SfM) uncertainty estimates within image-based pose estimation is presented. The method is applied to a class of problems in which SfM algorithms are utilized to form a geo-registered reference model of a particular ground area using imagery gathered during flight by a small unmanned aircraft. The model is then used to form camera pose estimates in near real-time from imagery gathered later. The resulting pose estimates can be utilized by any of the other onboard systems (e.g. as a replacement for GPS data) or downstream exploitation systems, e.g., image-based object trackers. However, many of the consumers of pose estimates require an assessment of the pose accuracy. The method for generating the accuracy assessment is presented. First, the uncertainty in the reference model is estimated. Bundle Adjustment (BA) is utilized for model generation. While the high-level approach for generating a covariance matrix of the BA parameters is straightforward, typical computing hardware is not able to support the required operations due to the scale of the optimization problem within BA. Therefore, a series of sparse matrix operations is utilized to form an exact covariance matrix for only the parameters that are needed at a particular moment. Once the uncertainty in the model has been determined, it is used to augment Perspective-n-Point pose estimation algorithms to improve the pose accuracy and to estimate the resulting pose uncertainty. The implementation of the described method is presented along with results including results gathered from flight test data.
Xie, Jianwen; Douglas, Pamela K; Wu, Ying Nian; Brody, Arthur L; Anderson, Ariana E
2017-04-15
Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet other mathematical constraints provide alternate biologically-plausible frameworks for generating brain networks. Non-negative matrix factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms (L1 Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks. The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks within scan for different constraints are used as basis functions to encode observed functional activity. These encodings are then decoded using machine learning, by using the time series weights to predict within scan whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects. The sparse coding algorithm of L1 Regularized Learning outperformed 4 variations of ICA (p<0.001) for predicting the task being performed within each scan using artifact-cleaned components. The NMF algorithms, which suppressed negative BOLD signal, had the poorest accuracy compared to the ICA and sparse coding algorithms. Holding constant the effect of the extraction algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (p<0.001). Lower classification accuracy occurred when the extracted spatial maps contained more CSF regions (p<0.001). The success of sparse coding algorithms suggests that algorithms which enforce sparsity, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA. Negative BOLD signal may capture task-related activations. Copyright © 2017 Elsevier B.V. All rights reserved.
Laplace-domain waveform modeling and inversion for the 3D acoustic-elastic coupled media
NASA Astrophysics Data System (ADS)
Shin, Jungkyun; Shin, Changsoo; Calandra, Henri
2016-06-01
Laplace-domain waveform inversion reconstructs long-wavelength subsurface models by using the zero-frequency component of damped seismic signals. Despite the computational advantages of Laplace-domain waveform inversion over conventional frequency-domain waveform inversion, an acoustic assumption and an iterative matrix solver have been used to invert 3D marine datasets to mitigate the intensive computing cost. In this study, we develop a Laplace-domain waveform modeling and inversion algorithm for 3D acoustic-elastic coupled media by using a parallel sparse direct solver library (MUltifrontal Massively Parallel Solver, MUMPS). We precisely simulate a real marine environment by coupling the 3D acoustic and elastic wave equations with the proper boundary condition at the fluid-solid interface. In addition, we can extract the elastic properties of the Earth below the sea bottom from the recorded acoustic pressure datasets. As a matrix solver, the parallel sparse direct solver is used to factorize the non-symmetric impedance matrix in a distributed memory architecture and rapidly solve the wave field for a number of shots by using the lower and upper matrix factors. Using both synthetic datasets and real datasets obtained by a 3D wide azimuth survey, the long-wavelength component of the P-wave and S-wave velocity models is reconstructed and the proposed modeling and inversion algorithm are verified. A cluster of 80 CPU cores is used for this study.
Pagès, Hervé
2018-01-01
Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set. PMID:29723188
Lun, Aaron T L; Pagès, Hervé; Smith, Mike L
2018-05-01
Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set.
Multilevel sparse functional principal component analysis.
Di, Chongzhi; Crainiceanu, Ciprian M; Jank, Wolfgang S
2014-01-29
We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis (MFPCA; Di et al. 2009) was proposed for such data when functions are densely recorded. Here we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between and within subject levels. We address inherent methodological differences in the sparse sampling context to: 1) estimate the covariance operators; 2) estimate the functional principal component scores; 3) predict the underlying curves. Through simulations the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions.
Use of the Homeland-Defense Operational Planning System (HOPS) for Emergency Management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Durling, Jr., R L; Price, D E
2005-12-16
The Homeland-Defense Operational Planning System (HOPS), is a new operational planning tool leveraging Lawrence Livermore National Laboratory's expertise in weapons systems and in sparse information analysis to support the defense of the U.S. homeland. HOPS provides planners with a basis to make decisions to protect against acts of terrorism, focusing on the defense of facilities critical to U.S. infrastructure. Criticality of facilities, structures, and systems is evaluated on a composite matrix of specific projected casualty, economic, and sociopolitical impact bins. Based on these criteria, significant unidentified vulnerabilities are identified and secured. To provide insight into potential successes by malevolent actors,more » HOPS analysts strive to base their efforts mainly on unclassified open-source data. However, more cooperation is needed between HOPS analysts and facility representatives to provide an advantage to those whose task is to defend these facilities. Evaluated facilities include: refineries, major ports, nuclear power plants and other nuclear licensees, dams, government installations, convention centers, sports stadiums, tourist venues, and public and freight transportation systems. A generalized summary of analyses of U.S. infrastructure facilities will be presented.« less
Risk Assessment Using The Homeland-Defense Operational Planning System (HOPS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Price, D E; Durling, R L
2005-10-10
The Homeland-Defense Operational Planning System (HOPS), is a new operational planning tool leveraging Lawrence Livermore National Laboratory's expertise in weapons systems and in sparse information analysis to support the defense of the U.S. homeland. HOPS provides planners with a basis to make decisions to protect against acts of terrorism, focusing on the defense of facilities critical to U.S. infrastructure. Criticality of facilities, structures, and systems is evaluated on a composite matrix of specific projected casualty, economic, and sociopolitical impact bins. Based on these criteria, significant unidentified vulnerabilities are identified and secured. To provide insight into potential successes by malevolent actors,more » HOPS analysts strive to base their efforts mainly on unclassified open-source data. However, more cooperation is needed between HOPS analysts and facility representatives to provide an advantage to those whose task is to defend these facilities. Evaluated facilities include: refineries, major ports, nuclear power plants and other nuclear licensees, dams, government installations, convention centers, sports stadiums, tourist venues, and public and freight transportation systems. A generalized summary of analyses of U.S. infrastructure facilities will be presented.« less
A three-dimensional wide-angle BPM for optical waveguide structures.
Ma, Changbao; Van Keuren, Edward
2007-01-22
Algorithms for effective modeling of optical propagation in three- dimensional waveguide structures are critical for the design of photonic devices. We present a three-dimensional (3-D) wide-angle beam propagation method (WA-BPM) using Hoekstra's scheme. A sparse matrix algebraic equation is formed and solved using iterative methods. The applicability, accuracy and effectiveness of our method are demonstrated by applying it to simulations of wide-angle beam propagation, along with a technique for shifting the simulation window to reduce the dimension of the numerical equation and a threshold technique to further ensure its convergence. These techniques can ensure the implementation of iterative methods for waveguide structures by relaxing the convergence problem, which will further enable us to develop higher-order 3-D WA-BPMs based on Padé approximant operators.
A three-dimensional wide-angle BPM for optical waveguide structures
NASA Astrophysics Data System (ADS)
Ma, Changbao; van Keuren, Edward
2007-01-01
Algorithms for effective modeling of optical propagation in three- dimensional waveguide structures are critical for the design of photonic devices. We present a three-dimensional (3-D) wide-angle beam propagation method (WA-BPM) using Hoekstra’s scheme. A sparse matrix algebraic equation is formed and solved using iterative methods. The applicability, accuracy and effectiveness of our method are demonstrated by applying it to simulations of wide-angle beam propagation, along with a technique for shifting the simulation window to reduce the dimension of the numerical equation and a threshold technique to further ensure its convergence. These techniques can ensure the implementation of iterative methods for waveguide structures by relaxing the convergence problem, which will further enable us to develop higher-order 3-D WA-BPMs based on Padé approximant operators.
Automatic Management of Parallel and Distributed System Resources
NASA Technical Reports Server (NTRS)
Yan, Jerry; Ngai, Tin Fook; Lundstrom, Stephen F.
1990-01-01
Viewgraphs on automatic management of parallel and distributed system resources are presented. Topics covered include: parallel applications; intelligent management of multiprocessing systems; performance evaluation of parallel architecture; dynamic concurrent programs; compiler-directed system approach; lattice gaseous cellular automata; and sparse matrix Cholesky factorization.
Condition Number Estimation of Preconditioned Matrices
Kushida, Noriyuki
2015-01-01
The present paper introduces a condition number estimation method for preconditioned matrices. The newly developed method provides reasonable results, while the conventional method which is based on the Lanczos connection gives meaningless results. The Lanczos connection based method provides the condition numbers of coefficient matrices of systems of linear equations with information obtained through the preconditioned conjugate gradient method. Estimating the condition number of preconditioned matrices is sometimes important when describing the effectiveness of new preconditionerers or selecting adequate preconditioners. Operating a preconditioner on a coefficient matrix is the simplest method of estimation. However, this is not possible for large-scale computing, especially if computation is performed on distributed memory parallel computers. This is because, the preconditioned matrices become dense, even if the original matrices are sparse. Although the Lanczos connection method can be used to calculate the condition number of preconditioned matrices, it is not considered to be applicable to large-scale problems because of its weakness with respect to numerical errors. Therefore, we have developed a robust and parallelizable method based on Hager’s method. The feasibility studies are curried out for the diagonal scaling preconditioner and the SSOR preconditioner with a diagonal matrix, a tri-daigonal matrix and Pei’s matrix. As a result, the Lanczos connection method contains around 10% error in the results even with a simple problem. On the other hand, the new method contains negligible errors. In addition, the newly developed method returns reasonable solutions when the Lanczos connection method fails with Pei’s matrix, and matrices generated with the finite element method. PMID:25816331
Hwang, Geelsu; Koltisko, Bernard; Jin, Xiaoming; Koo, Hyun
2017-11-08
Surface-grown bacteria and production of an extracellular polymeric matrix modulate the assembly of highly cohesive and firmly attached biofilms, making them difficult to remove from solid surfaces. Inhibition of cell growth and inactivation of matrix-producing bacteria can impair biofilm formation and facilitate removal. Here, we developed a novel nonleachable antibacterial composite with potent antibiofilm activity by directly incorporating polymerizable imidazolium-containing resin (antibacterial resin with carbonate linkage; ABR-C) into a methacrylate-based scaffold (ABR-modified composite; ABR-MC) using an efficient yet simplified chemistry. Low-dose inclusion of imidazolium moiety (∼2 wt %) resulted in bioactivity with minimal cytotoxicity without compromising mechanical integrity of the restorative material. The antibiofilm properties of ABR-MC were assessed using an exopolysaccharide-matrix-producing (EPS-matrix-producing) oral pathogen (Streptococcus mutans) in an experimental biofilm model. Using high-resolution confocal fluorescence imaging and biophysical methods, we observed remarkable disruption of bacterial accumulation and defective 3D matrix structure on the surface of ABR-MC. Specifically, the antibacterial composite impaired the ability of S. mutans to form organized bacterial clusters on the surface, resulting in altered biofilm architecture with sparse cell accumulation and reduced amounts of EPS matrix (versus control composite). Biofilm topology analyses on the control composite revealed a highly organized and weblike EPS structure that tethers the bacterial clusters to each other and to the surface, forming a highly cohesive unit. In contrast, such a structured matrix was absent on the surface of ABR-MC with mostly sparse and amorphous EPS, indicating disruption in the biofilm physical stability. Consistent with lack of structural organization, the defective biofilm on the surface of ABR-MC was readily detached when subjected to low shear stress, while most of the biofilm biomass remained on the control surface. Altogether, we demonstrate a new nonleachable antibacterial composite with excellent antibiofilm activity without affecting its mechanical properties, which may serve as a platform for development of alternative antifouling biomaterials.
Das, Anup; Sampson, Aaron L.; Lainscsek, Claudia; Muller, Lyle; Lin, Wutu; Doyle, John C.; Cash, Sydney S.; Halgren, Eric; Sejnowski, Terrence J.
2017-01-01
The correlation method from brain imaging has been used to estimate functional connectivity in the human brain. However, brain regions might show very high correlation even when the two regions are not directly connected due to the strong interaction of the two regions with common input from a third region. One previously proposed solution to this problem is to use a sparse regularized inverse covariance matrix or precision matrix (SRPM) assuming that the connectivity structure is sparse. This method yields partial correlations to measure strong direct interactions between pairs of regions while simultaneously removing the influence of the rest of the regions, thus identifying regions that are conditionally independent. To test our methods, we first demonstrated conditions under which the SRPM method could indeed find the true physical connection between a pair of nodes for a spring-mass example and an RC circuit example. The recovery of the connectivity structure using the SRPM method can be explained by energy models using the Boltzmann distribution. We then demonstrated the application of the SRPM method for estimating brain connectivity during stage 2 sleep spindles from human electrocorticography (ECoG) recordings using an 8 × 8 electrode array. The ECoG recordings that we analyzed were from a 32-year-old male patient with long-standing pharmaco-resistant left temporal lobe complex partial epilepsy. Sleep spindles were automatically detected using delay differential analysis and then analyzed with SRPM and the Louvain method for community detection. We found spatially localized brain networks within and between neighboring cortical areas during spindles, in contrast to the case when sleep spindles were not present. PMID:28095202
Efficient Implementations of the Quadrature-Free Discontinuous Galerkin Method
NASA Technical Reports Server (NTRS)
Lockard, David P.; Atkins, Harold L.
1999-01-01
The efficiency of the quadrature-free form of the dis- continuous Galerkin method in two dimensions, and briefly in three dimensions, is examined. Most of the work for constant-coefficient, linear problems involves the volume and edge integrations, and the transformation of information from the volume to the edges. These operations can be viewed as matrix-vector multiplications. Many of the matrices are sparse as a result of symmetry, and blocking and specialized multiplication routines are used to account for the sparsity. By optimizing these operations, a 35% reduction in total CPU time is achieved. For nonlinear problems, the calculation of the flux becomes dominant because of the cost associated with polynomial products and inversion. This component of the work can be reduced by up to 75% when the products are approximated by truncating terms. Because the cost is high for nonlinear problems on general elements, it is suggested that simplified physics and the most efficient element types be used over most of the domain.
Functional brain networks reconstruction using group sparsity-regularized learning.
Zhao, Qinghua; Li, Will X Y; Jiang, Xi; Lv, Jinglei; Lu, Jianfeng; Liu, Tianming
2018-06-01
Investigating functional brain networks and patterns using sparse representation of fMRI data has received significant interests in the neuroimaging community. It has been reported that sparse representation is effective in reconstructing concurrent and interactive functional brain networks. To date, most of data-driven network reconstruction approaches rarely take consideration of anatomical structures, which are the substrate of brain function. Furthermore, it has been rarely explored whether structured sparse representation with anatomical guidance could facilitate functional networks reconstruction. To address this problem, in this paper, we propose to reconstruct brain networks utilizing the structure guided group sparse regression (S2GSR) in which 116 anatomical regions from the AAL template, as prior knowledge, are employed to guide the network reconstruction when performing sparse representation of whole-brain fMRI data. Specifically, we extract fMRI signals from standard space aligned with the AAL template. Then by learning a global over-complete dictionary, with the learned dictionary as a set of features (regressors), the group structured regression employs anatomical structures as group information to regress whole brain signals. Finally, the decomposition coefficients matrix is mapped back to the brain volume to represent functional brain networks and patterns. We use the publicly available Human Connectome Project (HCP) Q1 dataset as the test bed, and the experimental results indicate that the proposed anatomically guided structure sparse representation is effective in reconstructing concurrent functional brain networks.
COMPUTATION OF GLOBAL PHOTOCHEMISTRY WITH SMVGEAR II (R823186)
A computer model was developed to simulate global gas-phase photochemistry. The model solves chemical equations with SMVGEAR II, a sparse-matrix, vectorized Gear-type code. To obtain SMVGEAR II, the original SMVGEAR code was modified to allow computation of different sets of chem...
NASA Astrophysics Data System (ADS)
Liu, Yang; Li, Feng; Xin, Lei; Fu, Jie; Huang, Puming
2017-10-01
Large amount of data is one of the most obvious features in satellite based remote sensing systems, which is also a burden for data processing and transmission. The theory of compressive sensing(CS) has been proposed for almost a decade, and massive experiments show that CS has favorable performance in data compression and recovery, so we apply CS theory to remote sensing images acquisition. In CS, the construction of classical sensing matrix for all sparse signals has to satisfy the Restricted Isometry Property (RIP) strictly, which limits applying CS in practical in image compression. While for remote sensing images, we know some inherent characteristics such as non-negative, smoothness and etc.. Therefore, the goal of this paper is to present a novel measurement matrix that breaks RIP. The new sensing matrix consists of two parts: the standard Nyquist sampling matrix for thumbnails and the conventional CS sampling matrix. Since most of sun-synchronous based satellites fly around the earth 90 minutes and the revisit cycle is also short, lots of previously captured remote sensing images of the same place are available in advance. This drives us to reconstruct remote sensing images through a deep learning approach with those measurements from the new framework. Therefore, we propose a novel deep convolutional neural network (CNN) architecture which takes in undersampsing measurements as input and outputs an intermediate reconstruction image. It is well known that the training procedure to the network costs long time, luckily, the training step can be done only once, which makes the approach attractive for a host of sparse recovery problems.
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard
2007-01-01
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dual-core and Intel quad-core designs, the heterogeneous STI Cell, as well as the first scientificmore » study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
Multitasking the Davidson algorithm for the large, sparse eigenvalue problem
DOE Office of Scientific and Technical Information (OSTI.GOV)
Umar, V.M.; Fischer, C.F.
1989-01-01
The authors report how the Davidson algorithm, developed for handling the eigenvalue problem for large and sparse matrices arising in quantum chemistry, was modified for use in atomic structure calculations. To date these calculations have used traditional eigenvalue methods, which limit the range of feasible calculations because of their excessive memory requirements and unsatisfactory performance attributed to time-consuming and costly processing of zero valued elements. The replacement of a traditional matrix eigenvalue method by the Davidson algorithm reduced these limitations. Significant speedup was found, which varied with the size of the underlying problem and its sparsity. Furthermore, the range ofmore » matrix sizes that can be manipulated efficiently was expended by more than one order or magnitude. On the CRAY X-MP the code was vectorized and the importance of gather/scatter analyzed. A parallelized version of the algorithm obtained an additional 35% reduction in execution time. Speedup due to vectorization and concurrency was also measured on the Alliant FX/8.« less
Holographic implementation of a binary associative memory for improved recognition
NASA Astrophysics Data System (ADS)
Bandyopadhyay, Somnath; Ghosh, Ajay; Datta, Asit K.
1998-03-01
Neural network associate memory has found wide application sin pattern recognition techniques. We propose an associative memory model for binary character recognition. The interconnection strengths of the memory are binary valued. The concept of sparse coding is sued to enhance the storage efficiency of the model. The question of imposed preconditioning of pattern vectors, which is inherent in a sparsely coded conventional memory, is eliminated by using a multistep correlation technique an the ability of correct association is enhanced in a real-time application. A potential optoelectronic implementation of the proposed associative memory is also described. The learning and recall is possible by using digital optical matrix-vector multiplication, where full use of parallelism and connectivity of optics is made. A hologram is used in the experiment as a longer memory (LTM) for storing all input information. The short-term memory or the interconnection weight matrix required during the recall process is configured by retrieving the necessary information from the holographic LTM.
Multiprocessor sparse L/U decomposition with controlled fill-in
NASA Technical Reports Server (NTRS)
Alaghband, G.; Jordan, H. F.
1985-01-01
Generation of the maximal compatibles of pivot elements for a class of small sparse matrices is studied. The algorithm involves a binary tree search and has a complexity exponential in the order of the matrix. Different strategies for selection of a set of compatible pivots based on the Markowitz criterion are investigated. The competing issues of parallelism and fill-in generation are studied and results are provided. A technque for obtaining an ordered compatible set directly from the ordered incompatible table is given. This technique generates a set of compatible pivots with the property of generating few fills. A new hueristic algorithm is then proposed that combines the idea of an ordered compatible set with a limited binary tree search to generate several sets of compatible pivots in linear time. Finally, an elimination set to reduce the matrix is selected. Parameters are suggested to obtain a balance between parallelism and fill-ins. Results of applying the proposed algorithms on several large application matrices are presented and analyzed.
A method of vehicle license plate recognition based on PCANet and compressive sensing
NASA Astrophysics Data System (ADS)
Ye, Xianyi; Min, Feng
2018-03-01
The manual feature extraction of the traditional method for vehicle license plates has no good robustness to change in diversity. And the high feature dimension that is extracted with Principal Component Analysis Network (PCANet) leads to low classification efficiency. For solving these problems, a method of vehicle license plate recognition based on PCANet and compressive sensing is proposed. First, PCANet is used to extract the feature from the images of characters. And then, the sparse measurement matrix which is a very sparse matrix and consistent with Restricted Isometry Property (RIP) condition of the compressed sensing is used to reduce the dimensions of extracted features. Finally, the Support Vector Machine (SVM) is used to train and recognize the features whose dimension has been reduced. Experimental results demonstrate that the proposed method has better performance than Convolutional Neural Network (CNN) in the recognition and time. Compared with no compression sensing, the proposed method has lower feature dimension for the increase of efficiency.
Horak, Jakub
2014-06-01
The conservation of traditional fruit orchards might be considered to be a fashion, and many people might find it difficult to accept that these artificial habitats can be significant for overall biodiversity. The main aim of this study was to identify possible roles of traditional fruit orchards for dead wood-dependent (saproxylic) beetles. The study was performed in the Central European landscape in the Czech Republic, which was historically covered by lowland sparse deciduous woodlands. Window traps were used to catch saproxylic beetles in 25 traditional fruit orchards. The species richness, as one of the best indicators of biodiversity, was positively driven by very high canopy openness and the rising proportion of deciduous woodlands in the matrix of the surrounding landscape. Due to the disappearance of natural and semi-natural habitats (i.e., sparse deciduous woodlands) of saproxylic beetles, orchards might complement the functions of suitable habitat fragments as the last biotic islands in the matrix of the cultural Central European landscape.
NASA Astrophysics Data System (ADS)
Yang, Yongchao; Nagarajaiah, Satish
2016-06-01
Randomly missing data of structural vibration responses time history often occurs in structural dynamics and health monitoring. For example, structural vibration responses are often corrupted by outliers or erroneous measurements due to sensor malfunction; in wireless sensing platforms, data loss during wireless communication is a common issue. Besides, to alleviate the wireless data sampling or communication burden, certain accounts of data are often discarded during sampling or before transmission. In these and other applications, recovery of the randomly missing structural vibration responses from the available, incomplete data, is essential for system identification and structural health monitoring; it is an ill-posed inverse problem, however. This paper explicitly harnesses the data structure itself-of the structural vibration responses-to address this (inverse) problem. What is relevant is an empirical, but often practically true, observation, that is, typically there are only few modes active in the structural vibration responses; hence a sparse representation (in frequency domain) of the single-channel data vector, or, a low-rank structure (by singular value decomposition) of the multi-channel data matrix. Exploiting such prior knowledge of data structure (intra-channel sparse or inter-channel low-rank), the new theories of ℓ1-minimization sparse recovery and nuclear-norm-minimization low-rank matrix completion enable recovery of the randomly missing or corrupted structural vibration response data. The performance of these two alternatives, in terms of recovery accuracy and computational time under different data missing rates, is investigated on a few structural vibration response data sets-the seismic responses of the super high-rise Canton Tower and the structural health monitoring accelerations of a real large-scale cable-stayed bridge. Encouraging results are obtained and the applicability and limitation of the presented methods are discussed.
Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection.
Wang, Haoran; Yuan, Chunfeng; Hu, Weiming; Ling, Haibin; Yang, Wankou; Sun, Changyin
2014-02-01
In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.
Ren, Yudan; Fang, Jun; Lv, Jinglei; Hu, Xintao; Guo, Cong Christine; Guo, Lei; Xu, Jiansong; Potenza, Marc N; Liu, Tianming
2017-08-01
Assessing functional brain activation patterns in neuropsychiatric disorders such as cocaine dependence (CD) or pathological gambling (PG) under naturalistic stimuli has received rising interest in recent years. In this paper, we propose and apply a novel group-wise sparse representation framework to assess differences in neural responses to naturalistic stimuli across multiple groups of participants (healthy control, cocaine dependence, pathological gambling). Specifically, natural stimulus fMRI (N-fMRI) signals from all three groups of subjects are aggregated into a big data matrix, which is then decomposed into a common signal basis dictionary and associated weight coefficient matrices via an effective online dictionary learning and sparse coding method. The coefficient matrices associated with each common dictionary atom are statistically assessed for each group separately. With the inter-group comparisons based on the group-wise correspondence established by the common dictionary, our experimental results demonstrated that the group-wise sparse coding and representation strategy can effectively and specifically detect brain networks/regions affected by different pathological conditions of the brain under naturalistic stimuli.
Resolution enhancement of tri-stereo remote sensing images by super resolution methods
NASA Astrophysics Data System (ADS)
Tuna, Caglayan; Akoguz, Alper; Unal, Gozde; Sertel, Elif
2016-10-01
Super resolution (SR) refers to generation of a High Resolution (HR) image from a decimated, blurred, low-resolution (LR) image set, which can be either a single frame or multi-frame that contains a collection of several images acquired from slightly different views of the same observation area. In this study, we propose a novel application of tri-stereo Remote Sensing (RS) satellite images to the super resolution problem. Since the tri-stereo RS images of the same observation area are acquired from three different viewing angles along the flight path of the satellite, these RS images are properly suited to a SR application. We first estimate registration between the chosen reference LR image and other LR images to calculate the sub pixel shifts among the LR images. Then, the warping, blurring and down sampling matrix operators are created as sparse matrices to avoid high memory and computational requirements, which would otherwise make the RS-SR solution impractical. Finally, the overall system matrix, which is constructed based on the obtained operator matrices is used to obtain the estimate HR image in one step in each iteration of the SR algorithm. Both the Laplacian and total variation regularizers are incorporated separately into our algorithm and the results are presented to demonstrate an improved quantitative performance against the standard interpolation method as well as improved qualitative results due expert evaluations.
Jelsch, C
2001-09-01
The normal matrix in the least-squares refinement of macromolecules is very sparse when the resolution reaches atomic and subatomic levels. The elements of the normal matrix, related to coordinates, thermal motion and charge-density parameters, have a global tendency to decrease rapidly with the interatomic distance between the atoms concerned. For instance, in the case of the protein crambin at 0.54 A resolution, the elements are reduced by two orders of magnitude for distances above 1.5 A. The neglect a priori of most of the normal-matrix elements according to a distance criterion represents an approximation in the refinement of macromolecules, which is particularly valid at very high resolution. The analytical expressions of the normal-matrix elements, which have been derived for the coordinates and the thermal parameters, show that the degree of matrix sparsity increases with the diffraction resolution and the size of the asymmetric unit.
Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Li, Xiaoye; Heber, Gerd; Biswas, Rupak
2000-01-01
The ability of computers to solve hitherto intractable problems and simulate complex processes using mathematical models makes them an indispensable part of modern science and engineering. Computer simulations of large-scale realistic applications usually require solving a set of non-linear partial differential equations (PDES) over a finite region. For example, one thrust area in the DOE Grand Challenge projects is to design future accelerators such as the SpaHation Neutron Source (SNS). Our colleagues at SLAC need to model complex RFQ cavities with large aspect ratios. Unstructured grids are currently used to resolve the small features in a large computational domain; dynamic mesh adaptation will be added in the future for additional efficiency. The PDEs for electromagnetics are discretized by the FEM method, which leads to a generalized eigenvalue problem Kx = AMx, where K and M are the stiffness and mass matrices, and are very sparse. In a typical cavity model, the number of degrees of freedom is about one million. For such large eigenproblems, direct solution techniques quickly reach the memory limits. Instead, the most widely-used methods are Krylov subspace methods, such as Lanczos or Jacobi-Davidson. In all the Krylov-based algorithms, sparse matrix-vector multiplication (SPMV) must be performed repeatedly. Therefore, the efficiency of SPMV usually determines the eigensolver speed. SPMV is also one of the most heavily used kernels in large-scale numerical simulations.
Structure-preserving and rank-revealing QR-factorizations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bischof, C.H.; Hansen, P.C.
1991-11-01
The rank-revealing QR-factorization (RRQR-factorization) is a special QR-factorization that is guaranteed to reveal the numerical rank of the matrix under consideration. This makes the RRQR-factorization a useful tool in the numerical treatment of many rank-deficient problems in numerical linear algebra. In this paper, a framework is presented for the efficient implementation of RRQR algorithms, in particular, for sparse matrices. A sparse RRQR-algorithm should seek to preserve the structure and sparsity of the matrix as much as possible while retaining the ability to capture safely the numerical rank. To this end, the paper proposes to compute an initial QR-factorization using amore » restricted pivoting strategy guarded by incremental condition estimation (ICE), and then applies the algorithm suggested by Chan and Foster to this QR-factorization. The column exchange strategy used in the initial QR factorization will exploit the fact that certain column exchanges do not change the sparsity structure, and compute a sparse QR-factorization that is a good approximation of the sought-after RRQR-factorization. Due to quantities produced by ICE, the Chan/Foster RRQR algorithm can be implemented very cheaply, thus verifying that the sought-after RRQR-factorization has indeed been computed. Experimental results on a model problem show that the initial QR-factorization is indeed very likely to produce RRQR-factorization.« less
A Shifted Block Lanczos Algorithm 1: The Block Recurrence
NASA Technical Reports Server (NTRS)
Grimes, Roger G.; Lewis, John G.; Simon, Horst D.
1990-01-01
In this paper we describe a block Lanczos algorithm that is used as the key building block of a software package for the extraction of eigenvalues and eigenvectors of large sparse symmetric generalized eigenproblems. The software package comprises: a version of the block Lanczos algorithm specialized for spectrally transformed eigenproblems; an adaptive strategy for choosing shifts, and efficient codes for factoring large sparse symmetric indefinite matrices. This paper describes the algorithmic details of our block Lanczos recurrence. This uses a novel combination of block generalizations of several features that have only been investigated independently in the past. In particular new forms of partial reorthogonalization, selective reorthogonalization and local reorthogonalization are used, as is a new algorithm for obtaining the M-orthogonal factorization of a matrix. The heuristic shifting strategy, the integration with sparse linear equation solvers and numerical experience with the code are described in a companion paper.
Color Sparse Representations for Image Processing: Review, Models, and Prospects.
Barthélemy, Quentin; Larue, Anthony; Mars, Jérôme I
2015-11-01
Sparse representations have been extended to deal with color images composed of three channels. A review of dictionary-learning-based sparse representations for color images is made here, detailing the differences between the models, and comparing their results on the real and simulated data. These models are considered in a unifying framework that is based on the degrees of freedom of the linear filtering/transformation of the color channels. Moreover, this allows it to be shown that the scalar quaternionic linear model is equivalent to constrained matrix-based color filtering, which highlights the filtering implicitly applied through this model. Based on this reformulation, the new color filtering model is introduced, using unconstrained filters. In this model, spatial morphologies of color images are encoded by atoms, and colors are encoded by color filters. Color variability is no longer captured in increasing the dictionary size, but with color filters, this gives an efficient color representation.
A sparse equivalent source method for near-field acoustic holography.
Fernandez-Grande, Efren; Xenaki, Angeliki; Gerstoft, Peter
2017-01-01
This study examines a near-field acoustic holography method consisting of a sparse formulation of the equivalent source method, based on the compressive sensing (CS) framework. The method, denoted Compressive-Equivalent Source Method (C-ESM), encourages spatially sparse solutions (based on the superposition of few waves) that are accurate when the acoustic sources are spatially localized. The importance of obtaining a non-redundant representation, i.e., a sensing matrix with low column coherence, and the inherent ill-conditioning of near-field reconstruction problems is addressed. Numerical and experimental results on a classical guitar and on a highly reactive dipole-like source are presented. C-ESM is valid beyond the conventional sampling limits, making wide-band reconstruction possible. Spatially extended sources can also be addressed with C-ESM, although in this case the obtained solution does not recover the spatial extent of the source.
On the sparseness of 1-norm support vector machines.
Zhang, Li; Zhou, Weida
2010-04-01
There is some empirical evidence available showing that 1-norm Support Vector Machines (1-norm SVMs) have good sparseness; however, both how good sparseness 1-norm SVMs can reach and whether they have a sparser representation than that of standard SVMs are not clear. In this paper we take into account the sparseness of 1-norm SVMs. Two upper bounds on the number of nonzero coefficients in the decision function of 1-norm SVMs are presented. First, the number of nonzero coefficients in 1-norm SVMs is at most equal to the number of only the exact support vectors lying on the +1 and -1 discriminating surfaces, while that in standard SVMs is equal to the number of support vectors, which implies that 1-norm SVMs have better sparseness than that of standard SVMs. Second, the number of nonzero coefficients is at most equal to the rank of the sample matrix. A brief review of the geometry of linear programming and the primal steepest edge pricing simplex method are given, which allows us to provide the proof of the two upper bounds and evaluate their tightness by experiments. Experimental results on toy data sets and the UCI data sets illustrate our analysis. Copyright 2009 Elsevier Ltd. All rights reserved.
Improving M-SBL for Joint Sparse Recovery Using a Subspace Penalty
NASA Astrophysics Data System (ADS)
Ye, Jong Chul; Kim, Jong Min; Bresler, Yoram
2015-12-01
The multiple measurement vector problem (MMV) is a generalization of the compressed sensing problem that addresses the recovery of a set of jointly sparse signal vectors. One of the important contributions of this paper is to reveal that the seemingly least related state-of-art MMV joint sparse recovery algorithms - M-SBL (multiple sparse Bayesian learning) and subspace-based hybrid greedy algorithms - have a very important link. More specifically, we show that replacing the $\\log\\det(\\cdot)$ term in M-SBL by a rank proxy that exploits the spark reduction property discovered in subspace-based joint sparse recovery algorithms, provides significant improvements. In particular, if we use the Schatten-$p$ quasi-norm as the corresponding rank proxy, the global minimiser of the proposed algorithm becomes identical to the true solution as $p \\rightarrow 0$. Furthermore, under the same regularity conditions, we show that the convergence to a local minimiser is guaranteed using an alternating minimization algorithm that has closed form expressions for each of the minimization steps, which are convex. Numerical simulations under a variety of scenarios in terms of SNR, and condition number of the signal amplitude matrix demonstrate that the proposed algorithm consistently outperforms M-SBL and other state-of-the art algorithms.
Accessing sparse arrays in parallel memories
DOE Office of Scientific and Technical Information (OSTI.GOV)
Banerjee, U.; Gajski, D.; Kuck, D.
The concept of dense and sparse execution of arrays is introduced. Arrays themselves can be stored in a dense or sparse manner in a parallel memory with m memory modules. The paper proposes hardware for speeding up the execution of array operations of the form c(c/sub 0/+ci)=a(a/sub 0/+ai) op b(b/sub 0/+bi), where a/sub 0/, a, b/sub 0/, b, c/sub 0/, c are integer constants and i is an index variable. The hardware handles 'sparse execution', in which the operation op is not executed for every value of i. The hardware also makes provision for 'sparse storage', in which memory spacemore » is not provided for every array element. It is shown how to access array elements of the above form without conflict in an efficient way. The efficiency is obtained by using some specialised units which are basically smart memories with priority detection, one's counting or associative searching. Generalisation to multidimensional arrays is shown possible under restrictions defined in the paper. 12 references.« less
Dynamic graph system for a semantic database
Mizell, David
2016-04-12
A method and system in a computer system for dynamically providing a graphical representation of a data store of entries via a matrix interface is disclosed. A dynamic graph system provides a matrix interface that exposes to an application program a graphical representation of data stored in a data store such as a semantic database storing triples. To the application program, the matrix interface represents the graph as a sparse adjacency matrix that is stored in compressed form. Each entry of the data store is considered to represent a link between nodes of the graph. Each entry has a first field and a second field identifying the nodes connected by the link and a third field with a value for the link that connects the identified nodes. The first, second, and third fields represent the rows, column, and elements of the adjacency matrix.
Dynamic graph system for a semantic database
Mizell, David
2015-01-27
A method and system in a computer system for dynamically providing a graphical representation of a data store of entries via a matrix interface is disclosed. A dynamic graph system provides a matrix interface that exposes to an application program a graphical representation of data stored in a data store such as a semantic database storing triples. To the application program, the matrix interface represents the graph as a sparse adjacency matrix that is stored in compressed form. Each entry of the data store is considered to represent a link between nodes of the graph. Each entry has a first field and a second field identifying the nodes connected by the link and a third field with a value for the link that connects the identified nodes. The first, second, and third fields represent the rows, column, and elements of the adjacency matrix.
LiDAR point classification based on sparse representation
NASA Astrophysics Data System (ADS)
Li, Nan; Pfeifer, Norbert; Liu, Chun
2017-04-01
In order to combine the initial spatial structure and features of LiDAR data for accurate classification. The LiDAR data is represented as a 4-order tensor. Sparse representation for classification(SRC) method is used for LiDAR tensor classification. It turns out SRC need only a few of training samples from each class, meanwhile can achieve good classification result. Multiple features are extracted from raw LiDAR points to generate a high-dimensional vector at each point. Then the LiDAR tensor is built by the spatial distribution and feature vectors of the point neighborhood. The entries of LiDAR tensor are accessed via four indexes. Each index is called mode: three spatial modes in direction X ,Y ,Z and one feature mode. Sparse representation for classification(SRC) method is proposed in this paper. The sparsity algorithm is to find the best represent the test sample by sparse linear combination of training samples from a dictionary. To explore the sparsity of LiDAR tensor, the tucker decomposition is used. It decomposes a tensor into a core tensor multiplied by a matrix along each mode. Those matrices could be considered as the principal components in each mode. The entries of core tensor show the level of interaction between the different components. Therefore, the LiDAR tensor can be approximately represented by a sparse tensor multiplied by a matrix selected from a dictionary along each mode. The matrices decomposed from training samples are arranged as initial elements in the dictionary. By dictionary learning, a reconstructive and discriminative structure dictionary along each mode is built. The overall structure dictionary composes of class-specified sub-dictionaries. Then the sparse core tensor is calculated by tensor OMP(Orthogonal Matching Pursuit) method based on dictionaries along each mode. It is expected that original tensor should be well recovered by sub-dictionary associated with relevant class, while entries in the sparse tensor associated with other classed should be nearly zero. Therefore, SRC use the reconstruction error associated with each class to do data classification. A section of airborne LiDAR points of Vienna city is used and classified into 6classes: ground, roofs, vegetation, covered ground, walls and other points. Only 6 training samples from each class are taken. For the final classification result, ground and covered ground are merged into one same class(ground). The classification accuracy for ground is 94.60%, roof is 95.47%, vegetation is 85.55%, wall is 76.17%, other object is 20.39%.
NASA Astrophysics Data System (ADS)
Zheng, Maoteng; Zhang, Yongjun; Zhou, Shunping; Zhu, Junfeng; Xiong, Xiaodong
2016-07-01
In recent years, new platforms and sensors in photogrammetry, remote sensing and computer vision areas have become available, such as Unmanned Aircraft Vehicles (UAV), oblique camera systems, common digital cameras and even mobile phone cameras. Images collected by all these kinds of sensors could be used as remote sensing data sources. These sensors can obtain large-scale remote sensing data which consist of a great number of images. Bundle block adjustment of large-scale data with conventional algorithm is very time and space (memory) consuming due to the super large normal matrix arising from large-scale data. In this paper, an efficient Block-based Sparse Matrix Compression (BSMC) method combined with the Preconditioned Conjugate Gradient (PCG) algorithm is chosen to develop a stable and efficient bundle block adjustment system in order to deal with the large-scale remote sensing data. The main contribution of this work is the BSMC-based PCG algorithm which is more efficient in time and memory than the traditional algorithm without compromising the accuracy. Totally 8 datasets of real data are used to test our proposed method. Preliminary results have shown that the BSMC method can efficiently decrease the time and memory requirement of large-scale data.
Shokrollahi, Mehrnaz; Krishnan, Sridhar; Dopsa, Dustin D; Muir, Ryan T; Black, Sandra E; Swartz, Richard H; Murray, Brian J; Boulos, Mark I
2016-11-01
Stroke is a leading cause of death and disability in adults, and incurs a significant economic burden to society. Periodic limb movements (PLMs) in sleep are repetitive movements involving the great toe, ankle, and hip. Evolving evidence suggests that PLMs may be associated with high blood pressure and stroke, but this relationship remains underexplored. Several issues limit the study of PLMs including the need to manually score them, which is time-consuming and costly. For this reason, we developed a novel automated method for nocturnal PLM detection, which was shown to be correlated with (a) the manually scored PLM index on polysomnography, and (b) white matter hyperintensities on brain imaging, which have been demonstrated to be associated with PLMs. Our proposed algorithm consists of three main stages: (1) representing the signal in the time-frequency plane using time-frequency matrices (TFM), (2) applying K-nonnegative matrix factorization technique to decompose the TFM matrix into its significant components, and (3) applying kernel sparse representation for classification (KSRC) to the decomposed signal. Our approach was applied to a dataset that consisted of 65 subjects who underwent polysomnography. An overall classification of 97 % was achieved for discrimination of the aforementioned signals, demonstrating the potential of the presented method.
Time integration algorithms for the two-dimensional Euler equations on unstructured meshes
NASA Technical Reports Server (NTRS)
Slack, David C.; Whitaker, D. L.; Walters, Robert W.
1994-01-01
Explicit and implicit time integration algorithms for the two-dimensional Euler equations on unstructured grids are presented. Both cell-centered and cell-vertex finite volume upwind schemes utilizing Roe's approximate Riemann solver are developed. For the cell-vertex scheme, a four-stage Runge-Kutta time integration, a fourstage Runge-Kutta time integration with implicit residual averaging, a point Jacobi method, a symmetric point Gauss-Seidel method and two methods utilizing preconditioned sparse matrix solvers are presented. For the cell-centered scheme, a Runge-Kutta scheme, an implicit tridiagonal relaxation scheme modeled after line Gauss-Seidel, a fully implicit lower-upper (LU) decomposition, and a hybrid scheme utilizing both Runge-Kutta and LU methods are presented. A reverse Cuthill-McKee renumbering scheme is employed for the direct solver to decrease CPU time by reducing the fill of the Jacobian matrix. A comparison of the various time integration schemes is made for both first-order and higher order accurate solutions using several mesh sizes, higher order accuracy is achieved by using multidimensional monotone linear reconstruction procedures. The results obtained for a transonic flow over a circular arc suggest that the preconditioned sparse matrix solvers perform better than the other methods as the number of elements in the mesh increases.
Structural performance analysis and redesign
NASA Technical Reports Server (NTRS)
Whetstone, W. D.
1978-01-01
Program performs stress buckling and vibrational analysis of large, linear, finite-element systems in excess of 50,000 degrees of freedom. Cost, execution time, and storage requirements are kept reasonable through use of sparse matrix solution techniques, and other computational and data management procedures designed for problems of very large size.
Mathematical foundations of hybrid data assimilation from a synchronization perspective
NASA Astrophysics Data System (ADS)
Penny, Stephen G.
2017-12-01
The state-of-the-art data assimilation methods used today in operational weather prediction centers around the world can be classified as generalized one-way coupled impulsive synchronization. This classification permits the investigation of hybrid data assimilation methods, which combine dynamic error estimates of the system state with long time-averaged (climatological) error estimates, from a synchronization perspective. Illustrative results show how dynamically informed formulations of the coupling matrix (via an Ensemble Kalman Filter, EnKF) can lead to synchronization when observing networks are sparse and how hybrid methods can lead to synchronization when those dynamic formulations are inadequate (due to small ensemble sizes). A large-scale application with a global ocean general circulation model is also presented. Results indicate that the hybrid methods also have useful applications in generalized synchronization, in particular, for correcting systematic model errors.
Mathematical foundations of hybrid data assimilation from a synchronization perspective.
Penny, Stephen G
2017-12-01
The state-of-the-art data assimilation methods used today in operational weather prediction centers around the world can be classified as generalized one-way coupled impulsive synchronization. This classification permits the investigation of hybrid data assimilation methods, which combine dynamic error estimates of the system state with long time-averaged (climatological) error estimates, from a synchronization perspective. Illustrative results show how dynamically informed formulations of the coupling matrix (via an Ensemble Kalman Filter, EnKF) can lead to synchronization when observing networks are sparse and how hybrid methods can lead to synchronization when those dynamic formulations are inadequate (due to small ensemble sizes). A large-scale application with a global ocean general circulation model is also presented. Results indicate that the hybrid methods also have useful applications in generalized synchronization, in particular, for correcting systematic model errors.
A Sparsity-Promoted Method Based on Majorization-Minimization for Weak Fault Feature Enhancement
Hao, Yansong; Song, Liuyang; Tang, Gang; Yuan, Hongfang
2018-01-01
Fault transient impulses induced by faulty components in rotating machinery usually contain substantial interference. Fault features are comparatively weak in the initial fault stage, which renders fault diagnosis more difficult. In this case, a sparse representation method based on the Majorzation-Minimization (MM) algorithm is proposed to enhance weak fault features and extract the features from strong background noise. However, the traditional MM algorithm suffers from two issues, which are the choice of sparse basis and complicated calculations. To address these challenges, a modified MM algorithm is proposed in which a sparse optimization objective function is designed firstly. Inspired by the Basis Pursuit (BP) model, the optimization function integrates an impulsive feature-preserving factor and a penalty function factor. Second, a modified Majorization iterative method is applied to address the convex optimization problem of the designed function. A series of sparse coefficients can be achieved through iterating, which only contain transient components. It is noteworthy that there is no need to select the sparse basis in the proposed iterative method because it is fixed as a unit matrix. Then the reconstruction step is omitted, which can significantly increase detection efficiency. Eventually, envelope analysis of the sparse coefficients is performed to extract weak fault features. Simulated and experimental signals including bearings and gearboxes are employed to validate the effectiveness of the proposed method. In addition, comparisons are made to prove that the proposed method outperforms the traditional MM algorithm in terms of detection results and efficiency. PMID:29597280
A Sparsity-Promoted Method Based on Majorization-Minimization for Weak Fault Feature Enhancement.
Ren, Bangyue; Hao, Yansong; Wang, Huaqing; Song, Liuyang; Tang, Gang; Yuan, Hongfang
2018-03-28
Fault transient impulses induced by faulty components in rotating machinery usually contain substantial interference. Fault features are comparatively weak in the initial fault stage, which renders fault diagnosis more difficult. In this case, a sparse representation method based on the Majorzation-Minimization (MM) algorithm is proposed to enhance weak fault features and extract the features from strong background noise. However, the traditional MM algorithm suffers from two issues, which are the choice of sparse basis and complicated calculations. To address these challenges, a modified MM algorithm is proposed in which a sparse optimization objective function is designed firstly. Inspired by the Basis Pursuit (BP) model, the optimization function integrates an impulsive feature-preserving factor and a penalty function factor. Second, a modified Majorization iterative method is applied to address the convex optimization problem of the designed function. A series of sparse coefficients can be achieved through iterating, which only contain transient components. It is noteworthy that there is no need to select the sparse basis in the proposed iterative method because it is fixed as a unit matrix. Then the reconstruction step is omitted, which can significantly increase detection efficiency. Eventually, envelope analysis of the sparse coefficients is performed to extract weak fault features. Simulated and experimental signals including bearings and gearboxes are employed to validate the effectiveness of the proposed method. In addition, comparisons are made to prove that the proposed method outperforms the traditional MM algorithm in terms of detection results and efficiency.
Numerical Technology for Large-Scale Computational Electromagnetics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sharpe, R; Champagne, N; White, D
The key bottleneck of implicit computational electromagnetics tools for large complex geometries is the solution of the resulting linear system of equations. The goal of this effort was to research and develop critical numerical technology that alleviates this bottleneck for large-scale computational electromagnetics (CEM). The mathematical operators and numerical formulations used in this arena of CEM yield linear equations that are complex valued, unstructured, and indefinite. Also, simultaneously applying multiple mathematical modeling formulations to different portions of a complex problem (hybrid formulations) results in a mixed structure linear system, further increasing the computational difficulty. Typically, these hybrid linear systems aremore » solved using a direct solution method, which was acceptable for Cray-class machines but does not scale adequately for ASCI-class machines. Additionally, LLNL's previously existing linear solvers were not well suited for the linear systems that are created by hybrid implicit CEM codes. Hence, a new approach was required to make effective use of ASCI-class computing platforms and to enable the next generation design capabilities. Multiple approaches were investigated, including the latest sparse-direct methods developed by our ASCI collaborators. In addition, approaches that combine domain decomposition (or matrix partitioning) with general-purpose iterative methods and special purpose pre-conditioners were investigated. Special-purpose pre-conditioners that take advantage of the structure of the matrix were adapted and developed based on intimate knowledge of the matrix properties. Finally, new operator formulations were developed that radically improve the conditioning of the resulting linear systems thus greatly reducing solution time. The goal was to enable the solution of CEM problems that are 10 to 100 times larger than our previous capability.« less
A Partitioning Algorithm for Block-Diagonal Matrices With Overlap
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guy Antoine Atenekeng Kahou; Laura Grigori; Masha Sosonkina
2008-02-02
We present a graph partitioning algorithm that aims at partitioning a sparse matrix into a block-diagonal form, such that any two consecutive blocks overlap. We denote this form of the matrix as the overlapped block-diagonal matrix. The partitioned matrix is suitable for applying the explicit formulation of Multiplicative Schwarz preconditioner (EFMS) described in [3]. The graph partitioning algorithm partitions the graph of the input matrix into K partitions, such that every partition {Omega}{sub i} has at most two neighbors {Omega}{sub i-1} and {Omega}{sub i+1}. First, an ordering algorithm, such as the reverse Cuthill-McKee algorithm, that reduces the matrix profile ismore » performed. An initial overlapped block-diagonal partition is obtained from the profile of the matrix. An iterative strategy is then used to further refine the partitioning by allowing nodes to be transferred between neighboring partitions. Experiments are performed on matrices arising from real-world applications to show the feasibility and usefulness of this approach.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhu, Xiaojun; Lei, Guangtsai; Pan, Guangwen
In this paper, the continuous operator is discretized into matrix forms by Galerkin`s procedure, using periodic Battle-Lemarie wavelets as basis/testing functions. The polynomial decomposition of wavelets is applied to the evaluation of matrix elements, which makes the computational effort of the matrix elements no more expensive than that of method of moments (MoM) with conventional piecewise basis/testing functions. A new algorithm is developed employing the fast wavelet transform (FWT). Owing to localization, cancellation, and orthogonal properties of wavelets, very sparse matrices have been obtained, which are then solved by the LSQR iterative method. This algorithm is also adaptive in thatmore » one can add at will finer wavelet bases in the regions where fields vary rapidly, without any damage to the system orthogonality of the wavelet basis functions. To demonstrate the effectiveness of the new algorithm, we applied it to the evaluation of frequency-dependent resistance and inductance matrices of multiple lossy transmission lines. Numerical results agree with previously published data and laboratory measurements. The valid frequency range of the boundary integral equation results has been extended two to three decades in comparison with the traditional MoM approach. The new algorithm has been integrated into the computer aided design tool, MagiCAD, which is used for the design and simulation of high-speed digital systems and multichip modules Pan et al. 29 refs., 7 figs., 6 tabs.« less
NASA Astrophysics Data System (ADS)
Gao, Pengzhi; Wang, Meng; Chow, Joe H.; Ghiocel, Scott G.; Fardanesh, Bruce; Stefopoulos, George; Razanousky, Michael P.
2016-11-01
This paper presents a new framework of identifying a series of cyber data attacks on power system synchrophasor measurements. We focus on detecting "unobservable" cyber data attacks that cannot be detected by any existing method that purely relies on measurements received at one time instant. Leveraging the approximate low-rank property of phasor measurement unit (PMU) data, we formulate the identification problem of successive unobservable cyber attacks as a matrix decomposition problem of a low-rank matrix plus a transformed column-sparse matrix. We propose a convex-optimization-based method and provide its theoretical guarantee in the data identification. Numerical experiments on actual PMU data from the Central New York power system and synthetic data are conducted to verify the effectiveness of the proposed method.
Unified commutation-pruning technique for efficient computation of composite DFTs
NASA Astrophysics Data System (ADS)
Castro-Palazuelos, David E.; Medina-Melendrez, Modesto Gpe.; Torres-Roman, Deni L.; Shkvarko, Yuriy V.
2015-12-01
An efficient computation of a composite length discrete Fourier transform (DFT), as well as a fast Fourier transform (FFT) of both time and space data sequences in uncertain (non-sparse or sparse) computational scenarios, requires specific processing algorithms. Traditional algorithms typically employ some pruning methods without any commutations, which prevents them from attaining the potential computational efficiency. In this paper, we propose an alternative unified approach with automatic commutations between three computational modalities aimed at efficient computations of the pruned DFTs adapted for variable composite lengths of the non-sparse input-output data. The first modality is an implementation of the direct computation of a composite length DFT, the second one employs the second-order recursive filtering method, and the third one performs the new pruned decomposed transform. The pruned decomposed transform algorithm performs the decimation in time or space (DIT) data acquisition domain and, then, decimation in frequency (DIF). The unified combination of these three algorithms is addressed as the DFTCOMM technique. Based on the treatment of the combinational-type hypotheses testing optimization problem of preferable allocations between all feasible commuting-pruning modalities, we have found the global optimal solution to the pruning problem that always requires a fewer or, at most, the same number of arithmetic operations than other feasible modalities. The DFTCOMM method outperforms the existing competing pruning techniques in the sense of attainable savings in the number of required arithmetic operations. It requires fewer or at most the same number of arithmetic operations for its execution than any other of the competing pruning methods reported in the literature. Finally, we provide the comparison of the DFTCOMM with the recently developed sparse fast Fourier transform (SFFT) algorithmic family. We feature that, in the sensing scenarios with sparse/non-sparse data Fourier spectrum, the DFTCOMM technique manifests robustness against such model uncertainties in the sense of insensitivity for sparsity/non-sparsity restrictions and the variability of the operating parameters.
Progress on a generalized coordinates tensor product finite element 3DPNS algorithm for subsonic
NASA Technical Reports Server (NTRS)
Baker, A. J.; Orzechowski, J. A.
1983-01-01
A generalized coordinates form of the penalty finite element algorithm for the 3-dimensional parabolic Navier-Stokes equations for turbulent subsonic flows was derived. This algorithm formulation requires only three distinct hypermatrices and is applicable using any boundary fitted coordinate transformation procedure. The tensor matrix product approximation to the Jacobian of the Newton linear algebra matrix statement was also derived. Tne Newton algorithm was restructured to replace large sparse matrix solution procedures with grid sweeping using alpha-block tridiagonal matrices, where alpha equals the number of dependent variables. Numerical experiments were conducted and the resultant data gives guidance on potentially preferred tensor product constructions for the penalty finite element 3DPNS algorithm.
Iris recognition based on robust principal component analysis
NASA Astrophysics Data System (ADS)
Karn, Pradeep; He, Xiao Hai; Yang, Shuai; Wu, Xiao Hong
2014-11-01
Iris images acquired under different conditions often suffer from blur, occlusion due to eyelids and eyelashes, specular reflection, and other artifacts. Existing iris recognition systems do not perform well on these types of images. To overcome these problems, we propose an iris recognition method based on robust principal component analysis. The proposed method decomposes all training images into a low-rank matrix and a sparse error matrix, where the low-rank matrix is used for feature extraction. The sparsity concentration index approach is then applied to validate the recognition result. Experimental results using CASIA V4 and IIT Delhi V1iris image databases showed that the proposed method achieved competitive performances in both recognition accuracy and computational efficiency.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carpenter, J.A.
This report is a sequel to ORNL/CSD-96 in the ongoing supplements to Professor A.S. Householder's KWIC Index for Numerical Algebra. With this supplement, the coverage has been restricted to Numerical Linear Algebra and is now roughly characterized by the American Mathematical Society's classification section 15 and 65F but with little coverage of inifinite matrices, matrices over fields of characteristics other than zero, operator theory, optimization and those parts of matrix theory primarily combinatorial in nature. Some recognition is made of the uses of graph theory in Numerical Linear Algebra, particularly as regards their use in algorithms for sparse matrix computations.more » The period covered by this report is roughly the calendar year 1981 as measured by the appearance of the articles in the American Mathematical Society's Contents of Mathematical Publications. The review citations are limited to the Mathematical Reviews (MR) and Das Zentralblatt fur Mathematik und Ihre Grenzgebiete (ZBL). Future reports will be made more timely by closer ovservation of the few journals which supply the bulk of the listings rather than what appears to be too much reliance on secondary sources. Some thought is being given to the physical appearance of these reports and the author welcomes comments concerning both their appearance and contents.« less
Application of a sparseness constraint in multivariate curve resolution - Alternating least squares.
Hugelier, Siewert; Piqueras, Sara; Bedia, Carmen; de Juan, Anna; Ruckebusch, Cyril
2018-02-13
The use of sparseness in chemometrics is a concept that has increased in popularity. The advantage is, above all, a better interpretability of the results obtained. In this work, sparseness is implemented as a constraint in multivariate curve resolution - alternating least squares (MCR-ALS), which aims at reproducing raw (mixed) data by a bilinear model of chemically meaningful profiles. In many cases, the mixed raw data analyzed are not sparse by nature, but their decomposition profiles can be, as it is the case in some instrumental responses, such as mass spectra, or in concentration profiles linked to scattered distribution maps of powdered samples in hyperspectral images. To induce sparseness in the constrained profiles, one-dimensional and/or two-dimensional numerical arrays can be fitted using a basis of Gaussian functions with a penalty on the coefficients. In this work, a least squares regression framework with L 0 -norm penalty is applied. This L 0 -norm penalty constrains the number of non-null coefficients in the fit of the array constrained without having an a priori on the number and their positions. It has been shown that the sparseness constraint induces the suppression of values linked to uninformative channels and noise in MS spectra and improves the location of scattered compounds in distribution maps, resulting in a better interpretability of the constrained profiles. An additional benefit of the sparseness constraint is a lower ambiguity in the bilinear model, since the major presence of null coefficients in the constrained profiles also helps to limit the solutions for the profiles in the counterpart matrix of the MCR bilinear model. Copyright © 2017 Elsevier B.V. All rights reserved.
Clutter Mitigation in Echocardiography Using Sparse Signal Separation
Yavneh, Irad
2015-01-01
In ultrasound imaging, clutter artifacts degrade images and may cause inaccurate diagnosis. In this paper, we apply a method called Morphological Component Analysis (MCA) for sparse signal separation with the objective of reducing such clutter artifacts. The MCA approach assumes that the two signals in the additive mix have each a sparse representation under some dictionary of atoms (a matrix), and separation is achieved by finding these sparse representations. In our work, an adaptive approach is used for learning the dictionary from the echo data. MCA is compared to Singular Value Filtering (SVF), a Principal Component Analysis- (PCA-) based filtering technique, and to a high-pass Finite Impulse Response (FIR) filter. Each filter is applied to a simulated hypoechoic lesion sequence, as well as experimental cardiac ultrasound data. MCA is demonstrated in both cases to outperform the FIR filter and obtain results comparable to the SVF method in terms of contrast-to-noise ratio (CNR). Furthermore, MCA shows a lower impact on tissue sections while removing the clutter artifacts. In experimental heart data, MCA obtains in our experiments clutter mitigation with an average CNR improvement of 1.33 dB. PMID:26199622
3D facial landmarks: Inter-operator variability of manual annotation
2014-01-01
Background Manual annotation of landmarks is a known source of variance, which exist in all fields of medical imaging, influencing the accuracy and interpretation of the results. However, the variability of human facial landmarks is only sparsely addressed in the current literature as opposed to e.g. the research fields of orthodontics and cephalometrics. We present a full facial 3D annotation procedure and a sparse set of manually annotated landmarks, in effort to reduce operator time and minimize the variance. Method Facial scans from 36 voluntary unrelated blood donors from the Danish Blood Donor Study was randomly chosen. Six operators twice manually annotated 73 anatomical and pseudo-landmarks, using a three-step scheme producing a dense point correspondence map. We analyzed both the intra- and inter-operator variability, using mixed-model ANOVA. We then compared four sparse sets of landmarks in order to construct a dense correspondence map of the 3D scans with a minimum point variance. Results The anatomical landmarks of the eye were associated with the lowest variance, particularly the center of the pupils. Whereas points of the jaw and eyebrows have the highest variation. We see marginal variability in regards to intra-operator and portraits. Using a sparse set of landmarks (n=14), that capture the whole face, the dense point mean variance was reduced from 1.92 to 0.54 mm. Conclusion The inter-operator variability was primarily associated with particular landmarks, where more leniently landmarks had the highest variability. The variables embedded in the portray and the reliability of a trained operator did only have marginal influence on the variability. Further, using 14 of the annotated landmarks we were able to reduced the variability and create a dense correspondences mesh to capture all facial features. PMID:25306436
Improved Personalized Recommendation Based on Causal Association Rule and Collaborative Filtering
ERIC Educational Resources Information Center
Lei, Wu; Qing, Fang; Zhou, Jin
2016-01-01
There are usually limited user evaluation of resources on a recommender system, which caused an extremely sparse user rating matrix, and this greatly reduce the accuracy of personalized recommendation, especially for new users or new items. This paper presents a recommendation method based on rating prediction using causal association rules.…
IMPLEMENTATION OF THE SMOKE EMISSION DATA PROCESSOR AND SMOKE TOOL INPUT DATA PROCESSOR IN MODELS-3
The U.S. Environmental Protection Agency has implemented Version 1.3 of SMOKE (Sparse Matrix Object Kernel Emission) processor for preparation of area, mobile, point, and biogenic sources emission data within Version 4.1 of the Models-3 air quality modeling framework. The SMOK...
Optimal Chebyshev polynomials on ellipses in the complex plane
NASA Technical Reports Server (NTRS)
Fischer, Bernd; Freund, Roland
1989-01-01
The design of iterative schemes for sparse matrix computations often leads to constrained polynomial approximation problems on sets in the complex plane. For the case of ellipses, we introduce a new class of complex polynomials which are in general very good approximations to the best polynomials and even optimal in most cases.
The Models-3 Community Multi-scale Air Quality (CMAQ) model, first released by the USEPA in 1999 (Byun and Ching. 1999), continues to be developed and evaluated. The principal components of the CMAQ system include a comprehensive emission processor known as the Sparse Matrix O...
Three-Dimensional Inverse Transport Solver Based on Compressive Sensing Technique
NASA Astrophysics Data System (ADS)
Cheng, Yuxiong; Wu, Hongchun; Cao, Liangzhi; Zheng, Youqi
2013-09-01
According to the direct exposure measurements from flash radiographic image, a compressive sensing-based method for three-dimensional inverse transport problem is presented. The linear absorption coefficients and interface locations of objects are reconstructed directly at the same time. It is always very expensive to obtain enough measurements. With limited measurements, compressive sensing sparse reconstruction technique orthogonal matching pursuit is applied to obtain the sparse coefficients by solving an optimization problem. A three-dimensional inverse transport solver is developed based on a compressive sensing-based technique. There are three features in this solver: (1) AutoCAD is employed as a geometry preprocessor due to its powerful capacity in graphic. (2) The forward projection matrix rather than Gauss matrix is constructed by the visualization tool generator. (3) Fourier transform and Daubechies wavelet transform are adopted to convert an underdetermined system to a well-posed system in the algorithm. Simulations are performed and numerical results in pseudo-sine absorption problem, two-cube problem and two-cylinder problem when using compressive sensing-based solver agree well with the reference value.
Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard
2008-10-16
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific-optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD quad-core, AMD dual-core, and Intel quad-core designs, the heterogeneous STI Cell, as well as one ofmore » the first scientific studies of the highly multithreaded Sun Victoria Falls (a Niagara2 SMP). We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural trade-offs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
Cross-correlation matrix analysis of Chinese and American bank stocks in subprime crisis
NASA Astrophysics Data System (ADS)
Zhu, Shi-Zhao; Li, Xin-Li; Nie, Sen; Zhang, Wen-Qing; Yu, Gao-Feng; Han, Xiao-Pu; Wang, Bing-Hong
2015-05-01
In order to study the universality of the interactions among different markets, we analyze the cross-correlation matrix of the price of the Chinese and American bank stocks. We then find that the stock prices of the emerging market are more correlated than that of the developed market. Considering that the values of the components for the eigenvector may be positive or negative, we analyze the differences between two markets in combination with the endogenous and exogenous events which influence the financial markets. We find that the sparse pattern of components of eigenvectors out of the threshold value has no change in American bank stocks before and after the subprime crisis. However, it changes from sparse to dense for Chinese bank stocks. By using the threshold value to exclude the external factors, we simulate the interactions in financial markets. Project supported by the National Natural Science Foundation of China (Grant Nos. 11275186, 91024026, and FOM2014OF001) and the University of Shanghai for Science and Technology (USST) of Humanities and Social Sciences, China (Grant Nos. USST13XSZ05 and 11YJA790231).
Deterministic matrices matching the compressed sensing phase transitions of Gaussian random matrices
Monajemi, Hatef; Jafarpour, Sina; Gavish, Matan; Donoho, David L.; Ambikasaran, Sivaram; Bacallado, Sergio; Bharadia, Dinesh; Chen, Yuxin; Choi, Young; Chowdhury, Mainak; Chowdhury, Soham; Damle, Anil; Fithian, Will; Goetz, Georges; Grosenick, Logan; Gross, Sam; Hills, Gage; Hornstein, Michael; Lakkam, Milinda; Lee, Jason; Li, Jian; Liu, Linxi; Sing-Long, Carlos; Marx, Mike; Mittal, Akshay; Monajemi, Hatef; No, Albert; Omrani, Reza; Pekelis, Leonid; Qin, Junjie; Raines, Kevin; Ryu, Ernest; Saxe, Andrew; Shi, Dai; Siilats, Keith; Strauss, David; Tang, Gary; Wang, Chaojun; Zhou, Zoey; Zhu, Zhen
2013-01-01
In compressed sensing, one takes samples of an N-dimensional vector using an matrix A, obtaining undersampled measurements . For random matrices with independent standard Gaussian entries, it is known that, when is k-sparse, there is a precisely determined phase transition: for a certain region in the (,)-phase diagram, convex optimization typically finds the sparsest solution, whereas outside that region, it typically fails. It has been shown empirically that the same property—with the same phase transition location—holds for a wide range of non-Gaussian random matrix ensembles. We report extensive experiments showing that the Gaussian phase transition also describes numerous deterministic matrices, including Spikes and Sines, Spikes and Noiselets, Paley Frames, Delsarte-Goethals Frames, Chirp Sensing Matrices, and Grassmannian Frames. Namely, for each of these deterministic matrices in turn, for a typical k-sparse object, we observe that convex optimization is successful over a region of the phase diagram that coincides with the region known for Gaussian random matrices. Our experiments considered coefficients constrained to for four different sets , and the results establish our finding for each of the four associated phase transitions. PMID:23277588
Meng, Yuguang; Lei, Hao
2010-06-01
An efficient iterative gridding reconstruction method with correction of off-resonance artifacts was developed, which is especially tailored for multiple-shot non-Cartesian imaging. The novelty of the method lies in that the transformation matrix for gridding (T) was constructed as the convolution of two sparse matrices, among which the former is determined by the sampling interval and the spatial distribution of the off-resonance frequencies and the latter by the sampling trajectory and the target grid in the Cartesian space. The resulting T matrix is also sparse and can be solved efficiently with the iterative conjugate gradient algorithm. It was shown that, with the proposed method, the reconstruction speed in multiple-shot non-Cartesian imaging can be improved significantly while retaining high reconstruction fidelity. More important, the method proposed allows tradeoff between the accuracy and the computation time of reconstruction, making customization of the use of such a method in different applications possible. The performance of the proposed method was demonstrated by numerical simulation and multiple-shot spiral imaging on rat brain at 4.7 T. (c) 2010 Wiley-Liss, Inc.
Sequential time interleaved random equivalent sampling for repetitive signal.
Zhao, Yijiu; Liu, Jingjing
2016-12-01
Compressed sensing (CS) based sampling techniques exhibit many advantages over other existing approaches for sparse signal spectrum sensing; they are also incorporated into non-uniform sampling signal reconstruction to improve the efficiency, such as random equivalent sampling (RES). However, in CS based RES, only one sample of each acquisition is considered in the signal reconstruction stage, and it will result in more acquisition runs and longer sampling time. In this paper, a sampling sequence is taken in each RES acquisition run, and the corresponding block measurement matrix is constructed using a Whittaker-Shannon interpolation formula. All the block matrices are combined into an equivalent measurement matrix with respect to all sampling sequences. We implemented the proposed approach with a multi-cores analog-to-digital converter (ADC), whose ADC cores are time interleaved. A prototype realization of this proposed CS based sequential random equivalent sampling method has been developed. It is able to capture an analog waveform at an equivalent sampling rate of 40 GHz while sampled at 1 GHz physically. Experiments indicate that, for a sparse signal, the proposed CS based sequential random equivalent sampling exhibits high efficiency.
Petrov, Andrii Y; Herbst, Michael; Andrew Stenger, V
2017-08-15
Rapid whole-brain dynamic Magnetic Resonance Imaging (MRI) is of particular interest in Blood Oxygen Level Dependent (BOLD) functional MRI (fMRI). Faster acquisitions with higher temporal sampling of the BOLD time-course provide several advantages including increased sensitivity in detecting functional activation, the possibility of filtering out physiological noise for improving temporal SNR, and freezing out head motion. Generally, faster acquisitions require undersampling of the data which results in aliasing artifacts in the object domain. A recently developed low-rank (L) plus sparse (S) matrix decomposition model (L+S) is one of the methods that has been introduced to reconstruct images from undersampled dynamic MRI data. The L+S approach assumes that the dynamic MRI data, represented as a space-time matrix M, is a linear superposition of L and S components, where L represents highly spatially and temporally correlated elements, such as the image background, while S captures dynamic information that is sparse in an appropriate transform domain. This suggests that L+S might be suited for undersampled task or slow event-related fMRI acquisitions because the periodic nature of the BOLD signal is sparse in the temporal Fourier transform domain and slowly varying low-rank brain background signals, such as physiological noise and drift, will be predominantly low-rank. In this work, as a proof of concept, we exploit the L+S method for accelerating block-design fMRI using a 3D stack of spirals (SoS) acquisition where undersampling is performed in the k z -t domain. We examined the feasibility of the L+S method to accurately separate temporally correlated brain background information in the L component while capturing periodic BOLD signals in the S component. We present results acquired in control human volunteers at 3T for both retrospective and prospectively acquired fMRI data for a visual activation block-design task. We show that a SoS fMRI acquisition with an acceleration of four and L+S reconstruction can achieve a brain coverage of 40 slices at 2mm isotropic resolution and 64 x 64 matrix size every 500ms. Copyright © 2017 Elsevier Inc. All rights reserved.
An efficient optical architecture for sparsely connected neural networks
NASA Technical Reports Server (NTRS)
Hine, Butler P., III; Downie, John D.; Reid, Max B.
1990-01-01
An architecture for general-purpose optical neural network processor is presented in which the interconnections and weights are formed by directing coherent beams holographically, thereby making use of the space-bandwidth products of the recording medium for sparsely interconnected networks more efficiently that the commonly used vector-matrix multiplier, since all of the hologram area is in use. An investigation is made of the use of computer-generated holograms recorded on such updatable media as thermoplastic materials, in order to define the interconnections and weights of a neural network processor; attention is given to limits on interconnection densities, diffraction efficiencies, and weighing accuracies possible with such an updatable thin film holographic device.
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Gutknecht, Martin H.; Nachtigal, Noel M.
1991-01-01
The nonsymmetric Lanczos method can be used to compute eigenvalues of large sparse non-Hermitian matrices or to solve large sparse non-Hermitian linear systems. However, the original Lanczos algorithm is susceptible to possible breakdowns and potential instabilities. An implementation is presented of a look-ahead version of the Lanczos algorithm that, except for the very special situation of an incurable breakdown, overcomes these problems by skipping over those steps in which a breakdown or near-breakdown would occur in the standard process. The proposed algorithm can handle look-ahead steps of any length and requires the same number of matrix-vector products and inner products as the standard Lanczos process without look-ahead.
Preconditioned conjugate gradient wave-front reconstructors for multiconjugate adaptive optics
NASA Astrophysics Data System (ADS)
Gilles, Luc; Ellerbroek, Brent L.; Vogel, Curtis R.
2003-09-01
Multiconjugate adaptive optics (MCAO) systems with 104-105 degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wave-front control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of adaptive optics degrees of freedom. We develop scalable open-loop iterative sparse matrix implementations of minimum variance wave-front reconstruction for telescope diameters up to 32 m with more than 104 actuators. The basic approach is the preconditioned conjugate gradient method with an efficient preconditioner, whose block structure is defined by the atmospheric turbulent layers very much like the layer-oriented MCAO algorithms of current interest. Two cost-effective preconditioners are investigated: a multigrid solver and a simpler block symmetric Gauss-Seidel (BSGS) sweep. Both options require off-line sparse Cholesky factorizations of the diagonal blocks of the matrix system. The cost to precompute these factors scales approximately as the three-halves power of the number of estimated phase grid points per atmospheric layer, and their average update rate is typically of the order of 10-2 Hz, i.e., 4-5 orders of magnitude lower than the typical 103 Hz temporal sampling rate. All other computations scale almost linearly with the total number of estimated phase grid points. We present numerical simulation results to illustrate algorithm convergence. Convergence rates of both preconditioners are similar, regardless of measurement noise level, indicating that the layer-oriented BSGS sweep is as effective as the more elaborated multiresolution preconditioner.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hutchinson, S.A.; Shadid, J.N.; Tuminaro, R.S.
1995-10-01
Aztec is an iterative library that greatly simplifies the parallelization process when solving the linear systems of equations Ax = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. Aztec is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparsemore » unstructured matrices for parallel solution. Once the distributed matrix is created, computation can be performed on any of the parallel machines running Aztec: nCUBE 2, IBM SP2 and Intel Paragon, MPI platforms as well as standard serial and vector platforms. Aztec includes a number of Krylov iterative methods such as conjugate gradient (CG), generalized minimum residual (GMRES) and stabilized biconjugate gradient (BICGSTAB) to solve systems of equations. These Krylov methods are used in conjunction with various preconditioners such as polynomial or domain decomposition methods using LU or incomplete LU factorizations within subdomains. Although the matrix A can be general, the package has been designed for matrices arising from the approximation of partial differential equations (PDEs). In particular, the Aztec package is oriented toward systems arising from PDE applications.« less
Semi-automatic sparse preconditioners for high-order finite element methods on non-uniform meshes
NASA Astrophysics Data System (ADS)
Austin, Travis M.; Brezina, Marian; Jamroz, Ben; Jhurani, Chetan; Manteuffel, Thomas A.; Ruge, John
2012-05-01
High-order finite elements often have a higher accuracy per degree of freedom than the classical low-order finite elements. However, in the context of implicit time-stepping methods, high-order finite elements present challenges to the construction of efficient simulations due to the high cost of inverting the denser finite element matrix. There are many cases where simulations are limited by the memory required to store the matrix and/or the algorithmic components of the linear solver. We are particularly interested in preconditioned Krylov methods for linear systems generated by discretization of elliptic partial differential equations with high-order finite elements. Using a preconditioner like Algebraic Multigrid can be costly in terms of memory due to the need to store matrix information at the various levels. We present a novel method for defining a preconditioner for systems generated by high-order finite elements that is based on a much sparser system than the original high-order finite element system. We investigate the performance for non-uniform meshes on a cube and a cubed sphere mesh, showing that the sparser preconditioner is more efficient and uses significantly less memory. Finally, we explore new methods to construct the sparse preconditioner and examine their effectiveness for non-uniform meshes. We compare results to a direct use of Algebraic Multigrid as a preconditioner and to a two-level additive Schwarz method.
Data traffic reduction schemes for sparse Cholesky factorizations
NASA Technical Reports Server (NTRS)
Naik, Vijay K.; Patrick, Merrell L.
1988-01-01
Load distribution schemes are presented which minimize the total data traffic in the Cholesky factorization of dense and sparse, symmetric, positive definite matrices on multiprocessor systems with local and shared memory. The total data traffic in factoring an n x n sparse, symmetric, positive definite matrix representing an n-vertex regular 2-D grid graph using n (sup alpha), alpha is equal to or less than 1, processors are shown to be O(n(sup 1 + alpha/2)). It is O(n(sup 3/2)), when n (sup alpha), alpha is equal to or greater than 1, processors are used. Under the conditions of uniform load distribution, these results are shown to be asymptotically optimal. The schemes allow efficient use of up to O(n) processors before the total data traffic reaches the maximum value of O(n(sup 3/2)). The partitioning employed within the scheme, allows a better utilization of the data accessed from shared memory than those of previously published methods.
Social Collaborative Filtering by Trust.
Yang, Bo; Lei, Yu; Liu, Jiming; Li, Wenjie
2017-08-01
Recommender systems are used to accurately and actively provide users with potentially interesting information or services. Collaborative filtering is a widely adopted approach to recommendation, but sparse data and cold-start users are often barriers to providing high quality recommendations. To address such issues, we propose a novel method that works to improve the performance of collaborative filtering recommendations by integrating sparse rating data given by users and sparse social trust network among these same users. This is a model-based method that adopts matrix factorization technique that maps users into low-dimensional latent feature spaces in terms of their trust relationship, and aims to more accurately reflect the users reciprocal influence on the formation of their own opinions and to learn better preferential patterns of users for high-quality recommendations. We use four large-scale datasets to show that the proposed method performs much better, especially for cold start users, than state-of-the-art recommendation algorithms for social collaborative filtering based on trust.
Inference for High-dimensional Differential Correlation Matrices.
Cai, T Tony; Zhang, Anru
2016-01-01
Motivated by differential co-expression analysis in genomics, we consider in this paper estimation and testing of high-dimensional differential correlation matrices. An adaptive thresholding procedure is introduced and theoretical guarantees are given. Minimax rate of convergence is established and the proposed estimator is shown to be adaptively rate-optimal over collections of paired correlation matrices with approximately sparse differences. Simulation results show that the procedure significantly outperforms two other natural methods that are based on separate estimation of the individual correlation matrices. The procedure is also illustrated through an analysis of a breast cancer dataset, which provides evidence at the gene co-expression level that several genes, of which a subset has been previously verified, are associated with the breast cancer. Hypothesis testing on the differential correlation matrices is also considered. A test, which is particularly well suited for testing against sparse alternatives, is introduced. In addition, other related problems, including estimation of a single sparse correlation matrix, estimation of the differential covariance matrices, and estimation of the differential cross-correlation matrices, are also discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jakeman, John D.; Narayan, Akil; Zhou, Tao
We propose an algorithm for recovering sparse orthogonal polynomial expansions via collocation. A standard sampling approach for recovering sparse polynomials uses Monte Carlo sampling, from the density of orthogonality, which results in poor function recovery when the polynomial degree is high. Our proposed approach aims to mitigate this limitation by sampling with respect to the weighted equilibrium measure of the parametric domain and subsequently solves a preconditionedmore » $$\\ell^1$$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. Our algorithm can be applied to a wide class of orthogonal polynomial families on bounded and unbounded domains, including all classical families. We present theoretical analysis to motivate the algorithm and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. In conclusion, numerical examples are also provided to demonstrate that our proposed algorithm leads to comparable or improved accuracy even when compared with Legendre- and Hermite-specific algorithms.« less
Jakeman, John D.; Narayan, Akil; Zhou, Tao
2017-06-22
We propose an algorithm for recovering sparse orthogonal polynomial expansions via collocation. A standard sampling approach for recovering sparse polynomials uses Monte Carlo sampling, from the density of orthogonality, which results in poor function recovery when the polynomial degree is high. Our proposed approach aims to mitigate this limitation by sampling with respect to the weighted equilibrium measure of the parametric domain and subsequently solves a preconditionedmore » $$\\ell^1$$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. Our algorithm can be applied to a wide class of orthogonal polynomial families on bounded and unbounded domains, including all classical families. We present theoretical analysis to motivate the algorithm and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. In conclusion, numerical examples are also provided to demonstrate that our proposed algorithm leads to comparable or improved accuracy even when compared with Legendre- and Hermite-specific algorithms.« less
Experiments with conjugate gradient algorithms for homotopy curve tracking
NASA Technical Reports Server (NTRS)
Irani, Kashmira M.; Ribbens, Calvin J.; Watson, Layne T.; Kamat, Manohar P.; Walker, Homer F.
1991-01-01
There are algorithms for finding zeros or fixed points of nonlinear systems of equations that are globally convergent for almost all starting points, i.e., with probability one. The essence of all such algorithms is the construction of an appropriate homotopy map and then tracking some smooth curve in the zero set of this homotopy map. HOMPACK is a mathematical software package implementing globally convergent homotopy algorithms with three different techniques for tracking a homotopy zero curve, and has separate routines for dense and sparse Jacobian matrices. The HOMPACK algorithms for sparse Jacobian matrices use a preconditioned conjugate gradient algorithm for the computation of the kernel of the homotopy Jacobian matrix, a required linear algebra step for homotopy curve tracking. Here, variants of the conjugate gradient algorithm are implemented in the context of homotopy curve tracking and compared with Craig's preconditioned conjugate gradient method used in HOMPACK. The test problems used include actual large scale, sparse structural mechanics problems.
NASA Astrophysics Data System (ADS)
Li, Gongxin; Li, Peng; Wang, Yuechao; Wang, Wenxue; Xi, Ning; Liu, Lianqing
2014-07-01
Scanning Ion Conductance Microscopy (SICM) is one kind of Scanning Probe Microscopies (SPMs), and it is widely used in imaging soft samples for many distinctive advantages. However, the scanning speed of SICM is much slower than other SPMs. Compressive sensing (CS) could improve scanning speed tremendously by breaking through the Shannon sampling theorem, but it still requires too much time in image reconstruction. Block compressive sensing can be applied to SICM imaging to further reduce the reconstruction time of sparse signals, and it has another unique application that it can achieve the function of image real-time display in SICM imaging. In this article, a new method of dividing blocks and a new matrix arithmetic operation were proposed to build the block compressive sensing model, and several experiments were carried out to verify the superiority of block compressive sensing in reducing imaging time and real-time display in SICM imaging.
Unsymmetric ordering using a constrained Markowitz scheme
DOE Office of Scientific and Technical Information (OSTI.GOV)
Amestoy, Patrick R.; Xiaoye S.; Pralet, Stephane
2005-01-18
We present a family of ordering algorithms that can be used as a preprocessing step prior to performing sparse LU factorization. The ordering algorithms simultaneously achieve the objectives of selecting numerically good pivots and preserving the sparsity. We describe the algorithmic properties and challenges in their implementation. By mixing the two objectives we show that we can reduce the amount of fill-in in the factors and reduce the number of numerical problems during factorization. On a set of large unsymmetric real problems, we obtained the median reductions of 12% in the factorization time, of 13% in the size of themore » LU factors, of 20% in the number of operations performed during the factorization phase, and of 11% in the memory needed by the multifrontal solver MA41-UNS. A byproduct of this ordering strategy is an incomplete LU-factored matrix that can be used as a preconditioner in an iterative solver.« less
People counting in classroom based on video surveillance
NASA Astrophysics Data System (ADS)
Zhang, Quanbin; Huang, Xiang; Su, Juan
2014-11-01
Currently, the switches of the lights and other electronic devices in the classroom are mainly relied on manual control, as a result, many lights are on while no one or only few people in the classroom. It is important to change the current situation and control the electronic devices intelligently according to the number and the distribution of the students in the classroom, so as to reduce the considerable waste of electronic resources. This paper studies the problem of people counting in classroom based on video surveillance. As the camera in the classroom can not get the full shape contour information of bodies and the clear features information of faces, most of the classical algorithms such as the pedestrian detection method based on HOG (histograms of oriented gradient) feature and the face detection method based on machine learning are unable to obtain a satisfied result. A new kind of dual background updating model based on sparse and low-rank matrix decomposition is proposed in this paper, according to the fact that most of the students in the classroom are almost in stationary state and there are body movement occasionally. Firstly, combining the frame difference with the sparse and low-rank matrix decomposition to predict the moving areas, and updating the background model with different parameters according to the positional relationship between the pixels of current video frame and the predicted motion regions. Secondly, the regions of moving objects are determined based on the updated background using the background subtraction method. Finally, some operations including binarization, median filtering and morphology processing, connected component detection, etc. are performed on the regions acquired by the background subtraction, in order to induce the effects of the noise and obtain the number of people in the classroom. The experiment results show the validity of the algorithm of people counting.
Decentralized state estimation for a large-scale spatially interconnected system.
Liu, Huabo; Yu, Haisheng
2018-03-01
A decentralized state estimator is derived for the spatially interconnected systems composed of many subsystems with arbitrary connection relations. An optimization problem on the basis of linear matrix inequality (LMI) is constructed for the computations of improved subsystem parameter matrices. Several computationally effective approaches are derived which efficiently utilize the block-diagonal characteristic of system parameter matrices and the sparseness of subsystem connection matrix. Moreover, this decentralized state estimator is proved to converge to a stable system and obtain a bounded covariance matrix of estimation errors under certain conditions. Numerical simulations show that the obtained decentralized state estimator is attractive in the synthesis of a large-scale networked system. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
Background recovery via motion-based robust principal component analysis with matrix factorization
NASA Astrophysics Data System (ADS)
Pan, Peng; Wang, Yongli; Zhou, Mingyuan; Sun, Zhipeng; He, Guoping
2018-03-01
Background recovery is a key technique in video analysis, but it still suffers from many challenges, such as camouflage, lighting changes, and diverse types of image noise. Robust principal component analysis (RPCA), which aims to recover a low-rank matrix and a sparse matrix, is a general framework for background recovery. The nuclear norm is widely used as a convex surrogate for the rank function in RPCA, which requires computing the singular value decomposition (SVD), a task that is increasingly costly as matrix sizes and ranks increase. However, matrix factorization greatly reduces the dimension of the matrix for which the SVD must be computed. Motion information has been shown to improve low-rank matrix recovery in RPCA, but this method still finds it difficult to handle original video data sets because of its batch-mode formulation and implementation. Hence, in this paper, we propose a motion-assisted RPCA model with matrix factorization (FM-RPCA) for background recovery. Moreover, an efficient linear alternating direction method of multipliers with a matrix factorization (FL-ADM) algorithm is designed for solving the proposed FM-RPCA model. Experimental results illustrate that the method provides stable results and is more efficient than the current state-of-the-art algorithms.
Reliability of the Colorado Family Support Assessment: A Self-Sufficiency Matrix for Families
ERIC Educational Resources Information Center
Richmond, Melissa K.; Pampel, Fred C.; Zarcula, Flavia; Howey, Virginia; McChesney, Brenda
2017-01-01
Purpose: Family support programs commonly use self-sufficiency matrices (SSMs) to measure family outcomes, however, validation research on SSMs is sparse. This study examined the reliability of the Colorado Family Support Assessment 2.0 (CFSA 2.0) to measure family self-reliance across 14 domains (e.g., employment). Methods: Ten written case…
A manual for PARTI runtime primitives
NASA Technical Reports Server (NTRS)
Berryman, Harry; Saltz, Joel
1990-01-01
Primitives are presented that are designed to help users efficiently program irregular problems (e.g., unstructured mesh sweeps, sparse matrix codes, adaptive mesh partial differential equations solvers) on distributed memory machines. These primitives are also designed for use in compilers for distributed memory multiprocessors. Communications patterns are captured at runtime, and the appropriate send and receive messages are automatically generated.
A radial basis function Galerkin method for inhomogeneous nonlocal diffusion
Lehoucq, Richard B.; Rowe, Stephen T.
2016-02-01
We introduce a discretization for a nonlocal diffusion problem using a localized basis of radial basis functions. The stiffness matrix entries are assembled by a special quadrature routine unique to the localized basis. Combining the quadrature method with the localized basis produces a well-conditioned, sparse, symmetric positive definite stiffness matrix. We demonstrate that both the continuum and discrete problems are well-posed and present numerical results for the convergence behavior of the radial basis function method. As a result, we explore approximating the solution to anisotropic differential equations by solving anisotropic nonlocal integral equations using the radial basis function method.
Comparing direct and iterative equation solvers in a large structural analysis software system
NASA Technical Reports Server (NTRS)
Poole, E. L.
1991-01-01
Two direct Choleski equation solvers and two iterative preconditioned conjugate gradient (PCG) equation solvers used in a large structural analysis software system are described. The two direct solvers are implementations of the Choleski method for variable-band matrix storage and sparse matrix storage. The two iterative PCG solvers include the Jacobi conjugate gradient method and an incomplete Choleski conjugate gradient method. The performance of the direct and iterative solvers is compared by solving several representative structural analysis problems. Some key factors affecting the performance of the iterative solvers relative to the direct solvers are identified.
Robust and sparse correlation matrix estimation for the analysis of high-dimensional genomics data.
Serra, Angela; Coretto, Pietro; Fratello, Michele; Tagliaferri, Roberto; Stegle, Oliver
2018-02-15
Microarray technology can be used to study the expression of thousands of genes across a number of different experimental conditions, usually hundreds. The underlying principle is that genes sharing similar expression patterns, across different samples, can be part of the same co-expression system, or they may share the same biological functions. Groups of genes are usually identified based on cluster analysis. Clustering methods rely on the similarity matrix between genes. A common choice to measure similarity is to compute the sample correlation matrix. Dimensionality reduction is another popular data analysis task which is also based on covariance/correlation matrix estimates. Unfortunately, covariance/correlation matrix estimation suffers from the intrinsic noise present in high-dimensional data. Sources of noise are: sampling variations, presents of outlying sample units, and the fact that in most cases the number of units is much larger than the number of genes. In this paper, we propose a robust correlation matrix estimator that is regularized based on adaptive thresholding. The resulting method jointly tames the effects of the high-dimensionality, and data contamination. Computations are easy to implement and do not require hand tunings. Both simulated and real data are analyzed. A Monte Carlo experiment shows that the proposed method is capable of remarkable performances. Our correlation metric is more robust to outliers compared with the existing alternatives in two gene expression datasets. It is also shown how the regularization allows to automatically detect and filter spurious correlations. The same regularization is also extended to other less robust correlation measures. Finally, we apply the ARACNE algorithm on the SyNTreN gene expression data. Sensitivity and specificity of the reconstructed network is compared with the gold standard. We show that ARACNE performs better when it takes the proposed correlation matrix estimator as input. The R software is available at https://github.com/angy89/RobustSparseCorrelation. aserra@unisa.it or robtag@unisa.it. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Three-dimensional unstructured grid Euler computations using a fully-implicit, upwind method
NASA Technical Reports Server (NTRS)
Whitaker, David L.
1993-01-01
A method has been developed to solve the Euler equations on a three-dimensional unstructured grid composed of tetrahedra. The method uses an upwind flow solver with a linearized, backward-Euler time integration scheme. Each time step results in a sparse linear system of equations which is solved by an iterative, sparse matrix solver. Local-time stepping, switched evolution relaxation (SER), preconditioning and reuse of the Jacobian are employed to accelerate the convergence rate. Implicit boundary conditions were found to be extremely important for fast convergence. Numerical experiments have shown that convergence rates comparable to that of a multigrid, central-difference scheme are achievable on the same mesh. Results are presented for several grids about an ONERA M6 wing.
NASA Astrophysics Data System (ADS)
Li, Xianye; Meng, Xiangfeng; Yang, Xiulun; Wang, Yurong; Yin, Yongkai; Peng, Xiang; He, Wenqi; Dong, Guoyan; Chen, Hongyi
2018-03-01
A multiple-image encryption method via lifting wavelet transform (LWT) and XOR operation is proposed, which is based on a row scanning compressive ghost imaging scheme. In the encryption process, the scrambling operation is implemented for the sparse images transformed by LWT, then the XOR operation is performed on the scrambled images, and the resulting XOR images are compressed in the row scanning compressive ghost imaging, through which the ciphertext images can be detected by bucket detector arrays. During decryption, the participant who possesses his/her correct key-group, can successfully reconstruct the corresponding plaintext image by measurement key regeneration, compression algorithm reconstruction, XOR operation, sparse images recovery, and inverse LWT (iLWT). Theoretical analysis and numerical simulations validate the feasibility of the proposed method.
Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications
ERIC Educational Resources Information Center
Lin, Chih-Kai
2017-01-01
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Sparse Logistic Regression for Diagnosis of Liver Fibrosis in Rat by Using SCAD-Penalized Likelihood
Yan, Fang-Rong; Lin, Jin-Guan; Liu, Yu
2011-01-01
The objective of the present study is to find out the quantitative relationship between progression of liver fibrosis and the levels of certain serum markers using mathematic model. We provide the sparse logistic regression by using smoothly clipped absolute deviation (SCAD) penalized function to diagnose the liver fibrosis in rats. Not only does it give a sparse solution with high accuracy, it also provides the users with the precise probabilities of classification with the class information. In the simulative case and the experiment case, the proposed method is comparable to the stepwise linear discriminant analysis (SLDA) and the sparse logistic regression with least absolute shrinkage and selection operator (LASSO) penalty, by using receiver operating characteristic (ROC) with bayesian bootstrap estimating area under the curve (AUC) diagnostic sensitivity for selected variable. Results show that the new approach provides a good correlation between the serum marker levels and the liver fibrosis induced by thioacetamide (TAA) in rats. Meanwhile, this approach might also be used in predicting the development of liver cirrhosis. PMID:21716672
Sparseness- and continuity-constrained seismic imaging
NASA Astrophysics Data System (ADS)
Herrmann, Felix J.
2005-04-01
Non-linear solution strategies to the least-squares seismic inverse-scattering problem with sparseness and continuity constraints are proposed. Our approach is designed to (i) deal with substantial amounts of additive noise (SNR < 0 dB); (ii) use the sparseness and locality (both in position and angle) of directional basis functions (such as curvelets and contourlets) on the model: the reflectivity; and (iii) exploit the near invariance of these basis functions under the normal operator, i.e., the scattering-followed-by-imaging operator. Signal-to-noise ratio and the continuity along the imaged reflectors are significantly enhanced by formulating the solution of the seismic inverse problem in terms of an optimization problem. During the optimization, sparseness on the basis and continuity along the reflectors are imposed by jointly minimizing the l1- and anisotropic diffusion/total-variation norms on the coefficients and reflectivity, respectively. [Joint work with Peyman P. Moghaddam was carried out as part of the SINBAD project, with financial support secured through ITF (the Industry Technology Facilitator) from the following organizations: BG Group, BP, ExxonMobil, and SHELL. Additional funding came from the NSERC Discovery Grants 22R81254.
NASA Astrophysics Data System (ADS)
Takasaki, Koichi
This paper presents a program for the multidisciplinary optimization and identification problem of the nonlinear model of large aerospace vehicle structures. The program constructs the global matrix of the dynamic system in the time direction by the p-version finite element method (pFEM), and the basic matrix for each pFEM node in the time direction is described by a sparse matrix similarly to the static finite element problem. The algorithm used by the program does not require the Hessian matrix of the objective function and so has low memory requirements. It also has a relatively low computational cost, and is suited to parallel computation. The program was integrated as a solver module of the multidisciplinary analysis system CUMuLOUS (Computational Utility for Multidisciplinary Large scale Optimization of Undense System) which is under development by the Aerospace Research and Development Directorate (ARD) of the Japan Aerospace Exploration Agency (JAXA).
Lee, Young-Beom; Lee, Jeonghyeon; Tak, Sungho; Lee, Kangjoo; Na, Duk L; Seo, Sang Won; Jeong, Yong; Ye, Jong Chul
2016-01-15
Recent studies of functional connectivity MR imaging have revealed that the default-mode network activity is disrupted in diseases such as Alzheimer's disease (AD). However, there is not yet a consensus on the preferred method for resting-state analysis. Because the brain is reported to have complex interconnected networks according to graph theoretical analysis, the independency assumption, as in the popular independent component analysis (ICA) approach, often does not hold. Here, rather than using the independency assumption, we present a new statistical parameter mapping (SPM)-type analysis method based on a sparse graph model where temporal dynamics at each voxel position are described as a sparse combination of global brain dynamics. In particular, a new concept of a spatially adaptive design matrix has been proposed to represent local connectivity that shares the same temporal dynamics. If we further assume that local network structures within a group are similar, the estimation problem of global and local dynamics can be solved using sparse dictionary learning for the concatenated temporal data across subjects. Moreover, under the homoscedasticity variance assumption across subjects and groups that is often used in SPM analysis, the aforementioned individual and group analyses using sparse dictionary learning can be accurately modeled by a mixed-effect model, which also facilitates a standard SPM-type group-level inference using summary statistics. Using an extensive resting fMRI data set obtained from normal, mild cognitive impairment (MCI), and Alzheimer's disease patient groups, we demonstrated that the changes in the default mode network extracted by the proposed method are more closely correlated with the progression of Alzheimer's disease. Copyright © 2015 Elsevier Inc. All rights reserved.
Two-stage sparse coding of region covariance via Log-Euclidean kernels to detect saliency.
Zhang, Ying-Ying; Yang, Cai; Zhang, Ping
2017-05-01
In this paper, we present a novel bottom-up saliency detection algorithm from the perspective of covariance matrices on a Riemannian manifold. Each superpixel is described by a region covariance matrix on Riemannian Manifolds. We carry out a two-stage sparse coding scheme via Log-Euclidean kernels to extract salient objects efficiently. In the first stage, given background dictionary on image borders, sparse coding of each region covariance via Log-Euclidean kernels is performed. The reconstruction error on the background dictionary is regarded as the initial saliency of each superpixel. In the second stage, an improvement of the initial result is achieved by calculating reconstruction errors of the superpixels on foreground dictionary, which is extracted from the first stage saliency map. The sparse coding in the second stage is similar to the first stage, but is able to effectively highlight the salient objects uniformly from the background. Finally, three post-processing methods-highlight-inhibition function, context-based saliency weighting, and the graph cut-are adopted to further refine the saliency map. Experiments on four public benchmark datasets show that the proposed algorithm outperforms the state-of-the-art methods in terms of precision, recall and mean absolute error, and demonstrate the robustness and efficiency of the proposed method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Lafzi, A; Farahani, R M; Tubbs, R S; Roushangar, L; Shoja, M M
2007-05-01
Enamel matrix derivative (EMD), such as Emdogain, has been suggested for the improvement of wound healing in periodontal surgical therapy. The present qualitative study seeks to illustrate the ultrastructural changes associated with a human gingival wound at 10 days after the application of EMD as an adjunct to a laterally-positioned flap in a patient with gingival recession. An otherwise healthy patient, who had been suffering from bilateral gingival recession defects on teeth #23 and #26, was studied. One defect was treated with a laterally-positioned flap, while the other was treated with a combination of EMD and a laterally-positioned flap. Ten days after the operation gingival biopsy specimens were obtained from the dentogingival region and examined using a transmission electron microscope. A considerable difference was found in both the cellular and extracellular phases of EMD and non-EMD sites. The fibroblasts of EMD site were more rounded with plump cytoplasms and euchromatic nuclei. A well-developed rough endoplasmic reticulum and numerous mitochondria could be detected. In contrast, the fibroblasts of non-EMD site were of flattened spindle-like morphology. While the signs of apoptosis could rarely be detected at EMD site, apoptotic bodies and ultra-structural evidence of apoptosis (crescent-like heterochromatic nuclei and dilated nuclear envelopes) were consistent features at non-EMD site. The extracellular matrix at EMD site mainly consisted of well-organised collagen fibres, while non-EMD site contained sparse and incompletely-formed collagen fibres. Coccoid bacteria were noted within the extracellular matrix and neutrophils at non-EMD site. It seems that EMD may enhance certain features of gingival wound healing, which may be attributable to its anti-apoptotic, anti-bacterial or anti-inflammatory properties.
Preconditioned conjugate gradient wave-front reconstructors for multiconjugate adaptive optics.
Gilles, Luc; Ellerbroek, Brent L; Vogel, Curtis R
2003-09-10
Multiconjugate adaptive optics (MCAO) systems with 10(4)-10(5) degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wavefront control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of adaptive optics degrees of freedom. We develop scalable open-loop iterative sparse matrix implementations of minimum variance wave-front reconstruction for telescope diameters up to 32 m with more than 10(4) actuators. The basic approach is the preconditioned conjugate gradient method with an efficient preconditioner, whose block structure is defined by the atmospheric turbulent layers very much like the layer-oriented MCAO algorithms of current interest. Two cost-effective preconditioners are investigated: a multigrid solver and a simpler block symmetric Gauss-Seidel (BSGS) sweep. Both options require off-line sparse Cholesky factorizations of the diagonal blocks of the matrix system. The cost to precompute these factors scales approximately as the three-halves power of the number of estimated phase grid points per atmospheric layer, and their average update rate is typically of the order of 10(-2) Hz, i.e., 4-5 orders of magnitude lower than the typical 10(3) Hz temporal sampling rate. All other computations scale almost linearly with the total number of estimated phase grid points. We present numerical simulation results to illustrate algorithm convergence. Convergence rates of both preconditioners are similar, regardless of measurement noise level, indicating that the layer-oriented BSGS sweep is as effective as the more elaborated multiresolution preconditioner.
Completing sparse and disconnected protein-protein network by deep learning.
Huang, Lei; Liao, Li; Wu, Cathy H
2018-03-22
Protein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes. Recently, an increasing number of computational methods have shifted from pair-wise prediction to network level prediction. Many of the existing network level methods predict PPIs under the assumption that the training network should be connected. However, this assumption greatly affects the prediction power and limits the application area because the current golden standard PPI networks are usually very sparse and disconnected. Therefore, how to effectively predict PPIs based on a training network that is sparse and disconnected remains a challenge. In this work, we developed a novel PPI prediction method based on deep learning neural network and regularized Laplacian kernel. We use a neural network with an autoencoder-like architecture to implicitly simulate the evolutionary processes of a PPI network. Neurons of the output layer correspond to proteins and are labeled with values (1 for interaction and 0 for otherwise) from the adjacency matrix of a sparse disconnected training PPI network. Unlike autoencoder, neurons at the input layer are given all zero input, reflecting an assumption of no a priori knowledge about PPIs, and hidden layers of smaller sizes mimic ancient interactome at different times during evolution. After the training step, an evolved PPI network whose rows are outputs of the neural network can be obtained. We then predict PPIs by applying the regularized Laplacian kernel to the transition matrix that is built upon the evolved PPI network. The results from cross-validation experiments show that the PPI prediction accuracies for yeast data and human data measured as AUC are increased by up to 8.4 and 14.9% respectively, as compared to the baseline. Moreover, the evolved PPI network can also help us leverage complementary information from the disconnected training network and multiple heterogeneous data sources. Tested by the yeast data with six heterogeneous feature kernels, the results show our method can further improve the prediction performance by up to 2%, which is very close to an upper bound that is obtained by an Approximate Bayesian Computation based sampling method. The proposed evolution deep neural network, coupled with regularized Laplacian kernel, is an effective tool in completing sparse and disconnected PPI networks and in facilitating integration of heterogeneous data sources.
Efficient Implementation of an Optimal Interpolator for Large Spatial Data Sets
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.
2007-01-01
Scattered data interpolation is a problem of interest in numerous areas such as electronic imaging, smooth surface modeling, and computational geometry. Our motivation arises from applications in geology and mining, which often involve large scattered data sets and a demand for high accuracy. The method of choice is ordinary kriging. This is because it is a best unbiased estimator. Unfortunately, this interpolant is computationally very expensive to compute exactly. For n scattered data points, computing the value of a single interpolant involves solving a dense linear system of size roughly n x n. This is infeasible for large n. In practice, kriging is solved approximately by local approaches that are based on considering only a relatively small'number of points that lie close to the query point. There are many problems with this local approach, however. The first is that determining the proper neighborhood size is tricky, and is usually solved by ad hoc methods such as selecting a fixed number of nearest neighbors or all the points lying within a fixed radius. Such fixed neighborhood sizes may not work well for all query points, depending on local density of the point distribution. Local methods also suffer from the problem that the resulting interpolant is not continuous. Meyer showed that while kriging produces smooth continues surfaces, it has zero order continuity along its borders. Thus, at interface boundaries where the neighborhood changes, the interpolant behaves discontinuously. Therefore, it is important to consider and solve the global system for each interpolant. However, solving such large dense systems for each query point is impractical. Recently a more principled approach to approximating kriging has been proposed based on a technique called covariance tapering. The problems arise from the fact that the covariance functions that are used in kriging have global support. Our implementations combine, utilize, and enhance a number of different approaches that have been introduced in literature for solving large linear systems for interpolation of scattered data points. For very large systems, exact methods such as Gaussian elimination are impractical since they require 0(n(exp 3)) time and 0(n(exp 2)) storage. As Billings et al. suggested, we use an iterative approach. In particular, we use the SYMMLQ method, for solving the large but sparse ordinary kriging systems that result from tapering. The main technical issue that need to be overcome in our algorithmic solution is that the points' covariance matrix for kriging should be symmetric positive definite. The goal of tapering is to obtain a sparse approximate representation of the covariance matrix while maintaining its positive definiteness. Furrer et al. used tapering to obtain a sparse linear system of the form Ax = b, where A is the tapered symmetric positive definite covariance matrix. Thus, Cholesky factorization could be used to solve their linear systems. They implemented an efficient sparse Cholesky decomposition method. They also showed if these tapers are used for a limited class of covariance models, the solution of the system converges to the solution of the original system. Matrix A in the ordinary kriging system, while symmetric, is not positive definite. Thus, their approach is not applicable to the ordinary kriging system. Therefore, we use tapering only to obtain a sparse linear system. Then, we use SYMMLQ to solve the ordinary kriging system. We show that solving large kriging systems becomes practical via tapering and iterative methods, and results in lower estimation errors compared to traditional local approaches, and significant memory savings compared to the original global system. We also developed a more efficient variant of the sparse SYMMLQ method for large ordinary kriging systems. This approach adaptively finds the correct local neighborhood for each query point in the interpolation process.
A manual for PARTI runtime primitives, revision 1
NASA Technical Reports Server (NTRS)
Das, Raja; Saltz, Joel; Berryman, Harry
1991-01-01
Primitives are presented that are designed to help users efficiently program irregular problems (e.g., unstructured mesh sweeps, sparse matrix codes, adaptive mesh partial differential equations solvers) on distributed memory machines. These primitives are also designed for use in compilers for distributed memory multiprocessors. Communications patterns are captured at runtime, and the appropriate send and receive messages are automatically generated.
Big geo data surface approximation using radial basis functions: A comparative study
NASA Astrophysics Data System (ADS)
Majdisova, Zuzana; Skala, Vaclav
2017-12-01
Approximation of scattered data is often a task in many engineering problems. The Radial Basis Function (RBF) approximation is appropriate for big scattered datasets in n-dimensional space. It is a non-separable approximation, as it is based on the distance between two points. This method leads to the solution of an overdetermined linear system of equations. In this paper the RBF approximation methods are briefly described, a new approach to the RBF approximation of big datasets is presented, and a comparison for different Compactly Supported RBFs (CS-RBFs) is made with respect to the accuracy of the computation. The proposed approach uses symmetry of a matrix, partitioning the matrix into blocks and data structures for storage of the sparse matrix. The experiments are performed for synthetic and real datasets.
NASA Astrophysics Data System (ADS)
Sides, Scott; Jamroz, Ben; Crockett, Robert; Pletzer, Alexander
2012-02-01
Self-consistent field theory (SCFT) for dense polymer melts has been highly successful in describing complex morphologies in block copolymers. Field-theoretic simulations such as these are able to access large length and time scales that are difficult or impossible for particle-based simulations such as molecular dynamics. The modified diffusion equations that arise as a consequence of the coarse-graining procedure in the SCF theory can be efficiently solved with a pseudo-spectral (PS) method that uses fast-Fourier transforms on uniform Cartesian grids. However, PS methods can be difficult to apply in many block copolymer SCFT simulations (eg. confinement, interface adsorption) in which small spatial regions might require finer resolution than most of the simulation grid. Progress on using new solver algorithms to address these problems will be presented. The Tech-X Chompst project aims at marrying the best of adaptive mesh refinement with linear matrix solver algorithms. The Tech-X code PolySwift++ is an SCFT simulation platform that leverages ongoing development in coupling Chombo, a package for solving PDEs via block-structured AMR calculations and embedded boundaries, with PETSc, a toolkit that includes a large assortment of sparse linear solvers.
Cheng, Yih-Chun; Tsai, Pei-Yun; Huang, Ming-Hao
2016-05-19
Low-complexity compressed sensing (CS) techniques for monitoring electrocardiogram (ECG) signals in wireless body sensor network (WBSN) are presented. The prior probability of ECG sparsity in the wavelet domain is first exploited. Then, variable orthogonal multi-matching pursuit (vOMMP) algorithm that consists of two phases is proposed. In the first phase, orthogonal matching pursuit (OMP) algorithm is adopted to effectively augment the support set with reliable indices and in the second phase, the orthogonal multi-matching pursuit (OMMP) is employed to rescue the missing indices. The reconstruction performance is thus enhanced with the prior information and the vOMMP algorithm. Furthermore, the computation-intensive pseudo-inverse operation is simplified by the matrix-inversion-free (MIF) technique based on QR decomposition. The vOMMP-MIF CS decoder is then implemented in 90 nm CMOS technology. The QR decomposition is accomplished by two systolic arrays working in parallel. The implementation supports three settings for obtaining 40, 44, and 48 coefficients in the sparse vector. From the measurement result, the power consumption is 11.7 mW at 0.9 V and 12 MHz. Compared to prior chip implementations, our design shows good hardware efficiency and is suitable for low-energy applications.
The Development of a Finite Volume Method for Modeling Sound in Coastal Ocean Environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Long, Wen; Yang, Zhaoqing; Copping, Andrea E.
: As the rapid growth of marine renewable energy and off-shore wind energy, there have been concerns that the noises generated from construction and operation of the devices may interfere marine animals’ communication. In this research, a underwater sound model is developed to simulate sound prorogation generated by marine-hydrokinetic energy (MHK) devices or offshore wind (OSW) energy platforms. Finite volume and finite difference methods are developed to solve the 3D Helmholtz equation of sound propagation in the coastal environment. For finite volume method, the grid system consists of triangular grids in horizontal plane and sigma-layers in vertical dimension. A 3Dmore » sparse matrix solver with complex coefficients is formed for solving the resulting acoustic pressure field. The Complex Shifted Laplacian Preconditioner (CSLP) method is applied to efficiently solve the matrix system iteratively with MPI parallelization using a high performance cluster. The sound model is then coupled with the Finite Volume Community Ocean Model (FVCOM) for simulating sound propagation generated by human activities in a range-dependent setting, such as offshore wind energy platform constructions and tidal stream turbines. As a proof of concept, initial validation of the finite difference solver is presented for two coastal wedge problems. Validation of finite volume method will be reported separately.« less
Acceleration of Linear Finite-Difference Poisson-Boltzmann Methods on Graphics Processing Units.
Qi, Ruxi; Botello-Smith, Wesley M; Luo, Ray
2017-07-11
Electrostatic interactions play crucial roles in biophysical processes such as protein folding and molecular recognition. Poisson-Boltzmann equation (PBE)-based models have emerged as widely used in modeling these important processes. Though great efforts have been put into developing efficient PBE numerical models, challenges still remain due to the high dimensionality of typical biomolecular systems. In this study, we implemented and analyzed commonly used linear PBE solvers for the ever-improving graphics processing units (GPU) for biomolecular simulations, including both standard and preconditioned conjugate gradient (CG) solvers with several alternative preconditioners. Our implementation utilizes the standard Nvidia CUDA libraries cuSPARSE, cuBLAS, and CUSP. Extensive tests show that good numerical accuracy can be achieved given that the single precision is often used for numerical applications on GPU platforms. The optimal GPU performance was observed with the Jacobi-preconditioned CG solver, with a significant speedup over standard CG solver on CPU in our diversified test cases. Our analysis further shows that different matrix storage formats also considerably affect the efficiency of different linear PBE solvers on GPU, with the diagonal format best suited for our standard finite-difference linear systems. Further efficiency may be possible with matrix-free operations and integrated grid stencil setup specifically tailored for the banded matrices in PBE-specific linear systems.
Stability and stabilisation of a class of networked dynamic systems
NASA Astrophysics Data System (ADS)
Liu, H. B.; Wang, D. Q.
2018-04-01
We investigate the stability and stabilisation of a linear time invariant networked heterogeneous system with arbitrarily connected subsystems. A new linear matrix inequality based sufficient and necessary condition for the stability is derived, based on which the stabilisation is provided. The obtained conditions efficiently utilise the block-diagonal characteristic of system parameter matrices and the sparseness of subsystem connection matrix. Moreover, a sufficient condition only dependent on each individual subsystem is also presented for the stabilisation of the networked systems with a large scale. Numerical simulations show that these conditions are computationally valid in the analysis and synthesis of a large-scale networked system.
Effect of missing data on multitask prediction methods.
de la Vega de León, Antonio; Chen, Beining; Gillet, Valerie J
2018-05-22
There has been a growing interest in multitask prediction in chemoinformatics, helped by the increasing use of deep neural networks in this field. This technique is applied to multitarget data sets, where compounds have been tested against different targets, with the aim of developing models to predict a profile of biological activities for a given compound. However, multitarget data sets tend to be sparse; i.e., not all compound-target combinations have experimental values. There has been little research on the effect of missing data on the performance of multitask methods. We have used two complete data sets to simulate sparseness by removing data from the training set. Different models to remove the data were compared. These sparse sets were used to train two different multitask methods, deep neural networks and Macau, which is a Bayesian probabilistic matrix factorization technique. Results from both methods were remarkably similar and showed that the performance decrease because of missing data is at first small before accelerating after large amounts of data are removed. This work provides a first approximation to assess how much data is required to produce good performance in multitask prediction exercises.
Inference for High-dimensional Differential Correlation Matrices *
Cai, T. Tony; Zhang, Anru
2015-01-01
Motivated by differential co-expression analysis in genomics, we consider in this paper estimation and testing of high-dimensional differential correlation matrices. An adaptive thresholding procedure is introduced and theoretical guarantees are given. Minimax rate of convergence is established and the proposed estimator is shown to be adaptively rate-optimal over collections of paired correlation matrices with approximately sparse differences. Simulation results show that the procedure significantly outperforms two other natural methods that are based on separate estimation of the individual correlation matrices. The procedure is also illustrated through an analysis of a breast cancer dataset, which provides evidence at the gene co-expression level that several genes, of which a subset has been previously verified, are associated with the breast cancer. Hypothesis testing on the differential correlation matrices is also considered. A test, which is particularly well suited for testing against sparse alternatives, is introduced. In addition, other related problems, including estimation of a single sparse correlation matrix, estimation of the differential covariance matrices, and estimation of the differential cross-correlation matrices, are also discussed. PMID:26500380
Fast super-resolution estimation of DOA and DOD in bistatic MIMO Radar with off-grid targets
NASA Astrophysics Data System (ADS)
Zhang, Dong; Zhang, Yongshun; Zheng, Guimei; Feng, Cunqian; Tang, Jun
2018-05-01
In this paper, we focus on the problem of joint DOA and DOD estimation in Bistatic MIMO Radar using sparse reconstruction method. In traditional ways, we usually convert the 2D parameter estimation problem into 1D parameter estimation problem by Kronecker product which will enlarge the scale of the parameter estimation problem and bring more computational burden. Furthermore, it requires that the targets must fall on the predefined grids. In this paper, a 2D-off-grid model is built which can solve the grid mismatch problem of 2D parameters estimation. Then in order to solve the joint 2D sparse reconstruction problem directly and efficiently, three kinds of fast joint sparse matrix reconstruction methods are proposed which are Joint-2D-OMP algorithm, Joint-2D-SL0 algorithm and Joint-2D-SOONE algorithm. Simulation results demonstrate that our methods not only can improve the 2D parameter estimation accuracy but also reduce the computational complexity compared with the traditional Kronecker Compressed Sensing method.
Compressed Sensing Quantum Process Tomography for Superconducting Quantum Gates
NASA Astrophysics Data System (ADS)
Rodionov, Andrey
An important challenge in quantum information science and quantum computing is the experimental realization of high-fidelity quantum operations on multi-qubit systems. Quantum process tomography (QPT) is a procedure devised to fully characterize a quantum operation. We first present the results of the estimation of the process matrix for superconducting multi-qubit quantum gates using the full data set employing various methods: linear inversion, maximum likelihood, and least-squares. To alleviate the problem of exponential resource scaling needed to characterize a multi-qubit system, we next investigate a compressed sensing (CS) method for QPT of two-qubit and three-qubit quantum gates. Using experimental data for two-qubit controlled-Z gates, taken with both Xmon and superconducting phase qubits, we obtain estimates for the process matrices with reasonably high fidelities compared to full QPT, despite using significantly reduced sets of initial states and measurement configurations. We show that the CS method still works when the amount of data is so small that the standard QPT would have an underdetermined system of equations. We also apply the CS method to the analysis of the three-qubit Toffoli gate with simulated noise, and similarly show that the method works well for a substantially reduced set of data. For the CS calculations we use two different bases in which the process matrix is approximately sparse (the Pauli-error basis and the singular value decomposition basis), and show that the resulting estimates of the process matrices match with reasonably high fidelity. For both two-qubit and three-qubit gates, we characterize the quantum process by its process matrix and average state fidelity, as well as by the corresponding standard deviation defined via the variation of the state fidelity for different initial states. We calculate the standard deviation of the average state fidelity both analytically and numerically, using a Monte Carlo method. Overall, we show that CS QPT offers a significant reduction in the needed amount of experimental data for two-qubit and three-qubit quantum gates.
Blind source separation by sparse decomposition
NASA Astrophysics Data System (ADS)
Zibulevsky, Michael; Pearlmutter, Barak A.
2000-04-01
The blind source separation problem is to extract the underlying source signals from a set of their linear mixtures, where the mixing matrix is unknown. This situation is common, eg in acoustics, radio, and medical signal processing. We exploit the property of the sources to have a sparse representation in a corresponding signal dictionary. Such a dictionary may consist of wavelets, wavelet packets, etc., or be obtained by learning from a given family of signals. Starting from the maximum a posteriori framework, which is applicable to the case of more sources than mixtures, we derive a few other categories of objective functions, which provide faster and more robust computations, when there are an equal number of sources and mixtures. Our experiments with artificial signals and with musical sounds demonstrate significantly better separation than other known techniques.
NASA Astrophysics Data System (ADS)
Li, Jun; Song, Minghui; Peng, Yuanxi
2018-03-01
Current infrared and visible image fusion methods do not achieve adequate information extraction, i.e., they cannot extract the target information from infrared images while retaining the background information from visible images. Moreover, most of them have high complexity and are time-consuming. This paper proposes an efficient image fusion framework for infrared and visible images on the basis of robust principal component analysis (RPCA) and compressed sensing (CS). The novel framework consists of three phases. First, RPCA decomposition is applied to the infrared and visible images to obtain their sparse and low-rank components, which represent the salient features and background information of the images, respectively. Second, the sparse and low-rank coefficients are fused by different strategies. On the one hand, the measurements of the sparse coefficients are obtained by the random Gaussian matrix, and they are then fused by the standard deviation (SD) based fusion rule. Next, the fused sparse component is obtained by reconstructing the result of the fused measurement using the fast continuous linearized augmented Lagrangian algorithm (FCLALM). On the other hand, the low-rank coefficients are fused using the max-absolute rule. Subsequently, the fused image is superposed by the fused sparse and low-rank components. For comparison, several popular fusion algorithms are tested experimentally. By comparing the fused results subjectively and objectively, we find that the proposed framework can extract the infrared targets while retaining the background information in the visible images. Thus, it exhibits state-of-the-art performance in terms of both fusion effects and timeliness.
Enforced Sparse Non-Negative Matrix Factorization
2016-01-23
documents to find interesting pieces of information. With limited resources, analysts often employ automated text - mining tools that highlight common...represented as an undirected bipartite graph. It has become a common method for generating topic models of text data because it is known to produce good results...model and the convergence rate of the underlying algorithm. I. Introduction A common analyst challenge is searching through large quantities of text
Methods for design and evaluation of integrated hardware-software systems for concurrent computation
NASA Technical Reports Server (NTRS)
Pratt, T. W.
1985-01-01
Research activities and publications are briefly summarized. The major tasks reviewed are: (1) VAX implementation of the PISCES parallel programming environment; (2) Apollo workstation network implementation of the PISCES environment; (3) FLEX implementation of the PISCES environment; (4) sparse matrix iterative solver in PSICES Fortran; (5) image processing application of PISCES; and (6) a formal model of concurrent computation being developed.
Research on Synthesis of Concurrent Computing Systems.
1982-09-01
20 1.5.1 An Informal Description of the Techniques ....... ..................... 20 1.5 2 Formal Definitions of Aggregation and Virtualisation ...sparsely interconnected networks . We have also developed techniques to create Kung’s systolic array parallel structure from a specification of matrix...resufts of the computation of that element. For example, if A,j is computed using a single enumeration, then virtualisation would produce a three
NASA Astrophysics Data System (ADS)
Schrodt, Franziska; Shan, Hanhuai; Fazayeli, Farideh; Karpatne, Anuj; Kattge, Jens; Banerjee, Arindam; Reichstein, Markus; Reich, Peter
2013-04-01
With the advent of remotely sensed data and coordinated efforts to create global databases, the ecological community has progressively become more data-intensive. However, in contrast to other disciplines, statistical ways of handling these large data sets, especially the gaps which are inherent to them, are lacking. Widely used theoretical approaches, for example model averaging based on Akaike's information criterion (AIC), are sensitive to missing values. Yet, the most common way of handling sparse matrices - the deletion of cases with missing data (complete case analysis) - is known to severely reduce statistical power as well as inducing biased parameter estimates. In order to address these issues, we present novel approaches to gap filling in large ecological data sets using matrix factorization techniques. Factorization based matrix completion was developed in a recommender system context and has since been widely used to impute missing data in fields outside the ecological community. Here, we evaluate the effectiveness of probabilistic matrix factorization techniques for imputing missing data in ecological matrices using two imputation techniques. Hierarchical Probabilistic Matrix Factorization (HPMF) effectively incorporates hierarchical phylogenetic information (phylogenetic group, family, genus, species and individual plant) into the trait imputation. Advanced Hierarchical Probabilistic Matrix Factorization (aHPMF) on the other hand includes climate and soil information into the matrix factorization by regressing the environmental variables against residuals of the HPMF. One unique opportunity opened up by aHPMF is out-of-sample prediction, where traits can be predicted for specific species at locations different to those sampled in the past. This has potentially far-reaching consequences for the study of global-scale plant functional trait patterns. We test the accuracy and effectiveness of HPMF and aHPMF in filling sparse matrices, using the TRY database of plant functional traits (http://www.try-db.org). TRY is one of the largest global compilations of plant trait databases (750 traits of 1 million plants), encompassing data on morphological, anatomical, biochemical, phenological and physiological features of plants. However, despite of unprecedented coverage, the TRY database is still very sparse, severely limiting joint trait analyses. Plant traits are the key to understanding how plants as primary producers adjust to changes in environmental conditions and in turn influence them. Forming the basis for Dynamic Global Vegetation Models (DGVMs), plant traits are also fundamental in global change studies for predicting future ecosystem changes. It is thus imperative that missing data is imputed in as accurate and precise a way as possible. In this study, we show the advantages and disadvantages of applying probabilistic matrix factorization techniques in incorporating hierarchical and environmental information for the prediction of missing plant traits as compared to conventional imputation techniques such as the complete case and mean approaches. We will discuss the implications of using gap-filled data for global-scale studies of plant functional trait - environment relationship as opposed to the above-mentioned conventional techniques, using examples of out-of-sample predictions of foliar Nitrogen across several species' ranges and biomes.
NASA Technical Reports Server (NTRS)
Maliassov, Serguei
1996-01-01
In this paper an algebraic substructuring preconditioner is considered for nonconforming finite element approximations of second order elliptic problems in 3D domains with a piecewise constant diffusion coefficient. Using a substructuring idea and a block Gauss elimination, part of the unknowns is eliminated and the Schur complement obtained is preconditioned by a spectrally equivalent very sparse matrix. In the case of quasiuniform tetrahedral mesh an appropriate algebraic multigrid solver can be used to solve the problem with this matrix. Explicit estimates of condition numbers and implementation algorithms are established for the constructed preconditioner. It is shown that the condition number of the preconditioned matrix does not depend on either the mesh step size or the jump of the coefficient. Finally, numerical experiments are presented to illustrate the theory being developed.
Zhang, Ying-Ying; Yang, Cai; Zhang, Ping
2017-08-01
In this paper, we present a novel bottom-up saliency detection algorithm from the perspective of covariance matrices on a Riemannian manifold. Each superpixel is described by a region covariance matrix on Riemannian Manifolds. We carry out a two-stage sparse coding scheme via Log-Euclidean kernels to extract salient objects efficiently. In the first stage, given background dictionary on image borders, sparse coding of each region covariance via Log-Euclidean kernels is performed. The reconstruction error on the background dictionary is regarded as the initial saliency of each superpixel. In the second stage, an improvement of the initial result is achieved by calculating reconstruction errors of the superpixels on foreground dictionary, which is extracted from the first stage saliency map. The sparse coding in the second stage is similar to the first stage, but is able to effectively highlight the salient objects uniformly from the background. Finally, three post-processing methods-highlight-inhibition function, context-based saliency weighting, and the graph cut-are adopted to further refine the saliency map. Experiments on four public benchmark datasets show that the proposed algorithm outperforms the state-of-the-art methods in terms of precision, recall and mean absolute error, and demonstrate the robustness and efficiency of the proposed method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Spectrum recovery method based on sparse representation for segmented multi-Gaussian model
NASA Astrophysics Data System (ADS)
Teng, Yidan; Zhang, Ye; Ti, Chunli; Su, Nan
2016-09-01
Hyperspectral images can realize crackajack features discriminability for supplying diagnostic characteristics with high spectral resolution. However, various degradations may generate negative influence on the spectral information, including water absorption, bands-continuous noise. On the other hand, the huge data volume and strong redundancy among spectrums produced intense demand on compressing HSIs in spectral dimension, which also leads to the loss of spectral information. The reconstruction of spectral diagnostic characteristics has irreplaceable significance for the subsequent application of HSIs. This paper introduces a spectrum restoration method for HSIs making use of segmented multi-Gaussian model (SMGM) and sparse representation. A SMGM is established to indicating the unsymmetrical spectral absorption and reflection characteristics, meanwhile, its rationality and sparse property are discussed. With the application of compressed sensing (CS) theory, we implement sparse representation to the SMGM. Then, the degraded and compressed HSIs can be reconstructed utilizing the uninjured or key bands. Finally, we take low rank matrix recovery (LRMR) algorithm for post processing to restore the spatial details. The proposed method was tested on the spectral data captured on the ground with artificial water absorption condition and an AVIRIS-HSI data set. The experimental results in terms of qualitative and quantitative assessments demonstrate that the effectiveness on recovering the spectral information from both degradations and loss compression. The spectral diagnostic characteristics and the spatial geometry feature are well preserved.
An analysis of spectral envelope-reduction via quadratic assignment problems
NASA Technical Reports Server (NTRS)
George, Alan; Pothen, Alex
1994-01-01
A new spectral algorithm for reordering a sparse symmetric matrix to reduce its envelope size was described. The ordering is computed by associating a Laplacian matrix with the given matrix and then sorting the components of a specified eigenvector of the Laplacian. In this paper, we provide an analysis of the spectral envelope reduction algorithm. We described related 1- and 2-sum problems; the former is related to the envelope size, while the latter is related to an upper bound on the work involved in an envelope Cholesky factorization scheme. We formulate the latter two problems as quadratic assignment problems, and then study the 2-sum problem in more detail. We obtain lower bounds on the 2-sum by considering a projected quadratic assignment problem, and then show that finding a permutation matrix closest to an orthogonal matrix attaining one of the lower bounds justifies the spectral envelope reduction algorithm. The lower bound on the 2-sum is seen to be tight for reasonably 'uniform' finite element meshes. We also obtain asymptotically tight lower bounds for the envelope size for certain classes of meshes.
Unifying model for random matrix theory in arbitrary space dimensions
NASA Astrophysics Data System (ADS)
Cicuta, Giovanni M.; Krausser, Johannes; Milkus, Rico; Zaccone, Alessio
2018-03-01
A sparse random block matrix model suggested by the Hessian matrix used in the study of elastic vibrational modes of amorphous solids is presented and analyzed. By evaluating some moments, benchmarked against numerics, differences in the eigenvalue spectrum of this model in different limits of space dimension d , and for arbitrary values of the lattice coordination number Z , are shown and discussed. As a function of these two parameters (and their ratio Z /d ), the most studied models in random matrix theory (Erdos-Renyi graphs, effective medium, and replicas) can be reproduced in the various limits of block dimensionality d . Remarkably, the Marchenko-Pastur spectral density (which is recovered by replica calculations for the Laplacian matrix) is reproduced exactly in the limit of infinite size of the blocks, or d →∞ , which clarifies the physical meaning of space dimension in these models. We feel that the approximate results for d =3 provided by our method may have many potential applications in the future, from the vibrational spectrum of glasses and elastic networks to wave localization, disordered conductors, random resistor networks, and random walks.
DNA melting profiles from a matrix method.
Poland, Douglas
2004-02-05
In this article we give a new method for the calculation of DNA melting profiles. Based on the matrix formulation of the DNA partition function, the method relies for its efficiency on the fact that the required matrices are very sparse, essentially reducing matrix multiplication to vector multiplication and thus making the computer time required to treat a DNA molecule containing N base pairs proportional to N(2). A key ingredient in the method is the result that multiplication by the inverse matrix can also be reduced to vector multiplication. The task of calculating the melting profile for the entire genome is further reduced by treating regions of the molecule between helix-plateaus, thus breaking the molecule up into independent parts that can each be treated individually. The method is easily modified to incorporate changes in the assignment of statistical weights to the different structural features of DNA. We illustrate the method using the genome of Haemophilus influenzae. Copyright 2003 Wiley Periodicals, Inc.
A General Sparse Tensor Framework for Electronic Structure Theory
Manzer, Samuel; Epifanovsky, Evgeny; Krylov, Anna I.; ...
2017-01-24
Linear-scaling algorithms must be developed in order to extend the domain of applicability of electronic structure theory to molecules of any desired size. But, the increasing complexity of modern linear-scaling methods makes code development and maintenance a significant challenge. A major contributor to this difficulty is the lack of robust software abstractions for handling block-sparse tensor operations. We therefore report the development of a highly efficient symbolic block-sparse tensor library in order to provide access to high-level software constructs to treat such problems. Our implementation supports arbitrary multi-dimensional sparsity in all input and output tensors. We then avoid cumbersome machine-generatedmore » code by implementing all functionality as a high-level symbolic C++ language library and demonstrate that our implementation attains very high performance for linear-scaling sparse tensor contractions.« less
Kernel Recursive Least-Squares Temporal Difference Algorithms with Sparsification and Regularization
Zhu, Qingxin; Niu, Xinzheng
2016-01-01
By combining with sparse kernel methods, least-squares temporal difference (LSTD) algorithms can construct the feature dictionary automatically and obtain a better generalization ability. However, the previous kernel-based LSTD algorithms do not consider regularization and their sparsification processes are batch or offline, which hinder their widespread applications in online learning problems. In this paper, we combine the following five techniques and propose two novel kernel recursive LSTD algorithms: (i) online sparsification, which can cope with unknown state regions and be used for online learning, (ii) L 2 and L 1 regularization, which can avoid overfitting and eliminate the influence of noise, (iii) recursive least squares, which can eliminate matrix-inversion operations and reduce computational complexity, (iv) a sliding-window approach, which can avoid caching all history samples and reduce the computational cost, and (v) the fixed-point subiteration and online pruning, which can make L 1 regularization easy to implement. Finally, simulation results on two 50-state chain problems demonstrate the effectiveness of our algorithms. PMID:27436996
Zhang, Chunyuan; Zhu, Qingxin; Niu, Xinzheng
2016-01-01
By combining with sparse kernel methods, least-squares temporal difference (LSTD) algorithms can construct the feature dictionary automatically and obtain a better generalization ability. However, the previous kernel-based LSTD algorithms do not consider regularization and their sparsification processes are batch or offline, which hinder their widespread applications in online learning problems. In this paper, we combine the following five techniques and propose two novel kernel recursive LSTD algorithms: (i) online sparsification, which can cope with unknown state regions and be used for online learning, (ii) L 2 and L 1 regularization, which can avoid overfitting and eliminate the influence of noise, (iii) recursive least squares, which can eliminate matrix-inversion operations and reduce computational complexity, (iv) a sliding-window approach, which can avoid caching all history samples and reduce the computational cost, and (v) the fixed-point subiteration and online pruning, which can make L 1 regularization easy to implement. Finally, simulation results on two 50-state chain problems demonstrate the effectiveness of our algorithms.
MGMRES: A generalization of GMRES for solving large sparse nonsymmetric linear systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Young, D.M.; Chen, J.Y.
1994-12-31
The authors are concerned with the solution of the linear system (1): Au = b, where A is a real square nonsingular matrix which is large, sparse and non-symmetric. They consider the use of Krylov subspace methods. They first choose an initial approximation u{sup (0)} to the solution {bar u} = A{sup {minus}1}B of (1). They also choose an auxiliary matrix Z which is nonsingular. For n = 1,2,{hor_ellipsis} they determine u{sup (n)} such that u{sup (n)} {minus} u{sup (0)}{epsilon}K{sub n}(r{sup (0)},A) where K{sub n}(r{sup (0)},A) is the (Krylov) subspace spanned by the Krylov vectors r{sup (0)}, Ar{sup (0)}, {hor_ellipsis},more » A{sup n{minus}1}r{sup 0} and where r{sup (0)} = b{minus}Au{sup (0)}. If ZA is SPD they also require that (u{sup (n)}{minus}{bar u}, ZA(u{sup (n)}{minus}{bar u})) be minimized. If, on the other hand, ZA is not SPD, then they require that the Galerkin condition, (Zr{sup n}, v) = 0, be satisfied for all v{epsilon}K{sub n}(r{sup (0)}, A) where r{sup n} = b{minus}Au{sup (n)}. In this paper the authors consider a generalization of GMRES. This generalized method, which they refer to as `MGMRES`, is very similar to GMRES except that they let Z = A{sup T}Y where Y is a nonsingular matrix which is symmetric by not necessarily SPD.« less
NASA Astrophysics Data System (ADS)
Chan, Garnet Kin-Lic; Keselman, Anna; Nakatani, Naoki; Li, Zhendong; White, Steven R.
2016-07-01
Current descriptions of the ab initio density matrix renormalization group (DMRG) algorithm use two superficially different languages: an older language of the renormalization group and renormalized operators, and a more recent language of matrix product states and matrix product operators. The same algorithm can appear dramatically different when written in the two different vocabularies. In this work, we carefully describe the translation between the two languages in several contexts. First, we describe how to efficiently implement the ab initio DMRG sweep using a matrix product operator based code, and the equivalence to the original renormalized operator implementation. Next we describe how to implement the general matrix product operator/matrix product state algebra within a pure renormalized operator-based DMRG code. Finally, we discuss two improvements of the ab initio DMRG sweep algorithm motivated by matrix product operator language: Hamiltonian compression, and a sum over operators representation that allows for perfect computational parallelism. The connections and correspondences described here serve to link the future developments with the past and are important in the efficient implementation of continuing advances in ab initio DMRG and related algorithms.
Chan, Garnet Kin-Lic; Keselman, Anna; Nakatani, Naoki; Li, Zhendong; White, Steven R
2016-07-07
Current descriptions of the ab initio density matrix renormalization group (DMRG) algorithm use two superficially different languages: an older language of the renormalization group and renormalized operators, and a more recent language of matrix product states and matrix product operators. The same algorithm can appear dramatically different when written in the two different vocabularies. In this work, we carefully describe the translation between the two languages in several contexts. First, we describe how to efficiently implement the ab initio DMRG sweep using a matrix product operator based code, and the equivalence to the original renormalized operator implementation. Next we describe how to implement the general matrix product operator/matrix product state algebra within a pure renormalized operator-based DMRG code. Finally, we discuss two improvements of the ab initio DMRG sweep algorithm motivated by matrix product operator language: Hamiltonian compression, and a sum over operators representation that allows for perfect computational parallelism. The connections and correspondences described here serve to link the future developments with the past and are important in the efficient implementation of continuing advances in ab initio DMRG and related algorithms.
Sparse distributed memory overview
NASA Technical Reports Server (NTRS)
Raugh, Mike
1990-01-01
The Sparse Distributed Memory (SDM) project is investigating the theory and applications of massively parallel computing architecture, called sparse distributed memory, that will support the storage and retrieval of sensory and motor patterns characteristic of autonomous systems. The immediate objectives of the project are centered in studies of the memory itself and in the use of the memory to solve problems in speech, vision, and robotics. Investigation of methods for encoding sensory data is an important part of the research. Examples of NASA missions that may benefit from this work are Space Station, planetary rovers, and solar exploration. Sparse distributed memory offers promising technology for systems that must learn through experience and be capable of adapting to new circumstances, and for operating any large complex system requiring automatic monitoring and control. Sparse distributed memory is a massively parallel architecture motivated by efforts to understand how the human brain works. Sparse distributed memory is an associative memory, able to retrieve information from cues that only partially match patterns stored in the memory. It is able to store long temporal sequences derived from the behavior of a complex system, such as progressive records of the system's sensory data and correlated records of the system's motor controls.
Numerical methods in Markov chain modeling
NASA Technical Reports Server (NTRS)
Philippe, Bernard; Saad, Youcef; Stewart, William J.
1989-01-01
Several methods for computing stationary probability distributions of Markov chains are described and compared. The main linear algebra problem consists of computing an eigenvector of a sparse, usually nonsymmetric, matrix associated with a known eigenvalue. It can also be cast as a problem of solving a homogeneous singular linear system. Several methods based on combinations of Krylov subspace techniques are presented. The performance of these methods on some realistic problems are compared.
A Stabilized Sparse-Matrix U-D Square-Root Implementation of a Large-State Extended Kalman Filter
NASA Technical Reports Server (NTRS)
Boggs, D.; Ghil, M.; Keppenne, C.
1995-01-01
The full nonlinear Kalman filter sequential algorithm is, in theory, well-suited to the four-dimensional data assimilation problem in large-scale atmospheric and oceanic problems. However, it was later discovered that this algorithm can be very sensitive to computer roundoff, and that results may cease to be meaningful as time advances. Implementations of a modified Kalman filter are given.
NASA Astrophysics Data System (ADS)
Imamura, Seigo; Ono, Kenji; Yokokawa, Mitsuo
2016-07-01
Ensemble computing, which is an instance of capacity computing, is an effective computing scenario for exascale parallel supercomputers. In ensemble computing, there are multiple linear systems associated with a common coefficient matrix. We improve the performance of iterative solvers for multiple vectors by solving them at the same time, that is, by solving for the product of the matrices. We implemented several iterative methods and compared their performance. The maximum performance on Sparc VIIIfx was 7.6 times higher than that of a naïve implementation. Finally, to deal with the different convergence processes of linear systems, we introduced a control method to eliminate the calculation of already converged vectors.
Recursive inverse factorization.
Rubensson, Emanuel H; Bock, Nicolas; Holmström, Erik; Niklasson, Anders M N
2008-03-14
A recursive algorithm for the inverse factorization S(-1)=ZZ(*) of Hermitian positive definite matrices S is proposed. The inverse factorization is based on iterative refinement [A.M.N. Niklasson, Phys. Rev. B 70, 193102 (2004)] combined with a recursive decomposition of S. As the computational kernel is matrix-matrix multiplication, the algorithm can be parallelized and the computational effort increases linearly with system size for systems with sufficiently sparse matrices. Recent advances in network theory are used to find appropriate recursive decompositions. We show that optimization of the so-called network modularity results in an improved partitioning compared to other approaches. In particular, when the recursive inverse factorization is applied to overlap matrices of irregularly structured three-dimensional molecules.
On the efficiency of a randomized mirror descent algorithm in online optimization problems
NASA Astrophysics Data System (ADS)
Gasnikov, A. V.; Nesterov, Yu. E.; Spokoiny, V. G.
2015-04-01
A randomized online version of the mirror descent method is proposed. It differs from the existing versions by the randomization method. Randomization is performed at the stage of the projection of a subgradient of the function being optimized onto the unit simplex rather than at the stage of the computation of a subgradient, which is common practice. As a result, a componentwise subgradient descent with a randomly chosen component is obtained, which admits an online interpretation. This observation, for example, has made it possible to uniformly interpret results on weighting expert decisions and propose the most efficient method for searching for an equilibrium in a zero-sum two-person matrix game with sparse matrix.
Partitioning sparse matrices with eigenvectors of graphs
NASA Technical Reports Server (NTRS)
Pothen, Alex; Simon, Horst D.; Liou, Kang-Pu
1990-01-01
The problem of computing a small vertex separator in a graph arises in the context of computing a good ordering for the parallel factorization of sparse, symmetric matrices. An algebraic approach for computing vertex separators is considered in this paper. It is shown that lower bounds on separator sizes can be obtained in terms of the eigenvalues of the Laplacian matrix associated with a graph. The Laplacian eigenvectors of grid graphs can be computed from Kronecker products involving the eigenvectors of path graphs, and these eigenvectors can be used to compute good separators in grid graphs. A heuristic algorithm is designed to compute a vertex separator in a general graph by first computing an edge separator in the graph from an eigenvector of the Laplacian matrix, and then using a maximum matching in a subgraph to compute the vertex separator. Results on the quality of the separators computed by the spectral algorithm are presented, and these are compared with separators obtained from other algorithms for computing separators. Finally, the time required to compute the Laplacian eigenvector is reported, and the accuracy with which the eigenvector must be computed to obtain good separators is considered. The spectral algorithm has the advantage that it can be implemented on a medium-size multiprocessor in a straightforward manner.
Comparison between sparsely distributed memory and Hopfield-type neural network models
NASA Technical Reports Server (NTRS)
Keeler, James D.
1986-01-01
The Sparsely Distributed Memory (SDM) model (Kanerva, 1984) is compared to Hopfield-type neural-network models. A mathematical framework for comparing the two is developed, and the capacity of each model is investigated. The capacity of the SDM can be increased independently of the dimension of the stored vectors, whereas the Hopfield capacity is limited to a fraction of this dimension. However, the total number of stored bits per matrix element is the same in the two models, as well as for extended models with higher order interactions. The models are also compared in their ability to store sequences of patterns. The SDM is extended to include time delays so that contextual information can be used to cover sequences. Finally, it is shown how a generalization of the SDM allows storage of correlated input pattern vectors.
Ma, Xu; Cheng, Yongmei; Hao, Shuai
2016-12-10
Automatic classification of terrain surfaces from an aerial image is essential for an autonomous unmanned aerial vehicle (UAV) landing at an unprepared site by using vision. Diverse terrain surfaces may show similar spectral properties due to the illumination and noise that easily cause poor classification performance. To address this issue, a multi-stage classification algorithm based on low-rank recovery and multi-feature fusion sparse representation is proposed. First, color moments and Gabor texture feature are extracted from training data and stacked as column vectors of a dictionary. Then we perform low-rank matrix recovery for the dictionary by using augmented Lagrange multipliers and construct a multi-stage terrain classifier. Experimental results on an aerial map database that we prepared verify the classification accuracy and robustness of the proposed method.
Joint seismic data denoising and interpolation with double-sparsity dictionary learning
NASA Astrophysics Data System (ADS)
Zhu, Lingchen; Liu, Entao; McClellan, James H.
2017-08-01
Seismic data quality is vital to geophysical applications, so that methods of data recovery, including denoising and interpolation, are common initial steps in the seismic data processing flow. We present a method to perform simultaneous interpolation and denoising, which is based on double-sparsity dictionary learning. This extends previous work that was for denoising only. The original double-sparsity dictionary learning algorithm is modified to track the traces with missing data by defining a masking operator that is integrated into the sparse representation of the dictionary. A weighted low-rank approximation algorithm is adopted to handle the dictionary updating as a sparse recovery optimization problem constrained by the masking operator. Compared to traditional sparse transforms with fixed dictionaries that lack the ability to adapt to complex data structures, the double-sparsity dictionary learning method learns the signal adaptively from selected patches of the corrupted seismic data, while preserving compact forward and inverse transform operators. Numerical experiments on synthetic seismic data indicate that this new method preserves more subtle features in the data set without introducing pseudo-Gibbs artifacts when compared to other directional multi-scale transform methods such as curvelets.
Parallel Preconditioning for CFD Problems on the CM-5
NASA Technical Reports Server (NTRS)
Simon, Horst D.; Kremenetsky, Mark D.; Richardson, John; Lasinski, T. A. (Technical Monitor)
1994-01-01
Up to today, preconditioning methods on massively parallel systems have faced a major difficulty. The most successful preconditioning methods in terms of accelerating the convergence of the iterative solver such as incomplete LU factorizations are notoriously difficult to implement on parallel machines for two reasons: (1) the actual computation of the preconditioner is not very floating-point intensive, but requires a large amount of unstructured communication, and (2) the application of the preconditioning matrix in the iteration phase (i.e. triangular solves) are difficult to parallelize because of the recursive nature of the computation. Here we present a new approach to preconditioning for very large, sparse, unsymmetric, linear systems, which avoids both difficulties. We explicitly compute an approximate inverse to our original matrix. This new preconditioning matrix can be applied most efficiently for iterative methods on massively parallel machines, since the preconditioning phase involves only a matrix-vector multiplication, with possibly a dense matrix. Furthermore the actual computation of the preconditioning matrix has natural parallelism. For a problem of size n, the preconditioning matrix can be computed by solving n independent small least squares problems. The algorithm and its implementation on the Connection Machine CM-5 are discussed in detail and supported by extensive timings obtained from real problem data.
Zhang, Guoqing; Sun, Huaijiang; Xia, Guiyu; Sun, Quansen
2016-07-07
Sparse representation based classification (SRC) has been developed and shown great potential for real-world application. Based on SRC, Yang et al. [10] devised a SRC steered discriminative projection (SRC-DP) method. However, as a linear algorithm, SRC-DP cannot handle the data with highly nonlinear distribution. Kernel sparse representation-based classifier (KSRC) is a non-linear extension of SRC and can remedy the drawback of SRC. KSRC requires the use of a predetermined kernel function and selection of the kernel function and its parameters is difficult. Recently, multiple kernel learning for SRC (MKL-SRC) [22] has been proposed to learn a kernel from a set of base kernels. However, MKL-SRC only considers the within-class reconstruction residual while ignoring the between-class relationship, when learning the kernel weights. In this paper, we propose a novel multiple kernel sparse representation-based classifier (MKSRC), and then we use it as a criterion to design a multiple kernel sparse representation based orthogonal discriminative projection method (MK-SR-ODP). The proposed algorithm aims at learning a projection matrix and a corresponding kernel from the given base kernels such that in the low dimension subspace the between-class reconstruction residual is maximized and the within-class reconstruction residual is minimized. Furthermore, to achieve a minimum overall loss by performing recognition in the learned low-dimensional subspace, we introduce cost information into the dimensionality reduction method. The solutions for the proposed method can be efficiently found based on trace ratio optimization method [33]. Extensive experimental results demonstrate the superiority of the proposed algorithm when compared with the state-of-the-art methods.
BCYCLIC: A parallel block tridiagonal matrix cyclic solver
NASA Astrophysics Data System (ADS)
Hirshman, S. P.; Perumalla, K. S.; Lynch, V. E.; Sanchez, R.
2010-09-01
A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.
Approximate equiangular tight frames for compressed sensing and CDMA applications
NASA Astrophysics Data System (ADS)
Tsiligianni, Evaggelia; Kondi, Lisimachos P.; Katsaggelos, Aggelos K.
2017-12-01
Performance guarantees for recovery algorithms employed in sparse representations, and compressed sensing highlights the importance of incoherence. Optimal bounds of incoherence are attained by equiangular unit norm tight frames (ETFs). Although ETFs are important in many applications, they do not exist for all dimensions, while their construction has been proven extremely difficult. In this paper, we construct frames that are close to ETFs. According to results from frame and graph theory, the existence of an ETF depends on the existence of its signature matrix, that is, a symmetric matrix with certain structure and spectrum consisting of two distinct eigenvalues. We view the construction of a signature matrix as an inverse eigenvalue problem and propose a method that produces frames of any dimensions that are close to ETFs. Due to the achieved equiangularity property, the so obtained frames can be employed as spreading sequences in synchronous code-division multiple access (s-CDMA) systems, besides compressed sensing.
NASA Astrophysics Data System (ADS)
Gerber, Florian; Mösinger, Kaspar; Furrer, Reinhard
2017-07-01
Software packages for spatial data often implement a hybrid approach of interpreted and compiled programming languages. The compiled parts are usually written in C, C++, or Fortran, and are efficient in terms of computational speed and memory usage. Conversely, the interpreted part serves as a convenient user-interface and calls the compiled code for computationally demanding operations. The price paid for the user friendliness of the interpreted component is-besides performance-the limited access to low level and optimized code. An example of such a restriction is the 64-bit vector support of the widely used statistical language R. On the R side, users do not need to change existing code and may not even notice the extension. On the other hand, interfacing 64-bit compiled code efficiently is challenging. Since many R packages for spatial data could benefit from 64-bit vectors, we investigate strategies to efficiently pass 64-bit vectors to compiled languages. More precisely, we show how to simply extend existing R packages using the foreign function interface to seamlessly support 64-bit vectors. This extension is shown with the sparse matrix algebra R package spam. The new capabilities are illustrated with an example of GIMMS NDVI3g data featuring a parametric modeling approach for a non-stationary covariance matrix.
Multimode waveguide speckle patterns for compressive sensing.
Valley, George C; Sefler, George A; Justin Shaw, T
2016-06-01
Compressive sensing (CS) of sparse gigahertz-band RF signals using microwave photonics may achieve better performances with smaller size, weight, and power than electronic CS or conventional Nyquist rate sampling. The critical element in a CS system is the device that produces the CS measurement matrix (MM). We show that passive speckle patterns in multimode waveguides potentially provide excellent MMs for CS. We measure and calculate the MM for a multimode fiber and perform simulations using this MM in a CS system. We show that the speckle MM exhibits the sharp phase transition and coherence properties needed for CS and that these properties are similar to those of a sub-Gaussian MM with the same mean and standard deviation. We calculate the MM for a multimode planar waveguide and find dimensions of the planar guide that give a speckle MM with a performance similar to that of the multimode fiber. The CS simulations show that all measured and calculated speckle MMs exhibit a robust performance with equal amplitude signals that are sparse in time, in frequency, and in wavelets (Haar wavelet transform). The planar waveguide results indicate a path to a microwave photonic integrated circuit for measuring sparse gigahertz-band RF signals using CS.
Segmentation of High Angular Resolution Diffusion MRI using Sparse Riemannian Manifold Clustering
Wright, Margaret J.; Thompson, Paul M.; Vidal, René
2015-01-01
We address the problem of segmenting high angular resolution diffusion imaging (HARDI) data into multiple regions (or fiber tracts) with distinct diffusion properties. We use the orientation distribution function (ODF) to represent HARDI data and cast the problem as a clustering problem in the space of ODFs. Our approach integrates tools from sparse representation theory and Riemannian geometry into a graph theoretic segmentation framework. By exploiting the Riemannian properties of the space of ODFs, we learn a sparse representation for each ODF and infer the segmentation by applying spectral clustering to a similarity matrix built from these representations. In cases where regions with similar (resp. distinct) diffusion properties belong to different (resp. same) fiber tracts, we obtain the segmentation by incorporating spatial and user-specified pairwise relationships into the formulation. Experiments on synthetic data evaluate the sensitivity of our method to image noise and the presence of complex fiber configurations, and show its superior performance compared to alternative segmentation methods. Experiments on phantom and real data demonstrate the accuracy of the proposed method in segmenting simulated fibers, as well as white matter fiber tracts of clinical importance in the human brain. PMID:24108748
An embedded system for face classification in infrared video using sparse representation
NASA Astrophysics Data System (ADS)
Saavedra M., Antonio; Pezoa, Jorge E.; Zarkesh-Ha, Payman; Figueroa, Miguel
2017-09-01
We propose a platform for robust face recognition in Infrared (IR) images using Compressive Sensing (CS). In line with CS theory, the classification problem is solved using a sparse representation framework, where test images are modeled by means of a linear combination of the training set. Because the training set constitutes an over-complete dictionary, we identify new images by finding their sparsest representation based on the training set, using standard l1-minimization algorithms. Unlike conventional face-recognition algorithms, we feature extraction is performed using random projections with a precomputed binary matrix, as proposed in the CS literature. This random sampling reduces the effects of noise and occlusions such as facial hair, eyeglasses, and disguises, which are notoriously challenging in IR images. Thus, the performance of our framework is robust to these noise and occlusion factors, achieving an average accuracy of approximately 90% when the UCHThermalFace database is used for training and testing purposes. We implemented our framework on a high-performance embedded digital system, where the computation of the sparse representation of IR images was performed by a dedicated hardware using a deeply pipelined architecture on an Field-Programmable Gate Array (FPGA).
NASA Astrophysics Data System (ADS)
Zhao, Jin; Han-Ming, Zhang; Bin, Yan; Lei, Li; Lin-Yuan, Wang; Ai-Long, Cai
2016-03-01
Sparse-view x-ray computed tomography (CT) imaging is an interesting topic in CT field and can efficiently decrease radiation dose. Compared with spatial reconstruction, a Fourier-based algorithm has advantages in reconstruction speed and memory usage. A novel Fourier-based iterative reconstruction technique that utilizes non-uniform fast Fourier transform (NUFFT) is presented in this work along with advanced total variation (TV) regularization for a fan sparse-view CT. The proposition of a selective matrix contributes to improve reconstruction quality. The new method employs the NUFFT and its adjoin to iterate back and forth between the Fourier and image space. The performance of the proposed algorithm is demonstrated through a series of digital simulations and experimental phantom studies. Results of the proposed algorithm are compared with those of existing TV-regularized techniques based on compressed sensing method, as well as basic algebraic reconstruction technique. Compared with the existing TV-regularized techniques, the proposed Fourier-based technique significantly improves convergence rate and reduces memory allocation, respectively. Projected supported by the National High Technology Research and Development Program of China (Grant No. 2012AA011603) and the National Natural Science Foundation of China (Grant No. 61372172).
Computational efficiency improvements for image colorization
NASA Astrophysics Data System (ADS)
Yu, Chao; Sharma, Gaurav; Aly, Hussein
2013-03-01
We propose an efficient algorithm for colorization of greyscale images. As in prior work, colorization is posed as an optimization problem: a user specifies the color for a few scribbles drawn on the greyscale image and the color image is obtained by propagating color information from the scribbles to surrounding regions, while maximizing the local smoothness of colors. In this formulation, colorization is obtained by solving a large sparse linear system, which normally requires substantial computation and memory resources. Our algorithm improves the computational performance through three innovations over prior colorization implementations. First, the linear system is solved iteratively without explicitly constructing the sparse matrix, which significantly reduces the required memory. Second, we formulate each iteration in terms of integral images obtained by dynamic programming, reducing repetitive computation. Third, we use a coarseto- fine framework, where a lower resolution subsampled image is first colorized and this low resolution color image is upsampled to initialize the colorization process for the fine level. The improvements we develop provide significant speedup and memory savings compared to the conventional approach of solving the linear system directly using off-the-shelf sparse solvers, and allow us to colorize images with typical sizes encountered in realistic applications on typical commodity computing platforms.
NASA Astrophysics Data System (ADS)
Orović, Irena; Stanković, Srdjan; Amin, Moeness
2013-05-01
A modified robust two-dimensional compressive sensing algorithm for reconstruction of sparse time-frequency representation (TFR) is proposed. The ambiguity function domain is assumed to be the domain of observations. The two-dimensional Fourier bases are used to linearly relate the observations to the sparse TFR, in lieu of the Wigner distribution. We assume that a set of available samples in the ambiguity domain is heavily corrupted by an impulsive type of noise. Consequently, the problem of sparse TFR reconstruction cannot be tackled using standard compressive sensing optimization algorithms. We introduce a two-dimensional L-statistics based modification into the transform domain representation. It provides suitable initial conditions that will produce efficient convergence of the reconstruction algorithm. This approach applies sorting and weighting operations to discard an expected amount of samples corrupted by noise. The remaining samples serve as observations used in sparse reconstruction of the time-frequency signal representation. The efficiency of the proposed approach is demonstrated on numerical examples that comprise both cases of monocomponent and multicomponent signals.
Track monitoring from the dynamic response of a passing train: A sparse approach
NASA Astrophysics Data System (ADS)
Lederman, George; Chen, Siheng; Garrett, James H.; Kovačević, Jelena; Noh, Hae Young; Bielak, Jacobo
2017-06-01
Collecting vibration data from revenue service trains could be a low-cost way to more frequently monitor railroad tracks, yet operational variability makes robust analysis a challenge. We propose a novel analysis technique for track monitoring that exploits the sparsity inherent in train-vibration data. This sparsity is based on the observation that large vertical train vibrations typically involve the excitation of the train's fundamental mode due to track joints, switchgear, or other discrete hardware. Rather than try to model the entire rail profile, in this study we examine a sparse approach to solving an inverse problem where (1) the roughness is constrained to a discrete and limited set of "bumps"; and (2) the train system is idealized as a simple damped oscillator that models the train's vibration in the fundamental mode. We use an expectation maximization (EM) approach to iteratively solve for the track profile and the train system properties, using orthogonal matching pursuit (OMP) to find the sparse approximation within each step. By enforcing sparsity, the inverse problem is well posed and the train's position can be found relative to the sparse bumps, thus reducing the uncertainty in the GPS data. We validate the sparse approach on two sections of track monitored from an operational train over a 16 month period of time, one where track changes did not occur during this period and another where changes did occur. We show that this approach can not only detect when track changes occur, but also offers insight into the type of such changes.
14 CFR 91.305 - Flight test areas.
Code of Federal Regulations, 2012 CFR
2012-01-01
... 14 Aeronautics and Space 2 2012-01-01 2012-01-01 false Flight test areas. 91.305 Section 91.305... AND GENERAL OPERATING RULES GENERAL OPERATING AND FLIGHT RULES Special Flight Operations § 91.305 Flight test areas. No person may flight test an aircraft except over open water, or sparsely populated...
14 CFR 91.305 - Flight test areas.
Code of Federal Regulations, 2013 CFR
2013-01-01
... 14 Aeronautics and Space 2 2013-01-01 2013-01-01 false Flight test areas. 91.305 Section 91.305... AND GENERAL OPERATING RULES GENERAL OPERATING AND FLIGHT RULES Special Flight Operations § 91.305 Flight test areas. No person may flight test an aircraft except over open water, or sparsely populated...
14 CFR 91.305 - Flight test areas.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 14 Aeronautics and Space 2 2010-01-01 2010-01-01 false Flight test areas. 91.305 Section 91.305... AND GENERAL OPERATING RULES GENERAL OPERATING AND FLIGHT RULES Special Flight Operations § 91.305 Flight test areas. No person may flight test an aircraft except over open water, or sparsely populated...
NASA Technical Reports Server (NTRS)
Jones, H. W.; Hein, D. N.; Knauer, S. C.
1978-01-01
A general class of even/odd transforms is presented that includes the Karhunen-Loeve transform, the discrete cosine transform, the Walsh-Hadamard transform, and other familiar transforms. The more complex even/odd transforms can be computed by combining a simpler even/odd transform with a sparse matrix multiplication. A theoretical performance measure is computed for some even/odd transforms, and two image compression experiments are reported.
2013-06-16
Science Dept., University of California, Irvine, USA 92697. Email : a.anandkumar@uci.edu,mjanzami@uci.edu. Daniel Hsu and Sham Kakade are with...Microsoft Research New England, 1 Memorial Drive, Cambridge, MA 02142. Email : dahsu@microsoft.com, skakade@microsoft.com 1 a latent space dimensionality...Sparse coding for multitask and transfer learning. ArxXiv preprint, abs/1209.0738, 2012. [34] G.H. Golub and C.F. Van Loan. Matrix Computations. The
NASA Astrophysics Data System (ADS)
Wang, H. T.; Chen, T. T.; Yan, C.; Pan, H.
2018-05-01
For App recommended areas of mobile phone software, made while using conduct App application recommended combined weighted Slope One algorithm collaborative filtering algorithm items based on further improvement of the traditional collaborative filtering algorithm in cold start, data matrix sparseness and other issues, will recommend Spark stasis parallel algorithm platform, the introduction of real-time streaming streaming real-time computing framework to improve real-time software applications recommended.
Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A
2013-11-05
Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.
Sparse Bayesian learning for DOA estimation with mutual coupling.
Dai, Jisheng; Hu, Nan; Xu, Weichao; Chang, Chunqi
2015-10-16
Sparse Bayesian learning (SBL) has given renewed interest to the problem of direction-of-arrival (DOA) estimation. It is generally assumed that the measurement matrix in SBL is precisely known. Unfortunately, this assumption may be invalid in practice due to the imperfect manifold caused by unknown or misspecified mutual coupling. This paper describes a modified SBL method for joint estimation of DOAs and mutual coupling coefficients with uniform linear arrays (ULAs). Unlike the existing method that only uses stationary priors, our new approach utilizes a hierarchical form of the Student t prior to enforce the sparsity of the unknown signal more heavily. We also provide a distinct Bayesian inference for the expectation-maximization (EM) algorithm, which can update the mutual coupling coefficients more efficiently. Another difference is that our method uses an additional singular value decomposition (SVD) to reduce the computational complexity of the signal reconstruction process and the sensitivity to the measurement noise.
Efficient spares matrix multiplication scheme for the CYBER 203
NASA Technical Reports Server (NTRS)
Lambiotte, J. J., Jr.
1984-01-01
This work has been directed toward the development of an efficient algorithm for performing this computation on the CYBER-203. The desire to provide software which gives the user the choice between the often conflicting goals of minimizing central processing (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of three types of storage is selected for each diagonal. For each storage type, an initialization sub-routine estimates the CPU and storage requirements based upon results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the resources. The three storage types employed were chosen to be efficient on the CYBER-203 for diagonals which are sparse, moderately sparse, or dense; however, for many densities, no diagonal type is most efficient with respect to both resource requirements. The user-supplied weights dictate the choice.
Improved collaborative filtering recommendation algorithm of similarity measure
NASA Astrophysics Data System (ADS)
Zhang, Baofu; Yuan, Baoping
2017-05-01
The Collaborative filtering recommendation algorithm is one of the most widely used recommendation algorithm in personalized recommender systems. The key is to find the nearest neighbor set of the active user by using similarity measure. However, the methods of traditional similarity measure mainly focus on the similarity of user common rating items, but ignore the relationship between the user common rating items and all items the user rates. And because rating matrix is very sparse, traditional collaborative filtering recommendation algorithm is not high efficiency. In order to obtain better accuracy, based on the consideration of common preference between users, the difference of rating scale and score of common items, this paper presents an improved similarity measure method, and based on this method, a collaborative filtering recommendation algorithm based on similarity improvement is proposed. Experimental results show that the algorithm can effectively improve the quality of recommendation, thus alleviate the impact of data sparseness.
An Online Dictionary Learning-Based Compressive Data Gathering Algorithm in Wireless Sensor Networks
Wang, Donghao; Wan, Jiangwen; Chen, Junying; Zhang, Qiang
2016-01-01
To adapt to sense signals of enormous diversities and dynamics, and to decrease the reconstruction errors caused by ambient noise, a novel online dictionary learning method-based compressive data gathering (ODL-CDG) algorithm is proposed. The proposed dictionary is learned from a two-stage iterative procedure, alternately changing between a sparse coding step and a dictionary update step. The self-coherence of the learned dictionary is introduced as a penalty term during the dictionary update procedure. The dictionary is also constrained with sparse structure. It’s theoretically demonstrated that the sensing matrix satisfies the restricted isometry property (RIP) with high probability. In addition, the lower bound of necessary number of measurements for compressive sensing (CS) reconstruction is given. Simulation results show that the proposed ODL-CDG algorithm can enhance the recovery accuracy in the presence of noise, and reduce the energy consumption in comparison with other dictionary based data gathering methods. PMID:27669250
Wang, Donghao; Wan, Jiangwen; Chen, Junying; Zhang, Qiang
2016-09-22
To adapt to sense signals of enormous diversities and dynamics, and to decrease the reconstruction errors caused by ambient noise, a novel online dictionary learning method-based compressive data gathering (ODL-CDG) algorithm is proposed. The proposed dictionary is learned from a two-stage iterative procedure, alternately changing between a sparse coding step and a dictionary update step. The self-coherence of the learned dictionary is introduced as a penalty term during the dictionary update procedure. The dictionary is also constrained with sparse structure. It's theoretically demonstrated that the sensing matrix satisfies the restricted isometry property (RIP) with high probability. In addition, the lower bound of necessary number of measurements for compressive sensing (CS) reconstruction is given. Simulation results show that the proposed ODL-CDG algorithm can enhance the recovery accuracy in the presence of noise, and reduce the energy consumption in comparison with other dictionary based data gathering methods.
Mafusire, Cosmas; Krüger, Tjaart P J
2018-06-01
The concept of orthonormal vector circle polynomials is revisited by deriving a set from the Cartesian gradient of Zernike polynomials in a unit circle using a matrix-based approach. The heart of this model is a closed-form matrix equation of the gradient of Zernike circle polynomials expressed as a linear combination of lower-order Zernike circle polynomials related through a gradient matrix. This is a sparse matrix whose elements are two-dimensional standard basis transverse Euclidean vectors. Using the outer product form of the Cholesky decomposition, the gradient matrix is used to calculate a new matrix, which we used to express the Cartesian gradient of the Zernike circle polynomials as a linear combination of orthonormal vector circle polynomials. Since this new matrix is singular, the orthonormal vector polynomials are recovered by reducing the matrix to its row echelon form using the Gauss-Jordan elimination method. We extend the model to derive orthonormal vector general polynomials, which are orthonormal in a general pupil by performing a similarity transformation on the gradient matrix to give its equivalent in the general pupil. The outer form of the Gram-Schmidt procedure and the Gauss-Jordan elimination method are then applied to the general pupil to generate the orthonormal vector general polynomials from the gradient of the orthonormal Zernike-based polynomials. The performance of the model is demonstrated with a simulated wavefront in a square pupil inscribed in a unit circle.
Compressed multi-block local binary pattern for object tracking
NASA Astrophysics Data System (ADS)
Li, Tianwen; Gao, Yun; Zhao, Lei; Zhou, Hao
2018-04-01
Both robustness and real-time are very important for the application of object tracking under a real environment. The focused trackers based on deep learning are difficult to satisfy with the real-time of tracking. Compressive sensing provided a technical support for real-time tracking. In this paper, an object can be tracked via a multi-block local binary pattern feature. The feature vector was extracted based on the multi-block local binary pattern feature, which was compressed via a sparse random Gaussian matrix as the measurement matrix. The experiments showed that the proposed tracker ran in real-time and outperformed the existed compressive trackers based on Haar-like feature on many challenging video sequences in terms of accuracy and robustness.
Sparse regularization for force identification using dictionaries
NASA Astrophysics Data System (ADS)
Qiao, Baijie; Zhang, Xingwu; Wang, Chenxi; Zhang, Hang; Chen, Xuefeng
2016-04-01
The classical function expansion method based on minimizing l2-norm of the response residual employs various basis functions to represent the unknown force. Its difficulty lies in determining the optimum number of basis functions. Considering the sparsity of force in the time domain or in other basis space, we develop a general sparse regularization method based on minimizing l1-norm of the coefficient vector of basis functions. The number of basis functions is adaptively determined by minimizing the number of nonzero components in the coefficient vector during the sparse regularization process. First, according to the profile of the unknown force, the dictionary composed of basis functions is determined. Second, a sparsity convex optimization model for force identification is constructed. Third, given the transfer function and the operational response, Sparse reconstruction by separable approximation (SpaRSA) is developed to solve the sparse regularization problem of force identification. Finally, experiments including identification of impact and harmonic forces are conducted on a cantilever thin plate structure to illustrate the effectiveness and applicability of SpaRSA. Besides the Dirac dictionary, other three sparse dictionaries including Db6 wavelets, Sym4 wavelets and cubic B-spline functions can also accurately identify both the single and double impact forces from highly noisy responses in a sparse representation frame. The discrete cosine functions can also successfully reconstruct the harmonic forces including the sinusoidal, square and triangular forces. Conversely, the traditional Tikhonov regularization method with the L-curve criterion fails to identify both the impact and harmonic forces in these cases.
Robust model-based 3d/3D fusion using sparse matching for minimally invasive surgery.
Neumann, Dominik; Grbic, Sasa; John, Matthias; Navab, Nassir; Hornegger, Joachim; Ionasec, Razvan
2013-01-01
Classical surgery is being disrupted by minimally invasive and transcatheter procedures. As there is no direct view or access to the affected anatomy, advanced imaging techniques such as 3D C-arm CT and C-arm fluoroscopy are routinely used for intra-operative guidance. However, intra-operative modalities have limited image quality of the soft tissue and a reliable assessment of the cardiac anatomy can only be made by injecting contrast agent, which is harmful to the patient and requires complex acquisition protocols. We propose a novel sparse matching approach for fusing high quality pre-operative CT and non-contrasted, non-gated intra-operative C-arm CT by utilizing robust machine learning and numerical optimization techniques. Thus, high-quality patient-specific models can be extracted from the pre-operative CT and mapped to the intra-operative imaging environment to guide minimally invasive procedures. Extensive quantitative experiments demonstrate that our model-based fusion approach has an average execution time of 2.9 s, while the accuracy lies within expert user confidence intervals.
High-Dimensional Bayesian Geostatistics
Banerjee, Sudipto
2017-01-01
With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as “priors” for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ~ n floating point operations (flops), where n the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings. PMID:29391920
High-Dimensional Bayesian Geostatistics.
Banerjee, Sudipto
2017-06-01
With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as "priors" for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ~ n floating point operations (flops), where n the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings.
Using the Model Coupling Toolkit to couple earth system models
Warner, J.C.; Perlin, N.; Skyllingstad, E.D.
2008-01-01
Continued advances in computational resources are providing the opportunity to operate more sophisticated numerical models. Additionally, there is an increasing demand for multidisciplinary studies that include interactions between different physical processes. Therefore there is a strong desire to develop coupled modeling systems that utilize existing models and allow efficient data exchange and model control. The basic system would entail model "1" running on "M" processors and model "2" running on "N" processors, with efficient exchange of model fields at predetermined synchronization intervals. Here we demonstrate two coupled systems: the coupling of the ocean circulation model Regional Ocean Modeling System (ROMS) to the surface wave model Simulating WAves Nearshore (SWAN), and the coupling of ROMS to the atmospheric model Coupled Ocean Atmosphere Prediction System (COAMPS). Both coupled systems use the Model Coupling Toolkit (MCT) as a mechanism for operation control and inter-model distributed memory transfer of model variables. In this paper we describe requirements and other options for model coupling, explain the MCT library, ROMS, SWAN and COAMPS models, methods for grid decomposition and sparse matrix interpolation, and provide an example from each coupled system. Methods presented in this paper are clearly applicable for coupling of other types of models. ?? 2008 Elsevier Ltd. All rights reserved.
Overcoming Challenges in Kinetic Modeling of Magnetized Plasmas and Vacuum Electronic Devices
NASA Astrophysics Data System (ADS)
Omelchenko, Yuri; Na, Dong-Yeop; Teixeira, Fernando
2017-10-01
We transform the state-of-the art of plasma modeling by taking advantage of novel computational techniques for fast and robust integration of multiscale hybrid (full particle ions, fluid electrons, no displacement current) and full-PIC models. These models are implemented in 3D HYPERS and axisymmetric full-PIC CONPIC codes. HYPERS is a massively parallel, asynchronous code. The HYPERS solver does not step fields and particles synchronously in time but instead executes local variable updates (events) at their self-adaptive rates while preserving fundamental conservation laws. The charge-conserving CONPIC code has a matrix-free explicit finite-element (FE) solver based on a sparse-approximate inverse (SPAI) algorithm. This explicit solver approximates the inverse FE system matrix (``mass'' matrix) using successive sparsity pattern orders of the original matrix. It does not reduce the set of Maxwell's equations to a vector-wave (curl-curl) equation of second order but instead utilizes the standard coupled first-order Maxwell's system. We discuss the ability of our codes to accurately and efficiently account for multiscale physical phenomena in 3D magnetized space and laboratory plasmas and axisymmetric vacuum electronic devices.
A Computing Platform for Parallel Sparse Matrix Computations
2016-01-05
REPORT NUMBER 19a. NAME OF RESPONSIBLE PERSON 19b. TELEPHONE NUMBER Ahmed Sameh Ahmed H. Sameh, Alicia Klinvex, Yao Zhu 611103 c. THIS PAGE The...PERCENT_SUPPORTEDNAME FTE Equivalent: Total Number: Discipline Yao Zhu 0.50 Alicia Klinvex 0.10 0.60 2 Names of Post Doctorates Names of Faculty Supported...PERCENT_SUPPORTEDNAME FTE Equivalent: Total Number: NAME Total Number: NAME Total Number: Yao Zhu Alicia Klinvex 2 ...... ...... Sub Contractors (DD882) Names of other
Exploiting Data Sparsity in Parallel Matrix Powers Computations
2013-05-03
2013 Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour...matrices of the form A = D+USV H, where D is sparse and USV H has low rank but may be dense. Matrices of this form arise in many practical applications...methods numerical partial di erential equation solvers, and preconditioned iterative methods. If A has this form , our algorithm enables a communication
Groupwise registration of MR brain images with tumors.
Tang, Zhenyu; Wu, Yihong; Fan, Yong
2017-08-04
A novel groupwise image registration framework is developed for registering MR brain images with tumors. Our method iteratively estimates a normal-appearance counterpart for each tumor image to be registered and constructs a directed graph (digraph) of normal-appearance images to guide the groupwise image registration. Particularly, our method maps each tumor image to its normal appearance counterpart by identifying and inpainting brain tumor regions with intensity information estimated using a low-rank plus sparse matrix decomposition based image representation technique. The estimated normal-appearance images are groupwisely registered to a group center image guided by a digraph of images so that the total length of 'image registration paths' to be the minimum, and then the original tumor images are warped to the group center image using the resulting deformation fields. We have evaluated our method based on both simulated and real MR brain tumor images. The registration results were evaluated with overlap measures of corresponding brain regions and average entropy of image intensity information, and Wilcoxon signed rank tests were adopted to compare different methods with respect to their regional overlap measures. Compared with a groupwise image registration method that is applied to normal-appearance images estimated using the traditional low-rank plus sparse matrix decomposition based image inpainting, our method achieved higher image registration accuracy with statistical significance (p = 7.02 × 10 -9 ).
Dimension-Factorized Range Migration Algorithm for Regularly Distributed Array Imaging
Guo, Qijia; Wang, Jie; Chang, Tianying
2017-01-01
The two-dimensional planar MIMO array is a popular approach for millimeter wave imaging applications. As a promising practical alternative, sparse MIMO arrays have been devised to reduce the number of antenna elements and transmitting/receiving channels with predictable and acceptable loss in image quality. In this paper, a high precision three-dimensional imaging algorithm is proposed for MIMO arrays of the regularly distributed type, especially the sparse varieties. Termed the Dimension-Factorized Range Migration Algorithm, the new imaging approach factorizes the conventional MIMO Range Migration Algorithm into multiple operations across the sparse dimensions. The thinner the sparse dimensions of the array, the more efficient the new algorithm will be. Advantages of the proposed approach are demonstrated by comparison with the conventional MIMO Range Migration Algorithm and its non-uniform fast Fourier transform based variant in terms of all the important characteristics of the approaches, especially the anti-noise capability. The computation cost is analyzed as well to evaluate the efficiency quantitatively. PMID:29113083
Many-core graph analytics using accelerated sparse linear algebra routines
NASA Astrophysics Data System (ADS)
Kozacik, Stephen; Paolini, Aaron L.; Fox, Paul; Kelmelis, Eric
2016-05-01
Graph analytics is a key component in identifying emerging trends and threats in many real-world applications. Largescale graph analytics frameworks provide a convenient and highly-scalable platform for developing algorithms to analyze large datasets. Although conceptually scalable, these techniques exhibit poor performance on modern computational hardware. Another model of graph computation has emerged that promises improved performance and scalability by using abstract linear algebra operations as the basis for graph analysis as laid out by the GraphBLAS standard. By using sparse linear algebra as the basis, existing highly efficient algorithms can be adapted to perform computations on the graph. This approach, however, is often less intuitive to graph analytics experts, who are accustomed to vertex-centric APIs such as Giraph, GraphX, and Tinkerpop. We are developing an implementation of the high-level operations supported by these APIs in terms of linear algebra operations. This implementation is be backed by many-core implementations of the fundamental GraphBLAS operations required, and offers the advantages of both the intuitive programming model of a vertex-centric API and the performance of a sparse linear algebra implementation. This technology can reduce the number of nodes required, as well as the run-time for a graph analysis problem, enabling customers to perform more complex analysis with less hardware at lower cost. All of this can be accomplished without the requirement for the customer to make any changes to their analytics code, thanks to the compatibility with existing graph APIs.
The w-effect in interferometric imaging: from a fast sparse measurement operator to superresolution
NASA Astrophysics Data System (ADS)
Dabbech, A.; Wolz, L.; Pratley, L.; McEwen, J. D.; Wiaux, Y.
2017-11-01
Modern radio telescopes, such as the Square Kilometre Array, will probe the radio sky over large fields of view, which results in large w-modulations of the sky image. This effect complicates the relationship between the measured visibilities and the image under scrutiny. In algorithmic terms, it gives rise to massive memory and computational time requirements. Yet, it can be a blessing in terms of reconstruction quality of the sky image. In recent years, several works have shown that large w-modulations promote the spread spectrum effect. Within the compressive sensing framework, this effect increases the incoherence between the sensing basis and the sparsity basis of the signal to be recovered, leading to better estimation of the sky image. In this article, we revisit the w-projection approach using convex optimization in realistic settings, where the measurement operator couples the w-terms in Fourier and the de-gridding kernels. We provide sparse, thus fast, models of the Fourier part of the measurement operator through adaptive sparsification procedures. Consequently, memory requirements and computational cost are significantly alleviated at the expense of introducing errors on the radio interferometric data model. We present a first investigation of the impact of the sparse variants of the measurement operator on the image reconstruction quality. We finally analyse the interesting superresolution potential associated with the spread spectrum effect of the w-modulation, and showcase it through simulations. Our c++ code is available online on GitHub.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Linenberg, A.; Lander, N.J.
1994-12-31
The need for remote monitoring of certain compounds in a sparsely populated area with limited user assistance led to the development and manufacture of a self contained, portable gas chromatography with the appropriate software. Part per billion levels of vinyl chloride, cis 1,2 dichloroethylene and trichloroethylene were detected in air using a trap for preconcentration of the compounds. The units were continuously calibrated with certified standards from Scott Specialty Gases, which in one case was 1 part per billion of the aforementioned compounds. The entire operation of the units, including monitoring instrument responses, changing operating parameters, data transfer, data reviewmore » and data reporting was done entirely on a remote basis from approximately 600 miles away using a remote computer with a modem and remote operating software. The entire system concept promises the availability of highly sensitive remote monitoring in sparsely populated areas for long periods of time.« less
Notes on implementation of sparsely distributed memory
NASA Technical Reports Server (NTRS)
Keeler, J. D.; Denning, P. J.
1986-01-01
The Sparsely Distributed Memory (SDM) developed by Kanerva is an unconventional memory design with very interesting and desirable properties. The memory works in a manner that is closely related to modern theories of human memory. The SDM model is discussed in terms of its implementation in hardware. Two appendices discuss the unconventional approaches of the SDM: Appendix A treats a resistive circuit for fast, parallel address decoding; and Appendix B treats a systolic array for high throughput read and write operations.
Sparse representation based image interpolation with nonlocal autoregressive modeling.
Dong, Weisheng; Zhang, Lei; Lukac, Rastislav; Shi, Guangming
2013-04-01
Sparse representation is proven to be a promising approach to image super-resolution, where the low-resolution (LR) image is usually modeled as the down-sampled version of its high-resolution (HR) counterpart after blurring. When the blurring kernel is the Dirac delta function, i.e., the LR image is directly down-sampled from its HR counterpart without blurring, the super-resolution problem becomes an image interpolation problem. In such cases, however, the conventional sparse representation models (SRM) become less effective, because the data fidelity term fails to constrain the image local structures. In natural images, fortunately, many nonlocal similar patches to a given patch could provide nonlocal constraint to the local structure. In this paper, we incorporate the image nonlocal self-similarity into SRM for image interpolation. More specifically, a nonlocal autoregressive model (NARM) is proposed and taken as the data fidelity term in SRM. We show that the NARM-induced sampling matrix is less coherent with the representation dictionary, and consequently makes SRM more effective for image interpolation. Our extensive experimental results demonstrate that the proposed NARM-based image interpolation method can effectively reconstruct the edge structures and suppress the jaggy/ringing artifacts, achieving the best image interpolation results so far in terms of PSNR as well as perceptual quality metrics such as SSIM and FSIM.
Luo, Xin; You, Zhuhong; Zhou, Mengchu; Li, Shuai; Leung, Hareton; Xia, Yunni; Zhu, Qingsheng
2015-01-09
The comprehensive mapping of protein-protein interactions (PPIs) is highly desired for one to gain deep insights into both fundamental cell biology processes and the pathology of diseases. Finely-set small-scale experiments are not only very expensive but also inefficient to identify numerous interactomes despite their high accuracy. High-throughput screening techniques enable efficient identification of PPIs; yet the desire to further extract useful knowledge from these data leads to the problem of binary interactome mapping. Network topology-based approaches prove to be highly efficient in addressing this problem; however, their performance deteriorates significantly on sparse putative PPI networks. Motivated by the success of collaborative filtering (CF)-based approaches to the problem of personalized-recommendation on large, sparse rating matrices, this work aims at implementing a highly efficient CF-based approach to binary interactome mapping. To achieve this, we first propose a CF framework for it. Under this framework, we model the given data into an interactome weight matrix, where the feature-vectors of involved proteins are extracted. With them, we design the rescaled cosine coefficient to model the inter-neighborhood similarity among involved proteins, for taking the mapping process. Experimental results on three large, sparse datasets demonstrate that the proposed approach outperforms several sophisticated topology-based approaches significantly.
Luo, Xin; You, Zhuhong; Zhou, Mengchu; Li, Shuai; Leung, Hareton; Xia, Yunni; Zhu, Qingsheng
2015-01-01
The comprehensive mapping of protein-protein interactions (PPIs) is highly desired for one to gain deep insights into both fundamental cell biology processes and the pathology of diseases. Finely-set small-scale experiments are not only very expensive but also inefficient to identify numerous interactomes despite their high accuracy. High-throughput screening techniques enable efficient identification of PPIs; yet the desire to further extract useful knowledge from these data leads to the problem of binary interactome mapping. Network topology-based approaches prove to be highly efficient in addressing this problem; however, their performance deteriorates significantly on sparse putative PPI networks. Motivated by the success of collaborative filtering (CF)-based approaches to the problem of personalized-recommendation on large, sparse rating matrices, this work aims at implementing a highly efficient CF-based approach to binary interactome mapping. To achieve this, we first propose a CF framework for it. Under this framework, we model the given data into an interactome weight matrix, where the feature-vectors of involved proteins are extracted. With them, we design the rescaled cosine coefficient to model the inter-neighborhood similarity among involved proteins, for taking the mapping process. Experimental results on three large, sparse datasets demonstrate that the proposed approach outperforms several sophisticated topology-based approaches significantly. PMID:25572661
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brewster, Aaron S.; Sawaya, Michael R.; University of California, Los Angeles, CA 90095-1570
2015-02-01
Special methods are required to interpret sparse diffraction patterns collected from peptide crystals at X-ray free-electron lasers. Bragg spots can be indexed from composite-image powder rings, with crystal orientations then deduced from a very limited number of spot positions. Still diffraction patterns from peptide nanocrystals with small unit cells are challenging to index using conventional methods owing to the limited number of spots and the lack of crystal orientation information for individual images. New indexing algorithms have been developed as part of the Computational Crystallography Toolbox (cctbx) to overcome these challenges. Accurate unit-cell information derived from an aggregate data setmore » from thousands of diffraction patterns can be used to determine a crystal orientation matrix for individual images with as few as five reflections. These algorithms are potentially applicable not only to amyloid peptides but also to any set of diffraction patterns with sparse properties, such as low-resolution virus structures or high-throughput screening of still images captured by raster-scanning at synchrotron sources. As a proof of concept for this technique, successful integration of X-ray free-electron laser (XFEL) data to 2.5 Å resolution for the amyloid segment GNNQQNY from the Sup35 yeast prion is presented.« less
NASA Astrophysics Data System (ADS)
Luo, Xin; You, Zhuhong; Zhou, Mengchu; Li, Shuai; Leung, Hareton; Xia, Yunni; Zhu, Qingsheng
2015-01-01
The comprehensive mapping of protein-protein interactions (PPIs) is highly desired for one to gain deep insights into both fundamental cell biology processes and the pathology of diseases. Finely-set small-scale experiments are not only very expensive but also inefficient to identify numerous interactomes despite their high accuracy. High-throughput screening techniques enable efficient identification of PPIs; yet the desire to further extract useful knowledge from these data leads to the problem of binary interactome mapping. Network topology-based approaches prove to be highly efficient in addressing this problem; however, their performance deteriorates significantly on sparse putative PPI networks. Motivated by the success of collaborative filtering (CF)-based approaches to the problem of personalized-recommendation on large, sparse rating matrices, this work aims at implementing a highly efficient CF-based approach to binary interactome mapping. To achieve this, we first propose a CF framework for it. Under this framework, we model the given data into an interactome weight matrix, where the feature-vectors of involved proteins are extracted. With them, we design the rescaled cosine coefficient to model the inter-neighborhood similarity among involved proteins, for taking the mapping process. Experimental results on three large, sparse datasets demonstrate that the proposed approach outperforms several sophisticated topology-based approaches significantly.
TESTING HIGH-DIMENSIONAL COVARIANCE MATRICES, WITH APPLICATION TO DETECTING SCHIZOPHRENIA RISK GENES
Zhu, Lingxue; Lei, Jing; Devlin, Bernie; Roeder, Kathryn
2017-01-01
Scientists routinely compare gene expression levels in cases versus controls in part to determine genes associated with a disease. Similarly, detecting case-control differences in co-expression among genes can be critical to understanding complex human diseases; however statistical methods have been limited by the high dimensional nature of this problem. In this paper, we construct a sparse-Leading-Eigenvalue-Driven (sLED) test for comparing two high-dimensional covariance matrices. By focusing on the spectrum of the differential matrix, sLED provides a novel perspective that accommodates what we assume to be common, namely sparse and weak signals in gene expression data, and it is closely related with Sparse Principal Component Analysis. We prove that sLED achieves full power asymptotically under mild assumptions, and simulation studies verify that it outperforms other existing procedures under many biologically plausible scenarios. Applying sLED to the largest gene-expression dataset obtained from post-mortem brain tissue from Schizophrenia patients and controls, we provide a novel list of genes implicated in Schizophrenia and reveal intriguing patterns in gene co-expression change for Schizophrenia subjects. We also illustrate that sLED can be generalized to compare other gene-gene “relationship” matrices that are of practical interest, such as the weighted adjacency matrices. PMID:29081874
Zhu, Lingxue; Lei, Jing; Devlin, Bernie; Roeder, Kathryn
2017-09-01
Scientists routinely compare gene expression levels in cases versus controls in part to determine genes associated with a disease. Similarly, detecting case-control differences in co-expression among genes can be critical to understanding complex human diseases; however statistical methods have been limited by the high dimensional nature of this problem. In this paper, we construct a sparse-Leading-Eigenvalue-Driven (sLED) test for comparing two high-dimensional covariance matrices. By focusing on the spectrum of the differential matrix, sLED provides a novel perspective that accommodates what we assume to be common, namely sparse and weak signals in gene expression data, and it is closely related with Sparse Principal Component Analysis. We prove that sLED achieves full power asymptotically under mild assumptions, and simulation studies verify that it outperforms other existing procedures under many biologically plausible scenarios. Applying sLED to the largest gene-expression dataset obtained from post-mortem brain tissue from Schizophrenia patients and controls, we provide a novel list of genes implicated in Schizophrenia and reveal intriguing patterns in gene co-expression change for Schizophrenia subjects. We also illustrate that sLED can be generalized to compare other gene-gene "relationship" matrices that are of practical interest, such as the weighted adjacency matrices.
Using Perturbed QR Factorizations To Solve Linear Least-Squares Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Avron, Haim; Ng, Esmond G.; Toledo, Sivan
2008-03-21
We propose and analyze a new tool to help solve sparse linear least-squares problems min{sub x} {parallel}Ax-b{parallel}{sub 2}. Our method is based on a sparse QR factorization of a low-rank perturbation {cflx A} of A. More precisely, we show that the R factor of {cflx A} is an effective preconditioner for the least-squares problem min{sub x} {parallel}Ax-b{parallel}{sub 2}, when solved using LSQR. We propose applications for the new technique. When A is rank deficient we can add rows to ensure that the preconditioner is well-conditioned without column pivoting. When A is sparse except for a few dense rows we canmore » drop these dense rows from A to obtain {cflx A}. Another application is solving an updated or downdated problem. If R is a good preconditioner for the original problem A, it is a good preconditioner for the updated/downdated problem {cflx A}. We can also solve what-if scenarios, where we want to find the solution if a column of the original matrix is changed/removed. We present a spectral theory that analyzes the generalized spectrum of the pencil (A*A,R*R) and analyze the applications.« less
Huang, Jacob; Gholami, Behnood; Agar, Nathalie Y. R.; Norton, Isaiah; Haddad, Wassim M.; Tannenbaum, Allen R.
2013-01-01
Glioma histologies are the primary factor in prognostic estimates and are used in determining the proper course of treatment. Furthermore, due to the sensitivity of cranial environments, real-time tumor-cell classification and boundary detection can aid in the precision and completeness of tumor resection. A recent improvement to mass spectrometry known as desorption electrospray ionization operates in an ambient environment without the application of a preparation compound. This allows for a real-time acquisition of mass spectra during surgeries and other live operations. In this paper, we present a framework using sparse kernel machines to determine a glioma sample’s histopathological subtype by analyzing its chemical composition acquired by desorption electrospray ionization mass spectrometry. PMID:22256188
The Kepler DB: a database management system for arrays, sparse arrays, and binary data
NASA Astrophysics Data System (ADS)
McCauliff, Sean; Cote, Miles T.; Girouard, Forrest R.; Middour, Christopher; Klaus, Todd C.; Wohler, Bill
2010-07-01
The Kepler Science Operations Center stores pixel values on approximately six million pixels collected every 30 minutes, as well as data products that are generated as a result of running the Kepler science processing pipeline. The Kepler Database management system (Kepler DB)was created to act as the repository of this information. After one year of flight usage, Kepler DB is managing 3 TiB of data and is expected to grow to over 10 TiB over the course of the mission. Kepler DB is a non-relational, transactional database where data are represented as one-dimensional arrays, sparse arrays or binary large objects. We will discuss Kepler DB's APIs, implementation, usage and deployment at the Kepler Science Operations Center.
The Kepler DB, a Database Management System for Arrays, Sparse Arrays and Binary Data
NASA Technical Reports Server (NTRS)
McCauliff, Sean; Cote, Miles T.; Girouard, Forrest R.; Middour, Christopher; Klaus, Todd C.; Wohler, Bill
2010-01-01
The Kepler Science Operations Center stores pixel values on approximately six million pixels collected every 30-minutes, as well as data products that are generated as a result of running the Kepler science processing pipeline. The Kepler Database (Kepler DB) management system was created to act as the repository of this information. After one year of ight usage, Kepler DB is managing 3 TiB of data and is expected to grow to over 10 TiB over the course of the mission. Kepler DB is a non-relational, transactional database where data are represented as one dimensional arrays, sparse arrays or binary large objects. We will discuss Kepler DB's APIs, implementation, usage and deployment at the Kepler Science Operations Center.
NASA Astrophysics Data System (ADS)
Xue, Zhaohui; Du, Peijun; Li, Jun; Su, Hongjun
2017-02-01
The generally limited availability of training data relative to the usually high data dimension pose a great challenge to accurate classification of hyperspectral imagery, especially for identifying crops characterized with highly correlated spectra. However, traditional parametric classification models are problematic due to the need of non-singular class-specific covariance matrices. In this research, a novel sparse graph regularization (SGR) method is presented, aiming at robust crop mapping using hyperspectral imagery with very few in situ data. The core of SGR lies in propagating labels from known data to unknown, which is triggered by: (1) the fraction matrix generated for the large unknown data by using an effective sparse representation algorithm with respect to the few training data serving as the dictionary; (2) the prediction function estimated for the few training data by formulating a regularization model based on sparse graph. Then, the labels of large unknown data can be obtained by maximizing the posterior probability distribution based on the two ingredients. SGR is more discriminative, data-adaptive, robust to noise, and efficient, which is unique with regard to previously proposed approaches and has high potentials in discriminating crops, especially when facing insufficient training data and high-dimensional spectral space. The study area is located at Zhangye basin in the middle reaches of Heihe watershed, Gansu, China, where eight crop types were mapped with Compact Airborne Spectrographic Imager (CASI) and Shortwave Infrared Airborne Spectrogrpahic Imager (SASI) hyperspectral data. Experimental results demonstrate that the proposed method significantly outperforms other traditional and state-of-the-art methods.
A multiple hold-out framework for Sparse Partial Least Squares.
Monteiro, João M; Rao, Anil; Shawe-Taylor, John; Mourão-Miranda, Janaina
2016-09-15
Supervised classification machine learning algorithms may have limitations when studying brain diseases with heterogeneous populations, as the labels might be unreliable. More exploratory approaches, such as Sparse Partial Least Squares (SPLS), may provide insights into the brain's mechanisms by finding relationships between neuroimaging and clinical/demographic data. The identification of these relationships has the potential to improve the current understanding of disease mechanisms, refine clinical assessment tools, and stratify patients. SPLS finds multivariate associative effects in the data by computing pairs of sparse weight vectors, where each pair is used to remove its corresponding associative effect from the data by matrix deflation, before computing additional pairs. We propose a novel SPLS framework which selects the adequate number of voxels and clinical variables to describe each associative effect, and tests their reliability by fitting the model to different splits of the data. As a proof of concept, the approach was applied to find associations between grey matter probability maps and individual items of the Mini-Mental State Examination (MMSE) in a clinical sample with various degrees of dementia. The framework found two statistically significant associative effects between subsets of brain voxels and subsets of the questions/tasks. SPLS was compared with its non-sparse version (PLS). The use of projection deflation versus a classical PLS deflation was also tested in both PLS and SPLS. SPLS outperformed PLS, finding statistically significant effects and providing higher correlation values in hold-out data. Moreover, projection deflation provided better results. Copyright © 2016 The Author(s). Published by Elsevier B.V. All rights reserved.
Matrix decomposition graphics processing unit solver for Poisson image editing
NASA Astrophysics Data System (ADS)
Lei, Zhao; Wei, Li
2012-10-01
In recent years, gradient-domain methods have been widely discussed in the image processing field, including seamless cloning and image stitching. These algorithms are commonly carried out by solving a large sparse linear system: the Poisson equation. However, solving the Poisson equation is a computational and memory intensive task which makes it not suitable for real-time image editing. A new matrix decomposition graphics processing unit (GPU) solver (MDGS) is proposed to settle the problem. A matrix decomposition method is used to distribute the work among GPU threads, so that MDGS will take full advantage of the computing power of current GPUs. Additionally, MDGS is a hybrid solver (combines both the direct and iterative techniques) and has two-level architecture. These enable MDGS to generate identical solutions with those of the common Poisson methods and achieve high convergence rate in most cases. This approach is advantageous in terms of parallelizability, enabling real-time image processing, low memory-taken and extensive applications.
NASA Astrophysics Data System (ADS)
Georgakopoulos, A.; Politopoulos, K.; Georgiou, E.
2018-03-01
A new dynamic-system approach to the problem of radiative transfer inside scattering and absorbing media is presented, directly based on first-hand physical principles. This method, the Dynamic Radiative Transfer System (DRTS), employs a dynamical system formality using a global sparse matrix, which characterizes the physical, optical and geometrical properties of the material-volume of interest. The new system state is generated by the above time-independent matrix, using simple matrix-vector multiplication for each subsequent time step. DRTS is capable of calculating accurately the time evolution of photon propagation in media of complex structure and shape. The flexibility of DRTS allows the integration of time-dependent sources, boundary conditions, different media and several optical phenomena like reflection and refraction in a unified and consistent way. Various examples of DRTS simulation results are presented for ultra-fast light pulse 3-D propagation, demonstrating greatly reduced computational cost and resource requirements compared to other methods.
A modified dual-level algorithm for large-scale three-dimensional Laplace and Helmholtz equation
NASA Astrophysics Data System (ADS)
Li, Junpu; Chen, Wen; Fu, Zhuojia
2018-01-01
A modified dual-level algorithm is proposed in the article. By the help of the dual level structure, the fully-populated interpolation matrix on the fine level is transformed to a local supported sparse matrix to solve the highly ill-conditioning and excessive storage requirement resulting from fully-populated interpolation matrix. The kernel-independent fast multipole method is adopted to expediting the solving process of the linear equations on the coarse level. Numerical experiments up to 2-million fine-level nodes have successfully been achieved. It is noted that the proposed algorithm merely needs to place 2-3 coarse-level nodes in each wavelength per direction to obtain the reasonable solution, which almost down to the minimum requirement allowed by the Shannon's sampling theorem. In the real human head model example, it is observed that the proposed algorithm can simulate well computationally very challenging exterior high-frequency harmonic acoustic wave propagation up to 20,000 Hz.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schiffmann, Florian; VandeVondele, Joost, E-mail: Joost.VandeVondele@mat.ethz.ch
2015-06-28
We present an improved preconditioning scheme for electronic structure calculations based on the orbital transformation method. First, a preconditioner is developed which includes information from the full Kohn-Sham matrix but avoids computationally demanding diagonalisation steps in its construction. This reduces the computational cost of its construction, eliminating a bottleneck in large scale simulations, while maintaining rapid convergence. In addition, a modified form of Hotelling’s iterative inversion is introduced to replace the exact inversion of the preconditioner matrix. This method is highly effective during molecular dynamics (MD), as the solution obtained in earlier MD steps is a suitable initial guess. Filteringmore » small elements during sparse matrix multiplication leads to linear scaling inversion, while retaining robustness, already for relatively small systems. For system sizes ranging from a few hundred to a few thousand atoms, which are typical for many practical applications, the improvements to the algorithm lead to a 2-5 fold speedup per MD step.« less
Hybrid Weighted Minimum Norm Method A new method based LORETA to solve EEG inverse problem.
Song, C; Zhuang, T; Wu, Q
2005-01-01
This Paper brings forward a new method to solve EEG inverse problem. Based on following physiological characteristic of neural electrical activity source: first, the neighboring neurons are prone to active synchronously; second, the distribution of source space is sparse; third, the active intensity of the sources are high centralized, we take these prior knowledge as prerequisite condition to develop the inverse solution of EEG, and not assume other characteristic of inverse solution to realize the most commonly 3D EEG reconstruction map. The proposed algorithm takes advantage of LORETA's low resolution method which emphasizes particularly on 'localization' and FOCUSS's high resolution method which emphasizes particularly on 'separability'. The method is still under the frame of the weighted minimum norm method. The keystone is to construct a weighted matrix which takes reference from the existing smoothness operator, competition mechanism and study algorithm. The basic processing is to obtain an initial solution's estimation firstly, then construct a new estimation using the initial solution's information, repeat this process until the solutions under last two estimate processing is keeping unchanged.
Impact of mercury from the Canadian boreal forest widfires to New England
NASA Astrophysics Data System (ADS)
Hwang, G.; Talbot, R. W.
2010-12-01
Canadian Boreal forest fires release significant amounts of mercury and constitute several air quality episodes every year in New England, especially during summer. With continuous monitoring of mercury in two New England sites in both rural and elevated area from 2004 to date, several events of the wildfire transport was screened out using ensembles of backward trajectories to ensure the air parcels sampled spent substantial residence time within the box of burned area defined by the the Fire Information for Resource Management System(FIRMS) MODIS hotspot/fires data. Other biomass burning tracers, (such as HCN), were also used as criteria if they are were available during the events period. The mercury to CO ratios during the events were calculated as the input to the Sparse Matrix Operator Kernel Emissions System (SMOKE) model to simulate the high and low ranges of mercury emissions frorm the burned area. We are now using the Community Multiscale Air Quality Modeling System (CMAQ) to study the impact of the mercury emission from the Canadian boreal forest wildfires to the New England region in more details.
Source apportionment of polycyclic aromatic hydrocarbons in Louisiana
NASA Astrophysics Data System (ADS)
Han, F.; Zhang, H.
2017-12-01
Polycyclic aromatic hydrocarbons (PAHs) in the environment are of significant concern due to their high toxicity that may result in adverse health effects. PAHs measurements at the limited air quality monitoring stations alone are insufficient to gain a complete concept of ambient PAH levels. This study simulates the concentrations of PAHs in Louisiana and identifies the major emission sources. Speciation profiles for PAHs were prepared using data assembled from existing emission profile databases. The Sparse Matrix Operator Kernel Emission (SMOKE) model was used to generate the estimated gridded emissions of 16 priority PAH species directly associated with health risks. The estimated emissions were then applied to simulate ambient concentrations of PAHs in Louisiana for January, April, July and October 2011 using the Community Multiscale Air Quality (CMAQ) model (v5.0.1). Through the formation, transport and deposition of PAHs species, the concentrations of PAHs species in gas phase and particulate phase were obtained. The spatial and temporal variations were analyzed and contributions of both local and regional major sources were quantified. This study provides important information for the prevention and treatment of PAHs in Louisiana.
Matrix Algebra for GPU and Multicore Architectures (MAGMA) for Large Petascale Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dongarra, Jack J.; Tomov, Stanimire
2014-03-24
The goal of the MAGMA project is to create a new generation of linear algebra libraries that achieve the fastest possible time to an accurate solution on hybrid Multicore+GPU-based systems, using all the processing power that future high-end systems can make available within given energy constraints. Our efforts at the University of Tennessee achieved the goals set in all of the five areas identified in the proposal: 1. Communication optimal algorithms; 2. Autotuning for GPU and hybrid processors; 3. Scheduling and memory management techniques for heterogeneity and scale; 4. Fault tolerance and robustness for large scale systems; 5. Building energymore » efficiency into software foundations. The University of Tennessee’s main contributions, as proposed, were the research and software development of new algorithms for hybrid multi/many-core CPUs and GPUs, as related to two-sided factorizations and complete eigenproblem solvers, hybrid BLAS, and energy efficiency for dense, as well as sparse, operations. Furthermore, as proposed, we investigated and experimented with various techniques targeting the five main areas outlined.« less
Finite-time scaling at the Anderson transition for vibrations in solids
NASA Astrophysics Data System (ADS)
Beltukov, Y. M.; Skipetrov, S. E.
2017-11-01
A model in which a three-dimensional elastic medium is represented by a network of identical masses connected by springs of random strengths and allowed to vibrate only along a selected axis of the reference frame exhibits an Anderson localization transition. To study this transition, we assume that the dynamical matrix of the network is given by a product of a sparse random matrix with real, independent, Gaussian-distributed nonzero entries and its transpose. A finite-time scaling analysis of the system's response to an initial excitation allows us to estimate the critical parameters of the localization transition. The critical exponent is found to be ν =1.57 ±0.02 , in agreement with previous studies of the Anderson transition belonging to the three-dimensional orthogonal universality class.
Pseudo progression identification of glioblastoma with dictionary learning.
Zhang, Jian; Yu, Hengyong; Qian, Xiaohua; Liu, Keqin; Tan, Hua; Yang, Tielin; Wang, Maode; Li, King Chuen; Chan, Michael D; Debinski, Waldemar; Paulsson, Anna; Wang, Ge; Zhou, Xiaobo
2016-06-01
Although the use of temozolomide in chemoradiotherapy is effective, the challenging clinical problem of pseudo progression has been raised in brain tumor treatment. This study aims to distinguish pseudo progression from true progression. Between 2000 and 2012, a total of 161 patients with glioblastoma multiforme (GBM) were treated with chemoradiotherapy at our hospital. Among the patients, 79 had their diffusion tensor imaging (DTI) data acquired at the earliest diagnosed date of pseudo progression or true progression, and 23 had both DTI data and genomic data. Clinical records of all patients were kept in good condition. Volumetric fractional anisotropy (FA) images obtained from the DTI data were decomposed into a sequence of sparse representations. Then, a feature selection algorithm was applied to extract the critical features from the feature matrix to reduce the size of the feature matrix and to improve the classification accuracy. The proposed approach was validated using the 79 samples with clinical DTI data. Satisfactory results were obtained under different experimental conditions. The area under the receiver operating characteristic (ROC) curve (AUC) was 0.87 for a given dictionary with 1024 atoms. For the subgroup of 23 samples, genomics data analysis was also performed. Results implied further perspective on pseudo progression classification. The proposed method can determine pseudo progression and true progression with improved accuracy. Laboring segmentation is no longer necessary because this skillfully designed method is not sensitive to tumor location. Copyright © 2016 Elsevier Ltd. All rights reserved.
QEEN Workshop: "Quantifying Exposure to Engineered Nano ...
The measurement and characterization of nanomaterials in biological tissues is complicated by a number of factors including: the sensitivity of the assay to small sized particles or low concentrations of materials; the ability to distinguish different forms and transformations of the materials related to the biological matrix; distinguishing exogenous nanomaterials, which may be composed of biologically common elements such as carbon,from normal biological tissues; differentiating particle from ionic phases for materials that dissolve; localization of sparsely distributed materials in a complex substrate (the
Modal analysis of circular Bragg fibers with arbitrary index profiles
NASA Astrophysics Data System (ADS)
Horikis, Theodoros P.; Kath, William L.
2006-12-01
A finite-difference approach based upon the immersed interface method is used to analyze the mode structure of Bragg fibers with arbitrary index profiles. The method allows general propagation constants and eigenmodes to be calculated to a high degree of accuracy, while computation times are kept to a minimum by exploiting sparse matrix algebra. The method is well suited to handle complicated structures comprised of a large number of thin layers with high-index contrast and simultaneously determines multiple eigenmodes without modification.
ERIC Educational Resources Information Center
Digital Equipment Corp., Maynard, MA.
The curriculum materials and computer programs in this booklet introduce the idea of a matrix. They go on to discuss matrix operations of addition, subtraction, multiplication by a scalar, and matrix multiplication. The last section covers several contemporary applications of matrix multiplication, including problems of communication…
Spin-adapted matrix product states and operators
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keller, Sebastian, E-mail: sebastian.keller@phys.chem.ethz.ch; Reiher, Markus, E-mail: markus.reiher@phys.chem.ethz.ch
Matrix product states (MPSs) and matrix product operators (MPOs) allow an alternative formulation of the density matrix renormalization group algorithm introduced by White. Here, we describe how non-abelian spin symmetry can be exploited in MPSs and MPOs by virtue of the Wigner–Eckart theorem at the example of the spin-adapted quantum chemical Hamiltonian operator.
Characterizations of matrix and operator-valued Φ-entropies, and operator Efron-Stein inequalities.
Cheng, Hao-Chung; Hsieh, Min-Hsiu
2016-03-01
We derive new characterizations of the matrix Φ-entropy functionals introduced in Chen & Tropp (Chen, Tropp 2014 Electron. J. Prob. 19 , 1-30. (doi:10.1214/ejp.v19-2964)). These characterizations help us to better understand the properties of matrix Φ-entropies, and are a powerful tool for establishing matrix concentration inequalities for random matrices. Then, we propose an operator-valued generalization of matrix Φ-entropy functionals, and prove the subadditivity under Löwner partial ordering. Our results demonstrate that the subadditivity of operator-valued Φ-entropies is equivalent to the convexity. As an application, we derive the operator Efron-Stein inequality.
NASA Astrophysics Data System (ADS)
Xu, Xiankun; Li, Peiwen
2017-11-01
Fixman's work in 1974 and the follow-up studies have developed a method that can factorize the inverse of mass matrix into an arithmetic combination of three sparse matrices-one of them is positive definite and needs to be further factorized by using the Cholesky decomposition or similar methods. When the molecule subjected to study is of serial chain structure, this method can achieve O (n) time complexity. However, for molecules with long branches, Cholesky decomposition about the corresponding positive definite matrix will introduce massive fill-in due to its nonzero structure. Although there are several methods can be used to reduce the number of fill-in, none of them could strictly guarantee for zero fill-in for all molecules according to our test, and thus cannot obtain O (n) time complexity by using these traditional methods. In this paper we present a new method that can guarantee for no fill-in in doing the Cholesky decomposition, which was developed based on the correlations between the mass matrix and the geometrical structure of molecules. As a result, the inverting of mass matrix will remain the O (n) time complexity, no matter the molecule structure has long branches or not.
Roy, P; Petroll, W M; Cavanagh, H D; Chuong, C J; Jester, J V
1997-04-10
An in vitro force measurement assay has been developed to quantify the forces exerted by single corneal fibroblasts during the early interaction with a collagen matrix. Corneal fibroblasts were sparsely seeded on top of collagen matrices whose stiffness was predetermined by micromanipulation with calibrated fine glass microneedles. The forces exerted by individual cells were calculated from time-lapse videomicroscopic recordings of the 2-D elastic distortion of the matrix. In additional experiments, the degree of permanent reorganization of the collagen matrices was assessed by lysing the cells with 1% Triton X-100 solution at the end of a 2-hour incubation and recording the subsequent relaxation. The data suggest that a cell can exert comparable centripetal force during either extension of a cell process or partial retraction of an extended pseudopodia. The rates of force associated with pseudopodial extension and partial retraction were 0.180 +/- 0.091 (x 10(-8)) N/min (n = 8 experiments) and 0.213 +/- 0.063 (x 10(-8)) N/min (n = 8 experiments), respectively. Rupture of pseudopodial adhesion associated with cell locomotion causes a release of force on the matrix and a complete recoil of the pseudopodia concerned; a simultaneous release of force on the matrix was also observed at the opposite end of the cell. Lysis of cells resulted in 84 +/- 18% relaxation of the matrix, suggesting that little permanent remodeling of matrix is produced by the actions of isolated migrating cells.
Characterizations of matrix and operator-valued Φ-entropies, and operator Efron–Stein inequalities
Cheng, Hao-Chung; Hsieh, Min-Hsiu
2016-01-01
We derive new characterizations of the matrix Φ-entropy functionals introduced in Chen & Tropp (Chen, Tropp 2014 Electron. J. Prob. 19, 1–30. (doi:10.1214/ejp.v19-2964)). These characterizations help us to better understand the properties of matrix Φ-entropies, and are a powerful tool for establishing matrix concentration inequalities for random matrices. Then, we propose an operator-valued generalization of matrix Φ-entropy functionals, and prove the subadditivity under Löwner partial ordering. Our results demonstrate that the subadditivity of operator-valued Φ-entropies is equivalent to the convexity. As an application, we derive the operator Efron–Stein inequality. PMID:27118909
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jakeman, John D.; Narayan, Akil; Zhou, Tao
We propose an algorithm for recovering sparse orthogonal polynomial expansions via collocation. A standard sampling approach for recovering sparse polynomials uses Monte Carlo sampling, from the density of orthogonality, which results in poor function recovery when the polynomial degree is high. Our proposed approach aims to mitigate this limitation by sampling with respect to the weighted equilibrium measure of the parametric domain and subsequently solves a preconditionedmore » $$\\ell^1$$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. Our algorithm can be applied to a wide class of orthogonal polynomial families on bounded and unbounded domains, including all classical families. We present theoretical analysis to motivate the algorithm and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. In conclusion, numerical examples are also provided to demonstrate that our proposed algorithm leads to comparable or improved accuracy even when compared with Legendre- and Hermite-specific algorithms.« less
Group sparse multiview patch alignment framework with view consistency for image classification.
Gui, Jie; Tao, Dacheng; Sun, Zhenan; Luo, Yong; You, Xinge; Tang, Yuan Yan
2014-07-01
No single feature can satisfactorily characterize the semantic concepts of an image. Multiview learning aims to unify different kinds of features to produce a consensual and efficient representation. This paper redefines part optimization in the patch alignment framework (PAF) and develops a group sparse multiview patch alignment framework (GSM-PAF). The new part optimization considers not only the complementary properties of different views, but also view consistency. In particular, view consistency models the correlations between all possible combinations of any two kinds of view. In contrast to conventional dimensionality reduction algorithms that perform feature extraction and feature selection independently, GSM-PAF enjoys joint feature extraction and feature selection by exploiting l(2,1)-norm on the projection matrix to achieve row sparsity, which leads to the simultaneous selection of relevant features and learning transformation, and thus makes the algorithm more discriminative. Experiments on two real-world image data sets demonstrate the effectiveness of GSM-PAF for image classification.
A comparison of SuperLU solvers on the intel MIC architecture
NASA Astrophysics Data System (ADS)
Tuncel, Mehmet; Duran, Ahmet; Celebi, M. Serdar; Akaydin, Bora; Topkaya, Figen O.
2016-10-01
In many science and engineering applications, problems may result in solving a sparse linear system AX=B. For example, SuperLU_MCDT, a linear solver, was used for the large penta-diagonal matrices for 2D problems and hepta-diagonal matrices for 3D problems, coming from the incompressible blood flow simulation (see [1]). It is important to test the status and potential improvements of state-of-the-art solvers on new technologies. In this work, sequential, multithreaded and distributed versions of SuperLU solvers (see [2]) are examined on the Intel Xeon Phi coprocessors using offload programming model at the EURORA cluster of CINECA in Italy. We consider a portfolio of test matrices containing patterned matrices from UFMM ([3]) and randomly located matrices. This architecture can benefit from high parallelism and large vectors. We find that the sequential SuperLU benefited up to 45 % performance improvement from the offload programming depending on the sparse matrix type and the size of transferred and processed data.
Brewster, Aaron S.; Sawaya, Michael R.; Rodriguez, Jose; ...
2015-01-23
Still diffraction patterns from peptide nanocrystals with small unit cells are challenging to index using conventional methods owing to the limited number of spots and the lack of crystal orientation information for individual images. New indexing algorithms have been developed as part of the Computational Crystallography Toolbox( cctbx) to overcome these challenges. Accurate unit-cell information derived from an aggregate data set from thousands of diffraction patterns can be used to determine a crystal orientation matrix for individual images with as few as five reflections. These algorithms are potentially applicable not only to amyloid peptides but also to any set ofmore » diffraction patterns with sparse properties, such as low-resolution virus structures or high-throughput screening of still images captured by raster-scanning at synchrotron sources. As a proof of concept for this technique, successful integration of X-ray free-electron laser (XFEL) data to 2.5 Å resolution for the amyloid segment GNNQQNY from the Sup35 yeast prion is presented.« less
Orthogonal sparse linear discriminant analysis
NASA Astrophysics Data System (ADS)
Liu, Zhonghua; Liu, Gang; Pu, Jiexin; Wang, Xiaohong; Wang, Haijun
2018-03-01
Linear discriminant analysis (LDA) is a linear feature extraction approach, and it has received much attention. On the basis of LDA, researchers have done a lot of research work on it, and many variant versions of LDA were proposed. However, the inherent problem of LDA cannot be solved very well by the variant methods. The major disadvantages of the classical LDA are as follows. First, it is sensitive to outliers and noises. Second, only the global discriminant structure is preserved, while the local discriminant information is ignored. In this paper, we present a new orthogonal sparse linear discriminant analysis (OSLDA) algorithm. The k nearest neighbour graph is first constructed to preserve the locality discriminant information of sample points. Then, L2,1-norm constraint on the projection matrix is used to act as loss function, which can make the proposed method robust to outliers in data points. Extensive experiments have been performed on several standard public image databases, and the experiment results demonstrate the performance of the proposed OSLDA algorithm.
Competition in high dimensional spaces using a sparse approximation of neural fields.
Quinton, Jean-Charles; Girau, Bernard; Lefort, Mathieu
2011-01-01
The Continuum Neural Field Theory implements competition within topologically organized neural networks with lateral inhibitory connections. However, due to the polynomial complexity of matrix-based implementations, updating dense representations of the activity becomes computationally intractable when an adaptive resolution or an arbitrary number of input dimensions is required. This paper proposes an alternative to self-organizing maps with a sparse implementation based on Gaussian mixture models, promoting a trade-off in redundancy for higher computational efficiency and alleviating constraints on the underlying substrate.This version reproduces the emergent attentional properties of the original equations, by directly applying them within a continuous approximation of a high dimensional neural field. The model is compatible with preprocessed sensory flows but can also be interfaced with artificial systems. This is particularly important for sensorimotor systems, where decisions and motor actions must be taken and updated in real-time. Preliminary tests are performed on a reactive color tracking application, using spatially distributed color features.
Data-driven cluster reinforcement and visualization in sparsely-matched self-organizing maps.
Manukyan, Narine; Eppstein, Margaret J; Rizzo, Donna M
2012-05-01
A self-organizing map (SOM) is a self-organized projection of high-dimensional data onto a typically 2-dimensional (2-D) feature map, wherein vector similarity is implicitly translated into topological closeness in the 2-D projection. However, when there are more neurons than input patterns, it can be challenging to interpret the results, due to diffuse cluster boundaries and limitations of current methods for displaying interneuron distances. In this brief, we introduce a new cluster reinforcement (CR) phase for sparsely-matched SOMs. The CR phase amplifies within-cluster similarity in an unsupervised, data-driven manner. Discontinuities in the resulting map correspond to between-cluster distances and are stored in a boundary (B) matrix. We describe a new hierarchical visualization of cluster boundaries displayed directly on feature maps, which requires no further clustering beyond what was implicitly accomplished during self-organization in SOM training. We use a synthetic benchmark problem and previously published microbial community profile data to demonstrate the benefits of the proposed methods.
Joint Feature Extraction and Classifier Design for ECG-Based Biometric Recognition.
Gutta, Sandeep; Cheng, Qi
2016-03-01
Traditional biometric recognition systems often utilize physiological traits such as fingerprint, face, iris, etc. Recent years have seen a growing interest in electrocardiogram (ECG)-based biometric recognition techniques, especially in the field of clinical medicine. In existing ECG-based biometric recognition methods, feature extraction and classifier design are usually performed separately. In this paper, a multitask learning approach is proposed, in which feature extraction and classifier design are carried out simultaneously. Weights are assigned to the features within the kernel of each task. We decompose the matrix consisting of all the feature weights into sparse and low-rank components. The sparse component determines the features that are relevant to identify each individual, and the low-rank component determines the common feature subspace that is relevant to identify all the subjects. A fast optimization algorithm is developed, which requires only the first-order information. The performance of the proposed approach is demonstrated through experiments using the MIT-BIH Normal Sinus Rhythm database.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks.
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R; Nguyen, Tuan N; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R.; Nguyen, Tuan N.; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T.
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively. PMID:28326009
Filtered gradient reconstruction algorithm for compressive spectral imaging
NASA Astrophysics Data System (ADS)
Mejia, Yuri; Arguello, Henry
2017-04-01
Compressive sensing matrices are traditionally based on random Gaussian and Bernoulli entries. Nevertheless, they are subject to physical constraints, and their structure unusually follows a dense matrix distribution, such as the case of the matrix related to compressive spectral imaging (CSI). The CSI matrix represents the integration of coded and shifted versions of the spectral bands. A spectral image can be recovered from CSI measurements by using iterative algorithms for linear inverse problems that minimize an objective function including a quadratic error term combined with a sparsity regularization term. However, current algorithms are slow because they do not exploit the structure and sparse characteristics of the CSI matrices. A gradient-based CSI reconstruction algorithm, which introduces a filtering step in each iteration of a conventional CSI reconstruction algorithm that yields improved image quality, is proposed. Motivated by the structure of the CSI matrix, Φ, this algorithm modifies the iterative solution such that it is forced to converge to a filtered version of the residual ΦTy, where y is the compressive measurement vector. We show that the filtered-based algorithm converges to better quality performance results than the unfiltered version. Simulation results highlight the relative performance gain over the existing iterative algorithms.
Zhang, Kaihua; Zhang, Lei; Yang, Ming-Hsuan
2014-10-01
It is a challenging task to develop effective and efficient appearance models for robust object tracking due to factors such as pose variation, illumination change, occlusion, and motion blur. Existing online tracking algorithms often update models with samples from observations in recent frames. Despite much success has been demonstrated, numerous issues remain to be addressed. First, while these adaptive appearance models are data-dependent, there does not exist sufficient amount of data for online algorithms to learn at the outset. Second, online tracking algorithms often encounter the drift problems. As a result of self-taught learning, misaligned samples are likely to be added and degrade the appearance models. In this paper, we propose a simple yet effective and efficient tracking algorithm with an appearance model based on features extracted from a multiscale image feature space with data-independent basis. The proposed appearance model employs non-adaptive random projections that preserve the structure of the image feature space of objects. A very sparse measurement matrix is constructed to efficiently extract the features for the appearance model. We compress sample images of the foreground target and the background using the same sparse measurement matrix. The tracking task is formulated as a binary classification via a naive Bayes classifier with online update in the compressed domain. A coarse-to-fine search strategy is adopted to further reduce the computational complexity in the detection procedure. The proposed compressive tracking algorithm runs in real-time and performs favorably against state-of-the-art methods on challenging sequences in terms of efficiency, accuracy and robustness.
Macquarrie, K T B; Mayer, K U; Jin, B; Spiessl, S M
2010-03-01
Redox evolution in sparsely fractured crystalline rocks is a key, and largely unresolved, issue when assessing the geochemical suitability of deep geological repositories for nuclear waste. Redox zonation created by the influx of oxygenated waters has previously been simulated using reactive transport models that have incorporated a variety of processes, resulting in predictions for the depth of oxygen penetration that may vary greatly. An assessment and direct comparison of the various underlying conceptual models are therefore needed. In this work a reactive transport model that considers multiple processes in an integrated manner is used to investigate the ingress of oxygen for both single fracture and fracture zone scenarios. It is shown that the depth of dissolved oxygen migration is greatly influenced by the a priori assumptions that are made in the conceptual models. For example, the ability of oxygen to access and react with minerals in the rock matrix may be of paramount importance for single fracture conceptual models. For fracture zone systems, the abundance and reactivity of minerals within the fractures and thin matrix slabs between the fractures appear to provide key controls on O(2) attenuation. The findings point to the need for improved understanding of the coupling between the key transport-reaction feedbacks to determine which conceptual models are most suitable and to provide guidance for which parameters should be targeted in field and laboratory investigations. Copyright 2009 Elsevier B.V. All rights reserved.
Phase matrix induced symmetrics for multiple scattering using the matrix operator method
NASA Technical Reports Server (NTRS)
Hitzfelder, S. J.; Kattawar, G. W.
1973-01-01
Entirely rigorous proofs of the symmetries induced by the phase matrix into the reflection and transmission operators used in the matrix operator theory are given. Results are obtained for multiple scattering in both homogeneous and inhomogeneous atmospheres. These results will be useful to researchers using the method since large savings in computer time and storage are obtainable.
Non-uniform sampling: post-Fourier era of NMR data collection and processing.
Kazimierczuk, Krzysztof; Orekhov, Vladislav
2015-11-01
The invention of multidimensional techniques in the 1970s revolutionized NMR, making it the general tool of structural analysis of molecules and materials. In the most straightforward approach, the signal sampling in the indirect dimensions of a multidimensional experiment is performed in the same manner as in the direct dimension, i.e. with a grid of equally spaced points. This results in lengthy experiments with a resolution often far from optimum. To circumvent this problem, numerous sparse-sampling techniques have been developed in the last three decades, including two traditionally distinct approaches: the radial sampling and non-uniform sampling. This mini review discusses the sparse signal sampling and reconstruction techniques from the point of view of an underdetermined linear algebra problem that arises when a full, equally spaced set of sampled points is replaced with sparse sampling. Additional assumptions that are introduced to solve the problem, as well as the shape of the undersampled Fourier transform operator (visualized as so-called point spread function), are shown to be the main differences between various sparse-sampling methods. Copyright © 2015 John Wiley & Sons, Ltd.
Chen, Bo; Chen, Minhua; Paisley, John; Zaas, Aimee; Woods, Christopher; Ginsburg, Geoffrey S; Hero, Alfred; Lucas, Joseph; Dunson, David; Carin, Lawrence
2010-11-09
Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Computer Sciences and Data Systems, volume 1
NASA Technical Reports Server (NTRS)
1987-01-01
Topics addressed include: software engineering; university grants; institutes; concurrent processing; sparse distributed memory; distributed operating systems; intelligent data management processes; expert system for image analysis; fault tolerant software; and architecture research.
NASA Astrophysics Data System (ADS)
Yenier, E.; Baturan, D.; Karimi, S.
2016-12-01
Monitoring of seismicity related to oil and gas operations is routinely performed nowadays using a number of different surface and downhole seismic array configurations and technologies. Here, we provide a hydraulic fracture (HF) monitoring case study that compares the data set generated by a sparse local surface network of broadband seismometers to a data set generated by a single downhole geophone string. Our data was collected during a 5-day single-well HF operation, by a temporary surface network consisting of 10 stations deployed within 5 km of the production well. The downhole data was recorded by a 20 geophone string deployed in an observation well located 15 m from the production well. Surface network data processing included standard STA/LTA event triggering enhanced by template-matching subspace detection, grid search locations which was improved using the double-differencing re-location technique, as well as Richter (ML) and moment (Mw) magnitude computations for all detected events. In addition, moment tensors were computed from first motion polarities and amplitudes for the subset of highest SNR events. The resulting surface event catalog shows a very weak spatio-temporal correlation to HF operations with only 43% of recorded seismicity occurring during HF stages times. This along with source mechanisms shows that the surface-recorded seismicity delineates the activation of several pre-existing structures striking NNE-SSW and consistent with regional stress conditions as indicated by the orientation of SHmax. Comparison of the sparse-surface and single downhole string datasets allows us to perform a cost-benefit analysis of the two monitoring methods. Our findings show that although the downhole array recorded ten times as many events, the surface network provides a more coherent delineation of the underlying structure and more accurate magnitudes for larger magnitude events. We attribute this to the enhanced focal coverage provided by the surface network and the use of broadband instrumentation. The results indicate that sparse surface networks of high quality instruments can provide rich and reliable datasets for evaluation of the impact and effectiveness of hydraulic fracture operations in regions with favorable surface noise, local stress and attenuation characteristics.
Evaluation of several non-reflecting computational boundary conditions for duct acoustics
NASA Technical Reports Server (NTRS)
Watson, Willie R.; Zorumski, William E.; Hodge, Steve L.
1994-01-01
Several non-reflecting computational boundary conditions that meet certain criteria and have potential applications to duct acoustics are evaluated for their effectiveness. The same interior solution scheme, grid, and order of approximation are used to evaluate each condition. Sparse matrix solution techniques are applied to solve the matrix equation resulting from the discretization. Modal series solutions for the sound attenuation in an infinite duct are used to evaluate the accuracy of each non-reflecting boundary conditions. The evaluations are performed for sound propagation in a softwall duct, for several sources, sound frequencies, and duct lengths. It is shown that a recently developed nonlocal boundary condition leads to sound attenuation predictions considerably more accurate for short ducts. This leads to a substantial reduction in the number of grid points when compared to other non-reflecting conditions.