matrix factorization algorithm: Topics by Science.gov

Sample records for matrix factorization algorithm

Recursive flexible multibody system dynamics using spatial operators

NASA Technical Reports Server (NTRS)

Jain, A.; Rodriguez, G.

1992-01-01

This paper uses spatial operators to develop new spatially recursive dynamics algorithms for flexible multibody systems. The operator description of the dynamics is identical to that for rigid multibody systems. Assumed-mode models are used for the deformation of each individual body. The algorithms are based on two spatial operator factorizations of the system mass matrix. The first (Newton-Euler) factorization of the mass matrix leads to recursive algorithms for the inverse dynamics, mass matrix evaluation, and composite-body forward dynamics for the systems. The second (innovations) factorization of the mass matrix, leads to an operator expression for the mass matrix inverse and to a recursive articulated-body forward dynamics algorithm. The primary focus is on serial chains, but extensions to general topologies are also described. A comparison of computational costs shows that the articulated-body, forward dynamics algorithm is much more efficient than the composite-body algorithm for most flexible multibody systems.
Computing row and column counts for sparse QR and LU factorization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gilbert, John R.; Li, Xiaoye S.; Ng, Esmond G.

2001-01-01

We present algorithms to determine the number of nonzeros in each row and column of the factors of a sparse matrix, for both the QR factorization and the LU factorization with partial pivoting. The algorithms use only the nonzero structure of the input matrix, and run in time nearly linear in the number of nonzeros in that matrix. They may be used to set up data structures or schedule parallel operations in advance of the numerical factorization. The row and column counts we compute are upper bounds on the actual counts. If the input matrix is strong Hall and theremore » is no coincidental numerical cancellation, the counts are exact for QR factorization and are the tightest bounds possible for LU factorization. These algorithms are based on our earlier work on computing row and column counts for sparse Cholesky factorization, plus an efficient method to compute the column elimination tree of a sparse matrix without explicitly forming the product of the matrix and its transpose.« less
Algorithms for Solvents and Spectral Factors of Matrix Polynomials

DTIC Science & Technology

1981-01-01

spectral factors of matrix polynomials LEANG S. SHIEHt, YIH T. TSAYt and NORMAN P. COLEMANt A generalized Newton method , based on the contracted gradient...of a matrix poly- nomial, is derived for solving the right (left) solvents and spectral factors of matrix polynomials. Two methods of selecting initial...estimates for rapid convergence of the newly developed numerical method are proposed. Also, new algorithms for solving complete sets of the right
Parallel O(log n) algorithms for open- and closed-chain rigid multibody systems based on a new mass matrix factorization technique

NASA Technical Reports Server (NTRS)

Fijany, Amir

1993-01-01

In this paper, parallel O(log n) algorithms for computation of rigid multibody dynamics are developed. These parallel algorithms are derived by parallelization of new O(n) algorithms for the problem. The underlying feature of these O(n) algorithms is a drastically different strategy for decomposition of interbody force which leads to a new factorization of the mass matrix (M). Specifically, it is shown that a factorization of the inverse of the mass matrix in the form of the Schur Complement is derived as M(exp -1) = C - B(exp *)A(exp -1)B, wherein matrices C, A, and B are block tridiagonal matrices. The new O(n) algorithm is then derived as a recursive implementation of this factorization of M(exp -1). For the closed-chain systems, similar factorizations and O(n) algorithms for computation of Operational Space Mass Matrix lambda and its inverse lambda(exp -1) are also derived. It is shown that these O(n) algorithms are strictly parallel, that is, they are less efficient than other algorithms for serial computation of the problem. But, to our knowledge, they are the only known algorithms that can be parallelized and that lead to both time- and processor-optimal parallel algorithms for the problem, i.e., parallel O(log n) algorithms with O(n) processors. The developed parallel algorithms, in addition to their theoretical significance, are also practical from an implementation point of view due to their simple architectural requirements.
Adaptive multi-view clustering based on nonnegative matrix factorization and pairwise co-regularization

NASA Astrophysics Data System (ADS)

Zhang, Tianzhen; Wang, Xiumei; Gao, Xinbo

2018-04-01

Nowadays, several datasets are demonstrated by multi-view, which usually include shared and complementary information. Multi-view clustering methods integrate the information of multi-view to obtain better clustering results. Nonnegative matrix factorization has become an essential and popular tool in clustering methods because of its interpretation. However, existing nonnegative matrix factorization based multi-view clustering algorithms do not consider the disagreement between views and neglects the fact that different views will have different contributions to the data distribution. In this paper, we propose a new multi-view clustering method, named adaptive multi-view clustering based on nonnegative matrix factorization and pairwise co-regularization. The proposed algorithm can obtain the parts-based representation of multi-view data by nonnegative matrix factorization. Then, pairwise co-regularization is used to measure the disagreement between views. There is only one parameter to auto learning the weight values according to the contribution of each view to data distribution. Experimental results show that the proposed algorithm outperforms several state-of-the-arts algorithms for multi-view clustering.
Block LU factorization

NASA Technical Reports Server (NTRS)

Demmel, James W.; Higham, Nicholas J.; Schreiber, Robert S.

1992-01-01

Many of the currently popular 'block algorithms' are scalar algorithms in which the operations have been grouped and reordered into matrix operations. One genuine block algorithm in practical use is block LU factorization, and this has recently been shown by Demmel and Higham to be unstable in general. It is shown here that block LU factorization is stable if A is block diagonally dominant by columns. Moreover, for a general matrix the level of instability in block LU factorization can be founded in terms of the condition number kappa(A) and the growth factor for Gaussian elimination without pivoting. A consequence is that block LU factorization is stable for a matrix A that is symmetric positive definite or point diagonally dominant by rows or columns as long as A is well-conditioned.
State-Space System Realization with Input- and Output-Data Correlation

NASA Technical Reports Server (NTRS)

Juang, Jer-Nan

1997-01-01

This paper introduces a general version of the information matrix consisting of the autocorrelation and cross-correlation matrices of the shifted input and output data. Based on the concept of data correlation, a new system realization algorithm is developed to create a model directly from input and output data. The algorithm starts by computing a special type of correlation matrix derived from the information matrix. The special correlation matrix provides information on the system-observability matrix and the state-vector correlation. A system model is then developed from the observability matrix in conjunction with other algebraic manipulations. This approach leads to several different algorithms for computing system matrices for use in representing the system model. The relationship of the new algorithms with other realization algorithms in the time and frequency domains is established with matrix factorization of the information matrix. Several examples are given to illustrate the validity and usefulness of these new algorithms.
RECEPTOR MODELING OF AMBIENT PARTICULATE MATTER DATA USING POSITIVE MATRIX FACTORIZATION REVIEW OF EXISTING METHODS

EPA Science Inventory

Methods for apportioning sources of ambient particulate matter (PM) using the positive matrix factorization (PMF) algorithm are reviewed. Numerous procedural decisions must be made and algorithmic parameters selected when analyzing PM data with PMF. However, few publications docu...
A Hybrid Algorithm for Non-negative Matrix Factorization Based on Symmetric Information Divergence

PubMed Central

Devarajan, Karthik; Ebrahimi, Nader; Soofi, Ehsan

2017-01-01

The objective of this paper is to provide a hybrid algorithm for non-negative matrix factorization based on a symmetric version of Kullback-Leibler divergence, known as intrinsic information. The convergence of the proposed algorithm is shown for several members of the exponential family such as the Gaussian, Poisson, gamma and inverse Gaussian models. The speed of this algorithm is examined and its usefulness is illustrated through some applied problems. PMID:28868206
UDU/T/ covariance factorization for Kalman filtering

NASA Technical Reports Server (NTRS)

Thornton, C. L.; Bierman, G. J.

1980-01-01

There has been strong motivation to produce numerically stable formulations of the Kalman filter algorithms because it has long been known that the original discrete-time Kalman formulas are numerically unreliable. Numerical instability can be avoided by propagating certain factors of the estimate error covariance matrix rather than the covariance matrix itself. This paper documents filter algorithms that correspond to the covariance factorization P = UDU(T), where U is a unit upper triangular matrix and D is diagonal. Emphasis is on computational efficiency and numerical stability, since these properties are of key importance in real-time filter applications. The history of square-root and U-D covariance filters is reviewed. Simple examples are given to illustrate the numerical inadequacy of the Kalman covariance filter algorithms; these examples show how factorization techniques can give improved computational reliability.
Communication-avoiding symmetric-indefinite factorization

DOE PAGES

Ballard, Grey Malone; Becker, Dulcenia; Demmel, James; ...

2014-11-13

We describe and analyze a novel symmetric triangular factorization algorithm. The algorithm is essentially a block version of Aasen's triangular tridiagonalization. It factors a dense symmetric matrix A as the product A=PLTL TP T where P is a permutation matrix, L is lower triangular, and T is block tridiagonal and banded. The algorithm is the first symmetric-indefinite communication-avoiding factorization: it performs an asymptotically optimal amount of communication in a two-level memory hierarchy for almost any cache-line size. Adaptations of the algorithm to parallel computers are likely to be communication efficient as well; one such adaptation has been recently published. Asmore » a result, the current paper describes the algorithm, proves that it is numerically stable, and proves that it is communication optimal.« less
Communication-avoiding symmetric-indefinite factorization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ballard, Grey Malone; Becker, Dulcenia; Demmel, James

We describe and analyze a novel symmetric triangular factorization algorithm. The algorithm is essentially a block version of Aasen's triangular tridiagonalization. It factors a dense symmetric matrix A as the product A=PLTL TP T where P is a permutation matrix, L is lower triangular, and T is block tridiagonal and banded. The algorithm is the first symmetric-indefinite communication-avoiding factorization: it performs an asymptotically optimal amount of communication in a two-level memory hierarchy for almost any cache-line size. Adaptations of the algorithm to parallel computers are likely to be communication efficient as well; one such adaptation has been recently published. Asmore » a result, the current paper describes the algorithm, proves that it is numerically stable, and proves that it is communication optimal.« less
Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis.

PubMed

Kim, Hyunsoo; Park, Haesun

2007-06-15

Many practical pattern recognition problems require non-negativity constraints. For example, pixels in digital images and chemical concentrations in bioinformatics are non-negative. Sparse non-negative matrix factorizations (NMFs) are useful when the degree of sparseness in the non-negative basis matrix or the non-negative coefficient matrix in an NMF needs to be controlled in approximating high-dimensional data in a lower dimensional space. In this article, we introduce a novel formulation of sparse NMF and show how the new formulation leads to a convergent sparse NMF algorithm via alternating non-negativity-constrained least squares. We apply our sparse NMF algorithm to cancer-class discovery and gene expression data analysis and offer biological analysis of the results obtained. Our experimental results illustrate that the proposed sparse NMF algorithm often achieves better clustering performance with shorter computing time compared to other existing NMF algorithms. The software is available as supplementary material.
Scalable non-negative matrix tri-factorization.

PubMed

Čopar, Andrej; Žitnik, Marinka; Zupan, Blaž

2017-01-01

Matrix factorization is a well established pattern discovery tool that has seen numerous applications in biomedical data analytics, such as gene expression co-clustering, patient stratification, and gene-disease association mining. Matrix factorization learns a latent data model that takes a data matrix and transforms it into a latent feature space enabling generalization, noise removal and feature discovery. However, factorization algorithms are numerically intensive, and hence there is a pressing challenge to scale current algorithms to work with large datasets. Our focus in this paper is matrix tri-factorization, a popular method that is not limited by the assumption of standard matrix factorization about data residing in one latent space. Matrix tri-factorization solves this by inferring a separate latent space for each dimension in a data matrix, and a latent mapping of interactions between the inferred spaces, making the approach particularly suitable for biomedical data mining. We developed a block-wise approach for latent factor learning in matrix tri-factorization. The approach partitions a data matrix into disjoint submatrices that are treated independently and fed into a parallel factorization system. An appealing property of the proposed approach is its mathematical equivalence with serial matrix tri-factorization. In a study on large biomedical datasets we show that our approach scales well on multi-processor and multi-GPU architectures. On a four-GPU system we demonstrate that our approach can be more than 100-times faster than its single-processor counterpart. A general approach for scaling non-negative matrix tri-factorization is proposed. The approach is especially useful parallel matrix factorization implemented in a multi-GPU environment. We expect the new approach will be useful in emerging procedures for latent factor analysis, notably for data integration, where many large data matrices need to be collectively factorized.
Adaptive truncation of matrix decompositions and efficient estimation of NMR relaxation distributions

NASA Astrophysics Data System (ADS)

Teal, Paul D.; Eccles, Craig

2015-04-01

The two most successful methods of estimating the distribution of nuclear magnetic resonance relaxation times from two dimensional data are data compression followed by application of the Butler-Reeds-Dawson algorithm, and a primal-dual interior point method using preconditioned conjugate gradient. Both of these methods have previously been presented using a truncated singular value decomposition of matrices representing the exponential kernel. In this paper it is shown that other matrix factorizations are applicable to each of these algorithms, and that these illustrate the different fundamental principles behind the operation of the algorithms. These are the rank-revealing QR (RRQR) factorization and the LDL factorization with diagonal pivoting, also known as the Bunch-Kaufman-Parlett factorization. It is shown that both algorithms can be improved by adaptation of the truncation as the optimization process progresses, improving the accuracy as the optimal value is approached. A variation on the interior method viz, the use of barrier function instead of the primal-dual approach, is found to offer considerable improvement in terms of speed and reliability. A third type of algorithm, related to the algorithm known as Fast iterative shrinkage-thresholding algorithm, is applied to the problem. This method can be efficiently formulated without the use of a matrix decomposition.
Normalization Of Thermal-Radiation Form-Factor Matrix

NASA Technical Reports Server (NTRS)

Tsuyuki, Glenn T.

1994-01-01

Report describes algorithm that adjusts form-factor matrix in TRASYS computer program, which calculates intraspacecraft radiative interchange among various surfaces and environmental heat loading from sources such as sun.
Study on the Algorithm of Judgment Matrix in Analytic Hierarchy Process

NASA Astrophysics Data System (ADS)

Lu, Zhiyong; Qin, Futong; Jin, Yican

2017-10-01

A new algorithm is proposed for the non-consistent judgment matrix in AHP. A primary judgment matrix is generated firstly through pre-ordering the targeted factor set, and a compared matrix is built through the top integral function. Then a relative error matrix is created by comparing the compared matrix with the primary judgment matrix which is regulated under the control of the relative error matrix and the dissimilar degree of the matrix step by step. Lastly, the targeted judgment matrix is generated to satisfy the requirement of consistence and the least dissimilar degree. The feasibility and validity of the proposed method are verified by simulation results.
The Incremental Multiresolution Matrix Factorization Algorithm

PubMed Central

Ithapu, Vamsi K.; Kondor, Risi; Johnson, Sterling C.; Singh, Vikas

2017-01-01

Multiresolution analysis and matrix factorization are foundational tools in computer vision. In this work, we study the interface between these two distinct topics and obtain techniques to uncover hierarchical block structure in symmetric matrices – an important aspect in the success of many vision problems. Our new algorithm, the incremental multiresolution matrix factorization, uncovers such structure one feature at a time, and hence scales well to large matrices. We describe how this multiscale analysis goes much farther than what a direct “global” factorization of the data can identify. We evaluate the efficacy of the resulting factorizations for relative leveraging within regression tasks using medical imaging data. We also use the factorization on representations learned by popular deep networks, providing evidence of their ability to infer semantic relationships even when they are not explicitly trained to do so. We show that this algorithm can be used as an exploratory tool to improve the network architecture, and within numerous other settings in vision. PMID:29416293
Using Strassen's algorithm to accelerate the solution of linear systems

NASA Technical Reports Server (NTRS)

Bailey, David H.; Lee, King; Simon, Horst D.

1990-01-01

Strassen's algorithm for fast matrix-matrix multiplication has been implemented for matrices of arbitrary shapes on the CRAY-2 and CRAY Y-MP supercomputers. Several techniques have been used to reduce the scratch space requirement for this algorithm while simultaneously preserving a high level of performance. When the resulting Strassen-based matrix multiply routine is combined with some routines from the new LAPACK library, LU decomposition can be performed with rates significantly higher than those achieved by conventional means. We succeeded in factoring a 2048 x 2048 matrix on the CRAY Y-MP at a rate equivalent to 325 MFLOPS.
Using Dynamic Multi-Task Non-Negative Matrix Factorization to Detect the Evolution of User Preferences in Collaborative Filtering

PubMed Central

Ju, Bin; Qian, Yuntao; Ye, Minchao; Ni, Rong; Zhu, Chenxi

2015-01-01

Predicting what items will be selected by a target user in the future is an important function for recommendation systems. Matrix factorization techniques have been shown to achieve good performance on temporal rating-type data, but little is known about temporal item selection data. In this paper, we developed a unified model that combines Multi-task Non-negative Matrix Factorization and Linear Dynamical Systems to capture the evolution of user preferences. Specifically, user and item features are projected into latent factor space by factoring co-occurrence matrices into a common basis item-factor matrix and multiple factor-user matrices. Moreover, we represented both within and between relationships of multiple factor-user matrices using a state transition matrix to capture the changes in user preferences over time. The experiments show that our proposed algorithm outperforms the other algorithms on two real datasets, which were extracted from Netflix movies and Last.fm music. Furthermore, our model provides a novel dynamic topic model for tracking the evolution of the behavior of a user over time. PMID:26270539

Using Dynamic Multi-Task Non-Negative Matrix Factorization to Detect the Evolution of User Preferences in Collaborative Filtering.

PubMed

Ju, Bin; Qian, Yuntao; Ye, Minchao; Ni, Rong; Zhu, Chenxi

2015-01-01

Predicting what items will be selected by a target user in the future is an important function for recommendation systems. Matrix factorization techniques have been shown to achieve good performance on temporal rating-type data, but little is known about temporal item selection data. In this paper, we developed a unified model that combines Multi-task Non-negative Matrix Factorization and Linear Dynamical Systems to capture the evolution of user preferences. Specifically, user and item features are projected into latent factor space by factoring co-occurrence matrices into a common basis item-factor matrix and multiple factor-user matrices. Moreover, we represented both within and between relationships of multiple factor-user matrices using a state transition matrix to capture the changes in user preferences over time. The experiments show that our proposed algorithm outperforms the other algorithms on two real datasets, which were extracted from Netflix movies and Last.fm music. Furthermore, our model provides a novel dynamic topic model for tracking the evolution of the behavior of a user over time.
An improved semi-implicit method for structural dynamics analysis

NASA Technical Reports Server (NTRS)

Park, K. C.

1982-01-01

A semi-implicit algorithm is presented for direct time integration of the structural dynamics equations. The algorithm avoids the factoring of the implicit difference solution matrix and mitigates the unacceptable accuracy losses which plagued previous semi-implicit algorithms. This substantial accuracy improvement is achieved by augmenting the solution matrix with two simple diagonal matrices of the order of the integration truncation error.
Pattern identification in time-course gene expression data with the CoGAPS matrix factorization.

PubMed

Fertig, Elana J; Stein-O'Brien, Genevieve; Jaffe, Andrew; Colantuoni, Carlo

2014-01-01

Patterns in time-course gene expression data can represent the biological processes that are active over the measured time period. However, the orthogonality constraint in standard pattern-finding algorithms, including notably principal components analysis (PCA), confounds expression changes resulting from simultaneous, non-orthogonal biological processes. Previously, we have shown that Markov chain Monte Carlo nonnegative matrix factorization algorithms are particularly adept at distinguishing such concurrent patterns. One such matrix factorization is implemented in the software package CoGAPS. We describe the application of this software and several technical considerations for identification of age-related patterns in a public, prefrontal cortex gene expression dataset.
Genetic algorithm and graph theory based matrix factorization method for online friend recommendation.

PubMed

Li, Qu; Yao, Min; Yang, Jianhua; Xu, Ning

2014-01-01

Online friend recommendation is a fast developing topic in web mining. In this paper, we used SVD matrix factorization to model user and item feature vector and used stochastic gradient descent to amend parameter and improve accuracy. To tackle cold start problem and data sparsity, we used KNN model to influence user feature vector. At the same time, we used graph theory to partition communities with fairly low time and space complexity. What is more, matrix factorization can combine online and offline recommendation. Experiments showed that the hybrid recommendation algorithm is able to recommend online friends with good accuracy.
Background recovery via motion-based robust principal component analysis with matrix factorization

NASA Astrophysics Data System (ADS)

Pan, Peng; Wang, Yongli; Zhou, Mingyuan; Sun, Zhipeng; He, Guoping

2018-03-01

Background recovery is a key technique in video analysis, but it still suffers from many challenges, such as camouflage, lighting changes, and diverse types of image noise. Robust principal component analysis (RPCA), which aims to recover a low-rank matrix and a sparse matrix, is a general framework for background recovery. The nuclear norm is widely used as a convex surrogate for the rank function in RPCA, which requires computing the singular value decomposition (SVD), a task that is increasingly costly as matrix sizes and ranks increase. However, matrix factorization greatly reduces the dimension of the matrix for which the SVD must be computed. Motion information has been shown to improve low-rank matrix recovery in RPCA, but this method still finds it difficult to handle original video data sets because of its batch-mode formulation and implementation. Hence, in this paper, we propose a motion-assisted RPCA model with matrix factorization (FM-RPCA) for background recovery. Moreover, an efficient linear alternating direction method of multipliers with a matrix factorization (FL-ADM) algorithm is designed for solving the proposed FM-RPCA model. Experimental results illustrate that the method provides stable results and is more efficient than the current state-of-the-art algorithms.
Highly parallel sparse Cholesky factorization

NASA Technical Reports Server (NTRS)

Gilbert, John R.; Schreiber, Robert

1990-01-01

Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yeung, Yu-Hong; Pothen, Alex; Halappanavar, Mahantesh

We present an augmented matrix approach to update the solution to a linear system of equations when the coefficient matrix is modified by a few elements within a principal submatrix. This problem arises in the dynamic security analysis of a power grid, where operators need to performmore » $N-x$ contingency analysis, i.e., determine the state of the system when up to $x$ links from $N$ fail. Our algorithms augment the coefficient matrix to account for the changes in it, and then compute the solution to the augmented system without refactoring the modified matrix. We provide two algorithms, a direct method, and a hybrid direct-iterative method for solving the augmented system. We also exploit the sparsity of the matrices and vectors to accelerate the overall computation. Our algorithms are compared on three power grids with PARDISO, a parallel direct solver, and CHOLMOD, a direct solver with the ability to modify the Cholesky factors of the coefficient matrix. We show that our augmented algorithms outperform PARDISO (by two orders of magnitude), and CHOLMOD (by a factor of up to 5). Further, our algorithms scale better than CHOLMOD as the number of elements updated increases. The solutions are computed with high accuracy. Our algorithms are capable of computing $N-x$ contingency analysis on a $778K$ bus grid, updating a solution with $x=20$ elements in $$1.6 \\times 10^{-2}$$ seconds on an Intel Xeon processor.« less
Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering.

PubMed

He, Zhaoshui; Xie, Shengli; Zdunek, Rafal; Zhou, Guoxu; Cichocki, Andrzej

2011-12-01

Nonnegative matrix factorization (NMF) is an unsupervised learning method useful in various applications including image processing and semantic analysis of documents. This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition. Three parallel multiplicative update algorithms using level 3 basic linear algebra subprograms directly are developed for this problem. First, by minimizing the Euclidean distance, a multiplicative update algorithm is proposed, and its convergence under mild conditions is proved. Based on it, we further propose another two fast parallel methods: α-SNMF and β -SNMF algorithms. All of them are easy to implement. These algorithms are applied to probabilistic clustering. We demonstrate their effectiveness for facial image clustering, document categorization, and pattern clustering in gene expression.
Active subspace: toward scalable low-rank learning.

PubMed

Liu, Guangcan; Yan, Shuicheng

2012-12-01

We address the scalability issues in low-rank matrix learning problems. Usually these problems resort to solving nuclear norm regularized optimization problems (NNROPs), which often suffer from high computational complexities if based on existing solvers, especially in large-scale settings. Based on the fact that the optimal solution matrix to an NNROP is often low rank, we revisit the classic mechanism of low-rank matrix factorization, based on which we present an active subspace algorithm for efficiently solving NNROPs by transforming large-scale NNROPs into small-scale problems. The transformation is achieved by factorizing the large solution matrix into the product of a small orthonormal matrix (active subspace) and another small matrix. Although such a transformation generally leads to nonconvex problems, we show that a suboptimal solution can be found by the augmented Lagrange alternating direction method. For the robust PCA (RPCA) (Candès, Li, Ma, & Wright, 2009 ) problem, a typical example of NNROPs, theoretical results verify the suboptimality of the solution produced by our algorithm. For the general NNROPs, we empirically show that our algorithm significantly reduces the computational complexity without loss of optimality.
A fast, preconditioned conjugate gradient Toeplitz solver

NASA Technical Reports Server (NTRS)

Pan, Victor; Schrieber, Robert

1989-01-01

A simple factorization is given of an arbitrary hermitian, positive definite matrix in which the factors are well-conditioned, hermitian, and positive definite. In fact, given knowledge of the extreme eigenvalues of the original matrix A, an optimal improvement can be achieved, making the condition numbers of each of the two factors equal to the square root of the condition number of A. This technique is to applied to the solution of hermitian, positive definite Toeplitz systems. Large linear systems with hermitian, positive definite Toeplitz matrices arise in some signal processing applications. A stable fast algorithm is given for solving these systems that is based on the preconditioned conjugate gradient method. The algorithm exploits Toeplitz structure to reduce the cost of an iteration to O(n log n) by applying the fast Fourier Transform to compute matrix-vector products. Matrix factorization is used as a preconditioner.
On non-negative matrix factorization algorithms for signal-dependent noise with application to electromyography data

PubMed Central

Devarajan, Karthik; Cheung, Vincent C.K.

2017-01-01

Non-negative matrix factorization (NMF) by the multiplicative updates algorithm is a powerful machine learning method for decomposing a high-dimensional nonnegative matrix V into two nonnegative matrices, W and H where V ~ WH. It has been successfully applied in the analysis and interpretation of large-scale data arising in neuroscience, computational biology and natural language processing, among other areas. A distinctive feature of NMF is its nonnegativity constraints that allow only additive linear combinations of the data, thus enabling it to learn parts that have distinct physical representations in reality. In this paper, we describe an information-theoretic approach to NMF for signal-dependent noise based on the generalized inverse Gaussian model. Specifically, we propose three novel algorithms in this setting, each based on multiplicative updates and prove monotonicity of updates using the EM algorithm. In addition, we develop algorithm-specific measures to evaluate their goodness-of-fit on data. Our methods are demonstrated using experimental data from electromyography studies as well as simulated data in the extraction of muscle synergies, and compared with existing algorithms for signal-dependent noise. PMID:24684448
Predicting drug-target interactions by dual-network integrated logistic matrix factorization

NASA Astrophysics Data System (ADS)

Hao, Ming; Bryant, Stephen H.; Wang, Yanli

2017-01-01

In this work, we propose a dual-network integrated logistic matrix factorization (DNILMF) algorithm to predict potential drug-target interactions (DTI). The prediction procedure consists of four steps: (1) inferring new drug/target profiles and constructing profile kernel matrix; (2) diffusing drug profile kernel matrix with drug structure kernel matrix; (3) diffusing target profile kernel matrix with target sequence kernel matrix; and (4) building DNILMF model and smoothing new drug/target predictions based on their neighbors. We compare our algorithm with the state-of-the-art method based on the benchmark dataset. Results indicate that the DNILMF algorithm outperforms the previously reported approaches in terms of AUPR (area under precision-recall curve) and AUC (area under curve of receiver operating characteristic) based on the 5 trials of 10-fold cross-validation. We conclude that the performance improvement depends on not only the proposed objective function, but also the used nonlinear diffusion technique which is important but under studied in the DTI prediction field. In addition, we also compile a new DTI dataset for increasing the diversity of currently available benchmark datasets. The top prediction results for the new dataset are confirmed by experimental studies or supported by other computational research.
Recursive dynamics for flexible multibody systems using spatial operators

NASA Technical Reports Server (NTRS)

Jain, A.; Rodriguez, G.

1990-01-01

Due to their structural flexibility, spacecraft and space manipulators are multibody systems with complex dynamics and possess a large number of degrees of freedom. Here the spatial operator algebra methodology is used to develop a new dynamics formulation and spatially recursive algorithms for such flexible multibody systems. A key feature of the formulation is that the operator description of the flexible system dynamics is identical in form to the corresponding operator description of the dynamics of rigid multibody systems. A significant advantage of this unifying approach is that it allows ideas and techniques for rigid multibody systems to be easily applied to flexible multibody systems. The algorithms use standard finite-element and assumed modes models for the individual body deformation. A Newton-Euler Operator Factorization of the mass matrix of the multibody system is first developed. It forms the basis for recursive algorithms such as for the inverse dynamics, the computation of the mass matrix, and the composite body forward dynamics for the system. Subsequently, an alternative Innovations Operator Factorization of the mass matrix, each of whose factors is invertible, is developed. It leads to an operator expression for the inverse of the mass matrix, and forms the basis for the recursive articulated body forward dynamics algorithm for the flexible multibody system. For simplicity, most of the development here focuses on serial chain multibody systems. However, extensions of the algorithms to general topology flexible multibody systems are described. While the computational cost of the algorithms depends on factors such as the topology and the amount of flexibility in the multibody system, in general, it appears that in contrast to the rigid multibody case, the articulated body forward dynamics algorithm is the more efficient algorithm for flexible multibody systems containing even a small number of flexible bodies. The variety of algorithms described here permits a user to choose the algorithm which is optimal for the multibody system at hand. The availability of a number of algorithms is even more important for real-time applications, where implementation on parallel processors or custom computing hardware is often necessary to maximize speed.
New Factorization Techniques and Fast Serial and Parrallel Algorithms for Operational Space Control of Robot Manipulators

NASA Technical Reports Server (NTRS)

Fijany, Amir; Djouani, Karim; Fried, George; Pontnau, Jean

1997-01-01

In this paper a new factorization technique for computation of inverse of mass matrix, and the operational space mass matrix, as arising in implementation of the operational space control scheme, is presented.
Efficient algorithms for computing a strong rank-revealing QR factorization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gu, M.; Eisenstat, S.C.

1996-07-01

Given an m x n matrix M with m {ge} n, it is shown that there exists a permutation {Pi} and an integer k such that the QR factorization given by equation (1) reveals the numerical rank of M: the k x k upper-triangular matrix A{sub k} is well conditioned, norm of (C{sub k}){sub 2} is small, and B{sub k} is linearly dependent on A{sub k} with coefficients bounded by a low-degree polynomial in n. Existing rank-revealing QR (RRQR) algorithms are related to such factorizations and two algorithms are presented for computing them. The new algorithms are nearly as efficientmore » as QR with column pivoting for most problems and take O(mn{sup 2}) floating-point operations in the worst case.« less
Estimating gene function with least squares nonnegative matrix factorization.

PubMed

Wang, Guoli; Ochs, Michael F

2007-01-01

Nonnegative matrix factorization is a machine learning algorithm that has extracted information from data in a number of fields, including imaging and spectral analysis, text mining, and microarray data analysis. One limitation with the method for linking genes through microarray data in order to estimate gene function is the high variance observed in transcription levels between different genes. Least squares nonnegative matrix factorization uses estimates of the uncertainties on the mRNA levels for each gene in each condition, to guide the algorithm to a local minimum in normalized chi2, rather than a Euclidean distance or divergence between the reconstructed data and the data itself. Herein, application of this method to microarray data is demonstrated in order to predict gene function.
A Chess-Like Game for Teaching Engineering Students to Solve Large System of Simultaneous Linear Equations

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.; Mohammed, Ahmed Ali; Kadiam, Subhash

2010-01-01

Solving large (and sparse) system of simultaneous linear equations has been (and continues to be) a major challenging problem for many real-world engineering/science applications [1-2]. For many practical/large-scale problems, the sparse, Symmetrical and Positive Definite (SPD) system of linear equations can be conveniently represented in matrix notation as [A] {x} = {b} , where the square coefficient matrix [A] and the Right-Hand-Side (RHS) vector {b} are known. The unknown solution vector {x} can be efficiently solved by the following step-by-step procedures [1-2]: Reordering phase, Matrix Factorization phase, Forward solution phase, and Backward solution phase. In this research work, a Game-Based Learning (GBL) approach has been developed to help engineering students to understand crucial details about matrix reordering and factorization phases. A "chess-like" game has been developed and can be played by either a single player, or two players. Through this "chess-like" open-ended game, the players/learners will not only understand the key concepts involved in reordering algorithms (based on existing algorithms), but also have the opportunities to "discover new algorithms" which are better than existing algorithms. Implementing the proposed "chess-like" game for matrix reordering and factorization phases can be enhanced by FLASH [3] computer environments, where computer simulation with animated human voice, sound effects, visual/graphical/colorful displays of matrix tables, score (or monetary) awards for the best game players, etc. can all be exploited. Preliminary demonstrations of the developed GBL approach can be viewed by anyone who has access to the internet web-site [4]!
A Spectral Algorithm for Envelope Reduction of Sparse Matrices

NASA Technical Reports Server (NTRS)

Barnard, Stephen T.; Pothen, Alex; Simon, Horst D.

1993-01-01

The problem of reordering a sparse symmetric matrix to reduce its envelope size is considered. A new spectral algorithm for computing an envelope-reducing reordering is obtained by associating a Laplacian matrix with the given matrix and then sorting the components of a specified eigenvector of the Laplacian. This Laplacian eigenvector solves a continuous relaxation of a discrete problem related to envelope minimization called the minimum 2-sum problem. The permutation vector computed by the spectral algorithm is a closest permutation vector to the specified Laplacian eigenvector. Numerical results show that the new reordering algorithm usually computes smaller envelope sizes than those obtained from the current standard algorithms such as Gibbs-Poole-Stockmeyer (GPS) or SPARSPAK reverse Cuthill-McKee (RCM), in some cases reducing the envelope by more than a factor of two.
A generalized leaky FxLMS algorithm for tuning the waterbed effect of feedback active noise control systems

NASA Astrophysics Data System (ADS)

Wu, Lifu; Qiu, Xiaojun; Guo, Yecai

2018-06-01

To tune the noise amplification in the feedback system caused by the waterbed effect effectively, an adaptive algorithm is proposed in this paper by replacing the scalar leaky factor of the leaky FxLMS algorithm with a real symmetric Toeplitz matrix. The elements in the matrix are calculated explicitly according to the noise amplification constraints, which are defined based on a simple but efficient method. Simulations in an ANC headphone application demonstrate that the proposed algorithm can adjust the frequency band of noise amplification more effectively than the FxLMS algorithm and the leaky FxLMS algorithm.
MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kannan, Ramakrishnan; Ballard, Grey; Park, Haesun

Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H, for the given input matrix A, such that A≈WH. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient parallel algorithms to solve the problem for big data sets. The main contribution of this work is a new, high-performance parallel computational framework for a broad class of NMF algorithms thatmore » iteratively solves alternating non-negative least squares (NLS) subproblems for W and H. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). The framework is flexible and able to leverage a variety of NMF and NLS algorithms, including Multiplicative Update, Hierarchical Alternating Least Squares, and Block Principal Pivoting. Our implementation allows us to benchmark and compare different algorithms on massive dense and sparse data matrices of size that spans from few hundreds of millions to billions. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements. The code and the datasets used for conducting the experiments are available online.« less

MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization

DOE PAGES

Kannan, Ramakrishnan; Ballard, Grey; Park, Haesun

2017-10-30

Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H, for the given input matrix A, such that A≈WH. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient parallel algorithms to solve the problem for big data sets. The main contribution of this work is a new, high-performance parallel computational framework for a broad class of NMF algorithms thatmore » iteratively solves alternating non-negative least squares (NLS) subproblems for W and H. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). The framework is flexible and able to leverage a variety of NMF and NLS algorithms, including Multiplicative Update, Hierarchical Alternating Least Squares, and Block Principal Pivoting. Our implementation allows us to benchmark and compare different algorithms on massive dense and sparse data matrices of size that spans from few hundreds of millions to billions. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements. The code and the datasets used for conducting the experiments are available online.« less
Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Kyungjoo; Rajamanickam, Sivasankaran; Stelle, George Widgery

We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block layout. The algorithm-byblocks approach induces a task graph for the factorization. These tasks are inter-related to each other through their data dependences in the factorization algorithm. To process the tasks on various manycore architectures in a portable manner, we also present a portable tasking API that incorporates different tasking backends and device-specific features using an open-source framework for manycore platforms i.e., Kokkos. A performance evaluation is presented onmore » both Intel Sandybridge and Xeon Phi platforms for matrices from the University of Florida sparse matrix collection to illustrate merits of the proposed task-based factorization. Experimental results demonstrate that our task-parallel implementation delivers about 26.6x speedup (geometric mean) over single-threaded incomplete Choleskyby- blocks and 19.2x speedup over serial Cholesky performance which does not carry tasking overhead using 56 threads on the Intel Xeon Phi processor for sparse matrices arising from various application problems.« less
Bunch-Kaufman factorization for real symmetric indefinite banded matrices

NASA Technical Reports Server (NTRS)

Jones, Mark T.; Patrick, Merrell L.

1989-01-01

The Bunch-Kaufman algorithm for factoring symmetric indefinite matrices was rejected for banded matrices because it destroys the banded structure of the matrix. Herein, it is shown that for a subclass of real symmetric matrices which arise in solving the generalized eigenvalue problem using Lanczos's method, the Bunch-Kaufman algorithm does not result in major destruction of the bandwidth. Space time complexities of the algorithm are given and used to show that the Bunch-Kaufman algorithm is a significant improvement over LU factorization.
Spatial operator factorization and inversion of the manipulator mass matrix

NASA Technical Reports Server (NTRS)

Rodriguez, Guillermo; Kreutz-Delgado, Kenneth

1992-01-01

This paper advances two linear operator factorizations of the manipulator mass matrix. Embedded in the factorizations are many of the techniques that are regarded as very efficient computational solutions to inverse and forward dynamics problems. The operator factorizations provide a high-level architectural understanding of the mass matrix and its inverse, which is not visible in the detailed algorithms. They also lead to a new approach to the development of computer programs or organize complexity in robot dynamics.
Constrained low-rank matrix estimation: phase transitions, approximate message passing and applications

NASA Astrophysics Data System (ADS)

Lesieur, Thibault; Krzakala, Florent; Zdeborová, Lenka

2017-07-01

This article is an extended version of previous work of Lesieur et al (2015 IEEE Int. Symp. on Information Theory Proc. pp 1635-9 and 2015 53rd Annual Allerton Conf. on Communication, Control and Computing (IEEE) pp 680-7) on low-rank matrix estimation in the presence of constraints on the factors into which the matrix is factorized. Low-rank matrix factorization is one of the basic methods used in data analysis for unsupervised learning of relevant features and other types of dimensionality reduction. We present a framework to study the constrained low-rank matrix estimation for a general prior on the factors, and a general output channel through which the matrix is observed. We draw a parallel with the study of vector-spin glass models—presenting a unifying way to study a number of problems considered previously in separate statistical physics works. We present a number of applications for the problem in data analysis. We derive in detail a general form of the low-rank approximate message passing (Low-RAMP) algorithm, that is known in statistical physics as the TAP equations. We thus unify the derivation of the TAP equations for models as different as the Sherrington-Kirkpatrick model, the restricted Boltzmann machine, the Hopfield model or vector (xy, Heisenberg and other) spin glasses. The state evolution of the Low-RAMP algorithm is also derived, and is equivalent to the replica symmetric solution for the large class of vector-spin glass models. In the section devoted to result we study in detail phase diagrams and phase transitions for the Bayes-optimal inference in low-rank matrix estimation. We present a typology of phase transitions and their relation to performance of algorithms such as the Low-RAMP or commonly used spectral methods.
Graph regularized nonnegative matrix factorization for temporal link prediction in dynamic networks

NASA Astrophysics Data System (ADS)

Ma, Xiaoke; Sun, Penggang; Wang, Yu

2018-04-01

Many networks derived from society and nature are temporal and incomplete. The temporal link prediction problem in networks is to predict links at time T + 1 based on a given temporal network from time 1 to T, which is essential to important applications. The current algorithms either predict the temporal links by collapsing the dynamic networks or collapsing features derived from each network, which are criticized for ignoring the connection among slices. to overcome the issue, we propose a novel graph regularized nonnegative matrix factorization algorithm (GrNMF) for the temporal link prediction problem without collapsing the dynamic networks. To obtain the feature for each network from 1 to t, GrNMF factorizes the matrix associated with networks by setting the rest networks as regularization, which provides a better way to characterize the topological information of temporal links. Then, the GrNMF algorithm collapses the feature matrices to predict temporal links. Compared with state-of-the-art methods, the proposed algorithm exhibits significantly improved accuracy by avoiding the collapse of temporal networks. Experimental results of a number of artificial and real temporal networks illustrate that the proposed method is not only more accurate but also more robust than state-of-the-art approaches.
A quasi-likelihood approach to non-negative matrix factorization

PubMed Central

Devarajan, Karthik; Cheung, Vincent C.K.

2017-01-01

A unified approach to non-negative matrix factorization based on the theory of generalized linear models is proposed. This approach embeds a variety of statistical models, including the exponential family, within a single theoretical framework and provides a unified view of such factorizations from the perspective of quasi-likelihood. Using this framework, a family of algorithms for handling signal-dependent noise is developed and its convergence proven using the Expectation-Maximization algorithm. In addition, a measure to evaluate the goodness-of-fit of the resulting factorization is described. The proposed methods allow modeling of non-linear effects via appropriate link functions and are illustrated using an application in biomedical signal processing. PMID:27348511
Recursive inverse factorization.

PubMed

Rubensson, Emanuel H; Bock, Nicolas; Holmström, Erik; Niklasson, Anders M N

2008-03-14

A recursive algorithm for the inverse factorization S(-1)=ZZ(*) of Hermitian positive definite matrices S is proposed. The inverse factorization is based on iterative refinement [A.M.N. Niklasson, Phys. Rev. B 70, 193102 (2004)] combined with a recursive decomposition of S. As the computational kernel is matrix-matrix multiplication, the algorithm can be parallelized and the computational effort increases linearly with system size for systems with sufficiently sparse matrices. Recent advances in network theory are used to find appropriate recursive decompositions. We show that optimization of the so-called network modularity results in an improved partitioning compared to other approaches. In particular, when the recursive inverse factorization is applied to overlap matrices of irregularly structured three-dimensional molecules.
Reconstructing householder vectors from Tall-Skinny QR

DOE PAGES

Ballard, Grey Malone; Demmel, James; Grigori, Laura; ...

2015-08-05

The Tall-Skinny QR (TSQR) algorithm is more communication efficient than the standard Householder algorithm for QR decomposition of matrices with many more rows than columns. However, TSQR produces a different representation of the orthogonal factor and therefore requires more software development to support the new representation. Further, implicitly applying the orthogonal factor to the trailing matrix in the context of factoring a square matrix is more complicated and costly than with the Householder representation. We show how to perform TSQR and then reconstruct the Householder vector representation with the same asymptotic communication efficiency and little extra computational cost. We demonstratemore » the high performance and numerical stability of this algorithm both theoretically and empirically. The new Householder reconstruction algorithm allows us to design more efficient parallel QR algorithms, with significantly lower latency cost compared to Householder QR and lower bandwidth and latency costs compared with Communication-Avoiding QR (CAQR) algorithm. Experiments on supercomputers demonstrate the benefits of the communication cost improvements: in particular, our experiments show substantial improvements over tuned library implementations for tall-and-skinny matrices. Furthermore, we also provide algorithmic improvements to the Householder QR and CAQR algorithms, and we investigate several alternatives to the Householder reconstruction algorithm that sacrifice guarantees on numerical stability in some cases in order to obtain higher performance.« less
High-performance sparse matrix-matrix products on Intel KNL and multicore architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nagasaka, Y; Matsuoka, S; Azad, A

Sparse matrix-matrix multiplication (SpGEMM) is a computational primitive that is widely used in areas ranging from traditional numerical applications to recent big data analysis and machine learning. Although many SpGEMM algorithms have been proposed, hardware specific optimizations for multi- and many-core processors are lacking and a detailed analysis of their performance under various use cases and matrices is not available. We firstly identify and mitigate multiple bottlenecks with memory management and thread scheduling on Intel Xeon Phi (Knights Landing or KNL). Specifically targeting multi- and many-core processors, we develop a hash-table-based algorithm and optimize a heap-based shared-memory SpGEMM algorithm. Wemore » examine their performance together with other publicly available codes. Different from the literature, our evaluation also includes use cases that are representative of real graph algorithms, such as multi-source breadth-first search or triangle counting. Our hash-table and heap-based algorithms are showing significant speedups from libraries in the majority of the cases while different algorithms dominate the other scenarios with different matrix size, sparsity, compression factor and operation type. We wrap up in-depth evaluation results and make a recipe to give the best SpGEMM algorithm for target scenario. A critical finding is that hash-table-based SpGEMM gets a significant performance boost if the nonzeros are not required to be sorted within each row of the output matrix.« less
Joint refinement model for the spin resolved one-electron reduced density matrix of YTiO3 using magnetic structure factors and magnetic Compton profiles data.

PubMed

Gueddida, Saber; Yan, Zeyin; Kibalin, Iurii; Voufack, Ariste Bolivard; Claiser, Nicolas; Souhassou, Mohamed; Lecomte, Claude; Gillon, Béatrice; Gillet, Jean-Michel

2018-04-28

In this paper, we propose a simple cluster model with limited basis sets to reproduce the unpaired electron distributions in a YTiO 3 ferromagnetic crystal. The spin-resolved one-electron-reduced density matrix is reconstructed simultaneously from theoretical magnetic structure factors and directional magnetic Compton profiles using our joint refinement algorithm. This algorithm is guided by the rescaling of basis functions and the adjustment of the spin population matrix. The resulting spin electron density in both position and momentum spaces from the joint refinement model is in agreement with theoretical and experimental results. Benefits brought from magnetic Compton profiles to the entire spin density matrix are illustrated. We studied the magnetic properties of the YTiO 3 crystal along the Ti-O 1 -Ti bonding. We found that the basis functions are mostly rescaled by means of magnetic Compton profiles, while the molecular occupation numbers are mainly modified by the magnetic structure factors.
Two-way learning with one-way supervision for gene expression data.

PubMed

Wong, Monica H T; Mutch, David M; McNicholas, Paul D

2017-03-04

A family of parsimonious Gaussian mixture models for the biclustering of gene expression data is introduced. Biclustering is accommodated by adopting a mixture of factor analyzers model with a binary, row-stochastic factor loadings matrix. This particular form of factor loadings matrix results in a block-diagonal covariance matrix, which is a useful property in gene expression analyses, specifically in biomarker discovery scenarios where blood can potentially act as a surrogate tissue for other less accessible tissues. Prior knowledge of the factor loadings matrix is useful in this application and is reflected in the one-way supervised nature of the algorithm. Additionally, the factor loadings matrix can be assumed to be constant across all components because of the relationship desired between the various types of tissue samples. Parameter estimates are obtained through a variant of the expectation-maximization algorithm and the best-fitting model is selected using the Bayesian information criterion. The family of models is demonstrated using simulated data and two real microarray data sets. The first real data set is from a rat study that investigated the influence of diabetes on gene expression in different tissues. The second real data set is from a human transcriptomics study that focused on blood and immune tissues. The microarray data sets illustrate the biclustering family's performance in biomarker discovery involving peripheral blood as surrogate biopsy material. The simulation studies indicate that the algorithm identifies the correct biclusters, most optimally when the number of observation clusters is known. Moreover, the biclustering algorithm identified biclusters comprised of biologically meaningful data related to insulin resistance and immune function in the rat and human real data sets, respectively. Initial results using real data show that this biclustering technique provides a novel approach for biomarker discovery by enabling blood to be used as a surrogate for hard-to-obtain tissues.
Decomposing Time Series Data by a Non-negative Matrix Factorization Algorithm with Temporally Constrained Coefficients

PubMed Central

Cheung, Vincent C. K.; Devarajan, Karthik; Severini, Giacomo; Turolla, Andrea; Bonato, Paolo

2017-01-01

The non-negative matrix factorization algorithm (NMF) decomposes a data matrix into a set of non-negative basis vectors, each scaled by a coefficient. In its original formulation, the NMF assumes the data samples and dimensions to be independently distributed, making it a less-than-ideal algorithm for the analysis of time series data with temporal correlations. Here, we seek to derive an NMF that accounts for temporal dependencies in the data by explicitly incorporating a very simple temporal constraint for the coefficients into the NMF update rules. We applied the modified algorithm to 2 multi-dimensional electromyographic data sets collected from the human upper-limb to identify muscle synergies. We found that because it reduced the number of free parameters in the model, our modified NMF made it possible to use the Akaike Information Criterion to objectively identify a model order (i.e., the number of muscle synergies composing the data) that is more functionally interpretable, and closer to the numbers previously determined using ad hoc measures. PMID:26737046
Identification of regional activation by factorization of high-density surface EMG signals: A comparison of Principal Component Analysis and Non-negative Matrix factorization.

PubMed

Gallina, Alessio; Garland, S Jayne; Wakeling, James M

2018-05-22

In this study, we investigated whether principal component analysis (PCA) and non-negative matrix factorization (NMF) perform similarly for the identification of regional activation within the human vastus medialis. EMG signals from 64 locations over the VM were collected from twelve participants while performing a low-force isometric knee extension. The envelope of the EMG signal of each channel was calculated by low-pass filtering (8 Hz) the monopolar EMG signal after rectification. The data matrix was factorized using PCA and NMF, and up to 5 factors were considered for each algorithm. Association between explained variance, spatial weights and temporal scores between the two algorithms were compared using Pearson correlation. For both PCA and NMF, a single factor explained approximately 70% of the variance of the signal, while two and three factors explained just over 85% or 90%. The variance explained by PCA and NMF was highly comparable (R > 0.99). Spatial weights and temporal scores extracted with non-negative reconstruction of PCA and NMF were highly associated (all p < 0.001, mean R > 0.97). Regional VM activation can be identified using high-density surface EMG and factorization algorithms. Regional activation explains up to 30% of the variance of the signal, as identified through both PCA and NMF. Copyright © 2018 Elsevier Ltd. All rights reserved.
Structure-preserving and rank-revealing QR-factorizations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bischof, C.H.; Hansen, P.C.

1991-11-01

The rank-revealing QR-factorization (RRQR-factorization) is a special QR-factorization that is guaranteed to reveal the numerical rank of the matrix under consideration. This makes the RRQR-factorization a useful tool in the numerical treatment of many rank-deficient problems in numerical linear algebra. In this paper, a framework is presented for the efficient implementation of RRQR algorithms, in particular, for sparse matrices. A sparse RRQR-algorithm should seek to preserve the structure and sparsity of the matrix as much as possible while retaining the ability to capture safely the numerical rank. To this end, the paper proposes to compute an initial QR-factorization using amore » restricted pivoting strategy guarded by incremental condition estimation (ICE), and then applies the algorithm suggested by Chan and Foster to this QR-factorization. The column exchange strategy used in the initial QR factorization will exploit the fact that certain column exchanges do not change the sparsity structure, and compute a sparse QR-factorization that is a good approximation of the sought-after RRQR-factorization. Due to quantities produced by ICE, the Chan/Foster RRQR algorithm can be implemented very cheaply, thus verifying that the sought-after RRQR-factorization has indeed been computed. Experimental results on a model problem show that the initial QR-factorization is indeed very likely to produce RRQR-factorization.« less
New fast DCT algorithms based on Loeffler's factorization

NASA Astrophysics Data System (ADS)

Hong, Yoon Mi; Kim, Il-Koo; Lee, Tammy; Cheon, Min-Su; Alshina, Elena; Han, Woo-Jin; Park, Jeong-Hoon

2012-10-01

This paper proposes a new 32-point fast discrete cosine transform (DCT) algorithm based on the Loeffler's 16-point transform. Fast integer realizations of 16-point and 32-point transforms are also provided based on the proposed transform. For the recent development of High Efficiency Video Coding (HEVC), simplified quanti-zation and de-quantization process are proposed. Three different forms of implementation with the essentially same performance, namely matrix multiplication, partial butterfly, and full factorization can be chosen accord-ing to the given platform. In terms of the number of multiplications required for the realization, our proposed full-factorization is 3~4 times faster than a partial butterfly, and about 10 times faster than direct matrix multiplication.
On the effective implementation of a boundary element code on graphics processing units unsing an out-of-core LU algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

D'Azevedo, Ed F; Nintcheu Fata, Sylvain

2012-01-01

A collocation boundary element code for solving the three-dimensional Laplace equation, publicly available from \\url{http://www.intetec.org}, has been adapted to run on an Nvidia Tesla general purpose graphics processing unit (GPU). Global matrix assembly and LU factorization of the resulting dense matrix were performed on the GPU. Out-of-core techniques were used to solve problems larger than available GPU memory. The code achieved over eight times speedup in matrix assembly and about 56~Gflops/sec in the LU factorization using only 512~Mbytes of GPU memory. Details of the GPU implementation and comparisons with the standard sequential algorithm are included to illustrate the performance ofmore » the GPU code.« less
Factorization in large-scale many-body calculations

DOE PAGES

Johnson, Calvin W.; Ormand, W. Erich; Krastev, Plamen G.

2013-08-07

One approach for solving interacting many-fermion systems is the configuration-interaction method, also sometimes called the interacting shell model, where one finds eigenvalues of the Hamiltonian in a many-body basis of Slater determinants (antisymmetrized products of single-particle wavefunctions). The resulting Hamiltonian matrix is typically very sparse, but for large systems the nonzero matrix elements can nonetheless require terabytes or more of storage. An alternate algorithm, applicable to a broad class of systems with symmetry, in our case rotational invariance, is to exactly factorize both the basis and the interaction using additive/multiplicative quantum numbers; such an algorithm recreates the many-body matrix elementsmore » on the fly and can reduce the storage requirements by an order of magnitude or more. Here, we discuss factorization in general and introduce a novel, generalized factorization method, essentially a ‘double-factorization’ which speeds up basis generation and set-up of required arrays. Although we emphasize techniques, we also place factorization in the context of a specific (unpublished) configuration-interaction code, BIGSTICK, which runs both on serial and parallel machines, and discuss the savings in memory due to factorization.« less
Semi-supervised spectral algorithms for community detection in complex networks based on equivalence of clustering methods

NASA Astrophysics Data System (ADS)

Ma, Xiaoke; Wang, Bingbo; Yu, Liang

2018-01-01

Community detection is fundamental for revealing the structure-functionality relationship in complex networks, which involves two issues-the quantitative function for community as well as algorithms to discover communities. Despite significant research on either of them, few attempt has been made to establish the connection between the two issues. To attack this problem, a generalized quantification function is proposed for community in weighted networks, which provides a framework that unifies several well-known measures. Then, we prove that the trace optimization of the proposed measure is equivalent with the objective functions of algorithms such as nonnegative matrix factorization, kernel K-means as well as spectral clustering. It serves as the theoretical foundation for designing algorithms for community detection. On the second issue, a semi-supervised spectral clustering algorithm is developed by exploring the equivalence relation via combining the nonnegative matrix factorization and spectral clustering. Different from the traditional semi-supervised algorithms, the partial supervision is integrated into the objective of the spectral algorithm. Finally, through extensive experiments on both artificial and real world networks, we demonstrate that the proposed method improves the accuracy of the traditional spectral algorithms in community detection.
Application of Improved 5th-Cubature Kalman Filter in Initial Strapdown Inertial Navigation System Alignment for Large Misalignment Angles.

PubMed

Wang, Wei; Chen, Xiyuan

2018-02-23

In view of the fact the accuracy of the third-degree Cubature Kalman Filter (CKF) used for initial alignment under large misalignment angle conditions is insufficient, an improved fifth-degree CKF algorithm is proposed in this paper. In order to make full use of the innovation on filtering, the innovation covariance matrix is calculated recursively by an innovative sequence with an exponent fading factor. Then a new adaptive error covariance matrix scaling algorithm is proposed. The Singular Value Decomposition (SVD) method is used for improving the numerical stability of the fifth-degree CKF in this paper. In order to avoid the overshoot caused by excessive scaling of error covariance matrix during the convergence stage, the scaling scheme is terminated when the gradient of azimuth reaches the maximum. The experimental results show that the improved algorithm has better alignment accuracy with large misalignment angles than the traditional algorithm.

Algorithms for solving large sparse systems of simultaneous linear equations on vector processors

NASA Technical Reports Server (NTRS)

David, R. E.

1984-01-01

Very efficient algorithms for solving large sparse systems of simultaneous linear equations have been developed for serial processing computers. These involve a reordering of matrix rows and columns in order to obtain a near triangular pattern of nonzero elements. Then an LU factorization is developed to represent the matrix inverse in terms of a sequence of elementary Gaussian eliminations, or pivots. In this paper it is shown how these algorithms are adapted for efficient implementation on vector processors. Results obtained on the CYBER 200 Model 205 are presented for a series of large test problems which show the comparative advantages of the triangularization and vector processing algorithms.
Non-negative Matrix Factorization for Self-calibration of Photometric Redshift Scatter in Weak-lensing Surveys

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Le; Yu, Yu; Zhang, Pengjie, E-mail: lezhang@sjtu.edu.cn

Photo- z error is one of the major sources of systematics degrading the accuracy of weak-lensing cosmological inferences. Zhang et al. proposed a self-calibration method combining galaxy–galaxy correlations and galaxy–shear correlations between different photo- z bins. Fisher matrix analysis shows that it can determine the rate of photo- z outliers at a level of 0.01%–1% merely using photometric data and do not rely on any prior knowledge. In this paper, we develop a new algorithm to implement this method by solving a constrained nonlinear optimization problem arising in the self-calibration process. Based on the techniques of fixed-point iteration and non-negativemore » matrix factorization, the proposed algorithm can efficiently and robustly reconstruct the scattering probabilities between the true- z and photo- z bins. The algorithm has been tested extensively by applying it to mock data from simulated stage IV weak-lensing projects. We find that the algorithm provides a successful recovery of the scatter rates at the level of 0.01%–1%, and the true mean redshifts of photo- z bins at the level of 0.001, which may satisfy the requirements in future lensing surveys.« less
Weighted graph based ordering techniques for preconditioned conjugate gradient methods

NASA Technical Reports Server (NTRS)

Clift, Simon S.; Tang, Wei-Pai

1994-01-01

We describe the basis of a matrix ordering heuristic for improving the incomplete factorization used in preconditioned conjugate gradient techniques applied to anisotropic PDE's. Several new matrix ordering techniques, derived from well-known algorithms in combinatorial graph theory, which attempt to implement this heuristic, are described. These ordering techniques are tested against a number of matrices arising from linear anisotropic PDE's, and compared with other matrix ordering techniques. A variation of RCM is shown to generally improve the quality of incomplete factorization preconditioners.
A fast algorithm for multiscale electromagnetic problems using interpolative decomposition and multilevel fast multipole algorithm

NASA Astrophysics Data System (ADS)

Pan, Xiao-Min; Wei, Jian-Gong; Peng, Zhen; Sheng, Xin-Qing

2012-02-01

The interpolative decomposition (ID) is combined with the multilevel fast multipole algorithm (MLFMA), denoted by ID-MLFMA, to handle multiscale problems. The ID-MLFMA first generates ID levels by recursively dividing the boxes at the finest MLFMA level into smaller boxes. It is specifically shown that near-field interactions with respect to the MLFMA, in the form of the matrix vector multiplication (MVM), are efficiently approximated at the ID levels. Meanwhile, computations on far-field interactions at the MLFMA levels remain unchanged. Only a small portion of matrix entries are required to approximate coupling among well-separated boxes at the ID levels, and these submatrices can be filled without computing the complete original coupling matrix. It follows that the matrix filling in the ID-MLFMA becomes much less expensive. The memory consumed is thus greatly reduced and the MVM is accelerated as well. Several factors that may influence the accuracy, efficiency and reliability of the proposed ID-MLFMA are investigated by numerical experiments. Complex targets are calculated to demonstrate the capability of the ID-MLFMA algorithm.
ROBNCA: robust network component analysis for recovering transcription factor activities.

PubMed

Noor, Amina; Ahmad, Aitzaz; Serpedin, Erchin; Nounou, Mohamed; Nounou, Hazem

2013-10-01

Network component analysis (NCA) is an efficient method of reconstructing the transcription factor activity (TFA), which makes use of the gene expression data and prior information available about transcription factor (TF)-gene regulations. Most of the contemporary algorithms either exhibit the drawback of inconsistency and poor reliability, or suffer from prohibitive computational complexity. In addition, the existing algorithms do not possess the ability to counteract the presence of outliers in the microarray data. Hence, robust and computationally efficient algorithms are needed to enable practical applications. We propose ROBust Network Component Analysis (ROBNCA), a novel iterative algorithm that explicitly models the possible outliers in the microarray data. An attractive feature of the ROBNCA algorithm is the derivation of a closed form solution for estimating the connectivity matrix, which was not available in prior contributions. The ROBNCA algorithm is compared with FastNCA and the non-iterative NCA (NI-NCA). ROBNCA estimates the TF activity profiles as well as the TF-gene control strength matrix with a much higher degree of accuracy than FastNCA and NI-NCA, irrespective of varying noise, correlation and/or amount of outliers in case of synthetic data. The ROBNCA algorithm is also tested on Saccharomyces cerevisiae data and Escherichia coli data, and it is observed to outperform the existing algorithms. The run time of the ROBNCA algorithm is comparable with that of FastNCA, and is hundreds of times faster than NI-NCA. The ROBNCA software is available at http://people.tamu.edu/∼amina/ROBNCA
An analysis of spectral envelope-reduction via quadratic assignment problems

NASA Technical Reports Server (NTRS)

George, Alan; Pothen, Alex

1994-01-01

A new spectral algorithm for reordering a sparse symmetric matrix to reduce its envelope size was described. The ordering is computed by associating a Laplacian matrix with the given matrix and then sorting the components of a specified eigenvector of the Laplacian. In this paper, we provide an analysis of the spectral envelope reduction algorithm. We described related 1- and 2-sum problems; the former is related to the envelope size, while the latter is related to an upper bound on the work involved in an envelope Cholesky factorization scheme. We formulate the latter two problems as quadratic assignment problems, and then study the 2-sum problem in more detail. We obtain lower bounds on the 2-sum by considering a projected quadratic assignment problem, and then show that finding a permutation matrix closest to an orthogonal matrix attaining one of the lower bounds justifies the spectral envelope reduction algorithm. The lower bound on the 2-sum is seen to be tight for reasonably 'uniform' finite element meshes. We also obtain asymptotically tight lower bounds for the envelope size for certain classes of meshes.
Application of Improved 5th-Cubature Kalman Filter in Initial Strapdown Inertial Navigation System Alignment for Large Misalignment Angles

PubMed Central

Wang, Wei; Chen, Xiyuan

2018-01-01

In view of the fact the accuracy of the third-degree Cubature Kalman Filter (CKF) used for initial alignment under large misalignment angle conditions is insufficient, an improved fifth-degree CKF algorithm is proposed in this paper. In order to make full use of the innovation on filtering, the innovation covariance matrix is calculated recursively by an innovative sequence with an exponent fading factor. Then a new adaptive error covariance matrix scaling algorithm is proposed. The Singular Value Decomposition (SVD) method is used for improving the numerical stability of the fifth-degree CKF in this paper. In order to avoid the overshoot caused by excessive scaling of error covariance matrix during the convergence stage, the scaling scheme is terminated when the gradient of azimuth reaches the maximum. The experimental results show that the improved algorithm has better alignment accuracy with large misalignment angles than the traditional algorithm. PMID:29473912
Unsupervised Learning for Monaural Source Separation Using Maximization–Minimization Algorithm with Time–Frequency Deconvolution †

PubMed Central

Bouridane, Ahmed; Ling, Bingo Wing-Kuen

2018-01-01

This paper presents an unsupervised learning algorithm for sparse nonnegative matrix factor time–frequency deconvolution with optimized fractional β-divergence. The β-divergence is a group of cost functions parametrized by a single parameter β. The Itakura–Saito divergence, Kullback–Leibler divergence and Least Square distance are special cases that correspond to β=0, 1, 2, respectively. This paper presents a generalized algorithm that uses a flexible range of β that includes fractional values. It describes a maximization–minimization (MM) algorithm leading to the development of a fast convergence multiplicative update algorithm with guaranteed convergence. The proposed model operates in the time–frequency domain and decomposes an information-bearing matrix into two-dimensional deconvolution of factor matrices that represent the spectral dictionary and temporal codes. The deconvolution process has been optimized to yield sparse temporal codes through maximizing the likelihood of the observations. The paper also presents a method to estimate the fractional β value. The method is demonstrated on separating audio mixtures recorded from a single channel. The paper shows that the extraction of the spectral dictionary and temporal codes is significantly more efficient by using the proposed algorithm and subsequently leads to better source separation performance. Experimental tests and comparisons with other factorization methods have been conducted to verify its efficacy. PMID:29702629
An Efficient Scheme for Updating Sparse Cholesky Factors

NASA Technical Reports Server (NTRS)

Raghavan, Padma

2002-01-01

Raghavan had earlier developed the software package DCSPACK which can be used for solving sparse linear systems where the coefficient matrix is symmetric and positive definite (this project was not funded by NASA but by agencies such as NSF). DSCPACK-S is the serial code and DSCPACK-P is a parallel implementation suitable for multiprocessors or networks-of-workstations with message passing using MCI. The main algorithm used is the Cholesky factorization of a sparse symmetric positive positive definite matrix A = LL(T). The code can also compute the factorization A = LDL(T). The complexity of the software arises from several factors relating to the sparsity of the matrix A. A sparse N x N matrix A has typically less that cN nonzeroes where c is a small constant. If the matrix were dense, it would have O(N2) nonzeroes. The most complicated part of such sparse Cholesky factorization relates to fill-in, i.e., zeroes in the original matrix that become nonzeroes in the factor L. An efficient implementation depends to a large extent on complex data structures and on techniques from graph theory to reduce, identify, and manage fill. DSCPACK is based on an efficient multifrontal implementation with fill-managing algorithms and implementation arising from earlier research by Raghavan and others. Sparse Cholesky factorization is typically a four step process: (1) ordering to compute a fill-reducing numbering, (2) symbolic factorization to determine the nonzero structure of L, (3) numeric factorization to compute L, and, (4) triangular solution to solve L(T)x = y and Ly = b. The first two steps are symbolic and are performed using the graph of the matrix. The numeric factorization step is of dominant cost and there are several schemes for improving performance by exploiting the nested and dense structure of groups of columns in the factor. The latter are aimed at better utilization of the cache-memory hierarchy on modem processors to prevent cache-misses and provide execution rates (operations/second) that are close to the peak rates for dense matrix computations. Currently, EPISCOPACY is being used in an application at NASA directed by J. Newman and M. James. We propose the implementation of efficient schemes for updating the LL(T) or LDL(T) factors computed in DSCPACK-S to meet the computational requirements of their project. A brief description is provided in the next section.
An algorithm for the solution of dynamic linear programs

NASA Technical Reports Server (NTRS)

Psiaki, Mark L.

1989-01-01

The algorithm's objective is to efficiently solve Dynamic Linear Programs (DLP) by taking advantage of their special staircase structure. This algorithm constitutes a stepping stone to an improved algorithm for solving Dynamic Quadratic Programs, which, in turn, would make the nonlinear programming method of Successive Quadratic Programs more practical for solving trajectory optimization problems. The ultimate goal is to being trajectory optimization solution speeds into the realm of real-time control. The algorithm exploits the staircase nature of the large constraint matrix of the equality-constrained DLPs encountered when solving inequality-constrained DLPs by an active set approach. A numerically-stable, staircase QL factorization of the staircase constraint matrix is carried out starting from its last rows and columns. The resulting recursion is like the time-varying Riccati equation from multi-stage LQR theory. The resulting factorization increases the efficiency of all of the typical LP solution operations over that of a dense matrix LP code. At the same time numerical stability is ensured. The algorithm also takes advantage of dynamic programming ideas about the cost-to-go by relaxing active pseudo constraints in a backwards sweeping process. This further decreases the cost per update of the LP rank-1 updating procedure, although it may result in more changes of the active set that if pseudo constraints were relaxed in a non-stagewise fashion. The usual stability of closed-loop Linear/Quadratic optimally-controlled systems, if it carries over to strictly linear cost functions, implies that the saving due to reduced factor update effort may outweigh the cost of an increased number of updates. An aerospace example is presented in which a ground-to-ground rocket's distance is maximized. This example demonstrates the applicability of this class of algorithms to aerospace guidance. It also sheds light on the efficacy of the proposed pseudo constraint relaxation scheme.
Singular value decomposition for collaborative filtering on a GPU

NASA Astrophysics Data System (ADS)

Kato, Kimikazu; Hosino, Tikara

2010-06-01

A collaborative filtering predicts customers' unknown preferences from known preferences. In a computation of the collaborative filtering, a singular value decomposition (SVD) is needed to reduce the size of a large scale matrix so that the burden for the next phase computation will be decreased. In this application, SVD means a roughly approximated factorization of a given matrix into smaller sized matrices. Webb (a.k.a. Simon Funk) showed an effective algorithm to compute SVD toward a solution of an open competition called "Netflix Prize". The algorithm utilizes an iterative method so that the error of approximation improves in each step of the iteration. We give a GPU version of Webb's algorithm. Our algorithm is implemented in the CUDA and it is shown to be efficient by an experiment.
Improving performances of suboptimal greedy iterative biclustering heuristics via localization.

PubMed

Erten, Cesim; Sözdinler, Melih

2010-10-15

Biclustering gene expression data is the problem of extracting submatrices of genes and conditions exhibiting significant correlation across both the rows and the columns of a data matrix of expression values. Even the simplest versions of the problem are computationally hard. Most of the proposed solutions therefore employ greedy iterative heuristics that locally optimize a suitably assigned scoring function. We provide a fast and simple pre-processing algorithm called localization that reorders the rows and columns of the input data matrix in such a way as to group correlated entries in small local neighborhoods within the matrix. The proposed localization algorithm takes its roots from effective use of graph-theoretical methods applied to problems exhibiting a similar structure to that of biclustering. In order to evaluate the effectivenesss of the localization pre-processing algorithm, we focus on three representative greedy iterative heuristic methods. We show how the localization pre-processing can be incorporated into each representative algorithm to improve biclustering performance. Furthermore, we propose a simple biclustering algorithm, Random Extraction After Localization (REAL) that randomly extracts submatrices from the localization pre-processed data matrix, eliminates those with low similarity scores, and provides the rest as correlated structures representing biclusters. We compare the proposed localization pre-processing with another pre-processing alternative, non-negative matrix factorization. We show that our fast and simple localization procedure provides similar or even better results than the computationally heavy matrix factorization pre-processing with regards to H-value tests. We next demonstrate that the performances of the three representative greedy iterative heuristic methods improve with localization pre-processing when biological correlations in the form of functional enrichment and PPI verification constitute the main performance criteria. The fact that the random extraction method based on localization REAL performs better than the representative greedy heuristic methods under same criteria also confirms the effectiveness of the suggested pre-processing method. Supplementary material including code implementations in LEDA C++ library, experimental data, and the results are available at http://code.google.com/p/biclustering/ cesim@khas.edu.tr; melihsozdinler@boun.edu.tr Supplementary data are available at Bioinformatics online.
Stochastic quasi-Newton molecular simulations

NASA Astrophysics Data System (ADS)

Chau, C. D.; Sevink, G. J. A.; Fraaije, J. G. E. M.

2010-08-01

We report a new and efficient factorized algorithm for the determination of the adaptive compound mobility matrix B in a stochastic quasi-Newton method (S-QN) that does not require additional potential evaluations. For one-dimensional and two-dimensional test systems, we previously showed that S-QN gives rise to efficient configurational space sampling with good thermodynamic consistency [C. D. Chau, G. J. A. Sevink, and J. G. E. M. Fraaije, J. Chem. Phys. 128, 244110 (2008)10.1063/1.2943313]. Potential applications of S-QN are quite ambitious, and include structure optimization, analysis of correlations and automated extraction of cooperative modes. However, the potential can only be fully exploited if the computational and memory requirements of the original algorithm are significantly reduced. In this paper, we consider a factorized mobility matrix B=JJT and focus on the nontrivial fundamentals of an efficient algorithm for updating the noise multiplier J . The new algorithm requires O(n2) multiplications per time step instead of the O(n3) multiplications in the original scheme due to Choleski decomposition. In a recursive form, the update scheme circumvents matrix storage and enables limited-memory implementation, in the spirit of the well-known limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method, allowing for a further reduction of the computational effort to O(n) . We analyze in detail the performance of the factorized (FSU) and limited-memory (L-FSU) algorithms in terms of convergence and (multiscale) sampling, for an elementary but relevant system that involves multiple time and length scales. Finally, we use this analysis to formulate conditions for the simulation of the complex high-dimensional potential energy landscapes of interest.
Introducing difference recurrence relations for faster semi-global alignment of long sequences.

PubMed

Suzuki, Hajime; Kasahara, Masahiro

2018-02-19

The read length of single-molecule DNA sequencers is reaching 1 Mb. Popular alignment software tools widely used for analyzing such long reads often take advantage of single-instruction multiple-data (SIMD) operations to accelerate calculation of dynamic programming (DP) matrices in the Smith-Waterman-Gotoh (SWG) algorithm with a fixed alignment start position at the origin. Nonetheless, 16-bit or 32-bit integers are necessary for storing the values in a DP matrix when sequences to be aligned are long; this situation hampers the use of the full SIMD width of modern processors. We proposed a faster semi-global alignment algorithm, "difference recurrence relations," that runs more rapidly than the state-of-the-art algorithm by a factor of 2.1. Instead of calculating and storing all the values in a DP matrix directly, our algorithm computes and stores mainly the differences between the values of adjacent cells in the matrix. Although the SWG algorithm and our algorithm can output exactly the same result, our algorithm mainly involves 8-bit integer operations, enabling us to exploit the full width of SIMD operations (e.g., 32) on modern processors. We also developed a library, libgaba, so that developers can easily integrate our algorithm into alignment programs. Our novel algorithm and optimized library implementation will facilitate accelerating nucleotide long-read analysis algorithms that use pairwise alignment stages. The library is implemented in the C programming language and available at https://github.com/ocxtal/libgaba .
A Fine-Grained Pipelined Implementation for Large-Scale Matrix Inversion on FPGA

NASA Astrophysics Data System (ADS)

Zhou, Jie; Dou, Yong; Zhao, Jianxun; Xia, Fei; Lei, Yuanwu; Tang, Yuxing

Large-scale matrix inversion play an important role in many applications. However to the best of our knowledge, there is no FPGA-based implementation. In this paper, we explore the possibility of accelerating large-scale matrix inversion on FPGA. To exploit the computational potential of FPGA, we introduce a fine-grained parallel algorithm for matrix inversion. A scalable linear array processing elements (PEs), which is the core component of the FPGA accelerator, is proposed to implement this algorithm. A total of 12 PEs can be integrated into an Altera StratixII EP2S130F1020C5 FPGA on our self-designed board. Experimental results show that a factor of 2.6 speedup and the maximum power-performance of 41 can be achieved compare to Pentium Dual CPU with double SSE threads.
Data Reduction Algorithm Using Nonnegative Matrix Factorization with Nonlinear Constraints

NASA Astrophysics Data System (ADS)

Sembiring, Pasukat

2017-12-01

Processing ofdata with very large dimensions has been a hot topic in recent decades. Various techniques have been proposed in order to execute the desired information or structure. Non- Negative Matrix Factorization (NMF) based on non-negatives data has become one of the popular methods for shrinking dimensions. The main strength of this method is non-negative object, the object model by a combination of some basic non-negative parts, so as to provide a physical interpretation of the object construction. The NMF is a dimension reduction method thathasbeen used widely for numerous applications including computer vision,text mining, pattern recognitions,and bioinformatics. Mathematical formulation for NMF did not appear as a convex optimization problem and various types of algorithms have been proposed to solve the problem. The Framework of Alternative Nonnegative Least Square(ANLS) are the coordinates of the block formulation approaches that have been proven reliable theoretically and empirically efficient. This paper proposes a new algorithm to solve NMF problem based on the framework of ANLS.This algorithm inherits the convergenceproperty of the ANLS framework to nonlinear constraints NMF formulations.
Joint estimation of 2D-DOA and frequency based on space-time matrix and conformal array.

PubMed

Wan, Liang-Tian; Liu, Lu-Tao; Si, Wei-Jian; Tian, Zuo-Xi

2013-01-01

Each element in the conformal array has a different pattern, which leads to the performance deterioration of the conventional high resolution direction-of-arrival (DOA) algorithms. In this paper, a joint frequency and two-dimension DOA (2D-DOA) estimation algorithm for conformal array are proposed. The delay correlation function is used to suppress noise. Both spatial and time sampling are utilized to construct the spatial-time matrix. The frequency and 2D-DOA estimation are accomplished based on parallel factor (PARAFAC) analysis without spectral peak searching and parameter pairing. The proposed algorithm needs only four guiding elements with precise positions to estimate frequency and 2D-DOA. Other instrumental elements can be arranged flexibly on the surface of the carrier. Simulation results demonstrate the effectiveness of the proposed algorithm.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chow, Edmond

Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.
The successive projection algorithm as an initialization method for brain tumor segmentation using non-negative matrix factorization.

PubMed

Sauwen, Nicolas; Acou, Marjan; Bharath, Halandur N; Sima, Diana M; Veraart, Jelle; Maes, Frederik; Himmelreich, Uwe; Achten, Eric; Van Huffel, Sabine

2017-01-01

Non-negative matrix factorization (NMF) has become a widely used tool for additive parts-based analysis in a wide range of applications. As NMF is a non-convex problem, the quality of the solution will depend on the initialization of the factor matrices. In this study, the successive projection algorithm (SPA) is proposed as an initialization method for NMF. SPA builds on convex geometry and allocates endmembers based on successive orthogonal subspace projections of the input data. SPA is a fast and reproducible method, and it aligns well with the assumptions made in near-separable NMF analyses. SPA was applied to multi-parametric magnetic resonance imaging (MRI) datasets for brain tumor segmentation using different NMF algorithms. Comparison with common initialization methods shows that SPA achieves similar segmentation quality and it is competitive in terms of convergence rate. Whereas SPA was previously applied as a direct endmember extraction tool, we have shown improved segmentation results when using SPA as an initialization method, as it allows further enhancement of the sources during the NMF iterative procedure.
Recovering hidden diagonal structures via non-negative matrix factorization with multiple constraints.

PubMed

Yang, Xi; Han, Guoqiang; Cai, Hongmin; Song, Yan

2017-03-31

Revealing data with intrinsically diagonal block structures is particularly useful for analyzing groups of highly correlated variables. Earlier researches based on non-negative matrix factorization (NMF) have been shown to be effective in representing such data by decomposing the observed data into two factors, where one factor is considered to be the feature and the other the expansion loading from a linear algebra perspective. If the data are sampled from multiple independent subspaces, the loading factor would possess a diagonal structure under an ideal matrix decomposition. However, the standard NMF method and its variants have not been reported to exploit this type of data via direct estimation. To address this issue, a non-negative matrix factorization with multiple constraints model is proposed in this paper. The constraints include an sparsity norm on the feature matrix and a total variational norm on each column of the loading matrix. The proposed model is shown to be capable of efficiently recovering diagonal block structures hidden in observed samples. An efficient numerical algorithm using the alternating direction method of multipliers model is proposed for optimizing the new model. Compared with several benchmark models, the proposed method performs robustly and effectively for simulated and real biological data.

Fast polar decomposition of an arbitrary matrix

NASA Technical Reports Server (NTRS)

Higham, Nicholas J.; Schreiber, Robert S.

1988-01-01

The polar decomposition of an m x n matrix A of full rank, where m is greater than or equal to n, can be computed using a quadratically convergent algorithm. The algorithm is based on a Newton iteration involving a matrix inverse. With the use of a preliminary complete orthogonal decomposition the algorithm can be extended to arbitrary A. How to use the algorithm to compute the positive semi-definite square root of a Hermitian positive semi-definite matrix is described. A hybrid algorithm which adaptively switches from the matrix inversion based iteration to a matrix multiplication based iteration due to Kovarik, and to Bjorck and Bowie is formulated. The decision when to switch is made using a condition estimator. This matrix multiplication rich algorithm is shown to be more efficient on machines for which matrix multiplication can be executed 1.5 times faster than matrix inversion.
A Study for Texture Feature Extraction of High-Resolution Satellite Images Based on a Direction Measure and Gray Level Co-Occurrence Matrix Fusion Algorithm

PubMed Central

Zhang, Xin; Cui, Jintian; Wang, Weisheng; Lin, Chao

2017-01-01

To address the problem of image texture feature extraction, a direction measure statistic that is based on the directionality of image texture is constructed, and a new method of texture feature extraction, which is based on the direction measure and a gray level co-occurrence matrix (GLCM) fusion algorithm, is proposed in this paper. This method applies the GLCM to extract the texture feature value of an image and integrates the weight factor that is introduced by the direction measure to obtain the final texture feature of an image. A set of classification experiments for the high-resolution remote sensing images were performed by using support vector machine (SVM) classifier with the direction measure and gray level co-occurrence matrix fusion algorithm. Both qualitative and quantitative approaches were applied to assess the classification results. The experimental results demonstrated that texture feature extraction based on the fusion algorithm achieved a better image recognition, and the accuracy of classification based on this method has been significantly improved. PMID:28640181
Adaptive Inverse Control for Rotorcraft Vibration Reduction

NASA Technical Reports Server (NTRS)

Jacklin, Stephen A.

1985-01-01

This thesis extends the Least Mean Square (LMS) algorithm to solve the mult!ple-input, multiple-output problem of alleviating N/Rev (revolutions per minute by number of blades) helicopter fuselage vibration by means of adaptive inverse control. A frequency domain locally linear model is used to represent the transfer matrix relating the higher harmonic pitch control inputs to the harmonic vibration outputs to be controlled. By using the inverse matrix as the controller gain matrix, an adaptive inverse regulator is formed to alleviate the N/Rev vibration. The stability and rate of convergence properties of the extended LMS algorithm are discussed. It is shown that the stability ranges for the elements of the stability gain matrix are directly related to the eigenvalues of the vibration signal information matrix for the learning phase, but not for the control phase. The overall conclusion is that the LMS adaptive inverse control method can form a robust vibration control system, but will require some tuning of the input sensor gains, the stability gain matrix, and the amount of control relaxation to be used. The learning curve of the controller during the learning phase is shown to be quantitatively close to that predicted by averaging the learning curves of the normal modes. For higher order transfer matrices, a rough estimate of the inverse is needed to start the algorithm efficiently. The simulation results indicate that the factor which most influences LMS adaptive inverse control is the product of the control relaxation and the the stability gain matrix. A small stability gain matrix makes the controller less sensitive to relaxation selection, and permits faster and more stable vibration reduction, than by choosing the stability gain matrix large and the control relaxation term small. It is shown that the best selections of the stability gain matrix elements and the amount of control relaxation is basically a compromise between slow, stable convergence and fast convergence with increased possibility of unstable identification. In the simulation studies, the LMS adaptive inverse control algorithm is shown to be capable of adapting the inverse (controller) matrix to track changes in the flight conditions. The algorithm converges quickly for moderate disturbances, while taking longer for larger disturbances. Perfect knowledge of the inverse matrix is not required for good control of the N/Rev vibration. However it is shown that measurement noise will prevent the LMS adaptive inverse control technique from controlling the vibration, unless the signal averaging method presented is incorporated into the algorithm.
Solution algorithms for nonlinear transient heat conduction analysis employing element-by-element iterative strategies

NASA Technical Reports Server (NTRS)

Winget, J. M.; Hughes, T. J. R.

1985-01-01

The particular problems investigated in the present study arise from nonlinear transient heat conduction. One of two types of nonlinearities considered is related to a material temperature dependence which is frequently needed to accurately model behavior over the range of temperature of engineering interest. The second nonlinearity is introduced by radiation boundary conditions. The finite element equations arising from the solution of nonlinear transient heat conduction problems are formulated. The finite element matrix equations are temporally discretized, and a nonlinear iterative solution algorithm is proposed. Algorithms for solving the linear problem are discussed, taking into account the form of the matrix equations, Gaussian elimination, cost, and iterative techniques. Attention is also given to approximate factorization, implementational aspects, and numerical results.
Computing rank-revealing QR factorizations of dense matrices.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bischof, C. H.; Quintana-Orti, G.; Mathematics and Computer Science

1998-06-01

We develop algorithms and implementations for computing rank-revealing QR (RRQR) factorizations of dense matrices. First, we develop an efficient block algorithm for approximating an RRQR factorization, employing a windowed version of the commonly used Golub pivoting strategy, aided by incremental condition estimation. Second, we develop efficiently implementable variants of guaranteed reliable RRQR algorithms for triangular matrices originally suggested by Chandrasekaran and Ipsen and by Pan and Tang. We suggest algorithmic improvements with respect to condition estimation, termination criteria, and Givens updating. By combining the block algorithm with one of the triangular postprocessing steps, we arrive at an efficient and reliablemore » algorithm for computing an RRQR factorization of a dense matrix. Experimental results on IBM RS/6000 SGI R8000 platforms show that this approach performs up to three times faster that the less reliable QR factorization with column pivoting as it is currently implemented in LAPACK, and comes within 15% of the performance of the LAPACK block algorithm for computing a QR factorization without any column exchanges. Thus, we expect this routine to be useful in may circumstances where numerical rank deficiency cannot be ruled out, but currently has been ignored because of the computational cost of dealing with it.« less
Algorithms and Application of Sparse Matrix Assembly and Equation Solvers for Aeroacoustics

NASA Technical Reports Server (NTRS)

Watson, W. R.; Nguyen, D. T.; Reddy, C. J.; Vatsa, V. N.; Tang, W. H.

2001-01-01

An algorithm for symmetric sparse equation solutions on an unstructured grid is described. Efficient, sequential sparse algorithms for degree-of-freedom reordering, supernodes, symbolic/numerical factorization, and forward backward solution phases are reviewed. Three sparse algorithms for the generation and assembly of symmetric systems of matrix equations are presented. The accuracy and numerical performance of the sequential version of the sparse algorithms are evaluated over the frequency range of interest in a three-dimensional aeroacoustics application. Results show that the solver solutions are accurate using a discretization of 12 points per wavelength. Results also show that the first assembly algorithm is impractical for high-frequency noise calculations. The second and third assembly algorithms have nearly equal performance at low values of source frequencies, but at higher values of source frequencies the third algorithm saves CPU time and RAM. The CPU time and the RAM required by the second and third assembly algorithms are two orders of magnitude smaller than that required by the sparse equation solver. A sequential version of these sparse algorithms can, therefore, be conveniently incorporated into a substructuring for domain decomposition formulation to achieve parallel computation, where different substructures are handles by different parallel processors.
Multi-threaded Sparse Matrix Sparse Matrix Multiplication for Many-Core and GPU Architectures.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deveci, Mehmet; Trott, Christian Robert; Rajamanickam, Sivasankaran

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix- matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and datamore » structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.« less
Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU Architectures.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deveci, Mehmet; Rajamanickam, Sivasankaran; Trott, Christian Robert

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scienti c computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and datamore » structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.« less
Detection of Protein Complexes Based on Penalized Matrix Decomposition in a Sparse Protein⁻Protein Interaction Network.

PubMed

Cao, Buwen; Deng, Shuguang; Qin, Hua; Ding, Pingjian; Chen, Shaopeng; Li, Guanghui

2018-06-15

High-throughput technology has generated large-scale protein interaction data, which is crucial in our understanding of biological organisms. Many complex identification algorithms have been developed to determine protein complexes. However, these methods are only suitable for dense protein interaction networks, because their capabilities decrease rapidly when applied to sparse protein⁻protein interaction (PPI) networks. In this study, based on penalized matrix decomposition ( PMD ), a novel method of penalized matrix decomposition for the identification of protein complexes (i.e., PMD pc ) was developed to detect protein complexes in the human protein interaction network. This method mainly consists of three steps. First, the adjacent matrix of the protein interaction network is normalized. Second, the normalized matrix is decomposed into three factor matrices. The PMD pc method can detect protein complexes in sparse PPI networks by imposing appropriate constraints on factor matrices. Finally, the results of our method are compared with those of other methods in human PPI network. Experimental results show that our method can not only outperform classical algorithms, such as CFinder, ClusterONE, RRW, HC-PIN, and PCE-FR, but can also achieve an ideal overall performance in terms of a composite score consisting of F-measure, accuracy (ACC), and the maximum matching ratio (MMR).
A fast efficient implicit scheme for the gasdynamic equations using a matrix reduction technique

NASA Technical Reports Server (NTRS)

Barth, T. J.; Steger, J. L.

1985-01-01

An efficient implicit finite-difference algorithm for the gasdynamic equations utilizing matrix reduction techniques is presented. A significant reduction in arithmetic operations is achieved without loss of the stability characteristics generality found in the Beam and Warming approximate factorization algorithm. Steady-state solutions to the conservative Euler equations in generalized coordinates are obtained for transonic flows and used to show that the method offers computational advantages over the conventional Beam and Warming scheme. Existing Beam and Warming codes can be retrofit with minimal effort. The theoretical extension of the matrix reduction technique to the full Navier-Stokes equations in Cartesian coordinates is presented in detail. Linear stability, using a Fourier stability analysis, is demonstrated and discussed for the one-dimensional Euler equations.
Non-negative Matrix Factorization and Co-clustering: A Promising Tool for Multi-tasks Bearing Fault Diagnosis

NASA Astrophysics Data System (ADS)

Shen, Fei; Chen, Chao; Yan, Ruqiang

2017-05-01

Classical bearing fault diagnosis methods, being designed according to one specific task, always pay attention to the effectiveness of extracted features and the final diagnostic performance. However, most of these approaches suffer from inefficiency when multiple tasks exist, especially in a real-time diagnostic scenario. A fault diagnosis method based on Non-negative Matrix Factorization (NMF) and Co-clustering strategy is proposed to overcome this limitation. Firstly, some high-dimensional matrixes are constructed using the Short-Time Fourier Transform (STFT) features, where the dimension of each matrix equals to the number of target tasks. Then, the NMF algorithm is carried out to obtain different components in each dimension direction through optimized matching, such as Euclidean distance and divergence distance. Finally, a Co-clustering technique based on information entropy is utilized to realize classification of each component. To verity the effectiveness of the proposed approach, a series of bearing data sets were analysed in this research. The tests indicated that although the diagnostic performance of single task is comparable to traditional clustering methods such as K-mean algorithm and Guassian Mixture Model, the accuracy and computational efficiency in multi-tasks fault diagnosis are improved.
HPC-NMF: A High-Performance Parallel Algorithm for Nonnegative Matrix Factorization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kannan, Ramakrishnan; Sukumar, Sreenivas R.; Ballard, Grey M.

NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient distributed algorithms to solve the problem for big data sets. We propose a high-performance distributed-memory parallel algorithm that computes the factorization by iteratively solving alternating non-negative least squares (NLS) subproblems formore » $$\\WW$$ and $$\\HH$$. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). As opposed to previous implementation, our algorithm is also flexible: It performs well for both dense and sparse matrices, and allows the user to choose any one of the multiple algorithms for solving the updates to low rank factors $$\\WW$$ and $$\\HH$$ within the alternating iterations.« less
New matrix bounds and iterative algorithms for the discrete coupled algebraic Riccati equation

NASA Astrophysics Data System (ADS)

Liu, Jianzhou; Wang, Li; Zhang, Juan

2017-11-01

The discrete coupled algebraic Riccati equation (DCARE) has wide applications in control theory and linear system. In general, for the DCARE, one discusses every term of the coupled term, respectively. In this paper, we consider the coupled term as a whole, which is different from the recent results. When applying eigenvalue inequalities to discuss the coupled term, our method has less error. In terms of the properties of special matrices and eigenvalue inequalities, we propose several upper and lower matrix bounds for the solution of DCARE. Further, we discuss the iterative algorithms for the solution of the DCARE. In the fixed point iterative algorithms, the scope of Lipschitz factor is wider than the recent results. Finally, we offer corresponding numerical examples to illustrate the effectiveness of the derived results.
A globally well-posed finite element algorithm for aerodynamics applications

NASA Technical Reports Server (NTRS)

Iannelli, G. S.; Baker, A. J.

1991-01-01

A finite element CFD algorithm is developed for Euler and Navier-Stokes aerodynamic applications. For the linear basis, the resultant approximation is at least second-order-accurate in time and space for synergistic use of three procedures: (1) a Taylor weak statement, which provides for derivation of companion conservation law systems with embedded dispersion-error control mechanisms; (2) a stiffly stable second-order-accurate implicit Rosenbrock-Runge-Kutta temporal algorithm; and (3) a matrix tensor product factorization that permits efficient numerical linear algebra handling of the terminal large-matrix statement. Thorough analyses are presented regarding well-posed boundary conditions for inviscid and viscous flow specifications. Numerical solutions are generated and compared for critical evaluation of quasi-one- and two-dimensional Euler and Navier-Stokes benchmark test problems.
Unsymmetric ordering using a constrained Markowitz scheme

DOE Office of Scientific and Technical Information (OSTI.GOV)

Amestoy, Patrick R.; Xiaoye S.; Pralet, Stephane

2005-01-18

We present a family of ordering algorithms that can be used as a preprocessing step prior to performing sparse LU factorization. The ordering algorithms simultaneously achieve the objectives of selecting numerically good pivots and preserving the sparsity. We describe the algorithmic properties and challenges in their implementation. By mixing the two objectives we show that we can reduce the amount of fill-in in the factors and reduce the number of numerical problems during factorization. On a set of large unsymmetric real problems, we obtained the median reductions of 12% in the factorization time, of 13% in the size of themore » LU factors, of 20% in the number of operations performed during the factorization phase, and of 11% in the memory needed by the multifrontal solver MA41-UNS. A byproduct of this ordering strategy is an incomplete LU-factored matrix that can be used as a preconditioner in an iterative solver.« less
Robust extraction of basis functions for simultaneous and proportional myoelectric control via sparse non-negative matrix factorization

NASA Astrophysics Data System (ADS)

Lin, Chuang; Wang, Binghui; Jiang, Ning; Farina, Dario

2018-04-01

Objective. This paper proposes a novel simultaneous and proportional multiple degree of freedom (DOF) myoelectric control method for active prostheses. Approach. The approach is based on non-negative matrix factorization (NMF) of surface EMG signals with the inclusion of sparseness constraints. By applying a sparseness constraint to the control signal matrix, it is possible to extract the basis information from arbitrary movements (quasi-unsupervised approach) for multiple DOFs concurrently. Main Results. In online testing based on target hitting, able-bodied subjects reached a greater throughput (TP) when using sparse NMF (SNMF) than with classic NMF or with linear regression (LR). Accordingly, the completion time (CT) was shorter for SNMF than NMF or LR. The same observations were made in two patients with unilateral limb deficiencies. Significance. The addition of sparseness constraints to NMF allows for a quasi-unsupervised approach to myoelectric control with superior results with respect to previous methods for the simultaneous and proportional control of multi-DOF. The proposed factorization algorithm allows robust simultaneous and proportional control, is superior to previous supervised algorithms, and, because of minimal supervision, paves the way to online adaptation in myoelectric control.
A case study of view-factor rectification procedures for diffuse-gray radiation enclosure computations

NASA Technical Reports Server (NTRS)

Taylor, Robert P.; Luck, Rogelio

1995-01-01

The view factors which are used in diffuse-gray radiation enclosure calculations are often computed by approximate numerical integrations. These approximately calculated view factors will usually not satisfy the important physical constraints of reciprocity and closure. In this paper several view-factor rectification algorithms are reviewed and a rectification algorithm based on a least-squares numerical filtering scheme is proposed with both weighted and unweighted classes. A Monte-Carlo investigation is undertaken to study the propagation of view-factor and surface-area uncertainties into the heat transfer results of the diffuse-gray enclosure calculations. It is found that the weighted least-squares algorithm is vastly superior to the other rectification schemes for the reduction of the heat-flux sensitivities to view-factor uncertainties. In a sample problem, which has proven to be very sensitive to uncertainties in view factor, the heat transfer calculations with weighted least-squares rectified view factors are very good with an original view-factor matrix computed to only one-digit accuracy. All of the algorithms had roughly equivalent effects on the reduction in sensitivity to area uncertainty in this case study.
A Class of Manifold Regularized Multiplicative Update Algorithms for Image Clustering.

PubMed

Yang, Shangming; Yi, Zhang; He, Xiaofei; Li, Xuelong

2015-12-01

Multiplicative update algorithms are important tools for information retrieval, image processing, and pattern recognition. However, when the graph regularization is added to the cost function, different classes of sample data may be mapped to the same subspace, which leads to the increase of data clustering error rate. In this paper, an improved nonnegative matrix factorization (NMF) cost function is introduced. Based on the cost function, a class of novel graph regularized NMF algorithms is developed, which results in a class of extended multiplicative update algorithms with manifold structure regularization. Analysis shows that in the learning, the proposed algorithms can efficiently minimize the rank of the data representation matrix. Theoretical results presented in this paper are confirmed by simulations. For different initializations and data sets, variation curves of cost functions and decomposition data are presented to show the convergence features of the proposed update rules. Basis images, reconstructed images, and clustering results are utilized to present the efficiency of the new algorithms. Last, the clustering accuracies of different algorithms are also investigated, which shows that the proposed algorithms can achieve state-of-the-art performance in applications of image clustering.
Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis.

PubMed

Févotte, Cédric; Bertin, Nancy; Durrieu, Jean-Louis

2009-03-01

This letter presents theoretical, algorithmic, and experimental results about nonnegative matrix factorization (NMF) with the Itakura-Saito (IS) divergence. We describe how IS-NMF is underlaid by a well-defined statistical model of superimposed gaussian components and is equivalent to maximum likelihood estimation of variance parameters. This setting can accommodate regularization constraints on the factors through Bayesian priors. In particular, inverse-gamma and gamma Markov chain priors are considered in this work. Estimation can be carried out using a space-alternating generalized expectation-maximization (SAGE) algorithm; this leads to a novel type of NMF algorithm, whose convergence to a stationary point of the IS cost function is guaranteed. We also discuss the links between the IS divergence and other cost functions used in NMF, in particular, the Euclidean distance and the generalized Kullback-Leibler (KL) divergence. As such, we describe how IS-NMF can also be performed using a gradient multiplicative algorithm (a standard algorithm structure in NMF) whose convergence is observed in practice, though not proven. Finally, we report a furnished experimental comparative study of Euclidean-NMF, KL-NMF, and IS-NMF algorithms applied to the power spectrogram of a short piano sequence recorded in real conditions, with various initializations and model orders. Then we show how IS-NMF can successfully be employed for denoising and upmix (mono to stereo conversion) of an original piece of early jazz music. These experiments indicate that IS-NMF correctly captures the semantics of audio and is better suited to the representation of music signals than NMF with the usual Euclidean and KL costs.
An isometric muscle force estimation framework based on a high-density surface EMG array and an NMF algorithm

NASA Astrophysics Data System (ADS)

Huang, Chengjun; Chen, Xiang; Cao, Shuai; Qiu, Bensheng; Zhang, Xu

2017-08-01

Objective. To realize accurate muscle force estimation, a novel framework is proposed in this paper which can extract the input of the prediction model from the appropriate activation area of the skeletal muscle. Approach. Surface electromyographic (sEMG) signals from the biceps brachii muscle during isometric elbow flexion were collected with a high-density (HD) electrode grid (128 channels) and the external force at three contraction levels was measured at the wrist synchronously. The sEMG envelope matrix was factorized into a matrix of basis vectors with each column representing an activation pattern and a matrix of time-varying coefficients by a nonnegative matrix factorization (NMF) algorithm. The activation pattern with the highest activation intensity, which was defined as the sum of the absolute values of the time-varying coefficient curve, was considered as the major activation pattern, and its channels with high weighting factors were selected to extract the input activation signal of a force estimation model based on the polynomial fitting technique. Main results. Compared with conventional methods using the whole channels of the grid, the proposed method could significantly improve the quality of force estimation and reduce the electrode number. Significance. The proposed method provides a way to find proper electrode placement for force estimation, which can be further employed in muscle heterogeneity analysis, myoelectric prostheses and the control of exoskeleton devices.

Second derivative time integration methods for discontinuous Galerkin solutions of unsteady compressible flows

NASA Astrophysics Data System (ADS)

Nigro, A.; De Bartolo, C.; Crivellini, A.; Bassi, F.

2017-12-01

In this paper we investigate the possibility of using the high-order accurate A (α) -stable Second Derivative (SD) schemes proposed by Enright for the implicit time integration of the Discontinuous Galerkin (DG) space-discretized Navier-Stokes equations. These multistep schemes are A-stable up to fourth-order, but their use results in a system matrix difficult to compute. Furthermore, the evaluation of the nonlinear function is computationally very demanding. We propose here a Matrix-Free (MF) implementation of Enright schemes that allows to obtain a method without the costs of forming, storing and factorizing the system matrix, which is much less computationally expensive than its matrix-explicit counterpart, and which performs competitively with other implicit schemes, such as the Modified Extended Backward Differentiation Formulae (MEBDF). The algorithm makes use of the preconditioned GMRES algorithm for solving the linear system of equations. The preconditioner is based on the ILU(0) factorization of an approximated but computationally cheaper form of the system matrix, and it has been reused for several time steps to improve the efficiency of the MF Newton-Krylov solver. We additionally employ a polynomial extrapolation technique to compute an accurate initial guess to the implicit nonlinear system. The stability properties of SD schemes have been analyzed by solving a linear model problem. For the analysis on the Navier-Stokes equations, two-dimensional inviscid and viscous test cases, both with a known analytical solution, are solved to assess the accuracy properties of the proposed time integration method for nonlinear autonomous and non-autonomous systems, respectively. The performance of the SD algorithm is compared with the ones obtained by using an MF-MEBDF solver, in order to evaluate its effectiveness, identifying its limitations and suggesting possible further improvements.
An on-line calibration algorithm for external parameters of visual system based on binocular stereo cameras

NASA Astrophysics Data System (ADS)

Wang, Liqiang; Liu, Zhen; Zhang, Zhonghua

2014-11-01

Stereo vision is the key in the visual measurement, robot vision, and autonomous navigation. Before performing the system of stereo vision, it needs to calibrate the intrinsic parameters for each camera and the external parameters of the system. In engineering, the intrinsic parameters remain unchanged after calibrating cameras, and the positional relationship between the cameras could be changed because of vibration, knocks and pressures in the vicinity of the railway or motor workshops. Especially for large baselines, even minute changes in translation or rotation can affect the epipolar geometry and scene triangulation to such a degree that visual system becomes disabled. A technology including both real-time examination and on-line recalibration for the external parameters of stereo system becomes particularly important. This paper presents an on-line method for checking and recalibrating the positional relationship between stereo cameras. In epipolar geometry, the external parameters of cameras can be obtained by factorization of the fundamental matrix. Thus, it offers a method to calculate the external camera parameters without any special targets. If the intrinsic camera parameters are known, the external parameters of system can be calculated via a number of random matched points. The process is: (i) estimating the fundamental matrix via the feature point correspondences; (ii) computing the essential matrix from the fundamental matrix; (iii) obtaining the external parameters by decomposition of the essential matrix. In the step of computing the fundamental matrix, the traditional methods are sensitive to noise and cannot ensure the estimation accuracy. We consider the feature distribution situation in the actual scene images and introduce a regional weighted normalization algorithm to improve accuracy of the fundamental matrix estimation. In contrast to traditional algorithms, experiments on simulated data prove that the method improves estimation robustness and accuracy of the fundamental matrix. Finally, we take an experiment for computing the relationship of a pair of stereo cameras to demonstrate accurate performance of the algorithm.
Efficient sparse matrix-matrix multiplication for computing periodic responses by shooting method on Intel Xeon Phi

NASA Astrophysics Data System (ADS)

Stoykov, S.; Atanassov, E.; Margenov, S.

2016-10-01

Many of the scientific applications involve sparse or dense matrix operations, such as solving linear systems, matrix-matrix products, eigensolvers, etc. In what concerns structural nonlinear dynamics, the computations of periodic responses and the determination of stability of the solution are of primary interest. Shooting method iswidely used for obtaining periodic responses of nonlinear systems. The method involves simultaneously operations with sparse and dense matrices. One of the computationally expensive operations in the method is multiplication of sparse by dense matrices. In the current work, a new algorithm for sparse matrix by dense matrix products is presented. The algorithm takes into account the structure of the sparse matrix, which is obtained by space discretization of the nonlinear Mindlin's plate equation of motion by the finite element method. The algorithm is developed to use the vector engine of Intel Xeon Phi coprocessors. It is compared with the standard sparse matrix by dense matrix algorithm and the one developed by Intel MKL and it is shown that by considering the properties of the sparse matrix better algorithms can be developed.
Accelerating scientific computations with mixed precision algorithms

NASA Astrophysics Data System (ADS)

Baboulin, Marc; Buttari, Alfredo; Dongarra, Jack; Kurzak, Jakub; Langou, Julie; Langou, Julien; Luszczek, Piotr; Tomov, Stanimire

2009-12-01

On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented. Program summaryProgram title: ITER-REF Catalogue identifier: AECO_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AECO_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 7211 No. of bytes in distributed program, including test data, etc.: 41 862 Distribution format: tar.gz Programming language: FORTRAN 77 Computer: desktop, server Operating system: Unix/Linux RAM: 512 Mbytes Classification: 4.8 External routines: BLAS (optional) Nature of problem: On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. Solution method: Mixed precision algorithms stem from the observation that, in many cases, a single precision solution of a problem can be refined to the point where double precision accuracy is achieved. A common approach to the solution of linear systems, either dense or sparse, is to perform the LU factorization of the coefficient matrix using Gaussian elimination. First, the coefficient matrix A is factored into the product of a lower triangular matrix L and an upper triangular matrix U. Partial row pivoting is in general used to improve numerical stability resulting in a factorization PA=LU, where P is a permutation matrix. The solution for the system is achieved by first solving Ly=Pb (forward substitution) and then solving Ux=y (backward substitution). Due to round-off errors, the computed solution, x, carries a numerical error magnified by the condition number of the coefficient matrix A. In order to improve the computed solution, an iterative process can be applied, which produces a correction to the computed solution at each iteration, which then yields the method that is commonly known as the iterative refinement algorithm. Provided that the system is not too ill-conditioned, the algorithm produces a solution correct to the working precision. Running time: seconds/minutes
An efficient quantum algorithm for spectral estimation

NASA Astrophysics Data System (ADS)

Steffens, Adrian; Rebentrost, Patrick; Marvian, Iman; Eisert, Jens; Lloyd, Seth

2017-03-01

We develop an efficient quantum implementation of an important signal processing algorithm for line spectral estimation: the matrix pencil method, which determines the frequencies and damping factors of signals consisting of finite sums of exponentially damped sinusoids. Our algorithm provides a quantum speedup in a natural regime where the sampling rate is much higher than the number of sinusoid components. Along the way, we develop techniques that are expected to be useful for other quantum algorithms as well—consecutive phase estimations to efficiently make products of asymmetric low rank matrices classically accessible and an alternative method to efficiently exponentiate non-Hermitian matrices. Our algorithm features an efficient quantum-classical division of labor: the time-critical steps are implemented in quantum superposition, while an interjacent step, requiring much fewer parameters, can operate classically. We show that frequencies and damping factors can be obtained in time logarithmic in the number of sampling points, exponentially faster than known classical algorithms.
The improved Apriori algorithm based on matrix pruning and weight analysis

NASA Astrophysics Data System (ADS)

Lang, Zhenhong

2018-04-01

This paper uses the matrix compression algorithm and weight analysis algorithm for reference and proposes an improved matrix pruning and weight analysis Apriori algorithm. After the transactional database is scanned for only once, the algorithm will construct the boolean transaction matrix. Through the calculation of one figure in the rows and columns of the matrix, the infrequent item set is pruned, and a new candidate item set is formed. Then, the item's weight and the transaction's weight as well as the weight support for items are calculated, thus the frequent item sets are gained. The experimental result shows that the improved Apriori algorithm not only reduces the number of repeated scans of the database, but also improves the efficiency of data correlation mining.
Brief announcement: Hypergraph parititioning for parallel sparse matrix-matrix multiplication

DOE PAGES

Ballard, Grey; Druinsky, Alex; Knight, Nicholas; ...

2015-01-01

The performance of parallel algorithms for sparse matrix-matrix multiplication is typically determined by the amount of interprocessor communication performed, which in turn depends on the nonzero structure of the input matrices. In this paper, we characterize the communication cost of a sparse matrix-matrix multiplication algorithm in terms of the size of a cut of an associated hypergraph that encodes the computation for a given input nonzero structure. Obtaining an optimal algorithm corresponds to solving a hypergraph partitioning problem. Furthermore, our hypergraph model generalizes several existing models for sparse matrix-vector multiplication, and we can leverage hypergraph partitioners developed for that computationmore » to improve application-specific algorithms for multiplying sparse matrices.« less
Trace Norm Regularized CANDECOMP/PARAFAC Decomposition With Missing Data.

PubMed

Liu, Yuanyuan; Shang, Fanhua; Jiao, Licheng; Cheng, James; Cheng, Hong

2015-11-01

In recent years, low-rank tensor completion (LRTC) problems have received a significant amount of attention in computer vision, data mining, and signal processing. The existing trace norm minimization algorithms for iteratively solving LRTC problems involve multiple singular value decompositions of very large matrices at each iteration. Therefore, they suffer from high computational cost. In this paper, we propose a novel trace norm regularized CANDECOMP/PARAFAC decomposition (TNCP) method for simultaneous tensor decomposition and completion. We first formulate a factor matrix rank minimization model by deducing the relation between the rank of each factor matrix and the mode- n rank of a tensor. Then, we introduce a tractable relaxation of our rank function, and then achieve a convex combination problem of much smaller-scale matrix trace norm minimization. Finally, we develop an efficient algorithm based on alternating direction method of multipliers to solve our problem. The promising experimental results on synthetic and real-world data validate the effectiveness of our TNCP method. Moreover, TNCP is significantly faster than the state-of-the-art methods and scales to larger problems.
Developing the fuzzy c-means clustering algorithm based on maximum entropy for multitarget tracking in a cluttered environment

NASA Astrophysics Data System (ADS)

Chen, Xiao; Li, Yaan; Yu, Jing; Li, Yuxing

2018-01-01

For fast and more effective implementation of tracking multiple targets in a cluttered environment, we propose a multiple targets tracking (MTT) algorithm called maximum entropy fuzzy c-means clustering joint probabilistic data association that combines fuzzy c-means clustering and the joint probabilistic data association (PDA) algorithm. The algorithm uses the membership value to express the probability of the target originating from measurement. The membership value is obtained through fuzzy c-means clustering objective function optimized by the maximum entropy principle. When considering the effect of the public measurement, we use a correction factor to adjust the association probability matrix to estimate the state of the target. As this algorithm avoids confirmation matrix splitting, it can solve the high computational load problem of the joint PDA algorithm. The results of simulations and analysis conducted for tracking neighbor parallel targets and cross targets in a different density cluttered environment show that the proposed algorithm can realize MTT quickly and efficiently in a cluttered environment. Further, the performance of the proposed algorithm remains constant with increasing process noise variance. The proposed algorithm has the advantages of efficiency and low computational load, which can ensure optimum performance when tracking multiple targets in a dense cluttered environment.
Matrix product operators, matrix product states, and ab initio density matrix renormalization group algorithms

NASA Astrophysics Data System (ADS)

Chan, Garnet Kin-Lic; Keselman, Anna; Nakatani, Naoki; Li, Zhendong; White, Steven R.

2016-07-01

Current descriptions of the ab initio density matrix renormalization group (DMRG) algorithm use two superficially different languages: an older language of the renormalization group and renormalized operators, and a more recent language of matrix product states and matrix product operators. The same algorithm can appear dramatically different when written in the two different vocabularies. In this work, we carefully describe the translation between the two languages in several contexts. First, we describe how to efficiently implement the ab initio DMRG sweep using a matrix product operator based code, and the equivalence to the original renormalized operator implementation. Next we describe how to implement the general matrix product operator/matrix product state algebra within a pure renormalized operator-based DMRG code. Finally, we discuss two improvements of the ab initio DMRG sweep algorithm motivated by matrix product operator language: Hamiltonian compression, and a sum over operators representation that allows for perfect computational parallelism. The connections and correspondences described here serve to link the future developments with the past and are important in the efficient implementation of continuing advances in ab initio DMRG and related algorithms.
Matrix product operators, matrix product states, and ab initio density matrix renormalization group algorithms.

PubMed

Chan, Garnet Kin-Lic; Keselman, Anna; Nakatani, Naoki; Li, Zhendong; White, Steven R

2016-07-07

Current descriptions of the ab initio density matrix renormalization group (DMRG) algorithm use two superficially different languages: an older language of the renormalization group and renormalized operators, and a more recent language of matrix product states and matrix product operators. The same algorithm can appear dramatically different when written in the two different vocabularies. In this work, we carefully describe the translation between the two languages in several contexts. First, we describe how to efficiently implement the ab initio DMRG sweep using a matrix product operator based code, and the equivalence to the original renormalized operator implementation. Next we describe how to implement the general matrix product operator/matrix product state algebra within a pure renormalized operator-based DMRG code. Finally, we discuss two improvements of the ab initio DMRG sweep algorithm motivated by matrix product operator language: Hamiltonian compression, and a sum over operators representation that allows for perfect computational parallelism. The connections and correspondences described here serve to link the future developments with the past and are important in the efficient implementation of continuing advances in ab initio DMRG and related algorithms.
Fast and stable algorithms for computing the principal square root of a complex matrix

NASA Technical Reports Server (NTRS)

Shieh, Leang S.; Lian, Sui R.; Mcinnis, Bayliss C.

1987-01-01

This note presents recursive algorithms that are rapidly convergent and more stable for finding the principal square root of a complex matrix. Also, the developed algorithms are utilized to derive the fast and stable matrix sign algorithms which are useful in developing applications to control system problems.
Electromagnetic scattering calculations on the Intel Touchstone Delta

NASA Technical Reports Server (NTRS)

Cwik, Tom; Patterson, Jean; Scott, David

1992-01-01

During the first year's operation of the Intel Touchstone Delta system, software which solves the electric field integral equations for fields scattered from arbitrarily shaped objects has been transferred to the Delta. To fully realize the Delta's resources, an out-of-core dense matrix solution algorithm that utilizes some or all of the 90 Gbyte of concurrent file system (CFS) has been used. The largest calculation completed to date computes the fields scattered from a perfectly conducting sphere modeled by 48,672 unknown functions, resulting in a complex valued dense matrix needing 37.9 Gbyte of storage. The out-of-core LU matrix factorization algorithm was executed in 8.25 h at a rate of 10.35 Gflops. Total time to complete the calculation was 19.7 h-the additional time was used to compute the 48,672 x 48,672 matrix entries, solve the system for a given excitation, and compute observable quantities. The calculation was performed in 64-b precision.
Lanczos algorithm with matrix product states for dynamical correlation functions

NASA Astrophysics Data System (ADS)

Dargel, P. E.; Wöllert, A.; Honecker, A.; McCulloch, I. P.; Schollwöck, U.; Pruschke, T.

2012-05-01

The density-matrix renormalization group (DMRG) algorithm can be adapted to the calculation of dynamical correlation functions in various ways which all represent compromises between computational efficiency and physical accuracy. In this paper we reconsider the oldest approach based on a suitable Lanczos-generated approximate basis and implement it using matrix product states (MPS) for the representation of the basis states. The direct use of matrix product states combined with an ex post reorthogonalization method allows us to avoid several shortcomings of the original approach, namely the multitargeting and the approximate representation of the Hamiltonian inherent in earlier Lanczos-method implementations in the DMRG framework, and to deal with the ghost problem of Lanczos methods, leading to a much better convergence of the spectral weights and poles. We present results for the dynamic spin structure factor of the spin-1/2 antiferromagnetic Heisenberg chain. A comparison to Bethe ansatz results in the thermodynamic limit reveals that the MPS-based Lanczos approach is much more accurate than earlier approaches at minor additional numerical cost.
Comparison of eigensolvers for symmetric band matrices.

PubMed

Moldaschl, Michael; Gansterer, Wilfried N

2014-09-15

We compare different algorithms for computing eigenvalues and eigenvectors of a symmetric band matrix across a wide range of synthetic test problems. Of particular interest is a comparison of state-of-the-art tridiagonalization-based methods as implemented in Lapack or Plasma on the one hand, and the block divide-and-conquer (BD&C) algorithm as well as the block twisted factorization (BTF) method on the other hand. The BD&C algorithm does not require tridiagonalization of the original band matrix at all, and the current version of the BTF method tridiagonalizes the original band matrix only for computing the eigenvalues. Avoiding the tridiagonalization process sidesteps the cost of backtransformation of the eigenvectors. Beyond that, we discovered another disadvantage of the backtransformation process for band matrices: In several scenarios, a lot of gradual underflow is observed in the (optional) accumulation of the transformation matrix and in the (obligatory) backtransformation step. According to the IEEE 754 standard for floating-point arithmetic, this implies many operations with subnormal (denormalized) numbers, which causes severe slowdowns compared to the other algorithms without backtransformation of the eigenvectors. We illustrate that in these cases the performance of existing methods from Lapack and Plasma reaches a competitive level only if subnormal numbers are disabled (and thus the IEEE standard is violated). Overall, our performance studies illustrate that if the problem size is large enough relative to the bandwidth, BD&C tends to achieve the highest performance of all methods if the spectrum to be computed is clustered. For test problems with well separated eigenvalues, the BTF method tends to become the fastest algorithm with growing problem size.
Peak picking NMR spectral data using non-negative matrix factorization.

PubMed

Tikole, Suhas; Jaravine, Victor; Rogov, Vladimir; Dötsch, Volker; Güntert, Peter

2014-02-11

Simple peak-picking algorithms, such as those based on lineshape fitting, perform well when peaks are completely resolved in multidimensional NMR spectra, but often produce wrong intensities and frequencies for overlapping peak clusters. For example, NOESY-type spectra have considerable overlaps leading to significant peak-picking intensity errors, which can result in erroneous structural restraints. Precise frequencies are critical for unambiguous resonance assignments. To alleviate this problem, a more sophisticated peaks decomposition algorithm, based on non-negative matrix factorization (NMF), was developed. We produce peak shapes from Fourier-transformed NMR spectra. Apart from its main goal of deriving components from spectra and producing peak lists automatically, the NMF approach can also be applied if the positions of some peaks are known a priori, e.g. from consistently referenced spectral dimensions of other experiments. Application of the NMF algorithm to a three-dimensional peak list of the 23 kDa bi-domain section of the RcsD protein (RcsD-ABL-HPt, residues 688-890) as well as to synthetic HSQC data shows that peaks can be picked accurately also in spectral regions with strong overlap.
The Comparison Between Nmf and Ica in Pigment Mixture Identification of Ancient Chinese Paintings

NASA Astrophysics Data System (ADS)

Liu, Y.; Lyu, S.; Hou, M.; Yin, Q.

2018-04-01

Since the colour in painting cultural relics observed by our naked eyes or hyperspectral cameras is usually a mixture of several kinds of pigments, the mixed pigments analysis will be an important subject in the field of ancient painting conservation and restoration. This paper aims to find a more effective method to confirm the types of every pure pigment from mixture on the surface of paintings. Firstly, we adopted two kinds of blind source separation algorithms, which are independent component analysis and non-negative matrix factorization, to extract the pure pigment component from mixed spectrum respectively. Moreover, we matched the separated pure spectrum with the pigments spectra library built by our team to determine the pigment type. Furthermore, three kinds of data including simulation data, mixed pigments spectral data measured in laboratory, and the spectral data of an ancient painting were chosen to evaluate the performance of the different algorithms. And the accuracy was compared between the two algorithms. Finally, the experimental results show that non-negative matrix factorization method is more suitable for endmember extraction in the field of ancient painting conservation and restoration.
FPGA-based coprocessor for matrix algorithms implementation

NASA Astrophysics Data System (ADS)

Amira, Abbes; Bensaali, Faycal

2003-03-01

Matrix algorithms are important in many types of applications including image and signal processing. These areas require enormous computing power. A close examination of the algorithms used in these, and related, applications reveals that many of the fundamental actions involve matrix operations such as matrix multiplication which is of O (N3) on a sequential computer and O (N3/p) on a parallel system with p processors complexity. This paper presents an investigation into the design and implementation of different matrix algorithms such as matrix operations, matrix transforms and matrix decompositions using an FPGA based environment. Solutions for the problem of processing large matrices have been proposed. The proposed system architectures are scalable, modular and require less area and time complexity with reduced latency when compared with existing structures.
Preconditioned conjugate gradient wave-front reconstructors for multiconjugate adaptive optics

NASA Astrophysics Data System (ADS)

Gilles, Luc; Ellerbroek, Brent L.; Vogel, Curtis R.

2003-09-01

Multiconjugate adaptive optics (MCAO) systems with 104-105 degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wave-front control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of adaptive optics degrees of freedom. We develop scalable open-loop iterative sparse matrix implementations of minimum variance wave-front reconstruction for telescope diameters up to 32 m with more than 104 actuators. The basic approach is the preconditioned conjugate gradient method with an efficient preconditioner, whose block structure is defined by the atmospheric turbulent layers very much like the layer-oriented MCAO algorithms of current interest. Two cost-effective preconditioners are investigated: a multigrid solver and a simpler block symmetric Gauss-Seidel (BSGS) sweep. Both options require off-line sparse Cholesky factorizations of the diagonal blocks of the matrix system. The cost to precompute these factors scales approximately as the three-halves power of the number of estimated phase grid points per atmospheric layer, and their average update rate is typically of the order of 10-2 Hz, i.e., 4-5 orders of magnitude lower than the typical 103 Hz temporal sampling rate. All other computations scale almost linearly with the total number of estimated phase grid points. We present numerical simulation results to illustrate algorithm convergence. Convergence rates of both preconditioners are similar, regardless of measurement noise level, indicating that the layer-oriented BSGS sweep is as effective as the more elaborated multiresolution preconditioner.
An efficient solution of real-time data processing for multi-GNSS network

NASA Astrophysics Data System (ADS)

Gong, Xiaopeng; Gu, Shengfeng; Lou, Yidong; Zheng, Fu; Ge, Maorong; Liu, Jingnan

2017-12-01

Global navigation satellite systems (GNSS) are acting as an indispensable tool for geodetic research and global monitoring of the Earth, and they have been rapidly developed over the past few years with abundant GNSS networks, modern constellations, and significant improvement in mathematic models of data processing. However, due to the increasing number of satellites and stations, the computational efficiency becomes a key issue and it could hamper the further development of GNSS applications. In this contribution, this problem is overcome from the aspects of both dense linear algebra algorithms and GNSS processing strategy. First, in order to fully explore the power of modern microprocessors, the square root information filter solution based on the blocked QR factorization employing as many matrix-matrix operations as possible is introduced. In addition, the algorithm complexity of GNSS data processing is further decreased by centralizing the carrier-phase observations and ambiguity parameters, as well as performing the real-time ambiguity resolution and elimination. Based on the QR factorization of the simulated matrix, we can conclude that compared to unblocked QR factorization, the blocked QR factorization can greatly improve processing efficiency with a magnitude of nearly two orders on a personal computer with four 3.30 GHz cores. Then, with 82 globally distributed stations, the processing efficiency is further validated in multi-GNSS (GPS/BDS/Galileo) satellite clock estimation. The results suggest that it will take about 31.38 s per epoch for the unblocked method. While, without any loss of accuracy, it only takes 0.50 and 0.31 s for our new algorithm per epoch for float and fixed clock solutions, respectively.

Quantum Simulation of Tunneling in Small Systems

PubMed Central

Sornborger, Andrew T.

2012-01-01

A number of quantum algorithms have been performed on small quantum computers; these include Shor's prime factorization algorithm, error correction, Grover's search algorithm and a number of analog and digital quantum simulations. Because of the number of gates and qubits necessary, however, digital quantum particle simulations remain untested. A contributing factor to the system size required is the number of ancillary qubits needed to implement matrix exponentials of the potential operator. Here, we show that a set of tunneling problems may be investigated with no ancillary qubits and a cost of one single-qubit operator per time step for the potential evolution, eliminating at least half of the quantum gates required for the algorithm and more than that in the general case. Such simulations are within reach of current quantum computer architectures. PMID:22916333
Turbo-SMT: Parallel Coupled Sparse Matrix-Tensor Factorizations and Applications

PubMed Central

Papalexakis, Evangelos E.; Faloutsos, Christos; Mitchell, Tom M.; Talukdar, Partha Pratim; Sidiropoulos, Nicholas D.; Murphy, Brian

2016-01-01

How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ’edible’, ’fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a Facebook dataset (users, ’friends’, wall-postings); there, Turbo-SMT spots spammer-like anomalies. PMID:27672406
The discrete hungry Lotka Volterra system and a new algorithm for computing matrix eigenvalues

NASA Astrophysics Data System (ADS)

Fukuda, Akiko; Ishiwata, Emiko; Iwasaki, Masashi; Nakamura, Yoshimasa

2009-01-01

The discrete hungry Lotka-Volterra (dhLV) system is a generalization of the discrete Lotka-Volterra (dLV) system which stands for a prey-predator model in mathematical biology. In this paper, we show that (1) some invariants exist which are expressed by dhLV variables and are independent from the discrete time and (2) a dhLV variable converges to some positive constant or zero as the discrete time becomes sufficiently large. Some characteristic polynomial is then factorized with the help of the dhLV system. The asymptotic behaviour of the dhLV system enables us to design an algorithm for computing complex eigenvalues of a certain band matrix.
An algorithm for solving an arbitrary triangular fully fuzzy Sylvester matrix equations

NASA Astrophysics Data System (ADS)

Daud, Wan Suhana Wan; Ahmad, Nazihah; Malkawi, Ghassan

2017-11-01

Sylvester matrix equations played a prominent role in various areas including control theory. Considering to any un-certainty problems that can be occurred at any time, the Sylvester matrix equation has to be adapted to the fuzzy environment. Therefore, in this study, an algorithm for solving an arbitrary triangular fully fuzzy Sylvester matrix equation is constructed. The construction of the algorithm is based on the max-min arithmetic multiplication operation. Besides that, an associated arbitrary matrix equation is modified in obtaining the final solution. Finally, some numerical examples are presented to illustrate the proposed algorithm.
A Distributed-Memory Package for Dense Hierarchically Semi-Separable Matrix Computations Using Randomization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter

In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
A Distributed-Memory Package for Dense Hierarchically Semi-Separable Matrix Computations Using Randomization

DOE PAGES

Rouet, François-Henry; Li, Xiaoye S.; Ghysels, Pieter; ...

2016-06-30

In this paper, we present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable (HSS) representations. Such matrices appear in many applications, for example, finite-element methods, boundary element methods, and so on. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, reliesmore » on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization, and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. Finally, this work is part of a more global effort, the STRUctured Matrices PACKage (STRUMPACK) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.« less
An efficient parallel algorithm for matrix-vector multiplication

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hendrickson, B.; Leland, R.; Plimpton, S.

The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in themore » well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.« less
Non-Rigid Structure Estimation in Trajectory Space from Monocular Vision

PubMed Central

Wang, Yaming; Tong, Lingling; Jiang, Mingfeng; Zheng, Junbao

2015-01-01

In this paper, the problem of non-rigid structure estimation in trajectory space from monocular vision is investigated. Similar to the Point Trajectory Approach (PTA), based on characteristic points’ trajectories described by a predefined Discrete Cosine Transform (DCT) basis, the structure matrix was also calculated by using a factorization method. To further optimize the non-rigid structure estimation from monocular vision, the rank minimization problem about structure matrix is proposed to implement the non-rigid structure estimation by introducing the basic low-rank condition. Moreover, the Accelerated Proximal Gradient (APG) algorithm is proposed to solve the rank minimization problem, and the initial structure matrix calculated by the PTA method is optimized. The APG algorithm can converge to efficient solutions quickly and lessen the reconstruction error obviously. The reconstruction results of real image sequences indicate that the proposed approach runs reliably, and effectively improves the accuracy of non-rigid structure estimation from monocular vision. PMID:26473863
Matrix multiplication on the Intel Touchstone Delta

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huss-Lederman, S.; Jacobson, E.M.; Tsao, A.

1993-12-31

Matrix multiplication is a key primitive in block matrix algorithms such as those found in LAPACK. We present results from our study of matrix multiplication algorithms on the Intel Touchstone Delta, a distributed memory message-passing architecture with a two-dimensional mesh topology. We obtain an implementation that uses communication primitives highly suited to the Delta and exploits the single node assembly-coded matrix multiplication. Our algorithm is completely general, able to deal with arbitrary mesh aspect ratios and matrix dimensions, and has achieved parallel efficiency of 86% with overall peak performance in excess of 8 Gflops on 256 nodes for an 8800more » {times} 8800 matrix. We describe our algorithm design and implementation, and present performance results that demonstrate scalability and robust behavior over varying mesh topologies.« less
Column Subset Selection, Matrix Factorization, and Eigenvalue Optimization

DTIC Science & Technology

2008-07-01

Pietsch and Grothendieck, which are regarded as basic instruments in modern functional analysis [Pis86]. • The methods for computing these... Pietsch factorization and the maxcut semi- definite program [GW95]. 1.2. Overview. We focus on the algorithmic version of the Kashin–Tzafriri theorem...will see that the desired subset is exposed by factoring the random submatrix. This factorization, which was invented by Pietsch , is regarded as a basic
A Shifted Block Lanczos Algorithm 1: The Block Recurrence

NASA Technical Reports Server (NTRS)

Grimes, Roger G.; Lewis, John G.; Simon, Horst D.

1990-01-01

In this paper we describe a block Lanczos algorithm that is used as the key building block of a software package for the extraction of eigenvalues and eigenvectors of large sparse symmetric generalized eigenproblems. The software package comprises: a version of the block Lanczos algorithm specialized for spectrally transformed eigenproblems; an adaptive strategy for choosing shifts, and efficient codes for factoring large sparse symmetric indefinite matrices. This paper describes the algorithmic details of our block Lanczos recurrence. This uses a novel combination of block generalizations of several features that have only been investigated independently in the past. In particular new forms of partial reorthogonalization, selective reorthogonalization and local reorthogonalization are used, as is a new algorithm for obtaining the M-orthogonal factorization of a matrix. The heuristic shifting strategy, the integration with sparse linear equation solvers and numerical experience with the code are described in a companion paper.
Study on the algorithm of computational ghost imaging based on discrete fourier transform measurement matrix

NASA Astrophysics Data System (ADS)

Zhang, Leihong; Liang, Dong; Li, Bei; Kang, Yi; Pan, Zilan; Zhang, Dawei; Gao, Xiumin; Ma, Xiuhua

2016-07-01

On the basis of analyzing the cosine light field with determined analytic expression and the pseudo-inverse method, the object is illuminated by a presetting light field with a determined discrete Fourier transform measurement matrix, and the object image is reconstructed by the pseudo-inverse method. The analytic expression of the algorithm of computational ghost imaging based on discrete Fourier transform measurement matrix is deduced theoretically, and compared with the algorithm of compressive computational ghost imaging based on random measurement matrix. The reconstruction process and the reconstruction error are analyzed. On this basis, the simulation is done to verify the theoretical analysis. When the sampling measurement number is similar to the number of object pixel, the rank of discrete Fourier transform matrix is the same as the one of the random measurement matrix, the PSNR of the reconstruction image of FGI algorithm and PGI algorithm are similar, the reconstruction error of the traditional CGI algorithm is lower than that of reconstruction image based on FGI algorithm and PGI algorithm. As the decreasing of the number of sampling measurement, the PSNR of reconstruction image based on FGI algorithm decreases slowly, and the PSNR of reconstruction image based on PGI algorithm and CGI algorithm decreases sharply. The reconstruction time of FGI algorithm is lower than that of other algorithms and is not affected by the number of sampling measurement. The FGI algorithm can effectively filter out the random white noise through a low-pass filter and realize the reconstruction denoising which has a higher denoising capability than that of the CGI algorithm. The FGI algorithm can improve the reconstruction accuracy and the reconstruction speed of computational ghost imaging.
Variational optimization algorithms for uniform matrix product states

NASA Astrophysics Data System (ADS)

Zauner-Stauber, V.; Vanderstraeten, L.; Fishman, M. T.; Verstraete, F.; Haegeman, J.

2018-01-01

We combine the density matrix renormalization group (DMRG) with matrix product state tangent space concepts to construct a variational algorithm for finding ground states of one-dimensional quantum lattices in the thermodynamic limit. A careful comparison of this variational uniform matrix product state algorithm (VUMPS) with infinite density matrix renormalization group (IDMRG) and with infinite time evolving block decimation (ITEBD) reveals substantial gains in convergence speed and precision. We also demonstrate that VUMPS works very efficiently for Hamiltonians with long-range interactions and also for the simulation of two-dimensional models on infinite cylinders. The new algorithm can be conveniently implemented as an extension of an already existing DMRG implementation.
A Partitioning Algorithm for Block-Diagonal Matrices With Overlap

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guy Antoine Atenekeng Kahou; Laura Grigori; Masha Sosonkina

2008-02-02

We present a graph partitioning algorithm that aims at partitioning a sparse matrix into a block-diagonal form, such that any two consecutive blocks overlap. We denote this form of the matrix as the overlapped block-diagonal matrix. The partitioned matrix is suitable for applying the explicit formulation of Multiplicative Schwarz preconditioner (EFMS) described in [3]. The graph partitioning algorithm partitions the graph of the input matrix into K partitions, such that every partition {Omega}{sub i} has at most two neighbors {Omega}{sub i-1} and {Omega}{sub i+1}. First, an ordering algorithm, such as the reverse Cuthill-McKee algorithm, that reduces the matrix profile ismore » performed. An initial overlapped block-diagonal partition is obtained from the profile of the matrix. An iterative strategy is then used to further refine the partitioning by allowing nodes to be transferred between neighboring partitions. Experiments are performed on matrices arising from real-world applications to show the feasibility and usefulness of this approach.« less
Acoustooptic linear algebra processors - Architectures, algorithms, and applications

NASA Technical Reports Server (NTRS)

Casasent, D.

1984-01-01

Architectures, algorithms, and applications for systolic processors are described with attention to the realization of parallel algorithms on various optical systolic array processors. Systolic processors for matrices with special structure and matrices of general structure, and the realization of matrix-vector, matrix-matrix, and triple-matrix products and such architectures are described. Parallel algorithms for direct and indirect solutions to systems of linear algebraic equations and their implementation on optical systolic processors are detailed with attention to the pipelining and flow of data and operations. Parallel algorithms and their optical realization for LU and QR matrix decomposition are specifically detailed. These represent the fundamental operations necessary in the implementation of least squares, eigenvalue, and SVD solutions. Specific applications (e.g., the solution of partial differential equations, adaptive noise cancellation, and optimal control) are described to typify the use of matrix processors in modern advanced signal processing.
The Role of miRNAs in the Progression of Prostate Cancer from Androgen-Dependent to Androgen-Independent Stages

DTIC Science & Technology

2012-09-01

regulated by miR-99a/let7c/125b-2 cluster. Using bioinformatic prediction algorithm TargetScan, we identified 7 genes that are commonly targeted by miR-99a...HPeak, a Hidden Markov Model (HMM)-based peak identifying algorithm (http://www.sph.umich.edu/csg/qin/HPeak/). Seven AR binding sites were reported by...and ARBS2 by ALGGEN- PROMO, a matrix algorithm for predicting transcription factor binding sites based on TRANSFAC (http://alggen.lsi.upc.es/cgi- bin
Matrix Interdiction Problem

NASA Astrophysics Data System (ADS)

Kasiviswanathan, Shiva Prasad; Pan, Feng

In the matrix interdiction problem, a real-valued matrix and an integer k is given. The objective is to remove a set of k matrix columns that minimizes in the residual matrix the sum of the row values, where the value of a row is defined to be the largest entry in that row. This combinatorial problem is closely related to bipartite network interdiction problem that can be applied to minimize the probability that an adversary can successfully smuggle weapons. After introducing the matrix interdiction problem, we study the computational complexity of this problem. We show that the matrix interdiction problem is NP-hard and that there exists a constant γ such that it is even NP-hard to approximate this problem within an n γ additive factor. We also present an algorithm for this problem that achieves an (n - k) multiplicative approximation ratio.
A matrix-algebraic formulation of distributed-memory maximal cardinality matching algorithms in bipartite graphs

DOE PAGES

Azad, Ariful; Buluç, Aydın

2016-05-16

We describe parallel algorithms for computing maximal cardinality matching in a bipartite graph on distributed-memory systems. Unlike traditional algorithms that match one vertex at a time, our algorithms process many unmatched vertices simultaneously using a matrix-algebraic formulation of maximal matching. This generic matrix-algebraic framework is used to develop three efficient maximal matching algorithms with minimal changes. The newly developed algorithms have two benefits over existing graph-based algorithms. First, unlike existing parallel algorithms, cardinality of matching obtained by the new algorithms stays constant with increasing processor counts, which is important for predictable and reproducible performance. Second, relying on bulk-synchronous matrix operations,more » these algorithms expose a higher degree of parallelism on distributed-memory platforms than existing graph-based algorithms. We report high-performance implementations of three maximal matching algorithms using hybrid OpenMP-MPI and evaluate the performance of these algorithm using more than 35 real and randomly generated graphs. On real instances, our algorithms achieve up to 200 × speedup on 2048 cores of a Cray XC30 supercomputer. Even higher speedups are obtained on larger synthetically generated graphs where our algorithms show good scaling on up to 16,384 cores.« less
A Deep Stochastic Model for Detecting Community in Complex Networks

NASA Astrophysics Data System (ADS)

Fu, Jingcheng; Wu, Jianliang

2017-01-01

Discovering community structures is an important step to understanding the structure and dynamics of real-world networks in social science, biology and technology. In this paper, we develop a deep stochastic model based on non-negative matrix factorization to identify communities, in which there are two sets of parameters. One is the community membership matrix, of which the elements in a row correspond to the probabilities of the given node belongs to each of the given number of communities in our model, another is the community-community connection matrix, of which the element in the i-th row and j-th column represents the probability of there being an edge between a randomly chosen node from the i-th community and a randomly chosen node from the j-th community. The parameters can be evaluated by an efficient updating rule, and its convergence can be guaranteed. The community-community connection matrix in our model is more precise than the community-community connection matrix in traditional non-negative matrix factorization methods. Furthermore, the method called symmetric nonnegative matrix factorization, is a special case of our model. Finally, based on the experiments on both synthetic and real-world networks data, it can be demonstrated that our algorithm is highly effective in detecting communities.
Preconditioned conjugate gradient wave-front reconstructors for multiconjugate adaptive optics.

PubMed

Gilles, Luc; Ellerbroek, Brent L; Vogel, Curtis R

2003-09-10

Multiconjugate adaptive optics (MCAO) systems with 10(4)-10(5) degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wavefront control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of adaptive optics degrees of freedom. We develop scalable open-loop iterative sparse matrix implementations of minimum variance wave-front reconstruction for telescope diameters up to 32 m with more than 10(4) actuators. The basic approach is the preconditioned conjugate gradient method with an efficient preconditioner, whose block structure is defined by the atmospheric turbulent layers very much like the layer-oriented MCAO algorithms of current interest. Two cost-effective preconditioners are investigated: a multigrid solver and a simpler block symmetric Gauss-Seidel (BSGS) sweep. Both options require off-line sparse Cholesky factorizations of the diagonal blocks of the matrix system. The cost to precompute these factors scales approximately as the three-halves power of the number of estimated phase grid points per atmospheric layer, and their average update rate is typically of the order of 10(-2) Hz, i.e., 4-5 orders of magnitude lower than the typical 10(3) Hz temporal sampling rate. All other computations scale almost linearly with the total number of estimated phase grid points. We present numerical simulation results to illustrate algorithm convergence. Convergence rates of both preconditioners are similar, regardless of measurement noise level, indicating that the layer-oriented BSGS sweep is as effective as the more elaborated multiresolution preconditioner.

A Novel Image Compression Algorithm for High Resolution 3D Reconstruction

NASA Astrophysics Data System (ADS)

Siddeq, M. M.; Rodrigues, M. A.

2014-06-01

This research presents a novel algorithm to compress high-resolution images for accurate structured light 3D reconstruction. Structured light images contain a pattern of light and shadows projected on the surface of the object, which are captured by the sensor at very high resolutions. Our algorithm is concerned with compressing such images to a high degree with minimum loss without adversely affecting 3D reconstruction. The Compression Algorithm starts with a single level discrete wavelet transform (DWT) for decomposing an image into four sub-bands. The sub-band LL is transformed by DCT yielding a DC-matrix and an AC-matrix. The Minimize-Matrix-Size Algorithm is used to compress the AC-matrix while a DWT is applied again to the DC-matrix resulting in LL2, HL2, LH2 and HH2 sub-bands. The LL2 sub-band is transformed by DCT, while the Minimize-Matrix-Size Algorithm is applied to the other sub-bands. The proposed algorithm has been tested with images of different sizes within a 3D reconstruction scenario. The algorithm is demonstrated to be more effective than JPEG2000 and JPEG concerning higher compression rates with equivalent perceived quality and the ability to more accurately reconstruct the 3D models.
BCYCLIC: A parallel block tridiagonal matrix cyclic solver

NASA Astrophysics Data System (ADS)

Hirshman, S. P.; Perumalla, K. S.; Lynch, V. E.; Sanchez, R.

2010-09-01

A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.
Online blind source separation using incremental nonnegative matrix factorization with volume constraint.

PubMed

Zhou, Guoxu; Yang, Zuyuan; Xie, Shengli; Yang, Jun-Mei

2011-04-01

Online blind source separation (BSS) is proposed to overcome the high computational cost problem, which limits the practical applications of traditional batch BSS algorithms. However, the existing online BSS methods are mainly used to separate independent or uncorrelated sources. Recently, nonnegative matrix factorization (NMF) shows great potential to separate the correlative sources, where some constraints are often imposed to overcome the non-uniqueness of the factorization. In this paper, an incremental NMF with volume constraint is derived and utilized for solving online BSS. The volume constraint to the mixing matrix enhances the identifiability of the sources, while the incremental learning mode reduces the computational cost. The proposed method takes advantage of the natural gradient based multiplication updating rule, and it performs especially well in the recovery of dependent sources. Simulations in BSS for dual-energy X-ray images, online encrypted speech signals, and high correlative face images show the validity of the proposed method.
Polar decomposition for attitude determination from vector observations

NASA Technical Reports Server (NTRS)

Bar-Itzhack, Itzhack Y.

1993-01-01

This work treats the problem of weighted least squares fitting of a 3D Euclidean-coordinate transformation matrix to a set of unit vectors measured in the reference and transformed coordinates. A closed-form analytic solution to the problem is re-derived. The fact that the solution is the closest orthogonal matrix to some matrix defined on the measured vectors and their weights is clearly demonstrated. Several known algorithms for computing the analytic closed form solution are considered. An algorithm is discussed which is based on the polar decomposition of matrices into the closest unitary matrix to the decomposed matrix and a Hermitian matrix. A somewhat longer improved algorithm is suggested too. A comparison of several algorithms is carried out using simulated data as well as real data from the Upper Atmosphere Research Satellite. The comparison is based on accuracy and time consumption. It is concluded that the algorithms based on polar decomposition yield a simple although somewhat less accurate solution. The precision of the latter algorithms increase with the number of the measured vectors and with the accuracy of their measurement.
Quantum algorithm for support matrix machines

NASA Astrophysics Data System (ADS)

Duan, Bojia; Yuan, Jiabin; Liu, Ying; Li, Dan

2017-09-01

We propose a quantum algorithm for support matrix machines (SMMs) that efficiently addresses an image classification problem by introducing a least-squares reformulation. This algorithm consists of two core subroutines: a quantum matrix inversion (Harrow-Hassidim-Lloyd, HHL) algorithm and a quantum singular value thresholding (QSVT) algorithm. The two algorithms can be implemented on a universal quantum computer with complexity O[log(npq) ] and O[log(pq)], respectively, where n is the number of the training data and p q is the size of the feature space. By iterating the algorithms, we can find the parameters for the SMM classfication model. Our analysis shows that both HHL and QSVT algorithms achieve an exponential increase of speed over their classical counterparts.
A class of parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Fijany, Amir; Bejczy, Antal K.

1989-01-01

Parallel and parallel/pipeline algorithms for computation of the manipulator inertia matrix are presented. An algorithm based on composite rigid-body spatial inertia method, which provides better features for parallelization, is used for the computation of the inertia matrix. Two parallel algorithms are developed which achieve the time lower bound in computation. Also described is the mapping of these algorithms with topological variation on a two-dimensional processor array, with nearest-neighbor connection, and with cardinality variation on a linear processor array. An efficient parallel/pipeline algorithm for the linear array was also developed, but at significantly higher efficiency.
Nonlinear hyperspectral unmixing based on sparse non-negative matrix factorization

NASA Astrophysics Data System (ADS)

Li, Jing; Li, Xiaorun; Zhao, Liaoying

2016-01-01

Hyperspectral unmixing aims at extracting pure material spectra, accompanied by their corresponding proportions, from a mixed pixel. Owing to modeling more accurate distribution of real material, nonlinear mixing models (non-LMM) are usually considered to hold better performance than LMMs in complicated scenarios. In the past years, numerous nonlinear models have been successfully applied to hyperspectral unmixing. However, most non-LMMs only think of sum-to-one constraint or positivity constraint while the widespread sparsity among real materials mixing is the very factor that cannot be ignored. That is, for non-LMMs, a pixel is usually composed of a few spectral signatures of different materials from all the pure pixel set. Thus, in this paper, a smooth sparsity constraint is incorporated into the state-of-the-art Fan nonlinear model to exploit the sparsity feature in nonlinear model and use it to enhance the unmixing performance. This sparsity-constrained Fan model is solved with the non-negative matrix factorization. The algorithm was implemented on synthetic and real hyperspectral data and presented its advantage over those competing algorithms in the experiments.
Peak picking NMR spectral data using non-negative matrix factorization

PubMed Central

2014-01-01

Background Simple peak-picking algorithms, such as those based on lineshape fitting, perform well when peaks are completely resolved in multidimensional NMR spectra, but often produce wrong intensities and frequencies for overlapping peak clusters. For example, NOESY-type spectra have considerable overlaps leading to significant peak-picking intensity errors, which can result in erroneous structural restraints. Precise frequencies are critical for unambiguous resonance assignments. Results To alleviate this problem, a more sophisticated peaks decomposition algorithm, based on non-negative matrix factorization (NMF), was developed. We produce peak shapes from Fourier-transformed NMR spectra. Apart from its main goal of deriving components from spectra and producing peak lists automatically, the NMF approach can also be applied if the positions of some peaks are known a priori, e.g. from consistently referenced spectral dimensions of other experiments. Conclusions Application of the NMF algorithm to a three-dimensional peak list of the 23 kDa bi-domain section of the RcsD protein (RcsD-ABL-HPt, residues 688-890) as well as to synthetic HSQC data shows that peaks can be picked accurately also in spectral regions with strong overlap. PMID:24511909
Laplace-domain waveform modeling and inversion for the 3D acoustic-elastic coupled media

NASA Astrophysics Data System (ADS)

Shin, Jungkyun; Shin, Changsoo; Calandra, Henri

2016-06-01

Laplace-domain waveform inversion reconstructs long-wavelength subsurface models by using the zero-frequency component of damped seismic signals. Despite the computational advantages of Laplace-domain waveform inversion over conventional frequency-domain waveform inversion, an acoustic assumption and an iterative matrix solver have been used to invert 3D marine datasets to mitigate the intensive computing cost. In this study, we develop a Laplace-domain waveform modeling and inversion algorithm for 3D acoustic-elastic coupled media by using a parallel sparse direct solver library (MUltifrontal Massively Parallel Solver, MUMPS). We precisely simulate a real marine environment by coupling the 3D acoustic and elastic wave equations with the proper boundary condition at the fluid-solid interface. In addition, we can extract the elastic properties of the Earth below the sea bottom from the recorded acoustic pressure datasets. As a matrix solver, the parallel sparse direct solver is used to factorize the non-symmetric impedance matrix in a distributed memory architecture and rapidly solve the wave field for a number of shots by using the lower and upper matrix factors. Using both synthetic datasets and real datasets obtained by a 3D wide azimuth survey, the long-wavelength component of the P-wave and S-wave velocity models is reconstructed and the proposed modeling and inversion algorithm are verified. A cluster of 80 CPU cores is used for this study.
Gene expression based mouse brain parcellation using Markov random field regularized non-negative matrix factorization

NASA Astrophysics Data System (ADS)

Pathak, Sayan D.; Haynor, David R.; Thompson, Carol L.; Lein, Ed; Hawrylycz, Michael

2009-02-01

Understanding the geography of genetic expression in the mouse brain has opened previously unexplored avenues in neuroinformatics. The Allen Brain Atlas (www.brain-map.org) (ABA) provides genome-wide colorimetric in situ hybridization (ISH) gene expression images at high spatial resolution, all mapped to a common three-dimensional 200μm3 spatial framework defined by the Allen Reference Atlas (ARA) and is a unique data set for studying expression based structural and functional organization of the brain. The goal of this study was to facilitate an unbiased data-driven structural partitioning of the major structures in the mouse brain. We have developed an algorithm that uses nonnegative matrix factorization (NMF) to perform parts based analysis of ISH gene expression images. The standard NMF approach and its variants are limited in their ability to flexibly integrate prior knowledge, in the context of spatial data. In this paper, we introduce spatial connectivity as an additional regularization in NMF decomposition via the use of Markov Random Fields (mNMF). The mNMF algorithm alternates neighborhood updates with iterations of the standard NMF algorithm to exploit spatial correlations in the data. We present the algorithm and show the sub-divisions of hippocampus and somatosensory-cortex obtained via this approach. The results are compared with established neuroanatomic knowledge. We also highlight novel gene expression based sub divisions of the hippocampus identified by using the mNMF algorithm.
Fast Algorithms for Structured Least Squares and Total Least Squares Problems

PubMed Central

Kalsi, Anoop; O’Leary, Dianne P.

2006-01-01

We consider the problem of solving least squares problems involving a matrix M of small displacement rank with respect to two matrices Z1 and Z2. We develop formulas for the generators of the matrix M HM in terms of the generators of M and show that the Cholesky factorization of the matrix M HM can be computed quickly if Z1 is close to unitary and Z2 is triangular and nilpotent. These conditions are satisfied for several classes of matrices, including Toeplitz, block Toeplitz, Hankel, and block Hankel, and for matrices whose blocks have such structure. Fast Cholesky factorization enables fast solution of least squares problems, total least squares problems, and regularized total least squares problems involving these classes of matrices. PMID:27274922
Fast Algorithms for Structured Least Squares and Total Least Squares Problems.

PubMed

Kalsi, Anoop; O'Leary, Dianne P

2006-01-01

We consider the problem of solving least squares problems involving a matrix M of small displacement rank with respect to two matrices Z 1 and Z 2. We develop formulas for the generators of the matrix M (H) M in terms of the generators of M and show that the Cholesky factorization of the matrix M (H) M can be computed quickly if Z 1 is close to unitary and Z 2 is triangular and nilpotent. These conditions are satisfied for several classes of matrices, including Toeplitz, block Toeplitz, Hankel, and block Hankel, and for matrices whose blocks have such structure. Fast Cholesky factorization enables fast solution of least squares problems, total least squares problems, and regularized total least squares problems involving these classes of matrices.
Near-optimal matrix recovery from random linear measurements.

PubMed

Romanov, Elad; Gavish, Matan

2018-06-25

In matrix recovery from random linear measurements, one is interested in recovering an unknown M-by-N matrix [Formula: see text] from [Formula: see text] measurements [Formula: see text], where each [Formula: see text] is an M-by-N measurement matrix with i.i.d. random entries, [Formula: see text] We present a matrix recovery algorithm, based on approximate message passing, which iteratively applies an optimal singular-value shrinker-a nonconvex nonlinearity tailored specifically for matrix estimation. Our algorithm typically converges exponentially fast, offering a significant speedup over previously suggested matrix recovery algorithms, such as iterative solvers for nuclear norm minimization (NNM). It is well known that there is a recovery tradeoff between the information content of the object [Formula: see text] to be recovered (specifically, its matrix rank r) and the number of linear measurements n from which recovery is to be attempted. The precise tradeoff between r and n, beyond which recovery by a given algorithm becomes possible, traces the so-called phase transition curve of that algorithm in the [Formula: see text] plane. The phase transition curve of our algorithm is noticeably better than that of NNM. Interestingly, it is close to the information-theoretic lower bound for the minimal number of measurements needed for matrix recovery, making it not only state of the art in terms of convergence rate, but also near optimal in terms of the matrices it successfully recovers. Copyright © 2018 the Author(s). Published by PNAS.
An algorithm for propagating the square-root covariance matrix in triangular form

NASA Technical Reports Server (NTRS)

Tapley, B. D.; Choe, C. Y.

1976-01-01

A method for propagating the square root of the state error covariance matrix in lower triangular form is described. The algorithm can be combined with any triangular square-root measurement update algorithm to obtain a triangular square-root sequential estimation algorithm. The triangular square-root algorithm compares favorably with the conventional sequential estimation algorithm with regard to computation time.
Using video-oriented instructions to speed up sequence comparison.

PubMed

Wozniak, A

1997-04-01

This document presents an implementation of the well-known Smith-Waterman algorithm for comparison of proteic and nucleic sequences, using specialized video instructions. These instructions, SIMD-like in their design, make possible parallelization of the algorithm at the instruction level. Benchmarks on an ULTRA SPARC running at 167 MHz show a speed-up factor of two compared to the same algorithm implemented with integer instructions on the same machine. Performance reaches over 18 million matrix cells per second on a single processor, giving to our knowledge the fastest implementation of the Smith-Waterman algorithm on a workstation. The accelerated procedure was introduced in LASSAP--a LArge Scale Sequence compArison Package software developed at INRIA--which handles parallelism at higher level. On a SUN Enterprise 6000 server with 12 processors, a speed of nearly 200 million matrix cells per second has been obtained. A sequence of length 300 amino acids is scanned against SWISSPROT R33 (1,8531,385 residues) in 29 s. This procedure is not restricted to databank scanning. It applies to all cases handled by LASSAP (intra- and inter-bank comparisons, Z-score computation, etc.
Generalized sidelobe canceller beamforming method for ultrasound imaging.

PubMed

Wang, Ping; Li, Na; Luo, Han-Wu; Zhu, Yong-Kun; Cui, Shi-Gang

2017-03-01

A modified generalized sidelobe canceller (IGSC) algorithm is proposed to enhance the resolution and robustness against the noise of the traditional generalized sidelobe canceller (GSC) and coherence factor combined method (GSC-CF). In the GSC algorithm, weighting vector is divided into adaptive and non-adaptive parts, while the non-adaptive part does not block all the desired signal. A modified steer vector of the IGSC algorithm is generated by the projection of the non-adaptive vector on the signal space constructed by the covariance matrix of received data. The blocking matrix is generated based on the orthogonal complementary space of the modified steer vector and the weighting vector is updated subsequently. The performance of IGSC was investigated by simulations and experiments. Through simulations, IGSC outperformed GSC-CF in terms of spatial resolution by 0.1 mm regardless there is noise or not, as well as the contrast ratio respect. The proposed IGSC can be further improved by combining with CF. The experimental results also validated the effectiveness of the proposed algorithm with dataset provided by the University of Michigan.
THz spectral data analysis and components unmixing based on non-negative matrix factorization methods

NASA Astrophysics Data System (ADS)

Ma, Yehao; Li, Xian; Huang, Pingjie; Hou, Dibo; Wang, Qiang; Zhang, Guangxin

2017-04-01

In many situations the THz spectroscopic data observed from complex samples represent the integrated result of several interrelated variables or feature components acting together. The actual information contained in the original data might be overlapping and there is a necessity to investigate various approaches for model reduction and data unmixing. The development and use of low-rank approximate nonnegative matrix factorization (NMF) and smooth constraint NMF (CNMF) algorithms for feature components extraction and identification in the fields of terahertz time domain spectroscopy (THz-TDS) data analysis are presented. The evolution and convergence properties of NMF and CNMF methods based on sparseness, independence and smoothness constraints for the resulting nonnegative matrix factors are discussed. For general NMF, its cost function is nonconvex and the result is usually susceptible to initialization and noise corruption, and may fall into local minima and lead to unstable decomposition. To reduce these drawbacks, smoothness constraint is introduced to enhance the performance of NMF. The proposed algorithms are evaluated by several THz-TDS data decomposition experiments including a binary system and a ternary system simulating some applications such as medicine tablet inspection. Results show that CNMF is more capable of finding optimal solutions and more robust for random initialization in contrast to NMF. The investigated method is promising for THz data resolution contributing to unknown mixture identification.
THz spectral data analysis and components unmixing based on non-negative matrix factorization methods.

PubMed

Ma, Yehao; Li, Xian; Huang, Pingjie; Hou, Dibo; Wang, Qiang; Zhang, Guangxin

2017-04-15

In many situations the THz spectroscopic data observed from complex samples represent the integrated result of several interrelated variables or feature components acting together. The actual information contained in the original data might be overlapping and there is a necessity to investigate various approaches for model reduction and data unmixing. The development and use of low-rank approximate nonnegative matrix factorization (NMF) and smooth constraint NMF (CNMF) algorithms for feature components extraction and identification in the fields of terahertz time domain spectroscopy (THz-TDS) data analysis are presented. The evolution and convergence properties of NMF and CNMF methods based on sparseness, independence and smoothness constraints for the resulting nonnegative matrix factors are discussed. For general NMF, its cost function is nonconvex and the result is usually susceptible to initialization and noise corruption, and may fall into local minima and lead to unstable decomposition. To reduce these drawbacks, smoothness constraint is introduced to enhance the performance of NMF. The proposed algorithms are evaluated by several THz-TDS data decomposition experiments including a binary system and a ternary system simulating some applications such as medicine tablet inspection. Results show that CNMF is more capable of finding optimal solutions and more robust for random initialization in contrast to NMF. The investigated method is promising for THz data resolution contributing to unknown mixture identification. Copyright © 2017 Elsevier B.V. All rights reserved.
Multi-Target Angle Tracking Algorithm for Bistatic MIMO Radar Based on the Elements of the Covariance Matrix

PubMed Central

Zhang, Zhengyan; Zhang, Jianyun; Zhou, Qingsong; Li, Xiaobo

2018-01-01

In this paper, we consider the problem of tracking the direction of arrivals (DOA) and the direction of departure (DOD) of multiple targets for bistatic multiple-input multiple-output (MIMO) radar. A high-precision tracking algorithm for target angle is proposed. First, the linear relationship between the covariance matrix difference and the angle difference of the adjacent moment was obtained through three approximate relations. Then, the proposed algorithm obtained the relationship between the elements in the covariance matrix difference. On this basis, the performance of the algorithm was improved by averaging the covariance matrix element. Finally, the least square method was used to estimate the DOD and DOA. The algorithm realized the automatic correlation of the angle and provided better performance when compared with the adaptive asymmetric joint diagonalization (AAJD) algorithm. The simulation results demonstrated the effectiveness of the proposed algorithm. The algorithm provides the technical support for the practical application of MIMO radar. PMID:29518957
Multi-Target Angle Tracking Algorithm for Bistatic Multiple-Input Multiple-Output (MIMO) Radar Based on the Elements of the Covariance Matrix.

PubMed

Zhang, Zhengyan; Zhang, Jianyun; Zhou, Qingsong; Li, Xiaobo

2018-03-07

In this paper, we consider the problem of tracking the direction of arrivals (DOA) and the direction of departure (DOD) of multiple targets for bistatic multiple-input multiple-output (MIMO) radar. A high-precision tracking algorithm for target angle is proposed. First, the linear relationship between the covariance matrix difference and the angle difference of the adjacent moment was obtained through three approximate relations. Then, the proposed algorithm obtained the relationship between the elements in the covariance matrix difference. On this basis, the performance of the algorithm was improved by averaging the covariance matrix element. Finally, the least square method was used to estimate the DOD and DOA. The algorithm realized the automatic correlation of the angle and provided better performance when compared with the adaptive asymmetric joint diagonalization (AAJD) algorithm. The simulation results demonstrated the effectiveness of the proposed algorithm. The algorithm provides the technical support for the practical application of MIMO radar.

Performance analysis of structured gradient algorithm. [for adaptive beamforming linear arrays

NASA Technical Reports Server (NTRS)

Godara, Lal C.

1990-01-01

The structured gradient algorithm uses a structured estimate of the array correlation matrix (ACM) to estimate the gradient required for the constrained least-mean-square (LMS) algorithm. This structure reflects the structure of the exact array correlation matrix for an equispaced linear array and is obtained by spatial averaging of the elements of the noisy correlation matrix. In its standard form the LMS algorithm does not exploit the structure of the array correlation matrix. The gradient is estimated by multiplying the array output with the receiver outputs. An analysis of the two algorithms is presented to show that the covariance of the gradient estimated by the structured method is less sensitive to the look direction signal than that estimated by the standard method. The effect of the number of elements on the signal sensitivity of the two algorithms is studied.
scarlet: Source separation in multi-band images by Constrained Matrix Factorization

NASA Astrophysics Data System (ADS)

Melchior, Peter; Moolekamp, Fred; Jerdee, Maximilian; Armstrong, Robert; Sun, Ai-Lei; Bosch, James; Lupton, Robert

2018-03-01

SCARLET performs source separation (aka "deblending") on multi-band images. It is geared towards optical astronomy, where scenes are composed of stars and galaxies, but it is straightforward to apply it to other imaging data. Separation is achieved through a constrained matrix factorization, which models each source with a Spectral Energy Distribution (SED) and a non-parametric morphology, or multiple such components per source. The code performs forced photometry (with PSF matching if needed) using an optimal weight function given by the signal-to-noise weighted morphology across bands. The approach works well if the sources in the scene have different colors and can be further strengthened by imposing various additional constraints/priors on each source. Because of its generic utility, this package provides a stand-alone implementation that contains the core components of the source separation algorithm. However, the development of this package is part of the LSST Science Pipeline; the meas_deblender package contains a wrapper to implement the algorithms here for the LSST stack.
Recommendation based on trust diffusion model.

PubMed

Yuan, Jinfeng; Li, Li

2014-01-01

Recommender system is emerging as a powerful and popular tool for online information relevant to a given user. The traditional recommendation system suffers from the cold start problem and the data sparsity problem. Many methods have been proposed to solve these problems, but few can achieve satisfactory efficiency. In this paper, we present a method which combines the trust diffusion (DiffTrust) algorithm and the probabilistic matrix factorization (PMF). DiffTrust is first used to study the possible diffusions of trust between various users. It is able to make use of the implicit relationship of the trust network, thus alleviating the data sparsity problem. The probabilistic matrix factorization (PMF) is then employed to combine the users' tastes with their trusted friends' interests. We evaluate the algorithm on Flixster, Moviedata, and Epinions datasets, respectively. The experimental results show that the recommendation based on our proposed DiffTrust + PMF model achieves high performance in terms of the root mean square error (RMSE), Recall, and F Measure.
Recommendation Based on Trust Diffusion Model

PubMed Central

Li, Li

2014-01-01

Recommender system is emerging as a powerful and popular tool for online information relevant to a given user. The traditional recommendation system suffers from the cold start problem and the data sparsity problem. Many methods have been proposed to solve these problems, but few can achieve satisfactory efficiency. In this paper, we present a method which combines the trust diffusion (DiffTrust) algorithm and the probabilistic matrix factorization (PMF). DiffTrust is first used to study the possible diffusions of trust between various users. It is able to make use of the implicit relationship of the trust network, thus alleviating the data sparsity problem. The probabilistic matrix factorization (PMF) is then employed to combine the users' tastes with their trusted friends' interests. We evaluate the algorithm on Flixster, Moviedata, and Epinions datasets, respectively. The experimental results show that the recommendation based on our proposed DiffTrust + PMF model achieves high performance in terms of the root mean square error (RMSE), Recall, and F Measure. PMID:25009827
Majorization as a Tool for Optimizing a Class of Matrix Functions.

ERIC Educational Resources Information Center

Kiers, Henk A.

1990-01-01

General algorithms are presented that can be used for optimizing matrix trace functions subject to certain constraints on the parameters. The parameter set that minimizes the majorizing function also decreases the matrix trace function, providing a monotonically convergent algorithm for minimizing the matrix trace function iteratively. (SLD)
Efficient conjugate gradient algorithms for computation of the manipulator forward dynamics

NASA Technical Reports Server (NTRS)

Fijany, Amir; Scheid, Robert E.

1989-01-01

The applicability of conjugate gradient algorithms for computation of the manipulator forward dynamics is investigated. The redundancies in the previously proposed conjugate gradient algorithm are analyzed. A new version is developed which, by avoiding these redundancies, achieves a significantly greater efficiency. A preconditioned conjugate gradient algorithm is also presented. A diagonal matrix whose elements are the diagonal elements of the inertia matrix is proposed as the preconditioner. In order to increase the computational efficiency, an algorithm is developed which exploits the synergism between the computation of the diagonal elements of the inertia matrix and that required by the conjugate gradient algorithm.
Algorithms for computing solvents of unilateral second-order matrix polynomials over prime finite fields using lambda-matrices

NASA Astrophysics Data System (ADS)

Burtyka, Filipp

2018-01-01

The paper considers algorithms for finding diagonalizable and non-diagonalizable roots (so called solvents) of monic arbitrary unilateral second-order matrix polynomial over prime finite field. These algorithms are based on polynomial matrices (lambda-matrices). This is an extension of existing general methods for computing solvents of matrix polynomials over field of complex numbers. We analyze how techniques for complex numbers can be adapted for finite field and estimate asymptotic complexity of the obtained algorithms.
Progress on a Taylor weak statement finite element algorithm for high-speed aerodynamic flows

NASA Technical Reports Server (NTRS)

Baker, A. J.; Freels, J. D.

1989-01-01

A new finite element numerical Computational Fluid Dynamics (CFD) algorithm has matured to the point of efficiently solving two-dimensional high speed real-gas compressible flow problems in generalized coordinates on modern vector computer systems. The algorithm employs a Taylor Weak Statement classical Galerkin formulation, a variably implicit Newton iteration, and a tensor matrix product factorization of the linear algebra Jacobian under a generalized coordinate transformation. Allowing for a general two-dimensional conservation law system, the algorithm has been exercised on the Euler and laminar forms of the Navier-Stokes equations. Real-gas fluid properties are admitted, and numerical results verify solution accuracy, efficiency, and stability over a range of test problem parameters.
Multi scales based sparse matrix spectral clustering image segmentation

NASA Astrophysics Data System (ADS)

Liu, Zhongmin; Chen, Zhicai; Li, Zhanming; Hu, Wenjin

2018-04-01

In image segmentation, spectral clustering algorithms have to adopt the appropriate scaling parameter to calculate the similarity matrix between the pixels, which may have a great impact on the clustering result. Moreover, when the number of data instance is large, computational complexity and memory use of the algorithm will greatly increase. To solve these two problems, we proposed a new spectral clustering image segmentation algorithm based on multi scales and sparse matrix. We devised a new feature extraction method at first, then extracted the features of image on different scales, at last, using the feature information to construct sparse similarity matrix which can improve the operation efficiency. Compared with traditional spectral clustering algorithm, image segmentation experimental results show our algorithm have better degree of accuracy and robustness.
Lanczos eigensolution method for high-performance computers

NASA Technical Reports Server (NTRS)

Bostic, Susan W.

1991-01-01

The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors.
Triangular covariance factorizations for. Ph.D. Thesis. - Calif. Univ.

NASA Technical Reports Server (NTRS)

Thornton, C. L.

1976-01-01

An improved computational form of the discrete Kalman filter is derived using an upper triangular factorization of the error covariance matrix. The covariance P is factored such that P = UDUT where U is unit upper triangular and D is diagonal. Recursions are developed for propagating the U-D covariance factors together with the corresponding state estimate. The resulting algorithm, referred to as the U-D filter, combines the superior numerical precision of square root filtering techniques with an efficiency comparable to that of Kalman's original formula. Moreover, this method is easily implemented and involves no more computer storage than the Kalman algorithm. These characteristics make the U-D method an attractive realtime filtering technique. A new covariance error analysis technique is obtained from an extension of the U-D filter equations. This evaluation method is flexible and efficient and may provide significantly improved numerical results. Cost comparisons show that for a large class of problems the U-D evaluation algorithm is noticeably less expensive than conventional error analysis methods.
XDATA

DTIC Science & Technology

2017-05-01

Parallelizing PINT The main focus of our research into the parallelization of the PINT algorithm has been to find appropriately scalable matrix math algorithms...leading eigenvector of the adjacency matrix of the pairwise affinity graph. We reviewed the matrix math implementation currently being used in PINT and...the new versions support a feature called matrix.distributed, which is some level of support for distributed matrix math ; however our code is not
Spatial operator approach to flexible multibody system dynamics and control

NASA Technical Reports Server (NTRS)

Rodriguez, G.

1991-01-01

The inverse and forward dynamics problems for flexible multibody systems were solved using the techniques of spatially recursive Kalman filtering and smoothing. These algorithms are easily developed using a set of identities associated with mass matrix factorization and inversion. These identities are easily derived using the spatial operator algebra developed by the author. Current work is aimed at computational experiments with the described algorithms and at modelling for control design of limber manipulator systems. It is also aimed at handling and manipulation of flexible objects.
An Improved DOA Estimation Approach Using Coarray Interpolation and Matrix Denoising

PubMed Central

Guo, Muran; Chen, Tao; Wang, Ben

2017-01-01

Co-prime arrays can estimate the directions of arrival (DOAs) of O(MN) sources with O(M+N) sensors, and are convenient to analyze due to their closed-form expression for the locations of virtual lags. However, the number of degrees of freedom is limited due to the existence of holes in difference coarrays if subspace-based algorithms such as the spatial smoothing multiple signal classification (MUSIC) algorithm are utilized. To address this issue, techniques such as positive definite Toeplitz completion and array interpolation have been proposed in the literature. Another factor that compromises the accuracy of DOA estimation is the limitation of the number of snapshots. Coarray-based processing is particularly sensitive to the discrepancy between the sample covariance matrix and the ideal covariance matrix due to the finite number of snapshots. In this paper, coarray interpolation based on matrix completion (MC) followed by a denoising operation is proposed to detect more sources with a higher accuracy. The effectiveness of the proposed method is based on the capability of MC to fill in holes in the virtual sensors and that of MC denoising operation to reduce the perturbation in the sample covariance matrix. The results of numerical simulations verify the superiority of the proposed approach. PMID:28509886
An Improved DOA Estimation Approach Using Coarray Interpolation and Matrix Denoising.

PubMed

Guo, Muran; Chen, Tao; Wang, Ben

2017-05-16

Co-prime arrays can estimate the directions of arrival (DOAs) of O ( M N ) sources with O ( M + N ) sensors, and are convenient to analyze due to their closed-form expression for the locations of virtual lags. However, the number of degrees of freedom is limited due to the existence of holes in difference coarrays if subspace-based algorithms such as the spatial smoothing multiple signal classification (MUSIC) algorithm are utilized. To address this issue, techniques such as positive definite Toeplitz completion and array interpolation have been proposed in the literature. Another factor that compromises the accuracy of DOA estimation is the limitation of the number of snapshots. Coarray-based processing is particularly sensitive to the discrepancy between the sample covariance matrix and the ideal covariance matrix due to the finite number of snapshots. In this paper, coarray interpolation based on matrix completion (MC) followed by a denoising operation is proposed to detect more sources with a higher accuracy. The effectiveness of the proposed method is based on the capability of MC to fill in holes in the virtual sensors and that of MC denoising operation to reduce the perturbation in the sample covariance matrix. The results of numerical simulations verify the superiority of the proposed approach.
Clustering Tree-structured Data on Manifold

PubMed Central

Lu, Na; Miao, Hongyu

2016-01-01

Tree-structured data usually contain both topological and geometrical information, and are necessarily considered on manifold instead of Euclidean space for appropriate data parameterization and analysis. In this study, we propose a novel tree-structured data parameterization, called Topology-Attribute matrix (T-A matrix), so the data clustering task can be conducted on matrix manifold. We incorporate the structure constraints embedded in data into the non-negative matrix factorization method to determine meta-trees from the T-A matrix, and the signature vector of each single tree can then be extracted by meta-tree decomposition. The meta-tree space turns out to be a cone space, in which we explore the distance metric and implement the clustering algorithm based on the concepts like Fréchet mean. Finally, the T-A matrix based clustering (TAMBAC) framework is evaluated and compared using both simulated data and real retinal images to illus trate its efficiency and accuracy. PMID:26660696
Comparison between iterative wavefront control algorithm and direct gradient wavefront control algorithm for adaptive optics system

NASA Astrophysics Data System (ADS)

Cheng, Sheng-Yi; Liu, Wen-Jin; Chen, Shan-Qiu; Dong, Li-Zhi; Yang, Ping; Xu, Bing

2015-08-01

Among all kinds of wavefront control algorithms in adaptive optics systems, the direct gradient wavefront control algorithm is the most widespread and common method. This control algorithm obtains the actuator voltages directly from wavefront slopes through pre-measuring the relational matrix between deformable mirror actuators and Hartmann wavefront sensor with perfect real-time characteristic and stability. However, with increasing the number of sub-apertures in wavefront sensor and deformable mirror actuators of adaptive optics systems, the matrix operation in direct gradient algorithm takes too much time, which becomes a major factor influencing control effect of adaptive optics systems. In this paper we apply an iterative wavefront control algorithm to high-resolution adaptive optics systems, in which the voltages of each actuator are obtained through iteration arithmetic, which gains great advantage in calculation and storage. For AO system with thousands of actuators, the computational complexity estimate is about O(n2) ˜ O(n3) in direct gradient wavefront control algorithm, while the computational complexity estimate in iterative wavefront control algorithm is about O(n) ˜ (O(n)3/2), in which n is the number of actuators of AO system. And the more the numbers of sub-apertures and deformable mirror actuators, the more significant advantage the iterative wavefront control algorithm exhibits. Project supported by the National Key Scientific and Research Equipment Development Project of China (Grant No. ZDYZ2013-2), the National Natural Science Foundation of China (Grant No. 11173008), and the Sichuan Provincial Outstanding Youth Academic Technology Leaders Program, China (Grant No. 2012JQ0012).
Matrix preconditioning: a robust operation for optical linear algebra processors.

PubMed

Ghosh, A; Paparao, P

1987-07-15

Analog electrooptical processors are best suited for applications demanding high computational throughput with tolerance for inaccuracies. Matrix preconditioning is one such application. Matrix preconditioning is a preprocessing step for reducing the condition number of a matrix and is used extensively with gradient algorithms for increasing the rate of convergence and improving the accuracy of the solution. In this paper, we describe a simple parallel algorithm for matrix preconditioning, which can be implemented efficiently on a pipelined optical linear algebra processor. From the results of our numerical experiments we show that the efficacy of the preconditioning algorithm is affected very little by the errors of the optical system.
Constructing the tree-level Yang-Mills S-matrix using complex factorization

NASA Astrophysics Data System (ADS)

Schuster, Philip C.; Toro, Natalia

2009-06-01

A remarkable connection between BCFW recursion relations and constraints on the S-matrix was made by Benincasa and Cachazo in 0705.4305, who noted that mutual consistency of different BCFW constructions of four-particle amplitudes generates non-trivial (but familiar) constraints on three-particle coupling constants — these include gauge invariance, the equivalence principle, and the lack of non-trivial couplings for spins > 2. These constraints can also be derived with weaker assumptions, by demanding the existence of four-point amplitudes that factorize properly in all unitarity limits with complex momenta. From this starting point, we show that the BCFW prescription can be interpreted as an algorithm for fully constructing a tree-level S-matrix, and that complex factorization of general BCFW amplitudes follows from the factorization of four-particle amplitudes. The allowed set of BCFW deformations is identified, formulated entirely as a statement on the three-particle sector, and using only complex factorization as a guide. Consequently, our analysis based on the physical consistency of the S-matrix is entirely independent of field theory. We analyze the case of pure Yang-Mills, and outline a proof for gravity. For Yang-Mills, we also show that the well-known scaling behavior of BCFW-deformed amplitudes at large z is a simple consequence of factorization. For gravity, factorization in certain channels requires asymptotic behavior ~ 1/z2.
Decoding the encoding of functional brain networks: An fMRI classification comparison of non-negative matrix factorization (NMF), independent component analysis (ICA), and sparse coding algorithms.

PubMed

Xie, Jianwen; Douglas, Pamela K; Wu, Ying Nian; Brody, Arthur L; Anderson, Ariana E

2017-04-15

Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet other mathematical constraints provide alternate biologically-plausible frameworks for generating brain networks. Non-negative matrix factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms (L1 Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks. The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks within scan for different constraints are used as basis functions to encode observed functional activity. These encodings are then decoded using machine learning, by using the time series weights to predict within scan whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects. The sparse coding algorithm of L1 Regularized Learning outperformed 4 variations of ICA (p<0.001) for predicting the task being performed within each scan using artifact-cleaned components. The NMF algorithms, which suppressed negative BOLD signal, had the poorest accuracy compared to the ICA and sparse coding algorithms. Holding constant the effect of the extraction algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (p<0.001). Lower classification accuracy occurred when the extracted spatial maps contained more CSF regions (p<0.001). The success of sparse coding algorithms suggests that algorithms which enforce sparsity, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA. Negative BOLD signal may capture task-related activations. Copyright © 2017 Elsevier B.V. All rights reserved.

Development of a Real Time Sparse Non-Negative Matrix Factorization Module for Cochlear Implants by Using xPC Target

PubMed Central

Hu, Hongmei; Krasoulis, Agamemnon; Lutman, Mark; Bleeck, Stefan

2013-01-01

Cochlear implants (CIS) require efficient speech processing to maximize information transmission to the brain, especially in noise. A novel CI processing strategy was proposed in our previous studies, in which sparsity-constrained non-negative matrix factorization (NMF) was applied to the envelope matrix in order to improve the CI performance in noisy environments. It showed that the algorithm needs to be adaptive, rather than fixed, in order to adjust to acoustical conditions and individual characteristics. Here, we explore the benefit of a system that allows the user to adjust the signal processing in real time according to their individual listening needs and their individual hearing capabilities. In this system, which is based on MATLAB®, SIMULINK® and the xPC Target™ environment, the input/outupt (I/O) boards are interfaced between the SIMULINK blocks and the CI stimulation system, such that the output can be controlled successfully in the manner of a hardware-in-the-loop (HIL) simulation, hence offering a convenient way to implement a real time signal processing module that does not require any low level language. The sparsity constrained parameter of the algorithm was adapted online subjectively during an experiment with normal-hearing subjects and noise vocoded speech simulation. Results show that subjects chose different parameter values according to their own intelligibility preferences, indicating that adaptive real time algorithms are beneficial to fully explore subjective preferences. We conclude that the adaptive real time systems are beneficial for the experimental design, and such systems allow one to conduct psychophysical experiments with high ecological validity. PMID:24129021
Development of a real time sparse non-negative matrix factorization module for cochlear implants by using xPC target.

PubMed

Hu, Hongmei; Krasoulis, Agamemnon; Lutman, Mark; Bleeck, Stefan

2013-10-14

Cochlear implants (CIs) require efficient speech processing to maximize information transmission to the brain, especially in noise. A novel CI processing strategy was proposed in our previous studies, in which sparsity-constrained non-negative matrix factorization (NMF) was applied to the envelope matrix in order to improve the CI performance in noisy environments. It showed that the algorithm needs to be adaptive, rather than fixed, in order to adjust to acoustical conditions and individual characteristics. Here, we explore the benefit of a system that allows the user to adjust the signal processing in real time according to their individual listening needs and their individual hearing capabilities. In this system, which is based on MATLAB®, SIMULINK® and the xPC Target™ environment, the input/outupt (I/O) boards are interfaced between the SIMULINK blocks and the CI stimulation system, such that the output can be controlled successfully in the manner of a hardware-in-the-loop (HIL) simulation, hence offering a convenient way to implement a real time signal processing module that does not require any low level language. The sparsity constrained parameter of the algorithm was adapted online subjectively during an experiment with normal-hearing subjects and noise vocoded speech simulation. Results show that subjects chose different parameter values according to their own intelligibility preferences, indicating that adaptive real time algorithms are beneficial to fully explore subjective preferences. We conclude that the adaptive real time systems are beneficial for the experimental design, and such systems allow one to conduct psychophysical experiments with high ecological validity.
The wavenumber algorithm for full-matrix imaging using an ultrasonic array.

PubMed

Hunter, Alan J; Drinkwater, Bruce W; Wilcox, Paul D

2008-11-01

Ultrasonic imaging using full-matrix capture, e.g., via the total focusing method (TFM), has been shown to increase angular inspection coverage and improve sensitivity to small defects in nondestructive evaluation. In this paper, we develop a Fourier-domain approach to full-matrix imaging based on the wavenumber algorithm used in synthetic aperture radar and sonar. The extension to the wavenumber algorithm for full-matrix data is described and the performance of the new algorithm compared with the TFM, which we use as a representative benchmark for the time-domain algorithms. The wavenumber algorithm provides a mathematically rigorous solution to the inverse problem for the assumed forward wave propagation model, whereas the TFM employs heuristic delay-and-sum beamforming. Consequently, the wavenumber algorithm has an improved point-spread function and provides better imagery. However, the major advantage of the wavenumber algorithm is its superior computational performance. For large arrays and images, the wavenumber algorithm is several orders of magnitude faster than the TFM. On the other hand, the key advantage of the TFM is its flexibility. The wavenumber algorithm requires a regularly sampled linear array, while the TFM can handle arbitrary imaging geometries. The TFM and the wavenumber algorithm are compared using simulated and experimental data.
A robust method of computing finite difference coefficients based on Vandermonde matrix

NASA Astrophysics Data System (ADS)

Zhang, Yijie; Gao, Jinghuai; Peng, Jigen; Han, Weimin

2018-05-01

When the finite difference (FD) method is employed to simulate the wave propagation, high-order FD method is preferred in order to achieve better accuracy. However, if the order of FD scheme is high enough, the coefficient matrix of the formula for calculating finite difference coefficients is close to be singular. In this case, when the FD coefficients are computed by matrix inverse operator of MATLAB, inaccuracy can be produced. In order to overcome this problem, we have suggested an algorithm based on Vandermonde matrix in this paper. After specified mathematical transformation, the coefficient matrix is transformed into a Vandermonde matrix. Then the FD coefficients of high-order FD method can be computed by the algorithm of Vandermonde matrix, which prevents the inverse of the singular matrix. The dispersion analysis and numerical results of a homogeneous elastic model and a geophysical model of oil and gas reservoir demonstrate that the algorithm based on Vandermonde matrix has better accuracy compared with matrix inverse operator of MATLAB.
Distributed Matrix Completion: Application to Cooperative Positioning in Noisy Environments

DTIC Science & Technology

2013-12-11

positioning, and a gossip version of low-rank approximation were developed. A convex relaxation for positioning in the presence of noise was shown to...of a large data matrix through gossip algorithms. A new algorithm is proposed that amounts to iteratively multiplying a vector by independent random...sparsification of the original matrix and averaging the resulting normalized vectors. This can be viewed as a generalization of gossip algorithms for
Development of a validation model for the defense meteorological satellite program's special sensor microwave imager

NASA Technical Reports Server (NTRS)

Swift, C. T.; Goodberlet, M. A.; Wilkerson, J. C.

1990-01-01

The Defence Meteorological Space Program's (DMSP) Special Sensor Microwave/Imager (SSM/I), an operational wind speed algorithm was developed. The algorithm is based on the D-matrix approach which seeks a linear relationship between measured SSM/I brightness temperatures and environmental parameters. D-matrix performance was validated by comparing algorithm derived wind speeds with near-simultaneous and co-located measurements made by off-shore ocean buoys. Other topics include error budget modeling, alternate wind speed algorithms, and D-matrix performance with one or more inoperative SSM/I channels.
Parallel-vector unsymmetric Eigen-Solver on high performance computers

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.; Jiangning, Qin

1993-01-01

The popular QR algorithm for solving all eigenvalues of an unsymmetric matrix is reviewed. Among the basic components in the QR algorithm, it was concluded from this study, that the reduction of an unsymmetric matrix to a Hessenberg form (before applying the QR algorithm itself) can be done effectively by exploiting the vector speed and multiple processors offered by modern high-performance computers. Numerical examples of several test cases have indicated that the proposed parallel-vector algorithm for converting a given unsymmetric matrix to a Hessenberg form offers computational advantages over the existing algorithm. The time saving obtained by the proposed methods is increased as the problem size increased.
Model reduction of nonsquare linear MIMO systems using multipoint matrix continued-fraction expansions

NASA Technical Reports Server (NTRS)

Guo, Tong-Yi; Hwang, Chyi; Shieh, Leang-San

1994-01-01

This paper deals with the multipoint Cauer matrix continued-fraction expansion (MCFE) for model reduction of linear multi-input multi-output (MIMO) systems with various numbers of inputs and outputs. A salient feature of the proposed MCFE approach to model reduction of MIMO systems with square transfer matrices is its equivalence to the matrix Pade approximation approach. The Cauer second form of the ordinary MCFE for a square transfer function matrix is generalized in this paper to a multipoint and nonsquare-matrix version. An interesting connection of the multipoint Cauer MCFE method to the multipoint matrix Pade approximation method is established. Also, algorithms for obtaining the reduced-degree matrix-fraction descriptions and reduced-dimensional state-space models from a transfer function matrix via the multipoint Cauer MCFE algorithm are presented. Practical advantages of using the multipoint Cauer MCFE are discussed and a numerical example is provided to illustrate the algorithms.
A Deep Learning Algorithm of Neural Network for the Parameterization of Typhoon-Ocean Feedback in Typhoon Forecast Models

NASA Astrophysics Data System (ADS)

Jiang, Guo-Qing; Xu, Jing; Wei, Jun

2018-04-01

Two algorithms based on machine learning neural networks are proposed—the shallow learning (S-L) and deep learning (D-L) algorithms—that can potentially be used in atmosphere-only typhoon forecast models to provide flow-dependent typhoon-induced sea surface temperature cooling (SSTC) for improving typhoon predictions. The major challenge of existing SSTC algorithms in forecast models is how to accurately predict SSTC induced by an upcoming typhoon, which requires information not only from historical data but more importantly also from the target typhoon itself. The S-L algorithm composes of a single layer of neurons with mixed atmospheric and oceanic factors. Such a structure is found to be unable to represent correctly the physical typhoon-ocean interaction. It tends to produce an unstable SSTC distribution, for which any perturbations may lead to changes in both SSTC pattern and strength. The D-L algorithm extends the neural network to a 4 × 5 neuron matrix with atmospheric and oceanic factors being separated in different layers of neurons, so that the machine learning can determine the roles of atmospheric and oceanic factors in shaping the SSTC. Therefore, it produces a stable crescent-shaped SSTC distribution, with its large-scale pattern determined mainly by atmospheric factors (e.g., winds) and small-scale features by oceanic factors (e.g., eddies). Sensitivity experiments reveal that the D-L algorithms improve maximum wind intensity errors by 60-70% for four case study simulations, compared to their atmosphere-only model runs.
Face recognition using tridiagonal matrix enhanced multivariance products representation

NASA Astrophysics Data System (ADS)

Ã-zay, Evrim Korkmaz

2017-01-01

This study aims to retrieve face images from a database according to a target face image. For this purpose, Tridiagonal Matrix Enhanced Multivariance Products Representation (TMEMPR) is taken into consideration. TMEMPR is a recursive algorithm based on Enhanced Multivariance Products Representation (EMPR). TMEMPR decomposes a matrix into three components which are a matrix of left support terms, a tridiagonal matrix of weight parameters for each recursion, and a matrix of right support terms, respectively. In this sense, there is an analogy between Singular Value Decomposition (SVD) and TMEMPR. However TMEMPR is a more flexible algorithm since its initial support terms (or vectors) can be chosen as desired. Low computational complexity is another advantage of TMEMPR because the algorithm has been constructed with recursions of certain arithmetic operations without requiring any iteration. The algorithm has been trained and tested with ORL face image database with 400 different grayscale images of 40 different people. TMEMPR's performance has been compared with SVD's performance as a result.
Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

NASA Astrophysics Data System (ADS)

Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue; Govind, Niranjan; Yang, Chao

2017-12-01

We present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by a modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.
A Novel 2D Image Compression Algorithm Based on Two Levels DWT and DCT Transforms with Enhanced Minimize-Matrix-Size Algorithm for High Resolution Structured Light 3D Surface Reconstruction

NASA Astrophysics Data System (ADS)

Siddeq, M. M.; Rodrigues, M. A.

2015-09-01

Image compression techniques are widely used on 2D image 2D video 3D images and 3D video. There are many types of compression techniques and among the most popular are JPEG and JPEG2000. In this research, we introduce a new compression method based on applying a two level discrete cosine transform (DCT) and a two level discrete wavelet transform (DWT) in connection with novel compression steps for high-resolution images. The proposed image compression algorithm consists of four steps. (1) Transform an image by a two level DWT followed by a DCT to produce two matrices: DC- and AC-Matrix, or low and high frequency matrix, respectively, (2) apply a second level DCT on the DC-Matrix to generate two arrays, namely nonzero-array and zero-array, (3) apply the Minimize-Matrix-Size algorithm to the AC-Matrix and to the other high-frequencies generated by the second level DWT, (4) apply arithmetic coding to the output of previous steps. A novel decompression algorithm, Fast-Match-Search algorithm (FMS), is used to reconstruct all high-frequency matrices. The FMS-algorithm computes all compressed data probabilities by using a table of data, and then using a binary search algorithm for finding decompressed data inside the table. Thereafter, all decoded DC-values with the decoded AC-coefficients are combined in one matrix followed by inverse two levels DCT with two levels DWT. The technique is tested by compression and reconstruction of 3D surface patches. Additionally, this technique is compared with JPEG and JPEG2000 algorithm through 2D and 3D root-mean-square-error following reconstruction. The results demonstrate that the proposed compression method has better visual properties than JPEG and JPEG2000 and is able to more accurately reconstruct surface patches in 3D.
Multiway analysis methods applied to the fluorescence excitation-emission dataset for the simultaneous quantification of valsartan and amlodipine in tablets

NASA Astrophysics Data System (ADS)

Dinç, Erdal; Ertekin, Zehra Ceren; Büker, Eda

2017-09-01

In this study, excitation-emission matrix datasets, which have strong overlapping bands, were processed by using four different chemometric calibration algorithms consisting of parallel factor analysis, Tucker3, three-way partial least squares and unfolded partial least squares for the simultaneous quantitative estimation of valsartan and amlodipine besylate in tablets. In analyses, preliminary separation step was not used before the application of parallel factor analysis Tucker3, three-way partial least squares and unfolded partial least squares approaches for the analysis of the related drug substances in samples. Three-way excitation-emission matrix data array was obtained by concatenating excitation-emission matrices of the calibration set, validation set, and commercial tablet samples. The excitation-emission matrix data array was used to get parallel factor analysis, Tucker3, three-way partial least squares and unfolded partial least squares calibrations and to predict the amounts of valsartan and amlodipine besylate in samples. For all the methods, calibration and prediction of valsartan and amlodipine besylate were performed in the working concentration ranges of 0.25-4.50 μg/mL. The validity and the performance of all the proposed methods were checked by using the validation parameters. From the analysis results, it was concluded that the described two-way and three-way algorithmic methods were very useful for the simultaneous quantitative resolution and routine analysis of the related drug substances in marketed samples.
Efficient computer algebra algorithms for polynomial matrices in control design

NASA Technical Reports Server (NTRS)

Baras, J. S.; Macenany, D. C.; Munach, R.

1989-01-01

The theory of polynomial matrices plays a key role in the design and analysis of multi-input multi-output control and communications systems using frequency domain methods. Examples include coprime factorizations of transfer functions, cannonical realizations from matrix fraction descriptions, and the transfer function design of feedback compensators. Typically, such problems abstract in a natural way to the need to solve systems of Diophantine equations or systems of linear equations over polynomials. These and other problems involving polynomial matrices can in turn be reduced to polynomial matrix triangularization procedures, a result which is not surprising given the importance of matrix triangularization techniques in numerical linear algebra. Matrices with entries from a field and Gaussian elimination play a fundamental role in understanding the triangularization process. In the case of polynomial matrices, matrices with entries from a ring for which Gaussian elimination is not defined and triangularization is accomplished by what is quite properly called Euclidean elimination. Unfortunately, the numerical stability and sensitivity issues which accompany floating point approaches to Euclidean elimination are not very well understood. New algorithms are presented which circumvent entirely such numerical issues through the use of exact, symbolic methods in computer algebra. The use of such error-free algorithms guarantees that the results are accurate to within the precision of the model data--the best that can be hoped for. Care must be taken in the design of such algorithms due to the phenomenon of intermediate expressions swell.
A Note on Alternating Minimization Algorithm for the Matrix Completion Problem

DOE PAGES

Gamarnik, David; Misra, Sidhant

2016-06-06

Here, we consider the problem of reconstructing a low-rank matrix from a subset of its entries and analyze two variants of the so-called alternating minimization algorithm, which has been proposed in the past.We establish that when the underlying matrix has rank one, has positive bounded entries, and the graph underlying the revealed entries has diameter which is logarithmic in the size of the matrix, both algorithms succeed in reconstructing the matrix approximately in polynomial time starting from an arbitrary initialization.We further provide simulation results which suggest that the second variant which is based on the message passing type updates performsmore » significantly better.« less
Near-lossless multichannel EEG compression based on matrix and tensor decompositions.

PubMed

Dauwels, Justin; Srinivasan, K; Reddy, M Ramasubba; Cichocki, Andrzej

2013-05-01

A novel near-lossless compression algorithm for multichannel electroencephalogram (MC-EEG) is proposed based on matrix/tensor decomposition models. MC-EEG is represented in suitable multiway (multidimensional) forms to efficiently exploit temporal and spatial correlations simultaneously. Several matrix/tensor decomposition models are analyzed in view of efficient decorrelation of the multiway forms of MC-EEG. A compression algorithm is built based on the principle of “lossy plus residual coding,” consisting of a matrix/tensor decomposition-based coder in the lossy layer followed by arithmetic coding in the residual layer. This approach guarantees a specifiable maximum absolute error between original and reconstructed signals. The compression algorithm is applied to three different scalp EEG datasets and an intracranial EEG dataset, each with different sampling rate and resolution. The proposed algorithm achieves attractive compression ratios compared to compressing individual channels separately. For similar compression ratios, the proposed algorithm achieves nearly fivefold lower average error compared to a similar wavelet-based volumetric MC-EEG compression algorithm.
Recursive Factorization of the Inverse Overlap Matrix in Linear-Scaling Quantum Molecular Dynamics Simulations.

PubMed

Negre, Christian F A; Mniszewski, Susan M; Cawkwell, Marc J; Bock, Nicolas; Wall, Michael E; Niklasson, Anders M N

2016-07-12

We present a reduced complexity algorithm to compute the inverse overlap factors required to solve the generalized eigenvalue problem in a quantum-based molecular dynamics (MD) simulation. Our method is based on the recursive, iterative refinement of an initial guess of Z (inverse square root of the overlap matrix S). The initial guess of Z is obtained beforehand by using either an approximate divide-and-conquer technique or dynamical methods, propagated within an extended Lagrangian dynamics from previous MD time steps. With this formulation, we achieve long-term stability and energy conservation even under the incomplete, approximate, iterative refinement of Z. Linear-scaling performance is obtained using numerically thresholded sparse matrix algebra based on the ELLPACK-R sparse matrix data format, which also enables efficient shared-memory parallelization. As we show in this article using self-consistent density-functional-based tight-binding MD, our approach is faster than conventional methods based on the diagonalization of overlap matrix S for systems as small as a few hundred atoms, substantially accelerating quantum-based simulations even for molecular structures of intermediate size. For a 4158-atom water-solvated polyalanine system, we find an average speedup factor of 122 for the computation of Z in each MD step.
Recursive Factorization of the Inverse Overlap Matrix in Linear Scaling Quantum Molecular Dynamics Simulations

DOE PAGES

Negre, Christian F. A; Mniszewski, Susan M.; Cawkwell, Marc Jon; ...

2016-06-06

We present a reduced complexity algorithm to compute the inverse overlap factors required to solve the generalized eigenvalue problem in a quantum-based molecular dynamics (MD) simulation. Our method is based on the recursive iterative re nement of an initial guess Z of the inverse overlap matrix S. The initial guess of Z is obtained beforehand either by using an approximate divide and conquer technique or dynamically, propagated within an extended Lagrangian dynamics from previous MD time steps. With this formulation, we achieve long-term stability and energy conservation even under incomplete approximate iterative re nement of Z. Linear scaling performance ismore » obtained using numerically thresholded sparse matrix algebra based on the ELLPACK-R sparse matrix data format, which also enables e cient shared memory parallelization. As we show in this article using selfconsistent density functional based tight-binding MD, our approach is faster than conventional methods based on the direct diagonalization of the overlap matrix S for systems as small as a few hundred atoms, substantially accelerating quantum-based simulations even for molecular structures of intermediate size. For a 4,158 atom water-solvated polyalanine system we nd an average speedup factor of 122 for the computation of Z in each MD step.« less
Efficient parallel linear scaling construction of the density matrix for Born-Oppenheimer molecular dynamics.

PubMed

Mniszewski, S M; Cawkwell, M J; Wall, M E; Mohd-Yusof, J; Bock, N; Germann, T C; Niklasson, A M N

2015-10-13

We present an algorithm for the calculation of the density matrix that for insulators scales linearly with system size and parallelizes efficiently on multicore, shared memory platforms with small and controllable numerical errors. The algorithm is based on an implementation of the second-order spectral projection (SP2) algorithm [ Niklasson, A. M. N. Phys. Rev. B 2002 , 66 , 155115 ] in sparse matrix algebra with the ELLPACK-R data format. We illustrate the performance of the algorithm within self-consistent tight binding theory by total energy calculations of gas phase poly(ethylene) molecules and periodic liquid water systems containing up to 15,000 atoms on up to 16 CPU cores. We consider algorithm-specific performance aspects, such as local vs nonlocal memory access and the degree of matrix sparsity. Comparisons to sparse matrix algebra implementations using off-the-shelf libraries on multicore CPUs, graphics processing units (GPUs), and the Intel many integrated core (MIC) architecture are also presented. The accuracy and stability of the algorithm are illustrated with long duration Born-Oppenheimer molecular dynamics simulations of 1000 water molecules and a 303 atom Trp cage protein solvated by 2682 water molecules.
Reduced order feedback control equations for linear time and frequency domain analysis

NASA Technical Reports Server (NTRS)

Frisch, H. P.

1981-01-01

An algorithm was developed which can be used to obtain the equations. In a more general context, the algorithm computes a real nonsingular similarity transformation matrix which reduces a real nonsymmetric matrix to block diagonal form, each block of which is a real quasi upper triangular matrix. The algorithm works with both defective and derogatory matrices and when and if it fails, the resultant output can be used as a guide for the reformulation of the mathematical equations that lead up to the ill conditioned matrix which could not be block diagonalized.

Multidimensional Hermite-Gaussian quadrature formulae and their application to nonlinear estimation

NASA Technical Reports Server (NTRS)

Mcreynolds, S. R.

1975-01-01

A simplified technique is proposed for calculating multidimensional Hermite-Gaussian quadratures that involves taking the square root of a matrix by the Cholesky algorithm rather than computation of the eigenvectors of the matrix. Ways of reducing the dimension, number, and order of the quadratures are set forth. If the function f(x) under the integral sign is not well approximated by a low-order algebraic expression, the order of the quadrature may be reduced by factoring f(x) into an expression that is nearly algebraic and one that is Gaussian.
Progress on a generalized coordinates tensor product finite element 3DPNS algorithm for subsonic

NASA Technical Reports Server (NTRS)

Baker, A. J.; Orzechowski, J. A.

1983-01-01

A generalized coordinates form of the penalty finite element algorithm for the 3-dimensional parabolic Navier-Stokes equations for turbulent subsonic flows was derived. This algorithm formulation requires only three distinct hypermatrices and is applicable using any boundary fitted coordinate transformation procedure. The tensor matrix product approximation to the Jacobian of the Newton linear algebra matrix statement was also derived. Tne Newton algorithm was restructured to replace large sparse matrix solution procedures with grid sweeping using alpha-block tridiagonal matrices, where alpha equals the number of dependent variables. Numerical experiments were conducted and the resultant data gives guidance on potentially preferred tensor product constructions for the penalty finite element 3DPNS algorithm.
Parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Amin-Javaheri, Masoud; Orin, David E.

1989-01-01

The development of an O(log2N) parallel algorithm for the manipulator inertia matrix is presented. It is based on the most efficient serial algorithm which uses the composite rigid body method. Recursive doubling is used to reformulate the linear recurrence equations which are required to compute the diagonal elements of the matrix. It results in O(log2N) levels of computation. Computation of the off-diagonal elements involves N linear recurrences of varying-size and a new method, which avoids redundant computation of position and orientation transforms for the manipulator, is developed. The O(log2N) algorithm is presented in both equation and graphic forms which clearly show the parallelism inherent in the algorithm.
Filtered gradient reconstruction algorithm for compressive spectral imaging

NASA Astrophysics Data System (ADS)

Mejia, Yuri; Arguello, Henry

2017-04-01

Compressive sensing matrices are traditionally based on random Gaussian and Bernoulli entries. Nevertheless, they are subject to physical constraints, and their structure unusually follows a dense matrix distribution, such as the case of the matrix related to compressive spectral imaging (CSI). The CSI matrix represents the integration of coded and shifted versions of the spectral bands. A spectral image can be recovered from CSI measurements by using iterative algorithms for linear inverse problems that minimize an objective function including a quadratic error term combined with a sparsity regularization term. However, current algorithms are slow because they do not exploit the structure and sparse characteristics of the CSI matrices. A gradient-based CSI reconstruction algorithm, which introduces a filtering step in each iteration of a conventional CSI reconstruction algorithm that yields improved image quality, is proposed. Motivated by the structure of the CSI matrix, Φ, this algorithm modifies the iterative solution such that it is forced to converge to a filtered version of the residual ΦTy, where y is the compressive measurement vector. We show that the filtered-based algorithm converges to better quality performance results than the unfiltered version. Simulation results highlight the relative performance gain over the existing iterative algorithms.
SMI adaptive antenna arrays for weak interfering signals. [Sample Matrix Inversion

NASA Technical Reports Server (NTRS)

Gupta, Inder J.

1986-01-01

The performance of adaptive antenna arrays in the presence of weak interfering signals (below thermal noise) is studied. It is shown that a conventional adaptive antenna array sample matrix inversion (SMI) algorithm is unable to suppress such interfering signals. To overcome this problem, the SMI algorithm is modified. In the modified algorithm, the covariance matrix is redefined such that the effect of thermal noise on the weights of adaptive arrays is reduced. Thus, the weights are dictated by relatively weak signals. It is shown that the modified algorithm provides the desired interference protection.
Analysis of facial motion patterns during speech using a matrix factorization algorithm

PubMed Central

Lucero, Jorge C.; Munhall, Kevin G.

2008-01-01

This paper presents an analysis of facial motion during speech to identify linearly independent kinematic regions. The data consists of three-dimensional displacement records of a set of markers located on a subject’s face while producing speech. A QR factorization with column pivoting algorithm selects a subset of markers with independent motion patterns. The subset is used as a basis to fit the motion of the other facial markers, which determines facial regions of influence of each of the linearly independent markers. Those regions constitute kinematic “eigenregions” whose combined motion produces the total motion of the face. Facial animations may be generated by driving the independent markers with collected displacement records. PMID:19062866
Application of symbolic/numeric matrix solution techniques to the NASTRAN program

NASA Technical Reports Server (NTRS)

Buturla, E. M.; Burroughs, S. H.

1977-01-01

The matrix solving algorithm of any finite element algorithm is extremely important since solution of the matrix equations requires a large amount of elapse time due to null calculations and excessive input/output operations. An alternate method of solving the matrix equations is presented. A symbolic processing step followed by numeric solution yields the solution very rapidly and is especially useful for nonlinear problems.
Pure endmember extraction using robust kernel archetypoid analysis for hyperspectral imagery

NASA Astrophysics Data System (ADS)

Sun, Weiwei; Yang, Gang; Wu, Ke; Li, Weiyue; Zhang, Dianfa

2017-09-01

A robust kernel archetypoid analysis (RKADA) method is proposed to extract pure endmembers from hyperspectral imagery (HSI). The RKADA assumes that each pixel is a sparse linear mixture of all endmembers and each endmember corresponds to a real pixel in the image scene. First, it improves the re8gular archetypal analysis with a new binary sparse constraint, and the adoption of the kernel function constructs the principal convex hull in an infinite Hilbert space and enlarges the divergences between pairwise pixels. Second, the RKADA transfers the pure endmember extraction problem into an optimization problem by minimizing residual errors with the Huber loss function. The Huber loss function reduces the effects from big noises and outliers in the convergence procedure of RKADA and enhances the robustness of the optimization function. Third, the random kernel sinks for fast kernel matrix approximation and the two-stage algorithm for optimizing initial pure endmembers are utilized to improve its computational efficiency in realistic implementations of RKADA, respectively. The optimization equation of RKADA is solved by using the block coordinate descend scheme and the desired pure endmembers are finally obtained. Six state-of-the-art pure endmember extraction methods are employed to make comparisons with the RKADA on both synthetic and real Cuprite HSI datasets, including three geometrical algorithms vertex component analysis (VCA), alternative volume maximization (AVMAX) and orthogonal subspace projection (OSP), and three matrix factorization algorithms the preconditioning for successive projection algorithm (PreSPA), hierarchical clustering based on rank-two nonnegative matrix factorization (H2NMF) and self-dictionary multiple measurement vector (SDMMV). Experimental results show that the RKADA outperforms all the six methods in terms of spectral angle distance (SAD) and root-mean-square-error (RMSE). Moreover, the RKADA has short computational times in offline operations and shows significant improvement in identifying pure endmembers for ground objects with smaller spectrum differences. Therefore, the RKADA could be an alternative for pure endmember extraction from hyperspectral images.
Sparse subspace clustering for data with missing entries and high-rank matrix completion.

PubMed

Fan, Jicong; Chow, Tommy W S

2017-09-01

Many methods have recently been proposed for subspace clustering, but they are often unable to handle incomplete data because of missing entries. Using matrix completion methods to recover missing entries is a common way to solve the problem. Conventional matrix completion methods require that the matrix should be of low-rank intrinsically, but most matrices are of high-rank or even full-rank in practice, especially when the number of subspaces is large. In this paper, a new method called Sparse Representation with Missing Entries and Matrix Completion is proposed to solve the problems of incomplete-data subspace clustering and high-rank matrix completion. The proposed algorithm alternately computes the matrix of sparse representation coefficients and recovers the missing entries of a data matrix. The proposed algorithm recovers missing entries through minimizing the representation coefficients, representation errors, and matrix rank. Thorough experimental study and comparative analysis based on synthetic data and natural images were conducted. The presented results demonstrate that the proposed algorithm is more effective in subspace clustering and matrix completion compared with other existing methods. Copyright © 2017 Elsevier Ltd. All rights reserved.
Computing the Density Matrix in Electronic Structure Theory on Graphics Processing Units.

PubMed

Cawkwell, M J; Sanville, E J; Mniszewski, S M; Niklasson, Anders M N

2012-11-13

The self-consistent solution of a Schrödinger-like equation for the density matrix is a critical and computationally demanding step in quantum-based models of interatomic bonding. This step was tackled historically via the diagonalization of the Hamiltonian. We have investigated the performance and accuracy of the second-order spectral projection (SP2) algorithm for the computation of the density matrix via a recursive expansion of the Fermi operator in a series of generalized matrix-matrix multiplications. We demonstrate that owing to its simplicity, the SP2 algorithm [Niklasson, A. M. N. Phys. Rev. B2002, 66, 155115] is exceptionally well suited to implementation on graphics processing units (GPUs). The performance in double and single precision arithmetic of a hybrid GPU/central processing unit (CPU) and full GPU implementation of the SP2 algorithm exceed those of a CPU-only implementation of the SP2 algorithm and traditional matrix diagonalization when the dimensions of the matrices exceed about 2000 × 2000. Padding schemes for arrays allocated in the GPU memory that optimize the performance of the CUBLAS implementations of the level 3 BLAS DGEMM and SGEMM subroutines for generalized matrix-matrix multiplications are described in detail. The analysis of the relative performance of the hybrid CPU/GPU and full GPU implementations indicate that the transfer of arrays between the GPU and CPU constitutes only a small fraction of the total computation time. The errors measured in the self-consistent density matrices computed using the SP2 algorithm are generally smaller than those measured in matrices computed via diagonalization. Furthermore, the errors in the density matrices computed using the SP2 algorithm do not exhibit any dependence of system size, whereas the errors increase linearly with the number of orbitals when diagonalization is employed.
Strong Tracking Spherical Simplex-Radial Cubature Kalman Filter for Maneuvering Target Tracking.

PubMed

Liu, Hua; Wu, Wen

2017-03-31

Conventional spherical simplex-radial cubature Kalman filter (SSRCKF) for maneuvering target tracking may decline in accuracy and even diverge when a target makes abrupt state changes. To overcome this problem, a novel algorithm named strong tracking spherical simplex-radial cubature Kalman filter (STSSRCKF) is proposed in this paper. The proposed algorithm uses the spherical simplex-radial (SSR) rule to obtain a higher accuracy than cubature Kalman filter (CKF) algorithm. Meanwhile, by introducing strong tracking filter (STF) into SSRCKF and modifying the predicted states' error covariance with a time-varying fading factor, the gain matrix is adjusted on line so that the robustness of the filter and the capability of dealing with uncertainty factors is improved. In this way, the proposed algorithm has the advantages of both STF's strong robustness and SSRCKF's high accuracy. Finally, a maneuvering target tracking problem with abrupt state changes is used to test the performance of the proposed filter. Simulation results show that the STSSRCKF algorithm can get better estimation accuracy and greater robustness for maneuvering target tracking.
Strong Tracking Spherical Simplex-Radial Cubature Kalman Filter for Maneuvering Target Tracking

PubMed Central

Liu, Hua; Wu, Wen

2017-01-01

Conventional spherical simplex-radial cubature Kalman filter (SSRCKF) for maneuvering target tracking may decline in accuracy and even diverge when a target makes abrupt state changes. To overcome this problem, a novel algorithm named strong tracking spherical simplex-radial cubature Kalman filter (STSSRCKF) is proposed in this paper. The proposed algorithm uses the spherical simplex-radial (SSR) rule to obtain a higher accuracy than cubature Kalman filter (CKF) algorithm. Meanwhile, by introducing strong tracking filter (STF) into SSRCKF and modifying the predicted states’ error covariance with a time-varying fading factor, the gain matrix is adjusted on line so that the robustness of the filter and the capability of dealing with uncertainty factors is improved. In this way, the proposed algorithm has the advantages of both STF’s strong robustness and SSRCKF’s high accuracy. Finally, a maneuvering target tracking problem with abrupt state changes is used to test the performance of the proposed filter. Simulation results show that the STSSRCKF algorithm can get better estimation accuracy and greater robustness for maneuvering target tracking. PMID:28362347
Amesos2 and Belos: Direct and Iterative Solvers for Large Sparse Linear Systems

DOE PAGES

Bavier, Eric; Hoemmen, Mark; Rajamanickam, Sivasankaran; ...

2012-01-01

Solvers for large sparse linear systems come in two categories: direct and iterative. Amesos2, a package in the Trilinos software project, provides direct methods, and Belos, another Trilinos package, provides iterative methods. Amesos2 offers a common interface to many different sparse matrix factorization codes, and can handle any implementation of sparse matrices and vectors, via an easy-to-extend C++ traits interface. It can also factor matrices whose entries have arbitrary “Scalar” type, enabling extended-precision and mixed-precision algorithms. Belos includes many different iterative methods for solving large sparse linear systems and least-squares problems. Unlike competing iterative solver libraries, Belos completely decouples themore » algorithms from the implementations of the underlying linear algebra objects. This lets Belos exploit the latest hardware without changes to the code. Belos favors algorithms that solve higher-level problems, such as multiple simultaneous linear systems and sequences of related linear systems, faster than standard algorithms. The package also supports extended-precision and mixed-precision algorithms. Together, Amesos2 and Belos form a complete suite of sparse linear solvers.« less
Speech enhancement on smartphone voice recording

NASA Astrophysics Data System (ADS)

Tris Atmaja, Bagus; Nur Farid, Mifta; Arifianto, Dhany

2016-11-01

Speech enhancement is challenging task in audio signal processing to enhance the quality of targeted speech signal while suppress other noises. In the beginning, the speech enhancement algorithm growth rapidly from spectral subtraction, Wiener filtering, spectral amplitude MMSE estimator to Non-negative Matrix Factorization (NMF). Smartphone as revolutionary device now is being used in all aspect of life including journalism; personally and professionally. Although many smartphones have two microphones (main and rear) the only main microphone is widely used for voice recording. This is why the NMF algorithm widely used for this purpose of speech enhancement. This paper evaluate speech enhancement on smartphone voice recording by using some algorithms mentioned previously. We also extend the NMF algorithm to Kulback-Leibler NMF with supervised separation. The last algorithm shows improved result compared to others by spectrogram and PESQ score evaluation.
ORACLS: A system for linear-quadratic-Gaussian control law design

NASA Technical Reports Server (NTRS)

Armstrong, E. S.

1978-01-01

A modern control theory design package (ORACLS) for constructing controllers and optimal filters for systems modeled by linear time-invariant differential or difference equations is described. Numerical linear-algebra procedures are used to implement the linear-quadratic-Gaussian (LQG) methodology of modern control theory. Algorithms are included for computing eigensystems of real matrices, the relative stability of a matrix, factored forms for nonnegative definite matrices, the solutions and least squares approximations to the solutions of certain linear matrix algebraic equations, the controllability properties of a linear time-invariant system, and the steady state covariance matrix of an open-loop stable system forced by white noise. Subroutines are provided for solving both the continuous and discrete optimal linear regulator problems with noise free measurements and the sampled-data optimal linear regulator problem. For measurement noise, duality theory and the optimal regulator algorithms are used to solve the continuous and discrete Kalman-Bucy filter problems. Subroutines are also included which give control laws causing the output of a system to track the output of a prescribed model.
A General Exponential Framework for Dimensionality Reduction.

PubMed

Wang, Su-Jing; Yan, Shuicheng; Yang, Jian; Zhou, Chun-Guang; Fu, Xiaolan

2014-02-01

As a general framework, Laplacian embedding, based on a pairwise similarity matrix, infers low dimensional representations from high dimensional data. However, it generally suffers from three issues: 1) algorithmic performance is sensitive to the size of neighbors; 2) the algorithm encounters the well known small sample size (SSS) problem; and 3) the algorithm de-emphasizes small distance pairs. To address these issues, here we propose exponential embedding using matrix exponential and provide a general framework for dimensionality reduction. In the framework, the matrix exponential can be roughly interpreted by the random walk over the feature similarity matrix, and thus is more robust. The positive definite property of matrix exponential deals with the SSS problem. The behavior of the decay function of exponential embedding is more significant in emphasizing small distance pairs. Under this framework, we apply matrix exponential to extend many popular Laplacian embedding algorithms, e.g., locality preserving projections, unsupervised discriminant projections, and marginal fisher analysis. Experiments conducted on the synthesized data, UCI, and the Georgia Tech face database show that the proposed new framework can well address the issues mentioned above.
Improving the Numerical Stability of Fast Matrix Multiplication

DOE PAGES

Ballard, Grey; Benson, Austin R.; Druinsky, Alex; ...

2016-10-04

Fast algorithms for matrix multiplication, namely those that perform asymptotically fewer scalar operations than the classical algorithm, have been considered primarily of theoretical interest. Apart from Strassen's original algorithm, few fast algorithms have been efficiently implemented or used in practical applications. However, there exist many practical alternatives to Strassen's algorithm with varying performance and numerical properties. Fast algorithms are known to be numerically stable, but because their error bounds are slightly weaker than the classical algorithm, they are not used even in cases where they provide a performance benefit. We argue in this study that the numerical sacrifice of fastmore » algorithms, particularly for the typical use cases of practical algorithms, is not prohibitive, and we explore ways to improve the accuracy both theoretically and empirically. The numerical accuracy of fast matrix multiplication depends on properties of the algorithm and of the input matrices, and we consider both contributions independently. We generalize and tighten previous error analyses of fast algorithms and compare their properties. We discuss algorithmic techniques for improving the error guarantees from two perspectives: manipulating the algorithms, and reducing input anomalies by various forms of diagonal scaling. In conclusion, we benchmark performance and demonstrate our improved numerical accuracy.« less
Muscle-tendon units localization and activation level analysis based on high-density surface EMG array and NMF algorithm

NASA Astrophysics Data System (ADS)

Huang, Chengjun; Chen, Xiang; Cao, Shuai; Zhang, Xu

2016-12-01

Objective. Some skeletal muscles can be subdivided into smaller segments called muscle-tendon units (MTUs). The purpose of this paper is to propose a framework to locate the active region of the corresponding MTUs within a single skeletal muscle and to analyze the activation level varieties of different MTUs during a dynamic motion task. Approach. Biceps brachii and gastrocnemius were selected as targeted muscles and three dynamic motion tasks were designed and studied. Eight healthy male subjects participated in the data collection experiments, and 128-channel surface electromyographic (sEMG) signals were collected with a high-density sEMG electrode grid (a grid consists of 8 rows and 16 columns). Then the sEMG envelopes matrix was factorized into a matrix of weighting vectors and a matrix of time-varying coefficients by nonnegative matrix factorization algorithm. Main results. The experimental results demonstrated that the weightings vectors, which represent invariant pattern of muscle activity across all channels, could be used to estimate the location of MTUs and the time-varying coefficients could be used to depict the variation of MTUs activation level during dynamic motion task. Significance. The proposed method provides one way to analyze in-depth the functional state of MTUs during dynamic tasks and thus can be employed on multiple noteworthy sEMG-based applications such as muscle force estimation, muscle fatigue research and the control of myoelectric prostheses. This work was supported by the National Nature Science Foundation of China under Grant 61431017 and 61271138.
Exploiting Symmetry on Parallel Architectures.

NASA Astrophysics Data System (ADS)

Stiller, Lewis Benjamin

1995-01-01

This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.
Modifying a numerical algorithm for solving the matrix equation X + AX T B = C

NASA Astrophysics Data System (ADS)

Vorontsov, Yu. O.

2013-06-01

Certain modifications are proposed for a numerical algorithm solving the matrix equation X + AX T B = C. By keeping the intermediate results in storage and repeatedly using them, it is possible to reduce the total complexity of the algorithm from O( n 4) to O( n 3) arithmetic operations.

An implicit iterative algorithm with a tuning parameter for Itô Lyapunov matrix equations

NASA Astrophysics Data System (ADS)

Zhang, Ying; Wu, Ai-Guo; Sun, Hui-Jie

2018-01-01

In this paper, an implicit iterative algorithm is proposed for solving a class of Lyapunov matrix equations arising in Itô stochastic linear systems. A tuning parameter is introduced in this algorithm, and thus the convergence rate of the algorithm can be changed. Some conditions are presented such that the developed algorithm is convergent. In addition, an explicit expression is also derived for the optimal tuning parameter, which guarantees that the obtained algorithm achieves its fastest convergence rate. Finally, numerical examples are employed to illustrate the effectiveness of the given algorithm.
Multivariable frequency domain identification via 2-norm minimization

NASA Technical Reports Server (NTRS)

Bayard, David S.

1992-01-01

The author develops a computational approach to multivariable frequency domain identification, based on 2-norm minimization. In particular, a Gauss-Newton (GN) iteration is developed to minimize the 2-norm of the error between frequency domain data and a matrix fraction transfer function estimate. To improve the global performance of the optimization algorithm, the GN iteration is initialized using the solution to a particular sequentially reweighted least squares problem, denoted as the SK iteration. The least squares problems which arise from both the SK and GN iterations are shown to involve sparse matrices with identical block structure. A sparse matrix QR factorization method is developed to exploit the special block structure, and to efficiently compute the least squares solution. A numerical example involving the identification of a multiple-input multiple-output (MIMO) plant having 286 unknown parameters is given to illustrate the effectiveness of the algorithm.
Phase diagram of matrix compressed sensing

NASA Astrophysics Data System (ADS)

Schülke, Christophe; Schniter, Philip; Zdeborová, Lenka

2016-12-01

In the problem of matrix compressed sensing, we aim to recover a low-rank matrix from a few noisy linear measurements. In this contribution, we analyze the asymptotic performance of a Bayes-optimal inference procedure for a model where the matrix to be recovered is a product of random matrices. The results that we obtain using the replica method describe the state evolution of the Parametric Bilinear Generalized Approximate Message Passing (P-BiG-AMP) algorithm, recently introduced in J. T. Parker and P. Schniter [IEEE J. Select. Top. Signal Process. 10, 795 (2016), 10.1109/JSTSP.2016.2539123]. We show the existence of two different types of phase transition and their implications for the solvability of the problem, and we compare the results of our theoretical analysis to the numerical performance reached by P-BiG-AMP. Remarkably, the asymptotic replica equations for matrix compressed sensing are the same as those for a related but formally different problem of matrix factorization.
Distributed Matrix Completion: Applications to Cooperative Positioning in Noisy Environments

DTIC Science & Technology

2013-12-11

positioning, and a gossip version of low-rank approximation were developed. A convex relaxation for positioning in the presence of noise was shown...computing the leading eigenvectors of a large data matrix through gossip algorithms. A new algorithm is proposed that amounts to iteratively multiplying...generalization of gossip algorithms for consensus. The algorithms outperform state-of-the-art methods in a communication-limited scenario. Positioning via
Decoding algorithm for vortex communications receiver

NASA Astrophysics Data System (ADS)

Kupferman, Judy; Arnon, Shlomi

2018-01-01

Vortex light beams can provide a tremendous alphabet for encoding information. We derive a symbol decoding algorithm for a direct detection matrix detector vortex beam receiver using Laguerre Gauss (LG) modes, and develop a mathematical model of symbol error rate (SER) for this receiver. We compare SER as a function of signal to noise ratio (SNR) for our algorithm and for the Pearson correlation algorithm. To our knowledge, this is the first comprehensive treatment of a decoding algorithm of a matrix detector for an LG receiver.
A parallel algorithm for the eigenvalues and eigenvectors for a general complex matrix

NASA Technical Reports Server (NTRS)

Shroff, Gautam

1989-01-01

A new parallel Jacobi-like algorithm is developed for computing the eigenvalues of a general complex matrix. Most parallel methods for this parallel typically display only linear convergence. Sequential norm-reducing algorithms also exit and they display quadratic convergence in most cases. The new algorithm is a parallel form of the norm-reducing algorithm due to Eberlein. It is proven that the asymptotic convergence rate of this algorithm is quadratic. Numerical experiments are presented which demonstrate the quadratic convergence of the algorithm and certain situations where the convergence is slow are also identified. The algorithm promises to be very competitive on a variety of parallel architectures.
Massive data compression for parameter-dependent covariance matrices

NASA Astrophysics Data System (ADS)

Heavens, Alan F.; Sellentin, Elena; de Mijolla, Damien; Vianello, Alvise

2017-12-01

We show how the massive data compression algorithm MOPED can be used to reduce, by orders of magnitude, the number of simulated data sets which are required to estimate the covariance matrix required for the analysis of Gaussian-distributed data. This is relevant when the covariance matrix cannot be calculated directly. The compression is especially valuable when the covariance matrix varies with the model parameters. In this case, it may be prohibitively expensive to run enough simulations to estimate the full covariance matrix throughout the parameter space. This compression may be particularly valuable for the next generation of weak lensing surveys, such as proposed for Euclid and Large Synoptic Survey Telescope, for which the number of summary data (such as band power or shear correlation estimates) is very large, ∼104, due to the large number of tomographic redshift bins which the data will be divided into. In the pessimistic case where the covariance matrix is estimated separately for all points in an Monte Carlo Markov Chain analysis, this may require an unfeasible 109 simulations. We show here that MOPED can reduce this number by a factor of 1000, or a factor of ∼106 if some regularity in the covariance matrix is assumed, reducing the number of simulations required to a manageable 103, making an otherwise intractable analysis feasible.
A Tensor Product Formulation of Strassen's Matrix Multiplication Algorithm with Memory Reduction

DOE PAGES

Kumar, B.; Huang, C. -H.; Sadayappan, P.; ...

1995-01-01

In this article, we present a program generation strategy of Strassen's matrix multiplication algorithm using a programming methodology based on tensor product formulas. In this methodology, block recursive programs such as the fast Fourier Transforms and Strassen's matrix multiplication algorithm are expressed as algebraic formulas involving tensor products and other matrix operations. Such formulas can be systematically translated to high-performance parallel/vector codes for various architectures. In this article, we present a nonrecursive implementation of Strassen's algorithm for shared memory vector processors such as the Cray Y-MP. A previous implementation of Strassen's algorithm synthesized from tensor product formulas required working storagemore » of size O(7 n ) for multiplying 2 n × 2 n matrices. We present a modified formulation in which the working storage requirement is reduced to O(4 n ). The modified formulation exhibits sufficient parallelism for efficient implementation on a shared memory multiprocessor. Performance results on a Cray Y-MP8/64 are presented.« less
A simple suboptimal least-squares algorithm for attitude determination with multiple sensors

NASA Technical Reports Server (NTRS)

Brozenec, Thomas F.; Bender, Douglas J.

1994-01-01

Three-axis attitude determination is equivalent to finding a coordinate transformation matrix which transforms a set of reference vectors fixed in inertial space to a set of measurement vectors fixed in the spacecraft. The attitude determination problem can be expressed as a constrained optimization problem. The constraint is that a coordinate transformation matrix must be proper, real, and orthogonal. A transformation matrix can be thought of as optimal in the least-squares sense if it maps the measurement vectors to the reference vectors with minimal 2-norm errors and meets the above constraint. This constrained optimization problem is known as Wahba's problem. Several algorithms which solve Wahba's problem exactly have been developed and used. These algorithms, while steadily improving, are all rather complicated. Furthermore, they involve such numerically unstable or sensitive operations as matrix determinant, matrix adjoint, and Newton-Raphson iterations. This paper describes an algorithm which minimizes Wahba's loss function, but without the constraint. When the constraint is ignored, the problem can be solved by a straightforward, numerically stable least-squares algorithm such as QR decomposition. Even though the algorithm does not explicitly take the constraint into account, it still yields a nearly orthogonal matrix for most practical cases; orthogonality only becomes corrupted when the sensor measurements are very noisy, on the same order of magnitude as the attitude rotations. The algorithm can be simplified if the attitude rotations are small enough so that the approximation sin(theta) approximately equals theta holds. We then compare the computational requirements for several well-known algorithms. For the general large-angle case, the QR least-squares algorithm is competitive with all other know algorithms and faster than most. If attitude rotations are small, the least-squares algorithm can be modified to run faster, and this modified algorithm is faster than all but a similarly specialized version of the QUEST algorithm. We also introduce a novel measurement averaging technique which reduces the n-measurement case to the two measurement case for our particular application, a star tracker and earth sensor mounted on an earth-pointed geosynchronous communications satellite. Using this technique, many n-measurement problems reduce to less than or equal to 3 measurements; this reduces the amount of required calculation without significant degradation in accuracy. Finally, we present the results of some tests which compare the least-squares algorithm with the QUEST and FOAM algorithms in the two-measurement case. For our example case, all three algorithms performed with similar accuracy.
Fast Electromagnetic Solvers for Large-Scale Naval Scattering Problems

DTIC Science & Technology

2008-09-27

IEEE Trans. Antennas Propag., vol. 52, no. 8, pp. 2141–2146, 2004. [12] R. J. Burkholder and J. F. Lee, “Fast dual-MGS block-factorization algorithm...Golub and C. F. V. Loan, Matrix Computations. Baltimore: The Johns Hopkins University Press, 1996. [20] W. D. Li, W. Hong, and H. X. Zhou, “Integral
Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue

Within this paper, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. Additionally, the solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less
Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue

In this article, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less
Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

DOE PAGES

Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue; ...

2017-12-01

In this article, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less
Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

DOE PAGES

Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue; ...

2017-08-24

Within this paper, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. Additionally, the solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less
Design of Robust Adaptive Unbalance Response Controllers for Rotors with Magnetic Bearings

NASA Technical Reports Server (NTRS)

Knospe, Carl R.; Tamer, Samir M.; Fedigan, Stephen J.

1996-01-01

Experimental results have recently demonstrated that an adaptive open loop control strategy can be highly effective in the suppression of unbalance induced vibration on rotors supported in active magnetic bearings. This algorithm, however, relies upon a predetermined gain matrix. Typically, this matrix is determined by an optimal control formulation resulting in the choice of the pseudo-inverse of the nominal influence coefficient matrix as the gain matrix. This solution may result in problems with stability and performance robustness since the estimated influence coefficient matrix is not equal to the actual influence coefficient matrix. Recently, analysis tools have been developed to examine the robustness of this control algorithm with respect to structured uncertainty. Herein, these tools are extended to produce a design procedure for determining the adaptive law's gain matrix. The resulting control algorithm has a guaranteed convergence rate and steady state performance in spite of the uncertainty in the rotor system. Several examples are presented which demonstrate the effectiveness of this approach and its advantages over the standard optimal control formulation.
Accuracy and speed in computing the Chebyshev collocation derivative

NASA Technical Reports Server (NTRS)

Don, Wai-Sun; Solomonoff, Alex

1991-01-01

We studied several algorithms for computing the Chebyshev spectral derivative and compare their roundoff error. For a large number of collocation points, the elements of the Chebyshev differentiation matrix, if constructed in the usual way, are not computed accurately. A subtle cause is is found to account for the poor accuracy when computing the derivative by the matrix-vector multiplication method. Methods for accurately computing the elements of the matrix are presented, and we find that if the entities of the matrix are computed accurately, the roundoff error of the matrix-vector multiplication is as small as that of the transform-recursion algorithm. Results of CPU time usage are shown for several different algorithms for computing the derivative by the Chebyshev collocation method for a wide variety of two-dimensional grid sizes on both an IBM and a Cray 2 computer. We found that which algorithm is fastest on a particular machine depends not only on the grid size, but also on small details of the computer hardware as well. For most practical grid sizes used in computation, the even-odd decomposition algorithm is found to be faster than the transform-recursion method.
A physiologically motivated sparse, compact, and smooth (SCS) approach to EEG source localization.

PubMed

Cao, Cheng; Akalin Acar, Zeynep; Kreutz-Delgado, Kenneth; Makeig, Scott

2012-01-01

Here, we introduce a novel approach to the EEG inverse problem based on the assumption that principal cortical sources of multi-channel EEG recordings may be assumed to be spatially sparse, compact, and smooth (SCS). To enforce these characteristics of solutions to the EEG inverse problem, we propose a correlation-variance model which factors a cortical source space covariance matrix into the multiplication of a pre-given correlation coefficient matrix and the square root of the diagonal variance matrix learned from the data under a Bayesian learning framework. We tested the SCS method using simulated EEG data with various SNR and applied it to a real ECOG data set. We compare the results of SCS to those of an established SBL algorithm.
[Orthogonal Vector Projection Algorithm for Spectral Unmixing].

PubMed

Song, Mei-ping; Xu, Xing-wei; Chang, Chein-I; An, Ju-bai; Yao, Li

2015-12-01

Spectrum unmixing is an important part of hyperspectral technologies, which is essential for material quantity analysis in hyperspectral imagery. Most linear unmixing algorithms require computations of matrix multiplication and matrix inversion or matrix determination. These are difficult for programming, especially hard for realization on hardware. At the same time, the computation costs of the algorithms increase significantly as the number of endmembers grows. Here, based on the traditional algorithm Orthogonal Subspace Projection, a new method called. Orthogonal Vector Projection is prompted using orthogonal principle. It simplifies this process by avoiding matrix multiplication and inversion. It firstly computes the final orthogonal vector via Gram-Schmidt process for each endmember spectrum. And then, these orthogonal vectors are used as projection vector for the pixel signature. The unconstrained abundance can be obtained directly by projecting the signature to the projection vectors, and computing the ratio of projected vector length and orthogonal vector length. Compared to the Orthogonal Subspace Projection and Least Squares Error algorithms, this method does not need matrix inversion, which is much computation costing and hard to implement on hardware. It just completes the orthogonalization process by repeated vector operations, easy for application on both parallel computation and hardware. The reasonability of the algorithm is proved by its relationship with Orthogonal Sub-space Projection and Least Squares Error algorithms. And its computational complexity is also compared with the other two algorithms', which is the lowest one. At last, the experimental results on synthetic image and real image are also provided, giving another evidence for effectiveness of the method.
Convergence Results on Iteration Algorithms to Linear Systems

PubMed Central

Wang, Zhuande; Yang, Chuansheng; Yuan, Yubo

2014-01-01

In order to solve the large scale linear systems, backward and Jacobi iteration algorithms are employed. The convergence is the most important issue. In this paper, a unified backward iterative matrix is proposed. It shows that some well-known iterative algorithms can be deduced with it. The most important result is that the convergence results have been proved. Firstly, the spectral radius of the Jacobi iterative matrix is positive and the one of backward iterative matrix is strongly positive (lager than a positive constant). Secondly, the mentioned two iterations have the same convergence results (convergence or divergence simultaneously). Finally, some numerical experiments show that the proposed algorithms are correct and have the merit of backward methods. PMID:24991640
Partitioning sparse matrices with eigenvectors of graphs

NASA Technical Reports Server (NTRS)

Pothen, Alex; Simon, Horst D.; Liou, Kang-Pu

1990-01-01

The problem of computing a small vertex separator in a graph arises in the context of computing a good ordering for the parallel factorization of sparse, symmetric matrices. An algebraic approach for computing vertex separators is considered in this paper. It is shown that lower bounds on separator sizes can be obtained in terms of the eigenvalues of the Laplacian matrix associated with a graph. The Laplacian eigenvectors of grid graphs can be computed from Kronecker products involving the eigenvectors of path graphs, and these eigenvectors can be used to compute good separators in grid graphs. A heuristic algorithm is designed to compute a vertex separator in a general graph by first computing an edge separator in the graph from an eigenvector of the Laplacian matrix, and then using a maximum matching in a subgraph to compute the vertex separator. Results on the quality of the separators computed by the spectral algorithm are presented, and these are compared with separators obtained from other algorithms for computing separators. Finally, the time required to compute the Laplacian eigenvector is reported, and the accuracy with which the eigenvector must be computed to obtain good separators is considered. The spectral algorithm has the advantage that it can be implemented on a medium-size multiprocessor in a straightforward manner.

V2.2 L2AS Detailed Release Description April 15, 2002

Atmospheric Science Data Center

2013-03-14

... 'optically thick atmosphere' algorithm. Implement new experimental aerosol retrieval algorithm over homogeneous surface types. ... Change values: cloud_mask_decision_matrix(1,1): .true. -> .false. cloud_mask_decision_matrix(2,1): .true. -> .false. ...
Attitude determination using vector observations: A fast optimal matrix algorithm

NASA Technical Reports Server (NTRS)

Markley, F. Landis

1993-01-01

The attitude matrix minimizing Wahba's loss function is computed directly by a method that is competitive with the fastest known algorithm for finding this optimal estimate. The method also provides an estimate of the attitude error covariance matrix. Analysis of the special case of two vector observations identifies those cases for which the TRIAD or algebraic method minimizes Wahba's loss function.
The bilinear complexity and practical algorithms for matrix multiplication

NASA Astrophysics Data System (ADS)

Smirnov, A. V.

2013-12-01

A method for deriving bilinear algorithms for matrix multiplication is proposed. New estimates for the bilinear complexity of a number of problems of the exact and approximate multiplication of rectangular matrices are obtained. In particular, the estimate for the boundary rank of multiplying 3 × 3 matrices is improved and a practical algorithm for the exact multiplication of square n × n matrices is proposed. The asymptotic arithmetic complexity of this algorithm is O( n 2.7743).
An Algorithm for the Hierarchical Organization of Path Diagrams and Calculation of Components of Expected Covariance.

ERIC Educational Resources Information Center

Boker, Steven M.; McArdle, J. J.; Neale, Michael

2002-01-01

Presents an algorithm for the production of a graphical diagram from a matrix formula in such a way that its components are logically and hierarchically arranged. The algorithm, which relies on the matrix equations of J. McArdle and R. McDonald (1984), calculates the individual path components of expected covariance between variables and…
Link predication based on matrix factorization by fusion of multi class organizations of the network.

PubMed

Jiao, Pengfei; Cai, Fei; Feng, Yiding; Wang, Wenjun

2017-08-21

Link predication aims at forecasting the latent or unobserved edges in the complex networks and has a wide range of applications in reality. Almost existing methods and models only take advantage of one class organization of the networks, which always lose important information hidden in other organizations of the network. In this paper, we propose a link predication framework which makes the best of the structure of networks in different level of organizations based on nonnegative matrix factorization, which is called NMF 3 here. We first map the observed network into another space by kernel functions, which could get the different order organizations. Then we combine the adjacency matrix of the network with one of other organizations, which makes us obtain the objective function of our framework for link predication based on the nonnegative matrix factorization. Third, we derive an iterative algorithm to optimize the objective function, which converges to a local optimum, and we propose a fast optimization strategy for large networks. Lastly, we test the proposed framework based on two kernel functions on a series of real world networks under different sizes of training set, and the experimental results show the feasibility, effectiveness, and competitiveness of the proposed framework.
A projected preconditioned conjugate gradient algorithm for computing many extreme eigenpairs of a Hermitian matrix [A projected preconditioned conjugate gradient algorithm for computing a large eigenspace of a Hermitian matrix

DOE PAGES

Vecharynski, Eugene; Yang, Chao; Pask, John E.

2015-02-25

Here, we present an iterative algorithm for computing an invariant subspace associated with the algebraically smallest eigenvalues of a large sparse or structured Hermitian matrix A. We are interested in the case in which the dimension of the invariant subspace is large (e.g., over several hundreds or thousands) even though it may still be small relative to the dimension of A. These problems arise from, for example, density functional theory (DFT) based electronic structure calculations for complex materials. The key feature of our algorithm is that it performs fewer Rayleigh–Ritz calculations compared to existing algorithms such as the locally optimalmore » block preconditioned conjugate gradient or the Davidson algorithm. It is a block algorithm, and hence can take advantage of efficient BLAS3 operations and be implemented with multiple levels of concurrency. We discuss a number of practical issues that must be addressed in order to implement the algorithm efficiently on a high performance computer.« less
Unified treatment of microscopic boundary conditions and efficient algorithms for estimating tangent operators of the homogenized behavior in the computational homogenization method

NASA Astrophysics Data System (ADS)

Nguyen, Van-Dung; Wu, Ling; Noels, Ludovic

2017-03-01

This work provides a unified treatment of arbitrary kinds of microscopic boundary conditions usually considered in the multi-scale computational homogenization method for nonlinear multi-physics problems. An efficient procedure is developed to enforce the multi-point linear constraints arising from the microscopic boundary condition either by the direct constraint elimination or by the Lagrange multiplier elimination methods. The macroscopic tangent operators are computed in an efficient way from a multiple right hand sides linear system whose left hand side matrix is the stiffness matrix of the microscopic linearized system at the converged solution. The number of vectors at the right hand side is equal to the number of the macroscopic kinematic variables used to formulate the microscopic boundary condition. As the resolution of the microscopic linearized system often follows a direct factorization procedure, the computation of the macroscopic tangent operators is then performed using this factorized matrix at a reduced computational time.
Attributed community mining using joint general non-negative matrix factorization with graph Laplacian

NASA Astrophysics Data System (ADS)

Chen, Zigang; Li, Lixiang; Peng, Haipeng; Liu, Yuhong; Yang, Yixian

2018-04-01

Community mining for complex social networks with link and attribute information plays an important role according to different application needs. In this paper, based on our proposed general non-negative matrix factorization (GNMF) algorithm without dimension matching constraints in our previous work, we propose the joint GNMF with graph Laplacian (LJGNMF) to implement community mining of complex social networks with link and attribute information according to different application needs. Theoretical derivation result shows that the proposed LJGNMF is fully compatible with previous methods of integrating traditional NMF and symmetric NMF. In addition, experimental results show that the proposed LJGNMF can meet the needs of different community minings by adjusting its parameters, and the effect is better than traditional NMF in the community vertices attributes entropy.
A parallel algorithm for 2D visco-acoustic frequency-domain full-waveform inversion: application to a dense OBS data set

NASA Astrophysics Data System (ADS)

Sourbier, F.; Operto, S.; Virieux, J.

2006-12-01

We present a distributed-memory parallel algorithm for 2D visco-acoustic full-waveform inversion of wide-angle seismic data. Our code is written in fortran90 and use MPI for parallelism. The algorithm was applied to real wide-angle data set recorded by 100 OBSs with a 1-km spacing in the eastern-Nankai trough (Japan) to image the deep structure of the subduction zone. Full-waveform inversion is applied sequentially to discrete frequencies by proceeding from the low to the high frequencies. The inverse problem is solved with a classic gradient method. Full-waveform modeling is performed with a frequency-domain finite-difference method. In the frequency-domain, solving the wave equation requires resolution of a large unsymmetric system of linear equations. We use the massively parallel direct solver MUMPS (http://www.enseeiht.fr/irit/apo/MUMPS) for distributed-memory computer to solve this system. The MUMPS solver is based on a multifrontal method for the parallel factorization. The MUMPS algorithm is subdivided in 3 main steps: a symbolic analysis step that performs re-ordering of the matrix coefficients to minimize the fill-in of the matrix during the subsequent factorization and an estimation of the assembly tree of the matrix. Second, the factorization is performed with dynamic scheduling to accomodate numerical pivoting and provides the LU factors distributed over all the processors. Third, the resolution is performed for multiple sources. To compute the gradient of the cost function, 2 simulations per shot are required (one to compute the forward wavefield and one to back-propagate residuals). The multi-source resolutions can be performed in parallel with MUMPS. In the end, each processor stores in core a sub-domain of all the solutions. These distributed solutions can be exploited to compute in parallel the gradient of the cost function. Since the gradient of the cost function is a weighted stack of the shot and residual solutions of MUMPS, each processor computes the corresponding sub-domain of the gradient. In the end, the gradient is centralized on the master processor using a collective communation. The gradient is scaled by the diagonal elements of the Hessian matrix. This scaling is computed only once per frequency before the first iteration of the inversion. Estimation of the diagonal terms of the Hessian requires performing one simulation per non redondant shot and receiver position. The same strategy that the one used for the gradient is used to compute the diagonal Hessian in parallel. This algorithm was applied to a dense wide-angle data set recorded by 100 OBSs in the eastern Nankai trough, offshore Japan. Thirteen frequencies ranging from 3 and 15 Hz were inverted. Tweny iterations per frequency were computed leading to 260 tomographic velocity models of increasing resolution. The velocity model dimensions are 105 km x 25 km corresponding to a finite-difference grid of 4201 x 1001 grid with a 25-m grid interval. The number of shot was 1005 and the number of inverted OBS gathers was 93. The inversion requires 20 days on 6 32-bits bi-processor nodes with 4 Gbytes of RAM memory per node when only the LU factorization is performed in parallel. Preliminary estimations of the time required to perform the inversion with the fully-parallelized code is 6 and 4 days using 20 and 50 processors respectively.
Matched field localization based on CS-MUSIC algorithm

NASA Astrophysics Data System (ADS)

Guo, Shuangle; Tang, Ruichun; Peng, Linhui; Ji, Xiaopeng

2016-04-01

The problem caused by shortness or excessiveness of snapshots and by coherent sources in underwater acoustic positioning is considered. A matched field localization algorithm based on CS-MUSIC (Compressive Sensing Multiple Signal Classification) is proposed based on the sparse mathematical model of the underwater positioning. The signal matrix is calculated through the SVD (Singular Value Decomposition) of the observation matrix. The observation matrix in the sparse mathematical model is replaced by the signal matrix, and a new concise sparse mathematical model is obtained, which means not only the scale of the localization problem but also the noise level is reduced; then the new sparse mathematical model is solved by the CS-MUSIC algorithm which is a combination of CS (Compressive Sensing) method and MUSIC (Multiple Signal Classification) method. The algorithm proposed in this paper can overcome effectively the difficulties caused by correlated sources and shortness of snapshots, and it can also reduce the time complexity and noise level of the localization problem by using the SVD of the observation matrix when the number of snapshots is large, which will be proved in this paper.
Implementation of the SU(2) Hamiltonian Symmetry for the DMRG Algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alvarez, Gonzalo

2012-01-01

In the Density Matrix Renormalization Group (DMRG) algorithm (White, 1992, 1993) and Hamiltonian symmetries play an important role. Using symmetries, the matrix representation of the Hamiltonian can be blocked. Diagonalizing each matrix block is more efficient than diagonalizing the original matrix. This paper explains how the the DMRG++ code (Alvarez, 2009) has been extended to handle the non-local SU(2) symmetry in a model independent way. Improvements in CPU times compared to runs with only local symmetries are discussed for the one-orbital Hubbard model, and for a two-orbital Hubbard model for iron-based superconductors. The computational bottleneck of the algorithm and themore » use of shared memory parallelization are also addressed.« less
An approximately factored incremental strategy for calculating consistent discrete aerodynamic sensitivity derivatives

NASA Technical Reports Server (NTRS)

Korivi, V. M.; Taylor, A. C., III; Newman, P. A.; Hou, G. J.-W.; Jones, H. E.

1992-01-01

An incremental strategy is presented for iteratively solving very large systems of linear equations, which are associated with aerodynamic sensitivity derivatives for advanced CFD codes. It is shown that the left-hand side matrix operator and the well-known factorization algorithm used to solve the nonlinear flow equations can also be used to efficiently solve the linear sensitivity equations. Two airfoil problems are considered as an example: subsonic low Reynolds number laminar flow and transonic high Reynolds number turbulent flow.
Biclustering sparse binary genomic data.

PubMed

van Uitert, Miranda; Meuleman, Wouter; Wessels, Lodewyk

2008-12-01

Genomic datasets often consist of large, binary, sparse data matrices. In such a dataset, one is often interested in finding contiguous blocks that (mostly) contain ones. This is a biclustering problem, and while many algorithms have been proposed to deal with gene expression data, only two algorithms have been proposed that specifically deal with binary matrices. None of the gene expression biclustering algorithms can handle the large number of zeros in sparse binary matrices. The two proposed binary algorithms failed to produce meaningful results. In this article, we present a new algorithm that is able to extract biclusters from sparse, binary datasets. A powerful feature is that biclusters with different numbers of rows and columns can be detected, varying from many rows to few columns and few rows to many columns. It allows the user to guide the search towards biclusters of specific dimensions. When applying our algorithm to an input matrix derived from TRANSFAC, we find transcription factors with distinctly dissimilar binding motifs, but a clear set of common targets that are significantly enriched for GO categories.
Research on numerical algorithms for large space structures

NASA Technical Reports Server (NTRS)

Denman, E. D.

1981-01-01

Numerical algorithms for analysis and design of large space structures are investigated. The sign algorithm and its application to decoupling of differential equations are presented. The generalized sign algorithm is given and its application to several problems discussed. The Laplace transforms of matrix functions and the diagonalization procedure for a finite element equation are discussed. The diagonalization of matrix polynomials is considered. The quadrature method and Laplace transforms is discussed and the identification of linear systems by the quadrature method investigated.
Affine Projection Algorithm with Improved Data-Selective Method Using the Condition Number

NASA Astrophysics Data System (ADS)

Ban, Sung Jun; Lee, Chang Woo; Kim, Sang Woo

Recently, a data-selective method has been proposed to achieve low misalignment in affine projection algorithm (APA) by keeping the condition number of an input data matrix small. We present an improved method, and a complexity reduction algorithm for the APA with the data-selective method. Experimental results show that the proposed algorithm has lower misalignment and a lower condition number for an input data matrix than both the conventional APA and the APA with the previous data-selective method.
Fast matrix multiplication and its algebraic neighbourhood

NASA Astrophysics Data System (ADS)

Pan, V. Ya.

2017-11-01

Matrix multiplication is among the most fundamental operations of modern computations. By 1969 it was still commonly believed that the classical algorithm was optimal, although the experts already knew that this was not so. Worldwide interest in matrix multiplication instantly exploded in 1969, when Strassen decreased the exponent 3 of cubic time to 2.807. Then everyone expected to see matrix multiplication performed in quadratic or nearly quadratic time very soon. Further progress, however, turned out to be capricious. It was at stalemate for almost a decade, then a combination of surprising techniques (completely independent of Strassen's original ones and much more advanced) enabled a new decrease of the exponent in 1978-1981 and then again in 1986, to 2.376. By 2017 the exponent has still not passed through the barrier of 2.373, but most disturbing was the curse of recursion — even the decrease of exponents below 2.7733 required numerous recursive steps, and each of them squared the problem size. As a result, all algorithms supporting such exponents supersede the classical algorithm only for inputs of immense sizes, far beyond any potential interest for the user. We survey the long study of fast matrix multiplication, focusing on neglected algorithms for feasible matrix multiplication. We comment on their design, the techniques involved, implementation issues, the impact of their study on the modern theory and practice of Algebraic Computations, and perspectives for fast matrix multiplication. Bibliography: 163 titles.
Use of a quality improvement tool, the prioritization matrix, to identify and prioritize triage software algorithm enhancement.

PubMed

North, Frederick; Varkey, Prathiba; Caraballo, Pedro; Vsetecka, Darlene; Bartel, Greg

2007-10-11

Complex decision support software can require significant effort in maintenance and enhancement. A quality improvement tool, the prioritization matrix, was successfully used to guide software enhancement of algorithms in a symptom assessment call center.
Distance descending ordering method: An O(n) algorithm for inverting the mass matrix in simulation of macromolecules with long branches

NASA Astrophysics Data System (ADS)

Xu, Xiankun; Li, Peiwen

2017-11-01

Fixman's work in 1974 and the follow-up studies have developed a method that can factorize the inverse of mass matrix into an arithmetic combination of three sparse matrices-one of them is positive definite and needs to be further factorized by using the Cholesky decomposition or similar methods. When the molecule subjected to study is of serial chain structure, this method can achieve O (n) time complexity. However, for molecules with long branches, Cholesky decomposition about the corresponding positive definite matrix will introduce massive fill-in due to its nonzero structure. Although there are several methods can be used to reduce the number of fill-in, none of them could strictly guarantee for zero fill-in for all molecules according to our test, and thus cannot obtain O (n) time complexity by using these traditional methods. In this paper we present a new method that can guarantee for no fill-in in doing the Cholesky decomposition, which was developed based on the correlations between the mass matrix and the geometrical structure of molecules. As a result, the inverting of mass matrix will remain the O (n) time complexity, no matter the molecule structure has long branches or not.
Fluorescent quantification of terazosin hydrochloride content in human plasma and tablets using second-order calibration based on both parallel factor analysis and alternating penalty trilinear decomposition.

PubMed

Zou, Hong-Yan; Wu, Hai-Long; OuYang, Li-Qun; Zhang, Yan; Nie, Jin-Fang; Fu, Hai-Yan; Yu, Ru-Qin

2009-09-14

Two second-order calibration methods based on the parallel factor analysis (PARAFAC) and the alternating penalty trilinear decomposition (APTLD) method, have been utilized for the direct determination of terazosin hydrochloride (THD) in human plasma samples, coupled with the excitation-emission matrix fluorescence spectroscopy. Meanwhile, the two algorithms combing with the standard addition procedures have been applied for the determination of terazosin hydrochloride in tablets and the results were validated by the high-performance liquid chromatography with fluorescence detection. These second-order calibrations all adequately exploited the second-order advantages. For human plasma samples, the average recoveries by the PARAFAC and APTLD algorithms with the factor number of 2 (N=2) were 100.4+/-2.7% and 99.2+/-2.4%, respectively. The accuracy of two algorithms was also evaluated through elliptical joint confidence region (EJCR) tests and t-test. It was found that both algorithms could give accurate results, and only the performance of APTLD was slightly better than that of PARAFAC. Figures of merit, such as sensitivity (SEN), selectivity (SEL) and limit of detection (LOD) were also calculated to compare the performances of the two strategies. For tablets, the average concentrations of THD in tablet were 63.5 and 63.2 ng mL(-1) by using the PARAFAC and APTLD algorithms, respectively. The accuracy was evaluated by t-test and both algorithms could give accurate results, too.
FPGA implementation of sparse matrix algorithm for information retrieval

NASA Astrophysics Data System (ADS)

Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio

2005-06-01

Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.

D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

PubMed Central

Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

2009-01-01

Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. DMATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the coregulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sosbox cisregulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. DMATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861
D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.

PubMed

Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

2009-07-27

Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D-MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co-regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos-box cis-regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D-MATRIX tool is accessible through the CIMAP domain network. http://203.190.147.116/dmatrix/
Least-Squares Approximation of an Improper by a Proper Correlation Matrix Using a Semi-Infinite Convex Program. Research Report 87-7.

ERIC Educational Resources Information Center

Knol, Dirk L.; ten Berge, Jos M. F.

An algorithm is presented for the best least-squares fitting correlation matrix approximating a given missing value or improper correlation matrix. The proposed algorithm is based on a solution for C. I. Mosier's oblique Procrustes rotation problem offered by J. M. F. ten Berge and K. Nevels (1977). It is shown that the minimization problem…
Recurrent procedure for constructing nonisotropic matrix elements of the collision integral of the nonlinear Boltzmann equation

NASA Astrophysics Data System (ADS)

Ender, I. A.; Bakaleinikov, L. A.; Flegontova, E. Yu.; Gerasimenko, A. B.

2017-08-01

We have proposed an algorithm for the sequential construction of nonisotropic matrix elements of the collision integral, which are required to solve the nonlinear Boltzmann equation using the moments method. The starting elements of the matrix are isotropic and assumed to be known. The algorithm can be used for an arbitrary law of interactions for any ratio of the masses of colliding particles.
Fast Compressive Tracking.

PubMed

Zhang, Kaihua; Zhang, Lei; Yang, Ming-Hsuan

2014-10-01

It is a challenging task to develop effective and efficient appearance models for robust object tracking due to factors such as pose variation, illumination change, occlusion, and motion blur. Existing online tracking algorithms often update models with samples from observations in recent frames. Despite much success has been demonstrated, numerous issues remain to be addressed. First, while these adaptive appearance models are data-dependent, there does not exist sufficient amount of data for online algorithms to learn at the outset. Second, online tracking algorithms often encounter the drift problems. As a result of self-taught learning, misaligned samples are likely to be added and degrade the appearance models. In this paper, we propose a simple yet effective and efficient tracking algorithm with an appearance model based on features extracted from a multiscale image feature space with data-independent basis. The proposed appearance model employs non-adaptive random projections that preserve the structure of the image feature space of objects. A very sparse measurement matrix is constructed to efficiently extract the features for the appearance model. We compress sample images of the foreground target and the background using the same sparse measurement matrix. The tracking task is formulated as a binary classification via a naive Bayes classifier with online update in the compressed domain. A coarse-to-fine search strategy is adopted to further reduce the computational complexity in the detection procedure. The proposed compressive tracking algorithm runs in real-time and performs favorably against state-of-the-art methods on challenging sequences in terms of efficiency, accuracy and robustness.
Dual energy CT with one full scan and a second sparse-view scan using structure preserving iterative reconstruction (SPIR)

NASA Astrophysics Data System (ADS)

Wang, Tonghe; Zhu, Lei

2016-09-01

Conventional dual-energy CT (DECT) reconstruction requires two full-size projection datasets with two different energy spectra. In this study, we propose an iterative algorithm to enable a new data acquisition scheme which requires one full scan and a second sparse-view scan for potential reduction in imaging dose and engineering cost of DECT. A bilateral filter is calculated as a similarity matrix from the first full-scan CT image to quantify the similarity between any two pixels, which is assumed unchanged on a second CT image since DECT scans are performed on the same object. The second CT image from reduced projections is reconstructed by an iterative algorithm which updates the image by minimizing the total variation of the difference between the image and its filtered image by the similarity matrix under data fidelity constraint. As the redundant structural information of the two CT images is contained in the similarity matrix for CT reconstruction, we refer to the algorithm as structure preserving iterative reconstruction (SPIR). The proposed method is evaluated on both digital and physical phantoms, and is compared with the filtered-backprojection (FBP) method, the conventional total-variation-regularization-based algorithm (TVR) and prior-image-constrained-compressed-sensing (PICCS). SPIR with a second 10-view scan reduces the image noise STD by a factor of one order of magnitude with same spatial resolution as full-view FBP image. SPIR substantially improves over TVR on the reconstruction accuracy of a 10-view scan by decreasing the reconstruction error from 6.18% to 1.33%, and outperforms TVR at 50 and 20-view scans on spatial resolution with a higher frequency at the modulation transfer function value of 10% by an average factor of 4. Compared with the 20-view scan PICCS result, the SPIR image has 7 times lower noise STD with similar spatial resolution. The electron density map obtained from the SPIR-based DECT images with a second 10-view scan has an average error of less than 1%.
Improving stochastic estimates with inference methods: calculating matrix diagonals.

PubMed

Selig, Marco; Oppermann, Niels; Ensslin, Torsten A

2012-02-01

Estimating the diagonal entries of a matrix, that is not directly accessible but only available as a linear operator in the form of a computer routine, is a common necessity in many computational applications, especially in image reconstruction and statistical inference. Here, methods of statistical inference are used to improve the accuracy or the computational costs of matrix probing methods to estimate matrix diagonals. In particular, the generalized Wiener filter methodology, as developed within information field theory, is shown to significantly improve estimates based on only a few sampling probes, in cases in which some form of continuity of the solution can be assumed. The strength, length scale, and precise functional form of the exploited autocorrelation function of the matrix diagonal is determined from the probes themselves. The developed algorithm is successfully applied to mock and real world problems. These performance tests show that, in situations where a matrix diagonal has to be calculated from only a small number of computationally expensive probes, a speedup by a factor of 2 to 10 is possible with the proposed method. © 2012 American Physical Society
A weighted information criterion for multiple minor components and its adaptive extraction algorithms.

PubMed

Gao, Yingbin; Kong, Xiangyu; Zhang, Huihui; Hou, Li'an

2017-05-01

Minor component (MC) plays an important role in signal processing and data analysis, so it is a valuable work to develop MC extraction algorithms. Based on the concepts of weighted subspace and optimum theory, a weighted information criterion is proposed for searching the optimum solution of a linear neural network. This information criterion exhibits a unique global minimum attained if and only if the state matrix is composed of the desired MCs of an autocorrelation matrix of an input signal. By using gradient ascent method and recursive least square (RLS) method, two algorithms are developed for multiple MCs extraction. The global convergences of the proposed algorithms are also analyzed by the Lyapunov method. The proposed algorithms can extract the multiple MCs in parallel and has advantage in dealing with high dimension matrices. Since the weighted matrix does not require an accurate value, it facilitates the system design of the proposed algorithms for practical applications. The speed and computation advantages of the proposed algorithms are verified through simulations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Randomized Dynamic Mode Decomposition

NASA Astrophysics Data System (ADS)

Erichson, N. Benjamin; Brunton, Steven L.; Kutz, J. Nathan

2017-11-01

The dynamic mode decomposition (DMD) is an equation-free, data-driven matrix decomposition that is capable of providing accurate reconstructions of spatio-temporal coherent structures arising in dynamical systems. We present randomized algorithms to compute the near-optimal low-rank dynamic mode decomposition for massive datasets. Randomized algorithms are simple, accurate and able to ease the computational challenges arising with `big data'. Moreover, randomized algorithms are amenable to modern parallel and distributed computing. The idea is to derive a smaller matrix from the high-dimensional input data matrix using randomness as a computational strategy. Then, the dynamic modes and eigenvalues are accurately learned from this smaller representation of the data, whereby the approximation quality can be controlled via oversampling and power iterations. Here, we present randomized DMD algorithms that are categorized by how many passes the algorithm takes through the data. Specifically, the single-pass randomized DMD does not require data to be stored for subsequent passes. Thus, it is possible to approximately decompose massive fluid flows (stored out of core memory, or not stored at all) using single-pass algorithms, which is infeasible with traditional DMD algorithms.
Optimizations for the EcoPod field identification tool

PubMed Central

Manoharan, Aswath; Stamberger, Jeannie; Yu, YuanYuan; Paepcke, Andreas

2008-01-01

Background We sketch our species identification tool for palm sized computers that helps knowledgeable observers with census activities. An algorithm turns an identification matrix into a minimal length series of questions that guide the operator towards identification. Historic observation data from the census geographic area helps minimize question volume. We explore how much historic data is required to boost performance, and whether the use of history negatively impacts identification of rare species. We also explore how characteristics of the matrix interact with the algorithm, and how best to predict the probability of observing a previously unseen species. Results Point counts of birds taken at Stanford University's Jasper Ridge Biological Preserve between 2000 and 2005 were used to examine the algorithm. A computer identified species by correctly answering, and counting the algorithm's questions. We also explored how the character density of the key matrix and the theoretical minimum number of questions for each bird in the matrix influenced the algorithm. Our investigation of the required probability smoothing determined whether Laplace smoothing of observation probabilities was sufficient, or whether the more complex Good-Turing technique is required. Conclusion Historic data improved identification speed, but only impacted the top 25% most frequently observed birds. For rare birds the history based algorithms did not impose a noticeable penalty in the number of questions required for identification. For our dataset neither age of the historic data, nor the number of observation years impacted the algorithm. Density of characters for different taxa in the identification matrix did not impact the algorithms. Intrinsic differences in identifying different birds did affect the algorithm, but the differences affected the baseline method of not using historic data to exactly the same degree. We found that Laplace smoothing performed better for rare species than Simple Good-Turing, and that, contrary to expectation, the technique did not then adversely affect identification performance for frequently observed birds. PMID:18366649
Personalized recommendation via an improved NBI algorithm and user influence model in a Microblog network

NASA Astrophysics Data System (ADS)

Lian, Jie; Liu, Yun; Zhang, Zhen-jiang; Gui, Chang-ni

2013-10-01

Bipartite network based recommendations have attracted extensive attentions in recent years. Differing from traditional object-oriented recommendations, the recommendation in a Microblog network has two crucial differences. One is high authority users or one’s special friends usually play a very active role in tweet-oriented recommendation. The other is that the object in a Microblog network corresponds to a set of tweets on same topic instead of an actual and single entity, e.g. goods or movies in traditional networks. Thus repeat recommendations of the tweets in one’s collected topics are indispensable. Therefore, this paper improves network based inference (NBI) algorithm by original link matrix and link weight on resource allocation processes. This paper finally proposes the Microblog recommendation model based on the factors of improved network based inference and user influence model. Adjusting the weights of these two factors could generate the best recommendation results in algorithm accuracy and recommendation personalization.
Unsupervised Bayesian linear unmixing of gene expression microarrays.

PubMed

Bazot, Cécile; Dobigeon, Nicolas; Tourneret, Jean-Yves; Zaas, Aimee K; Ginsburg, Geoffrey S; Hero, Alfred O

2013-03-19

This paper introduces a new constrained model and the corresponding algorithm, called unsupervised Bayesian linear unmixing (uBLU), to identify biological signatures from high dimensional assays like gene expression microarrays. The basis for uBLU is a Bayesian model for the data samples which are represented as an additive mixture of random positive gene signatures, called factors, with random positive mixing coefficients, called factor scores, that specify the relative contribution of each signature to a specific sample. The particularity of the proposed method is that uBLU constrains the factor loadings to be non-negative and the factor scores to be probability distributions over the factors. Furthermore, it also provides estimates of the number of factors. A Gibbs sampling strategy is adopted here to generate random samples according to the posterior distribution of the factors, factor scores, and number of factors. These samples are then used to estimate all the unknown parameters. Firstly, the proposed uBLU method is applied to several simulated datasets with known ground truth and compared with previous factor decomposition methods, such as principal component analysis (PCA), non negative matrix factorization (NMF), Bayesian factor regression modeling (BFRM), and the gradient-based algorithm for general matrix factorization (GB-GMF). Secondly, we illustrate the application of uBLU on a real time-evolving gene expression dataset from a recent viral challenge study in which individuals have been inoculated with influenza A/H3N2/Wisconsin. We show that the uBLU method significantly outperforms the other methods on the simulated and real data sets considered here. The results obtained on synthetic and real data illustrate the accuracy of the proposed uBLU method when compared to other factor decomposition methods from the literature (PCA, NMF, BFRM, and GB-GMF). The uBLU method identifies an inflammatory component closely associated with clinical symptom scores collected during the study. Using a constrained model allows recovery of all the inflammatory genes in a single factor.
AMLSA Algorithm for Hybrid Precoding in Millimeter Wave MIMO Systems

NASA Astrophysics Data System (ADS)

Liu, Fulai; Sun, Zhenxing; Du, Ruiyan; Bai, Xiaoyu

2017-10-01

In this paper, an effective algorithm will be proposed for hybrid precoding in mmWave MIMO systems, referred to as alternating minimization algorithm with the least squares amendment (AMLSA algorithm). To be specific, for the fully-connected structure, the presented algorithm is exploited to minimize the classical objective function and obtain the hybrid precoding matrix. It introduces an orthogonal constraint to the digital precoding matrix which is amended subsequently by the least squares after obtaining its alternating minimization iterative result. Simulation results confirm that the achievable spectral efficiency of our proposed algorithm is better to some extent than that of the existing algorithm without the least squares amendment. Furthermore, the number of iterations is reduced slightly via improving the initialization procedure.
Encryption and decryption algorithm using algebraic matrix approach

NASA Astrophysics Data System (ADS)

Thiagarajan, K.; Balasubramanian, P.; Nagaraj, J.; Padmashree, J.

2018-04-01

Cryptographic algorithms provide security of data against attacks during encryption and decryption. However, they are computationally intensive process which consume large amount of CPU time and space at time of encryption and decryption. The goal of this paper is to study the encryption and decryption algorithm and to find space complexity of the encrypted and decrypted data by using of algorithm. In this paper, we encrypt and decrypt the message using key with the help of cyclic square matrix provides the approach applicable for any number of words having more number of characters and longest word. Also we discussed about the time complexity of the algorithm. The proposed algorithm is simple but difficult to break the process.
Enforced Sparse Non-Negative Matrix Factorization

DTIC Science & Technology

2016-01-23

documents to find interesting pieces of information. With limited resources, analysts often employ automated text - mining tools that highlight common...represented as an undirected bipartite graph. It has become a common method for generating topic models of text data because it is known to produce good results...model and the convergence rate of the underlying algorithm. I. Introduction A common analyst challenge is searching through large quantities of text
A fast new algorithm for a robot neurocontroller using inverse QR decomposition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morris, A.S.; Khemaissia, S.

2000-01-01

A new adaptive neural network controller for robots is presented. The controller is based on direct adaptive techniques. Unlike many neural network controllers in the literature, inverse dynamical model evaluation is not required. A numerically robust, computationally efficient processing scheme for neutral network weight estimation is described, namely, the inverse QR decomposition (INVQR). The inverse QR decomposition and a weighted recursive least-squares (WRLS) method for neural network weight estimation is derived using Cholesky factorization of the data matrix. The algorithm that performs the efficient INVQR of the underlying space-time data matrix may be implemented in parallel on a triangular array.more » Furthermore, its systolic architecture is well suited for VLSI implementation. Another important benefit is well suited for VLSI implementation. Another important benefit of the INVQR decomposition is that it solves directly for the time-recursive least-squares filter vector, while avoiding the sequential back-substitution step required by the QR decomposition approaches.« less
Comparison of two matrix data structures for advanced CSM testbed applications

NASA Technical Reports Server (NTRS)

Regelbrugge, M. E.; Brogan, F. A.; Nour-Omid, B.; Rankin, C. C.; Wright, M. A.

1989-01-01

The first section describes data storage schemes presently used by the Computational Structural Mechanics (CSM) testbed sparse matrix facilities and similar skyline (profile) matrix facilities. The second section contains a discussion of certain features required for the implementation of particular advanced CSM algorithms, and how these features might be incorporated into the data storage schemes described previously. The third section presents recommendations, based on the discussions of the prior sections, for directing future CSM testbed development to provide necessary matrix facilities for advanced algorithm implementation and use. The objective is to lend insight into the matrix structures discussed and to help explain the process of evaluating alternative matrix data structures and utilities for subsequent use in the CSM testbed.
Multiple-algorithm parallel fusion of infrared polarization and intensity images based on algorithmic complementarity and synergy

NASA Astrophysics Data System (ADS)

Zhang, Lei; Yang, Fengbao; Ji, Linna; Lv, Sheng

2018-01-01

Diverse image fusion methods perform differently. Each method has advantages and disadvantages compared with others. One notion is that the advantages of different image methods can be effectively combined. A multiple-algorithm parallel fusion method based on algorithmic complementarity and synergy is proposed. First, in view of the characteristics of the different algorithms and difference-features among images, an index vector-based feature-similarity is proposed to define the degree of complementarity and synergy. This proposed index vector is a reliable evidence indicator for algorithm selection. Second, the algorithms with a high degree of complementarity and synergy are selected. Then, the different degrees of various features and infrared intensity images are used as the initial weights for the nonnegative matrix factorization (NMF). This avoids randomness of the NMF initialization parameter. Finally, the fused images of different algorithms are integrated using the NMF because of its excellent data fusing performance on independent features. Experimental results demonstrate that the visual effect and objective evaluation index of the fused images obtained using the proposed method are better than those obtained using traditional methods. The proposed method retains all the advantages that individual fusion algorithms have.
Efficient multitasking of Choleski matrix factorization on CRAY supercomputers

NASA Technical Reports Server (NTRS)

Overman, Andrea L.; Poole, Eugene L.

1991-01-01

A Choleski method is described and used to solve linear systems of equations that arise in large scale structural analysis. The method uses a novel variable-band storage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is used for comparison with the microtasked and autotasked implementations. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both computers. CPU and wall clock timings are given for the parallel implementations and are compared to single processor timings of the same algorithm.
The impact of initialization procedures on unsupervised unmixing of hyperspectral imagery using the constrained positive matrix factorization

NASA Astrophysics Data System (ADS)

Masalmah, Yahya M.; Vélez-Reyes, Miguel

2007-04-01

The authors proposed in previous papers the use of the constrained Positive Matrix Factorization (cPMF) to perform unsupervised unmixing of hyperspectral imagery. Two iterative algorithms were proposed to compute the cPMF based on the Gauss-Seidel and penalty approaches to solve optimization problems. Results presented in previous papers have shown the potential of the proposed method to perform unsupervised unmixing in HYPERION and AVIRIS imagery. The performance of iterative methods is highly dependent on the initialization scheme. Good initialization schemes can improve convergence speed, whether or not a global minimum is found, and whether or not spectra with physical relevance are retrieved as endmembers. In this paper, different initializations using random selection, longest norm pixels, and standard endmembers selection routines are studied and compared using simulated and real data.

Spectral Regularization Algorithms for Learning Large Incomplete Matrices.

PubMed

Mazumder, Rahul; Hastie, Trevor; Tibshirani, Robert

2010-03-01

We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices: for example it can obtain a rank-80 approximation of a 10(6) × 10(6) incomplete matrix with 10(5) observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very good performance both in training and test error when compared to other competitive state-of-the art techniques.
Spectral Regularization Algorithms for Learning Large Incomplete Matrices

PubMed Central

Mazumder, Rahul; Hastie, Trevor; Tibshirani, Robert

2010-01-01

We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices: for example it can obtain a rank-80 approximation of a 106 × 106 incomplete matrix with 105 observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very good performance both in training and test error when compared to other competitive state-of-the art techniques. PMID:21552465
Tensor completion for estimating missing values in visual data.

PubMed

Liu, Ji; Musialski, Przemyslaw; Wonka, Peter; Ye, Jieping

2013-01-01

In this paper, we propose an algorithm to estimate missing values in tensors of visual data. The values can be missing due to problems in the acquisition process or because the user manually identified unwanted outliers. Our algorithm works even with a small amount of samples and it can propagate structure to fill larger missing regions. Our methodology is built on recent studies about matrix completion using the matrix trace norm. The contribution of our paper is to extend the matrix case to the tensor case by proposing the first definition of the trace norm for tensors and then by building a working algorithm. First, we propose a definition for the tensor trace norm that generalizes the established definition of the matrix trace norm. Second, similarly to matrix completion, the tensor completion is formulated as a convex optimization problem. Unfortunately, the straightforward problem extension is significantly harder to solve than the matrix case because of the dependency among multiple constraints. To tackle this problem, we developed three algorithms: simple low rank tensor completion (SiLRTC), fast low rank tensor completion (FaLRTC), and high accuracy low rank tensor completion (HaLRTC). The SiLRTC algorithm is simple to implement and employs a relaxation technique to separate the dependent relationships and uses the block coordinate descent (BCD) method to achieve a globally optimal solution; the FaLRTC algorithm utilizes a smoothing scheme to transform the original nonsmooth problem into a smooth one and can be used to solve a general tensor trace norm minimization problem; the HaLRTC algorithm applies the alternating direction method of multipliers (ADMMs) to our problem. Our experiments show potential applications of our algorithms and the quantitative evaluation indicates that our methods are more accurate and robust than heuristic approaches. The efficiency comparison indicates that FaLTRC and HaLRTC are more efficient than SiLRTC and between FaLRTC an- HaLRTC the former is more efficient to obtain a low accuracy solution and the latter is preferred if a high-accuracy solution is desired.
Deep data analysis via physically constrained linear unmixing: universal framework, domain examples, and a community-wide platform.

PubMed

Kannan, R; Ievlev, A V; Laanait, N; Ziatdinov, M A; Vasudevan, R K; Jesse, S; Kalinin, S V

2018-01-01

Many spectral responses in materials science, physics, and chemistry experiments can be characterized as resulting from the superposition of a number of more basic individual spectra. In this context, unmixing is defined as the problem of determining the individual spectra, given measurements of multiple spectra that are spatially resolved across samples, as well as the determination of the corresponding abundance maps indicating the local weighting of each individual spectrum. Matrix factorization is a popular linear unmixing technique that considers that the mixture model between the individual spectra and the spatial maps is linear. Here, we present a tutorial paper targeted at domain scientists to introduce linear unmixing techniques, to facilitate greater understanding of spectroscopic imaging data. We detail a matrix factorization framework that can incorporate different domain information through various parameters of the matrix factorization method. We demonstrate many domain-specific examples to explain the expressivity of the matrix factorization framework and show how the appropriate use of domain-specific constraints such as non-negativity and sum-to-one abundance result in physically meaningful spectral decompositions that are more readily interpretable. Our aim is not only to explain the off-the-shelf available tools, but to add additional constraints when ready-made algorithms are unavailable for the task. All examples use the scalable open source implementation from https://github.com/ramkikannan/nmflibrary that can run from small laptops to supercomputers, creating a user-wide platform for rapid dissemination and adoption across scientific disciplines.
Statistical analysis and machine learning algorithms for optical biopsy

NASA Astrophysics Data System (ADS)

Wu, Binlin; Liu, Cheng-hui; Boydston-White, Susie; Beckman, Hugh; Sriramoju, Vidyasagar; Sordillo, Laura; Zhang, Chunyuan; Zhang, Lin; Shi, Lingyan; Smith, Jason; Bailin, Jacob; Alfano, Robert R.

2018-02-01

Analyzing spectral or imaging data collected with various optical biopsy methods is often times difficult due to the complexity of the biological basis. Robust methods that can utilize the spectral or imaging data and detect the characteristic spectral or spatial signatures for different types of tissue is challenging but highly desired. In this study, we used various machine learning algorithms to analyze a spectral dataset acquired from human skin normal and cancerous tissue samples using resonance Raman spectroscopy with 532nm excitation. The algorithms including principal component analysis, nonnegative matrix factorization, and autoencoder artificial neural network are used to reduce dimension of the dataset and detect features. A support vector machine with a linear kernel is used to classify the normal tissue and cancerous tissue samples. The efficacies of the methods are compared.
Abnormal global and local event detection in compressive sensing domain

NASA Astrophysics Data System (ADS)

Wang, Tian; Qiao, Meina; Chen, Jie; Wang, Chuanyun; Zhang, Wenjia; Snoussi, Hichem

2018-05-01

Abnormal event detection, also known as anomaly detection, is one challenging task in security video surveillance. It is important to develop effective and robust movement representation models for global and local abnormal event detection to fight against factors such as occlusion and illumination change. In this paper, a new algorithm is proposed. It can locate the abnormal events on one frame, and detect the global abnormal frame. The proposed algorithm employs a sparse measurement matrix designed to represent the movement feature based on optical flow efficiently. Then, the abnormal detection mission is constructed as a one-class classification task via merely learning from the training normal samples. Experiments demonstrate that our algorithm performs well on the benchmark abnormal detection datasets against state-of-the-art methods.
MUSIC algorithm for location searching of dielectric anomalies from S-parameters using microwave imaging

NASA Astrophysics Data System (ADS)

Park, Won-Kwang; Kim, Hwa Pyung; Lee, Kwang-Jae; Son, Seong-Ho

2017-11-01

Motivated by the biomedical engineering used in early-stage breast cancer detection, we investigated the use of MUltiple SIgnal Classification (MUSIC) algorithm for location searching of small anomalies using S-parameters. We considered the application of MUSIC to functional imaging where a small number of dipole antennas are used. Our approach is based on the application of Born approximation or physical factorization. We analyzed cases in which the anomaly is respectively small and large in relation to the wavelength, and the structure of the left-singular vectors is linked to the nonzero singular values of a Multi-Static Response (MSR) matrix whose elements are the S-parameters. Using simulations, we demonstrated the strengths and weaknesses of the MUSIC algorithm in detecting both small and extended anomalies.
On distribution reduction and algorithm implementation in inconsistent ordered information systems.

PubMed

Zhang, Yanqin

2014-01-01

As one part of our work in ordered information systems, distribution reduction is studied in inconsistent ordered information systems (OISs). Some important properties on distribution reduction are studied and discussed. The dominance matrix is restated for reduction acquisition in dominance relations based information systems. Matrix algorithm for distribution reduction acquisition is stepped. And program is implemented by the algorithm. The approach provides an effective tool for the theoretical research and the applications for ordered information systems in practices. For more detailed and valid illustrations, cases are employed to explain and verify the algorithm and the program which shows the effectiveness of the algorithm in complicated information systems.
First-Principle Construction of U(1) Symmetric Matrix Product States

NASA Astrophysics Data System (ADS)

Rakov, Mykhailo V.

2018-07-01

The algorithm to calculate the sets of symmetry sectors for virtual indices of U(1) symmetric matrix product states (MPS) is described. The principal differences between open (OBC) and periodic (PBC) boundary conditions are stressed, and the extension of PBC MPS algorithm to projected entangled pair states is outlined.
The comparison of automated clustering algorithms for resampling representative conformer ensembles with RMSD matrix.

PubMed

Kim, Hyoungrae; Jang, Cheongyun; Yadav, Dharmendra K; Kim, Mi-Hyun

2017-03-23

The accuracy of any 3D-QSAR, Pharmacophore and 3D-similarity based chemometric target fishing models are highly dependent on a reasonable sample of active conformations. Since a number of diverse conformational sampling algorithm exist, which exhaustively generate enough conformers, however model building methods relies on explicit number of common conformers. In this work, we have attempted to make clustering algorithms, which could find reasonable number of representative conformer ensembles automatically with asymmetric dissimilarity matrix generated from openeye tool kit. RMSD was the important descriptor (variable) of each column of the N × N matrix considered as N variables describing the relationship (network) between the conformer (in a row) and the other N conformers. This approach used to evaluate the performance of the well-known clustering algorithms by comparison in terms of generating representative conformer ensembles and test them over different matrix transformation functions considering the stability. In the network, the representative conformer group could be resampled for four kinds of algorithms with implicit parameters. The directed dissimilarity matrix becomes the only input to the clustering algorithms. Dunn index, Davies-Bouldin index, Eta-squared values and omega-squared values were used to evaluate the clustering algorithms with respect to the compactness and the explanatory power. The evaluation includes the reduction (abstraction) rate of the data, correlation between the sizes of the population and the samples, the computational complexity and the memory usage as well. Every algorithm could find representative conformers automatically without any user intervention, and they reduced the data to 14-19% of the original values within 1.13 s per sample at the most. The clustering methods are simple and practical as they are fast and do not ask for any explicit parameters. RCDTC presented the maximum Dunn and omega-squared values of the four algorithms in addition to consistent reduction rate between the population size and the sample size. The performance of the clustering algorithms was consistent over different transformation functions. Moreover, the clustering method can also be applied to molecular dynamics sampling simulation results.
CPDES3: A preconditioned conjugate gradient solver for linear asymmetric matrix equations arising from coupled partial differential equations in three dimensions

NASA Astrophysics Data System (ADS)

Anderson, D. V.; Koniges, A. E.; Shumaker, D. E.

1988-11-01

Many physical problems require the solution of coupled partial differential equations on three-dimensional domains. When the time scales of interest dictate an implicit discretization of the equations a rather complicated global matrix system needs solution. The exact form of the matrix depends on the choice of spatial grids and on the finite element or finite difference approximations employed. CPDES3 allows each spatial operator to have 7, 15, 19, or 27 point stencils and allows for general couplings between all of the component PDE's and it automatically generates the matrix structures needed to perform the algorithm. The resulting sparse matrix equation is solved by either the preconditioned conjugate gradient (CG) method or by the preconditioned biconjugate gradient (BCG) algorithm. An arbitrary number of component equations are permitted only limited by available memory. In the sub-band representation used, we generate an algorithm that is written compactly in terms of indirect induces which is vectorizable on some of the newer scientific computers.
CPDES2: A preconditioned conjugate gradient solver for linear asymmetric matrix equations arising from coupled partial differential equations in two dimensions

NASA Astrophysics Data System (ADS)

Anderson, D. V.; Koniges, A. E.; Shumaker, D. E.

1988-11-01

Many physical problems require the solution of coupled partial differential equations on two-dimensional domains. When the time scales of interest dictate an implicit discretization of the equations a rather complicated global matrix system needs solution. The exact form of the matrix depends on the choice of spatial grids and on the finite element or finite difference approximations employed. CPDES2 allows each spatial operator to have 5 or 9 point stencils and allows for general couplings between all of the component PDE's and it automatically generates the matrix structures needed to perform the algorithm. The resulting sparse matrix equation is solved by either the preconditioned conjugate gradient (CG) method or by the preconditioned biconjugate gradient (BCG) algorithm. An arbitrary number of component equations are permitted only limited by available memory. In the sub-band representation used, we generate an algorithm that is written compactly in terms of indirect indices which is vectorizable on some of the newer scientific computers.
Using a multifrontal sparse solver in a high performance, finite element code

NASA Technical Reports Server (NTRS)

King, Scott D.; Lucas, Robert; Raefsky, Arthur

1990-01-01

We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.
A spatial operator algebra for manipulator modeling and control

NASA Technical Reports Server (NTRS)

Rodriguez, G.; Kreutz, K.; Milman, M.

1988-01-01

A powerful new spatial operator algebra for modeling, control, and trajectory design of manipulators is discussed along with its implementation in the Ada programming language. Applications of this algebra to robotics include an operator representation of the manipulator Jacobian matrix; the robot dynamical equations formulated in terms of the spatial algebra, showing the complete equivalence between the recursive Newton-Euler formulations to robot dynamics; the operator factorization and inversion of the manipulator mass matrix which immediately results in O(N) recursive forward dynamics algorithms; the joint accelerations of a manipulator due to a tip contact force; the recursive computation of the equivalent mass matrix as seen at the tip of a manipulator; and recursive forward dynamics of a closed chain system. Finally, additional applications and current research involving the use of the spatial operator algebra are discussed in general terms.
GPS-Free Localization Algorithm for Wireless Sensor Networks

PubMed Central

Wang, Lei; Xu, Qingzheng

2010-01-01

Localization is one of the most fundamental problems in wireless sensor networks, since the locations of the sensor nodes are critical to both network operations and most application level tasks. A GPS-free localization scheme for wireless sensor networks is presented in this paper. First, we develop a standardized clustering-based approach for the local coordinate system formation wherein a multiplication factor is introduced to regulate the number of master and slave nodes and the degree of connectivity among master nodes. Second, using homogeneous coordinates, we derive a transformation matrix between two Cartesian coordinate systems to efficiently merge them into a global coordinate system and effectively overcome the flip ambiguity problem. The algorithm operates asynchronously without a centralized controller; and does not require that the location of the sensors be known a priori. A set of parameter-setting guidelines for the proposed algorithm is derived based on a probability model and the energy requirements are also investigated. A simulation analysis on a specific numerical example is conducted to validate the mathematical analytical results. We also compare the performance of the proposed algorithm under a variety multiplication factor, node density and node communication radius scenario. Experiments show that our algorithm outperforms existing mechanisms in terms of accuracy and convergence time. PMID:22219694
Reconstruction of 2D PET data with Monte Carlo generated system matrix for generalized natural pixels

NASA Astrophysics Data System (ADS)

Vandenberghe, Stefaan; Staelens, Steven; Byrne, Charles L.; Soares, Edward J.; Lemahieu, Ignace; Glick, Stephen J.

2006-06-01

In discrete detector PET, natural pixels are image basis functions calculated from the response of detector pairs. By using reconstruction with natural pixel basis functions, the discretization of the object into a predefined grid can be avoided. Here, we propose to use generalized natural pixel reconstruction. Using this approach, the basis functions are not the detector sensitivity functions as in the natural pixel case but uniform parallel strips. The backprojection of the strip coefficients results in the reconstructed image. This paper proposes an easy and efficient way to generate the matrix M directly by Monte Carlo simulation. Elements of the generalized natural pixel system matrix are formed by calculating the intersection of a parallel strip with the detector sensitivity function. These generalized natural pixels are easier to use than conventional natural pixels because the final step from solution to a square pixel representation is done by simple backprojection. Due to rotational symmetry in the PET scanner, the matrix M is block circulant and only the first blockrow needs to be stored. Data were generated using a fast Monte Carlo simulator using ray tracing. The proposed method was compared to a listmode MLEM algorithm, which used ray tracing for doing forward and backprojection. Comparison of the algorithms with different phantoms showed that an improved resolution can be obtained using generalized natural pixel reconstruction with accurate system modelling. In addition, it was noted that for the same resolution a lower noise level is present in this reconstruction. A numerical observer study showed the proposed method exhibited increased performance as compared to a standard listmode EM algorithm. In another study, more realistic data were generated using the GATE Monte Carlo simulator. For these data, a more uniform contrast recovery and a better contrast-to-noise performance were observed. It was observed that major improvements in contrast recovery were obtained with MLEM when the correct system matrix was used instead of simple ray tracing. The correct modelling was the major cause of improved contrast for the same background noise. Less important factors were the choice of the algorithm (MLEM performed better than ART) and the basis functions (generalized natural pixels gave better results than pixels).
Investigation of the Capability of Compact Polarimetric SAR Interferometry to Estimate Forest Height

NASA Astrophysics Data System (ADS)

Zhang, Hong; Xie, Lei; Wang, Chao; Chen, Jiehong

2013-08-01

The main objective of this paper is to investigate the capability of compact Polarimetric SAR Interferometry (C-PolInSAR) on forest height estimation. For this, the pseudo fully polarimetric interferomteric (F-PolInSAR) covariance matrix is firstly reconstructed, then the three- stage inversion algorithm, hybrid algorithm, Music and Capon algorithm are applied to both C-PolInSAR covariance matrix and pseudo F-PolInSAR covariance matrix. The availability of forest height estimation is demonstrated using L-band data generated by simulator PolSARProSim and X-band airborne data acquired by East China Research Institute of Electronic Engineering, China Electronics Technology Group Corporation.
Parallel matrix multiplication on the Connection Machine

NASA Technical Reports Server (NTRS)

Tichy, Walter F.

1988-01-01

Matrix multiplication is a computation and communication intensive problem. Six parallel algorithms for matrix multiplication on the Connection Machine are presented and compared with respect to their performance and processor usage. For n by n matrices, the algorithms have theoretical running times of O(n to the 2nd power log n), O(n log n), O(n), and O(log n), and require n, n to the 2nd power, n to the 2nd power, and n to the 3rd power processors, respectively. With careful attention to communication patterns, the theoretically predicted runtimes can indeed be achieved in practice. The parallel algorithms illustrate the tradeoffs between performance, communication cost, and processor usage.
A Scalable O(N) Algorithm for Large-Scale Parallel First-Principles Molecular Dynamics Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Osei-Kuffuor, Daniel; Fattebert, Jean-Luc

2014-01-01

Traditional algorithms for first-principles molecular dynamics (FPMD) simulations only gain a modest capability increase from current petascale computers, due to their O(N 3) complexity and their heavy use of global communications. To address this issue, we are developing a truly scalable O(N) complexity FPMD algorithm, based on density functional theory (DFT), which avoids global communications. The computational model uses a general nonorthogonal orbital formulation for the DFT energy functional, which requires knowledge of selected elements of the inverse of the associated overlap matrix. We present a scalable algorithm for approximately computing selected entries of the inverse of the overlap matrix,more » based on an approximate inverse technique, by inverting local blocks corresponding to principal submatrices of the global overlap matrix. The new FPMD algorithm exploits sparsity and uses nearest neighbor communication to provide a computational scheme capable of extreme scalability. Accuracy is controlled by the mesh spacing of the finite difference discretization, the size of the localization regions in which the electronic orbitals are confined, and a cutoff beyond which the entries of the overlap matrix can be omitted when computing selected entries of its inverse. We demonstrate the algorithm's excellent parallel scaling for up to O(100K) atoms on O(100K) processors, with a wall-clock time of O(1) minute per molecular dynamics time step.« less
Sparse matrix methods based on orthogonality and conjugacy

NASA Technical Reports Server (NTRS)

Lawson, C. L.

1973-01-01

A matrix having a high percentage of zero elements is called spares. In the solution of systems of linear equations or linear least squares problems involving large sparse matrices, significant saving of computer cost can be achieved by taking advantage of the sparsity. The conjugate gradient algorithm and a set of related algorithms are described.

Detecting an atomic clock frequency anomaly using an adaptive Kalman filter algorithm

NASA Astrophysics Data System (ADS)

Song, Huijie; Dong, Shaowu; Wu, Wenjun; Jiang, Meng; Wang, Weixiong

2018-06-01

The abnormal frequencies of an atomic clock mainly include frequency jump and frequency drift jump. Atomic clock frequency anomaly detection is a key technique in time-keeping. The Kalman filter algorithm, as a linear optimal algorithm, has been widely used in real-time detection for abnormal frequency. In order to obtain an optimal state estimation, the observation model and dynamic model of the Kalman filter algorithm should satisfy Gaussian white noise conditions. The detection performance is degraded if anomalies affect the observation model or dynamic model. The idea of the adaptive Kalman filter algorithm, applied to clock frequency anomaly detection, uses the residuals given by the prediction for building ‘an adaptive factor’ the prediction state covariance matrix is real-time corrected by the adaptive factor. The results show that the model error is reduced and the detection performance is improved. The effectiveness of the algorithm is verified by the frequency jump simulation, the frequency drift jump simulation and the measured data of the atomic clock by using the chi-square test.
Turbo-SMT: Accelerating Coupled Sparse Matrix-Tensor Factorizations by 200×

PubMed Central

Papalexakis, Evangelos E.; Faloutsos, Christos; Mitchell, Tom M.; Talukdar, Partha Pratim; Sidiropoulos, Nicholas D.; Murphy, Brian

2015-01-01

How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ‘edible’, ‘fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we accelerate any CMTF solver, so that it runs within a few minutes instead of tens of hours to a day, while maintaining good accuracy? We introduce TURBO-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, by up to 200×, along with an up to 65 fold increase in sparsity, with comparable accuracy to the baseline. We apply TURBO-SMT to BRAINQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. TURBO-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. PMID:26473087
Discriminative non-negative matrix factorization (DNMF) and its application to the fault diagnosis of diesel engine

NASA Astrophysics Data System (ADS)

Yang, Yong-sheng; Ming, An-bo; Zhang, You-yun; Zhu, Yong-sheng

2017-10-01

Diesel engines, widely used in engineering, are very important for the running of equipments and their fault diagnosis have attracted much attention. In the past several decades, the image based fault diagnosis methods have provided efficient ways for the diesel engine fault diagnosis. By introducing the class information into the traditional non-negative matrix factorization (NMF), an improved NMF algorithm named as discriminative NMF (DNMF) was developed and a novel imaged based fault diagnosis method was proposed by the combination of the DNMF and the KNN classifier. Experiments performed on the fault diagnosis of diesel engine were used to validate the efficacy of the proposed method. It is shown that the fault conditions of diesel engine can be efficiently classified by the proposed method using the coefficient matrix obtained by DNMF. Compared with the original NMF (ONMF) and principle component analysis (PCA), the DNMF can represent the class information more efficiently because the class characters of basis matrices obtained by the DNMF are more visible than those in the basis matrices obtained by the ONMF and PCA.
Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications.

PubMed

Shang, Fanhua; Cheng, James; Liu, Yuanyuan; Luo, Zhi-Quan; Lin, Zhouchen

2017-09-04

The heavy-tailed distributions of corrupted outliers and singular values of all channels in low-level vision have proven effective priors for many applications such as background modeling, photometric stereo and image alignment. And they can be well modeled by a hyper-Laplacian. However, the use of such distributions generally leads to challenging non-convex, non-smooth and non-Lipschitz problems, and makes existing algorithms very slow for large-scale applications. Together with the analytic solutions to Lp-norm minimization with two specific values of p, i.e., p=1/2 and p=2/3, we propose two novel bilinear factor matrix norm minimization models for robust principal component analysis. We first define the double nuclear norm and Frobenius/nuclear hybrid norm penalties, and then prove that they are in essence the Schatten-1/2 and 2/3 quasi-norms, respectively, which lead to much more tractable and scalable Lipschitz optimization problems. Our experimental analysis shows that both our methods yield more accurate solutions than original Schatten quasi-norm minimization, even when the number of observations is very limited. Finally, we apply our penalties to various low-level vision problems, e.g. moving object detection, image alignment and inpainting, and show that our methods usually outperform the state-of-the-art methods.
AMA- and RWE- Based Adaptive Kalman Filter for Denoising Fiber Optic Gyroscope Drift Signal

PubMed Central

Yang, Gongliu; Liu, Yuanyuan; Li, Ming; Song, Shunguang

2015-01-01

An improved double-factor adaptive Kalman filter called AMA-RWE-DFAKF is proposed to denoise fiber optic gyroscope (FOG) drift signal in both static and dynamic conditions. The first factor is Kalman gain updated by random weighting estimation (RWE) of the covariance matrix of innovation sequence at any time to ensure the lowest noise level of output, but the inertia of KF response increases in dynamic condition. To decrease the inertia, the second factor is the covariance matrix of predicted state vector adjusted by RWE only when discontinuities are detected by adaptive moving average (AMA).The AMA-RWE-DFAKF is applied for denoising FOG static and dynamic signals, its performance is compared with conventional KF (CKF), RWE-based adaptive KF with gain correction (RWE-AKFG), AMA- and RWE- based dual mode adaptive KF (AMA-RWE-DMAKF). Results of Allan variance on static signal and root mean square error (RMSE) on dynamic signal show that this proposed algorithm outperforms all the considered methods in denoising FOG signal. PMID:26512665
AMA- and RWE- Based Adaptive Kalman Filter for Denoising Fiber Optic Gyroscope Drift Signal.

PubMed

Yang, Gongliu; Liu, Yuanyuan; Li, Ming; Song, Shunguang

2015-10-23

An improved double-factor adaptive Kalman filter called AMA-RWE-DFAKF is proposed to denoise fiber optic gyroscope (FOG) drift signal in both static and dynamic conditions. The first factor is Kalman gain updated by random weighting estimation (RWE) of the covariance matrix of innovation sequence at any time to ensure the lowest noise level of output, but the inertia of KF response increases in dynamic condition. To decrease the inertia, the second factor is the covariance matrix of predicted state vector adjusted by RWE only when discontinuities are detected by adaptive moving average (AMA).The AMA-RWE-DFAKF is applied for denoising FOG static and dynamic signals, its performance is compared with conventional KF (CKF), RWE-based adaptive KF with gain correction (RWE-AKFG), AMA- and RWE- based dual mode adaptive KF (AMA-RWE-DMAKF). Results of Allan variance on static signal and root mean square error (RMSE) on dynamic signal show that this proposed algorithm outperforms all the considered methods in denoising FOG signal.
Numerical study of a matrix-free trust-region SQP method for equality constrained optimization.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Heinkenschloss, Matthias; Ridzal, Denis; Aguilo, Miguel Antonio

2011-12-01

This is a companion publication to the paper 'A Matrix-Free Trust-Region SQP Algorithm for Equality Constrained Optimization' [11]. In [11], we develop and analyze a trust-region sequential quadratic programming (SQP) method that supports the matrix-free (iterative, in-exact) solution of linear systems. In this report, we document the numerical behavior of the algorithm applied to a variety of equality constrained optimization problems, with constraints given by partial differential equations (PDEs).
Non-Linear Optimization Applied to Angle-of-Arrival Satellite-Based Geolocation with Correlated Measurements

DTIC Science & Technology

2015-03-01

General covariance intersection covariance matrix Σ1 Measurement 1’s covariance matrix I(X) Fisher information matrix g Confidence region L Lower... information in this chapter will discuss the motivation and background of the geolocation algorithm with the scope of the applications for this research. The...algorithm is able to produce the best description of an object given the information from a set of measurements. Determining a position requires the use of a
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication

DOE PAGES

Azad, Ariful; Ballard, Grey; Buluc, Aydin; ...

2016-11-08

Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdös-Rényi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first implementation of the 3D SpGEMM formulation that exploits multiple (intranode and internode) levels of parallelism, achievingmore » significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research.« less
The U.S. Geological Survey Modular Ground-Water Model - PCGN: A Preconditioned Conjugate Gradient Solver with Improved Nonlinear Control

USGS Publications Warehouse

Naff, Richard L.; Banta, Edward R.

2008-01-01

The preconditioned conjugate gradient with improved nonlinear control (PCGN) package provides addi-tional means by which the solution of nonlinear ground-water flow problems can be controlled as compared to existing solver packages for MODFLOW. Picard iteration is used to solve nonlinear ground-water flow equations by iteratively solving a linear approximation of the nonlinear equations. The linear solution is provided by means of the preconditioned conjugate gradient algorithm where preconditioning is provided by the modi-fied incomplete Cholesky algorithm. The incomplete Cholesky scheme incorporates two levels of fill, 0 and 1, in which the pivots can be modified so that the row sums of the preconditioning matrix and the original matrix are approximately equal. A relaxation factor is used to implement the modified pivots, which determines the degree of modification allowed. The effects of fill level and degree of pivot modification are briefly explored by means of a synthetic, heterogeneous finite-difference matrix; results are reported in the final section of this report. The preconditioned conjugate gradient method is coupled with Picard iteration so as to efficiently solve the nonlinear equations associated with many ground-water flow problems. The description of this coupling of the linear solver with Picard iteration is a primary concern of this document.
A Novel Riemannian Metric Based on Riemannian Structure and Scaling Information for Fixed Low-Rank Matrix Completion.

PubMed

Mao, Shasha; Xiong, Lin; Jiao, Licheng; Feng, Tian; Yeung, Sai-Kit

2017-05-01

Riemannian optimization has been widely used to deal with the fixed low-rank matrix completion problem, and Riemannian metric is a crucial factor of obtaining the search direction in Riemannian optimization. This paper proposes a new Riemannian metric via simultaneously considering the Riemannian geometry structure and the scaling information, which is smoothly varying and invariant along the equivalence class. The proposed metric can make a tradeoff between the Riemannian geometry structure and the scaling information effectively. Essentially, it can be viewed as a generalization of some existing metrics. Based on the proposed Riemanian metric, we also design a Riemannian nonlinear conjugate gradient algorithm, which can efficiently solve the fixed low-rank matrix completion problem. By experimenting on the fixed low-rank matrix completion, collaborative filtering, and image and video recovery, it illustrates that the proposed method is superior to the state-of-the-art methods on the convergence efficiency and the numerical performance.
A Framework to Debug Diagnostic Matrices

NASA Technical Reports Server (NTRS)

Kodal, Anuradha; Robinson, Peter; Patterson-Hine, Ann

2013-01-01

Diagnostics is an important concept in system health and monitoring of space operations. Many of the existing diagnostic algorithms utilize system knowledge in the form of diagnostic matrix (D-matrix, also popularly known as diagnostic dictionary, fault signature matrix or reachability matrix) gleaned from physical models. But, sometimes, this may not be coherent to obtain high diagnostic performance. In such a case, it is important to modify this D-matrix based on knowledge obtained from other sources such as time-series data stream (simulated or maintenance data) within the context of a framework that includes the diagnostic/inference algorithm. A systematic and sequential update procedure, diagnostic modeling evaluator (DME) is proposed to modify D-matrix and wrapper logic considering least expensive solution first. This iterative procedure includes conditions ranging from modifying 0s and 1s in the matrix, or adding/removing the rows (failure sources) columns (tests). We will experiment this framework on datasets from DX challenge 2009.
Multiple feature fusion via covariance matrix for visual tracking

NASA Astrophysics Data System (ADS)

Jin, Zefenfen; Hou, Zhiqiang; Yu, Wangsheng; Wang, Xin; Sun, Hui

2018-04-01

Aiming at the problem of complicated dynamic scenes in visual target tracking, a multi-feature fusion tracking algorithm based on covariance matrix is proposed to improve the robustness of the tracking algorithm. In the frame-work of quantum genetic algorithm, this paper uses the region covariance descriptor to fuse the color, edge and texture features. It also uses a fast covariance intersection algorithm to update the model. The low dimension of region covariance descriptor, the fast convergence speed and strong global optimization ability of quantum genetic algorithm, and the fast computation of fast covariance intersection algorithm are used to improve the computational efficiency of fusion, matching, and updating process, so that the algorithm achieves a fast and effective multi-feature fusion tracking. The experiments prove that the proposed algorithm can not only achieve fast and robust tracking but also effectively handle interference of occlusion, rotation, deformation, motion blur and so on.
Parallel Clustering Algorithm for Large-Scale Biological Data Sets

PubMed Central

Wang, Minchao; Zhang, Wu; Ding, Wang; Dai, Dongbo; Zhang, Huiran; Xie, Hao; Chen, Luonan; Guo, Yike; Xie, Jiang

2014-01-01

Backgrounds Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Methods Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. Result A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies. PMID:24705246
Fully Decentralized Semi-supervised Learning via Privacy-preserving Matrix Completion.

PubMed

Fierimonte, Roberto; Scardapane, Simone; Uncini, Aurelio; Panella, Massimo

2016-08-26

Distributed learning refers to the problem of inferring a function when the training data are distributed among different nodes. While significant work has been done in the contexts of supervised and unsupervised learning, the intermediate case of Semi-supervised learning in the distributed setting has received less attention. In this paper, we propose an algorithm for this class of problems, by extending the framework of manifold regularization. The main component of the proposed algorithm consists of a fully distributed computation of the adjacency matrix of the training patterns. To this end, we propose a novel algorithm for low-rank distributed matrix completion, based on the framework of diffusion adaptation. Overall, the distributed Semi-supervised algorithm is efficient and scalable, and it can preserve privacy by the inclusion of flexible privacy-preserving mechanisms for similarity computation. The experimental results and comparison on a wide range of standard Semi-supervised benchmarks validate our proposal.
Diagonalization of complex symmetric matrices: Generalized Householder reflections, iterative deflation and implicit shifts

NASA Astrophysics Data System (ADS)

Noble, J. H.; Lubasch, M.; Stevens, J.; Jentschura, U. D.

2017-12-01

We describe a matrix diagonalization algorithm for complex symmetric (not Hermitian) matrices, A ̲ =A̲T, which is based on a two-step algorithm involving generalized Householder reflections based on the indefinite inner product 〈 u ̲ , v ̲ 〉 ∗ =∑iuivi. This inner product is linear in both arguments and avoids complex conjugation. The complex symmetric input matrix is transformed to tridiagonal form using generalized Householder transformations (first step). An iterative, generalized QL decomposition of the tridiagonal matrix employing an implicit shift converges toward diagonal form (second step). The QL algorithm employs iterative deflation techniques when a machine-precision zero is encountered "prematurely" on the super-/sub-diagonal. The algorithm allows for a reliable and computationally efficient computation of resonance and antiresonance energies which emerge from complex-scaled Hamiltonians, and for the numerical determination of the real energy eigenvalues of pseudo-Hermitian and PT-symmetric Hamilton matrices. Numerical reference values are provided.
Markov chains: computing limit existence and approximations with DNA.

PubMed

Cardona, M; Colomer, M A; Conde, J; Miret, J M; Miró, J; Zaragoza, A

2005-09-01

We present two algorithms to perform computations over Markov chains. The first one determines whether the sequence of powers of the transition matrix of a Markov chain converges or not to a limit matrix. If it does converge, the second algorithm enables us to estimate this limit. The combination of these algorithms allows the computation of a limit using DNA computing. In this sense, we have encoded the states and the transition probabilities using strands of DNA for generating paths of the Markov chain.
Density-matrix-based algorithm for solving eigenvalue problems

NASA Astrophysics Data System (ADS)

Polizzi, Eric

2009-03-01

A fast and stable numerical algorithm for solving the symmetric eigenvalue problem is presented. The technique deviates fundamentally from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms) or other Davidson-Jacobi techniques and takes its inspiration from the contour integration and density-matrix representation in quantum mechanics. It will be shown that this algorithm—named FEAST—exhibits high efficiency, robustness, accuracy, and scalability on parallel architectures. Examples from electronic structure calculations of carbon nanotubes are presented, and numerical performances and capabilities are discussed.
Efficient tree tensor network states (TTNS) for quantum chemistry: Generalizations of the density matrix renormalization group algorithm

NASA Astrophysics Data System (ADS)

Nakatani, Naoki; Chan, Garnet Kin-Lic

2013-04-01

We investigate tree tensor network states for quantum chemistry. Tree tensor network states represent one of the simplest generalizations of matrix product states and the density matrix renormalization group. While matrix product states encode a one-dimensional entanglement structure, tree tensor network states encode a tree entanglement structure, allowing for a more flexible description of general molecules. We describe an optimal tree tensor network state algorithm for quantum chemistry. We introduce the concept of half-renormalization which greatly improves the efficiency of the calculations. Using our efficient formulation we demonstrate the strengths and weaknesses of tree tensor network states versus matrix product states. We carry out benchmark calculations both on tree systems (hydrogen trees and π-conjugated dendrimers) as well as non-tree molecules (hydrogen chains, nitrogen dimer, and chromium dimer). In general, tree tensor network states require much fewer renormalized states to achieve the same accuracy as matrix product states. In non-tree molecules, whether this translates into a computational savings is system dependent, due to the higher prefactor and computational scaling associated with tree algorithms. In tree like molecules, tree network states are easily superior to matrix product states. As an illustration, our largest dendrimer calculation with tree tensor network states correlates 110 electrons in 110 active orbitals.
Comparative study of original recover and recover KL in separable non-negative matrix factorization for topic detection in Twitter

NASA Astrophysics Data System (ADS)

Prabandari, R. D.; Murfi, H.

2017-07-01

An increasing amount of information on social media such as Twitter requires an efficient way to find the topics so that the information can be well managed. One of an automated method for topic detection is separable non-negative matrix factorization (SNMF). SNMF assumes that each topic has at least one word that does not appear on other topics. This method uses the direct approach and gives polynomial-time complexity, while the previous methods are iterative approaches and have NP-hard complexity. There are three steps of SNMF algorithm, i.e. constructing word co-occurrences, finding anchor words, and recovering topics. In this paper, we examine two topic recover methods, namely original recover that is using algebraic manipulation and recover KL that using probability approach with Kullback-Leibler divergence. Our simulations show that recover KL provides better accuracies in term of topic recall than original recover.

Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gittens, Alex; Devarakonda, Aditya; Racah, Evan

We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms. Spark is designed for data analytics on cluster computing platforms with access to local disks and is optimized for data-parallel tasks. We examine three widely-used and important matrix factorizations: NMF (for physical plausibility), PCA (for its ubiquity) and CX (for data interpretability). We apply these methods to 1.6TB particle physics, 2.2TB and 16TB climate modeling and 1.1TB bioimaging data. The data matrices are tall-and-skinny which enable the algorithms to map conveniently into Spark’s data parallel model. We perform scalingmore » experiments on up to 1600 Cray XC40 nodes, describe the sources of slowdowns, and provide tuning guidance to obtain high performance.« less
Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis

NASA Astrophysics Data System (ADS)

Opron, Kristopher; Xia, Kelin; Wei, Guo-Wei

2014-06-01

Protein structural fluctuation, typically measured by Debye-Waller factors, or B-factors, is a manifestation of protein flexibility, which strongly correlates to protein function. The flexibility-rigidity index (FRI) is a newly proposed method for the construction of atomic rigidity functions required in the theory of continuum elasticity with atomic rigidity, which is a new multiscale formalism for describing excessively large biomolecular systems. The FRI method analyzes protein rigidity and flexibility and is capable of predicting protein B-factors without resorting to matrix diagonalization. A fundamental assumption used in the FRI is that protein structures are uniquely determined by various internal and external interactions, while the protein functions, such as stability and flexibility, are solely determined by the structure. As such, one can predict protein flexibility without resorting to the protein interaction Hamiltonian. Consequently, bypassing the matrix diagonalization, the original FRI has a computational complexity of O(N^2). This work introduces a fast FRI (fFRI) algorithm for the flexibility analysis of large macromolecules. The proposed fFRI further reduces the computational complexity to O(N). Additionally, we propose anisotropic FRI (aFRI) algorithms for the analysis of protein collective dynamics. The aFRI algorithms permit adaptive Hessian matrices, from a completely global 3N × 3N matrix to completely local 3 × 3 matrices. These 3 × 3 matrices, despite being calculated locally, also contain non-local correlation information. Eigenvectors obtained from the proposed aFRI algorithms are able to demonstrate collective motions. Moreover, we investigate the performance of FRI by employing four families of radial basis correlation functions. Both parameter optimized and parameter-free FRI methods are explored. Furthermore, we compare the accuracy and efficiency of FRI with some established approaches to flexibility analysis, namely, normal mode analysis and Gaussian network model (GNM). The accuracy of the FRI method is tested using four sets of proteins, three sets of relatively small-, medium-, and large-sized structures and an extended set of 365 proteins. A fifth set of proteins is used to compare the efficiency of the FRI, fFRI, aFRI, and GNM methods. Intensive validation and comparison indicate that the FRI, particularly the fFRI, is orders of magnitude more efficient and about 10% more accurate overall than some of the most popular methods in the field. The proposed fFRI is able to predict B-factors for α-carbons of the HIV virus capsid (313 236 residues) in less than 30 seconds on a single processor using only one core. Finally, we demonstrate the application of FRI and aFRI to protein domain analysis.
Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Opron, Kristopher; Xia, Kelin; Wei, Guo-Wei, E-mail: wei@math.msu.edu

Protein structural fluctuation, typically measured by Debye-Waller factors, or B-factors, is a manifestation of protein flexibility, which strongly correlates to protein function. The flexibility-rigidity index (FRI) is a newly proposed method for the construction of atomic rigidity functions required in the theory of continuum elasticity with atomic rigidity, which is a new multiscale formalism for describing excessively large biomolecular systems. The FRI method analyzes protein rigidity and flexibility and is capable of predicting protein B-factors without resorting to matrix diagonalization. A fundamental assumption used in the FRI is that protein structures are uniquely determined by various internal and external interactions,more » while the protein functions, such as stability and flexibility, are solely determined by the structure. As such, one can predict protein flexibility without resorting to the protein interaction Hamiltonian. Consequently, bypassing the matrix diagonalization, the original FRI has a computational complexity of O(N{sup 2}). This work introduces a fast FRI (fFRI) algorithm for the flexibility analysis of large macromolecules. The proposed fFRI further reduces the computational complexity to O(N). Additionally, we propose anisotropic FRI (aFRI) algorithms for the analysis of protein collective dynamics. The aFRI algorithms permit adaptive Hessian matrices, from a completely global 3N × 3N matrix to completely local 3 × 3 matrices. These 3 × 3 matrices, despite being calculated locally, also contain non-local correlation information. Eigenvectors obtained from the proposed aFRI algorithms are able to demonstrate collective motions. Moreover, we investigate the performance of FRI by employing four families of radial basis correlation functions. Both parameter optimized and parameter-free FRI methods are explored. Furthermore, we compare the accuracy and efficiency of FRI with some established approaches to flexibility analysis, namely, normal mode analysis and Gaussian network model (GNM). The accuracy of the FRI method is tested using four sets of proteins, three sets of relatively small-, medium-, and large-sized structures and an extended set of 365 proteins. A fifth set of proteins is used to compare the efficiency of the FRI, fFRI, aFRI, and GNM methods. Intensive validation and comparison indicate that the FRI, particularly the fFRI, is orders of magnitude more efficient and about 10% more accurate overall than some of the most popular methods in the field. The proposed fFRI is able to predict B-factors for α-carbons of the HIV virus capsid (313 236 residues) in less than 30 seconds on a single processor using only one core. Finally, we demonstrate the application of FRI and aFRI to protein domain analysis.« less
Convex Banding of the Covariance Matrix

PubMed Central

Bien, Jacob; Bunea, Florentina; Xiao, Luo

2016-01-01

We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings. PMID:28042189
Convex Banding of the Covariance Matrix.

PubMed

Bien, Jacob; Bunea, Florentina; Xiao, Luo

2016-01-01

We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings.
Improved hybrid algorithm with Gaussian basis sets and plane waves: First-principles calculations of ethylene adsorption on β-SiC(001)-(3×2)

NASA Astrophysics Data System (ADS)

Wieferink, Jürgen; Krüger, Peter; Pollmann, Johannes

2006-11-01

We present an algorithm for DFT calculations employing Gaussian basis sets for the wave function and a Fourier basis for the potential representation. In particular, a numerically very efficient calculation of the local potential matrix elements and the charge density is described. Special emphasis is placed on the consequences of periodicity and explicit k -vector dependence. The algorithm is tested by comparison with more straightforward ones for the case of adsorption of ethylene on the silicon-rich SiC(001)-(3×2) surface clearly revealing its substantial advantages. A complete self-consistency cycle is speeded up by roughly one order of magnitude since the calculation of matrix elements and of the charge density are accelerated by factors of 10 and 80, respectively, as compared to their straightforward calculation. Our results for C2H4:SiC(001)-(3×2) show that ethylene molecules preferentially adsorb in on-top positions above Si dimers on the substrate surface saturating both dimer dangling bonds per unit cell. In addition, a twist of the molecules around a surface-perpendicular axis is slightly favored energetically similar to the case of a complete monolayer of ethylene adsorbed on the Si(001)-(2×1) surface.
Direct Retrieval of Exterior Orientation Parameters Using A 2-D Projective Transformation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seedahmed, Gamal H.

2006-09-01

Direct solutions are very attractive because they obviate the need for initial approximations associated with non-linear solutions. The Direct Linear Transformation (DLT) establishes itself as a method of choice for direct solutions in photogrammetry and other fields. The use of the DLT with coplanar object space points leads to a rank deficient model. This rank deficient model leaves the DLT defined up to a 2-D projective transformation, which makes the direct retrieval of the exterior orientation parameters (EOPs) a non-trivial task. This paper presents a novel direct algorithm to retrieve the EOPs from the 2-D projective transformation. It is basedmore » on a direct relationship between the 2-D projective transformation and the collinearity model using homogeneous coordinates representation. This representation offers a direct matrix correspondence between the 2-D projective transformation parameters and the collinearity model parameters. This correspondence lends itself to a direct matrix factorization to retrieve the EOPs. An important step in the proposed algorithm is a normalization process that provides the actual link between the 2-D projective transformation and the collinearity model. This paper explains the theoretical basis of the proposed algorithm as well as the necessary steps for its practical implementation. In addition, numerical examples are provided to demonstrate its validity.« less
QR-decomposition based SENSE reconstruction using parallel architecture.

PubMed

Ullah, Irfan; Nisar, Habab; Raza, Haseeb; Qasim, Malik; Inam, Omair; Omer, Hammad

2018-04-01

Magnetic Resonance Imaging (MRI) is a powerful medical imaging technique that provides essential clinical information about the human body. One major limitation of MRI is its long scan time. Implementation of advance MRI algorithms on a parallel architecture (to exploit inherent parallelism) has a great potential to reduce the scan time. Sensitivity Encoding (SENSE) is a Parallel Magnetic Resonance Imaging (pMRI) algorithm that utilizes receiver coil sensitivities to reconstruct MR images from the acquired under-sampled k-space data. At the heart of SENSE lies inversion of a rectangular encoding matrix. This work presents a novel implementation of GPU based SENSE algorithm, which employs QR decomposition for the inversion of the rectangular encoding matrix. For a fair comparison, the performance of the proposed GPU based SENSE reconstruction is evaluated against single and multicore CPU using openMP. Several experiments against various acceleration factors (AFs) are performed using multichannel (8, 12 and 30) phantom and in-vivo human head and cardiac datasets. Experimental results show that GPU significantly reduces the computation time of SENSE reconstruction as compared to multi-core CPU (approximately 12x speedup) and single-core CPU (approximately 53x speedup) without any degradation in the quality of the reconstructed images. Copyright © 2018 Elsevier Ltd. All rights reserved.
Analysis of modified SMI method for adaptive array weight control

NASA Technical Reports Server (NTRS)

Dilsavor, R. L.; Moses, R. L.

1989-01-01

An adaptive array is applied to the problem of receiving a desired signal in the presence of weak interference signals which need to be suppressed. A modification, suggested by Gupta, of the sample matrix inversion (SMI) algorithm controls the array weights. In the modified SMI algorithm, interference suppression is increased by subtracting a fraction F of the noise power from the diagonal elements of the estimated covariance matrix. Given the true covariance matrix and the desired signal direction, the modified algorithm is shown to maximize a well-defined, intuitive output power ratio criterion. Expressions are derived for the expected value and variance of the array weights and output powers as a function of the fraction F and the number of snapshots used in the covariance matrix estimate. These expressions are compared with computer simulation and good agreement is found. A trade-off is found to exist between the desired level of interference suppression and the number of snapshots required in order to achieve that level with some certainty. The removal of noise eigenvectors from the covariance matrix inverse is also discussed with respect to this application. Finally, the type and severity of errors which occur in the covariance matrix estimate are characterized through simulation.
Acoustic 3D modeling by the method of integral equations

NASA Astrophysics Data System (ADS)

Malovichko, M.; Khokhlov, N.; Yavich, N.; Zhdanov, M.

2018-02-01

This paper presents a parallel algorithm for frequency-domain acoustic modeling by the method of integral equations (IE). The algorithm is applied to seismic simulation. The IE method reduces the size of the problem but leads to a dense system matrix. A tolerable memory consumption and numerical complexity were achieved by applying an iterative solver, accompanied by an effective matrix-vector multiplication operation, based on the fast Fourier transform (FFT). We demonstrate that, the IE system matrix is better conditioned than that of the finite-difference (FD) method, and discuss its relation to a specially preconditioned FD matrix. We considered several methods of matrix-vector multiplication for the free-space and layered host models. The developed algorithm and computer code were benchmarked against the FD time-domain solution. It was demonstrated that, the method could accurately calculate the seismic field for the models with sharp material boundaries and a point source and receiver located close to the free surface. We used OpenMP to speed up the matrix-vector multiplication, while MPI was used to speed up the solution of the system equations, and also for parallelizing across multiple sources. The practical examples and efficiency tests are presented as well.
A Practical Approximation Algorithm for the LTS Estimator

DTIC Science & Technology

2015-07-02

are computed quantities. We begin with a few standard definitions [19, 20]. Consider a d-vector x and a d × d matrix X. Let Xi, j denote the element...separately. By definition , the middle factor is just κ(XE). To analyze the first factor, observe that ‖yE − y∗E‖2 = ∑ j∈E(y j − y∗j)2. By our earlier...y j − y∗j)2 ≤ med j∈H(y j − y∗j)2 (h/2) med j∈H(y j − y∗j)2 ≤ 2 h . To analyze the third factor, recall that from the definition of the Frobenius
SAMSAN- MODERN NUMERICAL METHODS FOR CLASSICAL SAMPLED SYSTEM ANALYSIS

NASA Technical Reports Server (NTRS)

Frisch, H. P.

1994-01-01

SAMSAN was developed to aid the control system analyst by providing a self consistent set of computer algorithms that support large order control system design and evaluation studies, with an emphasis placed on sampled system analysis. Control system analysts have access to a vast array of published algorithms to solve an equally large spectrum of controls related computational problems. The analyst usually spends considerable time and effort bringing these published algorithms to an integrated operational status and often finds them less general than desired. SAMSAN reduces the burden on the analyst by providing a set of algorithms that have been well tested and documented, and that can be readily integrated for solving control system problems. Algorithm selection for SAMSAN has been biased toward numerical accuracy for large order systems with computational speed and portability being considered important but not paramount. In addition to containing relevant subroutines from EISPAK for eigen-analysis and from LINPAK for the solution of linear systems and related problems, SAMSAN contains the following not so generally available capabilities: 1) Reduction of a real non-symmetric matrix to block diagonal form via a real similarity transformation matrix which is well conditioned with respect to inversion, 2) Solution of the generalized eigenvalue problem with balancing and grading, 3) Computation of all zeros of the determinant of a matrix of polynomials, 4) Matrix exponentiation and the evaluation of integrals involving the matrix exponential, with option to first block diagonalize, 5) Root locus and frequency response for single variable transfer functions in the S, Z, and W domains, 6) Several methods of computing zeros for linear systems, and 7) The ability to generate documentation "on demand". All matrix operations in the SAMSAN algorithms assume non-symmetric matrices with real double precision elements. There is no fixed size limit on any matrix in any SAMSAN algorithm; however, it is generally agreed by experienced users, and in the numerical error analysis literature, that computation with non-symmetric matrices of order greater than about 200 should be avoided or treated with extreme care. SAMSAN attempts to support the needs of application oriented analysis by providing: 1) a methodology with unlimited growth potential, 2) a methodology to insure that associated documentation is current and available "on demand", 3) a foundation of basic computational algorithms that most controls analysis procedures are based upon, 4) a set of check out and evaluation programs which demonstrate usage of the algorithms on a series of problems which are structured to expose the limits of each algorithm's applicability, and 5) capabilities which support both a priori and a posteriori error analysis for the computational algorithms provided. The SAMSAN algorithms are coded in FORTRAN 77 for batch or interactive execution and have been implemented on a DEC VAX computer under VMS 4.7. An effort was made to assure that the FORTRAN source code was portable and thus SAMSAN may be adaptable to other machine environments. The documentation is included on the distribution tape or can be purchased separately at the price below. SAMSAN version 2.0 was developed in 1982 and updated to version 3.0 in 1988.
Correlated Noise: How it Breaks NMF, and What to Do About It.

PubMed

Plis, Sergey M; Potluru, Vamsi K; Lane, Terran; Calhoun, Vince D

2011-01-12

Non-negative matrix factorization (NMF) is a problem of decomposing multivariate data into a set of features and their corresponding activations. When applied to experimental data, NMF has to cope with noise, which is often highly correlated. We show that correlated noise can break the Donoho and Stodden separability conditions of a dataset and a regular NMF algorithm will fail to decompose it, even when given freedom to be able to represent the noise as a separate feature. To cope with this issue, we present an algorithm for NMF with a generalized least squares objective function (glsNMF) and derive multiplicative updates for the method together with proving their convergence. The new algorithm successfully recovers the true representation from the noisy data. Robust performance can make glsNMF a valuable tool for analyzing empirical data.
Correlated Noise: How it Breaks NMF, and What to Do About It

PubMed Central

Plis, Sergey M.; Potluru, Vamsi K.; Lane, Terran; Calhoun, Vince D.

2010-01-01

Non-negative matrix factorization (NMF) is a problem of decomposing multivariate data into a set of features and their corresponding activations. When applied to experimental data, NMF has to cope with noise, which is often highly correlated. We show that correlated noise can break the Donoho and Stodden separability conditions of a dataset and a regular NMF algorithm will fail to decompose it, even when given freedom to be able to represent the noise as a separate feature. To cope with this issue, we present an algorithm for NMF with a generalized least squares objective function (glsNMF) and derive multiplicative updates for the method together with proving their convergence. The new algorithm successfully recovers the true representation from the noisy data. Robust performance can make glsNMF a valuable tool for analyzing empirical data. PMID:23750288
A Parallel Multiclassification Algorithm for Big Data Using an Extreme Learning Machine.

PubMed

Duan, Mingxing; Li, Kenli; Liao, Xiangke; Li, Keqin

2018-06-01

As data sets become larger and more complicated, an extreme learning machine (ELM) that runs in a traditional serial environment cannot realize its ability to be fast and effective. Although a parallel ELM (PELM) based on MapReduce to process large-scale data shows more efficient learning speed than identical ELM algorithms in a serial environment, some operations, such as intermediate results stored on disks and multiple copies for each task, are indispensable, and these operations create a large amount of extra overhead and degrade the learning speed and efficiency of the PELMs. In this paper, an efficient ELM based on the Spark framework (SELM), which includes three parallel subalgorithms, is proposed for big data classification. By partitioning the corresponding data sets reasonably, the hidden layer output matrix calculation algorithm, matrix decomposition algorithm, and matrix decomposition algorithm perform most of the computations locally. At the same time, they retain the intermediate results in distributed memory and cache the diagonal matrix as broadcast variables instead of several copies for each task to reduce a large amount of the costs, and these actions strengthen the learning ability of the SELM. Finally, we implement our SELM algorithm to classify large data sets. Extensive experiments have been conducted to validate the effectiveness of the proposed algorithms. As shown, our SELM achieves an speedup on a cluster with ten nodes, and reaches a speedup with 15 nodes, an speedup with 20 nodes, a speedup with 25 nodes, a speedup with 30 nodes, and a speedup with 35 nodes.
A space efficient flexible pivot selection approach to evaluate determinant and inverse of a matrix.

PubMed

Jafree, Hafsa Athar; Imtiaz, Muhammad; Inayatullah, Syed; Khan, Fozia Hanif; Nizami, Tajuddin

2014-01-01

This paper presents new simple approaches for evaluating determinant and inverse of a matrix. The choice of pivot selection has been kept arbitrary thus they reduce the error while solving an ill conditioned system. Computation of determinant of a matrix has been made more efficient by saving unnecessary data storage and also by reducing the order of the matrix at each iteration, while dictionary notation [1] has been incorporated for computing the matrix inverse thereby saving unnecessary calculations. These algorithms are highly class room oriented, easy to use and implemented by students. By taking the advantage of flexibility in pivot selection, one may easily avoid development of the fractions by most. Unlike the matrix inversion method [2] and [3], the presented algorithms obviate the use of permutations and inverse permutations.
A Branch-and-Bound Algorithm for Fitting Anti-Robinson Structures to Symmetric Dissimilarity Matrices.

ERIC Educational Resources Information Center

Brusco, Michael J.

2002-01-01

Developed a branch-and-bound algorithm that can be used to seriate a symmetric dissimilarity matrix by identifying a reordering of rows and columns of the matrix optimizing an anti-Robinson criterion. Computational results suggest that with respect to computational efficiency, the approach is generally competitive with dynamic programming. (SLD)
A Strassen-Newton algorithm for high-speed parallelizable matrix inversion

NASA Technical Reports Server (NTRS)

Bailey, David H.; Ferguson, Helaman R. P.

1988-01-01

Techniques are described for computing matrix inverses by algorithms that are highly suited to massively parallel computation. The techniques are based on an algorithm suggested by Strassen (1969). Variations of this scheme use matrix Newton iterations and other methods to improve the numerical stability while at the same time preserving a very high level of parallelism. One-processor Cray-2 implementations of these schemes range from one that is up to 55 percent faster than a conventional library routine to one that is slower than a library routine but achieves excellent numerical stability. The problem of computing the solution to a single set of linear equations is discussed, and it is shown that this problem can also be solved efficiently using these techniques.
Mechanobiological simulations of peri-acetabular bone ingrowth: a comparative analysis of cell-phenotype specific and phenomenological algorithms.

PubMed

Mukherjee, Kaushik; Gupta, Sanjay

2017-03-01

Several mechanobiology algorithms have been employed to simulate bone ingrowth around porous coated implants. However, there is a scarcity of quantitative comparison between the efficacies of commonly used mechanoregulatory algorithms. The objectives of this study are: (1) to predict peri-acetabular bone ingrowth using cell-phenotype specific algorithm and to compare these predictions with those obtained using phenomenological algorithm and (2) to investigate the influences of cellular parameters on bone ingrowth. The variation in host bone material property and interfacial micromotion of the implanted pelvis were mapped onto the microscale model of implant-bone interface. An overall variation of 17-88 % in peri-acetabular bone ingrowth was observed. Despite differences in predicted tissue differentiation patterns during the initial period, both the algorithms predicted similar spatial distribution of neo-tissue layer, after attainment of equilibrium. Results indicated that phenomenological algorithm, being computationally faster than the cell-phenotype specific algorithm, might be used to predict peri-prosthetic bone ingrowth. The cell-phenotype specific algorithm, however, was found to be useful in numerically investigating the influence of alterations in cellular activities on bone ingrowth, owing to biologically related factors. Amongst the host of cellular activities, matrix production rate of bone tissue was found to have predominant influence on peri-acetabular bone ingrowth.
Face verification with balanced thresholds.

PubMed

Yan, Shuicheng; Xu, Dong; Tang, Xiaoou

2007-01-01

The process of face verification is guided by a pre-learned global threshold, which, however, is often inconsistent with class-specific optimal thresholds. It is, hence, beneficial to pursue a balance of the class-specific thresholds in the model-learning stage. In this paper, we present a new dimensionality reduction algorithm tailored to the verification task that ensures threshold balance. This is achieved by the following aspects. First, feasibility is guaranteed by employing an affine transformation matrix, instead of the conventional projection matrix, for dimensionality reduction, and, hence, we call the proposed algorithm threshold balanced transformation (TBT). Then, the affine transformation matrix, constrained as the product of an orthogonal matrix and a diagonal matrix, is optimized to improve the threshold balance and classification capability in an iterative manner. Unlike most algorithms for face verification which are directly transplanted from face identification literature, TBT is specifically designed for face verification and clarifies the intrinsic distinction between these two tasks. Experiments on three benchmark face databases demonstrate that TBT significantly outperforms the state-of-the-art subspace techniques for face verification.

VIP: Vortex Image Processing Package for High-contrast Direct Imaging

NASA Astrophysics Data System (ADS)

Gomez Gonzalez, Carlos Alberto; Wertz, Olivier; Absil, Olivier; Christiaens, Valentin; Defrère, Denis; Mawet, Dimitri; Milli, Julien; Absil, Pierre-Antoine; Van Droogenbroeck, Marc; Cantalloube, Faustine; Hinz, Philip M.; Skemer, Andrew J.; Karlsson, Mikael; Surdej, Jean

2017-07-01

We present the Vortex Image Processing (VIP) library, a python package dedicated to astronomical high-contrast imaging. Our package relies on the extensive python stack of scientific libraries and aims to provide a flexible framework for high-contrast data and image processing. In this paper, we describe the capabilities of VIP related to processing image sequences acquired using the angular differential imaging (ADI) observing technique. VIP implements functionalities for building high-contrast data processing pipelines, encompassing pre- and post-processing algorithms, potential source position and flux estimation, and sensitivity curve generation. Among the reference point-spread function subtraction techniques for ADI post-processing, VIP includes several flavors of principal component analysis (PCA) based algorithms, such as annular PCA and incremental PCA algorithms capable of processing big datacubes (of several gigabytes) on a computer with limited memory. Also, we present a novel ADI algorithm based on non-negative matrix factorization, which comes from the same family of low-rank matrix approximations as PCA and provides fairly similar results. We showcase the ADI capabilities of the VIP library using a deep sequence on HR 8799 taken with the LBTI/LMIRCam and its recently commissioned L-band vortex coronagraph. Using VIP, we investigated the presence of additional companions around HR 8799 and did not find any significant additional point source beyond the four known planets. VIP is available at http://github.com/vortex-exoplanet/VIP and is accompanied with Jupyter notebook tutorials illustrating the main functionalities of the library.
Preserving Symmetry in Preconditioned Krylov Subspace Methods

NASA Technical Reports Server (NTRS)

Chan, Tony F.; Chow, E.; Saad, Y.; Yeung, M. C.

1996-01-01

We consider the problem of solving a linear system Ax = b when A is nearly symmetric and when the system is preconditioned by a symmetric positive definite matrix M. In the symmetric case, one can recover symmetry by using M-inner products in the conjugate gradient (CG) algorithm. This idea can also be used in the nonsymmetric case, and near symmetry can be preserved similarly. Like CG, the new algorithms are mathematically equivalent to split preconditioning, but do not require M to be factored. Better robustness in a specific sense can also be observed. When combined with truncated versions of iterative methods, tests show that this is more effective than the common practice of forfeiting near-symmetry altogether.
CP decomposition approach to blind separation for DS-CDMA system using a new performance index

NASA Astrophysics Data System (ADS)

Rouijel, Awatif; Minaoui, Khalid; Comon, Pierre; Aboutajdine, Driss

2014-12-01

In this paper, we present a canonical polyadic (CP) tensor decomposition isolating the scaling matrix. This has two major implications: (i) the problem conditioning shows up explicitly and could be controlled through a constraint on the so-called coherences and (ii) a performance criterion concerning the factor matrices can be exactly calculated and is more realistic than performance metrics used in the literature. Two new algorithms optimizing the CP decomposition based on gradient descent are proposed. This decomposition is illustrated by an application to direct-sequence code division multiplexing access (DS-CDMA) systems; computer simulations are provided and demonstrate the good behavior of these algorithms, compared to others in the literature.
Simultaneous Denoising, Deconvolution, and Demixing of Calcium Imaging Data

PubMed Central

Pnevmatikakis, Eftychios A.; Soudry, Daniel; Gao, Yuanjun; Machado, Timothy A.; Merel, Josh; Pfau, David; Reardon, Thomas; Mu, Yu; Lacefield, Clay; Yang, Weijian; Ahrens, Misha; Bruno, Randy; Jessell, Thomas M.; Peterka, Darcy S.; Yuste, Rafael; Paninski, Liam

2016-01-01

SUMMARY We present a modular approach for analyzing calcium imaging recordings of large neuronal ensembles. Our goal is to simultaneously identify the locations of the neurons, demix spatially overlapping components, and denoise and deconvolve the spiking activity from the slow dynamics of the calcium indicator. Our approach relies on a constrained nonnegative matrix factorization that expresses the spatiotemporal fluorescence activity as the product of a spatial matrix that encodes the spatial footprint of each neuron in the optical field and a temporal matrix that characterizes the calcium concentration of each neuron over time. This framework is combined with a novel constrained deconvolution approach that extracts estimates of neural activity from fluorescence traces, to create a spatiotemporal processing algorithm that requires minimal parameter tuning. We demonstrate the general applicability of our method by applying it to in vitro and in vivo multineuronal imaging data, whole-brain light-sheet imaging data, and dendritic imaging data. PMID:26774160
Application of majority voting and consensus voting algorithms in N-version software

NASA Astrophysics Data System (ADS)

Tsarev, R. Yu; Durmuş, M. S.; Üstoglu, I.; Morozov, V. A.

2018-05-01

N-version programming is one of the most common techniques which is used to improve the reliability of software by building in fault tolerance, redundancy and decreasing common cause failures. N different equivalent software versions are developed by N different and isolated workgroups by considering the same software specifications. The versions solve the same task and return results that have to be compared to determine the correct result. Decisions of N different versions are evaluated by a voting algorithm or the so-called voter. In this paper, two of the most commonly used software voting algorithms such as the majority voting algorithm and the consensus voting algorithm are studied. The distinctive features of Nversion programming with majority voting and N-version programming with consensus voting are described. These two algorithms make a decision about the correct result on the base of the agreement matrix. However, if the equivalence relation on the agreement matrix is not satisfied it is impossible to make a decision. It is shown that the agreement matrix can be transformed into an appropriate form by using the Boolean compositions when the equivalence relation is satisfied.
Parallel Preconditioning for CFD Problems on the CM-5

NASA Technical Reports Server (NTRS)

Simon, Horst D.; Kremenetsky, Mark D.; Richardson, John; Lasinski, T. A. (Technical Monitor)

1994-01-01

Up to today, preconditioning methods on massively parallel systems have faced a major difficulty. The most successful preconditioning methods in terms of accelerating the convergence of the iterative solver such as incomplete LU factorizations are notoriously difficult to implement on parallel machines for two reasons: (1) the actual computation of the preconditioner is not very floating-point intensive, but requires a large amount of unstructured communication, and (2) the application of the preconditioning matrix in the iteration phase (i.e. triangular solves) are difficult to parallelize because of the recursive nature of the computation. Here we present a new approach to preconditioning for very large, sparse, unsymmetric, linear systems, which avoids both difficulties. We explicitly compute an approximate inverse to our original matrix. This new preconditioning matrix can be applied most efficiently for iterative methods on massively parallel machines, since the preconditioning phase involves only a matrix-vector multiplication, with possibly a dense matrix. Furthermore the actual computation of the preconditioning matrix has natural parallelism. For a problem of size n, the preconditioning matrix can be computed by solving n independent small least squares problems. The algorithm and its implementation on the Connection Machine CM-5 are discussed in detail and supported by extensive timings obtained from real problem data.
Multi-site Field Verification of Laboratory Derived FDOM Sensor Corrections: The Good, the Bad and the Ugly

NASA Astrophysics Data System (ADS)

Saraceno, J.; Shanley, J. B.; Aulenbach, B. T.

2014-12-01

Fluorescent dissolved organic matter (FDOM) is an excellent proxy for dissolved organic carbon (DOC) in natural waters. Through this relationship, in situ FDOM can be utilized to capture both high frequency time series and long term fluxes of DOC in small streams. However, in order to calculate accurate DOC fluxes for comparison across sites, in situ FDOM data must be compensated for matrix effects. Key matrix effects, include temperature, turbidity and the inner filter effect due to color. These interferences must be compensated for to develop a reasonable relationship between FDOM and DOC. In this study, we applied laboratory-derived correction factors to real time data from the five USGS WEBB headwater streams in order to gauge their effectiveness across a range of matrix effects. The good news is that laboratory derived correction factors improved the predicative relationship (higher r2) between DOC and FDOM when compared to uncorrected data. The relative importance of each matrix effect (i.e. temperature) varied by site and by time, implying that each and every matrix effect should be compensated for when available. In general, temperature effects were more important on longer time scales, while corrections for turbidity and DOC inner filter effects were most prevalent during hydrologic events, when the highest instantaneous flux of DOC occurred. Unfortunately, even when corrected for matrix effects, in situ FDOM is a weaker predictor of DOC than A254, a common surrogate for DOC, implying that either DOC fluoresces at varying degrees (but should average out over time), that some matrix effects (e.g. pH) are either unaccounted for or laboratory-derived correction factors do not encompass the site variability of particles and organics. The least impressive finding is that the inherent dependence on three variables in the FDOM correction algorithm increases the likelihood of record data gaps which increases the uncertainty in calculated DOC flux values.
Research on sparse feature matching of improved RANSAC algorithm

NASA Astrophysics Data System (ADS)

Kong, Xiangsi; Zhao, Xian

2018-04-01

In this paper, a sparse feature matching method based on modified RANSAC algorithm is proposed to improve the precision and speed. Firstly, the feature points of the images are extracted using the SIFT algorithm. Then, the image pair is matched roughly by generating SIFT feature descriptor. At last, the precision of image matching is optimized by the modified RANSAC algorithm,. The RANSAC algorithm is improved from three aspects: instead of the homography matrix, this paper uses the fundamental matrix generated by the 8 point algorithm as the model; the sample is selected by a random block selecting method, which ensures the uniform distribution and the accuracy; adds sequential probability ratio test(SPRT) on the basis of standard RANSAC, which cut down the overall running time of the algorithm. The experimental results show that this method can not only get higher matching accuracy, but also greatly reduce the computation and improve the matching speed.
Generating probabilistic Boolean networks from a prescribed transition probability matrix.

PubMed

Ching, W-K; Chen, X; Tsing, N-K

2009-11-01

Probabilistic Boolean networks (PBNs) have received much attention in modeling genetic regulatory networks. A PBN can be regarded as a Markov chain process and is characterised by a transition probability matrix. In this study, the authors propose efficient algorithms for constructing a PBN when its transition probability matrix is given. The complexities of the algorithms are also analysed. This is an interesting inverse problem in network inference using steady-state data. The problem is important as most microarray data sets are assumed to be obtained from sampling the steady-state.
Matrix elements of vibration kinetic energy operator of tetrahedral molecules in non-orthogonal-dependent coordinates

NASA Astrophysics Data System (ADS)

Protasevich, Alexander E.; Nikitin, Andrei V.

2018-01-01

In this work, we propose an algorithm for calculating the matrix elements of the kinetic energy operator for tetrahedral molecules. This algorithm uses the dependent six-angle coordinates (6A) and takes into account the full symmetry of molecules. Unlike A.V. Nikitin, M. Rey, and Vl. G. Tyuterev who operate with the kinetic energy operator only in Radau orthogonal coordinates, we consider a general case. The matrix elements are shown to be a sum of products of one-dimensional integrals.
New method for propagating the square root covariance matrix in triangular form. [using Kalman-Bucy filter

NASA Technical Reports Server (NTRS)

Choe, C. Y.; Tapley, B. D.

1975-01-01

A method proposed by Potter of applying the Kalman-Bucy filter to the problem of estimating the state of a dynamic system is described, in which the square root of the state error covariance matrix is used to process the observations. A new technique which propagates the covariance square root matrix in lower triangular form is given for the discrete observation case. The technique is faster than previously proposed algorithms and is well-adapted for use with the Carlson square root measurement algorithm.
Quantum Support Vector Machine for Big Data Classification

NASA Astrophysics Data System (ADS)

Rebentrost, Patrick; Mohseni, Masoud; Lloyd, Seth

2014-09-01

Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.
Accelerated optimization and automated discovery with covariance matrix adaptation for experimental quantum control

NASA Astrophysics Data System (ADS)

Roslund, Jonathan; Shir, Ofer M.; Bäck, Thomas; Rabitz, Herschel

2009-10-01

Optimization of quantum systems by closed-loop adaptive pulse shaping offers a rich domain for the development and application of specialized evolutionary algorithms. Derandomized evolution strategies (DESs) are presented here as a robust class of optimizers for experimental quantum control. The combination of stochastic and quasi-local search embodied by these algorithms is especially amenable to the inherent topology of quantum control landscapes. Implementation of DES in the laboratory results in efficiency gains of up to ˜9 times that of the standard genetic algorithm, and thus is a promising tool for optimization of unstable or fragile systems. The statistical learning upon which these algorithms are predicated also provide the means for obtaining a control problem’s Hessian matrix with no additional experimental overhead. The forced optimal covariance adaptive learning (FOCAL) method is introduced to enable retrieval of the Hessian matrix, which can reveal information about the landscape’s local structure and dynamic mechanism. Exploitation of such algorithms in quantum control experiments should enhance their efficiency and provide additional fundamental insights.
Formulating face verification with semidefinite programming.

PubMed

Yan, Shuicheng; Liu, Jianzhuang; Tang, Xiaoou; Huang, Thomas S

2007-11-01

This paper presents a unified solution to three unsolved problems existing in face verification with subspace learning techniques: selection of verification threshold, automatic determination of subspace dimension, and deducing feature fusing weights. In contrast to previous algorithms which search for the projection matrix directly, our new algorithm investigates a similarity metric matrix (SMM). With a certain verification threshold, this matrix is learned by a semidefinite programming approach, along with the constraints of the kindred pairs with similarity larger than the threshold, and inhomogeneous pairs with similarity smaller than the threshold. Then, the subspace dimension and the feature fusing weights are simultaneously inferred from the singular value decomposition of the derived SMM. In addition, the weighted and tensor extensions are proposed to further improve the algorithmic effectiveness and efficiency, respectively. Essentially, the verification is conducted within an affine subspace in this new algorithm and is, hence, called the affine subspace for verification (ASV). Extensive experiments show that the ASV can achieve encouraging face verification accuracy in comparison to other subspace algorithms, even without the need to explore any parameters.
Numerical Differentiation Methods for Computing Error Covariance Matrices in Item Response Theory Modeling: An Evaluation and a New Proposal

ERIC Educational Resources Information Center

Tian, Wei; Cai, Li; Thissen, David; Xin, Tao

2013-01-01

In item response theory (IRT) modeling, the item parameter error covariance matrix plays a critical role in statistical inference procedures. When item parameters are estimated using the EM algorithm, the parameter error covariance matrix is not an automatic by-product of item calibration. Cai proposed the use of Supplemented EM algorithm for…
Preconditioned Alternating Projection Algorithms for Maximum a Posteriori ECT Reconstruction

PubMed Central

Krol, Andrzej; Li, Si; Shen, Lixin; Xu, Yuesheng

2012-01-01

We propose a preconditioned alternating projection algorithm (PAPA) for solving the maximum a posteriori (MAP) emission computed tomography (ECT) reconstruction problem. Specifically, we formulate the reconstruction problem as a constrained convex optimization problem with the total variation (TV) regularization. We then characterize the solution of the constrained convex optimization problem and show that it satisfies a system of fixed-point equations defined in terms of two proximity operators raised from the convex functions that define the TV-norm and the constrain involved in the problem. The characterization (of the solution) via the proximity operators that define two projection operators naturally leads to an alternating projection algorithm for finding the solution. For efficient numerical computation, we introduce to the alternating projection algorithm a preconditioning matrix (the EM-preconditioner) for the dense system matrix involved in the optimization problem. We prove theoretically convergence of the preconditioned alternating projection algorithm. In numerical experiments, performance of our algorithms, with an appropriately selected preconditioning matrix, is compared with performance of the conventional MAP expectation-maximization (MAP-EM) algorithm with TV regularizer (EM-TV) and that of the recently developed nested EM-TV algorithm for ECT reconstruction. Based on the numerical experiments performed in this work, we observe that the alternating projection algorithm with the EM-preconditioner outperforms significantly the EM-TV in all aspects including the convergence speed, the noise in the reconstructed images and the image quality. It also outperforms the nested EM-TV in the convergence speed while providing comparable image quality. PMID:23271835
Towards Batched Linear Solvers on Accelerated Hardware Platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Haidar, Azzam; Dong, Tingzing Tim; Tomov, Stanimire

2015-01-01

As hardware evolves, an increasingly effective approach to develop energy efficient, high-performance solvers, is to design them to work on many small and independent problems. Indeed, many applications already need this functionality, especially for GPUs, which are known to be currently about four to five times more energy efficient than multicore CPUs for every floating-point operation. In this paper, we describe the development of the main one-sided factorizations: LU, QR, and Cholesky; that are needed for a set of small dense matrices to work in parallel. We refer to such algorithms as batched factorizations. Our approach is based on representingmore » the algorithms as a sequence of batched BLAS routines for GPU-contained execution. Note that this is similar in functionality to the LAPACK and the hybrid MAGMA algorithms for large-matrix factorizations. But it is different from a straightforward approach, whereby each of GPU's symmetric multiprocessors factorizes a single problem at a time. We illustrate how our performance analysis together with the profiling and tracing tools guided the development of batched factorizations to achieve up to 2-fold speedup and 3-fold better energy efficiency compared to our highly optimized batched CPU implementations based on the MKL library on a two-sockets, Intel Sandy Bridge server. Compared to a batched LU factorization featured in the NVIDIA's CUBLAS library for GPUs, we achieves up to 2.5-fold speedup on the K40 GPU.« less
Algorithm for optimizing bipolar interconnection weights with applications in associative memories and multitarget classification.

PubMed

Chang, S; Wong, K W; Zhang, W; Zhang, Y

1999-08-10

An algorithm for optimizing a bipolar interconnection weight matrix with the Hopfield network is proposed. The effectiveness of this algorithm is demonstrated by computer simulation and optical implementation. In the optical implementation of the neural network the interconnection weights are biased to yield a nonnegative weight matrix. Moreover, a threshold subchannel is added so that the system can realize, in real time, the bipolar weighted summation in a single channel. Preliminary experimental results obtained from the applications in associative memories and multitarget classification with rotation invariance are shown.
Algorithm for Optimizing Bipolar Interconnection Weights with Applications in Associative Memories and Multitarget Classification

NASA Astrophysics Data System (ADS)

Chang, Shengjiang; Wong, Kwok-Wo; Zhang, Wenwei; Zhang, Yanxin

1999-08-01

An algorithm for optimizing a bipolar interconnection weight matrix with the Hopfield network is proposed. The effectiveness of this algorithm is demonstrated by computer simulation and optical implementation. In the optical implementation of the neural network the interconnection weights are biased to yield a nonnegative weight matrix. Moreover, a threshold subchannel is added so that the system can realize, in real time, the bipolar weighted summation in a single channel. Preliminary experimental results obtained from the applications in associative memories and multitarget classification with rotation invariance are shown.
Solving periodic block tridiagonal systems using the Sherman-Morrison-Woodbury formula

NASA Technical Reports Server (NTRS)

Yarrow, Maurice

1989-01-01

Many algorithms for solving the Navier-Stokes equations require the solution of periodic block tridiagonal systems of equations. By applying a splitting to the matrix representing this system of equations, it may first be reduced to a block tridiagonal matrix plus an outer product of two block vectors. The Sherman-Morrison-Woodbury formula is then applied. The algorithm thus reduces a periodic banded system to a non-periodic banded system with additional right-hand sides and is of higher efficiency than standard Thomas algorithm/LU decompositions.

A parallel algorithm for Hamiltonian matrix construction in electron-molecule collision calculations: MPI-SCATCI

NASA Astrophysics Data System (ADS)

Al-Refaie, Ahmed F.; Tennyson, Jonathan

2017-12-01

Construction and diagonalization of the Hamiltonian matrix is the rate-limiting step in most low-energy electron - molecule collision calculations. Tennyson (1996) implemented a novel algorithm for Hamiltonian construction which took advantage of the structure of the wavefunction in such calculations. This algorithm is re-engineered to make use of modern computer architectures and the use of appropriate diagonalizers is considered. Test calculations demonstrate that significant speed-ups can be gained using multiple CPUs. This opens the way to calculations which consider higher collision energies, larger molecules and / or more target states. The methodology, which is implemented as part of the UK molecular R-matrix codes (UKRMol and UKRMol+) can also be used for studies of bound molecular Rydberg states, photoionization and positron-molecule collisions.
Layer-oriented multigrid wavefront reconstruction algorithms for multi-conjugate adaptive optics

NASA Astrophysics Data System (ADS)

Gilles, Luc; Ellerbroek, Brent L.; Vogel, Curtis R.

2003-02-01

Multi-conjugate adaptive optics (MCAO) systems with 104-105 degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wavefront control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of AO degrees of freedom. In this paper, we develop an iterative sparse matrix implementation of minimum variance wavefront reconstruction for telescope diameters up to 32m with more than 104 actuators. The basic approach is the preconditioned conjugate gradient method, using a multigrid preconditioner incorporating a layer-oriented (block) symmetric Gauss-Seidel iterative smoothing operator. We present open-loop numerical simulation results to illustrate algorithm convergence.
The density matrix renormalization group algorithm on kilo-processor architectures: Implementation and trade-offs

NASA Astrophysics Data System (ADS)

Nemes, Csaba; Barcza, Gergely; Nagy, Zoltán; Legeza, Örs; Szolgay, Péter

2014-06-01

In the numerical analysis of strongly correlated quantum lattice models one of the leading algorithms developed to balance the size of the effective Hilbert space and the accuracy of the simulation is the density matrix renormalization group (DMRG) algorithm, in which the run-time is dominated by the iterative diagonalization of the Hamilton operator. As the most time-dominant step of the diagonalization can be expressed as a list of dense matrix operations, the DMRG is an appealing candidate to fully utilize the computing power residing in novel kilo-processor architectures. In the paper a smart hybrid CPU-GPU implementation is presented, which exploits the power of both CPU and GPU and tolerates problems exceeding the GPU memory size. Furthermore, a new CUDA kernel has been designed for asymmetric matrix-vector multiplication to accelerate the rest of the diagonalization. Besides the evaluation of the GPU implementation, the practical limits of an FPGA implementation are also discussed.
Blockwise conjugate gradient methods for image reconstruction in volumetric CT.

PubMed

Qiu, W; Titley-Peloquin, D; Soleimani, M

2012-11-01

Cone beam computed tomography (CBCT) enables volumetric image reconstruction from 2D projection data and plays an important role in image guided radiation therapy (IGRT). Filtered back projection is still the most frequently used algorithm in applications. The algorithm discretizes the scanning process (forward projection) into a system of linear equations, which must then be solved to recover images from measured projection data. The conjugate gradients (CG) algorithm and its variants can be used to solve (possibly regularized) linear systems of equations Ax=b and linear least squares problems minx∥b-Ax∥2, especially when the matrix A is very large and sparse. Their applications can be found in a general CT context, but in tomography problems (e.g. CBCT reconstruction) they have not widely been used. Hence, CBCT reconstruction using the CG-type algorithm LSQR was implemented and studied in this paper. In CBCT reconstruction, the main computational challenge is that the matrix A usually is very large, and storing it in full requires an amount of memory well beyond the reach of commodity computers. Because of these memory capacity constraints, only a small fraction of the weighting matrix A is typically used, leading to a poor reconstruction. In this paper, to overcome this difficulty, the matrix A is partitioned and stored blockwise, and blockwise matrix-vector multiplications are implemented within LSQR. This implementation allows us to use the full weighting matrix A for CBCT reconstruction without further enhancing computer standards. Tikhonov regularization can also be implemented in this fashion, and can produce significant improvement in the reconstructed images. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Clustering PPI data by combining FA and SHC method.

PubMed

Lei, Xiujuan; Ying, Chao; Wu, Fang-Xiang; Xu, Jin

2015-01-01

Clustering is one of main methods to identify functional modules from protein-protein interaction (PPI) data. Nevertheless traditional clustering methods may not be effective for clustering PPI data. In this paper, we proposed a novel method for clustering PPI data by combining firefly algorithm (FA) and synchronization-based hierarchical clustering (SHC) algorithm. Firstly, the PPI data are preprocessed via spectral clustering (SC) which transforms the high-dimensional similarity matrix into a low dimension matrix. Then the SHC algorithm is used to perform clustering. In SHC algorithm, hierarchical clustering is achieved by enlarging the neighborhood radius of synchronized objects continuously, while the hierarchical search is very difficult to find the optimal neighborhood radius of synchronization and the efficiency is not high. So we adopt the firefly algorithm to determine the optimal threshold of the neighborhood radius of synchronization automatically. The proposed algorithm is tested on the MIPS PPI dataset. The results show that our proposed algorithm is better than the traditional algorithms in precision, recall and f-measure value.
Clustering PPI data by combining FA and SHC method

PubMed Central

2015-01-01

Clustering is one of main methods to identify functional modules from protein-protein interaction (PPI) data. Nevertheless traditional clustering methods may not be effective for clustering PPI data. In this paper, we proposed a novel method for clustering PPI data by combining firefly algorithm (FA) and synchronization-based hierarchical clustering (SHC) algorithm. Firstly, the PPI data are preprocessed via spectral clustering (SC) which transforms the high-dimensional similarity matrix into a low dimension matrix. Then the SHC algorithm is used to perform clustering. In SHC algorithm, hierarchical clustering is achieved by enlarging the neighborhood radius of synchronized objects continuously, while the hierarchical search is very difficult to find the optimal neighborhood radius of synchronization and the efficiency is not high. So we adopt the firefly algorithm to determine the optimal threshold of the neighborhood radius of synchronization automatically. The proposed algorithm is tested on the MIPS PPI dataset. The results show that our proposed algorithm is better than the traditional algorithms in precision, recall and f-measure value. PMID:25707632
Coarse Alignment Technology on Moving base for SINS Based on the Improved Quaternion Filter Algorithm.

PubMed

Zhang, Tao; Zhu, Yongyun; Zhou, Feng; Yan, Yaxiong; Tong, Jinwu

2017-06-17

Initial alignment of the strapdown inertial navigation system (SINS) is intended to determine the initial attitude matrix in a short time with certain accuracy. The alignment accuracy of the quaternion filter algorithm is remarkable, but the convergence rate is slow. To solve this problem, this paper proposes an improved quaternion filter algorithm for faster initial alignment based on the error model of the quaternion filter algorithm. The improved quaternion filter algorithm constructs the K matrix based on the principle of optimal quaternion algorithm, and rebuilds the measurement model by containing acceleration and velocity errors to make the convergence rate faster. A doppler velocity log (DVL) provides the reference velocity for the improved quaternion filter alignment algorithm. In order to demonstrate the performance of the improved quaternion filter algorithm in the field, a turntable experiment and a vehicle test are carried out. The results of the experiments show that the convergence rate of the proposed improved quaternion filter is faster than that of the tradition quaternion filter algorithm. In addition, the improved quaternion filter algorithm also demonstrates advantages in terms of correctness, effectiveness, and practicability.
A modified dual-level algorithm for large-scale three-dimensional Laplace and Helmholtz equation

NASA Astrophysics Data System (ADS)

Li, Junpu; Chen, Wen; Fu, Zhuojia

2018-01-01

A modified dual-level algorithm is proposed in the article. By the help of the dual level structure, the fully-populated interpolation matrix on the fine level is transformed to a local supported sparse matrix to solve the highly ill-conditioning and excessive storage requirement resulting from fully-populated interpolation matrix. The kernel-independent fast multipole method is adopted to expediting the solving process of the linear equations on the coarse level. Numerical experiments up to 2-million fine-level nodes have successfully been achieved. It is noted that the proposed algorithm merely needs to place 2-3 coarse-level nodes in each wavelength per direction to obtain the reasonable solution, which almost down to the minimum requirement allowed by the Shannon's sampling theorem. In the real human head model example, it is observed that the proposed algorithm can simulate well computationally very challenging exterior high-frequency harmonic acoustic wave propagation up to 20,000 Hz.
Taming the Wild: A Unified Analysis of Hogwild!-Style Algorithms.

PubMed

De Sa, Christopher; Zhang, Ce; Olukotun, Kunle; Ré, Christopher

2015-12-01

Stochastic gradient descent (SGD) is a ubiquitous algorithm for a variety of machine learning problems. Researchers and industry have developed several techniques to optimize SGD's runtime performance, including asynchronous execution and reduced precision. Our main result is a martingale-based analysis that enables us to capture the rich noise models that may arise from such techniques. Specifically, we use our new analysis in three ways: (1) we derive convergence rates for the convex case (Hogwild!) with relaxed assumptions on the sparsity of the problem; (2) we analyze asynchronous SGD algorithms for non-convex matrix problems including matrix completion; and (3) we design and analyze an asynchronous SGD algorithm, called Buckwild!, that uses lower-precision arithmetic. We show experimentally that our algorithms run efficiently for a variety of problems on modern hardware.
Scalable and fault tolerant orthogonalization based on randomized distributed data aggregation

PubMed Central

Gansterer, Wilfried N.; Niederbrucker, Gerhard; Straková, Hana; Schulze Grotthoff, Stefan

2013-01-01

The construction of distributed algorithms for matrix computations built on top of distributed data aggregation algorithms with randomized communication schedules is investigated. For this purpose, a new aggregation algorithm for summing or averaging distributed values, the push-flow algorithm, is developed, which achieves superior resilience properties with respect to failures compared to existing aggregation methods. It is illustrated that on a hypercube topology it asymptotically requires the same number of iterations as the optimal all-to-all reduction operation and that it scales well with the number of nodes. Orthogonalization is studied as a prototypical matrix computation task. A new fault tolerant distributed orthogonalization method rdmGS, which can produce accurate results even in the presence of node failures, is built on top of distributed data aggregation algorithms. PMID:24748902
On structure-exploiting trust-region regularized nonlinear least squares algorithms for neural-network learning.

PubMed

Mizutani, Eiji; Demmel, James W

2003-01-01

This paper briefly introduces our numerical linear algebra approaches for solving structured nonlinear least squares problems arising from 'multiple-output' neural-network (NN) models. Our algorithms feature trust-region regularization, and exploit sparsity of either the 'block-angular' residual Jacobian matrix or the 'block-arrow' Gauss-Newton Hessian (or Fisher information matrix in statistical sense) depending on problem scale so as to render a large class of NN-learning algorithms 'efficient' in both memory and operation costs. Using a relatively large real-world nonlinear regression application, we shall explain algorithmic strengths and weaknesses, analyzing simulation results obtained by both direct and iterative trust-region algorithms with two distinct NN models: 'multilayer perceptrons' (MLP) and 'complementary mixtures of MLP-experts' (or neuro-fuzzy modular networks).
Testing of Lagrange multiplier damped least-squares control algorithm for woofer-tweeter adaptive optics

PubMed Central

Zou, Weiyao; Burns, Stephen A.

2012-01-01

A Lagrange multiplier-based damped least-squares control algorithm for woofer-tweeter (W-T) dual deformable-mirror (DM) adaptive optics (AO) is tested with a breadboard system. We show that the algorithm can complementarily command the two DMs to correct wavefront aberrations within a single optimization process: the woofer DM correcting the high-stroke, low-order aberrations, and the tweeter DM correcting the low-stroke, high-order aberrations. The optimal damping factor for a DM is found to be the median of the eigenvalue spectrum of the influence matrix of that DM. Wavefront control accuracy is maximized with the optimized control parameters. For the breadboard system, the residual wavefront error can be controlled to the precision of 0.03 μm in root mean square. The W-T dual-DM AO has applications in both ophthalmology and astronomy. PMID:22441462
F-MAP: A Bayesian approach to infer the gene regulatory network using external hints

PubMed Central

Shahdoust, Maryam; Mahjub, Hossein; Sadeghi, Mehdi

2017-01-01

The Common topological features of related species gene regulatory networks suggest reconstruction of the network of one species by using the further information from gene expressions profile of related species. We present an algorithm to reconstruct the gene regulatory network named; F-MAP, which applies the knowledge about gene interactions from related species. Our algorithm sets a Bayesian framework to estimate the precision matrix of one species microarray gene expressions dataset to infer the Gaussian Graphical model of the network. The conjugate Wishart prior is used and the information from related species is applied to estimate the hyperparameters of the prior distribution by using the factor analysis. Applying the proposed algorithm on six related species of drosophila shows that the precision of reconstructed networks is improved considerably compared to the precision of networks constructed by other Bayesian approaches. PMID:28938012
Testing of Lagrange multiplier damped least-squares control algorithm for woofer-tweeter adaptive optics.

PubMed

Zou, Weiyao; Burns, Stephen A

2012-03-20

A Lagrange multiplier-based damped least-squares control algorithm for woofer-tweeter (W-T) dual deformable-mirror (DM) adaptive optics (AO) is tested with a breadboard system. We show that the algorithm can complementarily command the two DMs to correct wavefront aberrations within a single optimization process: the woofer DM correcting the high-stroke, low-order aberrations, and the tweeter DM correcting the low-stroke, high-order aberrations. The optimal damping factor for a DM is found to be the median of the eigenvalue spectrum of the influence matrix of that DM. Wavefront control accuracy is maximized with the optimized control parameters. For the breadboard system, the residual wavefront error can be controlled to the precision of 0.03 μm in root mean square. The W-T dual-DM AO has applications in both ophthalmology and astronomy. © 2012 Optical Society of America
Electromagnetic MUSIC-type imaging of perfectly conducting, arc-like cracks at single frequency

NASA Astrophysics Data System (ADS)

Park, Won-Kwang; Lesselier, Dominique

2009-11-01

We propose a non-iterative MUSIC (MUltiple SIgnal Classification)-type algorithm for the time-harmonic electromagnetic imaging of one or more perfectly conducting, arc-like cracks found within a homogeneous space R2. The algorithm is based on a factorization of the Multi-Static Response (MSR) matrix collected in the far-field at a single, nonzero frequency in either Transverse Magnetic (TM) mode (Dirichlet boundary condition) or Transverse Electric (TE) mode (Neumann boundary condition), followed by the calculation of a MUSIC cost functional expected to exhibit peaks along the crack curves each half a wavelength. Numerical experimentation from exact, noiseless and noisy data shows that this is indeed the case and that the proposed algorithm behaves in robust manner, with better results in the TM mode than in the TE mode for which one would have to estimate the normal to the crack to get the most optimal results.
GAGA: a new algorithm for genomic inference of geographic ancestry reveals fine level population substructure in Europeans.

PubMed

Lao, Oscar; Liu, Fan; Wollstein, Andreas; Kayser, Manfred

2014-02-01

Attempts to detect genetic population substructure in humans are troubled by the fact that the vast majority of the total amount of observed genetic variation is present within populations rather than between populations. Here we introduce a new algorithm for transforming a genetic distance matrix that reduces the within-population variation considerably. Extensive computer simulations revealed that the transformed matrix captured the genetic population differentiation better than the original one which was based on the T1 statistic. In an empirical genomic data set comprising 2,457 individuals from 23 different European subpopulations, the proportion of individuals that were determined as a genetic neighbour to another individual from the same sampling location increased from 25% with the original matrix to 52% with the transformed matrix. Similarly, the percentage of genetic variation explained between populations by means of Analysis of Molecular Variance (AMOVA) increased from 1.62% to 7.98%. Furthermore, the first two dimensions of a classical multidimensional scaling (MDS) using the transformed matrix explained 15% of the variance, compared to 0.7% obtained with the original matrix. Application of MDS with Mclust, SPA with Mclust, and GemTools algorithms to the same dataset also showed that the transformed matrix gave a better association of the genetic clusters with the sampling locations, and particularly so when it was used in the AMOVA framework with a genetic algorithm. Overall, the new matrix transformation introduced here substantially reduces the within population genetic differentiation, and can be broadly applied to methods such as AMOVA to enhance their sensitivity to reveal population substructure. We herewith provide a publically available (http://www.erasmusmc.nl/fmb/resources/GAGA) model-free method for improved genetic population substructure detection that can be applied to human as well as any other species data in future studies relevant to evolutionary biology, behavioural ecology, medicine, and forensics.
A sparse matrix algorithm on the Boolean vector machine

NASA Technical Reports Server (NTRS)

Wagner, Robert A.; Patrick, Merrell L.

1988-01-01

VLSI technology is being used to implement a prototype Boolean Vector Machine (BVM), which is a large network of very small processors with equally small memories that operate in SIMD mode; these use bit-serial arithmetic, and communicate via cube-connected cycles network. The BVM's bit-serial arithmetic and the small memories of individual processors are noted to compromise the system's effectiveness in large numerical problem applications. Attention is presently given to the implementation of a basic matrix-vector iteration algorithm for space matrices of the BVM, in order to generate over 1 billion useful floating-point operations/sec for this iteration algorithm. The algorithm is expressed in a novel language designated 'BVM'.
Quantum-inspired algorithm for estimating the permanent of positive semidefinite matrices

NASA Astrophysics Data System (ADS)

Chakhmakhchyan, L.; Cerf, N. J.; Garcia-Patron, R.

2017-08-01

We construct a quantum-inspired classical algorithm for computing the permanent of Hermitian positive semidefinite matrices by exploiting a connection between these mathematical structures and the boson sampling model. Specifically, the permanent of a Hermitian positive semidefinite matrix can be expressed in terms of the expected value of a random variable, which stands for a specific photon-counting probability when measuring a linear-optically evolved random multimode coherent state. Our algorithm then approximates the matrix permanent from the corresponding sample mean and is shown to run in polynomial time for various sets of Hermitian positive semidefinite matrices, achieving a precision that improves over known techniques. This work illustrates how quantum optics may benefit algorithm development.
Co-clustering phenome–genome for phenotype classification and disease gene discovery

PubMed Central

Hwang, TaeHyun; Atluri, Gowtham; Xie, MaoQiang; Dey, Sanjoy; Hong, Changjin; Kumar, Vipin; Kuang, Rui

2012-01-01

Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype–gene association matrix under the prior knowledge from phenotype similarity network and protein–protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype–gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein–protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways. PMID:22735708
Directional variance adjustment: bias reduction in covariance matrices based on factor analysis with an application to portfolio optimization.

PubMed

Bartz, Daniel; Hatrick, Kerr; Hesse, Christian W; Müller, Klaus-Robert; Lemm, Steven

2013-01-01

Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation.

Directional Variance Adjustment: Bias Reduction in Covariance Matrices Based on Factor Analysis with an Application to Portfolio Optimization

PubMed Central

Bartz, Daniel; Hatrick, Kerr; Hesse, Christian W.; Müller, Klaus-Robert; Lemm, Steven

2013-01-01

Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation. PMID:23844016
Spin-adapted matrix product states and operators

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keller, Sebastian, E-mail: sebastian.keller@phys.chem.ethz.ch; Reiher, Markus, E-mail: markus.reiher@phys.chem.ethz.ch

Matrix product states (MPSs) and matrix product operators (MPOs) allow an alternative formulation of the density matrix renormalization group algorithm introduced by White. Here, we describe how non-abelian spin symmetry can be exploited in MPSs and MPOs by virtue of the Wigner–Eckart theorem at the example of the spin-adapted quantum chemical Hamiltonian operator.
Least-Squares Approximation of an Improper Correlation Matrix by a Proper One.

ERIC Educational Resources Information Center

Knol, Dirk L.; ten Berge, Jos M. F.

1989-01-01

An algorithm, based on a solution for C. I. Mosier's oblique Procrustes rotation problem, is presented for the best least-squares fitting correlation matrix approximating a given missing value or improper correlation matrix. Results are of interest for missing value and tetrachoric correlation, indefinite matrix correlation, and constrained…
Modified multiblock partial least squares path modeling algorithm with backpropagation neural networks approach

NASA Astrophysics Data System (ADS)

Yuniarto, Budi; Kurniawan, Robert

2017-03-01

PLS Path Modeling (PLS-PM) is different from covariance based SEM, where PLS-PM use an approach based on variance or component, therefore, PLS-PM is also known as a component based SEM. Multiblock Partial Least Squares (MBPLS) is a method in PLS regression which can be used in PLS Path Modeling which known as Multiblock PLS Path Modeling (MBPLS-PM). This method uses an iterative procedure in its algorithm. This research aims to modify MBPLS-PM with Back Propagation Neural Network approach. The result is MBPLS-PM algorithm can be modified using the Back Propagation Neural Network approach to replace the iterative process in backward and forward step to get the matrix t and the matrix u in the algorithm. By modifying the MBPLS-PM algorithm using Back Propagation Neural Network approach, the model parameters obtained are relatively not significantly different compared to model parameters obtained by original MBPLS-PM algorithm.
Advancing X-ray scattering metrology using inverse genetic algorithms.

PubMed

Hannon, Adam F; Sunday, Daniel F; Windover, Donald; Kline, R Joseph

2016-01-01

We compare the speed and effectiveness of two genetic optimization algorithms to the results of statistical sampling via a Markov chain Monte Carlo algorithm to find which is the most robust method for determining real space structure in periodic gratings measured using critical dimension small angle X-ray scattering. Both a covariance matrix adaptation evolutionary strategy and differential evolution algorithm are implemented and compared using various objective functions. The algorithms and objective functions are used to minimize differences between diffraction simulations and measured diffraction data. These simulations are parameterized with an electron density model known to roughly correspond to the real space structure of our nanogratings. The study shows that for X-ray scattering data, the covariance matrix adaptation coupled with a mean-absolute error log objective function is the most efficient combination of algorithm and goodness of fit criterion for finding structures with little foreknowledge about the underlying fine scale structure features of the nanograting.
Discriminant projective non-negative matrix factorization.

PubMed

Guan, Naiyang; Zhang, Xiang; Luo, Zhigang; Tao, Dacheng; Yang, Xuejun

2013-01-01

Projective non-negative matrix factorization (PNMF) projects high-dimensional non-negative examples X onto a lower-dimensional subspace spanned by a non-negative basis W and considers W(T) X as their coefficients, i.e., X≈WW(T) X. Since PNMF learns the natural parts-based representation Wof X, it has been widely used in many fields such as pattern recognition and computer vision. However, PNMF does not perform well in classification tasks because it completely ignores the label information of the dataset. This paper proposes a Discriminant PNMF method (DPNMF) to overcome this deficiency. In particular, DPNMF exploits Fisher's criterion to PNMF for utilizing the label information. Similar to PNMF, DPNMF learns a single non-negative basis matrix and needs less computational burden than NMF. In contrast to PNMF, DPNMF maximizes the distance between centers of any two classes of examples meanwhile minimizes the distance between any two examples of the same class in the lower-dimensional subspace and thus has more discriminant power. We develop a multiplicative update rule to solve DPNMF and prove its convergence. Experimental results on four popular face image datasets confirm its effectiveness comparing with the representative NMF and PNMF algorithms.
Discriminant Projective Non-Negative Matrix Factorization

PubMed Central

Guan, Naiyang; Zhang, Xiang; Luo, Zhigang; Tao, Dacheng; Yang, Xuejun

2013-01-01

Projective non-negative matrix factorization (PNMF) projects high-dimensional non-negative examples X onto a lower-dimensional subspace spanned by a non-negative basis W and considers WT X as their coefficients, i.e., X≈WWT X. Since PNMF learns the natural parts-based representation Wof X, it has been widely used in many fields such as pattern recognition and computer vision. However, PNMF does not perform well in classification tasks because it completely ignores the label information of the dataset. This paper proposes a Discriminant PNMF method (DPNMF) to overcome this deficiency. In particular, DPNMF exploits Fisher's criterion to PNMF for utilizing the label information. Similar to PNMF, DPNMF learns a single non-negative basis matrix and needs less computational burden than NMF. In contrast to PNMF, DPNMF maximizes the distance between centers of any two classes of examples meanwhile minimizes the distance between any two examples of the same class in the lower-dimensional subspace and thus has more discriminant power. We develop a multiplicative update rule to solve DPNMF and prove its convergence. Experimental results on four popular face image datasets confirm its effectiveness comparing with the representative NMF and PNMF algorithms. PMID:24376680
A novel chaos-based image encryption algorithm using DNA sequence operations

NASA Astrophysics Data System (ADS)

Chai, Xiuli; Chen, Yiran; Broyde, Lucie

2017-01-01

An image encryption algorithm based on chaotic system and deoxyribonucleic acid (DNA) sequence operations is proposed in this paper. First, the plain image is encoded into a DNA matrix, and then a new wave-based permutation scheme is performed on it. The chaotic sequences produced by 2D Logistic chaotic map are employed for row circular permutation (RCP) and column circular permutation (CCP). Initial values and parameters of the chaotic system are calculated by the SHA 256 hash of the plain image and the given values. Then, a row-by-row image diffusion method at DNA level is applied. A key matrix generated from the chaotic map is used to fuse the confused DNA matrix; also the initial values and system parameters of the chaotic system are renewed by the hamming distance of the plain image. Finally, after decoding the diffused DNA matrix, we obtain the cipher image. The DNA encoding/decoding rules of the plain image and the key matrix are determined by the plain image. Experimental results and security analyses both confirm that the proposed algorithm has not only an excellent encryption result but also resists various typical attacks.
Preconditioned alternating projection algorithms for maximum a posteriori ECT reconstruction

NASA Astrophysics Data System (ADS)

Krol, Andrzej; Li, Si; Shen, Lixin; Xu, Yuesheng

2012-11-01

We propose a preconditioned alternating projection algorithm (PAPA) for solving the maximum a posteriori (MAP) emission computed tomography (ECT) reconstruction problem. Specifically, we formulate the reconstruction problem as a constrained convex optimization problem with the total variation (TV) regularization. We then characterize the solution of the constrained convex optimization problem and show that it satisfies a system of fixed-point equations defined in terms of two proximity operators raised from the convex functions that define the TV-norm and the constraint involved in the problem. The characterization (of the solution) via the proximity operators that define two projection operators naturally leads to an alternating projection algorithm for finding the solution. For efficient numerical computation, we introduce to the alternating projection algorithm a preconditioning matrix (the EM-preconditioner) for the dense system matrix involved in the optimization problem. We prove theoretically convergence of the PAPA. In numerical experiments, performance of our algorithms, with an appropriately selected preconditioning matrix, is compared with performance of the conventional MAP expectation-maximization (MAP-EM) algorithm with TV regularizer (EM-TV) and that of the recently developed nested EM-TV algorithm for ECT reconstruction. Based on the numerical experiments performed in this work, we observe that the alternating projection algorithm with the EM-preconditioner outperforms significantly the EM-TV in all aspects including the convergence speed, the noise in the reconstructed images and the image quality. It also outperforms the nested EM-TV in the convergence speed while providing comparable image quality.
An experimental SMI adaptive antenna array simulator for weak interfering signals

NASA Technical Reports Server (NTRS)

Dilsavor, Ronald S.; Gupta, Inder J.

1991-01-01

An experimental sample matrix inversion (SMI) adaptive antenna array for suppressing weak interfering signals is described. The experimental adaptive array uses a modified SMI algorithm to increase the interference suppression. In the modified SMI algorithm, the sample covariance matrix is redefined to reduce the effect of thermal noise on the weights of an adaptive array. This is accomplished by subtracting a fraction of the smallest eigenvalue of the original covariance matrix from its diagonal entries. The test results obtained using the experimental system are compared with theoretical results. The two show a good agreement.
Computing sparse derivatives and consecutive zeros problem

NASA Astrophysics Data System (ADS)

Chandra, B. V. Ravi; Hossain, Shahadat

2013-02-01

We describe a substitution based sparse Jacobian matrix determination method using algorithmic differentiation. Utilizing the a priori known sparsity pattern, a compression scheme is determined using graph coloring. The "compressed pattern" of the Jacobian matrix is then reordered into a form suitable for computation by substitution. We show that the column reordering of the compressed pattern matrix (so as to align the zero entries into consecutive locations in each row) can be viewed as a variant of traveling salesman problem. Preliminary computational results show that on the test problems the performance of nearest-neighbor type heuristic algorithms is highly encouraging.
Implementing the SU(2) Symmetry for the DMRG

NASA Astrophysics Data System (ADS)

Alvarez, Gonzalo

2010-03-01

In the Density Matrix Renormalization Group (DMRG) algorithm (White, 1992), Hamiltonian symmetries play an important role. Using symmetries, the matrix representation of the Hamiltonian can be blocked. Diagonalizing each matrix block is more efficient than diagonalizing the original matrix. This talk will explain how the DMRG++ codefootnotetextarXiv:0902.3185 or Computer Physics Communications 180 (2009) 1572-1578. has been extended to handle the non-local SU(2) symmetry in a model independent way. Improvements in CPU times compared to runs with only local symmetries will be discussed for typical tight-binding models of strongly correlated electronic systems. The computational bottleneck of the algorithm, and the use of shared memory parallelization will also be addressed. Finally, a roadmap for future work on DMRG++ will be presented.
Projection matrix acquisition for cone-beam computed tomography iterative reconstruction

NASA Astrophysics Data System (ADS)

Yang, Fuqiang; Zhang, Dinghua; Huang, Kuidong; Shi, Wenlong; Zhang, Caixin; Gao, Zongzhao

2017-02-01

Projection matrix is an essential and time-consuming part in computed tomography (CT) iterative reconstruction. In this article a novel calculation algorithm of three-dimensional (3D) projection matrix is proposed to quickly acquire the matrix for cone-beam CT (CBCT). The CT data needed to be reconstructed is considered as consisting of the three orthogonal sets of equally spaced and parallel planes, rather than the individual voxels. After getting the intersections the rays with the surfaces of the voxels, the coordinate points and vertex is compared to obtain the index value that the ray traversed. Without considering ray-slope to voxel, it just need comparing the position of two points. Finally, the computer simulation is used to verify the effectiveness of the algorithm.
Effect of molecular anisotropy on backscattered ultraviolet radiance.

PubMed

Ahmad, Z; Bhartia, P K

1995-12-20

The effect of molecular anisotropy on backscattered UV (BUV) radiances is computed by accounting for it in both Rayleigh optical thickness and the scattering-phase matrix. If the effect of molecular anisotropy is included only in the optical thickness and not in the phase matrix, then for high sun (θ(0) ∼ 0°), the nadir radiance (I(0)) leaving the top of the atmosphere is approximately 1.8% higher than the radiance (I(op)) computed with the effect included in the phase matrix. For very low sun (θ(0) > 80°), I(0) is approximately 2.3% lower than I(op). For off-nadir radiances the relative increase (decrease) depends on both the local zenith angle as well as the azimuth angle. Also, an increase in the surface reflectivity decreases the effect of molecular anisotropy on the upwelling radiances. Exclusion of the anisotropy factor in the Rayleigh-phase matrix has very little effect (<1%) on ozone retrieval from the BUV-type instruments. This is because of the ratio technique used in the retrieval algorithm, which practically cancels out the anisotropy effect.
Toward the optimization of normalized graph Laplacian.

PubMed

Xie, Bo; Wang, Meng; Tao, Dacheng

2011-04-01

Normalized graph Laplacian has been widely used in many practical machine learning algorithms, e.g., spectral clustering and semisupervised learning. However, all of them use the Euclidean distance to construct the graph Laplacian, which does not necessarily reflect the inherent distribution of the data. In this brief, we propose a method to directly optimize the normalized graph Laplacian by using pairwise constraints. The learned graph is consistent with equivalence and nonequivalence pairwise relationships, and thus it can better represent similarity between samples. Meanwhile, our approach, unlike metric learning, automatically determines the scale factor during the optimization. The learned normalized Laplacian matrix can be directly applied in spectral clustering and semisupervised learning algorithms. Comprehensive experiments demonstrate the effectiveness of the proposed approach.
On the degree conjecture for separability of multipartite quantum states

NASA Astrophysics Data System (ADS)

Hassan, Ali Saif M.; Joag, Pramod S.

2008-01-01

We settle the so-called degree conjecture for the separability of multipartite quantum states, which are normalized graph Laplacians, first given by Braunstein et al. [Phys. Rev. A 73, 012320 (2006)]. The conjecture states that a multipartite quantum state is separable if and only if the degree matrix of the graph associated with the state is equal to the degree matrix of the partial transpose of this graph. We call this statement to be the strong form of the conjecture. In its weak version, the conjecture requires only the necessity, that is, if the state is separable, the corresponding degree matrices match. We prove the strong form of the conjecture for pure multipartite quantum states using the modified tensor product of graphs defined by Hassan and Joag [J. Phys. A 40, 10251 (2007)], as both necessary and sufficient condition for separability. Based on this proof, we give a polynomial-time algorithm for completely factorizing any pure multipartite quantum state. By polynomial-time algorithm, we mean that the execution time of this algorithm increases as a polynomial in m, where m is the number of parts of the quantum system. We give a counterexample to show that the conjecture fails, in general, even in its weak form, for multipartite mixed states. Finally, we prove this conjecture, in its weak form, for a class of multipartite mixed states, giving only a necessary condition for separability.
Blind color isolation for color-channel-based fringe pattern profilometry using digital projection

NASA Astrophysics Data System (ADS)

Hu, Yingsong; Xi, Jiangtao; Chicharo, Joe; Yang, Zongkai

2007-08-01

We present an algorithm for estimating the color demixing matrix based on the color fringe patterns captured from the reference plane or the surface of the object. The advantage of this algorithm is that it is a blind approach to calculating the demixing matrix in the sense that no extra images are required for color calibration before performing profile measurement. Simulation and experimental results convince us that the proposed algorithm can significantly reduce the influence of the color cross talk and at the same time improve the measurement accuracy of the color-channel-based phase-shifting profilometry.
Multi-contrast imaging of human posterior eye by Jones matrix optical coherence tomography

NASA Astrophysics Data System (ADS)

Yasuno, Yoshiaki

2017-04-01

A multi-contrast imaging of pathologic posterior eyes is demonstrated by Jones matrix optical coherence tomography (Jones matrix OCT). The Jones matrix OCT provides five tomographies, which includes scattering, local attenuation, birefringence, polarization uniformity, and optical coherence angiography, by a single scan. The hardware configuration, algorithms of the Jones matrix OCT as well as its application to ophthalmology is discussed.
Contact solution algorithms

NASA Technical Reports Server (NTRS)

Tielking, John T.

1989-01-01

Two algorithms for obtaining static contact solutions are described in this presentation. Although they were derived for contact problems involving specific structures (a tire and a solid rubber cylinder), they are sufficiently general to be applied to other shell-of-revolution and solid-body contact problems. The shell-of-revolution contact algorithm is a method of obtaining a point load influence coefficient matrix for the portion of shell surface that is expected to carry a contact load. If the shell is sufficiently linear with respect to contact loading, a single influence coefficient matrix can be used to obtain a good approximation of the contact pressure distribution. Otherwise, the matrix will be updated to reflect nonlinear load-deflection behavior. The solid-body contact algorithm utilizes a Lagrange multiplier to include the contact constraint in a potential energy functional. The solution is found by applying the principle of minimum potential energy. The Lagrange multiplier is identified as the contact load resultant for a specific deflection. At present, only frictionless contact solutions have been obtained with these algorithms. A sliding tread element has been developed to calculate friction shear force in the contact region of the rolling shell-of-revolution tire model.
A new implementation of the CMRH method for solving dense linear systems

NASA Astrophysics Data System (ADS)

Heyouni, M.; Sadok, H.

2008-04-01

The CMRH method [H. Sadok, Methodes de projections pour les systemes lineaires et non lineaires, Habilitation thesis, University of Lille1, Lille, France, 1994; H. Sadok, CMRH: A new method for solving nonsymmetric linear systems based on the Hessenberg reduction algorithm, Numer. Algorithms 20 (1999) 303-321] is an algorithm for solving nonsymmetric linear systems in which the Arnoldi component of GMRES is replaced by the Hessenberg process, which generates Krylov basis vectors which are orthogonal to standard unit basis vectors rather than mutually orthogonal. The iterate is formed from these vectors by solving a small least squares problem involving a Hessenberg matrix. Like GMRES, this method requires one matrix-vector product per iteration. However, it can be implemented to require half as much arithmetic work and less storage. Moreover, numerical experiments show that this method performs accurately and reduces the residual about as fast as GMRES. With this new implementation, we show that the CMRH method is the only method with long-term recurrence which requires not storing at the same time the entire Krylov vectors basis and the original matrix as in the GMRES algorithmE A comparison with Gaussian elimination is provided.

On Some Separated Algorithms for Separable Nonlinear Least Squares Problems.

PubMed

Gan, Min; Chen, C L Philip; Chen, Guang-Yong; Chen, Long

2017-10-03

For a class of nonlinear least squares problems, it is usually very beneficial to separate the variables into a linear and a nonlinear part and take full advantage of reliable linear least squares techniques. Consequently, the original problem is turned into a reduced problem which involves only nonlinear parameters. We consider in this paper four separated algorithms for such problems. The first one is the variable projection (VP) algorithm with full Jacobian matrix of Golub and Pereyra. The second and third ones are VP algorithms with simplified Jacobian matrices proposed by Kaufman and Ruano et al. respectively. The fourth one only uses the gradient of the reduced problem. Monte Carlo experiments are conducted to compare the performance of these four algorithms. From the results of the experiments, we find that: 1) the simplified Jacobian proposed by Ruano et al. is not a good choice for the VP algorithm; moreover, it may render the algorithm hard to converge; 2) the fourth algorithm perform moderately among these four algorithms; 3) the VP algorithm with the full Jacobian matrix perform more stable than that of the VP algorithm with Kuafman's simplified one; and 4) the combination of VP algorithm and Levenberg-Marquardt method is more effective than the combination of VP algorithm and Gauss-Newton method.
SMI adaptive antenna arrays for weak interfering signals

NASA Technical Reports Server (NTRS)

Gupta, I. J.

1987-01-01

The performance of adaptive antenna arrays is studied when a sample matrix inversion (SMI) algorithm is used to control array weights. It is shown that conventional SMI adaptive antennas, like other adaptive antennas, are unable to suppress weak interfering signals (below thermal noise) encountered in broadcasting satellite communication systems. To overcome this problem, the SMI algorithm is modified. In the modified algorithm, the covariance matrix is modified such that the effect of thermal noise on the weights of the adaptive array is reduced. Thus, the weights are dictated by relatively weak coherent signals. It is shown that the modified algorithm provides the desired interference protection. The use of defocused feeds as auxiliary elements of an SMI adaptive array is also discussed.
Matrix-Product-State Algorithm for Finite Fractional Quantum Hall Systems

NASA Astrophysics Data System (ADS)

Liu, Zhao; Bhatt, R. N.

2015-09-01

Exact diagonalization is a powerful tool to study fractional quantum Hall (FQH) systems. However, its capability is limited by the exponentially increasing computational cost. In order to overcome this difficulty, density-matrix-renormalization-group (DMRG) algorithms were developed for much larger system sizes. Very recently, it was realized that some model FQH states have exact matrix-product-state (MPS) representation. Motivated by this, here we report a MPS code, which is closely related to, but different from traditional DMRG language, for finite FQH systems on the cylinder geometry. By representing the many-body Hamiltonian as a matrix-product-operator (MPO) and using single-site update and density matrix correction, we show that our code can efficiently search the ground state of various FQH systems. We also compare the performance of our code with traditional DMRG. The possible generalization of our code to infinite FQH systems and other physical systems is also discussed.
Optimization of matrix tablets controlled drug release using Elman dynamic neural networks and decision trees.

PubMed

Petrović, Jelena; Ibrić, Svetlana; Betz, Gabriele; Đurić, Zorica

2012-05-30

The main objective of the study was to develop artificial intelligence methods for optimization of drug release from matrix tablets regardless of the matrix type. Static and dynamic artificial neural networks of the same topology were developed to model dissolution profiles of different matrix tablets types (hydrophilic/lipid) using formulation composition, compression force used for tableting and tablets porosity and tensile strength as input data. Potential application of decision trees in discovering knowledge from experimental data was also investigated. Polyethylene oxide polymer and glyceryl palmitostearate were used as matrix forming materials for hydrophilic and lipid matrix tablets, respectively whereas selected model drugs were diclofenac sodium and caffeine. Matrix tablets were prepared by direct compression method and tested for in vitro dissolution profiles. Optimization of static and dynamic neural networks used for modeling of drug release was performed using Monte Carlo simulations or genetic algorithms optimizer. Decision trees were constructed following discretization of data. Calculated difference (f(1)) and similarity (f(2)) factors for predicted and experimentally obtained dissolution profiles of test matrix tablets formulations indicate that Elman dynamic neural networks as well as decision trees are capable of accurate predictions of both hydrophilic and lipid matrix tablets dissolution profiles. Elman neural networks were compared to most frequently used static network, Multi-layered perceptron, and superiority of Elman networks have been demonstrated. Developed methods allow simple, yet very precise way of drug release predictions for both hydrophilic and lipid matrix tablets having controlled drug release. Copyright © 2012 Elsevier B.V. All rights reserved.
Efficient Tensor Completion for Color Image and Video Recovery: Low-Rank Tensor Train.

PubMed

Bengua, Johann A; Phien, Ho N; Tuan, Hoang Duong; Do, Minh N

2017-05-01

This paper proposes a novel approach to tensor completion, which recovers missing entries of data represented by tensors. The approach is based on the tensor train (TT) rank, which is able to capture hidden information from tensors thanks to its definition from a well-balanced matricization scheme. Accordingly, new optimization formulations for tensor completion are proposed as well as two new algorithms for their solution. The first one called simple low-rank tensor completion via TT (SiLRTC-TT) is intimately related to minimizing a nuclear norm based on TT rank. The second one is from a multilinear matrix factorization model to approximate the TT rank of a tensor, and is called tensor completion by parallel matrix factorization via TT (TMac-TT). A tensor augmentation scheme of transforming a low-order tensor to higher orders is also proposed to enhance the effectiveness of SiLRTC-TT and TMac-TT. Simulation results for color image and video recovery show the clear advantage of our method over all other methods.
Using Separable Nonnegative Matrix Factorization Techniques for the Analysis of Time-Resolved Raman Spectra

NASA Astrophysics Data System (ADS)

Luce, R.; Hildebrandt, P.; Kuhlmann, U.; Liesen, J.

2016-09-01

The key challenge of time-resolved Raman spectroscopy is the identification of the constituent species and the analysis of the kinetics of the underlying reaction network. In this work we present an integral approach that allows for determining both the component spectra and the rate constants simultaneously from a series of vibrational spectra. It is based on an algorithm for non-negative matrix factorization which is applied to the experimental data set following a few pre-processing steps. As a prerequisite for physically unambiguous solutions, each component spectrum must include one vibrational band that does not significantly interfere with vibrational bands of other species. The approach is applied to synthetic "experimental" spectra derived from model systems comprising a set of species with component spectra differing with respect to their degree of spectral interferences and signal-to-noise ratios. In each case, the species involved are connected via monomolecular reaction pathways. The potential and limitations of the approach for recovering the respective rate constants and component spectra are discussed.
Algorithm Optimally Allocates Actuation of a Spacecraft

NASA Technical Reports Server (NTRS)

Motaghedi, Shi

2007-01-01

A report presents an algorithm that solves the following problem: Allocate the force and/or torque to be exerted by each thruster and reaction-wheel assembly on a spacecraft for best performance, defined as minimizing the error between (1) the total force and torque commanded by the spacecraft control system and (2) the total of forces and torques actually exerted by all the thrusters and reaction wheels. The algorithm incorporates the matrix vector relationship between (1) the total applied force and torque and (2) the individual actuator force and torque values. It takes account of such constraints as lower and upper limits on the force or torque that can be applied by a given actuator. The algorithm divides the aforementioned problem into two optimization problems that it solves sequentially. These problems are of a type, known in the art as semi-definite programming problems, that involve linear matrix inequalities. The algorithm incorporates, as sub-algorithms, prior algorithms that solve such optimization problems very efficiently. The algorithm affords the additional advantage that the solution requires the minimum rate of consumption of fuel for the given best performance.
Imaging of downward-looking linear array SAR using three-dimensional spatial smoothing MUSIC algorithm

NASA Astrophysics Data System (ADS)

Zhang, Siqian; Kuang, Gangyao

2014-10-01

In this paper, a novel three-dimensional imaging algorithm of downward-looking linear array SAR is presented. To improve the resolution, multiple signal classification (MUSIC) algorithm has been used. However, since the scattering centers are always correlated in real SAR system, the estimated covariance matrix becomes singular. To address the problem, a three-dimensional spatial smoothing method is proposed in this paper to restore the singular covariance matrix to a full-rank one. The three-dimensional signal matrix can be divided into a set of orthogonal three-dimensional subspaces. The main idea of the method is based on extracting the array correlation matrix as the average of all correlation matrices from the subspaces. In addition, the spectral height of the peaks contains no information with regard to the scattering intensity of the different scattering centers, thus it is difficulty to reconstruct the backscattering information. The least square strategy is used to estimate the amplitude of the scattering center in this paper. The above results of the theoretical analysis are verified by 3-D scene simulations and experiments on real data.
Graphs, matrices, and the GraphBLAS: Seven good reasons

DOE PAGES

Kepner, Jeremy; Bader, David; Buluç, Aydın; ...

2015-01-01

The analysis of graphs has become increasingly important to a wide range of applications. Graph analysis presents a number of unique challenges in the areas of (1) software complexity, (2) data complexity, (3) security, (4) mathematical complexity, (5) theoretical analysis, (6) serial performance, and (7) parallel performance. Implementing graph algorithms using matrix-based approaches provides a number of promising solutions to these challenges. The GraphBLAS standard (istcbigdata.org/GraphBlas) is being developed to bring the potential of matrix based graph algorithms to the broadest possible audience. The GraphBLAS mathematically defines a core set of matrix-based graph operations that can be used to implementmore » a wide class of graph algorithms in a wide range of programming environments. This paper provides an introduction to the GraphBLAS and describes how the GraphBLAS can be used to address many of the challenges associated with analysis of graphs.« less
a Unified Matrix Polynomial Approach to Modal Identification

NASA Astrophysics Data System (ADS)

Allemang, R. J.; Brown, D. L.

1998-04-01

One important current focus of modal identification is a reformulation of modal parameter estimation algorithms into a single, consistent mathematical formulation with a corresponding set of definitions and unifying concepts. Particularly, a matrix polynomial approach is used to unify the presentation with respect to current algorithms such as the least-squares complex exponential (LSCE), the polyreference time domain (PTD), Ibrahim time domain (ITD), eigensystem realization algorithm (ERA), rational fraction polynomial (RFP), polyreference frequency domain (PFD) and the complex mode indication function (CMIF) methods. Using this unified matrix polynomial approach (UMPA) allows a discussion of the similarities and differences of the commonly used methods. the use of least squares (LS), total least squares (TLS), double least squares (DLS) and singular value decomposition (SVD) methods is discussed in order to take advantage of redundant measurement data. Eigenvalue and SVD transformation methods are utilized to reduce the effective size of the resulting eigenvalue-eigenvector problem as well.
Complex Langevin simulation of a random matrix model at nonzero chemical potential

NASA Astrophysics Data System (ADS)

Bloch, J.; Glesaaen, J.; Verbaarschot, J. J. M.; Zafeiropoulos, S.

2018-03-01

In this paper we test the complex Langevin algorithm for numerical simulations of a random matrix model of QCD with a first order phase transition to a phase of finite baryon density. We observe that a naive implementation of the algorithm leads to phase quenched results, which were also derived analytically in this article. We test several fixes for the convergence issues of the algorithm, in particular the method of gauge cooling, the shifted representation, the deformation technique and reweighted complex Langevin, but only the latter method reproduces the correct analytical results in the region where the quark mass is inside the domain of the eigenvalues. In order to shed more light on the issues of the methods we also apply them to a similar random matrix model with a milder sign problem and no phase transition, and in that case gauge cooling solves the convergence problems as was shown before in the literature.
New algorithms to compute the nearness symmetric solution of the matrix equation.

PubMed

Peng, Zhen-Yun; Fang, Yang-Zhi; Xiao, Xian-Wei; Du, Dan-Dan

2016-01-01

In this paper we consider the nearness symmetric solution of the matrix equation AXB = C to a given matrix [Formula: see text] in the sense of the Frobenius norm. By discussing equivalent form of the considered problem, we derive some necessary and sufficient conditions for the matrix [Formula: see text] is a solution of the considered problem. Based on the idea of the alternating variable minimization with multiplier method, we propose two iterative methods to compute the solution of the considered problem, and analyze the global convergence results of the proposed algorithms. Numerical results illustrate the proposed methods are more effective than the existing two methods proposed in Peng et al. (Appl Math Comput 160:763-777, 2005) and Peng (Int J Comput Math 87: 1820-1830, 2010).
a Global Registration Algorithm of the Single-Closed Ring Multi-Stations Point Cloud

NASA Astrophysics Data System (ADS)

Yang, R.; Pan, L.; Xiang, Z.; Zeng, H.

2018-04-01

Aimed at the global registration problem of the single-closed ring multi-stations point cloud, a formula in order to calculate the error of rotation matrix was constructed according to the definition of error. The global registration algorithm of multi-station point cloud was derived to minimize the error of rotation matrix. And fast-computing formulas of transformation matrix with whose implementation steps and simulation experiment scheme was given. Compared three different processing schemes of multi-station point cloud, the experimental results showed that the effectiveness of the new global registration method was verified, and it could effectively complete the global registration of point cloud.
Graphic matching based on shape contexts and reweighted random walks

NASA Astrophysics Data System (ADS)

Zhang, Mingxuan; Niu, Dongmei; Zhao, Xiuyang; Liu, Mingjun

2018-04-01

Graphic matching is a very critical issue in all aspects of computer vision. In this paper, a new graphics matching algorithm combining shape contexts and reweighted random walks was proposed. On the basis of the local descriptor, shape contexts, the reweighted random walks algorithm was modified to possess stronger robustness and correctness in the final result. Our main process is to use the descriptor of the shape contexts for the random walk on the iteration, of which purpose is to control the random walk probability matrix. We calculate bias matrix by using descriptors and then in the iteration we use it to enhance random walks' and random jumps' accuracy, finally we get the one-to-one registration result by discretization of the matrix. The algorithm not only preserves the noise robustness of reweighted random walks but also possesses the rotation, translation, scale invariance of shape contexts. Through extensive experiments, based on real images and random synthetic point sets, and comparisons with other algorithms, it is confirmed that this new method can produce excellent results in graphic matching.
Gene selection heuristic algorithm for nutrigenomics studies.

PubMed

Valour, D; Hue, I; Grimard, B; Valour, B

2013-07-15

Large datasets from -omics studies need to be deeply investigated. The aim of this paper is to provide a new method (LEM method) for the search of transcriptome and metabolome connections. The heuristic algorithm here described extends the classical canonical correlation analysis (CCA) to a high number of variables (without regularization) and combines well-conditioning and fast-computing in "R." Reduced CCA models are summarized in PageRank matrices, the product of which gives a stochastic matrix that resumes the self-avoiding walk covered by the algorithm. Then, a homogeneous Markov process applied to this stochastic matrix converges the probabilities of interconnection between genes, providing a selection of disjointed subsets of genes. This is an alternative to regularized generalized CCA for the determination of blocks within the structure matrix. Each gene subset is thus linked to the whole metabolic or clinical dataset that represents the biological phenotype of interest. Moreover, this selection process reaches the aim of biologists who often need small sets of genes for further validation or extended phenotyping. The algorithm is shown to work efficiently on three published datasets, resulting in meaningfully broadened gene networks.
Research on AHP decision algorithms based on BP algorithm

NASA Astrophysics Data System (ADS)

Ma, Ning; Guan, Jianhe

2017-10-01

Decision making is the thinking activity that people choose or judge, and scientific decision-making has always been a hot issue in the field of research. Analytic Hierarchy Process (AHP) is a simple and practical multi-criteria and multi-objective decision-making method that combines quantitative and qualitative and can show and calculate the subjective judgment in digital form. In the process of decision analysis using AHP method, the rationality of the two-dimensional judgment matrix has a great influence on the decision result. However, in dealing with the real problem, the judgment matrix produced by the two-dimensional comparison is often inconsistent, that is, it does not meet the consistency requirements. BP neural network algorithm is an adaptive nonlinear dynamic system. It has powerful collective computing ability and learning ability. It can perfect the data by constantly modifying the weights and thresholds of the network to achieve the goal of minimizing the mean square error. In this paper, the BP algorithm is used to deal with the consistency of the two-dimensional judgment matrix of the AHP.
Person Re-Identification via Distance Metric Learning With Latent Variables.

PubMed

Sun, Chong; Wang, Dong; Lu, Huchuan

2017-01-01

In this paper, we propose an effective person re-identification method with latent variables, which represents a pedestrian as the mixture of a holistic model and a number of flexible models. Three types of latent variables are introduced to model uncertain factors in the re-identification problem, including vertical misalignments, horizontal misalignments and leg posture variations. The distance between two pedestrians can be determined by minimizing a given distance function with respect to latent variables, and then be used to conduct the re-identification task. In addition, we develop a latent metric learning method for learning the effective metric matrix, which can be solved via an iterative manner: once latent information is specified, the metric matrix can be obtained based on some typical metric learning methods; with the computed metric matrix, the latent variables can be determined by searching the state space exhaustively. Finally, extensive experiments are conducted on seven databases to evaluate the proposed method. The experimental results demonstrate that our method achieves better performance than other competing algorithms.
Inductively Coupled Plasma Optical Emission Spectrometry for Rare Earth Elements Analysis

NASA Astrophysics Data System (ADS)

He, Man; Hu, Bin; Chen, Beibei; Jiang, Zucheng

2017-01-01

Inductively coupled plasma optical emission spectrometry (ICP-OES) merits multielements capability, high sensitivity, good reproducibility, low matrix effect and wide dynamic linear range for rare earth elements (REEs) analysis. But the spectral interference in trace REEs analysis by ICP-OES is a serious problem due to the complicated emission spectra of REEs, which demands some correction technology including interference factor method, derivative spectrum, Kalman filtering algorithm and partial least-squares (PLS) method. Matrix-matching calibration, internal standard, correction factor and sample dilution are usually employed to overcome or decrease the matrix effect. Coupled with various sample introduction techniques, the analytical performance of ICP-OES for REEs analysis would be improved. Compared with conventional pneumatic nebulization (PN), acid effect and matrix effect are decreased to some extent in flow injection ICP-OES, with higher tolerable matrix concentration and better reproducibility. By using electrothermal vaporization as sample introduction system, direct analysis of solid samples by ICP-OES is achieved and the vaporization behavior of refractory REEs with high boiling point, which can easily form involatile carbides in the graphite tube, could be improved by using chemical modifier, such as polytetrafluoroethylene and 1-phenyl-3-methyl-4-benzoyl-5-pyrazone. Laser ablation-ICP-OES is suitable for the analysis of both conductive and nonconductive solid samples, with the absolute detection limit of ng-pg level and extremely low sample consumption (0.2 % of that in conventional PN introduction). ICP-OES has been extensively employed for trace REEs analysis in high-purity materials, and environmental and biological samples.
Achieving Passive Localization with Traffic Light Schedules in Urban Road Sensor Networks

PubMed Central

Niu, Qiang; Yang, Xu; Gao, Shouwan; Chen, Pengpeng; Chan, Shibing

2016-01-01

Localization is crucial for the monitoring applications of cities, such as road monitoring, environment surveillance, vehicle tracking, etc. In urban road sensor networks, sensors are often sparely deployed due to the hardware cost. Under this sparse deployment, sensors cannot communicate with each other via ranging hardware or one-hop connectivity, rendering the existing localization solutions ineffective. To address this issue, this paper proposes a novel Traffic Lights Schedule-based localization algorithm (TLS), which is built on the fact that vehicles move through the intersection with a known traffic light schedule. We can first obtain the law by binary vehicle detection time stamps and describe the law as a matrix, called a detection matrix. At the same time, we can also use the known traffic light information to construct the matrices, which can be formed as a collection called a known matrix collection. The detection matrix is then matched in the known matrix collection for identifying where sensors are located on urban roads. We evaluate our algorithm by extensive simulation. The results show that the localization accuracy of intersection sensors can reach more than 90%. In addition, we compare it with a state-of-the-art algorithm and prove that it has a wider operational region. PMID:27735871
The tensor hypercontracted parametric reduced density matrix algorithm: coupled-cluster accuracy with O(r(4)) scaling.

PubMed

Shenvi, Neil; van Aggelen, Helen; Yang, Yang; Yang, Weitao; Schwerdtfeger, Christine; Mazziotti, David

2013-08-07

Tensor hypercontraction is a method that allows the representation of a high-rank tensor as a product of lower-rank tensors. In this paper, we show how tensor hypercontraction can be applied to both the electron repulsion integral tensor and the two-particle excitation amplitudes used in the parametric 2-electron reduced density matrix (p2RDM) algorithm. Because only O(r) auxiliary functions are needed in both of these approximations, our overall algorithm can be shown to scale as O(r(4)), where r is the number of single-particle basis functions. We apply our algorithm to several small molecules, hydrogen chains, and alkanes to demonstrate its low formal scaling and practical utility. Provided we use enough auxiliary functions, we obtain accuracy similar to that of the standard p2RDM algorithm, somewhere between that of CCSD and CCSD(T).

160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA)

PubMed Central

Li, Isaac TS; Shum, Warren; Truong, Kevin

2007-01-01

Background To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. Results In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. Conclusion This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching. PMID:17555593
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA).

PubMed

Li, Isaac T S; Shum, Warren; Truong, Kevin

2007-06-07

To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching.
Low complexity Reed-Solomon-based low-density parity-check design for software defined optical transmission system based on adaptive puncturing decoding algorithm

NASA Astrophysics Data System (ADS)

Pan, Xiaolong; Liu, Bo; Zheng, Jianglong; Tian, Qinghua

2016-08-01

We propose and demonstrate a low complexity Reed-Solomon-based low-density parity-check (RS-LDPC) code with adaptive puncturing decoding algorithm for elastic optical transmission system. Partial received codes and the relevant column in parity-check matrix can be punctured to reduce the calculation complexity by adaptive parity-check matrix during decoding process. The results show that the complexity of the proposed decoding algorithm is reduced by 30% compared with the regular RS-LDPC system. The optimized code rate of the RS-LDPC code can be obtained after five times iteration.
Efficient Numerical Diagonalization of Hermitian 3 × 3 Matrices

NASA Astrophysics Data System (ADS)

Kopp, Joachim

A very common problem in science is the numerical diagonalization of symmetric or hermitian 3 × 3 matrices. Since standard "black box" packages may be too inefficient if the number of matrices is large, we study several alternatives. We consider optimized implementations of the Jacobi, QL, and Cuppen algorithms and compare them with an alytical method relying on Cardano's formula for the eigenvalues and on vector cross products for the eigenvectors. Jacobi is the most accurate, but also the slowest method, while QL and Cuppen are good general purpose algorithms. The analytical algorithm outperforms the others by more than a factor of 2, but becomes inaccurate or may even fail completely if the matrix entries differ greatly in magnitude. This can mostly be circumvented by using a hybrid method, which falls back to QL if conditions are such that the analytical calculation might become too inaccurate. For all algorithms, we give an overview of the underlying mathematical ideas, and present detailed benchmark results. C and Fortran implementations of our code are available for download from .
Endmember extraction from hyperspectral image based on discrete firefly algorithm (EE-DFA)

NASA Astrophysics Data System (ADS)

Zhang, Chengye; Qin, Qiming; Zhang, Tianyuan; Sun, Yuanheng; Chen, Chao

2017-04-01

This study proposed a novel method to extract endmembers from hyperspectral image based on discrete firefly algorithm (EE-DFA). Endmembers are the input of many spectral unmixing algorithms. Hence, in this paper, endmember extraction from hyperspectral image is regarded as a combinational optimization problem to get best spectral unmixing results, which can be solved by the discrete firefly algorithm. Two series of experiments were conducted on the synthetic hyperspectral datasets with different SNR and the AVIRIS Cuprite dataset, respectively. The experimental results were compared with the endmembers extracted by four popular methods: the sequential maximum angle convex cone (SMACC), N-FINDR, Vertex Component Analysis (VCA), and Minimum Volume Constrained Nonnegative Matrix Factorization (MVC-NMF). What's more, the effect of the parameters in the proposed method was tested on both synthetic hyperspectral datasets and AVIRIS Cuprite dataset, and the recommended parameters setting was proposed. The results in this study demonstrated that the proposed EE-DFA method showed better performance than the existing popular methods. Moreover, EE-DFA is robust under different SNR conditions.
A new method for computation of eigenvector derivatives with distinct and repeated eigenvalues in structural dynamic analysis

NASA Astrophysics Data System (ADS)

Li, Zhengguang; Lai, Siu-Kai; Wu, Baisheng

2018-07-01

Determining eigenvector derivatives is a challenging task due to the singularity of the coefficient matrices of the governing equations, especially for those structural dynamic systems with repeated eigenvalues. An effective strategy is proposed to construct a non-singular coefficient matrix, which can be directly used to obtain the eigenvector derivatives with distinct and repeated eigenvalues. This approach also has an advantage that only requires eigenvalues and eigenvectors of interest, without solving the particular solutions of eigenvector derivatives. The Symmetric Quasi-Minimal Residual (SQMR) method is then adopted to solve the governing equations, only the existing factored (shifted) stiffness matrix from an iterative eigensolution such as the subspace iteration method or the Lanczos algorithm is utilized. The present method can deal with both cases of simple and repeated eigenvalues in a unified manner. Three numerical examples are given to illustrate the accuracy and validity of the proposed algorithm. Highly accurate approximations to the eigenvector derivatives are obtained within a few iteration steps, making a significant reduction of the computational effort. This method can be incorporated into a coupled eigensolver/derivative software module. In particular, it is applicable for finite element models with large sparse matrices.
N-Dimensional LLL Reduction Algorithm with Pivoted Reflection

PubMed Central

Deng, Zhongliang; Zhu, Di

2018-01-01

The Lenstra-Lenstra-Lovász (LLL) lattice reduction algorithm and many of its variants have been widely used by cryptography, multiple-input-multiple-output (MIMO) communication systems and carrier phase positioning in global navigation satellite system (GNSS) to solve the integer least squares (ILS) problem. In this paper, we propose an n-dimensional LLL reduction algorithm (n-LLL), expanding the Lovász condition in LLL algorithm to n-dimensional space in order to obtain a further reduced basis. We also introduce pivoted Householder reflection into the algorithm to optimize the reduction time. For an m-order positive definite matrix, analysis shows that the n-LLL reduction algorithm will converge within finite steps and always produce better results than the original LLL reduction algorithm with n > 2. The simulations clearly prove that n-LLL is better than the original LLL in reducing the condition number of an ill-conditioned input matrix with 39% improvement on average for typical cases, which can significantly reduce the searching space for solving ILS problem. The simulation results also show that the pivoted reflection has significantly declined the number of swaps in the algorithm by 57%, making n-LLL a more practical reduction algorithm. PMID:29351224
Attractive electron-electron interactions within robust local fitting approximations.

PubMed

Merlot, Patrick; Kjærgaard, Thomas; Helgaker, Trygve; Lindh, Roland; Aquilante, Francesco; Reine, Simen; Pedersen, Thomas Bondo

2013-06-30

An analysis of Dunlap's robust fitting approach reveals that the resulting two-electron integral matrix is not manifestly positive semidefinite when local fitting domains or non-Coulomb fitting metrics are used. We present a highly local approximate method for evaluating four-center two-electron integrals based on the resolution-of-the-identity (RI) approximation and apply it to the construction of the Coulomb and exchange contributions to the Fock matrix. In this pair-atomic resolution-of-the-identity (PARI) approach, atomic-orbital (AO) products are expanded in auxiliary functions centered on the two atoms associated with each product. Numerical tests indicate that in 1% or less of all Hartree-Fock and Kohn-Sham calculations, the indefinite integral matrix causes nonconvergence in the self-consistent-field iterations. In these cases, the two-electron contribution to the total energy becomes negative, meaning that the electronic interaction is effectively attractive, and the total energy is dramatically lower than that obtained with exact integrals. In the vast majority of our test cases, however, the indefiniteness does not interfere with convergence. The total energy accuracy is comparable to that of the standard Coulomb-metric RI method. The speed-up compared with conventional algorithms is similar to the RI method for Coulomb contributions; exchange contributions are accelerated by a factor of up to eight with a triple-zeta quality basis set. A positive semidefinite integral matrix is recovered within PARI by introducing local auxiliary basis functions spanning the full AO product space, as may be achieved by using Cholesky-decomposition techniques. Local completion, however, slows down the algorithm to a level comparable with or below conventional calculations. Copyright © 2013 Wiley Periodicals, Inc.
Optical implementation of systolic array processing

NASA Technical Reports Server (NTRS)

Caulfield, H. J.; Rhodes, W. T.; Foster, M. J.; Horvitz, S.

1981-01-01

Algorithms for matrix vector multiplication are implemented using acousto-optic cells for multiplication and input data transfer and using charge coupled devices detector arrays for accumulation and output of the results. No two dimensional matrix mask is required; matrix changes are implemented electronically. A system for multiplying a 50 component nonnegative real vector by a 50 by 50 nonnegative real matrix is described. Modifications for bipolar real and complex valued processing are possible, as are extensions to matrix-matrix multiplication and multiplication of a vector by multiple matrices.
CCOMP: An efficient algorithm for complex roots computation of determinantal equations

NASA Astrophysics Data System (ADS)

Zouros, Grigorios P.

2018-01-01

In this paper a free Python algorithm, entitled CCOMP (Complex roots COMPutation), is developed for the efficient computation of complex roots of determinantal equations inside a prescribed complex domain. The key to the method presented is the efficient determination of the candidate points inside the domain which, in their close neighborhood, a complex root may lie. Once these points are detected, the algorithm proceeds to a two-dimensional minimization problem with respect to the minimum modulus eigenvalue of the system matrix. In the core of CCOMP exist three sub-algorithms whose tasks are the efficient estimation of the minimum modulus eigenvalues of the system matrix inside the prescribed domain, the efficient computation of candidate points which guarantee the existence of minima, and finally, the computation of minima via bound constrained minimization algorithms. Theoretical results and heuristics support the development and the performance of the algorithm, which is discussed in detail. CCOMP supports general complex matrices, and its efficiency, applicability and validity is demonstrated to a variety of microwave applications.
Noise-immune complex correlation for optical coherence angiography based on standard and Jones matrix optical coherence tomography

PubMed Central

Makita, Shuichi; Kurokawa, Kazuhiro; Hong, Young-Joo; Miura, Masahiro; Yasuno, Yoshiaki

2016-01-01

This paper describes a complex correlation mapping algorithm for optical coherence angiography (cmOCA). The proposed algorithm avoids the signal-to-noise ratio dependence and exhibits low noise in vasculature imaging. The complex correlation coefficient of the signals, rather than that of the measured data are estimated, and two-step averaging is introduced. Algorithms of motion artifact removal based on non perfusing tissue detection using correlation are developed. The algorithms are implemented with Jones-matrix OCT. Simultaneous imaging of pigmented tissue and vasculature is also achieved using degree of polarization uniformity imaging with cmOCA. An application of cmOCA to in vivo posterior human eyes is presented to demonstrate that high-contrast images of patients’ eyes can be obtained. PMID:27446673
A deconvolution extraction method for 2D multi-object fibre spectroscopy based on the regularized least-squares QR-factorization algorithm

NASA Astrophysics Data System (ADS)

Yu, Jian; Yin, Qian; Guo, Ping; Luo, A.-li

2014-09-01

This paper presents an efficient method for the extraction of astronomical spectra from two-dimensional (2D) multifibre spectrographs based on the regularized least-squares QR-factorization (LSQR) algorithm. We address two issues: we propose a modified Gaussian point spread function (PSF) for modelling the 2D PSF from multi-emission-line gas-discharge lamp images (arc images), and we develop an efficient deconvolution method to extract spectra in real circumstances. The proposed modified 2D Gaussian PSF model can fit various types of 2D PSFs, including different radial distortion angles and ellipticities. We adopt the regularized LSQR algorithm to solve the sparse linear equations constructed from the sparse convolution matrix, which we designate the deconvolution spectrum extraction method. Furthermore, we implement a parallelized LSQR algorithm based on graphics processing unit programming in the Compute Unified Device Architecture to accelerate the computational processing. Experimental results illustrate that the proposed extraction method can greatly reduce the computational cost and memory use of the deconvolution method and, consequently, increase its efficiency and practicability. In addition, the proposed extraction method has a stronger noise tolerance than other methods, such as the boxcar (aperture) extraction and profile extraction methods. Finally, we present an analysis of the sensitivity of the extraction results to the radius and full width at half-maximum of the 2D PSF.
Performance comparisons on spatial lattice algorithm and direct matrix inverse method with application to adaptive arrays processing

NASA Technical Reports Server (NTRS)

An, S. H.; Yao, K.

1986-01-01

Lattice algorithm has been employed in numerous adaptive filtering applications such as speech analysis/synthesis, noise canceling, spectral analysis, and channel equalization. In this paper the application to adaptive-array processing is discussed. The advantages are fast convergence rate as well as computational accuracy independent of the noise and interference conditions. The results produced by this technique are compared to those obtained by the direct matrix inverse method.
Communication Lower Bounds and Optimal Algorithms for Programs that Reference Arrays - Part 1

DTIC Science & Technology

2013-05-14

include tensor contractions, the direct N-body algorithm, and database join. 1This indicates that this is the first of 5 times that matrix multiplication...and database join. Section 8 summarizes our results, and outlines the contents of Part 2 of this paper. Part 2 will discuss how to compute lower...contractions, the direct N–body algo- rithm, database join, and computing matrix powers Ak. 2 Geometric Model We begin by reviewing the geometric
A fast fully constrained geometric unmixing of hyperspectral images

NASA Astrophysics Data System (ADS)

Zhou, Xin; Li, Xiao-run; Cui, Jian-tao; Zhao, Liao-ying; Zheng, Jun-peng

2014-11-01

A great challenge in hyperspectral image analysis is decomposing a mixed pixel into a collection of endmembers and their corresponding abundance fractions. This paper presents an improved implementation of Barycentric Coordinate approach to unmix hyperspectral images, integrating with the Most-Negative Remove Projection method to meet the abundance sum-to-one constraint (ASC) and abundance non-negativity constraint (ANC). The original barycentric coordinate approach interprets the endmember unmixing problem as a simplex volume ratio problem, which is solved by calculate the determinants of two augmented matrix. One consists of all the members and the other consist of the to-be-unmixed pixel and all the endmembers except for the one corresponding to the specific abundance that is to be estimated. In this paper, we first modified the algorithm of Barycentric Coordinate approach by bringing in the Matrix Determinant Lemma to simplify the unmixing process, which makes the calculation only contains linear matrix and vector operations. So, the matrix determinant calculation of every pixel, as the original algorithm did, is avoided. By the end of this step, the estimated abundance meet the ASC constraint. Then, the Most-Negative Remove Projection method is used to make the abundance fractions meet the full constraints. This algorithm is demonstrated both on synthetic and real images. The resulting algorithm yields the abundance maps that are similar to those obtained by FCLS, while the runtime is outperformed as its computational simplicity.
A novel image encryption algorithm based on synchronized random bit generated in cascade-coupled chaotic semiconductor ring lasers

NASA Astrophysics Data System (ADS)

Li, Jiafu; Xiang, Shuiying; Wang, Haoning; Gong, Junkai; Wen, Aijun

2018-03-01

In this paper, a novel image encryption algorithm based on synchronization of physical random bit generated in a cascade-coupled semiconductor ring lasers (CCSRL) system is proposed, and the security analysis is performed. In both transmitter and receiver parts, the CCSRL system is a master-slave configuration consisting of a master semiconductor ring laser (M-SRL) with cross-feedback and a solitary SRL (S-SRL). The proposed image encryption algorithm includes image preprocessing based on conventional chaotic maps, pixel confusion based on control matrix extracted from physical random bit, and pixel diffusion based on random bit stream extracted from physical random bit. Firstly, the preprocessing method is used to eliminate the correlation between adjacent pixels. Secondly, physical random bit with verified randomness is generated based on chaos in the CCSRL system, and is used to simultaneously generate the control matrix and random bit stream. Finally, the control matrix and random bit stream are used for the encryption algorithm in order to change the position and the values of pixels, respectively. Simulation results and security analysis demonstrate that the proposed algorithm is effective and able to resist various typical attacks, and thus is an excellent candidate for secure image communication application.
FFT Computation with Systolic Arrays, A New Architecture

NASA Technical Reports Server (NTRS)

Boriakoff, Valentin

1994-01-01

The use of the Cooley-Tukey algorithm for computing the l-d FFT lends itself to a particular matrix factorization which suggests direct implementation by linearly-connected systolic arrays. Here we present a new systolic architecture that embodies this algorithm. This implementation requires a smaller number of processors and a smaller number of memory cells than other recent implementations, as well as having all the advantages of systolic arrays. For the implementation of the decimation-in-frequency case, word-serial data input allows continuous real-time operation without the need of a serial-to-parallel conversion device. No control or data stream switching is necessary. Computer simulation of this architecture was done in the context of a 1024 point DFT with a fixed point processor, and CMOS processor implementation has started.
An extension of the QZ algorithm for solving the generalized matrix eigenvalue problem

NASA Technical Reports Server (NTRS)

Ward, R. C.

1973-01-01

This algorithm is an extension of Moler and Stewart's QZ algorithm with some added features for saving time and operations. Also, some additional properties of the QR algorithm which were not practical to implement in the QZ algorithm can be generalized with the combination shift QZ algorithm. Numerous test cases are presented to give practical application tests for algorithm. Based on results, this algorithm should be preferred over existing algorithms which attempt to solve the class of generalized eigenproblems where both matrices are singular or nearly singular.
Three-dimensional magnetotelluric inversion including topography using deformed hexahedral edge finite elements, direct solvers and data space Gauss-Newton, parallelized on SMP computers

NASA Astrophysics Data System (ADS)

Kordy, M. A.; Wannamaker, P. E.; Maris, V.; Cherkaev, E.; Hill, G. J.

2014-12-01

We have developed an algorithm for 3D simulation and inversion of magnetotelluric (MT) responses using deformable hexahedral finite elements that permits incorporation of topography. Direct solvers parallelized on symmetric multiprocessor (SMP), single-chassis workstations with large RAM are used for the forward solution, parameter jacobians, and model update. The forward simulator, jacobians calculations, as well as synthetic and real data inversion are presented. We use first-order edge elements to represent the secondary electric field (E), yielding accuracy O(h) for E and its curl (magnetic field). For very low frequency or small material admittivity, the E-field requires divergence correction. Using Hodge decomposition, correction may be applied after the forward solution is calculated. It allows accurate E-field solutions in dielectric air. The system matrix factorization is computed using the MUMPS library, which shows moderately good scalability through 12 processor cores but limited gains beyond that. The factored matrix is used to calculate the forward response as well as the jacobians of field and MT responses using the reciprocity theorem. Comparison with other codes demonstrates accuracy of our forward calculations. We consider a popular conductive/resistive double brick structure and several topographic models. In particular, the ability of finite elements to represent smooth topographic slopes permits accurate simulation of refraction of electromagnetic waves normal to the slopes at high frequencies. Run time tests indicate that for meshes as large as 150x150x60 elements, MT forward response and jacobians can be calculated in ~2.5 hours per frequency. For inversion, we implemented data space Gauss-Newton method, which offers reduction in memory requirement and a significant speedup of the parameter step versus model space approach. For dense matrix operations we use tiling approach of PLASMA library, which shows very good scalability. In synthetic inversions we examine the importance of including the topography in the inversion and we test different regularization schemes using weighted second norm of model gradient as well as inverting for a static distortion matrix following Miensopust/Avdeeva approach. We also apply our algorithm to invert MT data collected at Mt St Helens.
Improving recovery of ECG signal with deterministic guarantees using split signal for multiple supports of matching pursuit (SS-MSMP) algorithm.

PubMed

Tawfic, Israa Shaker; Kayhan, Sema Koc

2017-02-01

Compressed sensing (CS) is a new field used for signal acquisition and design of sensor that made a large drooping in the cost of acquiring sparse signals. In this paper, new algorithms are developed to improve the performance of the greedy algorithms. In this paper, a new greedy pursuit algorithm, SS-MSMP (Split Signal for Multiple Support of Matching Pursuit), is introduced and theoretical analyses are given. The SS-MSMP is suggested for sparse data acquisition, in order to reconstruct analog and efficient signals via a small set of general measurements. This paper proposes a new fast method which depends on a study of the behavior of the support indices through picking the best estimation of the corrosion between residual and measurement matrix. The term multiple supports originates from an algorithm; in each iteration, the best support indices are picked based on maximum quality created by discovering correlation for a particular length of support. We depend on this new algorithm upon our previous derivative of halting condition that we produce for Least Support Orthogonal Matching Pursuit (LS-OMP) for clear and noisy signal. For better reconstructed results, SS-MSMP algorithm provides the recovery of support set for long signals such as signals used in WBAN. Numerical experiments demonstrate that the new suggested algorithm performs well compared to existing algorithms in terms of many factors used for reconstruction performance. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

A Coupled/Uncoupled Computational Scheme for Deformation and Fatigue Damage Analysis of Unidirectional Metal-Matrix Composites

NASA Technical Reports Server (NTRS)

Wilt, Thomas E.; Arnold, Steven M.; Saleeb, Atef F.

1997-01-01

A fatigue damage computational algorithm utilizing a multiaxial, isothermal, continuum-based fatigue damage model for unidirectional metal-matrix composites has been implemented into the commercial finite element code MARC using MARC user subroutines. Damage is introduced into the finite element solution through the concept of effective stress that fully couples the fatigue damage calculations with the finite element deformation solution. Two applications using the fatigue damage algorithm are presented. First, an axisymmetric stress analysis of a circumferentially reinforced ring, wherein both the matrix cladding and the composite core were assumed to behave elastic-perfectly plastic. Second, a micromechanics analysis of a fiber/matrix unit cell using both the finite element method and the generalized method of cells (GMC). Results are presented in the form of S-N curves and damage distribution plots.
Efficient two-dimensional compressive sensing in MIMO radar

NASA Astrophysics Data System (ADS)

Shahbazi, Nafiseh; Abbasfar, Aliazam; Jabbarian-Jahromi, Mohammad

2017-12-01

Compressive sensing (CS) has been a way to lower sampling rate leading to data reduction for processing in multiple-input multiple-output (MIMO) radar systems. In this paper, we further reduce the computational complexity of a pulse-Doppler collocated MIMO radar by introducing a two-dimensional (2D) compressive sensing. To do so, we first introduce a new 2D formulation for the compressed received signals and then we propose a new measurement matrix design for our 2D compressive sensing model that is based on minimizing the coherence of sensing matrix using gradient descent algorithm. The simulation results show that our proposed 2D measurement matrix design using gradient decent algorithm (2D-MMDGD) has much lower computational complexity compared to one-dimensional (1D) methods while having better performance in comparison with conventional methods such as Gaussian random measurement matrix.
Matrix form for the instrument line shape of Fourier-transform spectrometers yielding a fast integration algorithm to theoretical spectra.

PubMed

Desbiens, Raphaël; Tremblay, Pierre; Genest, Jérôme; Bouchard, Jean-Pierre

2006-01-20

The instrument line shape (ILS) of a Fourier-transform spectrometer is expressed in a matrix form. For all line shape effects that scale with wavenumber, the ILS matrix is shown to be transposed in the spectral and interferogram domains. The novel representation of the ILS matrix in the interferogram domain yields an insightful physical interpretation of the underlying process producing self-apodization. Working in the interferogram domain circumvents the problem of taking into account the effects of finite optical path difference and permits a proper discretization of the equations. A fast algorithm in O(N log2 N), based on the fractional Fourier transform, is introduced that permits the application of a constant resolving power line shape to theoretical spectra or forward models. The ILS integration formalism is validated with experimental data.
Fast and accurate matrix completion via truncated nuclear norm regularization.

PubMed

Hu, Yao; Zhang, Debing; Ye, Jieping; Li, Xuelong; He, Xiaofei

2013-09-01

Recovering a large matrix from a small subset of its entries is a challenging problem arising in many real applications, such as image inpainting and recommender systems. Many existing approaches formulate this problem as a general low-rank matrix approximation problem. Since the rank operator is nonconvex and discontinuous, most of the recent theoretical studies use the nuclear norm as a convex relaxation. One major limitation of the existing approaches based on nuclear norm minimization is that all the singular values are simultaneously minimized, and thus the rank may not be well approximated in practice. In this paper, we propose to achieve a better approximation to the rank of matrix by truncated nuclear norm, which is given by the nuclear norm subtracted by the sum of the largest few singular values. In addition, we develop a novel matrix completion algorithm by minimizing the Truncated Nuclear Norm. We further develop three efficient iterative procedures, TNNR-ADMM, TNNR-APGL, and TNNR-ADMMAP, to solve the optimization problem. TNNR-ADMM utilizes the alternating direction method of multipliers (ADMM), while TNNR-AGPL applies the accelerated proximal gradient line search method (APGL) for the final optimization. For TNNR-ADMMAP, we make use of an adaptive penalty according to a novel update rule for ADMM to achieve a faster convergence rate. Our empirical study shows encouraging results of the proposed algorithms in comparison to the state-of-the-art matrix completion algorithms on both synthetic and real visual datasets.
Nonnegative matrix factorization and sparse representation for the automated detection of periodic limb movements in sleep.

PubMed

Shokrollahi, Mehrnaz; Krishnan, Sridhar; Dopsa, Dustin D; Muir, Ryan T; Black, Sandra E; Swartz, Richard H; Murray, Brian J; Boulos, Mark I

2016-11-01

Stroke is a leading cause of death and disability in adults, and incurs a significant economic burden to society. Periodic limb movements (PLMs) in sleep are repetitive movements involving the great toe, ankle, and hip. Evolving evidence suggests that PLMs may be associated with high blood pressure and stroke, but this relationship remains underexplored. Several issues limit the study of PLMs including the need to manually score them, which is time-consuming and costly. For this reason, we developed a novel automated method for nocturnal PLM detection, which was shown to be correlated with (a) the manually scored PLM index on polysomnography, and (b) white matter hyperintensities on brain imaging, which have been demonstrated to be associated with PLMs. Our proposed algorithm consists of three main stages: (1) representing the signal in the time-frequency plane using time-frequency matrices (TFM), (2) applying K-nonnegative matrix factorization technique to decompose the TFM matrix into its significant components, and (3) applying kernel sparse representation for classification (KSRC) to the decomposed signal. Our approach was applied to a dataset that consisted of 65 subjects who underwent polysomnography. An overall classification of 97 % was achieved for discrimination of the aforementioned signals, demonstrating the potential of the presented method.
Single-phase power distribution system power flow and fault analysis

NASA Technical Reports Server (NTRS)

Halpin, S. M.; Grigsby, L. L.

1992-01-01

Alternative methods for power flow and fault analysis of single-phase distribution systems are presented. The algorithms for both power flow and fault analysis utilize a generalized approach to network modeling. The generalized admittance matrix, formed using elements of linear graph theory, is an accurate network model for all possible single-phase network configurations. Unlike the standard nodal admittance matrix formulation algorithms, the generalized approach uses generalized component models for the transmission line and transformer. The standard assumption of a common node voltage reference point is not required to construct the generalized admittance matrix. Therefore, truly accurate simulation results can be obtained for networks that cannot be modeled using traditional techniques.
Implementation and analysis of list mode algorithm using tubes of response on a dedicated brain and breast PET

NASA Astrophysics Data System (ADS)

Moliner, L.; Correcher, C.; González, A. J.; Conde, P.; Hernández, L.; Orero, A.; Rodríguez-Álvarez, M. J.; Sánchez, F.; Soriano, A.; Vidal, L. F.; Benlloch, J. M.

2013-02-01

In this work we present an innovative algorithm for the reconstruction of PET images based on the List-Mode (LM) technique which improves their spatial resolution compared to results obtained with current MLEM algorithms. This study appears as a part of a large project with the aim of improving diagnosis in early Alzheimer disease stages by means of a newly developed hybrid PET-MR insert. At the present, Alzheimer is the most relevant neurodegenerative disease and the best way to apply an effective treatment is its early diagnosis. The PET device will consist of several monolithic LYSO crystals coupled to SiPM detectors. Monolithic crystals can reduce scanner costs with the advantage to enable implementation of very small virtual pixels in their geometry. This is especially useful for LM reconstruction algorithms, since they do not need a pre-calculated system matrix. We have developed an LM algorithm which has been initially tested with a large aperture (186 mm) breast PET system. Such an algorithm instead of using the common lines of response, incorporates a novel calculation of tubes of response. The new approach improves the volumetric spatial resolution about a factor 2 at the border of the field of view when compared with traditionally used MLEM algorithm. Moreover, it has also shown to decrease the image noise, thus increasing the image quality.
Minimal parameter solution of the orthogonal matrix differential equation

NASA Technical Reports Server (NTRS)

Bar-Itzhack, Itzhack Y.; Markley, F. Landis

1990-01-01

As demonstrated in this work, all orthogonal matrices solve a first order differential equation. The straightforward solution of this equation requires n sup 2 integrations to obtain the element of the nth order matrix. There are, however, only n(n-1)/2 independent parameters which determine an orthogonal matrix. The questions of choosing them, finding their differential equation and expressing the orthogonal matrix in terms of these parameters are considered. Several possibilities which are based on attitude determination in three dimensions are examined. It is shown that not all 3-D methods have useful extensions to higher dimensions. It is also shown why the rate of change of the matrix elements, which are the elements of the angular rate vector in 3-D, are the elements of a tensor of the second rank (dyadic) in spaces other than three dimensional. It is proven that the 3-D Gibbs vector (or Cayley Parameters) are extendable to other dimensions. An algorithm is developed emplying the resulting parameters, which are termed Extended Rodrigues Parameters, and numerical results are presented of the application of the algorithm to a fourth order matrix.
Minimal parameter solution of the orthogonal matrix differential equation

NASA Technical Reports Server (NTRS)

Baritzhack, Itzhack Y.; Markley, F. Landis

1988-01-01

As demonstrated in this work, all orthogonal matrices solve a first order differential equation. The straightforward solution of this equation requires n sup 2 integrations to obtain the element of the nth order matrix. There are, however, only n(n-1)/2 independent parameters which determine an orthogonal matrix. The questions of choosing them, finding their differential equation and expressing the orthogonal matrix in terms of these parameters are considered. Several possibilities which are based on attitude determination in three dimensions are examined. It is shown that not all 3-D methods have useful extensions to higher dimensions. It is also shown why the rate of change of the matrix elements, which are the elements of the angular rate vector in 3-D, are the elements of a tensor of the second rank (dyadic) in spaces other than three dimensional. It is proven that the 3-D Gibbs vector (or Cayley Parameters) are extendable to other dimensions. An algorithm is developed employing the resulting parameters, which are termed Extended Rodrigues Parameters, and numerical results are presented of the application of the algorithm to a fourth order matrix.
On the degree conjecture for separability of multipartite quantum states

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hassan, Ali Saif M.; Joag, Pramod S.

2008-01-15

We settle the so-called degree conjecture for the separability of multipartite quantum states, which are normalized graph Laplacians, first given by Braunstein et al. [Phys. Rev. A 73, 012320 (2006)]. The conjecture states that a multipartite quantum state is separable if and only if the degree matrix of the graph associated with the state is equal to the degree matrix of the partial transpose of this graph. We call this statement to be the strong form of the conjecture. In its weak version, the conjecture requires only the necessity, that is, if the state is separable, the corresponding degree matricesmore » match. We prove the strong form of the conjecture for pure multipartite quantum states using the modified tensor product of graphs defined by Hassan and Joag [J. Phys. A 40, 10251 (2007)], as both necessary and sufficient condition for separability. Based on this proof, we give a polynomial-time algorithm for completely factorizing any pure multipartite quantum state. By polynomial-time algorithm, we mean that the execution time of this algorithm increases as a polynomial in m, where m is the number of parts of the quantum system. We give a counterexample to show that the conjecture fails, in general, even in its weak form, for multipartite mixed states. Finally, we prove this conjecture, in its weak form, for a class of multipartite mixed states, giving only a necessary condition for separability.« less
3D frequency-domain finite-difference modeling of acoustic wave propagation

NASA Astrophysics Data System (ADS)

Operto, S.; Virieux, J.

2006-12-01

We present a 3D frequency-domain finite-difference method for acoustic wave propagation modeling. This method is developed as a tool to perform 3D frequency-domain full-waveform inversion of wide-angle seismic data. For wide-angle data, frequency-domain full-waveform inversion can be applied only to few discrete frequencies to develop reliable velocity model. Frequency-domain finite-difference (FD) modeling of wave propagation requires resolution of a huge sparse system of linear equations. If this system can be solved with a direct method, solutions for multiple sources can be computed efficiently once the underlying matrix has been factorized. The drawback of the direct method is the memory requirement resulting from the fill-in of the matrix during factorization. We assess in this study whether representative problems can be addressed in 3D geometry with such approach. We start from the velocity-stress formulation of the 3D acoustic wave equation. The spatial derivatives are discretized with second-order accurate staggered-grid stencil on different coordinate systems such that the axis span over as many directions as possible. Once the discrete equations were developed on each coordinate system, the particle velocity fields are eliminated from the first-order hyperbolic system (following the so-called parsimonious staggered-grid method) leading to second-order elliptic wave equations in pressure. The second-order wave equations discretized on each coordinate system are combined linearly to mitigate the numerical anisotropy. Secondly, grid dispersion is minimized by replacing the mass term at the collocation point by its weighted averaging over all the grid points of the stencil. Use of second-order accurate staggered- grid stencil allows to reduce the bandwidth of the matrix to be factorized. The final stencil incorporates 27 points. Absorbing conditions are PML. The system is solved using the parallel direct solver MUMPS developed for distributed-memory computers. The MUMPS solver is based on a multifrontal method for LU factorization. We used the METIS algorithm to perform re-ordering of the matrix coefficients before factorization. Four grid points per minimum wavelength is used for discretization. We applied our algorithm to the 3D SEG/EAGE synthetic onshore OVERTHRUST model of dimensions 20 x 20 x 4.65 km. The velocities range between 2 and 6 km/s. We performed the simulations using 192 processors with 2 Gbytes of RAM memory per processor. We performed simulations for the 5 Hz, 7 Hz and 10 Hz frequencies in some fractions of the OVERTHRUST model. The grid interval was 100 m, 75 m and 50 m respectively. The grid dimensions were 207x207x53, 275x218x71 and 409x109x102 respectively corresponding to 100, 80 and 25 percents of the model respectively. The time for factorization is 20 mn, 108 mn and 163 mn respectively. The time for resolution was 3.8, 9.3 and 10.3 s per source. The total memory used during factorization is 143, 384 and 449 Gbytes respectively. One can note the huge memory requirement for factorization and the efficiency of the direct method to compute solutions for a large number of sources. This highlights the respective drawback and merit of the frequency-domain approach with respect to the time- domain counterpart. These results show that 3D acoustic frequency-domain wave propagation modeling can be performed at low frequencies using direct solver on large clusters of Pcs. This forward modeling algorithm may be used in the future as a tool to image the first kilometers of the crust by frequency-domain full-waveform inversion. For larger problems, we will use the out-of-core memory during factorization that has been implemented by the authors of MUMPS.
A high order accurate finite element algorithm for high Reynolds number flow prediction

NASA Technical Reports Server (NTRS)

Baker, A. J.

1978-01-01

A Galerkin-weighted residuals formulation is employed to establish an implicit finite element solution algorithm for generally nonlinear initial-boundary value problems. Solution accuracy, and convergence rate with discretization refinement, are quantized in several error norms, by a systematic study of numerical solutions to several nonlinear parabolic and a hyperbolic partial differential equation characteristic of the equations governing fluid flows. Solutions are generated using selective linear, quadratic and cubic basis functions. Richardson extrapolation is employed to generate a higher-order accurate solution to facilitate isolation of truncation error in all norms. Extension of the mathematical theory underlying accuracy and convergence concepts for linear elliptic equations is predicted for equations characteristic of laminar and turbulent fluid flows at nonmodest Reynolds number. The nondiagonal initial-value matrix structure introduced by the finite element theory is determined intrinsic to improved solution accuracy and convergence. A factored Jacobian iteration algorithm is derived and evaluated to yield a consequential reduction in both computer storage and execution CPU requirements while retaining solution accuracy.
Regularization Parameter Selection for Nonlinear Iterative Image Restoration and MRI Reconstruction Using GCV and SURE-Based Methods

PubMed Central

Ramani, Sathish; Liu, Zhihao; Rosen, Jeffrey; Nielsen, Jon-Fredrik; Fessler, Jeffrey A.

2012-01-01

Regularized iterative reconstruction algorithms for imaging inverse problems require selection of appropriate regularization parameter values. We focus on the challenging problem of tuning regularization parameters for nonlinear algorithms for the case of additive (possibly complex) Gaussian noise. Generalized cross-validation (GCV) and (weighted) mean-squared error (MSE) approaches (based on Stein's Unbiased Risk Estimate— SURE) need the Jacobian matrix of the nonlinear reconstruction operator (representative of the iterative algorithm) with respect to the data. We derive the desired Jacobian matrix for two types of nonlinear iterative algorithms: a fast variant of the standard iterative reweighted least-squares method and the contemporary split-Bregman algorithm, both of which can accommodate a wide variety of analysis- and synthesis-type regularizers. The proposed approach iteratively computes two weighted SURE-type measures: Predicted-SURE and Projected-SURE (that require knowledge of noise variance σ2), and GCV (that does not need σ2) for these algorithms. We apply the methods to image restoration and to magnetic resonance image (MRI) reconstruction using total variation (TV) and an analysis-type ℓ1-regularization. We demonstrate through simulations and experiments with real data that minimizing Predicted-SURE and Projected-SURE consistently lead to near-MSE-optimal reconstructions. We also observed that minimizing GCV yields reconstruction results that are near-MSE-optimal for image restoration and slightly sub-optimal for MRI. Theoretical derivations in this work related to Jacobian matrix evaluations can be extended, in principle, to other types of regularizers and reconstruction algorithms. PMID:22531764
Parallel conjugate gradient algorithms for manipulator dynamic simulation

NASA Technical Reports Server (NTRS)

Fijany, Amir; Scheld, Robert E.

1989-01-01

Parallel conjugate gradient algorithms for the computation of multibody dynamics are developed for the specialized case of a robot manipulator. For an n-dimensional positive-definite linear system, the Classical Conjugate Gradient (CCG) algorithms are guaranteed to converge in n iterations, each with a computation cost of O(n); this leads to a total computational cost of O(n sq) on a serial processor. A conjugate gradient algorithms is presented that provide greater efficiency using a preconditioner, which reduces the number of iterations required, and by exploiting parallelism, which reduces the cost of each iteration. Two Preconditioned Conjugate Gradient (PCG) algorithms are proposed which respectively use a diagonal and a tridiagonal matrix, composed of the diagonal and tridiagonal elements of the mass matrix, as preconditioners. Parallel algorithms are developed to compute the preconditioners and their inversions in O(log sub 2 n) steps using n processors. A parallel algorithm is also presented which, on the same architecture, achieves the computational time of O(log sub 2 n) for each iteration. Simulation results for a seven degree-of-freedom manipulator are presented. Variants of the proposed algorithms are also developed which can be efficiently implemented on the Robot Mathematics Processor (RMP).
Nonnegative Matrix Factorization for identification of unknown number of sources emitting delayed signals

PubMed Central

Iliev, Filip L.; Stanev, Valentin G.; Vesselinov, Velimir V.

2018-01-01

Factor analysis is broadly used as a powerful unsupervised machine learning tool for reconstruction of hidden features in recorded mixtures of signals. In the case of a linear approximation, the mixtures can be decomposed by a variety of model-free Blind Source Separation (BSS) algorithms. Most of the available BSS algorithms consider an instantaneous mixing of signals, while the case when the mixtures are linear combinations of signals with delays is less explored. Especially difficult is the case when the number of sources of the signals with delays is unknown and has to be determined from the data as well. To address this problem, in this paper, we present a new method based on Nonnegative Matrix Factorization (NMF) that is capable of identifying: (a) the unknown number of the sources, (b) the delays and speed of propagation of the signals, and (c) the locations of the sources. Our method can be used to decompose records of mixtures of signals with delays emitted by an unknown number of sources in a nondispersive medium, based only on recorded data. This is the case, for example, when electromagnetic signals from multiple antennas are received asynchronously; or mixtures of acoustic or seismic signals recorded by sensors located at different positions; or when a shift in frequency is induced by the Doppler effect. By applying our method to synthetic datasets, we demonstrate its ability to identify the unknown number of sources as well as the waveforms, the delays, and the strengths of the signals. Using Bayesian analysis, we also evaluate estimation uncertainties and identify the region of likelihood where the positions of the sources can be found. PMID:29518126
Nonnegative Matrix Factorization for identification of unknown number of sources emitting delayed signals.

PubMed

Iliev, Filip L; Stanev, Valentin G; Vesselinov, Velimir V; Alexandrov, Boian S

2018-01-01

Factor analysis is broadly used as a powerful unsupervised machine learning tool for reconstruction of hidden features in recorded mixtures of signals. In the case of a linear approximation, the mixtures can be decomposed by a variety of model-free Blind Source Separation (BSS) algorithms. Most of the available BSS algorithms consider an instantaneous mixing of signals, while the case when the mixtures are linear combinations of signals with delays is less explored. Especially difficult is the case when the number of sources of the signals with delays is unknown and has to be determined from the data as well. To address this problem, in this paper, we present a new method based on Nonnegative Matrix Factorization (NMF) that is capable of identifying: (a) the unknown number of the sources, (b) the delays and speed of propagation of the signals, and (c) the locations of the sources. Our method can be used to decompose records of mixtures of signals with delays emitted by an unknown number of sources in a nondispersive medium, based only on recorded data. This is the case, for example, when electromagnetic signals from multiple antennas are received asynchronously; or mixtures of acoustic or seismic signals recorded by sensors located at different positions; or when a shift in frequency is induced by the Doppler effect. By applying our method to synthetic datasets, we demonstrate its ability to identify the unknown number of sources as well as the waveforms, the delays, and the strengths of the signals. Using Bayesian analysis, we also evaluate estimation uncertainties and identify the region of likelihood where the positions of the sources can be found.
Efficient Algorithms for Estimating the Absorption Spectrum within Linear Response TDDFT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brabec, Jiri; Lin, Lin; Shao, Meiyue

We present two iterative algorithms for approximating the absorption spectrum of molecules within linear response of time-dependent density functional theory (TDDFT) framework. These methods do not attempt to compute eigenvalues or eigenvectors of the linear response matrix. They are designed to approximate the absorption spectrum as a function directly. They take advantage of the special structure of the linear response matrix. Neither method requires the linear response matrix to be constructed explicitly. They only require a procedure that performs the multiplication of the linear response matrix with a vector. These methods can also be easily modified to efficiently estimate themore » density of states (DOS) of the linear response matrix without computing the eigenvalues of this matrix. We show by computational experiments that the methods proposed in this paper can be much more efficient than methods that are based on the exact diagonalization of the linear response matrix. We show that they can also be more efficient than real-time TDDFT simulations. We compare the pros and cons of these methods in terms of their accuracy as well as their computational and storage cost.« less
A novel image encryption algorithm based on the chaotic system and DNA computing

NASA Astrophysics Data System (ADS)

Chai, Xiuli; Gan, Zhihua; Lu, Yang; Chen, Yiran; Han, Daojun

A novel image encryption algorithm using the chaotic system and deoxyribonucleic acid (DNA) computing is presented. Different from the traditional encryption methods, the permutation and diffusion of our method are manipulated on the 3D DNA matrix. Firstly, a 3D DNA matrix is obtained through bit plane splitting, bit plane recombination, DNA encoding of the plain image. Secondly, 3D DNA level permutation based on position sequence group (3DDNALPBPSG) is introduced, and chaotic sequences generated from the chaotic system are employed to permutate the positions of the elements of the 3D DNA matrix. Thirdly, 3D DNA level diffusion (3DDNALD) is given, the confused 3D DNA matrix is split into sub-blocks, and XOR operation by block is manipulated to the sub-DNA matrix and the key DNA matrix from the chaotic system. At last, by decoding the diffused DNA matrix, we get the cipher image. SHA 256 hash of the plain image is employed to calculate the initial values of the chaotic system to avoid chosen plaintext attack. Experimental results and security analyses show that our scheme is secure against several known attacks, and it can effectively protect the security of the images.
Development and Evaluation of an Order-N Formulation for Multi-Flexible Body Space Systems

NASA Technical Reports Server (NTRS)

Ghosh, Tushar K.; Quiocho, Leslie J.

2013-01-01

This paper presents development of a generic recursive Order-N algorithm for systems with rigid and flexible bodies, in tree or closed-loop topology, with N being the number of bodies of the system. Simulation results are presented for several test cases to verify and evaluate the performance of the code compared to an existing efficient dense mass matrix-based code. The comparison brought out situations where Order-N or mass matrix-based algorithms could be useful.
Survey of Quantification and Distance Functions Used for Internet-based Weak-link Sociological Phenomena

DTIC Science & Technology

2016-03-01

well as the Yahoo search engine and a classic SearchKing HIST algorithm. The co-PI immersed herself in the sociology literature for the relevant...Google matrix, PageRank as well as the Yahoo search engine and a classic SearchKing HIST algorithm. The co-PI immersed herself in the sociology...The PI studied all mathematical literature he can find related to the Google search engine, Google matrix, PageRank as well as the Yahoo search

Extending the range of real time density matrix renormalization group simulations

NASA Astrophysics Data System (ADS)

Kennes, D. M.; Karrasch, C.

2016-03-01

We discuss a few simple modifications to time-dependent density matrix renormalization group (DMRG) algorithms which allow to access larger time scales. We specifically aim at beginners and present practical aspects of how to implement these modifications within any standard matrix product state (MPS) based formulation of the method. Most importantly, we show how to 'combine' the Schrödinger and Heisenberg time evolutions of arbitrary pure states | ψ 〉 and operators A in the evaluation of 〈A〉ψ(t) = 〈 ψ | A(t) | ψ 〉 . This includes quantum quenches. The generalization to (non-)thermal mixed state dynamics 〈A〉ρ(t) =Tr [ ρA(t) ] induced by an initial density matrix ρ is straightforward. In the context of linear response (ground state or finite temperature T > 0) correlation functions, one can extend the simulation time by a factor of two by 'exploiting time translation invariance', which is efficiently implementable within MPS DMRG. We present a simple analytic argument for why a recently-introduced disentangler succeeds in reducing the effort of time-dependent simulations at T > 0. Finally, we advocate the python programming language as an elegant option for beginners to set up a DMRG code.
Streaming PCA with many missing entries.

DOT National Transportation Integrated Search

2015-12-01

This paper considers the problem of matrix completion when some number of the columns are : completely and arbitrarily corrupted, potentially by a malicious adversary. It is well-known that standard : algorithms for matrix completion can return arbit...
Subspace aware recovery of low rank and jointly sparse signals

PubMed Central

Biswas, Sampurna; Dasgupta, Soura; Mudumbai, Raghuraman; Jacob, Mathews

2017-01-01

We consider the recovery of a matrix X, which is simultaneously low rank and joint sparse, from few measurements of its columns using a two-step algorithm. Each column of X is measured using a combination of two measurement matrices; one which is the same for every column, while the the second measurement matrix varies from column to column. The recovery proceeds by first estimating the row subspace vectors from the measurements corresponding to the common matrix. The estimated row subspace vectors are then used to recover X from all the measurements using a convex program of joint sparsity minimization. Our main contribution is to provide sufficient conditions on the measurement matrices that guarantee the recovery of such a matrix using the above two-step algorithm. The results demonstrate quite significant savings in number of measurements when compared to the standard multiple measurement vector (MMV) scheme, which assumes same time invariant measurement pattern for all the time frames. We illustrate the impact of the sampling pattern on reconstruction quality using breath held cardiac cine MRI and cardiac perfusion MRI data, while the utility of the algorithm to accelerate the acquisition is demonstrated on MR parameter mapping. PMID:28630889
Arikan and Alamouti matrices based on fast block-wise inverse Jacket transform

NASA Astrophysics Data System (ADS)

Lee, Moon Ho; Khan, Md Hashem Ali; Kim, Kyeong Jin

2013-12-01

Recently, Lee and Hou (IEEE Signal Process Lett 13: 461-464, 2006) proposed one-dimensional and two-dimensional fast algorithms for block-wise inverse Jacket transforms (BIJTs). Their BIJTs are not real inverse Jacket transforms from mathematical point of view because their inverses do not satisfy the usual condition, i.e., the multiplication of a matrix with its inverse matrix is not equal to the identity matrix. Therefore, we mathematically propose a fast block-wise inverse Jacket transform of orders N = 2 k , 3 k , 5 k , and 6 k , where k is a positive integer. Based on the Kronecker product of the successive lower order Jacket matrices and the basis matrix, the fast algorithms for realizing these transforms are obtained. Due to the simple inverse and fast algorithms of Arikan polar binary and Alamouti multiple-input multiple-output (MIMO) non-binary matrices, which are obtained from BIJTs, they can be applied in areas such as 3GPP physical layer for ultra mobile broadband permutation matrices design, first-order q-ary Reed-Muller code design, diagonal channel design, diagonal subchannel decompose for interference alignment, and 4G MIMO long-term evolution Alamouti precoding design.
Singularity-free extraction of a quaternion from a direction-cosine matrix. [for spacecraft control and guidance

NASA Technical Reports Server (NTRS)

Klumpp, A. R.

1976-01-01

A computer algorithm for extracting a quaternion from a direction-cosine matrix (DCM) is described. The quaternion provides a four-parameter representation of rotation, as against the nine-parameter representation afforded by a DCM. Commanded attitude in space shuttle steering is conveniently computed by DCM, while actual attitude is computed most compactly as a quaternion, as is attitude error. The unit length of the rotation quaternion, and interchangeable of a quaternion and its negative, are used to advantage in the extraction algorithm. Protection of the algorithm against square root failure and division overflow are considered. Necessary and sufficient conditions for handling the rotation vector element of largest magnitude are discussed
An O(log sup 2 N) parallel algorithm for computing the eigenvalues of a symmetric tridiagonal matrix

NASA Technical Reports Server (NTRS)

Swarztrauber, Paul N.

1989-01-01

An O(log sup 2 N) parallel algorithm is presented for computing the eigenvalues of a symmetric tridiagonal matrix using a parallel algorithm for computing the zeros of the characteristic polynomial. The method is based on a quadratic recurrence in which the characteristic polynomial is constructed on a binary tree from polynomials whose degree doubles at each level. Intervals that contain exactly one zero are determined by the zeros of polynomials at the previous level which ensures that different processors compute different zeros. The exact behavior of the polynomials at the interval endpoints is used to eliminate the usual problems induced by finite precision arithmetic.
A Thick-Restart Lanczos Algorithm with Polynomial Filtering for Hermitian Eigenvalue Problems

DOE PAGES

Li, Ruipeng; Xi, Yuanzhe; Vecharynski, Eugene; ...

2016-08-16

Polynomial filtering can provide a highly effective means of computing all eigenvalues of a real symmetric (or complex Hermitian) matrix that are located in a given interval, anywhere in the spectrum. This paper describes a technique for tackling this problem by combining a thick-restart version of the Lanczos algorithm with deflation ("locking'') and a new type of polynomial filter obtained from a least-squares technique. Furthermore, the resulting algorithm can be utilized in a “spectrum-slicing” approach whereby a very large number of eigenvalues and associated eigenvectors of the matrix are computed by extracting eigenpairs located in different subintervals independently from onemore » another.« less
Second kind Chebyshev operational matrix algorithm for solving differential equations of Lane-Emden type

NASA Astrophysics Data System (ADS)

Doha, E. H.; Abd-Elhameed, W. M.; Youssri, Y. H.

2013-10-01

In this paper, we present a new second kind Chebyshev (S2KC) operational matrix of derivatives. With the aid of S2KC, an algorithm is described to obtain numerical solutions of a class of linear and nonlinear Lane-Emden type singular initial value problems (IVPs). The idea of obtaining such solutions is essentially based on reducing the differential equation with its initial conditions to a system of algebraic equations. Two illustrative examples concern relevant physical problems (the Lane-Emden equations of the first and second kind) are discussed to demonstrate the validity and applicability of the suggested algorithm. Numerical results obtained are comparing favorably with the analytical known solutions.
Invisible data matrix detection with smart phone using geometric correction and Hough transform

NASA Astrophysics Data System (ADS)

Sun, Halit; Uysalturk, Mahir C.; Karakaya, Mahmut

2016-04-01

Two-dimensional data matrices are used in many different areas that provide quick and automatic data entry to the computer system. Their most common usage is to automatically read labeled products (books, medicines, food, etc.) and recognize them. In Turkey, alcohol beverages and tobacco products are labeled and tracked with the invisible data matrices for public safety and tax purposes. In this application, since data matrixes are printed on a special paper with a pigmented ink, it cannot be seen under daylight. When red LEDs are utilized for illumination and reflected light is filtered, invisible data matrices become visible and decoded by special barcode readers. Owing to their physical dimensions, price and requirement of special training to use; cheap, small sized and easily carried domestic mobile invisible data matrix reader systems are required to be delivered to every inspector in the law enforcement units. In this paper, we first developed an apparatus attached to the smartphone including a red LED light and a high pass filter. Then, we promoted an algorithm to process captured images by smartphones and to decode all information stored in the invisible data matrix images. The proposed algorithm mainly involves four stages. In the first step, data matrix code is processed by Hough transform processing to find "L" shaped pattern. In the second step, borders of the data matrix are found by using the convex hull and corner detection methods. Afterwards, distortion of invisible data matrix corrected by geometric correction technique and the size of every module is fixed in rectangular shape. Finally, the invisible data matrix is scanned line by line in the horizontal axis to decode it. Based on the results obtained from the real test images of invisible data matrix captured with a smartphone, the proposed algorithm indicates high accuracy and low error rate.
Research on fast algorithm of small UAV navigation in non-linear matrix reductionism method

NASA Astrophysics Data System (ADS)

Zhang, Xiao; Fang, Jiancheng; Sheng, Wei; Cao, Juanjuan

2008-10-01

The low Reynolds numbers of small UAV will result in unfavorable aerodynamic conditions to support controlled flight. And as operated near ground, the small UAV will be affected seriously by low-frequency interference caused by atmospheric disturbance. Therefore, the GNC system needs high frequency of attitude estimation and control to realize the steady of the UAV. In company with the dimensional of small UAV dwindling away, its GNC system is more and more taken embedded designing technology to reach the purpose of compactness, light weight and low power consumption. At the same time, the operational capability of GNC system also gets limit in a certain extent. Therefore, a kind of high speed navigation algorithm design becomes the imminence demand of GNC system. Aiming at such requirement, a kind of non-linearity matrix reduction approach is adopted in this paper to create a new high speed navigation algorithm which holds the radius of meridian circle and prime vertical circle as constant and linearizes the position matrix calculation formulae of navigation equation. Compared with normal navigation algorithm, this high speed navigation algorithm decreases 17.3% operand. Within small UAV"s mission radius (20km), the accuracy of position error is less than 0.13m. The results of semi-physical experiments and small UAV's auto pilot testing proved that this algorithm can realize high frequency attitude estimation and control. It will avoid low-frequency interference caused by atmospheric disturbance properly.
Exponentially more precise quantum simulation of fermions in the configuration interaction representation

NASA Astrophysics Data System (ADS)

Babbush, Ryan; Berry, Dominic W.; Sanders, Yuval R.; Kivlichan, Ian D.; Scherer, Artur; Wei, Annie Y.; Love, Peter J.; Aspuru-Guzik, Alán

2018-01-01

We present a quantum algorithm for the simulation of molecular systems that is asymptotically more efficient than all previous algorithms in the literature in terms of the main problem parameters. As in Babbush et al (2016 New Journal of Physics 18, 033032), we employ a recently developed technique for simulating Hamiltonian evolution using a truncated Taylor series to obtain logarithmic scaling with the inverse of the desired precision. The algorithm of this paper involves simulation under an oracle for the sparse, first-quantized representation of the molecular Hamiltonian known as the configuration interaction (CI) matrix. We construct and query the CI matrix oracle to allow for on-the-fly computation of molecular integrals in a way that is exponentially more efficient than classical numerical methods. Whereas second-quantized representations of the wavefunction require \\widetilde{{ O }}(N) qubits, where N is the number of single-particle spin-orbitals, the CI matrix representation requires \\widetilde{{ O }}(η ) qubits, where η \\ll N is the number of electrons in the molecule of interest. We show that the gate count of our algorithm scales at most as \\widetilde{{ O }}({η }2{N}3t).
Numerical Aspects of Atomic Physics: Helium Basis Sets and Matrix Diagonalization

NASA Astrophysics Data System (ADS)

Jentschura, Ulrich; Noble, Jonathan

2014-03-01

We present a matrix diagonalization algorithm for complex symmetric matrices, which can be used in order to determine the resonance energies of auto-ionizing states of comparatively simple quantum many-body systems such as helium. The algorithm is based in multi-precision arithmetic and proceeds via a tridiagonalization of the complex symmetric (not necessarily Hermitian) input matrix using generalized Householder transformations. Example calculations involving so-called PT-symmetric quantum systems lead to reference values which pertain to the imaginary cubic perturbation (the imaginary cubic anharmonic oscillator). We then proceed to novel basis sets for the helium atom and present results for Bethe logarithms in hydrogen and helium, obtained using the enhanced numerical techniques. Some intricacies of ``canned'' algorithms such as those used in LAPACK will be discussed. Our algorithm, for complex symmetric matrices such as those describing cubic resonances after complex scaling, is faster than LAPACK's built-in routines, for specific classes of input matrices. It also offer flexibility in terms of the calculation of the so-called implicit shift, which is used in order to ``pivot'' the system toward the convergence to diagonal form. We conclude with a wider overview.
ChromAlign: A two-step algorithmic procedure for time alignment of three-dimensional LC-MS chromatographic surfaces.

PubMed

Sadygov, Rovshan G; Maroto, Fernando Martin; Hühmer, Andreas F R

2006-12-15

We present an algorithmic approach to align three-dimensional chromatographic surfaces of LC-MS data of complex mixture samples. The approach consists of two steps. In the first step, we prealign chromatographic profiles: two-dimensional projections of chromatographic surfaces. This is accomplished by correlation analysis using fast Fourier transforms. In this step, a temporal offset that maximizes the overlap and dot product between two chromatographic profiles is determined. In the second step, the algorithm generates correlation matrix elements between full mass scans of the reference and sample chromatographic surfaces. The temporal offset from the first step indicates a range of the mass scans that are possibly correlated, then the correlation matrix is calculated only for these mass scans. The correlation matrix carries information on highly correlated scans, but it does not itself determine the scan or time alignment. Alignment is determined as a path in the correlation matrix that maximizes the sum of the correlation matrix elements. The computational complexity of the optimal path generation problem is reduced by the use of dynamic programming. The program produces time-aligned surfaces. The use of the temporal offset from the first step in the second step reduces the computation time for generating the correlation matrix and speeds up the process. The algorithm has been implemented in a program, ChromAlign, developed in C++ language for the .NET2 environment in WINDOWS XP. In this work, we demonstrate the applications of ChromAlign to alignment of LC-MS surfaces of several datasets: a mixture of known proteins, samples from digests of surface proteins of T-cells, and samples prepared from digests of cerebrospinal fluid. ChromAlign accurately aligns the LC-MS surfaces we studied. In these examples, we discuss various aspects of the alignment by ChromAlign, such as constant time axis shifts and warping of chromatographic surfaces.
[The design and experiment of complementary S coding matrix based on digital micromirror spectrometer].

PubMed

Zhang, Zhi-Hai; Gao, Ling-Xiao; Guo, Yuan-Jun; Wang, Wei; Mo, Xiang-Xia

2012-12-01

The template selection is essential in the application of digital micromirror spectrometer. The best theoretical coding H-matrix is not widely used due to acyclic, complex coding and difficult achievement. The noise ratio of best practical S-matrix for improvement is slightly inferior to matrix H. So we designed a new type complementary S-matrix. Through studying its noise improvement theory, the algorithm is proved to have the advantages of both H-matrix and S-matrix. The experiments proved that the SNR can be increased 2.05 times than S-template.
Table-sized matrix model in fractional learning

NASA Astrophysics Data System (ADS)

Soebagyo, J.; Wahyudin; Mulyaning, E. C.

2018-05-01

This article provides an explanation of the fractional learning model i.e. a Table-Sized Matrix model in which fractional representation and its operations are symbolized by the matrix. The Table-Sized Matrix are employed to develop problem solving capabilities as well as the area model. The Table-Sized Matrix model referred to in this article is used to develop an understanding of the fractional concept to elementary school students which can then be generalized into procedural fluency (algorithm) in solving the fractional problem and its operation.
On the multivariate total least-squares approach to empirical coordinate transformations. Three algorithms

NASA Astrophysics Data System (ADS)

Schaffrin, Burkhard; Felus, Yaron A.

2008-06-01

The multivariate total least-squares (MTLS) approach aims at estimating a matrix of parameters, Ξ, from a linear model ( Y- E Y = ( X- E X ) · Ξ) that includes an observation matrix, Y, another observation matrix, X, and matrices of randomly distributed errors, E Y and E X . Two special cases of the MTLS approach include the standard multivariate least-squares approach where only the observation matrix, Y, is perturbed by random errors and, on the other hand, the data least-squares approach where only the coefficient matrix X is affected by random errors. In a previous contribution, the authors derived an iterative algorithm to solve the MTLS problem by using the nonlinear Euler-Lagrange conditions. In this contribution, new lemmas are developed to analyze the iterative algorithm, modify it, and compare it with a new ‘closed form’ solution that is based on the singular-value decomposition. For an application, the total least-squares approach is used to estimate the affine transformation parameters that convert cadastral data from the old to the new Israeli datum. Technical aspects of this approach, such as scaling the data and fixing the columns in the coefficient matrix are investigated. This case study illuminates the issue of “symmetry” in the treatment of two sets of coordinates for identical point fields, a topic that had already been emphasized by Teunissen (1989, Festschrift to Torben Krarup, Geodetic Institute Bull no. 58, Copenhagen, Denmark, pp 335-342). The differences between the standard least-squares and the TLS approach are analyzed in terms of the estimated variance component and a first-order approximation of the dispersion matrix of the estimated parameters.
pyRMSD: a Python package for efficient pairwise RMSD matrix calculation and handling.

PubMed

Gil, Víctor A; Guallar, Víctor

2013-09-15

We introduce pyRMSD, an open source standalone Python package that aims at offering an integrative and efficient way of performing Root Mean Square Deviation (RMSD)-related calculations of large sets of structures. It is specially tuned to do fast collective RMSD calculations, as pairwise RMSD matrices, implementing up to three well-known superposition algorithms. pyRMSD provides its own symmetric distance matrix class that, besides the fact that it can be used as a regular matrix, helps to save memory and increases memory access speed. This last feature can dramatically improve the overall performance of any Python algorithm using it. In addition, its extensibility, testing suites and documentation make it a good choice to those in need of a workbench for developing or testing new algorithms. The source code (under MIT license), installer, test suites and benchmarks can be found at https://pele.bsc.es/ under the tools section. victor.guallar@bsc.es Supplementary data are available at Bioinformatics online.
Parallel computing of a digital hologram and particle searching for microdigital-holographic particle-tracking velocimetry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Satake, Shin-ichi; Kanamori, Hiroyuki; Kunugi, Tomoaki

2007-02-01

We have developed a parallel algorithm for microdigital-holographic particle-tracking velocimetry. The algorithm is used in (1) numerical reconstruction of a particle image computer using a digital hologram, and (2) searching for particles. The numerical reconstruction from the digital hologram makes use of the Fresnel diffraction equation and the FFT (fast Fourier transform),whereas the particle search algorithm looks for local maximum graduation in a reconstruction field represented by a 3D matrix. To achieve high performance computing for both calculations (reconstruction and particle search), two memory partitions are allocated to the 3D matrix. In this matrix, the reconstruction part consists of horizontallymore » placed 2D memory partitions on the x-y plane for the FFT, whereas, the particle search part consists of vertically placed 2D memory partitions set along the z axes.Consequently, the scalability can be obtained for the proportion of processor elements,where the benchmarks are carried out for parallel computation by a SGI Altix machine.« less
Research on Ratio of Dosage of Drugs in Traditional Chinese Prescriptions by Data Mining.

PubMed

Yu, Xing-Wen; Gong, Qing-Yue; Hu, Kong-Fa; Mao, Wen-Jing; Zhang, Wei-Ming

2017-01-01

Maximizing the effectiveness of prescriptions and minimizing adverse effects of drugs is a key component of the health care of patients. In the practice of traditional Chinese medicine (TCM), it is important to provide clinicians a reference for dosing of prescribed drugs. The traditional Cheng-Church biclustering algorithm (CC) is optimized and the data of TCM prescription dose is analyzed by using the optimization algorithm. Based on an analysis of 212 prescriptions related to TCM treatment of kidney diseases, the study generated 87 prescription dose quantum matrices and each sub-matrix represents the referential value of the doses of drugs in different recipes. The optimized CC algorithm can effectively eliminate the interference of zero in the original dose matrix of TCM prescriptions and avoid zero appearing in output sub-matrix. This results in the ability to effectively analyze the reference value of drugs in different prescriptions related to kidney diseases, so as to provide valuable reference for clinicians to use drugs rationally.
Complex Langevin simulation of a random matrix model at nonzero chemical potential

DOE PAGES

Bloch, Jacques; Glesaaen, Jonas; Verbaarschot, Jacobus J. M.; ...

2018-03-06

In this study we test the complex Langevin algorithm for numerical simulations of a random matrix model of QCD with a first order phase transition to a phase of finite baryon density. We observe that a naive implementation of the algorithm leads to phase quenched results, which were also derived analytically in this article. We test several fixes for the convergence issues of the algorithm, in particular the method of gauge cooling, the shifted representation, the deformation technique and reweighted complex Langevin, but only the latter method reproduces the correct analytical results in the region where the quark mass ismore » inside the domain of the eigenvalues. In order to shed more light on the issues of the methods we also apply them to a similar random matrix model with a milder sign problem and no phase transition, and in that case gauge cooling solves the convergence problems as was shown before in the literature.« less

Complex Langevin simulation of a random matrix model at nonzero chemical potential

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bloch, Jacques; Glesaaen, Jonas; Verbaarschot, Jacobus J. M.

In this study we test the complex Langevin algorithm for numerical simulations of a random matrix model of QCD with a first order phase transition to a phase of finite baryon density. We observe that a naive implementation of the algorithm leads to phase quenched results, which were also derived analytically in this article. We test several fixes for the convergence issues of the algorithm, in particular the method of gauge cooling, the shifted representation, the deformation technique and reweighted complex Langevin, but only the latter method reproduces the correct analytical results in the region where the quark mass ismore » inside the domain of the eigenvalues. In order to shed more light on the issues of the methods we also apply them to a similar random matrix model with a milder sign problem and no phase transition, and in that case gauge cooling solves the convergence problems as was shown before in the literature.« less
Floating-point scaling technique for sources separation automatic gain control

NASA Astrophysics Data System (ADS)

Fermas, A.; Belouchrani, A.; Ait-Mohamed, O.

2012-07-01

Based on the floating-point representation and taking advantage of scaling factor indetermination in blind source separation (BSS) processing, we propose a scaling technique applied to the separation matrix, to avoid the saturation or the weakness in the recovered source signals. This technique performs an automatic gain control in an on-line BSS environment. We demonstrate the effectiveness of this technique by using the implementation of a division-free BSS algorithm with two inputs, two outputs. The proposed technique is computationally cheaper and efficient for a hardware implementation compared to the Euclidean normalisation.
CRRES microelectronic test chip orbital data. II

NASA Technical Reports Server (NTRS)

Soli, G. A.; Blaes, B. R.; Buehler, M. G.; Ray, K.; Lin, Y.-S.

1992-01-01

Data from a MOSFET matrix on two JPL (CIT Jet Propulsion Laboratory) CRRES (Combined Release and Radiation Effects Satellite) chips, each behind different amounts of shielding, are presented. Space damage factors are nearly identical to ground test values for pMOSFETs. The results from neighboring rows of MOSFETs show similar radiation degradation. The SRD (Space Radiation Dosimeter) is used to measure the total dose accumulated by the JPL chips. A parameter extraction algorithm that does not underestimate threshold voltage shifts is used. Temperature effects are removed from the MOSFET data.
Iterative matrix algorithm for high precision temperature and force decoupling in multi-parameter FBG sensing.

PubMed

Hopf, Barbara; Dutz, Franz J; Bosselmann, Thomas; Willsch, Michael; Koch, Alexander W; Roths, Johannes

2018-04-30

A new iterative matrix algorithm has been applied to improve the precision of temperature and force decoupling in multi-parameter FBG sensing. For the first time, this evaluation technique allows the integration of nonlinearities in the sensor's temperature characteristic and the temperature dependence of the sensor's force sensitivity. Applied to a sensor cable consisting of two FBGs in fibers with 80 µm and 125 µm cladding diameter installed in a 7 m-long coiled PEEK capillary, this technique significantly reduced the uncertainties in friction-compensated temperature measurements. In the presence of high friction-induced forces of up to 1.6 N the uncertainties in temperature evaluation were reduced from several degrees Celsius if using a standard linear matrix approach to less than 0.5°C if using the iterative matrix approach in an extended temperature range between -35°C and 125°C.
Analytic Confusion Matrix Bounds for Fault Detection and Isolation Using a Sum-of-Squared- Residuals Approach

NASA Technical Reports Server (NTRS)

Simon, Dan; Simon, Donald L.

2009-01-01

Given a system which can fail in 1 or n different ways, a fault detection and isolation (FDI) algorithm uses sensor data in order to determine which fault is the most likely to have occurred. The effectiveness of an FDI algorithm can be quantified by a confusion matrix, which i ndicates the probability that each fault is isolated given that each fault has occurred. Confusion matrices are often generated with simulation data, particularly for complex systems. In this paper we perform FDI using sums of squares of sensor residuals (SSRs). We assume that the sensor residuals are Gaussian, which gives the SSRs a chi-squared distribution. We then generate analytic lower and upper bounds on the confusion matrix elements. This allows for the generation of optimal sensor sets without numerical simulations. The confusion matrix bound s are verified with simulated aircraft engine data.
A fully redundant double difference algorithm for obtaining minimum variance estimates from GPS observations

NASA Technical Reports Server (NTRS)

Melbourne, William G.

1986-01-01

In double differencing a regression system obtained from concurrent Global Positioning System (GPS) observation sequences, one either undersamples the system to avoid introducing colored measurement statistics, or one fully samples the system incurring the resulting non-diagonal covariance matrix for the differenced measurement errors. A suboptimal estimation result will be obtained in the undersampling case and will also be obtained in the fully sampled case unless the color noise statistics are taken into account. The latter approach requires a least squares weighting matrix derived from inversion of a non-diagonal covariance matrix for the differenced measurement errors instead of inversion of the customary diagonal one associated with white noise processes. Presented is the so-called fully redundant double differencing algorithm for generating a weighted double differenced regression system that yields equivalent estimation results, but features for certain cases a diagonal weighting matrix even though the differenced measurement error statistics are highly colored.
A comparison of companion matrix methods to find roots of a trigonometric polynomial

NASA Astrophysics Data System (ADS)

Boyd, John P.

2013-08-01

A trigonometric polynomial is a truncated Fourier series of the form fN(t)≡∑j=0Naj cos(jt)+∑j=1N bj sin(jt). It has been previously shown by the author that zeros of such a polynomial can be computed as the eigenvalues of a companion matrix with elements which are complex valued combinations of the Fourier coefficients, the "CCM" method. However, previous work provided no examples, so one goal of this new work is to experimentally test the CCM method. A second goal is introduce a new alternative, the elimination/Chebyshev algorithm, and experimentally compare it with the CCM scheme. The elimination/Chebyshev matrix (ECM) algorithm yields a companion matrix with real-valued elements, albeit at the price of usefulness only for real roots. The new elimination scheme first converts the trigonometric rootfinding problem to a pair of polynomial equations in the variables (c,s) where c≡cos(t) and s≡sin(t). The elimination method next reduces the system to a single univariate polynomial P(c). We show that this same polynomial is the resultant of the system and is also a generator of the Groebner basis with lexicographic ordering for the system. Both methods give very high numerical accuracy for real-valued roots, typically at least 11 decimal places in Matlab/IEEE 754 16 digit floating point arithmetic. The CCM algorithm is typically one or two decimal places more accurate, though these differences disappear if the roots are "Newton-polished" by a single Newton's iteration. The complex-valued matrix is accurate for complex-valued roots, too, though accuracy decreases with the magnitude of the imaginary part of the root. The cost of both methods scales as O(N3) floating point operations. In spite of intimate connections of the elimination/Chebyshev scheme to two well-established technologies for solving systems of equations, resultants and Groebner bases, and the advantages of using only real-valued arithmetic to obtain a companion matrix with real-valued elements, the ECM algorithm is noticeably inferior to the complex-valued companion matrix in simplicity, ease of programming, and accuracy.
Explaining and Controlling for the Psychometric Properties of Computer-Generated Figural Matrix Items

ERIC Educational Resources Information Center

Freund, Philipp Alexander; Hofer, Stefan; Holling, Heinz

2008-01-01

Figural matrix items are a popular task type for assessing general intelligence (Spearman's g). Items of this kind can be constructed rationally, allowing the implementation of computerized generation algorithms. In this study, the influence of different task parameters on the degree of difficulty in matrix items was investigated. A sample of N =…
On Rank and Nullity

ERIC Educational Resources Information Center

Dobbs, David E.

2012-01-01

This note explains how Emil Artin's proof that row rank equals column rank for a matrix with entries in a field leads naturally to the formula for the nullity of a matrix and also to an algorithm for solving any system of linear equations in any number of variables. This material could be used in any course on matrix theory or linear algebra.
Three list scheduling temporal partitioning algorithm of time space characteristic analysis and compare for dynamic reconfigurable computing

NASA Astrophysics Data System (ADS)

Chen, Naijin

2013-03-01

Level Based Partitioning (LBP) algorithm, Cluster Based Partitioning (CBP) algorithm and Enhance Static List (ESL) temporal partitioning algorithm based on adjacent matrix and adjacent table are designed and implemented in this paper. Also partitioning time and memory occupation based on three algorithms are compared. Experiment results show LBP partitioning algorithm possesses the least partitioning time and better parallel character, as far as memory occupation and partitioning time are concerned, algorithms based on adjacent table have less partitioning time and less space memory occupation.
Quantum algorithm for linear systems of equations.

PubMed

Harrow, Aram W; Hassidim, Avinatan; Lloyd, Seth

2009-10-09

Solving linear systems of equations is a common problem that arises both on its own and as a subroutine in more complex problems: given a matrix A and a vector b(-->), find a vector x(-->) such that Ax(-->) = b(-->). We consider the case where one does not need to know the solution x(-->) itself, but rather an approximation of the expectation value of some operator associated with x(-->), e.g., x(-->)(dagger) Mx(-->) for some matrix M. In this case, when A is sparse, N x N and has condition number kappa, the fastest known classical algorithms can find x(-->) and estimate x(-->)(dagger) Mx(-->) in time scaling roughly as N square root(kappa). Here, we exhibit a quantum algorithm for estimating x(-->)(dagger) Mx(-->) whose runtime is a polynomial of log(N) and kappa. Indeed, for small values of kappa [i.e., poly log(N)], we prove (using some common complexity-theoretic assumptions) that any classical algorithm for this problem generically requires exponentially more time than our quantum algorithm.
A fast identification algorithm for Box-Cox transformation based radial basis function neural network.

PubMed

Hong, Xia

2006-07-01

In this letter, a Box-Cox transformation-based radial basis function (RBF) neural network is introduced using the RBF neural network to represent the transformed system output. Initially a fixed and moderate sized RBF model base is derived based on a rank revealing orthogonal matrix triangularization (QR decomposition). Then a new fast identification algorithm is introduced using Gauss-Newton algorithm to derive the required Box-Cox transformation, based on a maximum likelihood estimator. The main contribution of this letter is to explore the special structure of the proposed RBF neural network for computational efficiency by utilizing the inverse of matrix block decomposition lemma. Finally, the Box-Cox transformation-based RBF neural network, with good generalization and sparsity, is identified based on the derived optimal Box-Cox transformation and a D-optimality-based orthogonal forward regression algorithm. The proposed algorithm and its efficacy are demonstrated with an illustrative example in comparison with support vector machine regression.
Computational complexities and storage requirements of some Riccati equation solvers

NASA Technical Reports Server (NTRS)

Utku, Senol; Garba, John A.; Ramesh, A. V.

1989-01-01

The linear optimal control problem of an nth-order time-invariant dynamic system with a quadratic performance functional is usually solved by the Hamilton-Jacobi approach. This leads to the solution of the differential matrix Riccati equation with a terminal condition. The bulk of the computation for the optimal control problem is related to the solution of this equation. There are various algorithms in the literature for solving the matrix Riccati equation. However, computational complexities and storage requirements as a function of numbers of state variables, control variables, and sensors are not available for all these algorithms. In this work, the computational complexities and storage requirements for some of these algorithms are given. These expressions show the immensity of the computational requirements of the algorithms in solving the Riccati equation for large-order systems such as the control of highly flexible space structures. The expressions are also needed to compute the speedup and efficiency of any implementation of these algorithms on concurrent machines.
An order (n) algorithm for the dynamics simulation of robotic systems

NASA Technical Reports Server (NTRS)

Chun, H. M.; Turner, J. D.; Frisch, Harold P.

1989-01-01

The formulation of an Order (n) algorithm for DISCOS (Dynamics Interaction Simulation of Controls and Structures), which is an industry-standard software package for simulation and analysis of flexible multibody systems is presented. For systems involving many bodies, the new Order (n) version of DISCOS is much faster than the current version. Results of the experimental validation of the dynamics software are also presented. The experiment is carried out on a seven-joint robot arm at NASA's Goddard Space Flight Center. The algorithm used in the current version of DISCOS requires the inverse of a matrix whose dimension is equal to the number of constraints in the system. Generally, the number of constraints in a system is roughly proportional to the number of bodies in the system, and matrix inversion requires O(p exp 3) operations, where p is the dimension of the matrix. The current version of DISCOS is therefore considered an Order (n exp 3) algorithm. In contrast, the Order (n) algorithm requires inversion of matrices which are small, and the number of matrices to be inverted increases only linearly with the number of bodies. The newly-developed Order (n) DISCOS is currently capable of handling chain and tree topologies as well as multiple closed loops. Continuing development will extend the capability of the software to deal with typical robotics applications such as put-and-place, multi-arm hand-off and surface sliding.
Numerical implementation of the S-matrix algorithm for modeling of relief diffraction gratings

NASA Astrophysics Data System (ADS)

Yaremchuk, Iryna; Tamulevičius, Tomas; Fitio, Volodymyr; Gražulevičiūte, Ieva; Bobitski, Yaroslav; Tamulevičius, Sigitas

2013-11-01

A new numerical implementation is developed to calculate the diffraction efficiency of relief diffraction gratings. In the new formulation, vectors containing the expansion coefficients of electric and magnetic fields on boundaries of the grating layer are expressed by additional constants. An S-matrix algorithm has been systematically described in detail and adapted to a simple matrix form. This implementation is suitable for the study of optical characteristics of periodic structures by using modern object-oriented programming languages and different standard mathematical software. The modeling program has been developed on the basis of this numerical implementation and tested by comparison with other commercially available programs and experimental data. Numerical examples are given to show the usefulness of the new implementation.
Dual signal subspace projection (DSSP): a novel algorithm for removing large interference in biomagnetic measurements

NASA Astrophysics Data System (ADS)

Sekihara, Kensuke; Kawabata, Yuya; Ushio, Shuta; Sumiya, Satoshi; Kawabata, Shigenori; Adachi, Yoshiaki; Nagarajan, Srikantan S.

2016-06-01

Objective. In functional electrophysiological imaging, signals are often contaminated by interference that can be of considerable magnitude compared to the signals of interest. This paper proposes a novel algorithm for removing such interferences that does not require separate noise measurements. Approach. The algorithm is based on a dual definition of the signal subspace in the spatial- and time-domains. Since the algorithm makes use of this duality, it is named the dual signal subspace projection (DSSP). The DSSP algorithm first projects the columns of the measured data matrix onto the inside and outside of the spatial-domain signal subspace, creating a set of two preprocessed data matrices. The intersection of the row spans of these two matrices is estimated as the time-domain interference subspace. The original data matrix is projected onto the subspace that is orthogonal to this interference subspace. Main results. The DSSP algorithm is validated by using the computer simulation, and using two sets of real biomagnetic data: spinal cord evoked field data measured from a healthy volunteer and magnetoencephalography data from a patient with a vagus nerve stimulator. Significance. The proposed DSSP algorithm is effective for removing overlapped interference in a wide variety of biomagnetic measurements.
Recursive mass matrix factorization and inversion: An operator approach to open- and closed-chain multibody dynamics

NASA Technical Reports Server (NTRS)

Rodriguez, G.; Kreutz, K.

1988-01-01

This report advances a linear operator approach for analyzing the dynamics of systems of joint-connected rigid bodies.It is established that the mass matrix M for such a system can be factored as M=(I+H phi L)D(I+H phi L) sup T. This yields an immediate inversion M sup -1=(I-H psi L) sup T D sup -1 (I-H psi L), where H and phi are given by known link geometric parameters, and L, psi and D are obtained recursively by a spatial discrete-step Kalman filter and by the corresponding Riccati equation associated with this filter. The factors (I+H phi L) and (I-H psi L) are lower triangular matrices which are inverses of each other, and D is a diagonal matrix. This factorization and inversion of the mass matrix leads to recursive algortihms for forward dynamics based on spatially recursive filtering and smoothing. The primary motivation for advancing the operator approach is to provide a better means to formulate, analyze and understand spatial recursions in multibody dynamics. This is achieved because the linear operator notation allows manipulation of the equations of motion using a very high-level analytical framework (a spatial operator algebra) that is easy to understand and use. Detailed lower-level recursive algorithms can readily be obtained for inspection from the expressions involving spatial operators. The report consists of two main sections. In Part 1, the problem of serial chain manipulators is analyzed and solved. Extensions to a closed-chain system formed by multiple manipulators moving a common task object are contained in Part 2. To retain ease of exposition in the report, only these two types of multibody systems are considered. However, the same methods can be easily applied to arbitrary multibody systems formed by a collection of joint-connected regid bodies.
An Efficient Distributed Compressed Sensing Algorithm for Decentralized Sensor Network.

PubMed

Liu, Jing; Huang, Kaiyu; Zhang, Guoxian

2017-04-20

We consider the joint sparsity Model 1 (JSM-1) in a decentralized scenario, where a number of sensors are connected through a network and there is no fusion center. A novel algorithm, named distributed compact sensing matrix pursuit (DCSMP), is proposed to exploit the computational and communication capabilities of the sensor nodes. In contrast to the conventional distributed compressed sensing algorithms adopting a random sensing matrix, the proposed algorithm focuses on the deterministic sensing matrices built directly on the real acquisition systems. The proposed DCSMP algorithm can be divided into two independent parts, the common and innovation support set estimation processes. The goal of the common support set estimation process is to obtain an estimated common support set by fusing the candidate support set information from an individual node and its neighboring nodes. In the following innovation support set estimation process, the measurement vector is projected into a subspace that is perpendicular to the subspace spanned by the columns indexed by the estimated common support set, to remove the impact of the estimated common support set. We can then search the innovation support set using an orthogonal matching pursuit (OMP) algorithm based on the projected measurement vector and projected sensing matrix. In the proposed DCSMP algorithm, the process of estimating the common component/support set is decoupled with that of estimating the innovation component/support set. Thus, the inaccurately estimated common support set will have no impact on estimating the innovation support set. It is proven that under the condition the estimated common support set contains the true common support set, the proposed algorithm can find the true innovation set correctly. Moreover, since the innovation support set estimation process is independent of the common support set estimation process, there is no requirement for the cardinality of both sets; thus, the proposed DCSMP algorithm is capable of tackling the unknown sparsity problem successfully.
UTOPIAN: user-driven topic modeling based on interactive nonnegative matrix factorization.

PubMed

Choo, Jaegul; Lee, Changhyun; Reddy, Chandan K; Park, Haesun

2013-12-01

Topic modeling has been widely used for analyzing text document collections. Recently, there have been significant advancements in various topic modeling techniques, particularly in the form of probabilistic graphical modeling. State-of-the-art techniques such as Latent Dirichlet Allocation (LDA) have been successfully applied in visual text analytics. However, most of the widely-used methods based on probabilistic modeling have drawbacks in terms of consistency from multiple runs and empirical convergence. Furthermore, due to the complicatedness in the formulation and the algorithm, LDA cannot easily incorporate various types of user feedback. To tackle this problem, we propose a reliable and flexible visual analytics system for topic modeling called UTOPIAN (User-driven Topic modeling based on Interactive Nonnegative Matrix Factorization). Centered around its semi-supervised formulation, UTOPIAN enables users to interact with the topic modeling method and steer the result in a user-driven manner. We demonstrate the capability of UTOPIAN via several usage scenarios with real-world document corpuses such as InfoVis/VAST paper data set and product review data sets.
Using Separable Nonnegative Matrix Factorization Techniques for the Analysis of Time-Resolved Raman Spectra.

PubMed

Luce, Robert; Hildebrandt, Peter; Kuhlmann, Uwe; Liesen, Jörg

2016-09-01

The key challenge of time-resolved Raman spectroscopy is the identification of the constituent species and the analysis of the kinetics of the underlying reaction network. In this work we present an integral approach that allows for determining both the component spectra and the rate constants simultaneously from a series of vibrational spectra. It is based on an algorithm for nonnegative matrix factorization that is applied to the experimental data set following a few pre-processing steps. As a prerequisite for physically unambiguous solutions, each component spectrum must include one vibrational band that does not significantly interfere with the vibrational bands of other species. The approach is applied to synthetic "experimental" spectra derived from model systems comprising a set of species with component spectra differing with respect to their degree of spectral interferences and signal-to-noise ratios. In each case, the species involved are connected via monomolecular reaction pathways. The potential and limitations of the approach for recovering the respective rate constants and component spectra are discussed. © The Author(s) 2016.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.