Domain decomposition: A bridge between nature and parallel computers
NASA Technical Reports Server (NTRS)
Keyes, David E.
1992-01-01
Domain decomposition is an intuitive organizing principle for a partial differential equation (PDE) computation, both physically and architecturally. However, its significance extends beyond the readily apparent issues of geometry and discretization, on one hand, and of modular software and distributed hardware, on the other. Engineering and computer science aspects are bridged by an old but recently enriched mathematical theory that offers the subject not only unity, but also tools for analysis and generalization. Domain decomposition induces function-space and operator decompositions with valuable properties. Function-space bases and operator splittings that are not derived from domain decompositions generally lack one or more of these properties. The evolution of domain decomposition methods for elliptically dominated problems has linked two major algorithmic developments of the last 15 years: multilevel and Krylov methods. Domain decomposition methods may be considered descendants of both classes with an inheritance from each: they are nearly optimal and at the same time efficiently parallelizable. Many computationally driven application areas are ripe for these developments. A progression is made from a mathematically informal motivation for domain decomposition methods to a specific focus on fluid dynamics applications. To be introductory rather than comprehensive, simple examples are provided while convergence proofs and algorithmic details are left to the original references; however, an attempt is made to convey their most salient features, especially where this leads to algorithmic insight.
Parallel CE/SE Computations via Domain Decomposition
NASA Technical Reports Server (NTRS)
Himansu, Ananda; Jorgenson, Philip C. E.; Wang, Xiao-Yen; Chang, Sin-Chung
2000-01-01
This paper describes the parallelization strategy and achieved parallel efficiency of an explicit time-marching algorithm for solving conservation laws. The Space-Time Conservation Element and Solution Element (CE/SE) algorithm for solving the 2D and 3D Euler equations is parallelized with the aid of domain decomposition. The parallel efficiency of the resultant algorithm on a Silicon Graphics Origin 2000 parallel computer is checked.
A Domain Decomposition Parallelization of the Fast Marching Method
NASA Technical Reports Server (NTRS)
Herrmann, M.
2003-01-01
In this paper, the first domain decomposition parallelization of the Fast Marching Method for level sets has been presented. Parallel speedup has been demonstrated in both the optimal and non-optimal domain decomposition case. The parallel performance of the proposed method is strongly dependent on load balancing separately the number of nodes on each side of the interface. A load imbalance of nodes on either side of the domain leads to an increase in communication and rollback operations. Furthermore, the amount of inter-domain communication can be reduced by aligning the inter-domain boundaries with the interface normal vectors. In the case of optimal load balancing and aligned inter-domain boundaries, the proposed parallel FMM algorithm is highly efficient, reaching efficiency factors of up to 0.98. Future work will focus on the extension of the proposed parallel algorithm to higher order accuracy. Also, to further enhance parallel performance, the coupling of the domain decomposition parallelization to the G(sub 0)-based parallelization will be investigated.
A Parallel Algorithm for Contact in a Finite Element Hydrocode
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pierce, Timothy G.
A parallel algorithm is developed for contact/impact of multiple three dimensional bodies undergoing large deformation. As time progresses the relative positions of contact between the multiple bodies changes as collision and sliding occurs. The parallel algorithm is capable of tracking these changes and enforcing an impenetrability constraint and momentum transfer across the surfaces in contact. Portions of the various surfaces of the bodies are assigned to the processors of a distributed-memory parallel machine in an arbitrary fashion, known as the primary decomposition. A secondary, dynamic decomposition is utilized to bring opposing sections of the contacting surfaces together on the samemore » processors, so that opposing forces may be balanced and the resultant deformation of the bodies calculated. The secondary decomposition is accomplished and updated using only local communication with a limited subset of neighbor processors. Each processor represents both a domain of the primary decomposition and a domain of the secondary, or contact, decomposition. Thus each processor has four sets of neighbor processors: (a) those processors which represent regions adjacent to it in the primary decomposition, (b) those processors which represent regions adjacent to it in the contact decomposition, (c) those processors which send it the data from which it constructs its contact domain, and (d) those processors to which it sends its primary domain data, from which they construct their contact domains. The latter three of these neighbor sets change dynamically as the simulation progresses. By constraining all communication to these sets of neighbors, all global communication, with its attendant nonscalable performance, is avoided. A set of tests are provided to measure the degree of scalability achieved by this algorithm on up to 1024 processors. Issues related to the operating system of the test platform which lead to some degradation of the results are analyzed. This algorithm has been implemented as the contact capability of the ALE3D multiphysics code, and is currently in production use.« less
Parallelization of PANDA discrete ordinates code using spatial decomposition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Humbert, P.
2006-07-01
We present the parallel method, based on spatial domain decomposition, implemented in the 2D and 3D versions of the discrete Ordinates code PANDA. The spatial mesh is orthogonal and the spatial domain decomposition is Cartesian. For 3D problems a 3D Cartesian domain topology is created and the parallel method is based on a domain diagonal plane ordered sweep algorithm. The parallel efficiency of the method is improved by directions and octants pipelining. The implementation of the algorithm is straightforward using MPI blocking point to point communications. The efficiency of the method is illustrated by an application to the 3D-Ext C5G7more » benchmark of the OECD/NEA. (authors)« less
Dynamic load balancing algorithm for molecular dynamics based on Voronoi cells domain decompositions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fattebert, J.-L.; Richards, D.F.; Glosli, J.N.
2012-12-01
We present a new algorithm for automatic parallel load balancing in classical molecular dynamics. It assumes a spatial domain decomposition of particles into Voronoi cells. It is a gradient method which attempts to minimize a cost function by displacing Voronoi sites associated with each processor/sub-domain along steepest descent directions. Excellent load balance has been obtained for quasi-2D and 3D practical applications, with up to 440·10 6 particles on 65,536 MPI tasks.
Scalable domain decomposition solvers for stochastic PDEs in high performance computing
Desai, Ajit; Khalil, Mohammad; Pettit, Chris; ...
2017-09-21
Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
Scalable domain decomposition solvers for stochastic PDEs in high performance computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Desai, Ajit; Khalil, Mohammad; Pettit, Chris
Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
NASA Astrophysics Data System (ADS)
Yusa, Yasunori; Okada, Hiroshi; Yamada, Tomonori; Yoshimura, Shinobu
2018-04-01
A domain decomposition method for large-scale elastic-plastic problems is proposed. The proposed method is based on a quasi-Newton method in conjunction with a balancing domain decomposition preconditioner. The use of a quasi-Newton method overcomes two problems associated with the conventional domain decomposition method based on the Newton-Raphson method: (1) avoidance of a double-loop iteration algorithm, which generally has large computational complexity, and (2) consideration of the local concentration of nonlinear deformation, which is observed in elastic-plastic problems with stress concentration. Moreover, the application of a balancing domain decomposition preconditioner ensures scalability. Using the conventional and proposed domain decomposition methods, several numerical tests, including weak scaling tests, were performed. The convergence performance of the proposed method is comparable to that of the conventional method. In particular, in elastic-plastic analysis, the proposed method exhibits better convergence performance than the conventional method.
Domain Decomposition: A Bridge between Nature and Parallel Computers
1992-09-01
B., "Domain Decomposition Algorithms for Indefinite Elliptic Problems," S"IAM Journal of S; cientific and Statistical (’omputing, Vol. 13, 1992, pp...AD-A256 575 NASA Contractor Report 189709 ICASE Report No. 92-44 ICASE DOMAIN DECOMPOSITION: A BRIDGE BETWEEN NATURE AND PARALLEL COMPUTERS DTIC dE...effectively implemented on dis- tributed memory multiprocessors. In 1990 (as reported in Ref. 38 using the tile algo- rithm), a 103,201-unknown 2D elliptic
Optimal domain decomposition strategies
NASA Technical Reports Server (NTRS)
Yoon, Yonghyun; Soni, Bharat K.
1995-01-01
The primary interest of the authors is in the area of grid generation, in particular, optimal domain decomposition about realistic configurations. A grid generation procedure with optimal blocking strategies has been developed to generate multi-block grids for a circular-to-rectangular transition duct. The focus of this study is the domain decomposition which optimizes solution algorithm/block compatibility based on geometrical complexities as well as the physical characteristics of flow field. The progress realized in this study is summarized in this paper.
Rank-based decompositions of morphological templates.
Sussner, P; Ritter, G X
2000-01-01
Methods for matrix decomposition have found numerous applications in image processing, in particular for the problem of template decomposition. Since existing matrix decomposition techniques are mainly concerned with the linear domain, we consider it timely to investigate matrix decomposition techniques in the nonlinear domain with applications in image processing. The mathematical basis for these investigations is the new theory of rank within minimax algebra. Thus far, only minimax decompositions of rank 1 and rank 2 matrices into outer product expansions are known to the image processing community. We derive a heuristic algorithm for the decomposition of matrices having arbitrary rank.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pautz, Shawn D.; Bailey, Teresa S.
Here, the efficiency of discrete ordinates transport sweeps depends on the scheduling algorithm, the domain decomposition, the problem to be solved, and the computational platform. Sweep scheduling algorithms may be categorized by their approach to several issues. In this paper we examine the strategy of domain overloading for mesh partitioning as one of the components of such algorithms. In particular, we extend the domain overloading strategy, previously defined and analyzed for structured meshes, to the general case of unstructured meshes. We also present computational results for both the structured and unstructured domain overloading cases. We find that an appropriate amountmore » of domain overloading can greatly improve the efficiency of parallel sweeps for both structured and unstructured partitionings of the test problems examined on up to 10 5 processor cores.« less
Pautz, Shawn D.; Bailey, Teresa S.
2016-11-29
Here, the efficiency of discrete ordinates transport sweeps depends on the scheduling algorithm, the domain decomposition, the problem to be solved, and the computational platform. Sweep scheduling algorithms may be categorized by their approach to several issues. In this paper we examine the strategy of domain overloading for mesh partitioning as one of the components of such algorithms. In particular, we extend the domain overloading strategy, previously defined and analyzed for structured meshes, to the general case of unstructured meshes. We also present computational results for both the structured and unstructured domain overloading cases. We find that an appropriate amountmore » of domain overloading can greatly improve the efficiency of parallel sweeps for both structured and unstructured partitionings of the test problems examined on up to 10 5 processor cores.« less
Ceberio, Josu; Calvo, Borja; Mendiburu, Alexander; Lozano, Jose A
2018-02-15
In the last decade, many works in combinatorial optimisation have shown that, due to the advances in multi-objective optimisation, the algorithms from this field could be used for solving single-objective problems as well. In this sense, a number of papers have proposed multi-objectivising single-objective problems in order to use multi-objective algorithms in their optimisation. In this article, we follow up this idea by presenting a methodology for multi-objectivising combinatorial optimisation problems based on elementary landscape decompositions of their objective function. Under this framework, each of the elementary landscapes obtained from the decomposition is considered as an independent objective function to optimise. In order to illustrate this general methodology, we consider four problems from different domains: the quadratic assignment problem and the linear ordering problem (permutation domain), the 0-1 unconstrained quadratic optimisation problem (binary domain), and the frequency assignment problem (integer domain). We implemented two widely known multi-objective algorithms, NSGA-II and SPEA2, and compared their performance with that of a single-objective GA. The experiments conducted on a large benchmark of instances of the four problems show that the multi-objective algorithms clearly outperform the single-objective approaches. Furthermore, a discussion on the results suggests that the multi-objective space generated by this decomposition enhances the exploration ability, thus permitting NSGA-II and SPEA2 to obtain better results in the majority of the tested instances.
Distributed-Memory Computing With the Langley Aerothermodynamic Upwind Relaxation Algorithm (LAURA)
NASA Technical Reports Server (NTRS)
Riley, Christopher J.; Cheatwood, F. McNeil
1997-01-01
The Langley Aerothermodynamic Upwind Relaxation Algorithm (LAURA), a Navier-Stokes solver, has been modified for use in a parallel, distributed-memory environment using the Message-Passing Interface (MPI) standard. A standard domain decomposition strategy is used in which the computational domain is divided into subdomains with each subdomain assigned to a processor. Performance is examined on dedicated parallel machines and a network of desktop workstations. The effect of domain decomposition and frequency of boundary updates on performance and convergence is also examined for several realistic configurations and conditions typical of large-scale computational fluid dynamic analysis.
Scalable and fast heterogeneous molecular simulation with predictive parallelization schemes
NASA Astrophysics Data System (ADS)
Guzman, Horacio V.; Junghans, Christoph; Kremer, Kurt; Stuehn, Torsten
2017-11-01
Multiscale and inhomogeneous molecular systems are challenging topics in the field of molecular simulation. In particular, modeling biological systems in the context of multiscale simulations and exploring material properties are driving a permanent development of new simulation methods and optimization algorithms. In computational terms, those methods require parallelization schemes that make a productive use of computational resources for each simulation and from its genesis. Here, we introduce the heterogeneous domain decomposition approach, which is a combination of an heterogeneity-sensitive spatial domain decomposition with an a priori rearrangement of subdomain walls. Within this approach, the theoretical modeling and scaling laws for the force computation time are proposed and studied as a function of the number of particles and the spatial resolution ratio. We also show the new approach capabilities, by comparing it to both static domain decomposition algorithms and dynamic load-balancing schemes. Specifically, two representative molecular systems have been simulated and compared to the heterogeneous domain decomposition proposed in this work. These two systems comprise an adaptive resolution simulation of a biomolecule solvated in water and a phase-separated binary Lennard-Jones fluid.
Domain Decomposition Algorithms for First-Order System Least Squares Methods
NASA Technical Reports Server (NTRS)
Pavarino, Luca F.
1996-01-01
Least squares methods based on first-order systems have been recently proposed and analyzed for second-order elliptic equations and systems. They produce symmetric and positive definite discrete systems by using standard finite element spaces, which are not required to satisfy the inf-sup condition. In this paper, several domain decomposition algorithms for these first-order least squares methods are studied. Some representative overlapping and substructuring algorithms are considered in their additive and multiplicative variants. The theoretical and numerical results obtained show that the classical convergence bounds (on the iteration operator) for standard Galerkin discretizations are also valid for least squares methods.
Convergence issues in domain decomposition parallel computation of hovering rotor
NASA Astrophysics Data System (ADS)
Xiao, Zhongyun; Liu, Gang; Mou, Bin; Jiang, Xiong
2018-05-01
Implicit LU-SGS time integration algorithm has been widely used in parallel computation in spite of its lack of information from adjacent domains. When applied to parallel computation of hovering rotor flows in a rotating frame, it brings about convergence issues. To remedy the problem, three LU factorization-based implicit schemes (consisting of LU-SGS, DP-LUR and HLU-SGS) are investigated comparatively. A test case of pure grid rotation is designed to verify these algorithms, which show that LU-SGS algorithm introduces errors on boundary cells. When partition boundaries are circumferential, errors arise in proportion to grid speed, accumulating along with the rotation, and leading to computational failure in the end. Meanwhile, DP-LUR and HLU-SGS methods show good convergence owing to boundary treatment which are desirable in domain decomposition parallel computations.
Scalable and fast heterogeneous molecular simulation with predictive parallelization schemes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guzman, Horacio V.; Junghans, Christoph; Kremer, Kurt
Multiscale and inhomogeneous molecular systems are challenging topics in the field of molecular simulation. In particular, modeling biological systems in the context of multiscale simulations and exploring material properties are driving a permanent development of new simulation methods and optimization algorithms. In computational terms, those methods require parallelization schemes that make a productive use of computational resources for each simulation and from its genesis. Here, we introduce the heterogeneous domain decomposition approach, which is a combination of an heterogeneity-sensitive spatial domain decomposition with an a priori rearrangement of subdomain walls. Within this approach and paper, the theoretical modeling and scalingmore » laws for the force computation time are proposed and studied as a function of the number of particles and the spatial resolution ratio. We also show the new approach capabilities, by comparing it to both static domain decomposition algorithms and dynamic load-balancing schemes. Specifically, two representative molecular systems have been simulated and compared to the heterogeneous domain decomposition proposed in this work. Finally, these two systems comprise an adaptive resolution simulation of a biomolecule solvated in water and a phase-separated binary Lennard-Jones fluid.« less
Scalable and fast heterogeneous molecular simulation with predictive parallelization schemes
Guzman, Horacio V.; Junghans, Christoph; Kremer, Kurt; ...
2017-11-27
Multiscale and inhomogeneous molecular systems are challenging topics in the field of molecular simulation. In particular, modeling biological systems in the context of multiscale simulations and exploring material properties are driving a permanent development of new simulation methods and optimization algorithms. In computational terms, those methods require parallelization schemes that make a productive use of computational resources for each simulation and from its genesis. Here, we introduce the heterogeneous domain decomposition approach, which is a combination of an heterogeneity-sensitive spatial domain decomposition with an a priori rearrangement of subdomain walls. Within this approach and paper, the theoretical modeling and scalingmore » laws for the force computation time are proposed and studied as a function of the number of particles and the spatial resolution ratio. We also show the new approach capabilities, by comparing it to both static domain decomposition algorithms and dynamic load-balancing schemes. Specifically, two representative molecular systems have been simulated and compared to the heterogeneous domain decomposition proposed in this work. Finally, these two systems comprise an adaptive resolution simulation of a biomolecule solvated in water and a phase-separated binary Lennard-Jones fluid.« less
Terascale Optimal PDE Simulations (TOPS) Center
DOE Office of Scientific and Technical Information (OSTI.GOV)
Professor Olof B. Widlund
2007-07-09
Our work has focused on the development and analysis of domain decomposition algorithms for a variety of problems arising in continuum mechanics modeling. In particular, we have extended and analyzed FETI-DP and BDDC algorithms; these iterative solvers were first introduced and studied by Charbel Farhat and his collaborators, see [11, 45, 12], and by Clark Dohrmann of SANDIA, Albuquerque, see [43, 2, 1], respectively. These two closely related families of methods are of particular interest since they are used more extensively than other iterative substructuring methods to solve very large and difficult problems. Thus, the FETI algorithms are part ofmore » the SALINAS system developed by the SANDIA National Laboratories for very large scale computations, and as already noted, BDDC was first developed by a SANDIA scientist, Dr. Clark Dohrmann. The FETI algorithms are also making inroads in commercial engineering software systems. We also note that the analysis of these algorithms poses very real mathematical challenges. The success in developing this theory has, in several instances, led to significant improvements in the performance of these algorithms. A very desirable feature of these iterative substructuring and other domain decomposition algorithms is that they respect the memory hierarchy of modern parallel and distributed computing systems, which is essential for approaching peak floating point performance. The development of improved methods, together with more powerful computer systems, is making it possible to carry out simulations in three dimensions, with quite high resolution, relatively easily. This work is supported by high quality software systems, such as Argonne's PETSc library, which facilitates code development as well as the access to a variety of parallel and distributed computer systems. The success in finding scalable and robust domain decomposition algorithms for very large number of processors and very large finite element problems is, e.g., illustrated in [24, 25, 26]. This work is based on [29, 31]. Our work over these five and half years has, in our opinion, helped advance the knowledge of domain decomposition methods significantly. We see these methods as providing valuable alternatives to other iterative methods, in particular, those based on multi-grid. In our opinion, our accomplishments also match the goals of the TOPS project quite closely.« less
Using domain decomposition in the multigrid NAS parallel benchmark on the Fujitsu VPP500
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, J.C.H.; Lung, H.; Katsumata, Y.
1995-12-01
In this paper, we demonstrate how domain decomposition can be applied to the multigrid algorithm to convert the code for MPP architectures. We also discuss the performance and scalability of this implementation on the new product line of Fujitsu`s vector parallel computer, VPP500. This computer has Fujitsu`s well-known vector processor as the PE each rated at 1.6 C FLOPS. The high speed crossbar network rated at 800 MB/s provides the inter-PE communication. The results show that the physical domain decomposition is the best way to solve MG problems on VPP500.
NASA Astrophysics Data System (ADS)
Lezina, Natalya; Agoshkov, Valery
2017-04-01
Domain decomposition method (DDM) allows one to present a domain with complex geometry as a set of essentially simpler subdomains. This method is particularly applied for the hydrodynamics of oceans and seas. In each subdomain the system of thermo-hydrodynamic equations in the Boussinesq and hydrostatic approximations is solved. The problem of obtaining solution in the whole domain is that it is necessary to combine solutions in subdomains. For this purposes iterative algorithm is created and numerical experiments are conducted to investigate an effectiveness of developed algorithm using DDM. For symmetric operators in DDM, Poincare-Steklov's operators [1] are used, but for the problems of the hydrodynamics, it is not suitable. In this case for the problem, adjoint equation method [2] and inverse problem theory are used. In addition, it is possible to create algorithms for the parallel calculations using DDM on multiprocessor computer system. DDM for the model of the Baltic Sea dynamics is numerically studied. The results of numerical experiments using DDM are compared with the solution of the system of hydrodynamic equations in the whole domain. The work was supported by the Russian Science Foundation (project 14-11-00609, the formulation of the iterative process and numerical experiments). [1] V.I. Agoshkov, Domain Decompositions Methods in the Mathematical Physics Problem // Numerical processes and systems, No 8, Moscow, 1991 (in Russian). [2] V.I. Agoshkov, Optimal Control Approaches and Adjoint Equations in the Mathematical Physics Problem, Institute of Numerical Mathematics, RAS, Moscow, 2003 (in Russian).
Iterative image-domain decomposition for dual-energy CT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Niu, Tianye; Dong, Xue; Petrongolo, Michael
2014-04-15
Purpose: Dual energy CT (DECT) imaging plays an important role in advanced imaging applications due to its capability of material decomposition. Direct decomposition via matrix inversion suffers from significant degradation of image signal-to-noise ratios, which reduces clinical values of DECT. Existing denoising algorithms achieve suboptimal performance since they suppress image noise either before or after the decomposition and do not fully explore the noise statistical properties of the decomposition process. In this work, the authors propose an iterative image-domain decomposition method for noise suppression in DECT, using the full variance-covariance matrix of the decomposed images. Methods: The proposed algorithm ismore » formulated in the form of least-square estimation with smoothness regularization. Based on the design principles of a best linear unbiased estimator, the authors include the inverse of the estimated variance-covariance matrix of the decomposed images as the penalty weight in the least-square term. The regularization term enforces the image smoothness by calculating the square sum of neighboring pixel value differences. To retain the boundary sharpness of the decomposed images, the authors detect the edges in the CT images before decomposition. These edge pixels have small weights in the calculation of the regularization term. Distinct from the existing denoising algorithms applied on the images before or after decomposition, the method has an iterative process for noise suppression, with decomposition performed in each iteration. The authors implement the proposed algorithm using a standard conjugate gradient algorithm. The method performance is evaluated using an evaluation phantom (Catphan©600) and an anthropomorphic head phantom. The results are compared with those generated using direct matrix inversion with no noise suppression, a denoising method applied on the decomposed images, and an existing algorithm with similar formulation as the proposed method but with an edge-preserving regularization term. Results: On the Catphan phantom, the method maintains the same spatial resolution on the decomposed images as that of the CT images before decomposition (8 pairs/cm) while significantly reducing their noise standard deviation. Compared to that obtained by the direct matrix inversion, the noise standard deviation in the images decomposed by the proposed algorithm is reduced by over 98%. Without considering the noise correlation properties in the formulation, the denoising scheme degrades the spatial resolution to 6 pairs/cm for the same level of noise suppression. Compared to the edge-preserving algorithm, the method achieves better low-contrast detectability. A quantitative study is performed on the contrast-rod slice of Catphan phantom. The proposed method achieves lower electron density measurement error as compared to that by the direct matrix inversion, and significantly reduces the error variation by over 97%. On the head phantom, the method reduces the noise standard deviation of decomposed images by over 97% without blurring the sinus structures. Conclusions: The authors propose an iterative image-domain decomposition method for DECT. The method combines noise suppression and material decomposition into an iterative process and achieves both goals simultaneously. By exploring the full variance-covariance properties of the decomposed images and utilizing the edge predetection, the proposed algorithm shows superior performance on noise suppression with high image spatial resolution and low-contrast detectability.« less
Multitasking domain decomposition fast Poisson solvers on the Cray Y-MP
NASA Technical Reports Server (NTRS)
Chan, Tony F.; Fatoohi, Rod A.
1990-01-01
The results of multitasking implementation of a domain decomposition fast Poisson solver on eight processors of the Cray Y-MP are presented. The object of this research is to study the performance of domain decomposition methods on a Cray supercomputer and to analyze the performance of different multitasking techniques using highly parallel algorithms. Two implementations of multitasking are considered: macrotasking (parallelism at the subroutine level) and microtasking (parallelism at the do-loop level). A conventional FFT-based fast Poisson solver is also multitasked. The results of different implementations are compared and analyzed. A speedup of over 7.4 on the Cray Y-MP running in a dedicated environment is achieved for all cases.
Remote-sensing image encryption in hybrid domains
NASA Astrophysics Data System (ADS)
Zhang, Xiaoqiang; Zhu, Guiliang; Ma, Shilong
2012-04-01
Remote-sensing technology plays an important role in military and industrial fields. Remote-sensing image is the main means of acquiring information from satellites, which always contain some confidential information. To securely transmit and store remote-sensing images, we propose a new image encryption algorithm in hybrid domains. This algorithm makes full use of the advantages of image encryption in both spatial domain and transform domain. First, the low-pass subband coefficients of image DWT (discrete wavelet transform) decomposition are sorted by a PWLCM system in transform domain. Second, the image after IDWT (inverse discrete wavelet transform) reconstruction is diffused with 2D (two-dimensional) Logistic map and XOR operation in spatial domain. The experiment results and algorithm analyses show that the new algorithm possesses a large key space and can resist brute-force, statistical and differential attacks. Meanwhile, the proposed algorithm has the desirable encryption efficiency to satisfy requirements in practice.
MARS-MD: rejection based image domain material decomposition
NASA Astrophysics Data System (ADS)
Bateman, C. J.; Knight, D.; Brandwacht, B.; McMahon, J.; Healy, J.; Panta, R.; Aamir, R.; Rajendran, K.; Moghiseh, M.; Ramyar, M.; Rundle, D.; Bennett, J.; de Ruiter, N.; Smithies, D.; Bell, S. T.; Doesburg, R.; Chernoglazov, A.; Mandalika, V. B. H.; Walsh, M.; Shamshad, M.; Anjomrouz, M.; Atharifard, A.; Vanden Broeke, L.; Bheesette, S.; Kirkbride, T.; Anderson, N. G.; Gieseg, S. P.; Woodfield, T.; Renaud, P. F.; Butler, A. P. H.; Butler, P. H.
2018-05-01
This paper outlines image domain material decomposition algorithms that have been routinely used in MARS spectral CT systems. These algorithms (known collectively as MARS-MD) are based on a pragmatic heuristic for solving the under-determined problem where there are more materials than energy bins. This heuristic contains three parts: (1) splitting the problem into a number of possible sub-problems, each containing fewer materials; (2) solving each sub-problem; and (3) applying rejection criteria to eliminate all but one sub-problem's solution. An advantage of this process is that different constraints can be applied to each sub-problem if necessary. In addition, the result of this process is that solutions will be sparse in the material domain, which reduces crossover of signal between material images. Two algorithms based on this process are presented: the Segmentation variant, which uses segmented material classes to define each sub-problem; and the Angular Rejection variant, which defines the rejection criteria using the angle between reconstructed attenuation vectors.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Niu, T; Dong, X; Petrongolo, M
Purpose: Dual energy CT (DECT) imaging plays an important role in advanced imaging applications due to its material decomposition capability. Direct decomposition via matrix inversion suffers from significant degradation of image signal-to-noise ratios, which reduces clinical value. Existing de-noising algorithms achieve suboptimal performance since they suppress image noise either before or after the decomposition and do not fully explore the noise statistical properties of the decomposition process. We propose an iterative image-domain decomposition method for noise suppression in DECT, using the full variance-covariance matrix of the decomposed images. Methods: The proposed algorithm is formulated in the form of least-square estimationmore » with smoothness regularization. It includes the inverse of the estimated variance-covariance matrix of the decomposed images as the penalty weight in the least-square term. Performance is evaluated using an evaluation phantom (Catphan 600) and an anthropomorphic head phantom. Results are compared to those generated using direct matrix inversion with no noise suppression, a de-noising method applied on the decomposed images, and an existing algorithm with similar formulation but with an edge-preserving regularization term. Results: On the Catphan phantom, our method retains the same spatial resolution as the CT images before decomposition while reducing the noise standard deviation of decomposed images by over 98%. The other methods either degrade spatial resolution or achieve less low-contrast detectability. Also, our method yields lower electron density measurement error than direct matrix inversion and reduces error variation by over 97%. On the head phantom, it reduces the noise standard deviation of decomposed images by over 97% without blurring the sinus structures. Conclusion: We propose an iterative image-domain decomposition method for DECT. The method combines noise suppression and material decomposition into an iterative process and achieves both goals simultaneously. The proposed algorithm shows superior performance on noise suppression with high image spatial resolution and low-contrast detectability. This work is supported by a Varian MRA grant.« less
Domain Decomposition with Local Mesh Refinement.
1989-08-01
smoothi coefficients, or non-smooth solui ioni,. We eiriplov fromn 1 to 1024 tiles on problems containing irp to 161K (degrees of freedom. Though io... methodology survives such compromises and is even sequentially advantageous in many problems. The domain decomposition algorithms we employ (sertiun 3...iog( I + !J2 it - g i Ol Qunit squiare 1 he (,mai oive i> Hie outward normal. lfie sevoh iih exam pie, from [1. 27] has a smoothi solution, but rapidlY
Yang, L. H.; Brooks III, E. D.; Belak, J.
1992-01-01
A molecular dynamics algorithm for performing large-scale simulations using the Parallel C Preprocessor (PCP) programming paradigm on the BBN TC2000, a massively parallel computer, is discussed. The algorithm uses a linked-cell data structure to obtain the near neighbors of each atom as time evoles. Each processor is assigned to a geometric domain containing many subcells and the storage for that domain is private to the processor. Within this scheme, the interdomain (i.e., interprocessor) communication is minimized.
An asymptotic induced numerical method for the convection-diffusion-reaction equation
NASA Technical Reports Server (NTRS)
Scroggs, Jeffrey S.; Sorensen, Danny C.
1988-01-01
A parallel algorithm for the efficient solution of a time dependent reaction convection diffusion equation with small parameter on the diffusion term is presented. The method is based on a domain decomposition that is dictated by singular perturbation analysis. The analysis is used to determine regions where certain reduced equations may be solved in place of the full equation. Parallelism is evident at two levels. Domain decomposition provides parallelism at the highest level, and within each domain there is ample opportunity to exploit parallelism. Run time results demonstrate the viability of the method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meng, F.; Banks, J. W.; Henshaw, W. D.
We describe a new partitioned approach for solving conjugate heat transfer (CHT) problems where the governing temperature equations in different material domains are time-stepped in a implicit manner, but where the interface coupling is explicit. The new approach, called the CHAMP scheme (Conjugate Heat transfer Advanced Multi-domain Partitioned), is based on a discretization of the interface coupling conditions using a generalized Robin (mixed) condition. The weights in the Robin condition are determined from the optimization of a condition derived from a local stability analysis of the coupling scheme. The interface treatment combines ideas from optimized-Schwarz methods for domain-decomposition problems togethermore » with the interface jump conditions and additional compatibility jump conditions derived from the governing equations. For many problems (i.e. for a wide range of material properties, grid-spacings and time-steps) the CHAMP algorithm is stable and second-order accurate using no sub-time-step iterations (i.e. a single implicit solve of the temperature equation in each domain). In extreme cases (e.g. very fine grids with very large time-steps) it may be necessary to perform one or more sub-iterations. Each sub-iteration generally increases the range of stability substantially and thus one sub-iteration is likely sufficient for the vast majority of practical problems. The CHAMP algorithm is developed first for a model problem and analyzed using normal-mode the- ory. The theory provides a mechanism for choosing optimal parameters in the mixed interface condition. A comparison is made to the classical Dirichlet-Neumann (DN) method and, where applicable, to the optimized- Schwarz (OS) domain-decomposition method. For problems with different thermal conductivities and dif- fusivities, the CHAMP algorithm outperforms the DN scheme. For domain-decomposition problems with uniform conductivities and diffusivities, the CHAMP algorithm performs better than the typical OS scheme with one grid-cell overlap. Lastly, the CHAMP scheme is also developed for general curvilinear grids and CHT ex- amples are presented using composite overset grids that confirm the theory and demonstrate the effectiveness of the approach.« less
Efficient Delaunay Tessellation through K-D Tree Decomposition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morozov, Dmitriy; Peterka, Tom
Delaunay tessellations are fundamental data structures in computational geometry. They are important in data analysis, where they can represent the geometry of a point set or approximate its density. The algorithms for computing these tessellations at scale perform poorly when the input data is unbalanced. We investigate the use of k-d trees to evenly distribute points among processes and compare two strategies for picking split points between domain regions. Because resulting point distributions no longer satisfy the assumptions of existing parallel Delaunay algorithms, we develop a new parallel algorithm that adapts to its input and prove its correctness. We evaluatemore » the new algorithm using two late-stage cosmology datasets. The new running times are up to 50 times faster using k-d tree compared with regular grid decomposition. Moreover, in the unbalanced data sets, decomposing the domain into a k-d tree is up to five times faster than decomposing it into a regular grid.« less
A stable and accurate partitioned algorithm for conjugate heat transfer
NASA Astrophysics Data System (ADS)
Meng, F.; Banks, J. W.; Henshaw, W. D.; Schwendeman, D. W.
2017-09-01
We describe a new partitioned approach for solving conjugate heat transfer (CHT) problems where the governing temperature equations in different material domains are time-stepped in an implicit manner, but where the interface coupling is explicit. The new approach, called the CHAMP scheme (Conjugate Heat transfer Advanced Multi-domain Partitioned), is based on a discretization of the interface coupling conditions using a generalized Robin (mixed) condition. The weights in the Robin condition are determined from the optimization of a condition derived from a local stability analysis of the coupling scheme. The interface treatment combines ideas from optimized-Schwarz methods for domain-decomposition problems together with the interface jump conditions and additional compatibility jump conditions derived from the governing equations. For many problems (i.e. for a wide range of material properties, grid-spacings and time-steps) the CHAMP algorithm is stable and second-order accurate using no sub-time-step iterations (i.e. a single implicit solve of the temperature equation in each domain). In extreme cases (e.g. very fine grids with very large time-steps) it may be necessary to perform one or more sub-iterations. Each sub-iteration generally increases the range of stability substantially and thus one sub-iteration is likely sufficient for the vast majority of practical problems. The CHAMP algorithm is developed first for a model problem and analyzed using normal-mode theory. The theory provides a mechanism for choosing optimal parameters in the mixed interface condition. A comparison is made to the classical Dirichlet-Neumann (DN) method and, where applicable, to the optimized-Schwarz (OS) domain-decomposition method. For problems with different thermal conductivities and diffusivities, the CHAMP algorithm outperforms the DN scheme. For domain-decomposition problems with uniform conductivities and diffusivities, the CHAMP algorithm performs better than the typical OS scheme with one grid-cell overlap. The CHAMP scheme is also developed for general curvilinear grids and CHT examples are presented using composite overset grids that confirm the theory and demonstrate the effectiveness of the approach.
A stable and accurate partitioned algorithm for conjugate heat transfer
Meng, F.; Banks, J. W.; Henshaw, W. D.; ...
2017-04-25
We describe a new partitioned approach for solving conjugate heat transfer (CHT) problems where the governing temperature equations in different material domains are time-stepped in a implicit manner, but where the interface coupling is explicit. The new approach, called the CHAMP scheme (Conjugate Heat transfer Advanced Multi-domain Partitioned), is based on a discretization of the interface coupling conditions using a generalized Robin (mixed) condition. The weights in the Robin condition are determined from the optimization of a condition derived from a local stability analysis of the coupling scheme. The interface treatment combines ideas from optimized-Schwarz methods for domain-decomposition problems togethermore » with the interface jump conditions and additional compatibility jump conditions derived from the governing equations. For many problems (i.e. for a wide range of material properties, grid-spacings and time-steps) the CHAMP algorithm is stable and second-order accurate using no sub-time-step iterations (i.e. a single implicit solve of the temperature equation in each domain). In extreme cases (e.g. very fine grids with very large time-steps) it may be necessary to perform one or more sub-iterations. Each sub-iteration generally increases the range of stability substantially and thus one sub-iteration is likely sufficient for the vast majority of practical problems. The CHAMP algorithm is developed first for a model problem and analyzed using normal-mode the- ory. The theory provides a mechanism for choosing optimal parameters in the mixed interface condition. A comparison is made to the classical Dirichlet-Neumann (DN) method and, where applicable, to the optimized- Schwarz (OS) domain-decomposition method. For problems with different thermal conductivities and dif- fusivities, the CHAMP algorithm outperforms the DN scheme. For domain-decomposition problems with uniform conductivities and diffusivities, the CHAMP algorithm performs better than the typical OS scheme with one grid-cell overlap. Lastly, the CHAMP scheme is also developed for general curvilinear grids and CHT ex- amples are presented using composite overset grids that confirm the theory and demonstrate the effectiveness of the approach.« less
Progressive Precision Surface Design
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duchaineau, M; Joy, KJ
2002-01-11
We introduce a novel wavelet decomposition algorithm that makes a number of powerful new surface design operations practical. Wavelets, and hierarchical representations generally, have held promise to facilitate a variety of design tasks in a unified way by approximating results very precisely, thus avoiding a proliferation of undergirding mathematical representations. However, traditional wavelet decomposition is defined from fine to coarse resolution, thus limiting its efficiency for highly precise surface manipulation when attempting to create new non-local editing methods. Our key contribution is the progressive wavelet decomposition algorithm, a general-purpose coarse-to-fine method for hierarchical fitting, based in this paper on anmore » underlying multiresolution representation called dyadic splines. The algorithm requests input via a generic interval query mechanism, allowing a wide variety of non-local operations to be quickly implemented. The algorithm performs work proportionate to the tiny compressed output size, rather than to some arbitrarily high resolution that would otherwise be required, thus increasing performance by several orders of magnitude. We describe several design operations that are made tractable because of the progressive decomposition. Free-form pasting is a generalization of the traditional control-mesh edit, but for which the shape of the change is completely general and where the shape can be placed using a free-form deformation within the surface domain. Smoothing and roughening operations are enhanced so that an arbitrary loop in the domain specifies the area of effect. Finally, the sculpting effect of moving a tool shape along a path is simulated.« less
Scalable Domain Decomposed Monte Carlo Particle Transport
NASA Astrophysics Data System (ADS)
O'Brien, Matthew Joseph
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation. The main algorithms we consider are: • Domain decomposition of constructive solid geometry: enables extremely large calculations in which the background geometry is too large to fit in the memory of a single computational node. • Load Balancing: keeps the workload per processor as even as possible so the calculation runs efficiently. • Global Particle Find: if particles are on the wrong processor, globally resolve their locations to the correct processor based on particle coordinate and background domain. • Visualizing constructive solid geometry, sourcing particles, deciding that particle streaming communication is completed and spatial redecomposition. These algorithms are some of the most important parallel algorithms required for domain decomposed Monte Carlo particle transport. We demonstrate that our previous algorithms were not scalable, prove that our new algorithms are scalable, and run some of the algorithms up to 2 million MPI processes on the Sequoia supercomputer.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zheng, Xiang; Yang, Chao; State Key Laboratory of Computer Science, Chinese Academy of Sciences, Beijing 100190
2015-03-15
We present a numerical algorithm for simulating the spinodal decomposition described by the three dimensional Cahn–Hilliard–Cook (CHC) equation, which is a fourth-order stochastic partial differential equation with a noise term. The equation is discretized in space and time based on a fully implicit, cell-centered finite difference scheme, with an adaptive time-stepping strategy designed to accelerate the progress to equilibrium. At each time step, a parallel Newton–Krylov–Schwarz algorithm is used to solve the nonlinear system. We discuss various numerical and computational challenges associated with the method. The numerical scheme is validated by a comparison with an explicit scheme of high accuracymore » (and unreasonably high cost). We present steady state solutions of the CHC equation in two and three dimensions. The effect of the thermal fluctuation on the spinodal decomposition process is studied. We show that the existence of the thermal fluctuation accelerates the spinodal decomposition process and that the final steady morphology is sensitive to the stochastic noise. We also show the evolution of the energies and statistical moments. In terms of the parallel performance, it is found that the implicit domain decomposition approach scales well on supercomputers with a large number of processors.« less
NASA Technical Reports Server (NTRS)
Farhat, Charbel; Rixen, Daniel
1996-01-01
We present an optimal preconditioning algorithm that is equally applicable to the dual (FETI) and primal (Balancing) Schur complement domain decomposition methods, and which successfully addresses the problems of subdomain heterogeneities including the effects of large jumps of coefficients. The proposed preconditioner is derived from energy principles and embeds a new coarsening operator that propagates the error globally and accelerates convergence. The resulting iterative solver is illustrated with the solution of highly heterogeneous elasticity problems.
NASA Astrophysics Data System (ADS)
Cafiero, M.; Lloberas-Valls, O.; Cante, J.; Oliver, J.
2016-04-01
A domain decomposition technique is proposed which is capable of properly connecting arbitrary non-conforming interfaces. The strategy essentially consists in considering a fictitious zero-width interface between the non-matching meshes which is discretized using a Delaunay triangulation. Continuity is satisfied across domains through normal and tangential stresses provided by the discretized interface and inserted in the formulation in the form of Lagrange multipliers. The final structure of the global system of equations resembles the dual assembly of substructures where the Lagrange multipliers are employed to nullify the gap between domains. A new approach to handle floating subdomains is outlined which can be implemented without significantly altering the structure of standard industrial finite element codes. The effectiveness of the developed algorithm is demonstrated through a patch test example and a number of tests that highlight the accuracy of the methodology and independence of the results with respect to the framework parameters. Considering its high degree of flexibility and non-intrusive character, the proposed domain decomposition framework is regarded as an attractive alternative to other established techniques such as the mortar approach.
An MPI + $X$ implementation of contact global search using Kokkos
Hansen, Glen A.; Xavier, Patrick G.; Mish, Sam P.; ...
2015-10-05
This paper describes an approach that seeks to parallelize the spatial search associated with computational contact mechanics. In contact mechanics, the purpose of the spatial search is to find “nearest neighbors,” which is the prelude to an imprinting search that resolves the interactions between the external surfaces of contacting bodies. In particular, we are interested in the contact global search portion of the spatial search associated with this operation on domain-decomposition-based meshes. Specifically, we describe an implementation that combines standard domain-decomposition-based MPI-parallel spatial search with thread-level parallelism (MPI-X) available on advanced computer architectures (those with GPU coprocessors). Our goal ismore » to demonstrate the efficacy of the MPI-X paradigm in the overall contact search. Standard MPI-parallel implementations typically use a domain decomposition of the external surfaces of bodies within the domain in an attempt to efficiently distribute computational work. This decomposition may or may not be the same as the volume decomposition associated with the host physics. The parallel contact global search phase is then employed to find and distribute surface entities (nodes and faces) that are needed to compute contact constraints between entities owned by different MPI ranks without further inter-rank communication. Key steps of the contact global search include computing bounding boxes, building surface entity (node and face) search trees and finding and distributing entities required to complete on-rank (local) spatial searches. To enable source-code portability and performance across a variety of different computer architectures, we implemented the algorithm using the Kokkos hardware abstraction library. While we targeted development towards machines with a GPU accelerator per MPI rank, we also report performance results for OpenMP with a conventional multi-core compute node per rank. Results here demonstrate a 47 % decrease in the time spent within the global search algorithm, comparing the reference ACME algorithm with the GPU implementation, on an 18M face problem using four MPI ranks. As a result, while further work remains to maximize performance on the GPU, this result illustrates the potential of the proposed implementation.« less
Regularization of nonlinear decomposition of spectral x-ray projection images.
Ducros, Nicolas; Abascal, Juan Felipe Perez-Juste; Sixou, Bruno; Rit, Simon; Peyrin, Françoise
2017-09-01
Exploiting the x-ray measurements obtained in different energy bins, spectral computed tomography (CT) has the ability to recover the 3-D description of a patient in a material basis. This may be achieved solving two subproblems, namely the material decomposition and the tomographic reconstruction problems. In this work, we address the material decomposition of spectral x-ray projection images, which is a nonlinear ill-posed problem. Our main contribution is to introduce a material-dependent spatial regularization in the projection domain. The decomposition problem is solved iteratively using a Gauss-Newton algorithm that can benefit from fast linear solvers. A Matlab implementation is available online. The proposed regularized weighted least squares Gauss-Newton algorithm (RWLS-GN) is validated on numerical simulations of a thorax phantom made of up to five materials (soft tissue, bone, lung, adipose tissue, and gadolinium), which is scanned with a 120 kV source and imaged by a 4-bin photon counting detector. To evaluate the method performance of our algorithm, different scenarios are created by varying the number of incident photons, the concentration of the marker and the configuration of the phantom. The RWLS-GN method is compared to the reference maximum likelihood Nelder-Mead algorithm (ML-NM). The convergence of the proposed method and its dependence on the regularization parameter are also studied. We show that material decomposition is feasible with the proposed method and that it converges in few iterations. Material decomposition with ML-NM was very sensitive to noise, leading to decomposed images highly affected by noise, and artifacts even for the best case scenario. The proposed method was less sensitive to noise and improved contrast-to-noise ratio of the gadolinium image. Results were superior to those provided by ML-NM in terms of image quality and decomposition was 70 times faster. For the assessed experiments, material decomposition was possible with the proposed method when the number of incident photons was equal or larger than 10 5 and when the marker concentration was equal or larger than 0.03 g·cm -3 . The proposed method efficiently solves the nonlinear decomposition problem for spectral CT, which opens up new possibilities such as material-specific regularization in the projection domain and a parallelization framework, in which projections are solved in parallel. © 2017 American Association of Physicists in Medicine.
Computational mechanics analysis tools for parallel-vector supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.
1993-01-01
Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.
Newton-Krylov-Schwarz: An implicit solver for CFD
NASA Technical Reports Server (NTRS)
Cai, Xiao-Chuan; Keyes, David E.; Venkatakrishnan, V.
1995-01-01
Newton-Krylov methods and Krylov-Schwarz (domain decomposition) methods have begun to become established in computational fluid dynamics (CFD) over the past decade. The former employ a Krylov method inside of Newton's method in a Jacobian-free manner, through directional differencing. The latter employ an overlapping Schwarz domain decomposition to derive a preconditioner for the Krylov accelerator that relies primarily on local information, for data-parallel concurrency. They may be composed as Newton-Krylov-Schwarz (NKS) methods, which seem particularly well suited for solving nonlinear elliptic systems in high-latency, distributed-memory environments. We give a brief description of this family of algorithms, with an emphasis on domain decomposition iterative aspects. We then describe numerical simulations with Newton-Krylov-Schwarz methods on aerodynamics applications emphasizing comparisons with a standard defect-correction approach, subdomain preconditioner consistency, subdomain preconditioner quality, and the effect of a coarse grid.
NASA Astrophysics Data System (ADS)
Le, Thien-Phu
2017-10-01
The frequency-scale domain decomposition technique has recently been proposed for operational modal analysis. The technique is based on the Cauchy mother wavelet. In this paper, the approach is extended to the Morlet mother wavelet, which is very popular in signal processing due to its superior time-frequency localization. Based on the regressive form and an appropriate norm of the Morlet mother wavelet, the continuous wavelet transform of the power spectral density of ambient responses enables modes in the frequency-scale domain to be highlighted. Analytical developments first demonstrate the link between modal parameters and the local maxima of the continuous wavelet transform modulus. The link formula is then used as the foundation of the proposed modal identification method. Its practical procedure, combined with the singular value decomposition algorithm, is presented step by step. The proposition is finally verified using numerical examples and a laboratory test.
Domain decomposition algorithms and computation fluid dynamics
NASA Technical Reports Server (NTRS)
Chan, Tony F.
1988-01-01
In the past several years, domain decomposition was a very popular topic, partly motivated by the potential of parallelization. While a large body of theory and algorithms were developed for model elliptic problems, they are only recently starting to be tested on realistic applications. The application of some of these methods to two model problems in computational fluid dynamics are investigated. Some examples are two dimensional convection-diffusion problems and the incompressible driven cavity flow problem. The construction and analysis of efficient preconditioners for the interface operator to be used in the iterative solution of the interface solution is described. For the convection-diffusion problems, the effect of the convection term and its discretization on the performance of some of the preconditioners is discussed. For the driven cavity problem, the effectiveness of a class of boundary probe preconditioners is discussed.
Scalable High-order Methods for Multi-Scale Problems: Analysis, Algorithms and Application
2016-02-26
Karniadakis, “Resilient algorithms for reconstructing and simulating gappy flow fields in CFD ”, Fluid Dynamic Research, vol. 47, 051402, 2015. 2. Y. Yu, H...simulation, domain decomposition, CFD , gappy data, estimation theory, and gap-tooth algorithm. 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...objective of this project was to develop a general CFD framework for multifidelity simula- tions to target multiscale problems but also resilience in
Computational mechanics analysis tools for parallel-vector supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning
1993-01-01
Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.
Jung, Jaewoon; Mori, Takaharu; Kobayashi, Chigusa; Matsunaga, Yasuhiro; Yoda, Takao; Feig, Michael; Sugita, Yuji
2015-07-01
GENESIS (Generalized-Ensemble Simulation System) is a new software package for molecular dynamics (MD) simulations of macromolecules. It has two MD simulators, called ATDYN and SPDYN. ATDYN is parallelized based on an atomic decomposition algorithm for the simulations of all-atom force-field models as well as coarse-grained Go-like models. SPDYN is highly parallelized based on a domain decomposition scheme, allowing large-scale MD simulations on supercomputers. Hybrid schemes combining OpenMP and MPI are used in both simulators to target modern multicore computer architectures. Key advantages of GENESIS are (1) the highly parallel performance of SPDYN for very large biological systems consisting of more than one million atoms and (2) the availability of various REMD algorithms (T-REMD, REUS, multi-dimensional REMD for both all-atom and Go-like models under the NVT, NPT, NPAT, and NPγT ensembles). The former is achieved by a combination of the midpoint cell method and the efficient three-dimensional Fast Fourier Transform algorithm, where the domain decomposition space is shared in real-space and reciprocal-space calculations. Other features in SPDYN, such as avoiding concurrent memory access, reducing communication times, and usage of parallel input/output files, also contribute to the performance. We show the REMD simulation results of a mixed (POPC/DMPC) lipid bilayer as a real application using GENESIS. GENESIS is released as free software under the GPLv2 licence and can be easily modified for the development of new algorithms and molecular models. WIREs Comput Mol Sci 2015, 5:310-323. doi: 10.1002/wcms.1220.
A Framework for Parallel Unstructured Grid Generation for Complex Aerodynamic Simulations
NASA Technical Reports Server (NTRS)
Zagaris, George; Pirzadeh, Shahyar Z.; Chrisochoides, Nikos
2009-01-01
A framework for parallel unstructured grid generation targeting both shared memory multi-processors and distributed memory architectures is presented. The two fundamental building-blocks of the framework consist of: (1) the Advancing-Partition (AP) method used for domain decomposition and (2) the Advancing Front (AF) method used for mesh generation. Starting from the surface mesh of the computational domain, the AP method is applied recursively to generate a set of sub-domains. Next, the sub-domains are meshed in parallel using the AF method. The recursive nature of domain decomposition naturally maps to a divide-and-conquer algorithm which exhibits inherent parallelism. For the parallel implementation, the Master/Worker pattern is employed to dynamically balance the varying workloads of each task on the set of available CPUs. Performance results by this approach are presented and discussed in detail as well as future work and improvements.
Hierarchical fractional-step approximations and parallel kinetic Monte Carlo algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arampatzis, Giorgos, E-mail: garab@math.uoc.gr; Katsoulakis, Markos A., E-mail: markos@math.umass.edu; Plechac, Petr, E-mail: plechac@math.udel.edu
2012-10-01
We present a mathematical framework for constructing and analyzing parallel algorithms for lattice kinetic Monte Carlo (KMC) simulations. The resulting algorithms have the capacity to simulate a wide range of spatio-temporal scales in spatially distributed, non-equilibrium physiochemical processes with complex chemistry and transport micro-mechanisms. Rather than focusing on constructing exactly the stochastic trajectories, our approach relies on approximating the evolution of observables, such as density, coverage, correlations and so on. More specifically, we develop a spatial domain decomposition of the Markov operator (generator) that describes the evolution of all observables according to the kinetic Monte Carlo algorithm. This domain decompositionmore » corresponds to a decomposition of the Markov generator into a hierarchy of operators and can be tailored to specific hierarchical parallel architectures such as multi-core processors or clusters of Graphical Processing Units (GPUs). Based on this operator decomposition, we formulate parallel Fractional step kinetic Monte Carlo algorithms by employing the Trotter Theorem and its randomized variants; these schemes, (a) are partially asynchronous on each fractional step time-window, and (b) are characterized by their communication schedule between processors. The proposed mathematical framework allows us to rigorously justify the numerical and statistical consistency of the proposed algorithms, showing the convergence of our approximating schemes to the original serial KMC. The approach also provides a systematic evaluation of different processor communicating schedules. We carry out a detailed benchmarking of the parallel KMC schemes using available exact solutions, for example, in Ising-type systems and we demonstrate the capabilities of the method to simulate complex spatially distributed reactions at very large scales on GPUs. Finally, we discuss work load balancing between processors and propose a re-balancing scheme based on probabilistic mass transport methods.« less
NASA Technical Reports Server (NTRS)
Farhat, Charbel; Lesoinne, Michel
1993-01-01
Most of the recently proposed computational methods for solving partial differential equations on multiprocessor architectures stem from the 'divide and conquer' paradigm and involve some form of domain decomposition. For those methods which also require grids of points or patches of elements, it is often necessary to explicitly partition the underlying mesh, especially when working with local memory parallel processors. In this paper, a family of cost-effective algorithms for the automatic partitioning of arbitrary two- and three-dimensional finite element and finite difference meshes is presented and discussed in view of a domain decomposed solution procedure and parallel processing. The influence of the algorithmic aspects of a solution method (implicit/explicit computations), and the architectural specifics of a multiprocessor (SIMD/MIMD, startup/transmission time), on the design of a mesh partitioning algorithm are discussed. The impact of the partitioning strategy on load balancing, operation count, operator conditioning, rate of convergence and processor mapping is also addressed. Finally, the proposed mesh decomposition algorithms are demonstrated with realistic examples of finite element, finite volume, and finite difference meshes associated with the parallel solution of solid and fluid mechanics problems on the iPSC/2 and iPSC/860 multiprocessors.
Bai, Yulei; Jia, Quanjie; Zhang, Yun; Huang, Qiquan; Yang, Qiyu; Ye, Shuangli; He, Zhaoshui; Zhou, Yanzhou; Xie, Shengli
2016-05-01
It is important to improve the depth resolution in depth-resolved wavenumber-scanning interferometry (DRWSI) owing to the limited range of wavenumber scanning. In this work, a new nonlinear iterative least-squares algorithm called the wavenumber-domain least-squares algorithm (WLSA) is proposed for evaluating the phase of DRWSI. The simulated and experimental results of the Fourier transform (FT), complex-number least-squares algorithm (CNLSA), eigenvalue-decomposition and least-squares algorithm (EDLSA), and WLSA were compared and analyzed. According to the results, the WLSA is less dependent on the initial values, and the depth resolution δz is approximately changed from δz to δz/6. Thus, the WLSA exhibits a better performance than the FT, CNLSA, and EDLSA.
NASA Technical Reports Server (NTRS)
Luke, Edward Allen
1993-01-01
Two algorithms capable of computing a transonic 3-D inviscid flow field about rotating machines are considered for parallel implementation. During the study of these algorithms, a significant new method of measuring the performance of parallel algorithms is developed. The theory that supports this new method creates an empirical definition of scalable parallel algorithms that is used to produce quantifiable evidence that a scalable parallel application was developed. The implementation of the parallel application and an automated domain decomposition tool are also discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, W; Niu, T; Xing, L
2015-06-15
Purpose: To significantly improve dual energy CT (DECT) imaging by establishing a new theoretical framework of image-domain material decomposition with incorporation of edge-preserving techniques. Methods: The proposed algorithm, HYPR-NLM, combines the edge-preserving non-local mean filter (NLM) with the HYPR-LR (Local HighlY constrained backPRojection Reconstruction) framework. Image denoising using HYPR-LR framework depends on the noise level of the composite image which is the average of the different energy images. For DECT, the composite image is the average of high- and low-energy images. To further reduce noise, one may want to increase the window size of the filter of the HYPR-LR, leadingmore » resolution degradation. By incorporating the NLM filtering and the HYPR-LR framework, HYPR-NLM reduces the boost material decomposition noise using energy information redundancies as well as the non-local mean. We demonstrate the noise reduction and resolution preservation of the algorithm with both iodine concentration numerical phantom and clinical patient data by comparing the HYPR-NLM algorithm to the direct matrix inversion, HYPR-LR and iterative image-domain material decomposition (Iter-DECT). Results: The results show iterative material decomposition method reduces noise to the lowest level and provides improved DECT images. HYPR-NLM significantly reduces noise while preserving the accuracy of quantitative measurement and resolution. For the iodine concentration numerical phantom, the averaged noise levels are about 2.0, 0.7, 0.2 and 0.4 for direct inversion, HYPR-LR, Iter- DECT and HYPR-NLM, respectively. For the patient data, the noise levels of the water images are about 0.36, 0.16, 0.12 and 0.13 for direct inversion, HYPR-LR, Iter-DECT and HYPR-NLM, respectively. Difference images of both HYPR-LR and Iter-DECT show edge effect, while no significant edge effect is shown for HYPR-NLM, suggesting spatial resolution is well preserved for HYPR-NLM. Conclusion: HYPR-NLM provides an effective way to reduce the generic magnified image noise of dual–energy material decomposition while preserving resolution. This work is supported in part by NIH grants 7R01HL111141 and 1R01-EB016777. This work is also supported by the Natural Science Foundation of China (NSFC Grant No. 81201091), Fundamental Research Funds for the Central Universities in China, and Fund Project for Excellent Abroad Scholar Personnel in Science and Technology.« less
Jung, Jaewoon; Mori, Takaharu; Kobayashi, Chigusa; Matsunaga, Yasuhiro; Yoda, Takao; Feig, Michael; Sugita, Yuji
2015-01-01
GENESIS (Generalized-Ensemble Simulation System) is a new software package for molecular dynamics (MD) simulations of macromolecules. It has two MD simulators, called ATDYN and SPDYN. ATDYN is parallelized based on an atomic decomposition algorithm for the simulations of all-atom force-field models as well as coarse-grained Go-like models. SPDYN is highly parallelized based on a domain decomposition scheme, allowing large-scale MD simulations on supercomputers. Hybrid schemes combining OpenMP and MPI are used in both simulators to target modern multicore computer architectures. Key advantages of GENESIS are (1) the highly parallel performance of SPDYN for very large biological systems consisting of more than one million atoms and (2) the availability of various REMD algorithms (T-REMD, REUS, multi-dimensional REMD for both all-atom and Go-like models under the NVT, NPT, NPAT, and NPγT ensembles). The former is achieved by a combination of the midpoint cell method and the efficient three-dimensional Fast Fourier Transform algorithm, where the domain decomposition space is shared in real-space and reciprocal-space calculations. Other features in SPDYN, such as avoiding concurrent memory access, reducing communication times, and usage of parallel input/output files, also contribute to the performance. We show the REMD simulation results of a mixed (POPC/DMPC) lipid bilayer as a real application using GENESIS. GENESIS is released as free software under the GPLv2 licence and can be easily modified for the development of new algorithms and molecular models. WIREs Comput Mol Sci 2015, 5:310–323. doi: 10.1002/wcms.1220 PMID:26753008
Guo, Qiang; Qi, Liangang
2017-04-10
In the coexistence of multiple types of interfering signals, the performance of interference suppression methods based on time and frequency domains is degraded seriously, and the technique using an antenna array requires a large enough size and huge hardware costs. To combat multi-type interferences better for GNSS receivers, this paper proposes a cascaded multi-type interferences mitigation method combining improved double chain quantum genetic matching pursuit (DCQGMP)-based sparse decomposition and an MPDR beamformer. The key idea behind the proposed method is that the multiple types of interfering signals can be excised by taking advantage of their sparse features in different domains. In the first stage, the single-tone (multi-tone) and linear chirp interfering signals are canceled by sparse decomposition according to their sparsity in the over-complete dictionary. In order to improve the timeliness of matching pursuit (MP)-based sparse decomposition, a DCQGMP is introduced by combining an improved double chain quantum genetic algorithm (DCQGA) and the MP algorithm, and the DCQGMP algorithm is extended to handle the multi-channel signals according to the correlation among the signals in different channels. In the second stage, the minimum power distortionless response (MPDR) beamformer is utilized to nullify the residuary interferences (e.g., wideband Gaussian noise interferences). Several simulation results show that the proposed method can not only improve the interference mitigation degree of freedom (DoF) of the array antenna, but also effectively deal with the interference arriving from the same direction with the GNSS signal, which can be sparse represented in the over-complete dictionary. Moreover, it does not bring serious distortions into the navigation signal.
Guo, Qiang; Qi, Liangang
2017-01-01
In the coexistence of multiple types of interfering signals, the performance of interference suppression methods based on time and frequency domains is degraded seriously, and the technique using an antenna array requires a large enough size and huge hardware costs. To combat multi-type interferences better for GNSS receivers, this paper proposes a cascaded multi-type interferences mitigation method combining improved double chain quantum genetic matching pursuit (DCQGMP)-based sparse decomposition and an MPDR beamformer. The key idea behind the proposed method is that the multiple types of interfering signals can be excised by taking advantage of their sparse features in different domains. In the first stage, the single-tone (multi-tone) and linear chirp interfering signals are canceled by sparse decomposition according to their sparsity in the over-complete dictionary. In order to improve the timeliness of matching pursuit (MP)-based sparse decomposition, a DCQGMP is introduced by combining an improved double chain quantum genetic algorithm (DCQGA) and the MP algorithm, and the DCQGMP algorithm is extended to handle the multi-channel signals according to the correlation among the signals in different channels. In the second stage, the minimum power distortionless response (MPDR) beamformer is utilized to nullify the residuary interferences (e.g., wideband Gaussian noise interferences). Several simulation results show that the proposed method can not only improve the interference mitigation degree of freedom (DoF) of the array antenna, but also effectively deal with the interference arriving from the same direction with the GNSS signal, which can be sparse represented in the over-complete dictionary. Moreover, it does not bring serious distortions into the navigation signal. PMID:28394290
Algebraic multigrid domain and range decomposition (AMG-DD / AMG-RD)*
Bank, R.; Falgout, R. D.; Jones, T.; ...
2015-10-29
In modern large-scale supercomputing applications, algebraic multigrid (AMG) is a leading choice for solving matrix equations. However, the high cost of communication relative to that of computation is a concern for the scalability of traditional implementations of AMG on emerging architectures. This paper introduces two new algebraic multilevel algorithms, algebraic multigrid domain decomposition (AMG-DD) and algebraic multigrid range decomposition (AMG-RD), that replace traditional AMG V-cycles with a fully overlapping domain decomposition approach. While the methods introduced here are similar in spirit to the geometric methods developed by Brandt and Diskin [Multigrid solvers on decomposed domains, in Domain Decomposition Methods inmore » Science and Engineering, Contemp. Math. 157, AMS, Providence, RI, 1994, pp. 135--155], Mitchell [Electron. Trans. Numer. Anal., 6 (1997), pp. 224--233], and Bank and Holst [SIAM J. Sci. Comput., 22 (2000), pp. 1411--1443], they differ primarily in that they are purely algebraic: AMG-RD and AMG-DD trade communication for computation by forming global composite “grids” based only on the matrix, not the geometry. (As is the usual AMG convention, “grids” here should be taken only in the algebraic sense, regardless of whether or not it corresponds to any geometry.) Another important distinguishing feature of AMG-RD and AMG-DD is their novel residual communication process that enables effective parallel computation on composite grids, avoiding the all-to-all communication costs of the geometric methods. The main purpose of this paper is to study the potential of these two algebraic methods as possible alternatives to existing AMG approaches for future parallel machines. As a result, this paper develops some theoretical properties of these methods and reports on serial numerical tests of their convergence properties over a spectrum of problem parameters.« less
a Unified Matrix Polynomial Approach to Modal Identification
NASA Astrophysics Data System (ADS)
Allemang, R. J.; Brown, D. L.
1998-04-01
One important current focus of modal identification is a reformulation of modal parameter estimation algorithms into a single, consistent mathematical formulation with a corresponding set of definitions and unifying concepts. Particularly, a matrix polynomial approach is used to unify the presentation with respect to current algorithms such as the least-squares complex exponential (LSCE), the polyreference time domain (PTD), Ibrahim time domain (ITD), eigensystem realization algorithm (ERA), rational fraction polynomial (RFP), polyreference frequency domain (PFD) and the complex mode indication function (CMIF) methods. Using this unified matrix polynomial approach (UMPA) allows a discussion of the similarities and differences of the commonly used methods. the use of least squares (LS), total least squares (TLS), double least squares (DLS) and singular value decomposition (SVD) methods is discussed in order to take advantage of redundant measurement data. Eigenvalue and SVD transformation methods are utilized to reduce the effective size of the resulting eigenvalue-eigenvector problem as well.
Shatokhina, Iuliia; Obereder, Andreas; Rosensteiner, Matthias; Ramlau, Ronny
2013-04-20
We present a fast method for the wavefront reconstruction from pyramid wavefront sensor (P-WFS) measurements. The method is based on an analytical relation between pyramid and Shack-Hartmann sensor (SH-WFS) data. The algorithm consists of two steps--a transformation of the P-WFS data to SH data, followed by the application of cumulative reconstructor with domain decomposition, a wavefront reconstructor from SH-WFS measurements. The closed loop simulations confirm that our method provides the same quality as the standard matrix vector multiplication method. A complexity analysis as well as speed tests confirm that the method is very fast. Thus, the method can be used on extremely large telescopes, e.g., for eXtreme adaptive optics systems.
NASA Astrophysics Data System (ADS)
Sourbier, Florent; Operto, Stéphane; Virieux, Jean; Amestoy, Patrick; L'Excellent, Jean-Yves
2009-03-01
This is the first paper in a two-part series that describes a massively parallel code that performs 2D frequency-domain full-waveform inversion of wide-aperture seismic data for imaging complex structures. Full-waveform inversion methods, namely quantitative seismic imaging methods based on the resolution of the full wave equation, are computationally expensive. Therefore, designing efficient algorithms which take advantage of parallel computing facilities is critical for the appraisal of these approaches when applied to representative case studies and for further improvements. Full-waveform modelling requires the resolution of a large sparse system of linear equations which is performed with the massively parallel direct solver MUMPS for efficient multiple-shot simulations. Efficiency of the multiple-shot solution phase (forward/backward substitutions) is improved by using the BLAS3 library. The inverse problem relies on a classic local optimization approach implemented with a gradient method. The direct solver returns the multiple-shot wavefield solutions distributed over the processors according to a domain decomposition driven by the distribution of the LU factors. The domain decomposition of the wavefield solutions is used to compute in parallel the gradient of the objective function and the diagonal Hessian, this latter providing a suitable scaling of the gradient. The algorithm allows one to test different strategies for multiscale frequency inversion ranging from successive mono-frequency inversion to simultaneous multifrequency inversion. These different inversion strategies will be illustrated in the following companion paper. The parallel efficiency and the scalability of the code will also be quantified.
Data decomposition of Monte Carlo particle transport simulations via tally servers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romano, Paul K.; Siegel, Andrew R.; Forget, Benoit
An algorithm for decomposing large tally data in Monte Carlo particle transport simulations is developed, analyzed, and implemented in a continuous-energy Monte Carlo code, OpenMC. The algorithm is based on a non-overlapping decomposition of compute nodes into tracking processors and tally servers. The former are used to simulate the movement of particles through the domain while the latter continuously receive and update tally data. A performance model for this approach is developed, suggesting that, for a range of parameters relevant to LWR analysis, the tally server algorithm should perform with minimal overhead on contemporary supercomputers. An implementation of the algorithmmore » in OpenMC is then tested on the Intrepid and Titan supercomputers, supporting the key predictions of the model over a wide range of parameters. We thus conclude that the tally server algorithm is a successful approach to circumventing classical on-node memory constraints en route to unprecedentedly detailed Monte Carlo reactor simulations.« less
Planning paths through a spatial hierarchy - Eliminating stair-stepping effects
NASA Technical Reports Server (NTRS)
Slack, Marc G.
1989-01-01
Stair-stepping effects are a result of the loss of spatial continuity resulting from the decomposition of space into a grid. This paper presents a path planning algorithm which eliminates stair-stepping effects induced by the grid-based spatial representation. The algorithm exploits a hierarchical spatial model to efficiently plan paths for a mobile robot operating in dynamic domains. The spatial model and path planning algorithm map to a parallel machine, allowing the system to operate incrementally, thereby accounting for unexpected events in the operating space.
NASA Technical Reports Server (NTRS)
Nguyen, D. T.; Watson, Willie R. (Technical Monitor)
2005-01-01
The overall objectives of this research work are to formulate and validate efficient parallel algorithms, and to efficiently design/implement computer software for solving large-scale acoustic problems, arised from the unified frameworks of the finite element procedures. The adopted parallel Finite Element (FE) Domain Decomposition (DD) procedures should fully take advantages of multiple processing capabilities offered by most modern high performance computing platforms for efficient parallel computation. To achieve this objective. the formulation needs to integrate efficient sparse (and dense) assembly techniques, hybrid (or mixed) direct and iterative equation solvers, proper pre-conditioned strategies, unrolling strategies, and effective processors' communicating schemes. Finally, the numerical performance of the developed parallel finite element procedures will be evaluated by solving series of structural, and acoustic (symmetrical and un-symmetrical) problems (in different computing platforms). Comparisons with existing "commercialized" and/or "public domain" software are also included, whenever possible.
GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation.
Hess, Berk; Kutzner, Carsten; van der Spoel, David; Lindahl, Erik
2008-03-01
Molecular simulation is an extremely useful, but computationally very expensive tool for studies of chemical and biomolecular systems. Here, we present a new implementation of our molecular simulation toolkit GROMACS which now both achieves extremely high performance on single processors from algorithmic optimizations and hand-coded routines and simultaneously scales very well on parallel machines. The code encompasses a minimal-communication domain decomposition algorithm, full dynamic load balancing, a state-of-the-art parallel constraint solver, and efficient virtual site algorithms that allow removal of hydrogen atom degrees of freedom to enable integration time steps up to 5 fs for atomistic simulations also in parallel. To improve the scaling properties of the common particle mesh Ewald electrostatics algorithms, we have in addition used a Multiple-Program, Multiple-Data approach, with separate node domains responsible for direct and reciprocal space interactions. Not only does this combination of algorithms enable extremely long simulations of large systems but also it provides that simulation performance on quite modest numbers of standard cluster nodes.
Adaptive Fourier decomposition based R-peak detection for noisy ECG Signals.
Ze Wang; Chi Man Wong; Feng Wan
2017-07-01
An adaptive Fourier decomposition (AFD) based R-peak detection method is proposed for noisy ECG signals. Although lots of QRS detection methods have been proposed in literature, most detection methods require high signal quality. The proposed method extracts the R waves from the energy domain using the AFD and determines the R-peak locations based on the key decomposition parameters, achieving the denoising and the R-peak detection at the same time. Validated by clinical ECG signals in the MIT-BIH Arrhythmia Database, the proposed method shows better performance than the Pan-Tompkin (PT) algorithm in both situations of a native PT and the PT with a denoising process.
A tightly-coupled domain-decomposition approach for highly nonlinear stochastic multiphysics systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taverniers, Søren; Tartakovsky, Daniel M., E-mail: dmt@ucsd.edu
2017-02-01
Multiphysics simulations often involve nonlinear components that are driven by internally generated or externally imposed random fluctuations. When used with a domain-decomposition (DD) algorithm, such components have to be coupled in a way that both accurately propagates the noise between the subdomains and lends itself to a stable and cost-effective temporal integration. We develop a conservative DD approach in which tight coupling is obtained by using a Jacobian-free Newton–Krylov (JfNK) method with a generalized minimum residual iterative linear solver. This strategy is tested on a coupled nonlinear diffusion system forced by a truncated Gaussian noise at the boundary. Enforcement ofmore » path-wise continuity of the state variable and its flux, as opposed to continuity in the mean, at interfaces between subdomains enables the DD algorithm to correctly propagate boundary fluctuations throughout the computational domain. Reliance on a single Newton iteration (explicit coupling), rather than on the fully converged JfNK (implicit) coupling, may increase the solution error by an order of magnitude. Increase in communication frequency between the DD components reduces the explicit coupling's error, but makes it less efficient than the implicit coupling at comparable error levels for all noise strengths considered. Finally, the DD algorithm with the implicit JfNK coupling resolves temporally-correlated fluctuations of the boundary noise when the correlation time of the latter exceeds some multiple of an appropriately defined characteristic diffusion time.« less
A physics-motivated Centroidal Voronoi Particle domain decomposition method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fu, Lin, E-mail: lin.fu@tum.de; Hu, Xiangyu Y., E-mail: xiangyu.hu@tum.de; Adams, Nikolaus A., E-mail: nikolaus.adams@tum.de
2017-04-15
In this paper, we propose a novel domain decomposition method for large-scale simulations in continuum mechanics by merging the concepts of Centroidal Voronoi Tessellation (CVT) and Voronoi Particle dynamics (VP). The CVT is introduced to achieve a high-level compactness of the partitioning subdomains by the Lloyd algorithm which monotonically decreases the CVT energy. The number of computational elements between neighboring partitioning subdomains, which scales the communication effort for parallel simulations, is optimized implicitly as the generated partitioning subdomains are convex and simply connected with small aspect-ratios. Moreover, Voronoi Particle dynamics employing physical analogy with a tailored equation of state ismore » developed, which relaxes the particle system towards the target partition with good load balance. Since the equilibrium is computed by an iterative approach, the partitioning subdomains exhibit locality and the incremental property. Numerical experiments reveal that the proposed Centroidal Voronoi Particle (CVP) based algorithm produces high-quality partitioning with high efficiency, independently of computational-element types. Thus it can be used for a wide range of applications in computational science and engineering.« less
A physics-motivated Centroidal Voronoi Particle domain decomposition method
NASA Astrophysics Data System (ADS)
Fu, Lin; Hu, Xiangyu Y.; Adams, Nikolaus A.
2017-04-01
In this paper, we propose a novel domain decomposition method for large-scale simulations in continuum mechanics by merging the concepts of Centroidal Voronoi Tessellation (CVT) and Voronoi Particle dynamics (VP). The CVT is introduced to achieve a high-level compactness of the partitioning subdomains by the Lloyd algorithm which monotonically decreases the CVT energy. The number of computational elements between neighboring partitioning subdomains, which scales the communication effort for parallel simulations, is optimized implicitly as the generated partitioning subdomains are convex and simply connected with small aspect-ratios. Moreover, Voronoi Particle dynamics employing physical analogy with a tailored equation of state is developed, which relaxes the particle system towards the target partition with good load balance. Since the equilibrium is computed by an iterative approach, the partitioning subdomains exhibit locality and the incremental property. Numerical experiments reveal that the proposed Centroidal Voronoi Particle (CVP) based algorithm produces high-quality partitioning with high efficiency, independently of computational-element types. Thus it can be used for a wide range of applications in computational science and engineering.
Fast non-overlapping Schwarz domain decomposition methods for solving the neutron diffusion equation
NASA Astrophysics Data System (ADS)
Jamelot, Erell; Ciarlet, Patrick
2013-05-01
Studying numerically the steady state of a nuclear core reactor is expensive, in terms of memory storage and computational time. In order to address both requirements, one can use a domain decomposition method, implemented on a parallel computer. We present here such a method for the mixed neutron diffusion equations, discretized with Raviart-Thomas-Nédélec finite elements. This method is based on the Schwarz iterative algorithm with Robin interface conditions to handle communications. We analyse this method from the continuous point of view to the discrete point of view, and we give some numerical results in a realistic highly heterogeneous 3D configuration. Computations are carried out with the MINOS solver of the APOLLO3® neutronics code. APOLLO3 is a registered trademark in France.
Supercomputer simulations of structure formation in the Universe
NASA Astrophysics Data System (ADS)
Ishiyama, Tomoaki
2017-06-01
We describe the implementation and performance results of our massively parallel MPI†/OpenMP‡ hybrid TreePM code for large-scale cosmological N-body simulations. For domain decomposition, a recursive multi-section algorithm is used and the size of domains are automatically set so that the total calculation time is the same for all processes. We developed a highly-tuned gravity kernel for short-range forces, and a novel communication algorithm for long-range forces. For two trillion particles benchmark simulation, the average performance on the fullsystem of K computer (82,944 nodes, the total number of core is 663,552) is 5.8 Pflops, which corresponds to 55% of the peak speed.
Convolution of large 3D images on GPU and its decomposition
NASA Astrophysics Data System (ADS)
Karas, Pavel; Svoboda, David
2011-12-01
In this article, we propose a method for computing convolution of large 3D images. The convolution is performed in a frequency domain using a convolution theorem. The algorithm is accelerated on a graphic card by means of the CUDA parallel computing model. Convolution is decomposed in a frequency domain using the decimation in frequency algorithm. We pay attention to keeping our approach efficient in terms of both time and memory consumption and also in terms of memory transfers between CPU and GPU which have a significant inuence on overall computational time. We also study the implementation on multiple GPUs and compare the results between the multi-GPU and multi-CPU implementations.
Wavelet decomposition based principal component analysis for face recognition using MATLAB
NASA Astrophysics Data System (ADS)
Sharma, Mahesh Kumar; Sharma, Shashikant; Leeprechanon, Nopbhorn; Ranjan, Aashish
2016-03-01
For the realization of face recognition systems in the static as well as in the real time frame, algorithms such as principal component analysis, independent component analysis, linear discriminate analysis, neural networks and genetic algorithms are used for decades. This paper discusses an approach which is a wavelet decomposition based principal component analysis for face recognition. Principal component analysis is chosen over other algorithms due to its relative simplicity, efficiency, and robustness features. The term face recognition stands for identifying a person from his facial gestures and having resemblance with factor analysis in some sense, i.e. extraction of the principal component of an image. Principal component analysis is subjected to some drawbacks, mainly the poor discriminatory power and the large computational load in finding eigenvectors, in particular. These drawbacks can be greatly reduced by combining both wavelet transform decomposition for feature extraction and principal component analysis for pattern representation and classification together, by analyzing the facial gestures into space and time domain, where, frequency and time are used interchangeably. From the experimental results, it is envisaged that this face recognition method has made a significant percentage improvement in recognition rate as well as having a better computational efficiency.
The detection of flaws in austenitic welds using the decomposition of the time-reversal operator
NASA Astrophysics Data System (ADS)
Cunningham, Laura J.; Mulholland, Anthony J.; Tant, Katherine M. M.; Gachagan, Anthony; Harvey, Gerry; Bird, Colin
2016-04-01
The non-destructive testing of austenitic welds using ultrasound plays an important role in the assessment of the structural integrity of safety critical structures. The internal microstructure of these welds is highly scattering and can lead to the obscuration of defects when investigated by traditional imaging algorithms. This paper proposes an alternative objective method for the detection of flaws embedded in austenitic welds based on the singular value decomposition of the time-frequency domain response matrices. The distribution of the singular values is examined in the cases where a flaw exists and where there is no flaw present. A lower threshold on the singular values, specific to austenitic welds, is derived which, when exceeded, indicates the presence of a flaw. The detection criterion is successfully implemented on both synthetic and experimental data. The datasets arising from welds containing a flaw are further interrogated using the decomposition of the time-reversal operator (DORT) method and the total focusing method (TFM), and it is shown that images constructed via the DORT algorithm typically exhibit a higher signal-to-noise ratio than those constructed by the TFM algorithm.
The detection of flaws in austenitic welds using the decomposition of the time-reversal operator
Cunningham, Laura J.; Mulholland, Anthony J.; Gachagan, Anthony; Harvey, Gerry; Bird, Colin
2016-01-01
The non-destructive testing of austenitic welds using ultrasound plays an important role in the assessment of the structural integrity of safety critical structures. The internal microstructure of these welds is highly scattering and can lead to the obscuration of defects when investigated by traditional imaging algorithms. This paper proposes an alternative objective method for the detection of flaws embedded in austenitic welds based on the singular value decomposition of the time-frequency domain response matrices. The distribution of the singular values is examined in the cases where a flaw exists and where there is no flaw present. A lower threshold on the singular values, specific to austenitic welds, is derived which, when exceeded, indicates the presence of a flaw. The detection criterion is successfully implemented on both synthetic and experimental data. The datasets arising from welds containing a flaw are further interrogated using the decomposition of the time-reversal operator (DORT) method and the total focusing method (TFM), and it is shown that images constructed via the DORT algorithm typically exhibit a higher signal-to-noise ratio than those constructed by the TFM algorithm. PMID:27274683
Liu, Liangbing; Tao, Chao; Liu, XiaoJun; Deng, Mingxi; Wang, Senhua; Liu, Jun
2015-10-19
Photoacoustic tomography is a promising and rapidly developed methodology of biomedical imaging. It confronts an increasing urgent problem to reconstruct the image from weak and noisy photoacoustic signals, owing to its high benefit in extending the imaging depth and decreasing the dose of laser exposure. Based on the time-domain characteristics of photoacoustic signals, a pulse decomposition algorithm is proposed to reconstruct a photoacoustic image from signals with low signal-to-noise ratio. In this method, a photoacoustic signal is decomposed as the weighted summation of a set of pulses in the time-domain. Images are reconstructed from the weight factors, which are directly related to the optical absorption coefficient. Both simulation and experiment are conducted to test the performance of the method. Numerical simulations show that when the signal-to-noise ratio is -4 dB, the proposed method decreases the reconstruction error to about 17%, in comparison with the conventional back-projection method. Moreover, it can produce acceptable images even when the signal-to-noise ratio is decreased to -10 dB. Experiments show that, when the laser influence level is low, the proposed method achieves a relatively clean image of a hair phantom with some well preserved pattern details. The proposed method demonstrates imaging potential of photoacoustic tomography in expanding applications.
Fast algorithm for wavefront reconstruction in XAO/SCAO with pyramid wavefront sensor
NASA Astrophysics Data System (ADS)
Shatokhina, Iuliia; Obereder, Andreas; Ramlau, Ronny
2014-08-01
We present a fast wavefront reconstruction algorithm developed for an extreme adaptive optics system equipped with a pyramid wavefront sensor on a 42m telescope. The method is called the Preprocessed Cumulative Reconstructor with domain decomposition (P-CuReD). The algorithm is based on the theoretical relationship between pyramid and Shack-Hartmann wavefront sensor data. The algorithm consists of two consecutive steps - a data preprocessing, and an application of the CuReD algorithm, which is a fast method for wavefront reconstruction from Shack-Hartmann sensor data. The closed loop simulation results show that the P-CuReD method provides the same reconstruction quality and is significantly faster than an MVM.
NASA Astrophysics Data System (ADS)
Qarib, Hossein; Adeli, Hojjat
2015-12-01
In this paper authors introduce a new adaptive signal processing technique for feature extraction and parameter estimation in noisy exponentially damped signals. The iterative 3-stage method is based on the adroit integration of the strengths of parametric and nonparametric methods such as multiple signal categorization, matrix pencil, and empirical mode decomposition algorithms. The first stage is a new adaptive filtration or noise removal scheme. The second stage is a hybrid parametric-nonparametric signal parameter estimation technique based on an output-only system identification technique. The third stage is optimization of estimated parameters using a combination of the primal-dual path-following interior point algorithm and genetic algorithm. The methodology is evaluated using a synthetic signal and a signal obtained experimentally from transverse vibrations of a steel cantilever beam. The method is successful in estimating the frequencies accurately. Further, it estimates the damping exponents. The proposed adaptive filtration method does not include any frequency domain manipulation. Consequently, the time domain signal is not affected as a result of frequency domain and inverse transformations.
A parallel algorithm for nonlinear convection-diffusion equations
NASA Technical Reports Server (NTRS)
Scroggs, Jeffrey S.
1990-01-01
A parallel algorithm for the efficient solution of nonlinear time-dependent convection-diffusion equations with small parameter on the diffusion term is presented. The method is based on a physically motivated domain decomposition that is dictated by singular perturbation analysis. The analysis is used to determine regions where certain reduced equations may be solved in place of the full equation. The method is suitable for the solution of problems arising in the simulation of fluid dynamics. Experimental results for a nonlinear equation in two-dimensions are presented.
Employing multi-GPU power for molecular dynamics simulation: an extension of GALAMOST
NASA Astrophysics Data System (ADS)
Zhu, You-Liang; Pan, Deng; Li, Zhan-Wei; Liu, Hong; Qian, Hu-Jun; Zhao, Yang; Lu, Zhong-Yuan; Sun, Zhao-Yan
2018-04-01
We describe the algorithm of employing multi-GPU power on the basis of Message Passing Interface (MPI) domain decomposition in a molecular dynamics code, GALAMOST, which is designed for the coarse-grained simulation of soft matters. The code of multi-GPU version is developed based on our previous single-GPU version. In multi-GPU runs, one GPU takes charge of one domain and runs single-GPU code path. The communication between neighbouring domains takes a similar algorithm of CPU-based code of LAMMPS, but is optimised specifically for GPUs. We employ a memory-saving design which can enlarge maximum system size at the same device condition. An optimisation algorithm is employed to prolong the update period of neighbour list. We demonstrate good performance of multi-GPU runs on the simulation of Lennard-Jones liquid, dissipative particle dynamics liquid, polymer and nanoparticle composite, and two-patch particles on workstation. A good scaling of many nodes on cluster for two-patch particles is presented.
Application of higher order SVD to vibration-based system identification and damage detection
NASA Astrophysics Data System (ADS)
Chao, Shu-Hsien; Loh, Chin-Hsiung; Weng, Jian-Huang
2012-04-01
Singular value decomposition (SVD) is a powerful linear algebra tool. It is widely used in many different signal processing methods, such principal component analysis (PCA), singular spectrum analysis (SSA), frequency domain decomposition (FDD), subspace identification and stochastic subspace identification method ( SI and SSI ). In each case, the data is arranged appropriately in matrix form and SVD is used to extract the feature of the data set. In this study three different algorithms on signal processing and system identification are proposed: SSA, SSI-COV and SSI-DATA. Based on the extracted subspace and null-space from SVD of data matrix, damage detection algorithms can be developed. The proposed algorithm is used to process the shaking table test data of the 6-story steel frame. Features contained in the vibration data are extracted by the proposed method. Damage detection can then be investigated from the test data of the frame structure through subspace-based and nullspace-based damage indices.
Multiple image encryption scheme based on pixel exchange operation and vector decomposition
NASA Astrophysics Data System (ADS)
Xiong, Y.; Quan, C.; Tay, C. J.
2018-02-01
We propose a new multiple image encryption scheme based on a pixel exchange operation and a basic vector decomposition in Fourier domain. In this algorithm, original images are imported via a pixel exchange operator, from which scrambled images and pixel position matrices are obtained. Scrambled images encrypted into phase information are imported using the proposed algorithm and phase keys are obtained from the difference between scrambled images and synthesized vectors in a charge-coupled device (CCD) plane. The final synthesized vector is used as an input in a random phase encoding (DRPE) scheme. In the proposed encryption scheme, pixel position matrices and phase keys serve as additional private keys to enhance the security of the cryptosystem which is based on a 4-f system. Numerical simulations are presented to demonstrate the feasibility and robustness of the proposed encryption scheme.
Final Report, DE-FG01-06ER25718 Domain Decomposition and Parallel Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Widlund, Olof B.
2015-06-09
The goal of this project is to develop and improve domain decomposition algorithms for a variety of partial differential equations such as those of linear elasticity and electro-magnetics.These iterative methods are designed for massively parallel computing systems and allow the fast solution of the very large systems of algebraic equations that arise in large scale and complicated simulations. A special emphasis is placed on problems arising from Maxwell's equation. The approximate solvers, the preconditioners, are combined with the conjugate gradient method and must always include a solver of a coarse model in order to have a performance which is independentmore » of the number of processors used in the computer simulation. A recent development allows for an adaptive construction of this coarse component of the preconditioner.« less
A novel spatial-temporal detection method of dim infrared moving small target
NASA Astrophysics Data System (ADS)
Chen, Zhong; Deng, Tao; Gao, Lei; Zhou, Heng; Luo, Song
2014-09-01
Moving small target detection under complex background in infrared image sequence is one of the major challenges of modern military in Early Warning Systems (EWS) and the use of Long-Range Strike (LRS). However, because of the low SNR and undulating background, the infrared moving small target detection is a difficult problem in a long time. To solve this problem, a novel spatial-temporal detection method based on bi-dimensional empirical mode decomposition (EMD) and time-domain difference is proposed in this paper. This method is downright self-data decomposition and do not rely on any transition kernel function, so it has a strong adaptive capacity. Firstly, we generalized the 1D EMD algorithm to the 2D case. In this process, the project has solved serial issues in 2D EMD, such as large amount of data operations, define and identify extrema in 2D case, and two-dimensional signal boundary corrosion. The EMD algorithm studied in this project can be well adapted to the automatic detection of small targets under low SNR and complex background. Secondly, considering the characteristics of moving target, we proposed an improved filtering method based on three-frame difference on basis of the original difference filtering in time-domain, which greatly improves the ability of anti-jamming algorithm. Finally, we proposed a new time-space fusion method based on a combined processing of 2D EMD and improved time-domain differential filtering. And, experimental results show that this method works well in infrared small moving target detection under low SNR and complex background.
Cui, Xinchun; Niu, Yuying; Zheng, Xiangwei; Han, Yingshuai
2018-01-01
In this paper, a new color watermarking algorithm based on differential evolution is proposed. A color host image is first converted from RGB space to YIQ space, which is more suitable for the human visual system. Then, apply three-level discrete wavelet transformation to luminance component Y and generate four different frequency sub-bands. After that, perform singular value decomposition on these sub-bands. In the watermark embedding process, apply discrete wavelet transformation to a watermark image after the scrambling encryption processing. Our new algorithm uses differential evolution algorithm with adaptive optimization to choose the right scaling factors. Experimental results show that the proposed algorithm has a better performance in terms of invisibility and robustness.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dong, Xue; Niu, Tianye; Zhu, Lei, E-mail: leizhu@gatech.edu
2014-05-15
Purpose: Dual-energy CT (DECT) is being increasingly used for its capability of material decomposition and energy-selective imaging. A generic problem of DECT, however, is that the decomposition process is unstable in the sense that the relative magnitude of decomposed signals is reduced due to signal cancellation while the image noise is accumulating from the two CT images of independent scans. Direct image decomposition, therefore, leads to severe degradation of signal-to-noise ratio on the resultant images. Existing noise suppression techniques are typically implemented in DECT with the procedures of reconstruction and decomposition performed independently, which do not explore the statistical propertiesmore » of decomposed images during the reconstruction for noise reduction. In this work, the authors propose an iterative approach that combines the reconstruction and the signal decomposition procedures to minimize the DECT image noise without noticeable loss of resolution. Methods: The proposed algorithm is formulated as an optimization problem, which balances the data fidelity and total variation of decomposed images in one framework, and the decomposition step is carried out iteratively together with reconstruction. The noise in the CT images from the proposed algorithm becomes well correlated even though the noise of the raw projections is independent on the two CT scans. Due to this feature, the proposed algorithm avoids noise accumulation during the decomposition process. The authors evaluate the method performance on noise suppression and spatial resolution using phantom studies and compare the algorithm with conventional denoising approaches as well as combined iterative reconstruction methods with different forms of regularization. Results: On the Catphan©600 phantom, the proposed method outperforms the existing denoising methods on preserving spatial resolution at the same level of noise suppression, i.e., a reduction of noise standard deviation by one order of magnitude. This improvement is mainly attributed to the high noise correlation in the CT images reconstructed by the proposed algorithm. Iterative reconstruction using different regularization, including quadratic orq-generalized Gaussian Markov random field regularization, achieves similar noise suppression from high noise correlation. However, the proposed TV regularization obtains a better edge preserving performance. Studies of electron density measurement also show that our method reduces the average estimation error from 9.5% to 7.1%. On the anthropomorphic head phantom, the proposed method suppresses the noise standard deviation of the decomposed images by a factor of ∼14 without blurring the fine structures in the sinus area. Conclusions: The authors propose a practical method for DECT imaging reconstruction, which combines the image reconstruction and material decomposition into one optimization framework. Compared to the existing approaches, our method achieves a superior performance on DECT imaging with respect to decomposition accuracy, noise reduction, and spatial resolution.« less
López-Rodríguez, Patricia; Escot-Bocanegra, David; Fernández-Recio, Raúl; Bravo, Ignacio
2015-01-01
Radar high resolution range profiles are widely used among the target recognition community for the detection and identification of flying targets. In this paper, singular value decomposition is applied to extract the relevant information and to model each aircraft as a subspace. The identification algorithm is based on angle between subspaces and takes place in a transformed domain. In order to have a wide database of radar signatures and evaluate the performance, simulated range profiles are used as the recognition database while the test samples comprise data of actual range profiles collected in a measurement campaign. Thanks to the modeling of aircraft as subspaces only the valuable information of each target is used in the recognition process. Thus, one of the main advantages of using singular value decomposition, is that it helps to overcome the notable dissimilarities found in the shape and signal-to-noise ratio between actual and simulated profiles due to their difference in nature. Despite these differences, the recognition rates obtained with the algorithm are quite promising. PMID:25551484
NASA Technical Reports Server (NTRS)
Keyes, David E.; Smooke, Mitchell D.
1987-01-01
A parallelized finite difference code based on the Newton method for systems of nonlinear elliptic boundary value problems in two dimensions is analyzed in terms of computational complexity and parallel efficiency. An approximate cost function depending on 15 dimensionless parameters is derived for algorithms based on stripwise and boxwise decompositions of the domain and a one-to-one assignment of the strip or box subdomains to processors. The sensitivity of the cost functions to the parameters is explored in regions of parameter space corresponding to model small-order systems with inexpensive function evaluations and also a coupled system of nineteen equations with very expensive function evaluations. The algorithm was implemented on the Intel Hypercube, and some experimental results for the model problems with stripwise decompositions are presented and compared with the theory. In the context of computational combustion problems, multiprocessors of either message-passing or shared-memory type may be employed with stripwise decompositions to realize speedup of O(n), where n is mesh resolution in one direction, for reasonable n.
Independent EEG Sources Are Dipolar
Delorme, Arnaud; Palmer, Jason; Onton, Julie; Oostenveld, Robert; Makeig, Scott
2012-01-01
Independent component analysis (ICA) and blind source separation (BSS) methods are increasingly used to separate individual brain and non-brain source signals mixed by volume conduction in electroencephalographic (EEG) and other electrophysiological recordings. We compared results of decomposing thirteen 71-channel human scalp EEG datasets by 22 ICA and BSS algorithms, assessing the pairwise mutual information (PMI) in scalp channel pairs, the remaining PMI in component pairs, the overall mutual information reduction (MIR) effected by each decomposition, and decomposition ‘dipolarity’ defined as the number of component scalp maps matching the projection of a single equivalent dipole with less than a given residual variance. The least well-performing algorithm was principal component analysis (PCA); best performing were AMICA and other likelihood/mutual information based ICA methods. Though these and other commonly-used decomposition methods returned many similar components, across 18 ICA/BSS algorithms mean dipolarity varied linearly with both MIR and with PMI remaining between the resulting component time courses, a result compatible with an interpretation of many maximally independent EEG components as being volume-conducted projections of partially-synchronous local cortical field activity within single compact cortical domains. To encourage further method comparisons, the data and software used to prepare the results have been made available (http://sccn.ucsd.edu/wiki/BSSComparison). PMID:22355308
On some Aitken-like acceleration of the Schwarz method
NASA Astrophysics Data System (ADS)
Garbey, M.; Tromeur-Dervout, D.
2002-12-01
In this paper we present a family of domain decomposition based on Aitken-like acceleration of the Schwarz method seen as an iterative procedure with a linear rate of convergence. We first present the so-called Aitken-Schwarz procedure for linear differential operators. The solver can be a direct solver when applied to the Helmholtz problem with five-point finite difference scheme on regular grids. We then introduce the Steffensen-Schwarz variant which is an iterative domain decomposition solver that can be applied to linear and nonlinear problems. We show that these solvers have reasonable numerical efficiency compared to classical fast solvers for the Poisson problem or multigrids for more general linear and nonlinear elliptic problems. However, the salient feature of our method is that our algorithm has high tolerance to slow network in the context of distributed parallel computing and is attractive, generally speaking, to use with computer architecture for which performance is limited by the memory bandwidth rather than the flop performance of the CPU. This is nowadays the case for most parallel. computer using the RISC processor architecture. We will illustrate this highly desirable property of our algorithm with large-scale computing experiments.
Spatiotemporal Domain Decomposition for Massive Parallel Computation of Space-Time Kernel Density
NASA Astrophysics Data System (ADS)
Hohl, A.; Delmelle, E. M.; Tang, W.
2015-07-01
Accelerated processing capabilities are deemed critical when conducting analysis on spatiotemporal datasets of increasing size, diversity and availability. High-performance parallel computing offers the capacity to solve computationally demanding problems in a limited timeframe, but likewise poses the challenge of preventing processing inefficiency due to workload imbalance between computing resources. Therefore, when designing new algorithms capable of implementing parallel strategies, careful spatiotemporal domain decomposition is necessary to account for heterogeneity in the data. In this study, we perform octtree-based adaptive decomposition of the spatiotemporal domain for parallel computation of space-time kernel density. In order to avoid edge effects near subdomain boundaries, we establish spatiotemporal buffers to include adjacent data-points that are within the spatial and temporal kernel bandwidths. Then, we quantify computational intensity of each subdomain to balance workloads among processors. We illustrate the benefits of our methodology using a space-time epidemiological dataset of Dengue fever, an infectious vector-borne disease that poses a severe threat to communities in tropical climates. Our parallel implementation of kernel density reaches substantial speedup compared to sequential processing, and achieves high levels of workload balance among processors due to great accuracy in quantifying computational intensity. Our approach is portable of other space-time analytical tests.
Approximating the 0-1 Multiple Knapsack Problem with Agent Decomposition and Market Negotiation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smolinski, B.
The 0-1 multiple knapsack problem appears in many domains from financial portfolio management to cargo ship stowing. Methods for solving it range from approximate algorithms, such as greedy algorithms, to exact algorithms, such as branch and bound. Approximate algorithms have no bounds on how poorly they perform and exact algorithms can suffer from exponential time and space complexities with large data sets. This paper introduces a market model based on agent decomposition and market auctions for approximating the 0-1 multiple knapsack problem, and an algorithm that implements the model (M(x)). M(x) traverses the solution space rather than getting caught inmore » a local maximum, overcoming an inherent problem of many greedy algorithms. The use of agents ensures that infeasible solutions are not considered while traversing the solution space and that traversal of the solution space is not just random, but is also directed. M(x) is compared to a bound and bound algorithm (BB) and a simple greedy algorithm with a random shuffle (G(x)). The results suggest that M(x) is a good algorithm for approximating the 0-1 Multiple Knapsack problem. M(x) almost always found solutions that were close to optimal in a fraction of the time it took BB to run and with much less memory on large test data sets. M(x) usually performed better than G(x) on hard problems with correlated data.« less
A resilient domain decomposition polynomial chaos solver for uncertain elliptic PDEs
NASA Astrophysics Data System (ADS)
Mycek, Paul; Contreras, Andres; Le Maître, Olivier; Sargsyan, Khachik; Rizzi, Francesco; Morris, Karla; Safta, Cosmin; Debusschere, Bert; Knio, Omar
2017-07-01
A resilient method is developed for the solution of uncertain elliptic PDEs on extreme scale platforms. The method is based on a hybrid domain decomposition, polynomial chaos (PC) framework that is designed to address soft faults. Specifically, parallel and independent solves of multiple deterministic local problems are used to define PC representations of local Dirichlet boundary-to-boundary maps that are used to reconstruct the global solution. A LAD-lasso type regression is developed for this purpose. The performance of the resulting algorithm is tested on an elliptic equation with an uncertain diffusivity field. Different test cases are considered in order to analyze the impacts of correlation structure of the uncertain diffusivity field, the stochastic resolution, as well as the probability of soft faults. In particular, the computations demonstrate that, provided sufficiently many samples are generated, the method effectively overcomes the occurrence of soft faults.
Efficient implementation of a 3-dimensional ADI method on the iPSC/860
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van der Wijngaart, R.F.
1993-12-31
A comparison is made between several domain decomposition strategies for the solution of three-dimensional partial differential equations on a MIMD distributed memory parallel computer. The grids used are structured, and the numerical algorithm is ADI. Important implementation issues regarding load balancing, storage requirements, network latency, and overlap of computations and communications are discussed. Results of the solution of the three-dimensional heat equation on the Intel iPSC/860 are presented for the three most viable methods. It is found that the Bruno-Cappello decomposition delivers optimal computational speed through an almost complete elimination of processor idle time, while providing good memory efficiency.
Conception of discrete systems decomposition algorithm using p-invariants and hypergraphs
NASA Astrophysics Data System (ADS)
Stefanowicz, Ł.
2016-09-01
In the article author presents an idea of decomposition algorithm of discrete systems described by Petri Nets using pinvariants. Decomposition process is significant from the point of view of discrete systems design, because it allows separation of the smaller sequential parts. Proposed algorithm uses modified Martinez-Silva method as well as author's selection algorithm. The developed method is a good complement of classical decomposition algorithms using graphs and hypergraphs.
HMC algorithm with multiple time scale integration and mass preconditioning
NASA Astrophysics Data System (ADS)
Urbach, C.; Jansen, K.; Shindler, A.; Wenger, U.
2006-01-01
We present a variant of the HMC algorithm with mass preconditioning (Hasenbusch acceleration) and multiple time scale integration. We have tested this variant for standard Wilson fermions at β=5.6 and at pion masses ranging from 380 to 680 MeV. We show that in this situation its performance is comparable to the recently proposed HMC variant with domain decomposition as preconditioner. We give an update of the "Berlin Wall" figure, comparing the performance of our variant of the HMC algorithm to other published performance data. Advantages of the HMC algorithm with mass preconditioning and multiple time scale integration are that it is straightforward to implement and can be used in combination with a wide variety of lattice Dirac operators.
Hasani, Mojtaba H; Gharibzadeh, Shahriar; Farjami, Yaghoub; Tavakkoli, Jahan
2013-09-01
Various numerical algorithms have been developed to solve the Khokhlov-Kuznetsov-Zabolotskaya (KZK) parabolic nonlinear wave equation. In this work, a generalized time-domain numerical algorithm is proposed to solve the diffraction term of the KZK equation. This algorithm solves the transverse Laplacian operator of the KZK equation in three-dimensional (3D) Cartesian coordinates using a finite-difference method based on the five-point implicit backward finite difference and the five-point Crank-Nicolson finite difference discretization techniques. This leads to a more uniform discretization of the Laplacian operator which in turn results in fewer calculation gridding nodes without compromising accuracy in the diffraction term. In addition, a new empirical algorithm based on the LU decomposition technique is proposed to solve the system of linear equations obtained from this discretization. The proposed empirical algorithm improves the calculation speed and memory usage, while the order of computational complexity remains linear in calculation of the diffraction term in the KZK equation. For evaluating the accuracy of the proposed algorithm, two previously published algorithms are used as comparison references: the conventional 2D Texas code and its generalization for 3D geometries. The results show that the accuracy/efficiency performance of the proposed algorithm is comparable with the established time-domain methods.
High performance computation of radiative transfer equation using the finite element method
NASA Astrophysics Data System (ADS)
Badri, M. A.; Jolivet, P.; Rousseau, B.; Favennec, Y.
2018-05-01
This article deals with an efficient strategy for numerically simulating radiative transfer phenomena using distributed computing. The finite element method alongside the discrete ordinate method is used for spatio-angular discretization of the monochromatic steady-state radiative transfer equation in an anisotropically scattering media. Two very different methods of parallelization, angular and spatial decomposition methods, are presented. To do so, the finite element method is used in a vectorial way. A detailed comparison of scalability, performance, and efficiency on thousands of processors is established for two- and three-dimensional heterogeneous test cases. Timings show that both algorithms scale well when using proper preconditioners. It is also observed that our angular decomposition scheme outperforms our domain decomposition method. Overall, we perform numerical simulations at scales that were previously unattainable by standard radiative transfer equation solvers.
NASA Technical Reports Server (NTRS)
Chew, W. C.; Song, J. M.; Lu, C. C.; Weedon, W. H.
1995-01-01
In the first phase of our work, we have concentrated on laying the foundation to develop fast algorithms, including the use of recursive structure like the recursive aggregate interaction matrix algorithm (RAIMA), the nested equivalence principle algorithm (NEPAL), the ray-propagation fast multipole algorithm (RPFMA), and the multi-level fast multipole algorithm (MLFMA). We have also investigated the use of curvilinear patches to build a basic method of moments code where these acceleration techniques can be used later. In the second phase, which is mainly reported on here, we have concentrated on implementing three-dimensional NEPAL on a massively parallel machine, the Connection Machine CM-5, and have been able to obtain some 3D scattering results. In order to understand the parallelization of codes on the Connection Machine, we have also studied the parallelization of 3D finite-difference time-domain (FDTD) code with PML material absorbing boundary condition (ABC). We found that simple algorithms like the FDTD with material ABC can be parallelized very well allowing us to solve within a minute a problem of over a million nodes. In addition, we have studied the use of the fast multipole method and the ray-propagation fast multipole algorithm to expedite matrix-vector multiplication in a conjugate-gradient solution to integral equations of scattering. We find that these methods are faster than LU decomposition for one incident angle, but are slower than LU decomposition when many incident angles are needed as in the monostatic RCS calculations.
Multiscale Simulations of Magnetic Island Coalescence
NASA Technical Reports Server (NTRS)
Dorelli, John C.
2010-01-01
We describe a new interactive parallel Adaptive Mesh Refinement (AMR) framework written in the Python programming language. This new framework, PyAMR, hides the details of parallel AMR data structures and algorithms (e.g., domain decomposition, grid partition, and inter-process communication), allowing the user to focus on the development of algorithms for advancing the solution of a systems of partial differential equations on a single uniform mesh. We demonstrate the use of PyAMR by simulating the pairwise coalescence of magnetic islands using the resistive Hall MHD equations. Techniques for coupling different physics models on different levels of the AMR grid hierarchy are discussed.
Evaluating the performance of distributed approaches for modal identification
NASA Astrophysics Data System (ADS)
Krishnan, Sriram S.; Sun, Zhuoxiong; Irfanoglu, Ayhan; Dyke, Shirley J.; Yan, Guirong
2011-04-01
In this paper two modal identification approaches appropriate for use in a distributed computing environment are applied to a full-scale, complex structure. The natural excitation technique (NExT) is used in conjunction with a condensed eigensystem realization algorithm (ERA), and the frequency domain decomposition with peak-picking (FDD-PP) are both applied to sensor data acquired from a 57.5-ft, 10 bay highway sign truss structure. Monte-Carlo simulations are performed on a numerical example to investigate the statistical properties and sensitivity to noise of the two distributed algorithms. Experimental results are provided and discussed.
Reconstructing photorealistic 3D models from image sequence using domain decomposition method
NASA Astrophysics Data System (ADS)
Xiong, Hanwei; Pan, Ming; Zhang, Xiangwei
2009-11-01
In the fields of industrial design, artistic design and heritage conservation, physical objects are usually digitalized by reverse engineering through some 3D scanning methods. Structured light and photogrammetry are two main methods to acquire 3D information, and both are expensive. Even if these expensive instruments are used, photorealistic 3D models are seldom available. In this paper, a new method to reconstruction photorealistic 3D models using a single camera is proposed. A square plate glued with coded marks is used to place the objects, and a sequence of about 20 images is taken. From the coded marks, the images are calibrated, and a snake algorithm is used to segment object from the background. A rough 3d model is obtained using shape from silhouettes algorithm. The silhouettes are decomposed into a combination of convex curves, which are used to partition the rough 3d model into some convex mesh patches. For each patch, the multi-view photo consistency constraints and smooth regulations are expressed as a finite element formulation, which can be resolved locally, and the information can be exchanged along the patches boundaries. The rough model is deformed into a fine 3d model through such a domain decomposition finite element method. The textures are assigned to each element mesh, and a photorealistic 3D model is got finally. A toy pig is used to verify the algorithm, and the result is exciting.
Interface conditions for domain decomposition with radical grid refinement
NASA Technical Reports Server (NTRS)
Scroggs, Jeffrey S.
1991-01-01
Interface conditions for coupling the domains in a physically motivated domain decomposition method are discussed. The domain decomposition is based on an asymptotic-induced method for the numerical solution of hyperbolic conservation laws with small viscosity. The method consists of multiple stages. The first stage is to obtain a first approximation using a first-order method, such as the Godunov scheme. Subsequent stages of the method involve solving internal-layer problem via a domain decomposition. The method is derived and justified via singular perturbation techniques.
Information criteria for quantifying loss of reversibility in parallelized KMC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gourgoulias, Konstantinos, E-mail: gourgoul@math.umass.edu; Katsoulakis, Markos A., E-mail: markos@math.umass.edu; Rey-Bellet, Luc, E-mail: luc@math.umass.edu
Parallel Kinetic Monte Carlo (KMC) is a potent tool to simulate stochastic particle systems efficiently. However, despite literature on quantifying domain decomposition errors of the particle system for this class of algorithms in the short and in the long time regime, no study yet explores and quantifies the loss of time-reversibility in Parallel KMC. Inspired by concepts from non-equilibrium statistical mechanics, we propose the entropy production per unit time, or entropy production rate, given in terms of an observable and a corresponding estimator, as a metric that quantifies the loss of reversibility. Typically, this is a quantity that cannot bemore » computed explicitly for Parallel KMC, which is why we develop a posteriori estimators that have good scaling properties with respect to the size of the system. Through these estimators, we can connect the different parameters of the scheme, such as the communication time step of the parallelization, the choice of the domain decomposition, and the computational schedule, with its performance in controlling the loss of reversibility. From this point of view, the entropy production rate can be seen both as an information criterion to compare the reversibility of different parallel schemes and as a tool to diagnose reversibility issues with a particular scheme. As a demonstration, we use Sandia Lab's SPPARKS software to compare different parallelization schemes and different domain (lattice) decompositions.« less
Information criteria for quantifying loss of reversibility in parallelized KMC
NASA Astrophysics Data System (ADS)
Gourgoulias, Konstantinos; Katsoulakis, Markos A.; Rey-Bellet, Luc
2017-01-01
Parallel Kinetic Monte Carlo (KMC) is a potent tool to simulate stochastic particle systems efficiently. However, despite literature on quantifying domain decomposition errors of the particle system for this class of algorithms in the short and in the long time regime, no study yet explores and quantifies the loss of time-reversibility in Parallel KMC. Inspired by concepts from non-equilibrium statistical mechanics, we propose the entropy production per unit time, or entropy production rate, given in terms of an observable and a corresponding estimator, as a metric that quantifies the loss of reversibility. Typically, this is a quantity that cannot be computed explicitly for Parallel KMC, which is why we develop a posteriori estimators that have good scaling properties with respect to the size of the system. Through these estimators, we can connect the different parameters of the scheme, such as the communication time step of the parallelization, the choice of the domain decomposition, and the computational schedule, with its performance in controlling the loss of reversibility. From this point of view, the entropy production rate can be seen both as an information criterion to compare the reversibility of different parallel schemes and as a tool to diagnose reversibility issues with a particular scheme. As a demonstration, we use Sandia Lab's SPPARKS software to compare different parallelization schemes and different domain (lattice) decompositions.
A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garrett, C. Kristopher; Hauck, Cory D.
In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less
A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework
Garrett, C. Kristopher; Hauck, Cory D.
2018-04-05
In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less
Detection of small surface defects using DCT based enhancement approach in machine vision systems
NASA Astrophysics Data System (ADS)
He, Fuqiang; Wang, Wen; Chen, Zichen
2005-12-01
Utilizing DCT based enhancement approach, an improved small defect detection algorithm for real-time leather surface inspection was developed. A two-stage decomposition procedure was proposed to extract an odd-odd frequency matrix after a digital image has been transformed to DCT domain. Then, the reverse cumulative sum algorithm was proposed to detect the transition points of the gentle curves plotted from the odd-odd frequency matrix. The best radius of the cutting sector was computed in terms of the transition points and the high-pass filtering operation was implemented. The filtered image was then inversed and transformed back to the spatial domain. Finally, the restored image was segmented by an entropy method and some defect features are calculated. Experimental results show the proposed small defect detection method can reach the small defect detection rate by 94%.
SVD compression for magnetic resonance fingerprinting in the time domain.
McGivney, Debra F; Pierre, Eric; Ma, Dan; Jiang, Yun; Saybasili, Haris; Gulani, Vikas; Griswold, Mark A
2014-12-01
Magnetic resonance (MR) fingerprinting is a technique for acquiring and processing MR data that simultaneously provides quantitative maps of different tissue parameters through a pattern recognition algorithm. A predefined dictionary models the possible signal evolutions simulated using the Bloch equations with different combinations of various MR parameters and pattern recognition is completed by computing the inner product between the observed signal and each of the predicted signals within the dictionary. Though this matching algorithm has been shown to accurately predict the MR parameters of interest, one desires a more efficient method to obtain the quantitative images. We propose to compress the dictionary using the singular value decomposition, which will provide a low-rank approximation. By compressing the size of the dictionary in the time domain, we are able to speed up the pattern recognition algorithm, by a factor of between 3.4-4.8, without sacrificing the high signal-to-noise ratio of the original scheme presented previously.
SVD Compression for Magnetic Resonance Fingerprinting in the Time Domain
McGivney, Debra F.; Pierre, Eric; Ma, Dan; Jiang, Yun; Saybasili, Haris; Gulani, Vikas; Griswold, Mark A.
2016-01-01
Magnetic resonance fingerprinting is a technique for acquiring and processing MR data that simultaneously provides quantitative maps of different tissue parameters through a pattern recognition algorithm. A predefined dictionary models the possible signal evolutions simulated using the Bloch equations with different combinations of various MR parameters and pattern recognition is completed by computing the inner product between the observed signal and each of the predicted signals within the dictionary. Though this matching algorithm has been shown to accurately predict the MR parameters of interest, one desires a more efficient method to obtain the quantitative images. We propose to compress the dictionary using the singular value decomposition (SVD), which will provide a low-rank approximation. By compressing the size of the dictionary in the time domain, we are able to speed up the pattern recognition algorithm, by a factor of between 3.4-4.8, without sacrificing the high signal-to-noise ratio of the original scheme presented previously. PMID:25029380
NASA Astrophysics Data System (ADS)
Pioldi, Fabio; Rizzi, Egidio
2017-07-01
Output-only structural identification is developed by a refined Frequency Domain Decomposition ( rFDD) approach, towards assessing current modal properties of heavy-damped buildings (in terms of identification challenge), under strong ground motions. Structural responses from earthquake excitations are taken as input signals for the identification algorithm. A new dedicated computational procedure, based on coupled Chebyshev Type II bandpass filters, is outlined for the effective estimation of natural frequencies, mode shapes and modal damping ratios. The identification technique is also coupled with a Gabor Wavelet Transform, resulting in an effective and self-contained time-frequency analysis framework. Simulated response signals generated by shear-type frames (with variable structural features) are used as a necessary validation condition. In this context use is made of a complete set of seismic records taken from the FEMA P695 database, i.e. all 44 "Far-Field" (22 NS, 22 WE) earthquake signals. The modal estimates are statistically compared to their target values, proving the accuracy of the developed algorithm in providing prompt and accurate estimates of all current strong ground motion modal parameters. At this stage, such analysis tool may be employed for convenient application in the realm of Earthquake Engineering, towards potential Structural Health Monitoring and damage detection purposes.
Optimal mapping of irregular finite element domains to parallel processors
NASA Technical Reports Server (NTRS)
Flower, J.; Otto, S.; Salama, M.
1987-01-01
Mapping the solution domain of n-finite elements into N-subdomains that may be processed in parallel by N-processors is an optimal one if the subdomain decomposition results in a well-balanced workload distribution among the processors. The problem is discussed in the context of irregular finite element domains as an important aspect of the efficient utilization of the capabilities of emerging multiprocessor computers. Finding the optimal mapping is an intractable combinatorial optimization problem, for which a satisfactory approximate solution is obtained here by analogy to a method used in statistical mechanics for simulating the annealing process in solids. The simulated annealing analogy and algorithm are described, and numerical results are given for mapping an irregular two-dimensional finite element domain containing a singularity onto the Hypercube computer.
Domain Decomposition By the Advancing-Partition Method for Parallel Unstructured Grid Generation
NASA Technical Reports Server (NTRS)
Pirzadeh, Shahyar Z.; Zagaris, George
2009-01-01
A new method of domain decomposition has been developed for generating unstructured grids in subdomains either sequentially or using multiple computers in parallel. Domain decomposition is a crucial and challenging step for parallel grid generation. Prior methods are generally based on auxiliary, complex, and computationally intensive operations for defining partition interfaces and usually produce grids of lower quality than those generated in single domains. The new technique, referred to as "Advancing Partition," is based on the Advancing-Front method, which partitions a domain as part of the volume mesh generation in a consistent and "natural" way. The benefits of this approach are: 1) the process of domain decomposition is highly automated, 2) partitioning of domain does not compromise the quality of the generated grids, and 3) the computational overhead for domain decomposition is minimal. The new method has been implemented in NASA's unstructured grid generation code VGRID.
Domain Decomposition By the Advancing-Partition Method
NASA Technical Reports Server (NTRS)
Pirzadeh, Shahyar Z.
2008-01-01
A new method of domain decomposition has been developed for generating unstructured grids in subdomains either sequentially or using multiple computers in parallel. Domain decomposition is a crucial and challenging step for parallel grid generation. Prior methods are generally based on auxiliary, complex, and computationally intensive operations for defining partition interfaces and usually produce grids of lower quality than those generated in single domains. The new technique, referred to as "Advancing Partition," is based on the Advancing-Front method, which partitions a domain as part of the volume mesh generation in a consistent and "natural" way. The benefits of this approach are: 1) the process of domain decomposition is highly automated, 2) partitioning of domain does not compromise the quality of the generated grids, and 3) the computational overhead for domain decomposition is minimal. The new method has been implemented in NASA's unstructured grid generation code VGRID.
GPU-based parallel algorithm for blind image restoration using midfrequency-based methods
NASA Astrophysics Data System (ADS)
Xie, Lang; Luo, Yi-han; Bao, Qi-liang
2013-08-01
GPU-based general-purpose computing is a new branch of modern parallel computing, so the study of parallel algorithms specially designed for GPU hardware architecture is of great significance. In order to solve the problem of high computational complexity and poor real-time performance in blind image restoration, the midfrequency-based algorithm for blind image restoration was analyzed and improved in this paper. Furthermore, a midfrequency-based filtering method is also used to restore the image hardly with any recursion or iteration. Combining the algorithm with data intensiveness, data parallel computing and GPU execution model of single instruction and multiple threads, a new parallel midfrequency-based algorithm for blind image restoration is proposed in this paper, which is suitable for stream computing of GPU. In this algorithm, the GPU is utilized to accelerate the estimation of class-G point spread functions and midfrequency-based filtering. Aiming at better management of the GPU threads, the threads in a grid are scheduled according to the decomposition of the filtering data in frequency domain after the optimization of data access and the communication between the host and the device. The kernel parallelism structure is determined by the decomposition of the filtering data to ensure the transmission rate to get around the memory bandwidth limitation. The results show that, with the new algorithm, the operational speed is significantly increased and the real-time performance of image restoration is effectively improved, especially for high-resolution images.
Morphological decomposition of 2-D binary shapes into convex polygons: a heuristic algorithm.
Xu, J
2001-01-01
In many morphological shape decomposition algorithms, either a shape can only be decomposed into shape components of extremely simple forms or a time consuming search process is employed to determine a decomposition. In this paper, we present a morphological shape decomposition algorithm that decomposes a two-dimensional (2-D) binary shape into a collection of convex polygonal components. A single convex polygonal approximation for a given image is first identified. This first component is determined incrementally by selecting a sequence of basic shape primitives. These shape primitives are chosen based on shape information extracted from the given shape at different scale levels. Additional shape components are identified recursively from the difference image between the given image and the first component. Simple operations are used to repair certain concavities caused by the set difference operation. The resulting hierarchical structure provides descriptions for the given shape at different detail levels. The experiments show that the decomposition results produced by the algorithm seem to be in good agreement with the natural structures of the given shapes. The computational cost of the algorithm is significantly lower than that of an earlier search-based convex decomposition algorithm. Compared to nonconvex decomposition algorithms, our algorithm allows accurate approximations for the given shapes at low coding costs.
A general framework of noise suppression in material decomposition for dual-energy CT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Petrongolo, Michael; Dong, Xue; Zhu, Lei, E-mail: leizhu@gatech.edu
Purpose: As a general problem of dual-energy CT (DECT), noise amplification in material decomposition severely reduces the signal-to-noise ratio on the decomposed images compared to that on the original CT images. In this work, the authors propose a general framework of noise suppression in material decomposition for DECT. The method is based on an iterative algorithm recently developed in their group for image-domain decomposition of DECT, with an extension to include nonlinear decomposition models. The generalized framework of iterative DECT decomposition enables beam-hardening correction with simultaneous noise suppression, which improves the clinical benefits of DECT. Methods: The authors propose tomore » suppress noise on the decomposed images of DECT using convex optimization, which is formulated in the form of least-squares estimation with smoothness regularization. Based on the design principles of a best linear unbiased estimator, the authors include the inverse of the estimated variance–covariance matrix of the decomposed images as the penalty weight in the least-squares term. Analytical formulas are derived to compute the variance–covariance matrix for decomposed images with general-form numerical or analytical decomposition. As a demonstration, the authors implement the proposed algorithm on phantom data using an empirical polynomial function of decomposition measured on a calibration scan. The polynomial coefficients are determined from the projection data acquired on a wedge phantom, and the signal decomposition is performed in the projection domain. Results: On the Catphan{sup ®}600 phantom, the proposed noise suppression method reduces the average noise standard deviation of basis material images by one to two orders of magnitude, with a superior performance on spatial resolution as shown in comparisons of line-pair images and modulation transfer function measurements. On the synthesized monoenergetic CT images, the noise standard deviation is reduced by a factor of 2–3. By using nonlinear decomposition on projections, the authors’ method effectively suppresses the streaking artifacts of beam hardening and obtains more uniform images than their previous approach based on a linear model. Similar performance of noise suppression is observed in the results of an anthropomorphic head phantom and a pediatric chest phantom generated by the proposed method. With beam-hardening correction enabled by their approach, the image spatial nonuniformity on the head phantom is reduced from around 10% on the original CT images to 4.9% on the synthesized monoenergetic CT image. On the pediatric chest phantom, their method suppresses image noise standard deviation by a factor of around 7.5, and compared with linear decomposition, it reduces the estimation error of electron densities from 33.3% to 8.6%. Conclusions: The authors propose a general framework of noise suppression in material decomposition for DECT. Phantom studies have shown the proposed method improves the image uniformity and the accuracy of electron density measurements by effective beam-hardening correction and reduces noise level without noticeable resolution loss.« less
NASA Astrophysics Data System (ADS)
Compton, Duane C.; Snapp, Robert R.
2007-09-01
TWiGS (two-dimensional wavelet transform with generalized cross validation and soft thresholding) is a novel algorithm for denoising liquid chromatography-mass spectrometry (LC-MS) data for use in "shot-gun" proteomics. Proteomics, the study of all proteins in an organism, is an emerging field that has already proven successful for drug and disease discovery in humans. There are a number of constraints that limit the effectiveness of liquid chromatography-mass spectrometry (LC-MS) for shot-gun proteomics, where the chemical signals are typically weak, and data sets are computationally large. Most algorithms suffer greatly from a researcher driven bias, making the results irreproducible and unusable by other laboratories. We thus introduce a new algorithm, TWiGS, that removes electrical (additive white) and chemical noise from LC-MS data sets. TWiGS is developed to be a true two-dimensional algorithm, which operates in the time-frequency domain, and minimizes the amount of researcher bias. It is based on the traditional discrete wavelet transform (DWT), which allows for fast and reproducible analysis. The separable two-dimensional DWT decomposition is paired with generalized cross validation and soft thresholding. The Haar, Coiflet-6, Daubechie-4 and the number of decomposition levels are determined based on observed experimental results. Using a synthetic LC-MS data model, TWiGS accurately retains key characteristics of the peaks in both the time and m/z domain, and can detect peaks from noise of the same intensity. TWiGS is applied to angiotensin I and II samples run on a LC-ESI-TOF-MS (liquid-chromatography-electrospray-ionization) to demonstrate its utility for the detection of low-lying peaks obscured by noise.
NASA Astrophysics Data System (ADS)
Voznyuk, I.; Litman, A.; Tortel, H.
2015-08-01
A Quasi-Newton method for reconstructing the constitutive parameters of three-dimensional (3D) penetrable scatterers from scattered field measurements is presented. This method is adapted for handling large-scale electromagnetic problems while keeping the memory requirement and the time flexibility as low as possible. The forward scattering problem is solved by applying the finite-element tearing and interconnecting full-dual-primal (FETI-FDP2) method which shares the same spirit as the domain decomposition methods for finite element methods. The idea is to split the computational domain into smaller non-overlapping sub-domains in order to simultaneously solve local sub-problems. Various strategies are proposed in order to efficiently couple the inversion algorithm with the FETI-FDP2 method: a separation into permanent and non-permanent subdomains is performed, iterative solvers are favorized for resolving the interface problem and a marching-on-in-anything initial guess selection further accelerates the process. The computational burden is also reduced by applying the adjoint state vector methodology. Finally, the inversion algorithm is confronted to measurements extracted from the 3D Fresnel database.
NASA Astrophysics Data System (ADS)
Vera, N. C.; GMMC
2013-05-01
In this paper we present the results of macrohybrid mixed Darcian flow in porous media in a general three-dimensional domain. The global problem is solved as a set of local subproblems which are posed using a domain decomposition method. Unknown fields of local problems, velocity and pressure are approximated using mixed finite elements. For this application, a general three-dimensional domain is considered which is discretized using tetrahedra. The discrete domain is decomposed into subdomains and reformulated the original problem as a set of subproblems, communicated through their interfaces. To solve this set of subproblems, we use finite element mixed and parallel computing. The parallelization of a problem using this methodology can, in principle, to fully exploit a computer equipment and also provides results in less time, two very important elements in modeling. Referencias G.Alduncin and N.Vera-Guzmán Parallel proximal-point algorithms for mixed _nite element models of _ow in the subsurface, Commun. Numer. Meth. Engng 2004; 20:83-104 (DOI: 10.1002/cnm.647) Z. Chen, G.Huan and Y. Ma Computational Methods for Multiphase Flows in Porous Media, SIAM, Society for Industrial and Applied Mathematics, Philadelphia, 2006. A. Quarteroni and A. Valli, Numerical Approximation of Partial Differential Equations, Springer-Verlag, Berlin, 1994. Brezzi F, Fortin M. Mixed and Hybrid Finite Element Methods. Springer: New York, 1991.
Fast divide-and-conquer algorithm for evaluating polarization in classical force fields
NASA Astrophysics Data System (ADS)
Nocito, Dominique; Beran, Gregory J. O.
2017-03-01
Evaluation of the self-consistent polarization energy forms a major computational bottleneck in polarizable force fields. In large systems, the linear polarization equations are typically solved iteratively with techniques based on Jacobi iterations (JI) or preconditioned conjugate gradients (PCG). Two new variants of JI are proposed here that exploit domain decomposition to accelerate the convergence of the induced dipoles. The first, divide-and-conquer JI (DC-JI), is a block Jacobi algorithm which solves the polarization equations within non-overlapping sub-clusters of atoms directly via Cholesky decomposition, and iterates to capture interactions between sub-clusters. The second, fuzzy DC-JI, achieves further acceleration by employing overlapping blocks. Fuzzy DC-JI is analogous to an additive Schwarz method, but with distance-based weighting when averaging the fuzzy dipoles from different blocks. Key to the success of these algorithms is the use of K-means clustering to identify natural atomic sub-clusters automatically for both algorithms and to determine the appropriate weights in fuzzy DC-JI. The algorithm employs knowledge of the 3-D spatial interactions to group important elements in the 2-D polarization matrix. When coupled with direct inversion in the iterative subspace (DIIS) extrapolation, fuzzy DC-JI/DIIS in particular converges in a comparable number of iterations as PCG, but with lower computational cost per iteration. In the end, the new algorithms demonstrated here accelerate the evaluation of the polarization energy by 2-3 fold compared to existing implementations of PCG or JI/DIIS.
Modal Identification of Tsing MA Bridge by Using Improved Eigensystem Realization Algorithm
NASA Astrophysics Data System (ADS)
QIN, Q.; LI, H. B.; QIAN, L. Z.; LAU, C.-K.
2001-10-01
This paper presents the results of research work on modal identification of Tsing Ma bridge ambient testing data by using an improved eigensystem realization algorithm. The testing was carried out before the bridge was open to traffic and after the completion of surfacing. Without traffic load, ambient excitations were much less intensive, and the bridge responses to such ambient excitation were also less intensive. Consequently, the bridge responses were significantly influenced by the random movement of heavy construction vehicles on the deck. To cut off noises in the testing data and make the ambient signals more stationary, the Chebyshev digital filter was used instead of the digital filter with a Hanning window. Random decrement (RD) functions were built to convert the ambient responses to free vibrations. An improved eigensystem realization algorithm was employed to improve the accuracy and the efficiency of modal identification. It uses cross-correlation functions ofRD functions to form the Hankel matrix instead of RD functions themselves and uses eigenvalue decomposition instead of singular value decomposition. The data for response accelerations were acquired group by group because of limited number of high-quality accelerometers and channels of data loggers available. The modes were identified group by group and then assembled by using response accelerations acquired at reference points to form modes of the complete bridge. Seventy-nine modes of the Tsing Ma bridge were identified, including five complex modes formed in accordance with unevenly distributed damping in the bridge. The identified modes in time domain were then compared with those identified in frequency domain and finite element analytical results.
Domain decomposition for aerodynamic and aeroacoustic analyses, and optimization
NASA Technical Reports Server (NTRS)
Baysal, Oktay
1995-01-01
The overarching theme was the domain decomposition, which intended to improve the numerical solution technique for the partial differential equations at hand; in the present study, those that governed either the fluid flow, or the aeroacoustic wave propagation, or the sensitivity analysis for a gradient-based optimization. The role of the domain decomposition extended beyond the original impetus of discretizing geometrical complex regions or writing modular software for distributed-hardware computers. It induced function-space decompositions and operator decompositions that offered the valuable property of near independence of operator evaluation tasks. The objectives have gravitated about the extensions and implementations of either the previously developed or concurrently being developed methodologies: (1) aerodynamic sensitivity analysis with domain decomposition (SADD); (2) computational aeroacoustics of cavities; and (3) dynamic, multibody computational fluid dynamics using unstructured meshes.
NASA Astrophysics Data System (ADS)
Pioldi, Fabio; Ferrari, Rosalba; Rizzi, Egidio
2016-02-01
The present paper deals with the seismic modal dynamic identification of frame structures by a refined Frequency Domain Decomposition (rFDD) algorithm, autonomously formulated and implemented within MATLAB. First, the output-only identification technique is outlined analytically and then employed to characterize all modal properties. Synthetic response signals generated prior to the dynamic identification are adopted as input channels, in view of assessing a necessary condition for the procedure's efficiency. Initially, the algorithm is verified on canonical input from random excitation. Then, modal identification has been attempted successfully at given seismic input, taken as base excitation, including both strong motion data and single and multiple input ground motions. Rather than different attempts investigating the role of seismic response signals in the Time Domain, this paper considers the identification analysis in the Frequency Domain. Results turn-out very much consistent with the target values, with quite limited errors in the modal estimates, including for the damping ratios, ranging from values in the order of 1% to 10%. Either seismic excitation and high values of damping, resulting critical also in case of well-spaced modes, shall not fulfill traditional FFD assumptions: this shows the consistency of the developed algorithm. Through original strategies and arrangements, the paper shows that a comprehensive rFDD modal dynamic identification of frames at seismic input is feasible, also at concomitant high damping.
An NN-Based SRD Decomposition Algorithm and Its Application in Nonlinear Compensation
Yan, Honghang; Deng, Fang; Sun, Jian; Chen, Jie
2014-01-01
In this study, a neural network-based square root of descending (SRD) order decomposition algorithm for compensating for nonlinear data generated by sensors is presented. The study aims at exploring the optimized decomposition of data 1.00,0.00,0.00 and minimizing the computational complexity and memory space of the training process. A linear decomposition algorithm, which automatically finds the optimal decomposition of N subparts and reduces the training time to 1N and memory cost to 1N, has been implemented on nonlinear data obtained from an encoder. Particular focus is given to the theoretical access of estimating the numbers of hidden nodes and the precision of varying the decomposition method. Numerical experiments are designed to evaluate the effect of this algorithm. Moreover, a designed device for angular sensor calibration is presented. We conduct an experiment that samples the data of an encoder and compensates for the nonlinearity of the encoder to testify this novel algorithm. PMID:25232912
3-D modeling of ductile tearing using finite elements: Computational aspects and techniques
NASA Astrophysics Data System (ADS)
Gullerud, Arne Stewart
This research focuses on the development and application of computational tools to perform large-scale, 3-D modeling of ductile tearing in engineering components under quasi-static to mild loading rates. Two standard models for ductile tearing---the computational cell methodology and crack growth controlled by the crack tip opening angle (CTOA)---are described and their 3-D implementations are explored. For the computational cell methodology, quantification of the effects of several numerical issues---computational load step size, procedures for force release after cell deletion, and the porosity for cell deletion---enables construction of computational algorithms to remove the dependence of predicted crack growth on these issues. This work also describes two extensions of the CTOA approach into 3-D: a general 3-D method and a constant front technique. Analyses compare the characteristics of the extensions, and a validation study explores the ability of the constant front extension to predict crack growth in thin aluminum test specimens over a range of specimen geometries, absolutes sizes, and levels of out-of-plane constraint. To provide a computational framework suitable for the solution of these problems, this work also describes the parallel implementation of a nonlinear, implicit finite element code. The implementation employs an explicit message-passing approach using the MPI standard to maintain portability, a domain decomposition of element data to provide parallel execution, and a master-worker organization of the computational processes to enhance future extensibility. A linear preconditioned conjugate gradient (LPCG) solver serves as the core of the solution process. The parallel LPCG solver utilizes an element-by-element (EBE) structure of the computations to permit a dual-level decomposition of the element data: domain decomposition of the mesh provides efficient coarse-grain parallel execution, while decomposition of the domains into blocks of similar elements (same type, constitutive model, etc.) provides fine-grain parallel computation on each processor. A major focus of the LPCG solver is a new implementation of the Hughes-Winget element-by-element (HW) preconditioner. The implementation employs a weighted dependency graph combined with a new coloring algorithm to provide load-balanced scheduling for the preconditioner and overlapped communication/computation. This approach enables efficient parallel application of the HW preconditioner for arbitrary unstructured meshes.
Gao, Yu-Fei; Gui, Guan; Xie, Wei; Zou, Yan-Bin; Yang, Yue; Wan, Qun
2017-01-01
This paper investigates a two-dimensional angle of arrival (2D AOA) estimation algorithm for the electromagnetic vector sensor (EMVS) array based on Type-2 block component decomposition (BCD) tensor modeling. Such a tensor decomposition method can take full advantage of the multidimensional structural information of electromagnetic signals to accomplish blind estimation for array parameters with higher resolution. However, existing tensor decomposition methods encounter many restrictions in applications of the EMVS array, such as the strict requirement for uniqueness conditions of decomposition, the inability to handle partially-polarized signals, etc. To solve these problems, this paper investigates tensor modeling for partially-polarized signals of an L-shaped EMVS array. The 2D AOA estimation algorithm based on rank-(L1,L2,·) BCD is developed, and the uniqueness condition of decomposition is analyzed. By means of the estimated steering matrix, the proposed algorithm can automatically achieve angle pair-matching. Numerical experiments demonstrate that the present algorithm has the advantages of both accuracy and robustness of parameter estimation. Even under the conditions of lower SNR, small angular separation and limited snapshots, the proposed algorithm still possesses better performance than subspace methods and the canonical polyadic decomposition (CPD) method. PMID:28448431
Gao, Yu-Fei; Gui, Guan; Xie, Wei; Zou, Yan-Bin; Yang, Yue; Wan, Qun
2017-04-27
This paper investigates a two-dimensional angle of arrival (2D AOA) estimation algorithm for the electromagnetic vector sensor (EMVS) array based on Type-2 block component decomposition (BCD) tensor modeling. Such a tensor decomposition method can take full advantage of the multidimensional structural information of electromagnetic signals to accomplish blind estimation for array parameters with higher resolution. However, existing tensor decomposition methods encounter many restrictions in applications of the EMVS array, such as the strict requirement for uniqueness conditions of decomposition, the inability to handle partially-polarized signals, etc. To solve these problems, this paper investigates tensor modeling for partially-polarized signals of an L-shaped EMVS array. The 2D AOA estimation algorithm based on rank- ( L 1 , L 2 , · ) BCD is developed, and the uniqueness condition of decomposition is analyzed. By means of the estimated steering matrix, the proposed algorithm can automatically achieve angle pair-matching. Numerical experiments demonstrate that the present algorithm has the advantages of both accuracy and robustness of parameter estimation. Even under the conditions of lower SNR, small angular separation and limited snapshots, the proposed algorithm still possesses better performance than subspace methods and the canonical polyadic decomposition (CPD) method.
An ambiguity principle for assigning protein structural domains.
Postic, Guillaume; Ghouzam, Yassine; Chebrek, Romain; Gelly, Jean-Christophe
2017-01-01
Ambiguity is the quality of being open to several interpretations. For an image, it arises when the contained elements can be delimited in two or more distinct ways, which may cause confusion. We postulate that it also applies to the analysis of protein three-dimensional structure, which consists in dividing the molecule into subunits called domains. Because different definitions of what constitutes a domain can be used to partition a given structure, the same protein may have different but equally valid domain annotations. However, knowledge and experience generally displace our ability to accept more than one way to decompose the structure of an object-in this case, a protein. This human bias in structure analysis is particularly harmful because it leads to ignoring potential avenues of research. We present an automated method capable of producing multiple alternative decompositions of protein structure (web server and source code available at www.dsimb.inserm.fr/sword/). Our innovative algorithm assigns structural domains through the hierarchical merging of protein units, which are evolutionarily preserved substructures that describe protein architecture at an intermediate level, between domain and secondary structure. To validate the use of these protein units for decomposing protein structures into domains, we set up an extensive benchmark made of expert annotations of structural domains and including state-of-the-art domain parsing algorithms. The relevance of our "multipartitioning" approach is shown through numerous examples of applications covering protein function, evolution, folding, and structure prediction. Finally, we introduce a measure for the structural ambiguity of protein molecules.
Marine Controlled-Source Electromagnetic 2D Inversion for synthetic models.
NASA Astrophysics Data System (ADS)
Liu, Y.; Li, Y.
2016-12-01
We present a 2D inverse algorithm for frequency domain marine controlled-source electromagnetic (CSEM) data, which is based on the regularized Gauss-Newton approach. As a forward solver, our parallel adaptive finite element forward modeling program is employed. It is a self-adaptive, goal-oriented grid refinement algorithm in which a finite element analysis is performed on a sequence of refined meshes. The mesh refinement process is guided by a dual error estimate weighting to bias refinement towards elements that affect the solution at the EM receiver locations. With the use of the direct solver (MUMPS), we can effectively compute the electromagnetic fields for multi-sources and parametric sensitivities. We also implement the parallel data domain decomposition approach of Key and Ovall (2011), with the goal of being able to compute accurate responses in parallel for complicated models and a full suite of data parameters typical of offshore CSEM surveys. All minimizations are carried out by using the Gauss-Newton algorithm and model perturbations at each iteration step are obtained by using the Inexact Conjugate Gradient iteration method. Synthetic test inversions are presented.
NASA Technical Reports Server (NTRS)
Aftosmis, M. J.; Berger, M. J.; Adomavicius, G.
2000-01-01
Preliminary verification and validation of an efficient Euler solver for adaptively refined Cartesian meshes with embedded boundaries is presented. The parallel, multilevel method makes use of a new on-the-fly parallel domain decomposition strategy based upon the use of space-filling curves, and automatically generates a sequence of coarse meshes for processing by the multigrid smoother. The coarse mesh generation algorithm produces grids which completely cover the computational domain at every level in the mesh hierarchy. A series of examples on realistically complex three-dimensional configurations demonstrate that this new coarsening algorithm reliably achieves mesh coarsening ratios in excess of 7 on adaptively refined meshes. Numerical investigations of the scheme's local truncation error demonstrate an achieved order of accuracy between 1.82 and 1.88. Convergence results for the multigrid scheme are presented for both subsonic and transonic test cases and demonstrate W-cycle multigrid convergence rates between 0.84 and 0.94. Preliminary parallel scalability tests on both simple wing and complex complete aircraft geometries shows a computational speedup of 52 on 64 processors using the run-time mesh partitioner.
Integrated Network Decompositions and Dynamic Programming for Graph Optimization (INDDGO)
DOE Office of Scientific and Technical Information (OSTI.GOV)
The INDDGO software package offers a set of tools for finding exact solutions to graph optimization problems via tree decompositions and dynamic programming algorithms. Currently the framework offers serial and parallel (distributed memory) algorithms for finding tree decompositions and solving the maximum weighted independent set problem. The parallel dynamic programming algorithm is implemented on top of the MADNESS task-based runtime.
A hybrid algorithm for parallel molecular dynamics simulations
NASA Astrophysics Data System (ADS)
Mangiardi, Chris M.; Meyer, R.
2017-10-01
This article describes algorithms for the hybrid parallelization and SIMD vectorization of molecular dynamics simulations with short-range forces. The parallelization method combines domain decomposition with a thread-based parallelization approach. The goal of the work is to enable efficient simulations of very large (tens of millions of atoms) and inhomogeneous systems on many-core processors with hundreds or thousands of cores and SIMD units with large vector sizes. In order to test the efficiency of the method, simulations of a variety of configurations with up to 74 million atoms have been performed. Results are shown that were obtained on multi-core systems with Sandy Bridge and Haswell processors as well as systems with Xeon Phi many-core processors.
NASA Astrophysics Data System (ADS)
Kang, Shouqiang; Ma, Danyang; Wang, Yujing; Lan, Chaofeng; Chen, Qingguo; Mikulovich, V. I.
2017-03-01
To effectively assess different fault locations and different degrees of performance degradation of a rolling bearing with a unified assessment index, a novel state assessment method based on the relative compensation distance of multiple-domain features and locally linear embedding is proposed. First, for a single-sample signal, time-domain and frequency-domain indexes can be calculated for the original vibration signal and each sensitive intrinsic mode function obtained by improved ensemble empirical mode decomposition, and the singular values of the sensitive intrinsic mode function matrix can be extracted by singular value decomposition to construct a high-dimensional hybrid-domain feature vector. Second, a feature matrix can be constructed by arranging each feature vector of multiple samples, the dimensions of each row vector of the feature matrix can be reduced by the locally linear embedding algorithm, and the compensation distance of each fault state of the rolling bearing can be calculated using the support vector machine. Finally, the relative distance between different fault locations and different degrees of performance degradation and the normal-state optimal classification surface can be compensated, and on the basis of the proposed relative compensation distance, the assessment model can be constructed and an assessment curve drawn. Experimental results show that the proposed method can effectively assess different fault locations and different degrees of performance degradation of the rolling bearing under certain conditions.
Matrix decomposition graphics processing unit solver for Poisson image editing
NASA Astrophysics Data System (ADS)
Lei, Zhao; Wei, Li
2012-10-01
In recent years, gradient-domain methods have been widely discussed in the image processing field, including seamless cloning and image stitching. These algorithms are commonly carried out by solving a large sparse linear system: the Poisson equation. However, solving the Poisson equation is a computational and memory intensive task which makes it not suitable for real-time image editing. A new matrix decomposition graphics processing unit (GPU) solver (MDGS) is proposed to settle the problem. A matrix decomposition method is used to distribute the work among GPU threads, so that MDGS will take full advantage of the computing power of current GPUs. Additionally, MDGS is a hybrid solver (combines both the direct and iterative techniques) and has two-level architecture. These enable MDGS to generate identical solutions with those of the common Poisson methods and achieve high convergence rate in most cases. This approach is advantageous in terms of parallelizability, enabling real-time image processing, low memory-taken and extensive applications.
Approximation, abstraction and decomposition in search and optimization
NASA Technical Reports Server (NTRS)
Ellman, Thomas
1992-01-01
In this paper, I discuss four different areas of my research. One portion of my research has focused on automatic synthesis of search control heuristics for constraint satisfaction problems (CSPs). I have developed techniques for automatically synthesizing two types of heuristics for CSPs: Filtering functions are used to remove portions of a search space from consideration. Another portion of my research is focused on automatic synthesis of hierarchic algorithms for solving constraint satisfaction problems (CSPs). I have developed a technique for constructing hierarchic problem solvers based on numeric interval algebra. Another portion of my research is focused on automatic decomposition of design optimization problems. We are using the design of racing yacht hulls as a testbed domain for this research. Decomposition is especially important in the design of complex physical shapes such as yacht hulls. Another portion of my research is focused on intelligent model selection in design optimization. The model selection problem results from the difficulty of using exact models to analyze the performance of candidate designs.
An efficient solution procedure for the thermoelastic analysis of truss space structures
NASA Technical Reports Server (NTRS)
Givoli, D.; Rand, O.
1992-01-01
A solution procedure is proposed for the thermal and thermoelastic analysis of truss space structures in periodic motion. In this method, the spatial domain is first descretized using a consistent finite element formulation. Then the resulting semi-discrete equations in time are solved analytically by using Fourier decomposition. Geometrical symmetry is taken advantage of completely. An algorithm is presented for the calculation of heat flux distribution. The method is demonstrated via a numerical example of a cylindrically shaped space structure.
A transient FETI methodology for large-scale parallel implicit computations in structural mechanics
NASA Technical Reports Server (NTRS)
Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier
1992-01-01
Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.
Parallel Finite Element Domain Decomposition for Structural/Acoustic Analysis
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.; Tungkahotara, Siroj; Watson, Willie R.; Rajan, Subramaniam D.
2005-01-01
A domain decomposition (DD) formulation for solving sparse linear systems of equations resulting from finite element analysis is presented. The formulation incorporates mixed direct and iterative equation solving strategics and other novel algorithmic ideas that are optimized to take advantage of sparsity and exploit modern computer architecture, such as memory and parallel computing. The most time consuming part of the formulation is identified and the critical roles of direct sparse and iterative solvers within the framework of the formulation are discussed. Experiments on several computer platforms using several complex test matrices are conducted using software based on the formulation. Small-scale structural examples are used to validate thc steps in the formulation and large-scale (l,000,000+ unknowns) duct acoustic examples are used to evaluate the ORIGIN 2000 processors, and a duster of 6 PCs (running under the Windows environment). Statistics show that the formulation is efficient in both sequential and parallel computing environmental and that the formulation is significantly faster and consumes less memory than that based on one of the best available commercialized parallel sparse solvers.
Conservative tightly-coupled simulations of stochastic multiscale systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taverniers, Søren; Pigarov, Alexander Y.; Tartakovsky, Daniel M., E-mail: dmt@ucsd.edu
2016-05-15
Multiphysics problems often involve components whose macroscopic dynamics is driven by microscopic random fluctuations. The fidelity of simulations of such systems depends on their ability to propagate these random fluctuations throughout a computational domain, including subdomains represented by deterministic solvers. When the constituent processes take place in nonoverlapping subdomains, system behavior can be modeled via a domain-decomposition approach that couples separate components at the interfaces between these subdomains. Its coupling algorithm has to maintain a stable and efficient numerical time integration even at high noise strength. We propose a conservative domain-decomposition algorithm in which tight coupling is achieved by employingmore » either Picard's or Newton's iterative method. Coupled diffusion equations, one of which has a Gaussian white-noise source term, provide a computational testbed for analysis of these two coupling strategies. Fully-converged (“implicit”) coupling with Newton's method typically outperforms its Picard counterpart, especially at high noise levels. This is because the number of Newton iterations scales linearly with the amplitude of the Gaussian noise, while the number of Picard iterations can scale superlinearly. At large time intervals between two subsequent inter-solver communications, the solution error for single-iteration (“explicit”) Picard's coupling can be several orders of magnitude higher than that for implicit coupling. Increasing the explicit coupling's communication frequency reduces this difference, but the resulting increase in computational cost can make it less efficient than implicit coupling at similar levels of solution error, depending on the communication frequency of the latter and the noise strength. This trend carries over into higher dimensions, although at high noise strength explicit coupling may be the only computationally viable option.« less
Cheng, Yih-Chun; Tsai, Pei-Yun; Huang, Ming-Hao
2016-05-19
Low-complexity compressed sensing (CS) techniques for monitoring electrocardiogram (ECG) signals in wireless body sensor network (WBSN) are presented. The prior probability of ECG sparsity in the wavelet domain is first exploited. Then, variable orthogonal multi-matching pursuit (vOMMP) algorithm that consists of two phases is proposed. In the first phase, orthogonal matching pursuit (OMP) algorithm is adopted to effectively augment the support set with reliable indices and in the second phase, the orthogonal multi-matching pursuit (OMMP) is employed to rescue the missing indices. The reconstruction performance is thus enhanced with the prior information and the vOMMP algorithm. Furthermore, the computation-intensive pseudo-inverse operation is simplified by the matrix-inversion-free (MIF) technique based on QR decomposition. The vOMMP-MIF CS decoder is then implemented in 90 nm CMOS technology. The QR decomposition is accomplished by two systolic arrays working in parallel. The implementation supports three settings for obtaining 40, 44, and 48 coefficients in the sparse vector. From the measurement result, the power consumption is 11.7 mW at 0.9 V and 12 MHz. Compared to prior chip implementations, our design shows good hardware efficiency and is suitable for low-energy applications.
Application of composite dictionary multi-atom matching in gear fault diagnosis.
Cui, Lingli; Kang, Chenhui; Wang, Huaqing; Chen, Peng
2011-01-01
The sparse decomposition based on matching pursuit is an adaptive sparse expression method for signals. This paper proposes an idea concerning a composite dictionary multi-atom matching decomposition and reconstruction algorithm, and the introduction of threshold de-noising in the reconstruction algorithm. Based on the structural characteristics of gear fault signals, a composite dictionary combining the impulse time-frequency dictionary and the Fourier dictionary was constituted, and a genetic algorithm was applied to search for the best matching atom. The analysis results of gear fault simulation signals indicated the effectiveness of the hard threshold, and the impulse or harmonic characteristic components could be separately extracted. Meanwhile, the robustness of the composite dictionary multi-atom matching algorithm at different noise levels was investigated. Aiming at the effects of data lengths on the calculation efficiency of the algorithm, an improved segmented decomposition and reconstruction algorithm was proposed, and the calculation efficiency of the decomposition algorithm was significantly enhanced. In addition it is shown that the multi-atom matching algorithm was superior to the single-atom matching algorithm in both calculation efficiency and algorithm robustness. Finally, the above algorithm was applied to gear fault engineering signals, and achieved good results.
Hybrid-optimization strategy for the communication of large-scale Kinetic Monte Carlo simulation
NASA Astrophysics Data System (ADS)
Wu, Baodong; Li, Shigang; Zhang, Yunquan; Nie, Ningming
2017-02-01
The parallel Kinetic Monte Carlo (KMC) algorithm based on domain decomposition has been widely used in large-scale physical simulations. However, the communication overhead of the parallel KMC algorithm is critical, and severely degrades the overall performance and scalability. In this paper, we present a hybrid optimization strategy to reduce the communication overhead for the parallel KMC simulations. We first propose a communication aggregation algorithm to reduce the total number of messages and eliminate the communication redundancy. Then, we utilize the shared memory to reduce the memory copy overhead of the intra-node communication. Finally, we optimize the communication scheduling using the neighborhood collective operations. We demonstrate the scalability and high performance of our hybrid optimization strategy by both theoretical and experimental analysis. Results show that the optimized KMC algorithm exhibits better performance and scalability than the well-known open-source library-SPPARKS. On 32-node Xeon E5-2680 cluster (total 640 cores), the optimized algorithm reduces the communication time by 24.8% compared with SPPARKS.
Convolution- and Fourier-transform-based reconstructors for pyramid wavefront sensor.
Shatokhina, Iuliia; Ramlau, Ronny
2017-08-01
In this paper, we present two novel algorithms for wavefront reconstruction from pyramid-type wavefront sensor data. An overview of the current state-of-the-art in the application of pyramid-type wavefront sensors shows that the novel algorithms can be applied in various scientific fields such as astronomy, ophthalmology, and microscopy. Assuming a computationally very challenging setting corresponding to the extreme adaptive optics (XAO) on the European Extremely Large Telescope, we present the results of the performed end-to-end simulations and compare the achieved AO correction quality (in terms of the long-exposure Strehl ratio) to other methods, such as matrix-vector multiplication and preprocessed cumulative reconstructor with domain decomposition. Also, we provide a comparison in terms of applicability and computational complexity and closed-loop performance of our novel algorithms to other methods existing for this type of sensor.
An ambiguity principle for assigning protein structural domains
Postic, Guillaume; Ghouzam, Yassine; Chebrek, Romain; Gelly, Jean-Christophe
2017-01-01
Ambiguity is the quality of being open to several interpretations. For an image, it arises when the contained elements can be delimited in two or more distinct ways, which may cause confusion. We postulate that it also applies to the analysis of protein three-dimensional structure, which consists in dividing the molecule into subunits called domains. Because different definitions of what constitutes a domain can be used to partition a given structure, the same protein may have different but equally valid domain annotations. However, knowledge and experience generally displace our ability to accept more than one way to decompose the structure of an object—in this case, a protein. This human bias in structure analysis is particularly harmful because it leads to ignoring potential avenues of research. We present an automated method capable of producing multiple alternative decompositions of protein structure (web server and source code available at www.dsimb.inserm.fr/sword/). Our innovative algorithm assigns structural domains through the hierarchical merging of protein units, which are evolutionarily preserved substructures that describe protein architecture at an intermediate level, between domain and secondary structure. To validate the use of these protein units for decomposing protein structures into domains, we set up an extensive benchmark made of expert annotations of structural domains and including state-of-the-art domain parsing algorithms. The relevance of our “multipartitioning” approach is shown through numerous examples of applications covering protein function, evolution, folding, and structure prediction. Finally, we introduce a measure for the structural ambiguity of protein molecules. PMID:28097215
Dual domain watermarking for authentication and compression of cultural heritage images.
Zhao, Yang; Campisi, Patrizio; Kundur, Deepa
2004-03-01
This paper proposes an approach for the combined image authentication and compression of color images by making use of a digital watermarking and data hiding framework. The digital watermark is comprised of two components: a soft-authenticator watermark for authentication and tamper assessment of the given image, and a chrominance watermark employed to improve the efficiency of compression. The multipurpose watermark is designed by exploiting the orthogonality of various domains used for authentication, color decomposition and watermark insertion. The approach is implemented as a DCT-DWT dual domain algorithm and is applied for the protection and compression of cultural heritage imagery. Analysis is provided to characterize the behavior of the scheme under ideal conditions. Simulations and comparisons of the proposed approach with state-of-the-art existing work demonstrate the potential of the overall scheme.
Applications of singular value analysis and partial-step algorithm for nonlinear orbit determination
NASA Technical Reports Server (NTRS)
Ryne, Mark S.; Wang, Tseng-Chan
1991-01-01
An adaptive method in which cruise and nonlinear orbit determination problems can be solved using a single program is presented. It involves singular value decomposition augmented with an extended partial step algorithm. The extended partial step algorithm constrains the size of the correction to the spacecraft state and other solve-for parameters. The correction is controlled by an a priori covariance and a user-supplied bounds parameter. The extended partial step method is an extension of the update portion of the singular value decomposition algorithm. It thus preserves the numerical stability of the singular value decomposition method, while extending the region over which it converges. In linear cases, this method reduces to the singular value decomposition algorithm with the full rank solution. Two examples are presented to illustrate the method's utility.
Optimal cost design of water distribution networks using a decomposition approach
NASA Astrophysics Data System (ADS)
Lee, Ho Min; Yoo, Do Guen; Sadollah, Ali; Kim, Joong Hoon
2016-12-01
Water distribution network decomposition, which is an engineering approach, is adopted to increase the efficiency of obtaining the optimal cost design of a water distribution network using an optimization algorithm. This study applied the source tracing tool in EPANET, which is a hydraulic and water quality analysis model, to the decomposition of a network to improve the efficiency of the optimal design process. The proposed approach was tested by carrying out the optimal cost design of two water distribution networks, and the results were compared with other optimal cost designs derived from previously proposed optimization algorithms. The proposed decomposition approach using the source tracing technique enables the efficient decomposition of an actual large-scale network, and the results can be combined with the optimal cost design process using an optimization algorithm. This proves that the final design in this study is better than those obtained with other previously proposed optimization algorithms.
NASA Astrophysics Data System (ADS)
Hu, Xiaogang; Rymer, William Z.; Suresh, Nina L.
2014-04-01
Objective. The aim of this study is to assess the accuracy of a surface electromyogram (sEMG) motor unit (MU) decomposition algorithm during low levels of muscle contraction. Approach. A two-source method was used to verify the accuracy of the sEMG decomposition system, by utilizing simultaneous intramuscular and surface EMG recordings from the human first dorsal interosseous muscle recorded during isometric trapezoidal force contractions. Spike trains from each recording type were decomposed independently utilizing two different algorithms, EMGlab and dEMG decomposition algorithms. The degree of agreement of the decomposed spike timings was assessed for three different segments of the EMG signals, corresponding to specified regions in the force task. A regression analysis was performed to examine whether certain properties of the sEMG and force signal can predict the decomposition accuracy. Main results. The average accuracy of successful decomposition among the 119 MUs that were common to both intramuscular and surface records was approximately 95%, and the accuracy was comparable between the different segments of the sEMG signals (i.e., force ramp-up versus steady state force versus combined). The regression function between the accuracy and properties of sEMG and force signals revealed that the signal-to-noise ratio of the action potential and stability in the action potential records were significant predictors of the surface decomposition accuracy. Significance. The outcomes of our study confirm the accuracy of the sEMG decomposition algorithm during low muscle contraction levels and provide confidence in the overall validity of the surface dEMG decomposition algorithm.
Algorithms and Application of Sparse Matrix Assembly and Equation Solvers for Aeroacoustics
NASA Technical Reports Server (NTRS)
Watson, W. R.; Nguyen, D. T.; Reddy, C. J.; Vatsa, V. N.; Tang, W. H.
2001-01-01
An algorithm for symmetric sparse equation solutions on an unstructured grid is described. Efficient, sequential sparse algorithms for degree-of-freedom reordering, supernodes, symbolic/numerical factorization, and forward backward solution phases are reviewed. Three sparse algorithms for the generation and assembly of symmetric systems of matrix equations are presented. The accuracy and numerical performance of the sequential version of the sparse algorithms are evaluated over the frequency range of interest in a three-dimensional aeroacoustics application. Results show that the solver solutions are accurate using a discretization of 12 points per wavelength. Results also show that the first assembly algorithm is impractical for high-frequency noise calculations. The second and third assembly algorithms have nearly equal performance at low values of source frequencies, but at higher values of source frequencies the third algorithm saves CPU time and RAM. The CPU time and the RAM required by the second and third assembly algorithms are two orders of magnitude smaller than that required by the sparse equation solver. A sequential version of these sparse algorithms can, therefore, be conveniently incorporated into a substructuring for domain decomposition formulation to achieve parallel computation, where different substructures are handles by different parallel processors.
Uncertainty propagation in orbital mechanics via tensor decomposition
NASA Astrophysics Data System (ADS)
Sun, Yifei; Kumar, Mrinal
2016-03-01
Uncertainty forecasting in orbital mechanics is an essential but difficult task, primarily because the underlying Fokker-Planck equation (FPE) is defined on a relatively high dimensional (6-D) state-space and is driven by the nonlinear perturbed Keplerian dynamics. In addition, an enormously large solution domain is required for numerical solution of this FPE (e.g. encompassing the entire orbit in the x-y-z subspace), of which the state probability density function (pdf) occupies a tiny fraction at any given time. This coupling of large size, high dimensionality and nonlinearity makes for a formidable computational task, and has caused the FPE for orbital uncertainty propagation to remain an unsolved problem. To the best of the authors' knowledge, this paper presents the first successful direct solution of the FPE for perturbed Keplerian mechanics. To tackle the dimensionality issue, the time-varying state pdf is approximated in the CANDECOMP/PARAFAC decomposition tensor form where all the six spatial dimensions as well as the time dimension are separated from one other. The pdf approximation for all times is obtained simultaneously via the alternating least squares algorithm. Chebyshev spectral differentiation is employed for discretization on account of its spectral ("super-fast") convergence rate. To facilitate the tensor decomposition and control the solution domain size, system dynamics is expressed using spherical coordinates in a noninertial reference frame. Numerical results obtained on a regular personal computer are compared with Monte Carlo simulations.
A New Domain Decomposition Approach for the Gust Response Problem
NASA Technical Reports Server (NTRS)
Scott, James R.; Atassi, Hafiz M.; Susan-Resiga, Romeo F.
2002-01-01
A domain decomposition method is developed for solving the aerodynamic/aeroacoustic problem of an airfoil in a vortical gust. The computational domain is divided into inner and outer regions wherein the governing equations are cast in different forms suitable for accurate computations in each region. Boundary conditions which ensure continuity of pressure and velocity are imposed along the interface separating the two regions. A numerical study is presented for reduced frequencies ranging from 0.1 to 3.0. It is seen that the domain decomposition approach in providing robust and grid independent solutions.
Automated Development of Accurate Algorithms and Efficient Codes for Computational Aeroacoustics
NASA Technical Reports Server (NTRS)
Goodrich, John W.; Dyson, Rodger W.
1999-01-01
The simulation of sound generation and propagation in three space dimensions with realistic aircraft components is a very large time dependent computation with fine details. Simulations in open domains with embedded objects require accurate and robust algorithms for propagation, for artificial inflow and outflow boundaries, and for the definition of geometrically complex objects. The development, implementation, and validation of methods for solving these demanding problems is being done to support the NASA pillar goals for reducing aircraft noise levels. Our goal is to provide algorithms which are sufficiently accurate and efficient to produce usable results rapidly enough to allow design engineers to study the effects on sound levels of design changes in propulsion systems, and in the integration of propulsion systems with airframes. There is a lack of design tools for these purposes at this time. Our technical approach to this problem combines the development of new, algorithms with the use of Mathematica and Unix utilities to automate the algorithm development, code implementation, and validation. We use explicit methods to ensure effective implementation by domain decomposition for SPMD parallel computing. There are several orders of magnitude difference in the computational efficiencies of the algorithms which we have considered. We currently have new artificial inflow and outflow boundary conditions that are stable, accurate, and unobtrusive, with implementations that match the accuracy and efficiency of the propagation methods. The artificial numerical boundary treatments have been proven to have solutions which converge to the full open domain problems, so that the error from the boundary treatments can be driven as low as is required. The purpose of this paper is to briefly present a method for developing highly accurate algorithms for computational aeroacoustics, the use of computer automation in this process, and a brief survey of the algorithms that have resulted from this work. A review of computational aeroacoustics has recently been given by Lele.
FDTD modelling of induced polarization phenomena in transient electromagnetics
NASA Astrophysics Data System (ADS)
Commer, Michael; Petrov, Peter V.; Newman, Gregory A.
2017-04-01
The finite-difference time-domain scheme is augmented in order to treat the modelling of transient electromagnetic signals containing induced polarization effects from 3-D distributions of polarizable media. Compared to the non-dispersive problem, the discrete dispersive Maxwell system contains costly convolution operators. Key components to our solution for highly digitized model meshes are Debye decomposition and composite memory variables. We revert to the popular Cole-Cole model of dispersion to describe the frequency-dependent behaviour of electrical conductivity. Its inversely Laplace-transformed Debye decomposition results in a series of time convolutions between electric field and exponential decay functions, with the latter reflecting each Debye constituents' individual relaxation time. These function types in the discrete-time convolution allow for their substitution by memory variables, annihilating the otherwise prohibitive computing demands. Numerical examples demonstrate the efficiency and practicality of our algorithm.
Near-lossless multichannel EEG compression based on matrix and tensor decompositions.
Dauwels, Justin; Srinivasan, K; Reddy, M Ramasubba; Cichocki, Andrzej
2013-05-01
A novel near-lossless compression algorithm for multichannel electroencephalogram (MC-EEG) is proposed based on matrix/tensor decomposition models. MC-EEG is represented in suitable multiway (multidimensional) forms to efficiently exploit temporal and spatial correlations simultaneously. Several matrix/tensor decomposition models are analyzed in view of efficient decorrelation of the multiway forms of MC-EEG. A compression algorithm is built based on the principle of “lossy plus residual coding,” consisting of a matrix/tensor decomposition-based coder in the lossy layer followed by arithmetic coding in the residual layer. This approach guarantees a specifiable maximum absolute error between original and reconstructed signals. The compression algorithm is applied to three different scalp EEG datasets and an intracranial EEG dataset, each with different sampling rate and resolution. The proposed algorithm achieves attractive compression ratios compared to compressing individual channels separately. For similar compression ratios, the proposed algorithm achieves nearly fivefold lower average error compared to a similar wavelet-based volumetric MC-EEG compression algorithm.
Visual based laser speckle pattern recognition method for structural health monitoring
NASA Astrophysics Data System (ADS)
Park, Kyeongtaek; Torbol, Marco
2017-04-01
This study performed the system identification of a target structure by analyzing the laser speckle pattern taken by a camera. The laser speckle pattern is generated by the diffuse reflection of the laser beam on a rough surface of the target structure. The camera, equipped with a red filter, records the scattered speckle particles of the laser light in real time and the raw speckle image of the pixel data is fed to the graphic processing unit (GPU) in the system. The algorithm for laser speckle contrast analysis (LASCA) computes: the laser speckle contrast images and the laser speckle flow images. The k-mean clustering algorithm is used to classify the pixels in each frame and the clusters' centroids, which function as virtual sensors, track the displacement between different frames in time domain. The fast Fourier transform (FFT) and the frequency domain decomposition (FDD) compute the modal properties of the structure: natural frequencies and damping ratios. This study takes advantage of the large scale computational capability of GPU. The algorithm is written in Compute Unifies Device Architecture (CUDA C) that allows the processing of speckle images in real time.
NASA Technical Reports Server (NTRS)
Watson, Brian; Kamat, M. P.
1990-01-01
Element-by-element preconditioned conjugate gradient (EBE-PCG) algorithms have been advocated for use in parallel/vector processing environments as being superior to the conventional LDL(exp T) decomposition algorithm for single load cases. Although there may be some advantages in using such algorithms for a single load case, when it comes to situations involving multiple load cases, the LDL(exp T) decomposition algorithm would appear to be decidedly more cost-effective. The authors have outlined an EBE-PCG algorithm suitable for multiple load cases and compared its effectiveness to the highly efficient LDL(exp T) decomposition scheme. The proposed algorithm offers almost no advantages over the LDL(exp T) algorithm for the linear problems investigated on the Alliant FX/8. However, there may be some merit in the algorithm in solving nonlinear problems with load incrementation, but that remains to be investigated.
Randomized Dynamic Mode Decomposition
NASA Astrophysics Data System (ADS)
Erichson, N. Benjamin; Brunton, Steven L.; Kutz, J. Nathan
2017-11-01
The dynamic mode decomposition (DMD) is an equation-free, data-driven matrix decomposition that is capable of providing accurate reconstructions of spatio-temporal coherent structures arising in dynamical systems. We present randomized algorithms to compute the near-optimal low-rank dynamic mode decomposition for massive datasets. Randomized algorithms are simple, accurate and able to ease the computational challenges arising with `big data'. Moreover, randomized algorithms are amenable to modern parallel and distributed computing. The idea is to derive a smaller matrix from the high-dimensional input data matrix using randomness as a computational strategy. Then, the dynamic modes and eigenvalues are accurately learned from this smaller representation of the data, whereby the approximation quality can be controlled via oversampling and power iterations. Here, we present randomized DMD algorithms that are categorized by how many passes the algorithm takes through the data. Specifically, the single-pass randomized DMD does not require data to be stored for subsequent passes. Thus, it is possible to approximately decompose massive fluid flows (stored out of core memory, or not stored at all) using single-pass algorithms, which is infeasible with traditional DMD algorithms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Staschus, K.
1985-01-01
In this dissertation, efficient algorithms for electric-utility capacity expansion planning with renewable energy are developed. The algorithms include a deterministic phase that quickly finds a near-optimal expansion plan using derating and a linearized approximation to the time-dependent availability of nondispatchable energy sources. A probabilistic second phase needs comparatively few computer-time consuming probabilistic simulation iterations to modify this solution towards the optimal expansion plan. For the deterministic first phase, two algorithms, based on a Lagrangian Dual decomposition and a Generalized Benders Decomposition, are developed. The probabilistic second phase uses a Generalized Benders Decomposition approach. Extensive computational tests of the algorithms aremore » reported. Among the deterministic algorithms, the one based on Lagrangian Duality proves fastest. The two-phase approach is shown to save up to 80% in computing time as compared to a purely probabilistic algorithm. The algorithms are applied to determine the optimal expansion plan for the Tijuana-Mexicali subsystem of the Mexican electric utility system. A strong recommendation to push conservation programs in the desert city of Mexicali results from this implementation.« less
Hou, Tingjun; Zhang, Wei; Case, David A; Wang, Wei
2008-02-29
Many important protein-protein interactions are mediated by peptide recognition modular domains, such as the Src homology 3 (SH3), SH2, PDZ, and WW domains. Characterizing the interaction interface of domain-peptide complexes and predicting binding specificity for modular domains are critical for deciphering protein-protein interaction networks. Here, we propose the use of an energetic decomposition analysis to characterize domain-peptide interactions and the molecular interaction energy components (MIECs), including van der Waals, electrostatic, and desolvation energy between residue pairs on the binding interface. We show a proof-of-concept study on the amphiphysin-1 SH3 domain interacting with its peptide ligands. The structures of the human amphiphysin-1 SH3 domain complexed with 884 peptides were first modeled using virtual mutagenesis and optimized by molecular mechanics (MM) minimization. Next, the MIECs between domain and peptide residues were computed using the MM/generalized Born decomposition analysis. We conducted two types of statistical analyses on the MIECs to demonstrate their usefulness for predicting binding affinities of peptides and for classifying peptides into binder and non-binder categories. First, combining partial least squares analysis and genetic algorithm, we fitted linear regression models between the MIECs and the peptide binding affinities on the training data set. These models were then used to predict binding affinities for peptides in the test data set; the predicted values have a correlation coefficient of 0.81 and an unsigned mean error of 0.39 compared with the experimentally measured ones. The partial least squares-genetic algorithm analysis on the MIECs revealed the critical interactions for the binding specificity of the amphiphysin-1 SH3 domain. Next, a support vector machine (SVM) was employed to build classification models based on the MIECs of peptides in the training set. A rigorous training-validation procedure was used to assess the performances of different kernel functions in SVM and different combinations of the MIECs. The best SVM classifier gave satisfactory predictions for the test set, indicated by average prediction accuracy rates of 78% and 91% for the binding and non-binding peptides, respectively. We also showed that the performance of our approach on both binding affinity prediction and binder/non-binder classification was superior to the performances of the conventional MM/Poisson-Boltzmann solvent-accessible surface area and MM/generalized Born solvent-accessible surface area calculations. Our study demonstrates that the analysis of the MIECs between peptides and the SH3 domain can successfully characterize the binding interface, and it provides a framework to derive integrated prediction models for different domain-peptide systems.
An analysis of scatter decomposition
NASA Technical Reports Server (NTRS)
Nicol, David M.; Saltz, Joel H.
1990-01-01
A formal analysis of a powerful mapping technique known as scatter decomposition is presented. Scatter decomposition divides an irregular computational domain into a large number of equal sized pieces, and distributes them modularly among processors. A probabilistic model of workload in one dimension is used to formally explain why, and when scatter decomposition works. The first result is that if correlation in workload is a convex function of distance, then scattering a more finely decomposed domain yields a lower average processor workload variance. The second result shows that if the workload process is stationary Gaussian and the correlation function decreases linearly in distance until becoming zero and then remains zero, scattering a more finely decomposed domain yields a lower expected maximum processor workload. Finally it is shown that if the correlation function decreases linearly across the entire domain, then among all mappings that assign an equal number of domain pieces to each processor, scatter decomposition minimizes the average processor workload variance. The dependence of these results on the assumption of decreasing correlation is illustrated with situations where a coarser granularity actually achieves better load balance.
Simulation results for a finite element-based cumulative reconstructor
NASA Astrophysics Data System (ADS)
Wagner, Roland; Neubauer, Andreas; Ramlau, Ronny
2017-10-01
Modern ground-based telescopes rely on adaptive optics (AO) systems for the compensation of image degradation caused by atmospheric turbulences. Within an AO system, measurements of incoming light from guide stars are used to adjust deformable mirror(s) in real time that correct for atmospheric distortions. The incoming wavefront has to be derived from sensor measurements, and this intermediate result is then translated into the shape(s) of the deformable mirror(s). Rapid changes of the atmosphere lead to the need for fast wavefront reconstruction algorithms. We review a fast matrix-free algorithm that was developed by Neubauer to reconstruct the incoming wavefront from Shack-Hartmann measurements based on a finite element discretization of the telescope aperture. The method is enhanced by a domain decomposition ansatz. We show that this algorithm reaches the quality of standard approaches in end-to-end simulation while at the same time maintaining the speed of recently introduced solvers with linear order speed.
Compressed-sensing wavenumber-scanning interferometry
NASA Astrophysics Data System (ADS)
Bai, Yulei; Zhou, Yanzhou; He, Zhaoshui; Ye, Shuangli; Dong, Bo; Xie, Shengli
2018-01-01
The Fourier transform (FT), the nonlinear least-squares algorithm (NLSA), and eigenvalue decomposition algorithm (EDA) are used to evaluate the phase field in depth-resolved wavenumber-scanning interferometry (DRWSI). However, because the wavenumber series of the laser's output is usually accompanied by nonlinearity and mode-hop, FT, NLSA, and EDA, which are only suitable for equidistant interference data, often lead to non-negligible phase errors. In this work, a compressed-sensing method for DRWSI (CS-DRWSI) is proposed to resolve this problem. By using the randomly spaced inverse Fourier matrix and solving the underdetermined equation in the wavenumber domain, CS-DRWSI determines the nonuniform sampling and spectral leakage of the interference spectrum. Furthermore, it can evaluate interference data without prior knowledge of the object. The experimental results show that CS-DRWSI improves the depth resolution and suppresses sidelobes. It can replace the FT as a standard algorithm for DRWSI.
Computational plasticity algorithm for particle dynamics simulations
NASA Astrophysics Data System (ADS)
Krabbenhoft, K.; Lyamin, A. V.; Vignes, C.
2018-01-01
The problem of particle dynamics simulation is interpreted in the framework of computational plasticity leading to an algorithm which is mathematically indistinguishable from the common implicit scheme widely used in the finite element analysis of elastoplastic boundary value problems. This algorithm provides somewhat of a unification of two particle methods, the discrete element method and the contact dynamics method, which usually are thought of as being quite disparate. In particular, it is shown that the former appears as the special case where the time stepping is explicit while the use of implicit time stepping leads to the kind of schemes usually labelled contact dynamics methods. The framing of particle dynamics simulation within computational plasticity paves the way for new approaches similar (or identical) to those frequently employed in nonlinear finite element analysis. These include mixed implicit-explicit time stepping, dynamic relaxation and domain decomposition schemes.
NASA Technical Reports Server (NTRS)
Dongarra, Jack (Editor); Messina, Paul (Editor); Sorensen, Danny C. (Editor); Voigt, Robert G. (Editor)
1990-01-01
Attention is given to such topics as an evaluation of block algorithm variants in LAPACK and presents a large-grain parallel sparse system solver, a multiprocessor method for the solution of the generalized Eigenvalue problem on an interval, and a parallel QR algorithm for iterative subspace methods on the CM2. A discussion of numerical methods includes the topics of asynchronous numerical solutions of PDEs on parallel computers, parallel homotopy curve tracking on a hypercube, and solving Navier-Stokes equations on the Cedar Multi-Cluster system. A section on differential equations includes a discussion of a six-color procedure for the parallel solution of elliptic systems using the finite quadtree structure, data parallel algorithms for the finite element method, and domain decomposition methods in aerodynamics. Topics dealing with massively parallel computing include hypercube vs. 2-dimensional meshes and massively parallel computation of conservation laws. Performance and tools are also discussed.
Hybrid Nested Partitions and Math Programming Framework for Large-scale Combinatorial Optimization
2010-03-31
optimization problems: 1) exact algorithms and 2) metaheuristic algorithms . This project will integrate concepts from these two technologies to develop...optimal solutions within an acceptable amount of computation time, and 2) metaheuristic algorithms such as genetic algorithms , tabu search, and the...integer programming decomposition approaches, such as Dantzig Wolfe decomposition and Lagrangian relaxation, and metaheuristics such as the Nested
SU-F-I-41: Calibration-Free Material Decomposition for Dual-Energy CT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, W; Xing, L; Zhang, Q
2016-06-15
Purpose: To eliminate tedious phantom calibration or manually region of interest (ROI) selection as required in dual-energy CT material decomposition, we establish a new projection-domain material decomposition framework with incorporation of energy spectrum. Methods: Similar to the case of dual-energy CT, the integral of the basis material image in our model is expressed as a linear combination of basis functions, which are the polynomials of high- and low-energy raw projection data. To yield the unknown coefficients of the linear combination, the proposed algorithm minimizes the quadratic error between the high- and low-energy raw projection data and the projection calculated usingmore » material images. We evaluate the algorithm with an iodine concentration numerical phantom at different dose and iodine concentration levels. The x-ray energy spectra of the high and low energy are estimated using an indirect transmission method. The derived monochromatic images are compared with the high- and low-energy CT images to demonstrate beam hardening artifacts reduction. Quantitative results were measured and compared to the true values. Results: The differences between the true density value used for simulation and that were obtained from the monochromatic images, are 1.8%, 1.3%, 2.3%, and 2.9% for the dose levels from standard dose to 1/8 dose, and are 0.4%, 0.7%, 1.5%, and 1.8% for the four iodine concentration levels from 6 mg/mL to 24 mg/mL. For all of the cases, beam hardening artifacts, especially streaks shown between dense inserts, are almost completely removed in the monochromatic images. Conclusion: The proposed algorithm provides an effective way to yield material images and artifacts-free monochromatic images at different dose levels without the need for phantom calibration or ROI selection. Furthermore, the approach also yields accurate results when the concentration of the iodine concentrate insert is very low, suggesting the algorithm is robust with respect to the low-contrast scenario.« less
A frequency domain radar interferometric imaging (FII) technique based on high-resolution methods
NASA Astrophysics Data System (ADS)
Luce, H.; Yamamoto, M.; Fukao, S.; Helal, D.; Crochet, M.
2001-01-01
In the present work, we propose a frequency-domain interferometric imaging (FII) technique for a better knowledge of the vertical distribution of the atmospheric scatterers detected by MST radars. This is an extension of the dual frequency-domain interferometry (FDI) technique to multiple frequencies. Its objective is to reduce the ambiguity (resulting from the use of only two adjacent frequencies), inherent with the FDI technique. Different methods, commonly used in antenna array processing, are first described within the context of application to the FII technique. These methods are the Fourier-based imaging, the Capon's and the singular value decomposition method used with the MUSIC algorithm. Some preliminary simulations and tests performed on data collected with the middle and upper atmosphere (MU) radar (Shigaraki, Japan) are also presented. This work is a first step in the developments of the FII technique which seems to be very promising.
Data-driven process decomposition and robust online distributed modelling for large-scale processes
NASA Astrophysics Data System (ADS)
Shu, Zhang; Lijuan, Li; Lijuan, Yao; Shipin, Yang; Tao, Zou
2018-02-01
With the increasing attention of networked control, system decomposition and distributed models show significant importance in the implementation of model-based control strategy. In this paper, a data-driven system decomposition and online distributed subsystem modelling algorithm was proposed for large-scale chemical processes. The key controlled variables are first partitioned by affinity propagation clustering algorithm into several clusters. Each cluster can be regarded as a subsystem. Then the inputs of each subsystem are selected by offline canonical correlation analysis between all process variables and its controlled variables. Process decomposition is then realised after the screening of input and output variables. When the system decomposition is finished, the online subsystem modelling can be carried out by recursively block-wise renewing the samples. The proposed algorithm was applied in the Tennessee Eastman process and the validity was verified.
Power System Decomposition for Practical Implementation of Bulk-Grid Voltage Control Methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vallem, Mallikarjuna R.; Vyakaranam, Bharat GNVSR; Holzer, Jesse T.
Power system algorithms such as AC optimal power flow and coordinated volt/var control of the bulk power system are computationally intensive and become difficult to solve in operational time frames. The computational time required to run these algorithms increases exponentially as the size of the power system increases. The solution time for multiple subsystems is less than that for solving the entire system simultaneously, and the local nature of the voltage problem lends itself to such decomposition. This paper describes an algorithm that can be used to perform power system decomposition from the point of view of the voltage controlmore » problem. Our approach takes advantage of the dominant localized effect of voltage control and is based on clustering buses according to the electrical distances between them. One of the contributions of the paper is to use multidimensional scaling to compute n-dimensional Euclidean coordinates for each bus based on electrical distance to perform algorithms like K-means clustering. A simple coordinated reactive power control of photovoltaic inverters for voltage regulation is used to demonstrate the effectiveness of the proposed decomposition algorithm and its components. The proposed decomposition method is demonstrated on the IEEE 118-bus system.« less
Enhanced visualization of abnormalities in digital-mammographic images
NASA Astrophysics Data System (ADS)
Young, Susan S.; Moore, William E.
2002-05-01
This paper describes two new presentation methods that are intended to improve the ability of radiologists to visualize abnormalities in mammograms by enhancing the appearance of the breast parenchyma pattern relative to the fatty-tissue surroundings. The first method, referred to as mountain- view, is obtained via multiscale edge decomposition through filter banks. The image is displayed in a multiscale edge domain that causes the image to have a topographic-like appearance. The second method displays the image in the intensity domain and is referred to as contrast-enhancement presentation. The input image is first passed through a decomposition filter bank to produce a filtered output (Id). The image at the lowest resolution is processed using a LUT (look-up table) to produce a tone scaled image (I'). The LUT is designed to optimally map the code value range corresponding to the parenchyma pattern in the mammographic image into the dynamic range of the output medium. The algorithm uses a contrast weight control mechanism to produce the desired weight factors to enhance the edge information corresponding to the parenchyma pattern. The output image is formed using a reconstruction filter bank through I' and enhanced Id.
Sublattice parallel replica dynamics.
Martínez, Enrique; Uberuaga, Blas P; Voter, Arthur F
2014-06-01
Exascale computing presents a challenge for the scientific community as new algorithms must be developed to take full advantage of the new computing paradigm. Atomistic simulation methods that offer full fidelity to the underlying potential, i.e., molecular dynamics (MD) and parallel replica dynamics, fail to use the whole machine speedup, leaving a region in time and sample size space that is unattainable with current algorithms. In this paper, we present an extension of the parallel replica dynamics algorithm [A. F. Voter, Phys. Rev. B 57, R13985 (1998)] by combining it with the synchronous sublattice approach of Shim and Amar [ and , Phys. Rev. B 71, 125432 (2005)], thereby exploiting event locality to improve the algorithm scalability. This algorithm is based on a domain decomposition in which events happen independently in different regions in the sample. We develop an analytical expression for the speedup given by this sublattice parallel replica dynamics algorithm and compare it with parallel MD and traditional parallel replica dynamics. We demonstrate how this algorithm, which introduces a slight additional approximation of event locality, enables the study of physical systems unreachable with traditional methodologies and promises to better utilize the resources of current high performance and future exascale computers.
NASA Astrophysics Data System (ADS)
Cheng, Boyang; Jin, Longxu; Li, Guoning
2018-06-01
Visible light and infrared images fusion has been a significant subject in imaging science. As a new contribution to this field, a novel fusion framework of visible light and infrared images based on adaptive dual-channel unit-linking pulse coupled neural networks with singular value decomposition (ADS-PCNN) in non-subsampled shearlet transform (NSST) domain is present in this paper. First, the source images are decomposed into multi-direction and multi-scale sub-images by NSST. Furthermore, an improved novel sum modified-Laplacian (INSML) of low-pass sub-image and an improved average gradient (IAVG) of high-pass sub-images are input to stimulate the ADS-PCNN, respectively. To address the large spectral difference between infrared and visible light and the occurrence of black artifacts in fused images, a local structure information operator (LSI), which comes from local area singular value decomposition in each source image, is regarded as the adaptive linking strength that enhances fusion accuracy. Compared with PCNN models in other studies, the proposed method simplifies certain peripheral parameters, and the time matrix is utilized to decide the iteration number adaptively. A series of images from diverse scenes are used for fusion experiments and the fusion results are evaluated subjectively and objectively. The results of the subjective and objective evaluation show that our algorithm exhibits superior fusion performance and is more effective than the existing typical fusion techniques.
Neuromimetic Sound Representation for Percept Detection and Manipulation
NASA Astrophysics Data System (ADS)
Zotkin, Dmitry N.; Chi, Taishih; Shamma, Shihab A.; Duraiswami, Ramani
2005-12-01
The acoustic wave received at the ears is processed by the human auditory system to separate different sounds along the intensity, pitch, and timbre dimensions. Conventional Fourier-based signal processing, while endowed with fast algorithms, is unable to easily represent a signal along these attributes. In this paper, we discuss the creation of maximally separable sounds in auditory user interfaces and use a recently proposed cortical sound representation, which performs a biomimetic decomposition of an acoustic signal, to represent and manipulate sound for this purpose. We briefly overview algorithms for obtaining, manipulating, and inverting a cortical representation of a sound and describe algorithms for manipulating signal pitch and timbre separately. The algorithms are also used to create sound of an instrument between a "guitar" and a "trumpet." Excellent sound quality can be achieved if processing time is not a concern, and intelligible signals can be reconstructed in reasonable processing time (about ten seconds of computational time for a one-second signal sampled at [InlineEquation not available: see fulltext.]). Work on bringing the algorithms into the real-time processing domain is ongoing.
Recent advances in the modeling of plasmas with the Particle-In-Cell methods
NASA Astrophysics Data System (ADS)
Vay, Jean-Luc; Lehe, Remi; Vincenti, Henri; Godfrey, Brendan; Lee, Patrick; Haber, Irv
2015-11-01
The Particle-In-Cell (PIC) approach is the method of choice for self-consistent simulations of plasmas from first principles. The fundamentals of the PIC method were established decades ago but improvements or variations are continuously being proposed. We report on several recent advances in PIC related algorithms, including: (a) detailed analysis of the numerical Cherenkov instability and its remediation, (b) analytic pseudo-spectral electromagnetic solvers in Cartesian and cylindrical (with azimuthal modes decomposition) geometries, (c) arbitrary-order finite-difference and generalized pseudo-spectral Maxwell solvers, (d) novel analysis of Maxwell's solvers' stencil variation and truncation, in application to domain decomposition strategies and implementation of Perfectly Matched Layers in high-order and pseudo-spectral solvers. Work supported by US-DOE Contracts DE-AC02-05CH11231 and the US-DOE SciDAC program ComPASS. Used resources of NERSC, supported by US-DOE Contract DE-AC02-05CH11231.
Automatic classification of visual evoked potentials based on wavelet decomposition
NASA Astrophysics Data System (ADS)
Stasiakiewicz, Paweł; Dobrowolski, Andrzej P.; Tomczykiewicz, Kazimierz
2017-04-01
Diagnosis of part of the visual system, that is responsible for conducting compound action potential, is generally based on visual evoked potentials generated as a result of stimulation of the eye by external light source. The condition of patient's visual path is assessed by set of parameters that describe the time domain characteristic extremes called waves. The decision process is compound therefore diagnosis significantly depends on experience of a doctor. The authors developed a procedure - based on wavelet decomposition and linear discriminant analysis - that ensures automatic classification of visual evoked potentials. The algorithm enables to assign individual case to normal or pathological class. The proposed classifier has a 96,4% sensitivity at 10,4% probability of false alarm in a group of 220 cases and area under curve ROC equals to 0,96 which, from the medical point of view, is a very good result.
Parallel Algorithms for Computational Models of Geophysical Systems
NASA Astrophysics Data System (ADS)
Carrillo Ledesma, A.; Herrera, I.; de la Cruz, L. M.; Hernández, G.; Grupo de Modelacion Matematica y Computacional
2013-05-01
Mathematical models of many systems of interest, including very important continuous systems of Earth Sciences and Engineering, lead to a great variety of partial differential equations (PDEs) whose solution methods are based on the computational processing of large-scale algebraic systems. Furthermore, the incredible expansion experienced by the existing computational hardware and software has made amenable to effective treatment problems of an ever increasing diversity and complexity, posed by scientific and engineering applications. Parallel computing is outstanding among the new computational tools and, in order to effectively use the most advanced computers available today, massively parallel software is required. Domain decomposition methods (DDMs) have been developed precisely for effectively treating PDEs in paralle. Ideally, the main objective of domain decomposition research is to produce algorithms capable of 'obtaining the global solution by exclusively solving local problems', but up-to-now this has only been an aspiration; that is, a strong desire for achieving such a property and so we call it 'the DDM-paradigm'. In recent times, numerically competitive DDM-algorithms are non-overlapping, preconditioned and necessarily incorporate constraints which pose an additional challenge for achieving the DDM-paradigm. Recently a group of four algorithms, referred to as the 'DVS-algorithms', which fulfill the DDM-paradigm, was developed. To derive them a new discretization method, which uses a non-overlapping system of nodes (the derived-nodes), was introduced. This discretization procedure can be applied to any boundary-value problem, or system of such equations. In turn, the resulting system of discrete equations can be treated using any available DDM-algorithm. In particular, two of the four DVS-algorithms mentioned above were obtained by application of the well-known and very effective algorithms BDDC and FETI-DP; these will be referred to as the DVS-BDDC and DVS-FETI-DP algorithms. The other two, which will be referred to as the DVS-PRIMAL and DVS-DUAL algorithms, were obtained by application of two new algorithms that had not been previously reported in the literature. As said before, the four DVS-algorithms constitute a group of preconditioned and constrained algorithms that, for the first time, fulfill the DDM-paradigm. Both, BDDC and FETI-DP, are very well-known; and both are highly efficient. Recently, it was established that these two methods are closely related and its numerical performance is quite similar. On the other hand, through numerical experiments, we have established that the numerical performances of each one of the members of DVS-algorithms group (DVS-BDDC, DVS-FETI-DP, DVS-PRIMAL and DVS-DUAL) are very similar too. Furthermore, we have carried out comparisons of the performances of the standard versions of BDDC and FETI-DP with DVS-BDDC and DVS-FETI-DP, and in all such numerical experiments the DVS algorithms have performed significantly better.
Traffic Simulations on Parallel Computers Using Domain Decomposition Techniques
DOT National Transportation Integrated Search
1995-01-01
Large scale simulations of Intelligent Transportation Systems (ITS) can only be acheived by using the computing resources offered by parallel computing architectures. Domain decomposition techniques are proposed which allow the performance of traffic...
Frequency hopping signal detection based on wavelet decomposition and Hilbert-Huang transform
NASA Astrophysics Data System (ADS)
Zheng, Yang; Chen, Xihao; Zhu, Rui
2017-07-01
Frequency hopping (FH) signal is widely adopted by military communications as a kind of low probability interception signal. Therefore, it is very important to research the FH signal detection algorithm. The existing detection algorithm of FH signals based on the time-frequency analysis cannot satisfy the time and frequency resolution requirement at the same time due to the influence of window function. In order to solve this problem, an algorithm based on wavelet decomposition and Hilbert-Huang transform (HHT) was proposed. The proposed algorithm removes the noise of the received signals by wavelet decomposition and detects the FH signals by Hilbert-Huang transform. Simulation results show the proposed algorithm takes into account both the time resolution and the frequency resolution. Correspondingly, the accuracy of FH signals detection can be improved.
About decomposition approach for solving the classification problem
NASA Astrophysics Data System (ADS)
Andrianova, A. A.
2016-11-01
This article describes the features of the application of an algorithm with using of decomposition methods for solving the binary classification problem of constructing a linear classifier based on Support Vector Machine method. Application of decomposition reduces the volume of calculations, in particular, due to the emerging possibilities to build parallel versions of the algorithm, which is a very important advantage for the solution of problems with big data. The analysis of the results of computational experiments conducted using the decomposition approach. The experiment use known data set for binary classification problem.
An Orthogonal Evolutionary Algorithm With Learning Automata for Multiobjective Optimization.
Dai, Cai; Wang, Yuping; Ye, Miao; Xue, Xingsi; Liu, Hailin
2016-12-01
Research on multiobjective optimization problems becomes one of the hottest topics of intelligent computation. In order to improve the search efficiency of an evolutionary algorithm and maintain the diversity of solutions, in this paper, the learning automata (LA) is first used for quantization orthogonal crossover (QOX), and a new fitness function based on decomposition is proposed to achieve these two purposes. Based on these, an orthogonal evolutionary algorithm with LA for complex multiobjective optimization problems with continuous variables is proposed. The experimental results show that in continuous states, the proposed algorithm is able to achieve accurate Pareto-optimal sets and wide Pareto-optimal fronts efficiently. Moreover, the comparison with the several existing well-known algorithms: nondominated sorting genetic algorithm II, decomposition-based multiobjective evolutionary algorithm, decomposition-based multiobjective evolutionary algorithm with an ensemble of neighborhood sizes, multiobjective optimization by LA, and multiobjective immune algorithm with nondominated neighbor-based selection, on 15 multiobjective benchmark problems, shows that the proposed algorithm is able to find more accurate and evenly distributed Pareto-optimal fronts than the compared ones.
NASA Astrophysics Data System (ADS)
Feng, Zhipeng; Chu, Fulei; Zuo, Ming J.
2011-03-01
Energy separation algorithm is good at tracking instantaneous changes in frequency and amplitude of modulated signals, but it is subject to the constraints of mono-component and narrow band. In most cases, time-varying modulated vibration signals of machinery consist of multiple components, and have so complicated instantaneous frequency trajectories on time-frequency plane that they overlap in frequency domain. For such signals, conventional filters fail to obtain mono-components of narrow band, and their rectangular decomposition of time-frequency plane may split instantaneous frequency trajectories thus resulting in information loss. Regarding the advantage of generalized demodulation method in decomposing multi-component signals into mono-components, an iterative generalized demodulation method is used as a preprocessing tool to separate signals into mono-components, so as to satisfy the requirements by energy separation algorithm. By this improvement, energy separation algorithm can be generalized to a broad range of signals, as long as the instantaneous frequency trajectories of signal components do not intersect on time-frequency plane. Due to the good adaptability of energy separation algorithm to instantaneous changes in signals and the mono-component decomposition nature of generalized demodulation, the derived time-frequency energy distribution has fine resolution and is free from cross term interferences. The good performance of the proposed time-frequency analysis is illustrated by analyses of a simulated signal and the on-site recorded nonstationary vibration signal of a hydroturbine rotor during a shut-down transient process, showing that it has potential to analyze time-varying modulated signals of multi-components.
NASA Technical Reports Server (NTRS)
McDowell, Mark
2004-01-01
An integrated algorithm for decomposing overlapping particle images (multi-particle objects) along with determining each object s constituent particle centroid(s) has been developed using image analysis techniques. The centroid finding algorithm uses a modified eight-direction search method for finding the perimeter of any enclosed object. The centroid is calculated using the intensity-weighted center of mass of the object. The overlap decomposition algorithm further analyzes the object data and breaks it down into its constituent particle centroid(s). This is accomplished with an artificial neural network, feature based technique and provides an efficient way of decomposing overlapping particles. Combining the centroid finding and overlap decomposition routines into a single algorithm allows us to accurately predict the error associated with finding the centroid(s) of particles in our experiments. This algorithm has been tested using real, simulated, and synthetic data and the results are presented and discussed.
Fast polar decomposition of an arbitrary matrix
NASA Technical Reports Server (NTRS)
Higham, Nicholas J.; Schreiber, Robert S.
1988-01-01
The polar decomposition of an m x n matrix A of full rank, where m is greater than or equal to n, can be computed using a quadratically convergent algorithm. The algorithm is based on a Newton iteration involving a matrix inverse. With the use of a preliminary complete orthogonal decomposition the algorithm can be extended to arbitrary A. How to use the algorithm to compute the positive semi-definite square root of a Hermitian positive semi-definite matrix is described. A hybrid algorithm which adaptively switches from the matrix inversion based iteration to a matrix multiplication based iteration due to Kovarik, and to Bjorck and Bowie is formulated. The decision when to switch is made using a condition estimator. This matrix multiplication rich algorithm is shown to be more efficient on machines for which matrix multiplication can be executed 1.5 times faster than matrix inversion.
Spectral Diffusion: An Algorithm for Robust Material Decomposition of Spectral CT Data
Clark, Darin P.; Badea, Cristian T.
2014-01-01
Clinical successes with dual energy CT, aggressive development of energy discriminating x-ray detectors, and novel, target-specific, nanoparticle contrast agents promise to establish spectral CT as a powerful functional imaging modality. Common to all of these applications is the need for a material decomposition algorithm which is robust in the presence of noise. Here, we develop such an algorithm which uses spectrally joint, piece-wise constant kernel regression and the split Bregman method to iteratively solve for a material decomposition which is gradient sparse, quantitatively accurate, and minimally biased. We call this algorithm spectral diffusion because it integrates structural information from multiple spectral channels and their corresponding material decompositions within the framework of diffusion-like denoising algorithms (e.g. anisotropic diffusion, total variation, bilateral filtration). Using a 3D, digital bar phantom and a material sensitivity matrix calibrated for use with a polychromatic x-ray source, we quantify the limits of detectability (CNR = 5) afforded by spectral diffusion in the triple-energy material decomposition of iodine (3.1 mg/mL), gold (0.9 mg/mL), and gadolinium (2.9 mg/mL) concentrations. We then apply spectral diffusion to the in vivo separation of these three materials in the mouse kidneys, liver, and spleen. PMID:25296173
Spectral diffusion: an algorithm for robust material decomposition of spectral CT data.
Clark, Darin P; Badea, Cristian T
2014-11-07
Clinical successes with dual energy CT, aggressive development of energy discriminating x-ray detectors, and novel, target-specific, nanoparticle contrast agents promise to establish spectral CT as a powerful functional imaging modality. Common to all of these applications is the need for a material decomposition algorithm which is robust in the presence of noise. Here, we develop such an algorithm which uses spectrally joint, piecewise constant kernel regression and the split Bregman method to iteratively solve for a material decomposition which is gradient sparse, quantitatively accurate, and minimally biased. We call this algorithm spectral diffusion because it integrates structural information from multiple spectral channels and their corresponding material decompositions within the framework of diffusion-like denoising algorithms (e.g. anisotropic diffusion, total variation, bilateral filtration). Using a 3D, digital bar phantom and a material sensitivity matrix calibrated for use with a polychromatic x-ray source, we quantify the limits of detectability (CNR = 5) afforded by spectral diffusion in the triple-energy material decomposition of iodine (3.1 mg mL(-1)), gold (0.9 mg mL(-1)), and gadolinium (2.9 mg mL(-1)) concentrations. We then apply spectral diffusion to the in vivo separation of these three materials in the mouse kidneys, liver, and spleen.
Extending substructure based iterative solvers to multiple load and repeated analyses
NASA Technical Reports Server (NTRS)
Farhat, Charbel
1993-01-01
Direct solvers currently dominate commercial finite element structural software, but do not scale well in the fine granularity regime targeted by emerging parallel processors. Substructure based iterative solvers--often called also domain decomposition algorithms--lend themselves better to parallel processing, but must overcome several obstacles before earning their place in general purpose structural analysis programs. One such obstacle is the solution of systems with many or repeated right hand sides. Such systems arise, for example, in multiple load static analyses and in implicit linear dynamics computations. Direct solvers are well-suited for these problems because after the system matrix has been factored, the multiple or repeated solutions can be obtained through relatively inexpensive forward and backward substitutions. On the other hand, iterative solvers in general are ill-suited for these problems because they often must restart from scratch for every different right hand side. In this paper, we present a methodology for extending the range of applications of domain decomposition methods to problems with multiple or repeated right hand sides. Basically, we formulate the overall problem as a series of minimization problems over K-orthogonal and supplementary subspaces, and tailor the preconditioned conjugate gradient algorithm to solve them efficiently. The resulting solution method is scalable, whereas direct factorization schemes and forward and backward substitution algorithms are not. We illustrate the proposed methodology with the solution of static and dynamic structural problems, and highlight its potential to outperform forward and backward substitutions on parallel computers. As an example, we show that for a linear structural dynamics problem with 11640 degrees of freedom, every time-step beyond time-step 15 is solved in a single iteration and consumes 1.0 second on a 32 processor iPSC-860 system; for the same problem and the same parallel processor, a pair of forward/backward substitutions at each step consumes 15.0 seconds.
Multigrid treatment of implicit continuum diffusion
NASA Astrophysics Data System (ADS)
Francisquez, Manaure; Zhu, Ben; Rogers, Barrett
2017-10-01
Implicit treatment of diffusive terms of various differential orders common in continuum mechanics modeling, such as computational fluid dynamics, is investigated with spectral and multigrid algorithms in non-periodic 2D domains. In doubly periodic time dependent problems these terms can be efficiently and implicitly handled by spectral methods, but in non-periodic systems solved with distributed memory parallel computing and 2D domain decomposition, this efficiency is lost for large numbers of processors. We built and present here a multigrid algorithm for these types of problems which outperforms a spectral solution that employs the highly optimized FFTW library. This multigrid algorithm is not only suitable for high performance computing but may also be able to efficiently treat implicit diffusion of arbitrary order by introducing auxiliary equations of lower order. We test these solvers for fourth and sixth order diffusion with idealized harmonic test functions as well as a turbulent 2D magnetohydrodynamic simulation. It is also shown that an anisotropic operator without cross-terms can improve model accuracy and speed, and we examine the impact that the various diffusion operators have on the energy, the enstrophy, and the qualitative aspect of a simulation. This work was supported by DOE-SC-0010508. This research used resources of the National Energy Research Scientific Computing Center (NERSC).
NASA Astrophysics Data System (ADS)
Iwasawa, Masaki; Tanikawa, Ataru; Hosono, Natsuki; Nitadori, Keigo; Muranushi, Takayuki; Makino, Junichiro
2016-08-01
We present the basic idea, implementation, measured performance, and performance model of FDPS (Framework for Developing Particle Simulators). FDPS is an application-development framework which helps researchers to develop simulation programs using particle methods for large-scale distributed-memory parallel supercomputers. A particle-based simulation program for distributed-memory parallel computers needs to perform domain decomposition, exchange of particles which are not in the domain of each computing node, and gathering of the particle information in other nodes which are necessary for interaction calculation. Also, even if distributed-memory parallel computers are not used, in order to reduce the amount of computation, algorithms such as the Barnes-Hut tree algorithm or the Fast Multipole Method should be used in the case of long-range interactions. For short-range interactions, some methods to limit the calculation to neighbor particles are required. FDPS provides all of these functions which are necessary for efficient parallel execution of particle-based simulations as "templates," which are independent of the actual data structure of particles and the functional form of the particle-particle interaction. By using FDPS, researchers can write their programs with the amount of work necessary to write a simple, sequential and unoptimized program of O(N2) calculation cost, and yet the program, once compiled with FDPS, will run efficiently on large-scale parallel supercomputers. A simple gravitational N-body program can be written in around 120 lines. We report the actual performance of these programs and the performance model. The weak scaling performance is very good, and almost linear speed-up was obtained for up to the full system of the K computer. The minimum calculation time per timestep is in the range of 30 ms (N = 107) to 300 ms (N = 109). These are currently limited by the time for the calculation of the domain decomposition and communication necessary for the interaction calculation. We discuss how we can overcome these bottlenecks.
NASA Astrophysics Data System (ADS)
Furuichi, M.; Nishiura, D.
2015-12-01
Fully Lagrangian methods such as Smoothed Particle Hydrodynamics (SPH) and Discrete Element Method (DEM) have been widely used to solve the continuum and particles motions in the computational geodynamics field. These mesh-free methods are suitable for the problems with the complex geometry and boundary. In addition, their Lagrangian nature allows non-diffusive advection useful for tracking history dependent properties (e.g. rheology) of the material. These potential advantages over the mesh-based methods offer effective numerical applications to the geophysical flow and tectonic processes, which are for example, tsunami with free surface and floating body, magma intrusion with fracture of rock, and shear zone pattern generation of granular deformation. In order to investigate such geodynamical problems with the particle based methods, over millions to billion particles are required for the realistic simulation. Parallel computing is therefore important for handling such huge computational cost. An efficient parallel implementation of SPH and DEM methods is however known to be difficult especially for the distributed-memory architecture. Lagrangian methods inherently show workload imbalance problem for parallelization with the fixed domain in space, because particles move around and workloads change during the simulation. Therefore dynamic load balance is key technique to perform the large scale SPH and DEM simulation. In this work, we present the parallel implementation technique of SPH and DEM method utilizing dynamic load balancing algorithms toward the high resolution simulation over large domain using the massively parallel super computer system. Our method utilizes the imbalances of the executed time of each MPI process as the nonlinear term of parallel domain decomposition and minimizes them with the Newton like iteration method. In order to perform flexible domain decomposition in space, the slice-grid algorithm is used. Numerical tests show that our approach is suitable for solving the particles with different calculation costs (e.g. boundary particles) as well as the heterogeneous computer architecture. We analyze the parallel efficiency and scalability on the super computer systems (K-computer, Earth simulator 3, etc.).
Incremental k-core decomposition: Algorithms and evaluation
Sariyuce, Ahmet Erdem; Gedik, Bugra; Jacques-SIlva, Gabriela; ...
2016-02-01
A k-core of a graph is a maximal connected subgraph in which every vertex is connected to at least k vertices in the subgraph. k-core decomposition is often used in large-scale network analysis, such as community detection, protein function prediction, visualization, and solving NP-hard problems on real networks efficiently, like maximal clique finding. In many real-world applications, networks change over time. As a result, it is essential to develop efficient incremental algorithms for dynamic graph data. In this paper, we propose a suite of incremental k-core decomposition algorithms for dynamic graph data. These algorithms locate a small subgraph that ismore » guaranteed to contain the list of vertices whose maximum k-core values have changed and efficiently process this subgraph to update the k-core decomposition. We present incremental algorithms for both insertion and deletion operations, and propose auxiliary vertex state maintenance techniques that can further accelerate these operations. Our results show a significant reduction in runtime compared to non-incremental alternatives. We illustrate the efficiency of our algorithms on different types of real and synthetic graphs, at varying scales. Furthermore, for a graph of 16 million vertices, we observe relative throughputs reaching a million times, relative to the non-incremental algorithms.« less
The Speech multi features fusion perceptual hash algorithm based on tensor decomposition
NASA Astrophysics Data System (ADS)
Huang, Y. B.; Fan, M. H.; Zhang, Q. Y.
2018-03-01
With constant progress in modern speech communication technologies, the speech data is prone to be attacked by the noise or maliciously tampered. In order to make the speech perception hash algorithm has strong robustness and high efficiency, this paper put forward a speech perception hash algorithm based on the tensor decomposition and multi features is proposed. This algorithm analyses the speech perception feature acquires each speech component wavelet packet decomposition. LPCC, LSP and ISP feature of each speech component are extracted to constitute the speech feature tensor. Speech authentication is done by generating the hash values through feature matrix quantification which use mid-value. Experimental results showing that the proposed algorithm is robust for content to maintain operations compared with similar algorithms. It is able to resist the attack of the common background noise. Also, the algorithm is highly efficiency in terms of arithmetic, and is able to meet the real-time requirements of speech communication and complete the speech authentication quickly.
Domain decomposition for a mixed finite element method in three dimensions
Cai, Z.; Parashkevov, R.R.; Russell, T.F.; Wilson, J.D.; Ye, X.
2003-01-01
We consider the solution of the discrete linear system resulting from a mixed finite element discretization applied to a second-order elliptic boundary value problem in three dimensions. Based on a decomposition of the velocity space, these equations can be reduced to a discrete elliptic problem by eliminating the pressure through the use of substructures of the domain. The practicality of the reduction relies on a local basis, presented here, for the divergence-free subspace of the velocity space. We consider additive and multiplicative domain decomposition methods for solving the reduced elliptic problem, and their uniform convergence is established.
Error reduction in EMG signal decomposition
Kline, Joshua C.
2014-01-01
Decomposition of the electromyographic (EMG) signal into constituent action potentials and the identification of individual firing instances of each motor unit in the presence of ambient noise are inherently probabilistic processes, whether performed manually or with automated algorithms. Consequently, they are subject to errors. We set out to classify and reduce these errors by analyzing 1,061 motor-unit action-potential trains (MUAPTs), obtained by decomposing surface EMG (sEMG) signals recorded during human voluntary contractions. Decomposition errors were classified into two general categories: location errors representing variability in the temporal localization of each motor-unit firing instance and identification errors consisting of falsely detected or missed firing instances. To mitigate these errors, we developed an error-reduction algorithm that combines multiple decomposition estimates to determine a more probable estimate of motor-unit firing instances with fewer errors. The performance of the algorithm is governed by a trade-off between the yield of MUAPTs obtained above a given accuracy level and the time required to perform the decomposition. When applied to a set of sEMG signals synthesized from real MUAPTs, the identification error was reduced by an average of 1.78%, improving the accuracy to 97.0%, and the location error was reduced by an average of 1.66 ms. The error-reduction algorithm in this study is not limited to any specific decomposition strategy. Rather, we propose it be used for other decomposition methods, especially when analyzing precise motor-unit firing instances, as occurs when measuring synchronization. PMID:25210159
Enhancement of lung sounds based on empirical mode decomposition and Fourier transform algorithm.
Mondal, Ashok; Banerjee, Poulami; Somkuwar, Ajay
2017-02-01
There is always heart sound (HS) signal interfering during the recording of lung sound (LS) signals. This obscures the features of LS signals and creates confusion on pathological states, if any, of the lungs. In this work, a new method is proposed for reduction of heart sound interference which is based on empirical mode decomposition (EMD) technique and prediction algorithm. In this approach, first the mixed signal is split into several components in terms of intrinsic mode functions (IMFs). Thereafter, HS-included segments are localized and removed from them. The missing values of the gap thus produced, is predicted by a new Fast Fourier Transform (FFT) based prediction algorithm and the time domain LS signal is reconstructed by taking an inverse FFT of the estimated missing values. The experiments have been conducted on simulated and recorded HS corrupted LS signals at three different flow rates and various SNR levels. The performance of the proposed method is evaluated by qualitative and quantitative analysis of the results. It is found that the proposed method is superior to the baseline method in terms of quantitative and qualitative measurement. The developed method gives better results compared to baseline method for different SNR levels. Our method gives cross correlation index (CCI) of 0.9488, signal to deviation ratio (SDR) of 9.8262, and normalized maximum amplitude error (NMAE) of 26.94 for 0 dB SNR value. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Parallel discontinuous Galerkin FEM for computing hyperbolic conservation law on unstructured grids
NASA Astrophysics Data System (ADS)
Ma, Xinrong; Duan, Zhijian
2018-04-01
High-order resolution Discontinuous Galerkin finite element methods (DGFEM) has been known as a good method for solving Euler equations and Navier-Stokes equations on unstructured grid, but it costs too much computational resources. An efficient parallel algorithm was presented for solving the compressible Euler equations. Moreover, the multigrid strategy based on three-stage three-order TVD Runge-Kutta scheme was used in order to improve the computational efficiency of DGFEM and accelerate the convergence of the solution of unsteady compressible Euler equations. In order to make each processor maintain load balancing, the domain decomposition method was employed. Numerical experiment performed for the inviscid transonic flow fluid problems around NACA0012 airfoil and M6 wing. The results indicated that our parallel algorithm can improve acceleration and efficiency significantly, which is suitable for calculating the complex flow fluid.
Fast algorithm of adaptive Fourier series
NASA Astrophysics Data System (ADS)
Gao, You; Ku, Min; Qian, Tao
2018-05-01
Adaptive Fourier decomposition (AFD, precisely 1-D AFD or Core-AFD) was originated for the goal of positive frequency representations of signals. It achieved the goal and at the same time offered fast decompositions of signals. There then arose several types of AFDs. AFD merged with the greedy algorithm idea, and in particular, motivated the so-called pre-orthogonal greedy algorithm (Pre-OGA) that was proven to be the most efficient greedy algorithm. The cost of the advantages of the AFD type decompositions is, however, the high computational complexity due to the involvement of maximal selections of the dictionary parameters. The present paper offers one formulation of the 1-D AFD algorithm by building the FFT algorithm into it. Accordingly, the algorithm complexity is reduced, from the original $\\mathcal{O}(M N^2)$ to $\\mathcal{O}(M N\\log_2 N)$, where $N$ denotes the number of the discretization points on the unit circle and $M$ denotes the number of points in $[0,1)$. This greatly enhances the applicability of AFD. Experiments are carried out to show the high efficiency of the proposed algorithm.
RS-Forest: A Rapid Density Estimator for Streaming Anomaly Detection.
Wu, Ke; Zhang, Kun; Fan, Wei; Edwards, Andrea; Yu, Philip S
Anomaly detection in streaming data is of high interest in numerous application domains. In this paper, we propose a novel one-class semi-supervised algorithm to detect anomalies in streaming data. Underlying the algorithm is a fast and accurate density estimator implemented by multiple fully randomized space trees (RS-Trees), named RS-Forest. The piecewise constant density estimate of each RS-tree is defined on the tree node into which an instance falls. Each incoming instance in a data stream is scored by the density estimates averaged over all trees in the forest. Two strategies, statistical attribute range estimation of high probability guarantee and dual node profiles for rapid model update, are seamlessly integrated into RS-Forest to systematically address the ever-evolving nature of data streams. We derive the theoretical upper bound for the proposed algorithm and analyze its asymptotic properties via bias-variance decomposition. Empirical comparisons to the state-of-the-art methods on multiple benchmark datasets demonstrate that the proposed method features high detection rate, fast response, and insensitivity to most of the parameter settings. Algorithm implementations and datasets are available upon request.
RS-Forest: A Rapid Density Estimator for Streaming Anomaly Detection
Wu, Ke; Zhang, Kun; Fan, Wei; Edwards, Andrea; Yu, Philip S.
2015-01-01
Anomaly detection in streaming data is of high interest in numerous application domains. In this paper, we propose a novel one-class semi-supervised algorithm to detect anomalies in streaming data. Underlying the algorithm is a fast and accurate density estimator implemented by multiple fully randomized space trees (RS-Trees), named RS-Forest. The piecewise constant density estimate of each RS-tree is defined on the tree node into which an instance falls. Each incoming instance in a data stream is scored by the density estimates averaged over all trees in the forest. Two strategies, statistical attribute range estimation of high probability guarantee and dual node profiles for rapid model update, are seamlessly integrated into RS-Forest to systematically address the ever-evolving nature of data streams. We derive the theoretical upper bound for the proposed algorithm and analyze its asymptotic properties via bias-variance decomposition. Empirical comparisons to the state-of-the-art methods on multiple benchmark datasets demonstrate that the proposed method features high detection rate, fast response, and insensitivity to most of the parameter settings. Algorithm implementations and datasets are available upon request. PMID:25685112
Implementing Linear Algebra Related Algorithms on the TI-92+ Calculator.
ERIC Educational Resources Information Center
Alexopoulos, John; Abraham, Paul
2001-01-01
Demonstrates a less utilized feature of the TI-92+: its natural and powerful programming language. Shows how to implement several linear algebra related algorithms including the Gram-Schmidt process, Least Squares Approximations, Wronskians, Cholesky Decompositions, and Generalized Linear Least Square Approximations with QR Decompositions.…
NASA Astrophysics Data System (ADS)
Loring, B.; Karimabadi, H.; Rortershteyn, V.
2015-10-01
The surface line integral convolution(LIC) visualization technique produces dense visualization of vector fields on arbitrary surfaces. We present a screen space surface LIC algorithm for use in distributed memory data parallel sort last rendering infrastructures. The motivations for our work are to support analysis of datasets that are too large to fit in the main memory of a single computer and compatibility with prevalent parallel scientific visualization tools such as ParaView and VisIt. By working in screen space using OpenGL we can leverage the computational power of GPUs when they are available and run without them when they are not. We address efficiency and performance issues that arise from the transformation of data from physical to screen space by selecting an alternate screen space domain decomposition. We analyze the algorithm's scaling behavior with and without GPUs on two high performance computing systems using data from turbulent plasma simulations.
Efficient parallel resolution of the simplified transport equations in mixed-dual formulation
NASA Astrophysics Data System (ADS)
Barrault, M.; Lathuilière, B.; Ramet, P.; Roman, J.
2011-03-01
A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart-Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Loring, Burlen; Karimabadi, Homa; Rortershteyn, Vadim
2014-07-01
The surface line integral convolution(LIC) visualization technique produces dense visualization of vector fields on arbitrary surfaces. We present a screen space surface LIC algorithm for use in distributed memory data parallel sort last rendering infrastructures. The motivations for our work are to support analysis of datasets that are too large to fit in the main memory of a single computer and compatibility with prevalent parallel scientific visualization tools such as ParaView and VisIt. By working in screen space using OpenGL we can leverage the computational power of GPUs when they are available and run without them when they are not.more » We address efficiency and performance issues that arise from the transformation of data from physical to screen space by selecting an alternate screen space domain decomposition. We analyze the algorithm's scaling behavior with and without GPUs on two high performance computing systems using data from turbulent plasma simulations.« less
Airbreathing Propulsion System Analysis Using Multithreaded Parallel Processing
NASA Technical Reports Server (NTRS)
Schunk, Richard Gregory; Chung, T. J.; Rodriguez, Pete (Technical Monitor)
2000-01-01
In this paper, parallel processing is used to analyze the mixing, and combustion behavior of hypersonic flow. Preliminary work for a sonic transverse hydrogen jet injected from a slot into a Mach 4 airstream in a two-dimensional duct combustor has been completed [Moon and Chung, 1996]. Our aim is to extend this work to three-dimensional domain using multithreaded domain decomposition parallel processing based on the flowfield-dependent variation theory. Numerical simulations of chemically reacting flows are difficult because of the strong interactions between the turbulent hydrodynamic and chemical processes. The algorithm must provide an accurate representation of the flowfield, since unphysical flowfield calculations will lead to the faulty loss or creation of species mass fraction, or even premature ignition, which in turn alters the flowfield information. Another difficulty arises from the disparity in time scales between the flowfield and chemical reactions, which may require the use of finite rate chemistry. The situations are more complex when there is a disparity in length scales involved in turbulence. In order to cope with these complicated physical phenomena, it is our plan to utilize the flowfield-dependent variation theory mentioned above, facilitated by large eddy simulation. Undoubtedly, the proposed computation requires the most sophisticated computational strategies. The multithreaded domain decomposition parallel processing will be necessary in order to reduce both computational time and storage. Without special treatments involved in computer engineering, our attempt to analyze the airbreathing combustion appears to be difficult, if not impossible.
NASA Technical Reports Server (NTRS)
Barth, Timothy J.; Kutler, Paul (Technical Monitor)
1998-01-01
Several stabilized demoralization procedures for conservation law equations on triangulated domains will be considered. Specifically, numerical schemes based on upwind finite volume, fluctuation splitting, Galerkin least-squares, and space discontinuous Galerkin demoralization will be considered in detail. A standard energy analysis for several of these methods will be given via entropy symmetrization. Next, we will present some relatively new theoretical results concerning congruence relationships for left or right symmetrized equations. These results suggest new variants of existing FV, DG, GLS, and FS methods which are computationally more efficient while retaining the pleasant theoretical properties achieved by entropy symmetrization. In addition, the task of Jacobean linearization of these schemes for use in Newton's method is greatly simplified owing to exploitation of exact symmetries which exist in the system. The FV, FS and DG schemes also permit discrete maximum principle analysis and enforcement which greatly adds to the robustness of the methods. Discrete maximum principle theory will be presented for general finite volume approximations on unstructured meshes. Next, we consider embedding these nonlinear space discretizations into exact and inexact Newton solvers which are preconditioned using a nonoverlapping (Schur complement) domain decomposition technique. Elements of nonoverlapping domain decomposition for elliptic problems will be reviewed followed by the present extension to hyperbolic and elliptic-hyperbolic problems. Other issues of practical relevance such the meshing of geometries, code implementation, turbulence modeling, global convergence, etc, will. be addressed as needed.
NASA Technical Reports Server (NTRS)
Barth, Timothy; Chancellor, Marisa K. (Technical Monitor)
1997-01-01
Several stabilized discretization procedures for conservation law equations on triangulated domains will be considered. Specifically, numerical schemes based on upwind finite volume, fluctuation splitting, Galerkin least-squares, and space discontinuous Galerkin discretization will be considered in detail. A standard energy analysis for several of these methods will be given via entropy symmetrization. Next, we will present some relatively new theoretical results concerning congruence relationships for left or right symmetrized equations. These results suggest new variants of existing FV, DG, GLS and FS methods which are computationally more efficient while retaining the pleasant theoretical properties achieved by entropy symmetrization. In addition, the task of Jacobian linearization of these schemes for use in Newton's method is greatly simplified owing to exploitation of exact symmetries which exist in the system. These variants have been implemented in the "ELF" library for which example calculations will be shown. The FV, FS and DG schemes also permit discrete maximum principle analysis and enforcement which greatly adds to the robustness of the methods. Some prevalent limiting strategies will be reviewed. Next, we consider embedding these nonlinear space discretizations into exact and inexact Newton solvers which are preconditioned using a nonoverlapping (Schur complement) domain decomposition technique. Elements of nonoverlapping domain decomposition for elliptic problems will be reviewed followed by the present extension to hyperbolic and elliptic-hyperbolic problems. Other issues of practical relevance such the meshing of geometries, code implementation, turbulence modeling, global convergence, etc. will be addressed as needed.
Harmonic analysis of traction power supply system based on wavelet decomposition
NASA Astrophysics Data System (ADS)
Dun, Xiaohong
2018-05-01
With the rapid development of high-speed railway and heavy-haul transport, AC drive electric locomotive and EMU large-scale operation in the country on the ground, the electrified railway has become the main harmonic source of China's power grid. In response to this phenomenon, the need for timely monitoring of power quality problems of electrified railway, assessment and governance. Wavelet transform is developed on the basis of Fourier analysis, the basic idea comes from the harmonic analysis, with a rigorous theoretical model, which has inherited and developed the local thought of Garbor transformation, and has overcome the disadvantages such as window fixation and lack of discrete orthogonally, so as to become a more recently studied spectral analysis tool. The wavelet analysis takes the gradual and precise time domain step in the high frequency part so as to focus on any details of the signal being analyzed, thereby comprehensively analyzing the harmonics of the traction power supply system meanwhile use the pyramid algorithm to increase the speed of wavelet decomposition. The matlab simulation shows that the use of wavelet decomposition of the traction power supply system for harmonic spectrum analysis is effective.
Optimization by nonhierarchical asynchronous decomposition
NASA Technical Reports Server (NTRS)
Shankar, Jayashree; Ribbens, Calvin J.; Haftka, Raphael T.; Watson, Layne T.
1992-01-01
Large scale optimization problems are tractable only if they are somehow decomposed. Hierarchical decompositions are inappropriate for some types of problems and do not parallelize well. Sobieszczanski-Sobieski has proposed a nonhierarchical decomposition strategy for nonlinear constrained optimization that is naturally parallel. Despite some successes on engineering problems, the algorithm as originally proposed fails on simple two dimensional quadratic programs. The algorithm is carefully analyzed for quadratic programs, and a number of modifications are suggested to improve its robustness.
Li, Yuxing; Li, Yaan; Chen, Xiao; Yu, Jing
2017-12-26
As the sound signal of ships obtained by sensors contains other many significant characteristics of ships and called ship-radiated noise (SN), research into a denoising algorithm and its application has obtained great significance. Using the advantage of variational mode decomposition (VMD) combined with the correlation coefficient for denoising, a hybrid secondary denoising algorithm is proposed using secondary VMD combined with a correlation coefficient (CC). First, different kinds of simulation signals are decomposed into several bandwidth-limited intrinsic mode functions (IMFs) using VMD, where the decomposition number by VMD is equal to the number by empirical mode decomposition (EMD); then, the CCs between the IMFs and the simulation signal are calculated respectively. The noise IMFs are identified by the CC threshold and the rest of the IMFs are reconstructed in order to realize the first denoising process. Finally, secondary denoising of the simulation signal can be accomplished by repeating the above steps of decomposition, screening and reconstruction. The final denoising result is determined according to the CC threshold. The denoising effect is compared under the different signal-to-noise ratio and the time of decomposition by VMD. Experimental results show the validity of the proposed denoising algorithm using secondary VMD (2VMD) combined with CC compared to EMD denoising, ensemble EMD (EEMD) denoising, VMD denoising and cubic VMD (3VMD) denoising, as well as two denoising algorithms presented recently. The proposed denoising algorithm is applied to feature extraction and classification for SN signals, which can effectively improve the recognition rate of different kinds of ships.
NASA Astrophysics Data System (ADS)
Hasan, Mohammed A.
1997-11-01
In this dissertation, we present several novel approaches for detection and identification of targets of arbitrary shapes from the acoustic backscattered data and using the incident waveform. This problem is formulated as time- delay estimation and sinusoidal frequency estimation problems which both have applications in many other important areas in signal processing. Solving time-delay estimation problem allows the identification of the specular components in the backscattered signal from elastic and non-elastic targets. Thus, accurate estimation of these time delays would help in determining the existence of certain clues for detecting targets. Several new methods for solving these two problems in the time, frequency and wavelet domains are developed. In the time domain, a new block fast transversal filter (BFTF) is proposed for a fast implementation of the least squares (LS) method. This BFTF algorithm is derived by using data-related constrained block-LS cost function to guarantee global optimality. The new soft-constrained algorithm provides an efficient way of transferring weight information between blocks of data and thus it is computationally very efficient compared with other LS- based schemes. Additionally, the tracking ability of the algorithm can be controlled by varying the block length and/or a soft constrained parameter. The effectiveness of this algorithm is tested on several underwater acoustic backscattered data for elastic targets and non-elastic (cement chunk) objects. In the frequency domain, the time-delay estimation problem is converted to a sinusoidal frequency estimation problem by using the discrete Fourier transform. Then, the lagged sample covariance matrices of the resulting signal are computed and studied in terms of their eigen- structure. These matrices are shown to be robust and effective in extracting bases for the signal and noise subspaces. New MUSIC and matrix pencil-based methods are derived these subspaces. The effectiveness of the method is demonstrated on the problem of detection of multiple specular components in the acoustic backscattered data. Finally, a method for the estimation of time delays using wavelet decomposition is derived. The sub-band adaptive filtering uses discrete wavelet transform for multi- resolution or sub-band decomposition. Joint time delay estimation for identifying multi-specular components and subsequent adaptive filtering processes are performed on the signal in each sub-band. This would provide multiple 'look' of the signal at different resolution scale which results in more accurate estimates for delays associated with the specular components. Simulation results on the simulated and real shallow water data are provided which show the promise of this new scheme for target detection in a heavy cluttered environment.
Video Shot Boundary Detection Using QR-Decomposition and Gaussian Transition Detection
NASA Astrophysics Data System (ADS)
Amiri, Ali; Fathy, Mahmood
2010-12-01
This article explores the problem of video shot boundary detection and examines a novel shot boundary detection algorithm by using QR-decomposition and modeling of gradual transitions by Gaussian functions. Specifically, the authors attend to the challenges of detecting gradual shots and extracting appropriate spatiotemporal features that affect the ability of algorithms to efficiently detect shot boundaries. The algorithm utilizes the properties of QR-decomposition and extracts a block-wise probability function that illustrates the probability of video frames to be in shot transitions. The probability function has abrupt changes in hard cut transitions, and semi-Gaussian behavior in gradual transitions. The algorithm detects these transitions by analyzing the probability function. Finally, we will report the results of the experiments using large-scale test sets provided by the TRECVID 2006, which has assessments for hard cut and gradual shot boundary detection. These results confirm the high performance of the proposed algorithm.
Unified path integral approach to theories of diffusion-influenced reactions
NASA Astrophysics Data System (ADS)
Prüstel, Thorsten; Meier-Schellersheim, Martin
2017-08-01
Building on mathematical similarities between quantum mechanics and theories of diffusion-influenced reactions, we develop a general approach for computational modeling of diffusion-influenced reactions that is capable of capturing not only the classical Smoluchowski picture but also alternative theories, as is here exemplified by a volume reactivity model. In particular, we prove the path decomposition expansion of various Green's functions describing the irreversible and reversible reaction of an isolated pair of molecules. To this end, we exploit a connection between boundary value and interaction potential problems with δ - and δ'-function perturbation. We employ a known path-integral-based summation of a perturbation series to derive a number of exact identities relating propagators and survival probabilities satisfying different boundary conditions in a unified and systematic manner. Furthermore, we show how the path decomposition expansion represents the propagator as a product of three factors in the Laplace domain that correspond to quantities figuring prominently in stochastic spatially resolved simulation algorithms. This analysis will thus be useful for the interpretation of current and the design of future algorithms. Finally, we discuss the relation between the general approach and the theory of Brownian functionals and calculate the mean residence time for the case of irreversible and reversible reactions.
A singular-value method for reconstruction of nonradial and lossy objects.
Jiang, Wei; Astheimer, Jeffrey; Waag, Robert
2012-03-01
Efficient inverse scattering algorithms for nonradial lossy objects are presented using singular-value decomposition to form reduced-rank representations of the scattering operator. These algorithms extend eigenfunction methods that are not applicable to nonradial lossy scattering objects because the scattering operators for these objects do not have orthonormal eigenfunction decompositions. A method of local reconstruction by segregation of scattering contributions from different local regions is also presented. Scattering from each region is isolated by forming a reduced-rank representation of the scattering operator that has domain and range spaces comprised of far-field patterns with retransmitted fields that focus on the local region. Methods for the estimation of the boundary, average sound speed, and average attenuation slope of the scattering object are also given. These methods yielded approximations of scattering objects that were sufficiently accurate to allow residual variations to be reconstructed in a single iteration. Calculated scattering from a lossy elliptical object with a random background, internal features, and white noise is used to evaluate the proposed methods. Local reconstruction yielded images with spatial resolution that is finer than a half wavelength of the center frequency and reproduces sound speed and attenuation slope with relative root-mean-square errors of 1.09% and 11.45%, respectively.
Application of Time-Frequency Domain Transform to Three-Dimensional Interpolation of Medical Images.
Lv, Shengqing; Chen, Yimin; Li, Zeyu; Lu, Jiahui; Gao, Mingke; Lu, Rongrong
2017-11-01
Medical image three-dimensional (3D) interpolation is an important means to improve the image effect in 3D reconstruction. In image processing, the time-frequency domain transform is an efficient method. In this article, several time-frequency domain transform methods are applied and compared in 3D interpolation. And a Sobel edge detection and 3D matching interpolation method based on wavelet transform is proposed. We combine wavelet transform, traditional matching interpolation methods, and Sobel edge detection together in our algorithm. What is more, the characteristics of wavelet transform and Sobel operator are used. They deal with the sub-images of wavelet decomposition separately. Sobel edge detection 3D matching interpolation method is used in low-frequency sub-images under the circumstances of ensuring high frequency undistorted. Through wavelet reconstruction, it can get the target interpolation image. In this article, we make 3D interpolation of the real computed tomography (CT) images. Compared with other interpolation methods, our proposed method is verified to be effective and superior.
Path planning of decentralized multi-quadrotor based on fuzzy-cell decomposition algorithm
NASA Astrophysics Data System (ADS)
Iswanto, Wahyunggoro, Oyas; Cahyadi, Adha Imam
2017-04-01
The paper aims to present a design algorithm for multi quadrotor lanes in order to move towards the goal quickly and avoid obstacles in an area with obstacles. There are several problems in path planning including how to get to the goal position quickly and avoid static and dynamic obstacles. To overcome the problem, therefore, the paper presents fuzzy logic algorithm and fuzzy cell decomposition algorithm. Fuzzy logic algorithm is one of the artificial intelligence algorithms which can be applied to robot path planning that is able to detect static and dynamic obstacles. Cell decomposition algorithm is an algorithm of graph theory used to make a robot path map. By using the two algorithms the robot is able to get to the goal position and avoid obstacles but it takes a considerable time because they are able to find the shortest path. Therefore, this paper describes a modification of the algorithms by adding a potential field algorithm used to provide weight values on the map applied for each quadrotor by using decentralized controlled, so that the quadrotor is able to move to the goal position quickly by finding the shortest path. The simulations conducted have shown that multi-quadrotor can avoid various obstacles and find the shortest path by using the proposed algorithms.
Primary decomposition of zero-dimensional ideals over finite fields
NASA Astrophysics Data System (ADS)
Gao, Shuhong; Wan, Daqing; Wang, Mingsheng
2009-03-01
A new algorithm is presented for computing primary decomposition of zero-dimensional ideals over finite fields. Like Berlekamp's algorithm for univariate polynomials, the new method is based on the invariant subspace of the Frobenius map acting on the quotient algebra. The dimension of the invariant subspace equals the number of primary components, and a basis of the invariant subspace yields a complete decomposition. Unlike previous approaches for decomposing multivariate polynomial systems, the new method does not need primality testing nor any generic projection, instead it reduces the general decomposition problem directly to root finding of univariate polynomials over the ground field. Also, it is shown how Groebner basis structure can be used to get partial primary decomposition without any root finding.
System identification of timber masonry walls using shaking table test
NASA Astrophysics Data System (ADS)
Roy, Timir B.; Guerreiro, Luis; Bagchi, Ashutosh
2017-04-01
Dynamic study is important in order to design, repair and rehabilitation of structures. It has played an important role in the behavior characterization of structures; such as: bridges, dams, high rise buildings etc. There had been substantial development in this area over the last few decades, especially in the field of dynamic identification techniques of structural systems. Frequency Domain Decomposition (FDD) and Time Domain Decomposition are most commonly used methods to identify modal parameters; such as: natural frequency, modal damping and mode shape. The focus of the present research is to study the dynamic characteristics of typical timber masonry walls commonly used in Portugal. For that purpose, a multi-storey structural prototype of such wall has been tested on a seismic shake table at the National Laboratory for Civil Engineering, Portugal (LNEC). Signal processing has been performed of the output response, which is collected from the shaking table experiment of the prototype using accelerometers. In the present work signal processing of the output response, based on the input response has been done in two ways: FDD and Stochastic Subspace Identification (SSI). In order to estimate the values of the modal parameters, algorithms for FDD are formulated and parametric functions for the SSI are computed. Finally, estimated values from both the methods are compared to measure the accuracy of both the techniques.
Projection decomposition algorithm for dual-energy computed tomography via deep neural network.
Xu, Yifu; Yan, Bin; Chen, Jian; Zeng, Lei; Li, Lei
2018-03-15
Dual-energy computed tomography (DECT) has been widely used to improve identification of substances from different spectral information. Decomposition of the mixed test samples into two materials relies on a well-calibrated material decomposition function. This work aims to establish and validate a data-driven algorithm for estimation of the decomposition function. A deep neural network (DNN) consisting of two sub-nets is proposed to solve the projection decomposition problem. The compressing sub-net, substantially a stack auto-encoder (SAE), learns a compact representation of energy spectrum. The decomposing sub-net with a two-layer structure fits the nonlinear transform between energy projection and basic material thickness. The proposed DNN not only delivers image with lower standard deviation and higher quality in both simulated and real data, and also yields the best performance in cases mixed with photon noise. Moreover, DNN costs only 0.4 s to generate a decomposition solution of 360 × 512 size scale, which is about 200 times faster than the competing algorithms. The DNN model is applicable to the decomposition tasks with different dual energies. Experimental results demonstrated the strong function fitting ability of DNN. Thus, the Deep learning paradigm provides a promising approach to solve the nonlinear problem in DECT.
A multilevel preconditioner for domain decomposition boundary systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bramble, J.H.; Pasciak, J.E.; Xu, Jinchao.
1991-12-11
In this note, we consider multilevel preconditioning of the reduced boundary systems which arise in non-overlapping domain decomposition methods. It will be shown that the resulting preconditioned systems have condition numbers which be bounded in the case of multilevel spaces on the whole domain and grow at most proportional to the number of levels in the case of multilevel boundary spaces without multilevel extensions into the interior.
NASA Astrophysics Data System (ADS)
Romano, Paul Kollath
Monte Carlo particle transport methods are being considered as a viable option for high-fidelity simulation of nuclear reactors. While Monte Carlo methods offer several potential advantages over deterministic methods, there are a number of algorithmic shortcomings that would prevent their immediate adoption for full-core analyses. In this thesis, algorithms are proposed both to ameliorate the degradation in parallel efficiency typically observed for large numbers of processors and to offer a means of decomposing large tally data that will be needed for reactor analysis. A nearest-neighbor fission bank algorithm was proposed and subsequently implemented in the OpenMC Monte Carlo code. A theoretical analysis of the communication pattern shows that the expected cost is O( N ) whereas traditional fission bank algorithms are O(N) at best. The algorithm was tested on two supercomputers, the Intrepid Blue Gene/P and the Titan Cray XK7, and demonstrated nearly linear parallel scaling up to 163,840 processor cores on a full-core benchmark problem. An algorithm for reducing network communication arising from tally reduction was analyzed and implemented in OpenMC. The proposed algorithm groups only particle histories on a single processor into batches for tally purposes---in doing so it prevents all network communication for tallies until the very end of the simulation. The algorithm was tested, again on a full-core benchmark, and shown to reduce network communication substantially. A model was developed to predict the impact of load imbalances on the performance of domain decomposed simulations. The analysis demonstrated that load imbalances in domain decomposed simulations arise from two distinct phenomena: non-uniform particle densities and non-uniform spatial leakage. The dominant performance penalty for domain decomposition was shown to come from these physical effects rather than insufficient network bandwidth or high latency. The model predictions were verified with measured data from simulations in OpenMC on a full-core benchmark problem. Finally, a novel algorithm for decomposing large tally data was proposed, analyzed, and implemented/tested in OpenMC. The algorithm relies on disjoint sets of compute processes and tally servers. The analysis showed that for a range of parameters relevant to LWR analysis, the tally server algorithm should perform with minimal overhead. Tests were performed on Intrepid and Titan and demonstrated that the algorithm did indeed perform well over a wide range of parameters. (Copies available exclusively from MIT Libraries, libraries.mit.edu/docs - docs mit.edu)
Limited-memory adaptive snapshot selection for proper orthogonal decomposition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oxberry, Geoffrey M.; Kostova-Vassilevska, Tanya; Arrighi, Bill
2015-04-02
Reduced order models are useful for accelerating simulations in many-query contexts, such as optimization, uncertainty quantification, and sensitivity analysis. However, offline training of reduced order models can have prohibitively expensive memory and floating-point operation costs in high-performance computing applications, where memory per core is limited. To overcome this limitation for proper orthogonal decomposition, we propose a novel adaptive selection method for snapshots in time that limits offline training costs by selecting snapshots according an error control mechanism similar to that found in adaptive time-stepping ordinary differential equation solvers. The error estimator used in this work is related to theory boundingmore » the approximation error in time of proper orthogonal decomposition-based reduced order models, and memory usage is minimized by computing the singular value decomposition using a single-pass incremental algorithm. Results for a viscous Burgers’ test problem demonstrate convergence in the limit as the algorithm error tolerances go to zero; in this limit, the full order model is recovered to within discretization error. The resulting method can be used on supercomputers to generate proper orthogonal decomposition-based reduced order models, or as a subroutine within hyperreduction algorithms that require taking snapshots in time, or within greedy algorithms for sampling parameter space.« less
Scenario Decomposition for 0-1 Stochastic Programs: Improvements and Asynchronous Implementation
Ryan, Kevin; Rajan, Deepak; Ahmed, Shabbir
2016-05-01
We recently proposed scenario decomposition algorithm for stochastic 0-1 programs finds an optimal solution by evaluating and removing individual solutions that are discovered by solving scenario subproblems. In our work, we develop an asynchronous, distributed implementation of the algorithm which has computational advantages over existing synchronous implementations of the algorithm. Improvements to both the synchronous and asynchronous algorithm are proposed. We also test the results on well known stochastic 0-1 programs from the SIPLIB test library and is able to solve one previously unsolved instance from the test set.
NASA Astrophysics Data System (ADS)
Vatankhah, Saeed; Renaut, Rosemary A.; Ardestani, Vahid E.
2018-04-01
We present a fast algorithm for the total variation regularization of the 3-D gravity inverse problem. Through imposition of the total variation regularization, subsurface structures presenting with sharp discontinuities are preserved better than when using a conventional minimum-structure inversion. The associated problem formulation for the regularization is nonlinear but can be solved using an iteratively reweighted least-squares algorithm. For small-scale problems the regularized least-squares problem at each iteration can be solved using the generalized singular value decomposition. This is not feasible for large-scale, or even moderate-scale, problems. Instead we introduce the use of a randomized generalized singular value decomposition in order to reduce the dimensions of the problem and provide an effective and efficient solution technique. For further efficiency an alternating direction algorithm is used to implement the total variation weighting operator within the iteratively reweighted least-squares algorithm. Presented results for synthetic examples demonstrate that the novel randomized decomposition provides good accuracy for reduced computational and memory demands as compared to use of classical approaches.
Spectral decompositions of multiple time series: a Bayesian non-parametric approach.
Macaro, Christian; Prado, Raquel
2014-01-01
We consider spectral decompositions of multiple time series that arise in studies where the interest lies in assessing the influence of two or more factors. We write the spectral density of each time series as a sum of the spectral densities associated to the different levels of the factors. We then use Whittle's approximation to the likelihood function and follow a Bayesian non-parametric approach to obtain posterior inference on the spectral densities based on Bernstein-Dirichlet prior distributions. The prior is strategically important as it carries identifiability conditions for the models and allows us to quantify our degree of confidence in such conditions. A Markov chain Monte Carlo (MCMC) algorithm for posterior inference within this class of frequency-domain models is presented.We illustrate the approach by analyzing simulated and real data via spectral one-way and two-way models. In particular, we present an analysis of functional magnetic resonance imaging (fMRI) brain responses measured in individuals who participated in a designed experiment to study pain perception in humans.
Highly Scalable Matching Pursuit Signal Decomposition Algorithm
NASA Technical Reports Server (NTRS)
Christensen, Daniel; Das, Santanu; Srivastava, Ashok N.
2009-01-01
Matching Pursuit Decomposition (MPD) is a powerful iterative algorithm for signal decomposition and feature extraction. MPD decomposes any signal into linear combinations of its dictionary elements or atoms . A best fit atom from an arbitrarily defined dictionary is determined through cross-correlation. The selected atom is subtracted from the signal and this procedure is repeated on the residual in the subsequent iterations until a stopping criterion is met. The reconstructed signal reveals the waveform structure of the original signal. However, a sufficiently large dictionary is required for an accurate reconstruction; this in return increases the computational burden of the algorithm, thus limiting its applicability and level of adoption. The purpose of this research is to improve the scalability and performance of the classical MPD algorithm. Correlation thresholds were defined to prune insignificant atoms from the dictionary. The Coarse-Fine Grids and Multiple Atom Extraction techniques were proposed to decrease the computational burden of the algorithm. The Coarse-Fine Grids method enabled the approximation and refinement of the parameters for the best fit atom. The ability to extract multiple atoms within a single iteration enhanced the effectiveness and efficiency of each iteration. These improvements were implemented to produce an improved Matching Pursuit Decomposition algorithm entitled MPD++. Disparate signal decomposition applications may require a particular emphasis of accuracy or computational efficiency. The prominence of the key signal features required for the proper signal classification dictates the level of accuracy necessary in the decomposition. The MPD++ algorithm may be easily adapted to accommodate the imposed requirements. Certain feature extraction applications may require rapid signal decomposition. The full potential of MPD++ may be utilized to produce incredible performance gains while extracting only slightly less energy than the standard algorithm. When the utmost accuracy must be achieved, the modified algorithm extracts atoms more conservatively but still exhibits computational gains over classical MPD. The MPD++ algorithm was demonstrated using an over-complete dictionary on real life data. Computational times were reduced by factors of 1.9 and 44 for the emphases of accuracy and performance, respectively. The modified algorithm extracted similar amounts of energy compared to classical MPD. The degree of the improvement in computational time depends on the complexity of the data, the initialization parameters, and the breadth of the dictionary. The results of the research confirm that the three modifications successfully improved the scalability and computational efficiency of the MPD algorithm. Correlation Thresholding decreased the time complexity by reducing the dictionary size. Multiple Atom Extraction also reduced the time complexity by decreasing the number of iterations required for a stopping criterion to be reached. The Course-Fine Grids technique enabled complicated atoms with numerous variable parameters to be effectively represented in the dictionary. Due to the nature of the three proposed modifications, they are capable of being stacked and have cumulative effects on the reduction of the time complexity.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oh, Duk -Soon; Widlund, Olof B.; Zampini, Stefano
Here, a BDDC domain decomposition preconditioner is defined by a coarse component, expressed in terms of primal constraints, a weighted average across the interface between the subdomains, and local components given in terms of solvers of local subdomain problems. BDDC methods for vector field problems discretized with Raviart-Thomas finite elements are introduced. The methods are based on a new type of weighted average an adaptive selection of primal constraints developed to deal with coefficients with high contrast even inside individual subdomains. For problems with very many subdomains, a third level of the preconditioner is introduced. Assuming that the subdomains aremore » all built from elements of a coarse triangulation of the given domain, and that in each subdomain the material parameters are consistent, one obtains a bound for the preconditioned linear system's condition number which is independent of the values and jumps of these parameters across the subdomains' interface. Numerical experiments, using the PETSc library, are also presented which support the theory and show the algorithms' effectiveness even for problems not covered by the theory. Also included are experiments with Brezzi-Douglas-Marini finite-element approximations.« less
Oh, Duk -Soon; Widlund, Olof B.; Zampini, Stefano; ...
2017-06-21
Here, a BDDC domain decomposition preconditioner is defined by a coarse component, expressed in terms of primal constraints, a weighted average across the interface between the subdomains, and local components given in terms of solvers of local subdomain problems. BDDC methods for vector field problems discretized with Raviart-Thomas finite elements are introduced. The methods are based on a new type of weighted average an adaptive selection of primal constraints developed to deal with coefficients with high contrast even inside individual subdomains. For problems with very many subdomains, a third level of the preconditioner is introduced. Assuming that the subdomains aremore » all built from elements of a coarse triangulation of the given domain, and that in each subdomain the material parameters are consistent, one obtains a bound for the preconditioned linear system's condition number which is independent of the values and jumps of these parameters across the subdomains' interface. Numerical experiments, using the PETSc library, are also presented which support the theory and show the algorithms' effectiveness even for problems not covered by the theory. Also included are experiments with Brezzi-Douglas-Marini finite-element approximations.« less
Bearing Fault Diagnosis Based on Statistical Locally Linear Embedding
Wang, Xiang; Zheng, Yuan; Zhao, Zhenzhou; Wang, Jinping
2015-01-01
Fault diagnosis is essentially a kind of pattern recognition. The measured signal samples usually distribute on nonlinear low-dimensional manifolds embedded in the high-dimensional signal space, so how to implement feature extraction, dimensionality reduction and improve recognition performance is a crucial task. In this paper a novel machinery fault diagnosis approach based on a statistical locally linear embedding (S-LLE) algorithm which is an extension of LLE by exploiting the fault class label information is proposed. The fault diagnosis approach first extracts the intrinsic manifold features from the high-dimensional feature vectors which are obtained from vibration signals that feature extraction by time-domain, frequency-domain and empirical mode decomposition (EMD), and then translates the complex mode space into a salient low-dimensional feature space by the manifold learning algorithm S-LLE, which outperforms other feature reduction methods such as PCA, LDA and LLE. Finally in the feature reduction space pattern classification and fault diagnosis by classifier are carried out easily and rapidly. Rolling bearing fault signals are used to validate the proposed fault diagnosis approach. The results indicate that the proposed approach obviously improves the classification performance of fault pattern recognition and outperforms the other traditional approaches. PMID:26153771
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiang, Huaiguang; Dai, Xiaoxiao; Gao, David Wenzhong
An approach of big data characterization for smart grids (SGs) and its applications in fault detection, identification, and causal impact analysis is proposed in this paper, which aims to provide substantial data volume reduction while keeping comprehensive information from synchrophasor measurements in spatial and temporal domains. Especially, based on secondary voltage control (SVC) and local SG observation algorithm, a two-layer dynamic optimal synchrophasor measurement devices selection algorithm (OSMDSA) is proposed to determine SVC zones, their corresponding pilot buses, and the optimal synchrophasor measurement devices. Combining the two-layer dynamic OSMDSA and matching pursuit decomposition, the synchrophasor data is completely characterized inmore » the spatial-temporal domain. To demonstrate the effectiveness of the proposed characterization approach, SG situational awareness is investigated based on hidden Markov model based fault detection and identification using the spatial-temporal characteristics generated from the reduced data. To identify the major impact buses, the weighted Granger causality for SGs is proposed to investigate the causal relationship of buses during system disturbance. The IEEE 39-bus system and IEEE 118-bus system are employed to validate and evaluate the proposed approach.« less
Achieving a high mode count in the exact electromagnetic simulation of diffractive optical elements.
Junker, André; Brenner, Karl-Heinz
2018-03-01
The application of rigorous optical simulation algorithms, both in the modal as well as in the time domain, is known to be limited to the nano-optical scale due to severe computing time and memory constraints. This is true even for today's high-performance computers. To address this problem, we develop the fast rigorous iterative method (FRIM), an algorithm based on an iterative approach, which, under certain conditions, allows solving also large-size problems approximation free. We achieve this in the case of a modal representation by avoiding the computationally complex eigenmode decomposition. Thereby, the numerical cost is reduced from O(N 3 ) to O(N log N), enabling a simulation of structures like certain diffractive optical elements with a significantly higher mode count than presently possible. Apart from speed, another major advantage of the iterative FRIM over standard modal methods is the possibility to trade runtime against accuracy.
On iterative processes in the Krylov-Sonneveld subspaces
NASA Astrophysics Data System (ADS)
Ilin, Valery P.
2016-10-01
The iterative Induced Dimension Reduction (IDR) methods are considered for solving large systems of linear algebraic equations (SLAEs) with nonsingular nonsymmetric matrices. These approaches are investigated by many authors and are charachterized sometimes as the alternative to the classical processes of Krylov type. The key moments of the IDR algorithms consist in the construction of the embedded Sonneveld subspaces, which have the decreasing dimensions and use the orthogonalization to some fixed subspace. Other independent approaches for research and optimization of the iterations are based on the augmented and modified Krylov subspaces by using the aggregation and deflation procedures with present various low rank approximations of the original matrices. The goal of this paper is to show, that IDR method in Sonneveld subspaces present an original interpretation of the modified algorithms in the Krylov subspaces. In particular, such description is given for the multi-preconditioned semi-conjugate direction methods which are actual for the parallel algebraic domain decomposition approaches.
NASA Astrophysics Data System (ADS)
Hemker, Roy
1999-11-01
The advances in computational speed make it now possible to do full 3D PIC simulations of laser plasma and beam plasma interactions, but at the same time the increased complexity of these problems makes it necessary to apply modern approaches like object oriented programming to the development of simulation codes. We report here on our progress in developing an object oriented parallel 3D PIC code using Fortran 90. In its current state the code contains algorithms for 1D, 2D, and 3D simulations in cartesian coordinates and for 2D cylindrically-symmetric geometry. For all of these algorithms the code allows for a moving simulation window and arbitrary domain decomposition for any number of dimensions. Recent 3D simulation results on the propagation of intense laser and electron beams through plasmas will be presented.
Three-Dimensional High-Lift Analysis Using a Parallel Unstructured Multigrid Solver
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.
1998-01-01
A directional implicit unstructured agglomeration multigrid solver is ported to shared and distributed memory massively parallel machines using the explicit domain-decomposition and message-passing approach. Because the algorithm operates on local implicit lines in the unstructured mesh, special care is required in partitioning the problem for parallel computing. A weighted partitioning strategy is described which avoids breaking the implicit lines across processor boundaries, while incurring minimal additional communication overhead. Good scalability is demonstrated on a 128 processor SGI Origin 2000 machine and on a 512 processor CRAY T3E machine for reasonably fine grids. The feasibility of performing large-scale unstructured grid calculations with the parallel multigrid algorithm is demonstrated by computing the flow over a partial-span flap wing high-lift geometry on a highly resolved grid of 13.5 million points in approximately 4 hours of wall clock time on the CRAY T3E.
An expert support system for breast cancer diagnosis using color wavelet features.
Issac Niwas, S; Palanisamy, P; Chibbar, Rajni; Zhang, W J
2012-10-01
Breast cancer diagnosis can be done through the pathologic assessments of breast tissue samples such as core needle biopsy technique. The result of analysis on this sample by pathologist is crucial for breast cancer patient. In this paper, nucleus of tissue samples are investigated after decomposition by means of the Log-Gabor wavelet on HSV color domain and an algorithm is developed to compute the color wavelet features. These features are used for breast cancer diagnosis using Support Vector Machine (SVM) classifier algorithm. The ability of properly trained SVM is to correctly classify patterns and make them particularly suitable for use in an expert system that aids in the diagnosis of cancer tissue samples. The results are compared with other multivariate classifiers such as Naïves Bayes classifier and Artificial Neural Network. The overall accuracy of the proposed method using SVM classifier will be further useful for automation in cancer diagnosis.
NASA Astrophysics Data System (ADS)
Jia, Zhongxiao; Yang, Yanfei
2018-05-01
In this paper, we propose new randomization based algorithms for large scale linear discrete ill-posed problems with general-form regularization: subject to , where L is a regularization matrix. Our algorithms are inspired by the modified truncated singular value decomposition (MTSVD) method, which suits only for small to medium scale problems, and randomized SVD (RSVD) algorithms that generate good low rank approximations to A. We use rank-k truncated randomized SVD (TRSVD) approximations to A by truncating the rank- RSVD approximations to A, where q is an oversampling parameter. The resulting algorithms are called modified TRSVD (MTRSVD) methods. At every step, we use the LSQR algorithm to solve the resulting inner least squares problem, which is proved to become better conditioned as k increases so that LSQR converges faster. We present sharp bounds for the approximation accuracy of the RSVDs and TRSVDs for severely, moderately and mildly ill-posed problems, and substantially improve a known basic bound for TRSVD approximations. We prove how to choose the stopping tolerance for LSQR in order to guarantee that the computed and exact best regularized solutions have the same accuracy. Numerical experiments illustrate that the best regularized solutions by MTRSVD are as accurate as the ones by the truncated generalized singular value decomposition (TGSVD) algorithm, and at least as accurate as those by some existing truncated randomized generalized singular value decomposition (TRGSVD) algorithms. This work was supported in part by the National Science Foundation of China (Nos. 11771249 and 11371219).
A CFD study of complex missile and store configurations in relative motion
NASA Technical Reports Server (NTRS)
Baysal, Oktay
1995-01-01
An investigation was conducted from May 16, 1990 to August 31, 1994 on the development of computational fluid dynamics (CFD) methodologies for complex missiles and the store separation problem. These flowfields involved multiple-component configurations, where at least one of the objects was engaged in relative motion. The two most important issues that had to be addressed were: (1) the unsteadiness of the flowfields (time-accurate and efficient CFD algorithms for the unsteady equations), and (2) the generation of grid systems which would permit multiple and moving bodies in the computational domain (dynamic domain decomposition). The study produced two competing and promising methodologies, and their proof-of-concept cases, which have been reported in the open literature: (1) Unsteady solutions on dynamic, overlapped grids, which may also be perceived as moving, locally-structured grids, and (2) Unsteady solutions on dynamic, unstructured grids.
NASA Technical Reports Server (NTRS)
Zhang, Yiqiang; Alexander, J. I. D.; Ouazzani, J.
1994-01-01
Free and moving boundary problems require the simultaneous solution of unknown field variables and the boundaries of the domains on which these variables are defined. There are many technologically important processes that lead to moving boundary problems associated with fluid surfaces and solid-fluid boundaries. These include crystal growth, metal alloy and glass solidification, melting and name propagation. The directional solidification of semi-conductor crystals by the Bridgman-Stockbarger method is a typical example of such a complex process. A numerical model of this growth method must solve the appropriate heat, mass and momentum transfer equations and determine the location of the melt-solid interface. In this work, a Chebyshev pseudospectra collocation method is adapted to the problem of directional solidification. Implementation involves a solution algorithm that combines domain decomposition, finite-difference preconditioned conjugate minimum residual method and a Picard type iterative scheme.
Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular Simulation
Phillips, James C.; Sun, Yanhua; Jain, Nikhil; Bohm, Eric J.; Kalé, Laxmikant V.
2014-01-01
Currently deployed petascale supercomputers typically use toroidal network topologies in three or more dimensions. While these networks perform well for topology-agnostic codes on a few thousand nodes, leadership machines with 20,000 nodes require topology awareness to avoid network contention for communication-intensive codes. Topology adaptation is complicated by irregular node allocation shapes and holes due to dedicated input/output nodes or hardware failure. In the context of the popular molecular dynamics program NAMD, we present methods for mapping a periodic 3-D grid of fixed-size spatial decomposition domains to 3-D Cray Gemini and 5-D IBM Blue Gene/Q toroidal networks to enable hundred-million atom full machine simulations, and to similarly partition node allocations into compact domains for smaller simulations using multiple-copy algorithms. Additional enabling techniques are discussed and performance is reported for NCSA Blue Waters, ORNL Titan, ANL Mira, TACC Stampede, and NERSC Edison. PMID:25594075
CP decomposition approach to blind separation for DS-CDMA system using a new performance index
NASA Astrophysics Data System (ADS)
Rouijel, Awatif; Minaoui, Khalid; Comon, Pierre; Aboutajdine, Driss
2014-12-01
In this paper, we present a canonical polyadic (CP) tensor decomposition isolating the scaling matrix. This has two major implications: (i) the problem conditioning shows up explicitly and could be controlled through a constraint on the so-called coherences and (ii) a performance criterion concerning the factor matrices can be exactly calculated and is more realistic than performance metrics used in the literature. Two new algorithms optimizing the CP decomposition based on gradient descent are proposed. This decomposition is illustrated by an application to direct-sequence code division multiplexing access (DS-CDMA) systems; computer simulations are provided and demonstrate the good behavior of these algorithms, compared to others in the literature.
Ma, Jingjing; Liu, Jie; Ma, Wenping; Gong, Maoguo; Jiao, Licheng
2014-01-01
Community structure is one of the most important properties in social networks. In dynamic networks, there are two conflicting criteria that need to be considered. One is the snapshot quality, which evaluates the quality of the community partitions at the current time step. The other is the temporal cost, which evaluates the difference between communities at different time steps. In this paper, we propose a decomposition-based multiobjective community detection algorithm to simultaneously optimize these two objectives to reveal community structure and its evolution in dynamic networks. It employs the framework of multiobjective evolutionary algorithm based on decomposition to simultaneously optimize the modularity and normalized mutual information, which quantitatively measure the quality of the community partitions and temporal cost, respectively. A local search strategy dealing with the problem-specific knowledge is incorporated to improve the effectiveness of the new algorithm. Experiments on computer-generated and real-world networks demonstrate that the proposed algorithm can not only find community structure and capture community evolution more accurately, but also be steadier than the two compared algorithms. PMID:24723806
Polar decomposition for attitude determination from vector observations
NASA Technical Reports Server (NTRS)
Bar-Itzhack, Itzhack Y.
1993-01-01
This work treats the problem of weighted least squares fitting of a 3D Euclidean-coordinate transformation matrix to a set of unit vectors measured in the reference and transformed coordinates. A closed-form analytic solution to the problem is re-derived. The fact that the solution is the closest orthogonal matrix to some matrix defined on the measured vectors and their weights is clearly demonstrated. Several known algorithms for computing the analytic closed form solution are considered. An algorithm is discussed which is based on the polar decomposition of matrices into the closest unitary matrix to the decomposed matrix and a Hermitian matrix. A somewhat longer improved algorithm is suggested too. A comparison of several algorithms is carried out using simulated data as well as real data from the Upper Atmosphere Research Satellite. The comparison is based on accuracy and time consumption. It is concluded that the algorithms based on polar decomposition yield a simple although somewhat less accurate solution. The precision of the latter algorithms increase with the number of the measured vectors and with the accuracy of their measurement.
Ma, Jingjing; Liu, Jie; Ma, Wenping; Gong, Maoguo; Jiao, Licheng
2014-01-01
Community structure is one of the most important properties in social networks. In dynamic networks, there are two conflicting criteria that need to be considered. One is the snapshot quality, which evaluates the quality of the community partitions at the current time step. The other is the temporal cost, which evaluates the difference between communities at different time steps. In this paper, we propose a decomposition-based multiobjective community detection algorithm to simultaneously optimize these two objectives to reveal community structure and its evolution in dynamic networks. It employs the framework of multiobjective evolutionary algorithm based on decomposition to simultaneously optimize the modularity and normalized mutual information, which quantitatively measure the quality of the community partitions and temporal cost, respectively. A local search strategy dealing with the problem-specific knowledge is incorporated to improve the effectiveness of the new algorithm. Experiments on computer-generated and real-world networks demonstrate that the proposed algorithm can not only find community structure and capture community evolution more accurately, but also be steadier than the two compared algorithms.
A novel iterative scheme and its application to differential equations.
Khan, Yasir; Naeem, F; Šmarda, Zdeněk
2014-01-01
The purpose of this paper is to employ an alternative approach to reconstruct the standard variational iteration algorithm II proposed by He, including Lagrange multiplier, and to give a simpler formulation of Adomian decomposition and modified Adomian decomposition method in terms of newly proposed variational iteration method-II (VIM). Through careful investigation of the earlier variational iteration algorithm and Adomian decomposition method, we find unnecessary calculations for Lagrange multiplier and also repeated calculations involved in each iteration, respectively. Several examples are given to verify the reliability and efficiency of the method.
NASA Technical Reports Server (NTRS)
Plassman, Gerald E.
2005-01-01
This contractor report describes a performance comparison of available alternative complete Singular Value Decomposition (SVD) methods and implementations which are suitable for incorporation into point spread function deconvolution algorithms. The report also presents a survey of alternative algorithms, including partial SVD's special case SVD's, and others developed for concurrent processing systems.
Dual energy computed tomography for the head.
Naruto, Norihito; Itoh, Toshihide; Noguchi, Kyo
2018-02-01
Dual energy CT (DECT) is a promising technology that provides better diagnostic accuracy in several brain diseases. DECT can generate various types of CT images from a single acquisition data set at high kV and low kV based on material decomposition algorithms. The two-material decomposition algorithm can separate bone/calcification from iodine accurately. The three-material decomposition algorithm can generate a virtual non-contrast image, which helps to identify conditions such as brain hemorrhage. A virtual monochromatic image has the potential to eliminate metal artifacts by reducing beam-hardening effects. DECT also enables exploration of advanced imaging to make diagnosis easier. One such novel application of DECT is the X-Map, which helps to visualize ischemic stroke in the brain without using iodine contrast medium.
A Graph Based Backtracking Algorithm for Solving General CSPs
NASA Technical Reports Server (NTRS)
Pang, Wanlin; Goodwin, Scott D.
2003-01-01
Many AI tasks can be formalized as constraint satisfaction problems (CSPs), which involve finding values for variables subject to constraints. While solving a CSP is an NP-complete task in general, tractable classes of CSPs have been identified based on the structure of the underlying constraint graphs. Much effort has been spent on exploiting structural properties of the constraint graph to improve the efficiency of finding a solution. These efforts contributed to development of a class of CSP solving algorithms called decomposition algorithms. The strength of CSP decomposition is that its worst-case complexity depends on the structural properties of the constraint graph and is usually better than the worst-case complexity of search methods. Its practical application is limited, however, since it cannot be applied if the CSP is not decomposable. In this paper, we propose a graph based backtracking algorithm called omega-CDBT, which shares merits and overcomes the weaknesses of both decomposition and search approaches.
Fast heap transform-based QR-decomposition of real and complex matrices: algorithms and codes
NASA Astrophysics Data System (ADS)
Grigoryan, Artyom M.
2015-03-01
In this paper, we describe a new look on the application of Givens rotations to the QR-decomposition problem, which is similar to the method of Householder transformations. We apply the concept of the discrete heap transform, or signal-induced unitary transforms which had been introduced by Grigoryan (2006) and used in signal and image processing. Both cases of real and complex nonsingular matrices are considered and examples of performing QR-decomposition of square matrices are given. The proposed method of QR-decomposition for the complex matrix is novel and differs from the known method of complex Givens rotation and is based on analytical equations for the heap transforms. Many examples illustrated the proposed heap transform method of QR-decomposition are given, algorithms are described in detail, and MATLAB-based codes are included.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tang Shaojie; Tang Xiangyang; School of Automation, Xi'an University of Posts and Telecommunications, Xi'an, Shaanxi 710121
2012-09-15
Purposes: The suppression of noise in x-ray computed tomography (CT) imaging is of clinical relevance for diagnostic image quality and the potential for radiation dose saving. Toward this purpose, statistical noise reduction methods in either the image or projection domain have been proposed, which employ a multiscale decomposition to enhance the performance of noise suppression while maintaining image sharpness. Recognizing the advantages of noise suppression in the projection domain, the authors propose a projection domain multiscale penalized weighted least squares (PWLS) method, in which the angular sampling rate is explicitly taken into consideration to account for the possible variation ofmore » interview sampling rate in advanced clinical or preclinical applications. Methods: The projection domain multiscale PWLS method is derived by converting an isotropic diffusion partial differential equation in the image domain into the projection domain, wherein a multiscale decomposition is carried out. With adoption of the Markov random field or soft thresholding objective function, the projection domain multiscale PWLS method deals with noise at each scale. To compensate for the degradation in image sharpness caused by the projection domain multiscale PWLS method, an edge enhancement is carried out following the noise reduction. The performance of the proposed method is experimentally evaluated and verified using the projection data simulated by computer and acquired by a CT scanner. Results: The preliminary results show that the proposed projection domain multiscale PWLS method outperforms the projection domain single-scale PWLS method and the image domain multiscale anisotropic diffusion method in noise reduction. In addition, the proposed method can preserve image sharpness very well while the occurrence of 'salt-and-pepper' noise and mosaic artifacts can be avoided. Conclusions: Since the interview sampling rate is taken into account in the projection domain multiscale decomposition, the proposed method is anticipated to be useful in advanced clinical and preclinical applications where the interview sampling rate varies.« less
SOI layout decomposition for double patterning lithography on high-performance computer platforms
NASA Astrophysics Data System (ADS)
Verstov, Vladimir; Zinchenko, Lyudmila; Makarchuk, Vladimir
2014-12-01
In the paper silicon on insulator layout decomposition algorithms for the double patterning lithography on high performance computing platforms are discussed. Our approach is based on the use of a contradiction graph and a modified concurrent breadth-first search algorithm. We evaluate our technique on 45 nm Nangate Open Cell Library including non-Manhattan geometry. Experimental results show that our soft computing algorithms decompose layout successfully and a minimal distance between polygons in layout is increased.
A thermodynamic definition of protein domains.
Porter, Lauren L; Rose, George D
2012-06-12
Protein domains are conspicuous structural units in globular proteins, and their identification has been a topic of intense biochemical interest dating back to the earliest crystal structures. Numerous disparate domain identification algorithms have been proposed, all involving some combination of visual intuition and/or structure-based decomposition. Instead, we present a rigorous, thermodynamically-based approach that redefines domains as cooperative chain segments. In greater detail, most small proteins fold with high cooperativity, meaning that the equilibrium population is dominated by completely folded and completely unfolded molecules, with a negligible subpopulation of partially folded intermediates. Here, we redefine structural domains in thermodynamic terms as cooperative folding units, based on m-values, which measure the cooperativity of a protein or its substructures. In our analysis, a domain is equated to a contiguous segment of the folded protein whose m-value is largely unaffected when that segment is excised from its parent structure. Defined in this way, a domain is a self-contained cooperative unit; i.e., its cooperativity depends primarily upon intrasegment interactions, not intersegment interactions. Implementing this concept computationally, the domains in a large representative set of proteins were identified; all exhibit consistency with experimental findings. Specifically, our domain divisions correspond to the experimentally determined equilibrium folding intermediates in a set of nine proteins. The approach was also proofed against a representative set of 71 additional proteins, again with confirmatory results. Our reframed interpretation of a protein domain transforms an indeterminate structural phenomenon into a quantifiable molecular property grounded in solution thermodynamics.
NASA Astrophysics Data System (ADS)
Lahmiri, Salim
2016-08-01
The main purpose of this work is to explore the usefulness of fractal descriptors estimated in multi-resolution domains to characterize biomedical digital image texture. In this regard, three multi-resolution techniques are considered: the well-known discrete wavelet transform (DWT) and the empirical mode decomposition (EMD), and; the newly introduced; variational mode decomposition mode (VMD). The original image is decomposed by the DWT, EMD, and VMD into different scales. Then, Fourier spectrum based fractal descriptors is estimated at specific scales and directions to characterize the image. The support vector machine (SVM) was used to perform supervised classification. The empirical study was applied to the problem of distinguishing between normal and abnormal brain magnetic resonance images (MRI) affected with Alzheimer disease (AD). Our results demonstrate that fractal descriptors estimated in VMD domain outperform those estimated in DWT and EMD domains; and also those directly estimated from the original image.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dongarra, J.J.; Hewitt, T.
1985-08-01
This note describes some experiments on simple, dense linear algebra algorithms. These experiments show that the CRAY X-MP is capable of small-grain multitasking arising from standard implementations of LU and Cholesky decomposition. The implementation described here provides the ''fastest'' execution rate for LU decomposition, 718 MFLOPS for a matrix of order 1000.
NASA Astrophysics Data System (ADS)
Ma, Yehao; Li, Xian; Huang, Pingjie; Hou, Dibo; Wang, Qiang; Zhang, Guangxin
2017-04-01
In many situations the THz spectroscopic data observed from complex samples represent the integrated result of several interrelated variables or feature components acting together. The actual information contained in the original data might be overlapping and there is a necessity to investigate various approaches for model reduction and data unmixing. The development and use of low-rank approximate nonnegative matrix factorization (NMF) and smooth constraint NMF (CNMF) algorithms for feature components extraction and identification in the fields of terahertz time domain spectroscopy (THz-TDS) data analysis are presented. The evolution and convergence properties of NMF and CNMF methods based on sparseness, independence and smoothness constraints for the resulting nonnegative matrix factors are discussed. For general NMF, its cost function is nonconvex and the result is usually susceptible to initialization and noise corruption, and may fall into local minima and lead to unstable decomposition. To reduce these drawbacks, smoothness constraint is introduced to enhance the performance of NMF. The proposed algorithms are evaluated by several THz-TDS data decomposition experiments including a binary system and a ternary system simulating some applications such as medicine tablet inspection. Results show that CNMF is more capable of finding optimal solutions and more robust for random initialization in contrast to NMF. The investigated method is promising for THz data resolution contributing to unknown mixture identification.
Ma, Yehao; Li, Xian; Huang, Pingjie; Hou, Dibo; Wang, Qiang; Zhang, Guangxin
2017-04-15
In many situations the THz spectroscopic data observed from complex samples represent the integrated result of several interrelated variables or feature components acting together. The actual information contained in the original data might be overlapping and there is a necessity to investigate various approaches for model reduction and data unmixing. The development and use of low-rank approximate nonnegative matrix factorization (NMF) and smooth constraint NMF (CNMF) algorithms for feature components extraction and identification in the fields of terahertz time domain spectroscopy (THz-TDS) data analysis are presented. The evolution and convergence properties of NMF and CNMF methods based on sparseness, independence and smoothness constraints for the resulting nonnegative matrix factors are discussed. For general NMF, its cost function is nonconvex and the result is usually susceptible to initialization and noise corruption, and may fall into local minima and lead to unstable decomposition. To reduce these drawbacks, smoothness constraint is introduced to enhance the performance of NMF. The proposed algorithms are evaluated by several THz-TDS data decomposition experiments including a binary system and a ternary system simulating some applications such as medicine tablet inspection. Results show that CNMF is more capable of finding optimal solutions and more robust for random initialization in contrast to NMF. The investigated method is promising for THz data resolution contributing to unknown mixture identification. Copyright © 2017 Elsevier B.V. All rights reserved.
INDDGO: Integrated Network Decomposition & Dynamic programming for Graph Optimization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Groer, Christopher S; Sullivan, Blair D; Weerapurage, Dinesh P
2012-10-01
It is well-known that dynamic programming algorithms can utilize tree decompositions to provide a way to solve some \\emph{NP}-hard problems on graphs where the complexity is polynomial in the number of nodes and edges in the graph, but exponential in the width of the underlying tree decomposition. However, there has been relatively little computational work done to determine the practical utility of such dynamic programming algorithms. We have developed software to construct tree decompositions using various heuristics and have created a fast, memory-efficient dynamic programming implementation for solving maximum weighted independent set. We describe our software and the algorithms wemore » have implemented, focusing on memory saving techniques for the dynamic programming. We compare the running time and memory usage of our implementation with other techniques for solving maximum weighted independent set, including a commercial integer programming solver and a semi-definite programming solver. Our results indicate that it is possible to solve some instances where the underlying decomposition has width much larger than suggested by the literature. For certain types of problems, our dynamic programming code runs several times faster than these other methods.« less
xEMD procedures as a data - Assisted filtering method
NASA Astrophysics Data System (ADS)
Machrowska, Anna; Jonak, Józef
2018-01-01
The article presents the possibility of using Empirical Mode Decomposition (EMD), Ensemble Empirical Mode Decomposition (EEMD), Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) and Improved Complete Ensemble Empirical Mode Decomposition (ICEEMD) algorithms for mechanical system condition monitoring applications. There were presented the results of the xEMD procedures used for vibration signals of system in different states of wear.
Extended FDD-WT method based on correcting the errors due to non-synchronous sensing of sensors
NASA Astrophysics Data System (ADS)
Tarinejad, Reza; Damadipour, Majid
2016-05-01
In this research, a combinational non-parametric method called frequency domain decomposition-wavelet transform (FDD-WT) that was recently presented by the authors, is extended for correction of the errors resulting from asynchronous sensing of sensors, in order to extend the application of the algorithm for different kinds of structures, especially for huge structures. Therefore, the analysis process is based on time-frequency domain decomposition and is performed with emphasis on correcting time delays between sensors. Time delay estimation (TDE) methods are investigated for their efficiency and accuracy for noisy environmental records and the Phase Transform - β (PHAT-β) technique was selected as an appropriate method to modify the operation of traditional FDD-WT in order to achieve the exact results. In this paper, a theoretical example (3DOF system) has been provided in order to indicate the non-synchronous sensing effects of the sensors on the modal parameters; moreover, the Pacoima dam subjected to 13 Jan 2001 earthquake excitation was selected as a case study. The modal parameters of the dam obtained from the extended FDD-WT method were compared with the output of the classical signal processing method, which is referred to as 4-Spectral method, as well as other literatures relating to the dynamic characteristics of Pacoima dam. The results comparison indicates that values are correct and reliable.
NASA Astrophysics Data System (ADS)
Magee, Daniel J.; Niemeyer, Kyle E.
2018-03-01
The expedient design of precision components in aerospace and other high-tech industries requires simulations of physical phenomena often described by partial differential equations (PDEs) without exact solutions. Modern design problems require simulations with a level of resolution difficult to achieve in reasonable amounts of time-even in effectively parallelized solvers. Though the scale of the problem relative to available computing power is the greatest impediment to accelerating these applications, significant performance gains can be achieved through careful attention to the details of memory communication and access. The swept time-space decomposition rule reduces communication between sub-domains by exhausting the domain of influence before communicating boundary values. Here we present a GPU implementation of the swept rule, which modifies the algorithm for improved performance on this processing architecture by prioritizing use of private (shared) memory, avoiding interblock communication, and overwriting unnecessary values. It shows significant improvement in the execution time of finite-difference solvers for one-dimensional unsteady PDEs, producing speedups of 2 - 9 × for a range of problem sizes, respectively, compared with simple GPU versions and 7 - 300 × compared with parallel CPU versions. However, for a more sophisticated one-dimensional system of equations discretized with a second-order finite-volume scheme, the swept rule performs 1.2 - 1.9 × worse than a standard implementation for all problem sizes.
A Kullback-Leibler approach for 3D reconstruction of spectral CT data corrupted by Poisson noise
NASA Astrophysics Data System (ADS)
Hohweiller, Tom; Ducros, Nicolas; Peyrin, Françoise; Sixou, Bruno
2017-09-01
While standard computed tomography (CT) data do not depend on energy, spectral computed tomography (SPCT) acquire energy-resolved data, which allows material decomposition of the object of interest. Decompo- sitions in the projection domain allow creating projection mass density (PMD) per materials. From decomposed projections, a tomographic reconstruction creates 3D material density volume. The decomposition is made pos- sible by minimizing a cost function. The variational approach is preferred since this is an ill-posed non-linear inverse problem. Moreover, noise plays a critical role when decomposing data. That is why in this paper, a new data fidelity term is used to take into account of the photonic noise. In this work two data fidelity terms were investigated: a weighted least squares (WLS) term, adapted to Gaussian noise, and the Kullback-Leibler distance (KL), adapted to Poisson noise. A regularized Gauss-Newton algorithm minimizes the cost function iteratively. Both methods decompose materials from a numerical phantom of a mouse. Soft tissues and bones are decomposed in the projection domain; then a tomographic reconstruction creates a 3D material density volume for each material. Comparing relative errors, KL is shown to outperform WLS for low photon counts, in 2D and 3D. This new method could be of particular interest when low-dose acquisitions are performed.
NASA Astrophysics Data System (ADS)
Huang, Yan; Wang, Zhihui
2015-12-01
With the development of FPGA, DSP Builder is widely applied to design system-level algorithms. The algorithm of CL multi-wavelet is more advanced and effective than scalar wavelets in processing signal decomposition. Thus, a system of CL multi-wavelet based on DSP Builder is designed for the first time in this paper. The system mainly contains three parts: a pre-filtering subsystem, a one-level decomposition subsystem and a two-level decomposition subsystem. It can be converted into hardware language VHDL by the Signal Complier block that can be used in Quartus II. After analyzing the energy indicator, it shows that this system outperforms Daubenchies wavelet in signal decomposition. Furthermore, it has proved to be suitable for the implementation of signal fusion based on SoPC hardware, and it will become a solid foundation in this new field.
Mode Shape Estimation Algorithms Under Ambient Conditions: A Comparative Review
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dosiek, Luke; Zhou, Ning; Pierre, John W.
Abstract—This paper provides a comparative review of five existing ambient electromechanical mode shape estimation algorithms, i.e., the Transfer Function (TF), Spectral, Frequency Domain Decomposition (FDD), Channel Matching, and Subspace Methods. It is also shown that the TF Method is a general approach to estimating mode shape and that the Spectral, FDD, and Channel Matching Methods are actually special cases of it. Additionally, some of the variations of the Subspace Method are reviewed and the Numerical algorithm for Subspace State Space System IDentification (N4SID) is implemented. The five algorithms are then compared using data simulated from a 17-machine model of themore » Western Electricity Coordinating Council (WECC) under ambient conditions with both low and high damping, as well as during the case where ambient data is disrupted by an oscillatory ringdown. The performance of the algorithms is compared using the statistics from Monte Carlo Simulations and results from measured WECC data, and a discussion of the practical issues surrounding their implementation, including cases where power system probing is an option, is provided. The paper concludes with some recommendations as to the appropriate use of the various techniques. Index Terms—Electromechanical mode shape, small-signal stability, phasor measurement units (PMU), system identification, N4SID, subspace.« less
Real-time trajectory optimization on parallel processors
NASA Technical Reports Server (NTRS)
Psiaki, Mark L.
1993-01-01
A parallel algorithm has been developed for rapidly solving trajectory optimization problems. The goal of the work has been to develop an algorithm that is suitable to do real-time, on-line optimal guidance through repeated solution of a trajectory optimization problem. The algorithm has been developed on an INTEL iPSC/860 message passing parallel processor. It uses a zero-order-hold discretization of a continuous-time problem and solves the resulting nonlinear programming problem using a custom-designed augmented Lagrangian nonlinear programming algorithm. The algorithm achieves parallelism of function, derivative, and search direction calculations through the principle of domain decomposition applied along the time axis. It has been encoded and tested on 3 example problems, the Goddard problem, the acceleration-limited, planar minimum-time to the origin problem, and a National Aerospace Plane minimum-fuel ascent guidance problem. Execution times as fast as 118 sec of wall clock time have been achieved for a 128-stage Goddard problem solved on 32 processors. A 32-stage minimum-time problem has been solved in 151 sec on 32 processors. A 32-stage National Aerospace Plane problem required 2 hours when solved on 32 processors. A speed-up factor of 7.2 has been achieved by using 32-nodes instead of 1-node to solve a 64-stage Goddard problem.
NASA Astrophysics Data System (ADS)
Zapata, M. A. Uh; Van Bang, D. Pham; Nguyen, K. D.
2016-05-01
This paper presents a parallel algorithm for the finite-volume discretisation of the Poisson equation on three-dimensional arbitrary geometries. The proposed method is formulated by using a 2D horizontal block domain decomposition and interprocessor data communication techniques with message passing interface. The horizontal unstructured-grid cells are reordered according to the neighbouring relations and decomposed into blocks using a load-balanced distribution to give all processors an equal amount of elements. In this algorithm, two parallel successive over-relaxation methods are presented: a multi-colour ordering technique for unstructured grids based on distributed memory and a block method using reordering index following similar ideas of the partitioning for structured grids. In all cases, the parallel algorithms are implemented with a combination of an acceleration iterative solver. This solver is based on a parabolic-diffusion equation introduced to obtain faster solutions of the linear systems arising from the discretisation. Numerical results are given to evaluate the performances of the methods showing speedups better than linear.
Peak picking NMR spectral data using non-negative matrix factorization.
Tikole, Suhas; Jaravine, Victor; Rogov, Vladimir; Dötsch, Volker; Güntert, Peter
2014-02-11
Simple peak-picking algorithms, such as those based on lineshape fitting, perform well when peaks are completely resolved in multidimensional NMR spectra, but often produce wrong intensities and frequencies for overlapping peak clusters. For example, NOESY-type spectra have considerable overlaps leading to significant peak-picking intensity errors, which can result in erroneous structural restraints. Precise frequencies are critical for unambiguous resonance assignments. To alleviate this problem, a more sophisticated peaks decomposition algorithm, based on non-negative matrix factorization (NMF), was developed. We produce peak shapes from Fourier-transformed NMR spectra. Apart from its main goal of deriving components from spectra and producing peak lists automatically, the NMF approach can also be applied if the positions of some peaks are known a priori, e.g. from consistently referenced spectral dimensions of other experiments. Application of the NMF algorithm to a three-dimensional peak list of the 23 kDa bi-domain section of the RcsD protein (RcsD-ABL-HPt, residues 688-890) as well as to synthetic HSQC data shows that peaks can be picked accurately also in spectral regions with strong overlap.
Emerging CFD technologies and aerospace vehicle design
NASA Technical Reports Server (NTRS)
Aftosmis, Michael J.
1995-01-01
With the recent focus on the needs of design and applications CFD, research groups have begun to address the traditional bottlenecks of grid generation and surface modeling. Now, a host of emerging technologies promise to shortcut or dramatically simplify the simulation process. This paper discusses the current status of these emerging technologies. It will argue that some tools are already available which can have positive impact on portions of the design cycle. However, in most cases, these tools need to be integrated into specific engineering systems and process cycles to be used effectively. The rapidly maturing status of unstructured and Cartesian approaches for inviscid simulations makes suggests the possibility of highly automated Euler-boundary layer simulations with application to loads estimation and even preliminary design. Similarly, technology is available to link block structured mesh generation algorithms with topology libraries to avoid tedious re-meshing of topologically similar configurations. Work in algorithmic based auto-blocking suggests that domain decomposition and point placement operations in multi-block mesh generation may be properly posed as problems in Computational Geometry, and following this approach may lead to robust algorithmic processes for automatic mesh generation.
A novel partitioning method for block-structured adaptive meshes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fu, Lin, E-mail: lin.fu@tum.de; Litvinov, Sergej, E-mail: sergej.litvinov@aer.mw.tum.de; Hu, Xiangyu Y., E-mail: xiangyu.hu@tum.de
We propose a novel partitioning method for block-structured adaptive meshes utilizing the meshless Lagrangian particle concept. With the observation that an optimum partitioning has high analogy to the relaxation of a multi-phase fluid to steady state, physically motivated model equations are developed to characterize the background mesh topology and are solved by multi-phase smoothed-particle hydrodynamics. In contrast to well established partitioning approaches, all optimization objectives are implicitly incorporated and achieved during the particle relaxation to stationary state. Distinct partitioning sub-domains are represented by colored particles and separated by a sharp interface with a surface tension model. In order to obtainmore » the particle relaxation, special viscous and skin friction models, coupled with a tailored time integration algorithm are proposed. Numerical experiments show that the present method has several important properties: generation of approximately equal-sized partitions without dependence on the mesh-element type, optimized interface communication between distinct partitioning sub-domains, continuous domain decomposition which is physically localized and implicitly incremental. Therefore it is particularly suitable for load-balancing of high-performance CFD simulations.« less
A novel partitioning method for block-structured adaptive meshes
NASA Astrophysics Data System (ADS)
Fu, Lin; Litvinov, Sergej; Hu, Xiangyu Y.; Adams, Nikolaus A.
2017-07-01
We propose a novel partitioning method for block-structured adaptive meshes utilizing the meshless Lagrangian particle concept. With the observation that an optimum partitioning has high analogy to the relaxation of a multi-phase fluid to steady state, physically motivated model equations are developed to characterize the background mesh topology and are solved by multi-phase smoothed-particle hydrodynamics. In contrast to well established partitioning approaches, all optimization objectives are implicitly incorporated and achieved during the particle relaxation to stationary state. Distinct partitioning sub-domains are represented by colored particles and separated by a sharp interface with a surface tension model. In order to obtain the particle relaxation, special viscous and skin friction models, coupled with a tailored time integration algorithm are proposed. Numerical experiments show that the present method has several important properties: generation of approximately equal-sized partitions without dependence on the mesh-element type, optimized interface communication between distinct partitioning sub-domains, continuous domain decomposition which is physically localized and implicitly incremental. Therefore it is particularly suitable for load-balancing of high-performance CFD simulations.
NASA Technical Reports Server (NTRS)
Aftosmis, M. J.; Berger, M. J.; Murman, S. M.; Kwak, Dochan (Technical Monitor)
2002-01-01
The proposed paper will present recent extensions in the development of an efficient Euler solver for adaptively-refined Cartesian meshes with embedded boundaries. The paper will focus on extensions of the basic method to include solution adaptation, time-dependent flow simulation, and arbitrary rigid domain motion. The parallel multilevel method makes use of on-the-fly parallel domain decomposition to achieve extremely good scalability on large numbers of processors, and is coupled with an automatic coarse mesh generation algorithm for efficient processing by a multigrid smoother. Numerical results are presented demonstrating parallel speed-ups of up to 435 on 512 processors. Solution-based adaptation may be keyed off truncation error estimates using tau-extrapolation or a variety of feature detection based refinement parameters. The multigrid method is extended to for time-dependent flows through the use of a dual-time approach. The extension to rigid domain motion uses an Arbitrary Lagrangian-Eulerlarian (ALE) formulation, and results will be presented for a variety of two- and three-dimensional example problems with both simple and complex geometry.
A Dual Super-Element Domain Decomposition Approach for Parallel Nonlinear Finite Element Analysis
NASA Astrophysics Data System (ADS)
Jokhio, G. A.; Izzuddin, B. A.
2015-05-01
This article presents a new domain decomposition method for nonlinear finite element analysis introducing the concept of dual partition super-elements. The method extends ideas from the displacement frame method and is ideally suited for parallel nonlinear static/dynamic analysis of structural systems. In the new method, domain decomposition is realized by replacing one or more subdomains in a "parent system," each with a placeholder super-element, where the subdomains are processed separately as "child partitions," each wrapped by a dual super-element along the partition boundary. The analysis of the overall system, including the satisfaction of equilibrium and compatibility at all partition boundaries, is realized through direct communication between all pairs of placeholder and dual super-elements. The proposed method has particular advantages for matrix solution methods based on the frontal scheme, and can be readily implemented for existing finite element analysis programs to achieve parallelization on distributed memory systems with minimal intervention, thus overcoming memory bottlenecks typically faced in the analysis of large-scale problems. Several examples are presented in this article which demonstrate the computational benefits of the proposed parallel domain decomposition approach and its applicability to the nonlinear structural analysis of realistic structural systems.
Fast immersed interface Poisson solver for 3D unbounded problems around arbitrary geometries
NASA Astrophysics Data System (ADS)
Gillis, T.; Winckelmans, G.; Chatelain, P.
2018-02-01
We present a fast and efficient Fourier-based solver for the Poisson problem around an arbitrary geometry in an unbounded 3D domain. This solver merges two rewarding approaches, the lattice Green's function method and the immersed interface method, using the Sherman-Morrison-Woodbury decomposition formula. The method is intended to be second order up to the boundary. This is verified on two potential flow benchmarks. We also further analyse the iterative process and the convergence behavior of the proposed algorithm. The method is applicable to a wide range of problems involving a Poisson equation around inner bodies, which goes well beyond the present validation on potential flows.
Abraham, Mark James; Murtola, Teemu; Schulz, Roland; ...
2015-07-15
GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. This work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. Finally, the latest best-in-class compressed trajectory storage format is supported.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abraham, Mark James; Murtola, Teemu; Schulz, Roland
GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. This work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. Finally, the latest best-in-class compressed trajectory storage format is supported.
A Benders based rolling horizon algorithm for a dynamic facility location problem
Marufuzzaman,, Mohammad; Gedik, Ridvan; Roni, Mohammad S.
2016-06-28
This study presents a well-known capacitated dynamic facility location problem (DFLP) that satisfies the customer demand at a minimum cost by determining the time period for opening, closing, or retaining an existing facility in a given location. To solve this challenging NP-hard problem, this paper develops a unique hybrid solution algorithm that combines a rolling horizon algorithm with an accelerated Benders decomposition algorithm. Extensive computational experiments are performed on benchmark test instances to evaluate the hybrid algorithm’s efficiency and robustness in solving the DFLP problem. Computational results indicate that the hybrid Benders based rolling horizon algorithm consistently offers high qualitymore » feasible solutions in a much shorter computational time period than the standalone rolling horizon and accelerated Benders decomposition algorithms in the experimental range.« less
NASA Technical Reports Server (NTRS)
Choudhary, Alok Nidhi; Leung, Mun K.; Huang, Thomas S.; Patel, Janak H.
1989-01-01
Computer vision systems employ a sequence of vision algorithms in which the output of an algorithm is the input of the next algorithm in the sequence. Algorithms that constitute such systems exhibit vastly different computational characteristics, and therefore, require different data decomposition techniques and efficient load balancing techniques for parallel implementation. However, since the input data for a task is produced as the output data of the previous task, this information can be exploited to perform knowledge based data decomposition and load balancing. Presented here are algorithms for a motion estimation system. The motion estimation is based on the point correspondence between the involved images which are a sequence of stereo image pairs. Researchers propose algorithms to obtain point correspondences by matching feature points among stereo image pairs at any two consecutive time instants. Furthermore, the proposed algorithms employ non-iterative procedures, which results in saving considerable amounts of computation time. The system consists of the following steps: (1) extraction of features; (2) stereo match of images in one time instant; (3) time match of images from consecutive time instants; (4) stereo match to compute final unambiguous points; and (5) computation of motion parameters.
NASA Astrophysics Data System (ADS)
Teal, Paul D.; Eccles, Craig
2015-04-01
The two most successful methods of estimating the distribution of nuclear magnetic resonance relaxation times from two dimensional data are data compression followed by application of the Butler-Reeds-Dawson algorithm, and a primal-dual interior point method using preconditioned conjugate gradient. Both of these methods have previously been presented using a truncated singular value decomposition of matrices representing the exponential kernel. In this paper it is shown that other matrix factorizations are applicable to each of these algorithms, and that these illustrate the different fundamental principles behind the operation of the algorithms. These are the rank-revealing QR (RRQR) factorization and the LDL factorization with diagonal pivoting, also known as the Bunch-Kaufman-Parlett factorization. It is shown that both algorithms can be improved by adaptation of the truncation as the optimization process progresses, improving the accuracy as the optimal value is approached. A variation on the interior method viz, the use of barrier function instead of the primal-dual approach, is found to offer considerable improvement in terms of speed and reliability. A third type of algorithm, related to the algorithm known as Fast iterative shrinkage-thresholding algorithm, is applied to the problem. This method can be efficiently formulated without the use of a matrix decomposition.
Dictionary-Based Tensor Canonical Polyadic Decomposition
NASA Astrophysics Data System (ADS)
Cohen, Jeremy Emile; Gillis, Nicolas
2018-04-01
To ensure interpretability of extracted sources in tensor decomposition, we introduce in this paper a dictionary-based tensor canonical polyadic decomposition which enforces one factor to belong exactly to a known dictionary. A new formulation of sparse coding is proposed which enables high dimensional tensors dictionary-based canonical polyadic decomposition. The benefits of using a dictionary in tensor decomposition models are explored both in terms of parameter identifiability and estimation accuracy. Performances of the proposed algorithms are evaluated on the decomposition of simulated data and the unmixing of hyperspectral images.
Zhang, Shuangyue; Han, Dong; Politte, David G; Williamson, Jeffrey F; O'Sullivan, Joseph A
2018-05-01
The purpose of this study was to assess the performance of a novel dual-energy CT (DECT) approach for proton stopping power ratio (SPR) mapping that integrates image reconstruction and material characterization using a joint statistical image reconstruction (JSIR) method based on a linear basis vector model (BVM). A systematic comparison between the JSIR-BVM method and previously described DECT image- and sinogram-domain decomposition approaches is also carried out on synthetic data. The JSIR-BVM method was implemented to estimate the electron densities and mean excitation energies (I-values) required by the Bethe equation for SPR mapping. In addition, image- and sinogram-domain DECT methods based on three available SPR models including BVM were implemented for comparison. The intrinsic SPR modeling accuracy of the three models was first validated. Synthetic DECT transmission sinograms of two 330 mm diameter phantoms each containing 17 soft and bony tissues (for a total of 34) of known composition were then generated with spectra of 90 and 140 kVp. The estimation accuracy of the reconstructed SPR images were evaluated for the seven investigated methods. The impact of phantom size and insert location on SPR estimation accuracy was also investigated. All three selected DECT-SPR models predict the SPR of all tissue types with less than 0.2% RMS errors under idealized conditions with no reconstruction uncertainties. When applied to synthetic sinograms, the JSIR-BVM method achieves the best performance with mean and RMS-average errors of less than 0.05% and 0.3%, respectively, for all noise levels, while the image- and sinogram-domain decomposition methods show increasing mean and RMS-average errors with increasing noise level. The JSIR-BVM method also reduces statistical SPR variation by sixfold compared to other methods. A 25% phantom diameter change causes up to 4% SPR differences for the image-domain decomposition approach, while the JSIR-BVM method and sinogram-domain decomposition methods are insensitive to size change. Among all the investigated methods, the JSIR-BVM method achieves the best performance for SPR estimation in our simulation phantom study. This novel method is robust with respect to sinogram noise and residual beam-hardening effects, yielding SPR estimation errors comparable to intrinsic BVM modeling error. In contrast, the achievable SPR estimation accuracy of the image- and sinogram-domain decomposition methods is dominated by the CT image intensity uncertainties introduced by the reconstruction and decomposition processes. © 2018 American Association of Physicists in Medicine.
Decomposition of timed automata for solving scheduling problems
NASA Astrophysics Data System (ADS)
Nishi, Tatsushi; Wakatake, Masato
2014-03-01
A decomposition algorithm for scheduling problems based on timed automata (TA) model is proposed. The problem is represented as an optimal state transition problem for TA. The model comprises of the parallel composition of submodels such as jobs and resources. The procedure of the proposed methodology can be divided into two steps. The first step is to decompose the TA model into several submodels by using decomposable condition. The second step is to combine individual solution of subproblems for the decomposed submodels by the penalty function method. A feasible solution for the entire model is derived through the iterated computation of solving the subproblem for each submodel. The proposed methodology is applied to solve flowshop and jobshop scheduling problems. Computational experiments demonstrate the effectiveness of the proposed algorithm compared with a conventional TA scheduling algorithm without decomposition.
Cui, Lingli; Wu, Na; Wang, Wenjing; Kang, Chenhui
2014-01-01
This paper presents a new method for a composite dictionary matching pursuit algorithm, which is applied to vibration sensor signal feature extraction and fault diagnosis of a gearbox. Three advantages are highlighted in the new method. First, the composite dictionary in the algorithm has been changed from multi-atom matching to single-atom matching. Compared to non-composite dictionary single-atom matching, the original composite dictionary multi-atom matching pursuit (CD-MaMP) algorithm can achieve noise reduction in the reconstruction stage, but it cannot dramatically reduce the computational cost and improve the efficiency in the decomposition stage. Therefore, the optimized composite dictionary single-atom matching algorithm (CD-SaMP) is proposed. Second, the termination condition of iteration based on the attenuation coefficient is put forward to improve the sparsity and efficiency of the algorithm, which adjusts the parameters of the termination condition constantly in the process of decomposition to avoid noise. Third, composite dictionaries are enriched with the modulation dictionary, which is one of the important structural characteristics of gear fault signals. Meanwhile, the termination condition of iteration settings, sub-feature dictionary selections and operation efficiency between CD-MaMP and CD-SaMP are discussed, aiming at gear simulation vibration signals with noise. The simulation sensor-based vibration signal results show that the termination condition of iteration based on the attenuation coefficient enhances decomposition sparsity greatly and achieves a good effect of noise reduction. Furthermore, the modulation dictionary achieves a better matching effect compared to the Fourier dictionary, and CD-SaMP has a great advantage of sparsity and efficiency compared with the CD-MaMP. The sensor-based vibration signals measured from practical engineering gearbox analyses have further shown that the CD-SaMP decomposition and reconstruction algorithm is feasible and effective. PMID:25207870
Cui, Lingli; Wu, Na; Wang, Wenjing; Kang, Chenhui
2014-09-09
This paper presents a new method for a composite dictionary matching pursuit algorithm, which is applied to vibration sensor signal feature extraction and fault diagnosis of a gearbox. Three advantages are highlighted in the new method. First, the composite dictionary in the algorithm has been changed from multi-atom matching to single-atom matching. Compared to non-composite dictionary single-atom matching, the original composite dictionary multi-atom matching pursuit (CD-MaMP) algorithm can achieve noise reduction in the reconstruction stage, but it cannot dramatically reduce the computational cost and improve the efficiency in the decomposition stage. Therefore, the optimized composite dictionary single-atom matching algorithm (CD-SaMP) is proposed. Second, the termination condition of iteration based on the attenuation coefficient is put forward to improve the sparsity and efficiency of the algorithm, which adjusts the parameters of the termination condition constantly in the process of decomposition to avoid noise. Third, composite dictionaries are enriched with the modulation dictionary, which is one of the important structural characteristics of gear fault signals. Meanwhile, the termination condition of iteration settings, sub-feature dictionary selections and operation efficiency between CD-MaMP and CD-SaMP are discussed, aiming at gear simulation vibration signals with noise. The simulation sensor-based vibration signal results show that the termination condition of iteration based on the attenuation coefficient enhances decomposition sparsity greatly and achieves a good effect of noise reduction. Furthermore, the modulation dictionary achieves a better matching effect compared to the Fourier dictionary, and CD-SaMP has a great advantage of sparsity and efficiency compared with the CD-MaMP. The sensor-based vibration signals measured from practical engineering gearbox analyses have further shown that the CD-SaMP decomposition and reconstruction algorithm is feasible and effective.
Fat water decomposition using globally optimal surface estimation (GOOSE) algorithm.
Cui, Chen; Wu, Xiaodong; Newell, John D; Jacob, Mathews
2015-03-01
This article focuses on developing a novel noniterative fat water decomposition algorithm more robust to fat water swaps and related ambiguities. Field map estimation is reformulated as a constrained surface estimation problem to exploit the spatial smoothness of the field, thus minimizing the ambiguities in the recovery. Specifically, the differences in the field map-induced frequency shift between adjacent voxels are constrained to be in a finite range. The discretization of the above problem yields a graph optimization scheme, where each node of the graph is only connected with few other nodes. Thanks to the low graph connectivity, the problem is solved efficiently using a noniterative graph cut algorithm. The global minimum of the constrained optimization problem is guaranteed. The performance of the algorithm is compared with that of state-of-the-art schemes. Quantitative comparisons are also made against reference data. The proposed algorithm is observed to yield more robust fat water estimates with fewer fat water swaps and better quantitative results than other state-of-the-art algorithms in a range of challenging applications. The proposed algorithm is capable of considerably reducing the swaps in challenging fat water decomposition problems. The experiments demonstrate the benefit of using explicit smoothness constraints in field map estimation and solving the problem using a globally convergent graph-cut optimization algorithm. © 2014 Wiley Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shadid, John Nicolas; Elman, Howard; Shuttleworth, Robert R.
2007-04-01
In recent years, considerable effort has been placed on developing efficient and robust solution algorithms for the incompressible Navier-Stokes equations based on preconditioned Krylov methods. These include physics-based methods, such as SIMPLE, and purely algebraic preconditioners based on the approximation of the Schur complement. All these techniques can be represented as approximate block factorization (ABF) type preconditioners. The goal is to decompose the application of the preconditioner into simplified sub-systems in which scalable multi-level type solvers can be applied. In this paper we develop a taxonomy of these ideas based on an adaptation of a generalized approximate factorization of themore » Navier-Stokes system first presented in [25]. This taxonomy illuminates the similarities and differences among these preconditioners and the central role played by efficient approximation of certain Schur complement operators. We then present a parallel computational study that examines the performance of these methods and compares them to an additive Schwarz domain decomposition (DD) algorithm. Results are presented for two and three-dimensional steady state problems for enclosed domains and inflow/outflow systems on both structured and unstructured meshes. The numerical experiments are performed using MPSalsa, a stabilized finite element code.« less
Domain decomposition by the advancing-partition method for parallel unstructured grid generation
NASA Technical Reports Server (NTRS)
Banihashemi, legal representative, Soheila (Inventor); Pirzadeh, Shahyar Z. (Inventor)
2012-01-01
In a method for domain decomposition for generating unstructured grids, a surface mesh is generated for a spatial domain. A location of a partition plane dividing the domain into two sections is determined. Triangular faces on the surface mesh that intersect the partition plane are identified. A partition grid of tetrahedral cells, dividing the domain into two sub-domains, is generated using a marching process in which a front comprises only faces of new cells which intersect the partition plane. The partition grid is generated until no active faces remain on the front. Triangular faces on each side of the partition plane are collected into two separate subsets. Each subset of triangular faces is renumbered locally and a local/global mapping is created for each sub-domain. A volume grid is generated for each sub-domain. The partition grid and volume grids are then merged using the local-global mapping.
Dong, Jianwu; Chen, Feng; Zhou, Dong; Liu, Tian; Yu, Zhaofei; Wang, Yi
2017-03-01
Existence of low SNR regions and rapid-phase variations pose challenges to spatial phase unwrapping algorithms. Global optimization-based phase unwrapping methods are widely used, but are significantly slower than greedy methods. In this paper, dual decomposition acceleration is introduced to speed up a three-dimensional graph cut-based phase unwrapping algorithm. The phase unwrapping problem is formulated as a global discrete energy minimization problem, whereas the technique of dual decomposition is used to increase the computational efficiency by splitting the full problem into overlapping subproblems and enforcing the congruence of overlapping variables. Using three dimensional (3D) multiecho gradient echo images from an agarose phantom and five brain hemorrhage patients, we compared this proposed method with an unaccelerated graph cut-based method. Experimental results show up to 18-fold acceleration in computation time. Dual decomposition significantly improves the computational efficiency of 3D graph cut-based phase unwrapping algorithms. Magn Reson Med 77:1353-1358, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
NASA Astrophysics Data System (ADS)
Chen, Zhen; Chan, Tommy H. T.
2017-08-01
This paper proposes a new methodology for moving force identification (MFI) from the responses of bridge deck. Based on the existing time domain method (TDM), the MFI problem eventually becomes solving the linear algebraic equation in the form Ax = b . The vector b is usually contaminated by an unknown error e generating from measurement error, which often called the vector e as ''noise''. With the ill-posed problems that exist in the inverse problem, the identification force would be sensitive to the noise e . The proposed truncated generalized singular value decomposition method (TGSVD) aims at obtaining an acceptable solution and making the noise to be less sensitive to perturbations with the ill-posed problems. The illustrated results show that the TGSVD has many advantages such as higher precision, better adaptability and noise immunity compared with TDM. In addition, choosing a proper regularization matrix L and a truncation parameter k are very useful to improve the identification accuracy and to solve ill-posed problems when it is used to identify the moving force on bridge.
A new time-frequency method for identification and classification of ball bearing faults
NASA Astrophysics Data System (ADS)
Attoui, Issam; Fergani, Nadir; Boutasseta, Nadir; Oudjani, Brahim; Deliou, Adel
2017-06-01
In order to fault diagnosis of ball bearing that is one of the most critical components of rotating machinery, this paper presents a time-frequency procedure incorporating a new feature extraction step that combines the classical wavelet packet decomposition energy distribution technique and a new feature extraction technique based on the selection of the most impulsive frequency bands. In the proposed procedure, firstly, as a pre-processing step, the most impulsive frequency bands are selected at different bearing conditions using a combination between Fast-Fourier-Transform FFT and Short-Frequency Energy SFE algorithms. Secondly, once the most impulsive frequency bands are selected, the measured machinery vibration signals are decomposed into different frequency sub-bands by using discrete Wavelet Packet Decomposition WPD technique to maximize the detection of their frequency contents and subsequently the most useful sub-bands are represented in the time-frequency domain by using Short Time Fourier transform STFT algorithm for knowing exactly what the frequency components presented in those frequency sub-bands are. Once the proposed feature vector is obtained, three feature dimensionality reduction techniques are employed using Linear Discriminant Analysis LDA, a feedback wrapper method and Locality Sensitive Discriminant Analysis LSDA. Lastly, the Adaptive Neuro-Fuzzy Inference System ANFIS algorithm is used for instantaneous identification and classification of bearing faults. In order to evaluate the performances of the proposed method, different testing data set to the trained ANFIS model by using different conditions of healthy and faulty bearings under various load levels, fault severities and rotating speed. The conclusion resulting from this paper is highlighted by experimental results which prove that the proposed method can serve as an intelligent bearing fault diagnosis system.
Burger, Stefan; Fraunholz, Thomas; Leirer, Christian; Hoppe, Ronald H W; Wixforth, Achim; Peter, Malte A; Franke, Thomas
2013-06-25
Phase decomposition in lipid membranes has been the subject of numerous investigations by both experiment and theoretical simulation, yet quantitative comparisons of the simulated data to the experimental results are rare. In this work, we present a novel way of comparing the temporal development of liquid-ordered domains obtained from numerically solving the Cahn-Hilliard equation and by inducing a phase transition in giant unilamellar vesicles (GUVs). Quantitative comparison is done by calculating the structure factor of the domain pattern. It turns out that the decomposition takes place in three distinct regimes in both experiment and simulation. These regimes are characterized by different rates of growth of the mean domain diameter, and there is quantitative agreement between experiment and simulation as to the duration of each regime and the absolute rate of growth in each regime.
Decomposition of Proteins into Dynamic Units from Atomic Cross-Correlation Functions.
Calligari, Paolo; Gerolin, Marco; Abergel, Daniel; Polimeno, Antonino
2017-01-10
In this article, we present a clustering method of atoms in proteins based on the analysis of the correlation times of interatomic distance correlation functions computed from MD simulations. The goal is to provide a coarse-grained description of the protein in terms of fewer elements that can be treated as dynamically independent subunits. Importantly, this domain decomposition method does not take into account structural properties of the protein. Instead, the clustering of protein residues in terms of networks of dynamically correlated domains is defined on the basis of the effective correlation times of the pair distance correlation functions. For these properties, our method stands as a complementary analysis to the customary protein decomposition in terms of quasi-rigid, structure-based domains. Results obtained for a prototypal protein structure illustrate the approach proposed.
Combinatorial algorithms for design of DNA arrays.
Hannenhalli, Sridhar; Hubell, Earl; Lipshutz, Robert; Pevzner, Pavel A
2002-01-01
Optimal design of DNA arrays requires the development of algorithms with two-fold goals: reducing the effects caused by unintended illumination (border length minimization problem) and reducing the complexity of masks (mask decomposition problem). We describe algorithms that reduce the number of rectangles in mask decomposition by 20-30% as compared to a standard array design under the assumption that the arrangement of oligonucleotides on the array is fixed. This algorithm produces provably optimal solution for all studied real instances of array design. We also address the difficult problem of finding an arrangement which minimizes the border length and come up with a new idea of threading that significantly reduces the border length as compared to standard designs.
a Novel Two-Component Decomposition for Co-Polar Channels of GF-3 Quad-Pol Data
NASA Astrophysics Data System (ADS)
Kwok, E.; Li, C. H.; Zhao, Q. H.; Li, Y.
2018-04-01
Polarimetric target decomposition theory is the most dynamic and exploratory research area in the field of PolSAR. But most methods of target decomposition are based on fully polarized data (quad pol) and seldom utilize dual-polar data for target decomposition. Given this, we proposed a novel two-component decomposition method for co-polar channels of GF-3 quad-pol data. This method decomposes the data into two scattering contributions: surface, double bounce in dual co-polar channels. To save this underdetermined problem, a criterion for determining the model is proposed. The criterion can be named as second-order averaged scattering angle, which originates from the H/α decomposition. and we also put forward an alternative parameter of it. To validate the effectiveness of proposed decomposition, Liaodong Bay is selected as research area. The area is located in northeastern China, where it grows various wetland resources and appears sea ice phenomenon in winter. and we use the GF-3 quad-pol data as study data, which which is China's first C-band polarimetric synthetic aperture radar (PolSAR) satellite. The dependencies between the features of proposed algorithm and comparison decompositions (Pauli decomposition, An&Yang decomposition, Yamaguchi S4R decomposition) were investigated in the study. Though several aspects of the experimental discussion, we can draw the conclusion: the proposed algorithm may be suitable for special scenes with low vegetation coverage or low vegetation in the non-growing season; proposed decomposition features only using co-polar data are highly correlated with the corresponding comparison decomposition features under quad-polarization data. Moreover, it would be become input of the subsequent classification or parameter inversion.
A fast new algorithm for a robot neurocontroller using inverse QR decomposition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morris, A.S.; Khemaissia, S.
2000-01-01
A new adaptive neural network controller for robots is presented. The controller is based on direct adaptive techniques. Unlike many neural network controllers in the literature, inverse dynamical model evaluation is not required. A numerically robust, computationally efficient processing scheme for neutral network weight estimation is described, namely, the inverse QR decomposition (INVQR). The inverse QR decomposition and a weighted recursive least-squares (WRLS) method for neural network weight estimation is derived using Cholesky factorization of the data matrix. The algorithm that performs the efficient INVQR of the underlying space-time data matrix may be implemented in parallel on a triangular array.more » Furthermore, its systolic architecture is well suited for VLSI implementation. Another important benefit is well suited for VLSI implementation. Another important benefit of the INVQR decomposition is that it solves directly for the time-recursive least-squares filter vector, while avoiding the sequential back-substitution step required by the QR decomposition approaches.« less
Optimal pattern synthesis for speech recognition based on principal component analysis
NASA Astrophysics Data System (ADS)
Korsun, O. N.; Poliyev, A. V.
2018-02-01
The algorithm for building an optimal pattern for the purpose of automatic speech recognition, which increases the probability of correct recognition, is developed and presented in this work. The optimal pattern forming is based on the decomposition of an initial pattern to principal components, which enables to reduce the dimension of multi-parameter optimization problem. At the next step the training samples are introduced and the optimal estimates for principal components decomposition coefficients are obtained by a numeric parameter optimization algorithm. Finally, we consider the experiment results that show the improvement in speech recognition introduced by the proposed optimization algorithm.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mburu, Sarah; Kolli, R. Prakash; Perea, Daniel E.
The microstructure and mechanical properties in unaged and thermally aged (at 280 °C, 320 °C, 360 °C, and 400 °C to 4300 h) CF–3 and CF–8 cast duplex stainless steels (CDSS) are investigated. The unaged CF–8 steel has Cr-rich M 23C 6 carbides located at the δ–ferrite/γ–austenite heterophase interfaces that were not observed in the CF–3 steel and this corresponds to a difference in mechanical properties. Both unaged steels exhibit incipient spinodal decomposition into Fe-rich α–domains and Cr-rich α’–domains. During aging, spinodal decomposition progresses and the mean wavelength (MW) and mean amplitude (MA) of the compositional fluctuations increase as amore » function of aging temperature. Additionally, G–phase precipitates form between the spinodal decomposition domains in CF–3 at 360 °C and 400 °C and in CF–8 at 400 °C. Finally, the microstructural evolution is correlated to changes in mechanical properties.« less
Mburu, Sarah; Kolli, R. Prakash; Perea, Daniel E.; ...
2017-03-06
The microstructure and mechanical properties in unaged and thermally aged (at 280 °C, 320 °C, 360 °C, and 400 °C to 4300 h) CF–3 and CF–8 cast duplex stainless steels (CDSS) are investigated. The unaged CF–8 steel has Cr-rich M 23C 6 carbides located at the δ–ferrite/γ–austenite heterophase interfaces that were not observed in the CF–3 steel and this corresponds to a difference in mechanical properties. Both unaged steels exhibit incipient spinodal decomposition into Fe-rich α–domains and Cr-rich α’–domains. During aging, spinodal decomposition progresses and the mean wavelength (MW) and mean amplitude (MA) of the compositional fluctuations increase as amore » function of aging temperature. Additionally, G–phase precipitates form between the spinodal decomposition domains in CF–3 at 360 °C and 400 °C and in CF–8 at 400 °C. Finally, the microstructural evolution is correlated to changes in mechanical properties.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mburu, Sarah; Kolli, R. Prakash; Perea, Daniel E.
The microstructure and mechanical properties in unaged and thermally aged (at 280 oC, 320 oC, 360 oC, and 400 oC to 4300 h) CF–3 and CF–8 cast duplex stainless steels (CDSS) are investigated. The unaged CF–8 steel has Cr-rich M23C6 carbides located at the δ–ferrite/γ– austenite heterophase interfaces that were not observed in the CF–3 steel and this corresponds to a difference in mechanical properties. Both unaged steels exhibit incipient spinodal decomposition into Fe-rich α–domains and Cr-rich α’–domains. During aging, spinodal decomposition progresses and the mean wavelength (MW) and mean amplitude (MA) of the compositional fluctuations increase as a functionmore » of aging temperature. Additionally, G–phase precipitates form between the spinodal decomposition domains in CF–3 at 360 oC and 400 oC and in CF–8 at 400 oC. The microstructural evolution is correlated to changes in mechanical properties.« less
Alles, E. J.; Zhu, Y.; van Dongen, K. W. A.; McGough, R. J.
2013-01-01
The fast nearfield method, when combined with time-space decomposition, is a rapid and accurate approach for calculating transient nearfield pressures generated by ultrasound transducers. However, the standard time-space decomposition approach is only applicable to certain analytical representations of the temporal transducer surface velocity that, when applied to the fast nearfield method, are expressed as a finite sum of products of separate temporal and spatial terms. To extend time-space decomposition such that accelerated transient field simulations are enabled in the nearfield for an arbitrary transducer surface velocity, a new transient simulation method, frequency domain time-space decomposition (FDTSD), is derived. With this method, the temporal transducer surface velocity is transformed into the frequency domain, and then each complex-valued term is processed separately. Further improvements are achieved by spectral clipping, which reduces the number of terms and the computation time. Trade-offs between speed and accuracy are established for FDTSD calculations, and pressure fields obtained with the FDTSD method for a circular transducer are compared to those obtained with Field II and the impulse response method. The FDTSD approach, when combined with the fast nearfield method and spectral clipping, consistently achieves smaller errors in less time and requires less memory than Field II or the impulse response method. PMID:23160476
Distributed-Memory Breadth-First Search on Massive Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buluc, Aydin; Beamer, Scott; Madduri, Kamesh
This chapter studies the problem of traversing large graphs using the breadth-first search order on distributed-memory supercomputers. We consider both the traditional level-synchronous top-down algorithm as well as the recently discovered direction optimizing algorithm. We analyze the performance and scalability trade-offs in using different local data structures such as CSR and DCSC, enabling in-node multithreading, and graph decompositions such as 1D and 2D decomposition.
Optimum and Heuristic Algorithms for Finite State Machine Decomposition and Partitioning
1989-09-01
Heuristic Algorithms for Finite State Machine Decomposition and Partitioning Pravnav Ashar, Srinivas Devadas , and A. Richard Newton , T E’,’ .,jpf~s’!i3...94720. Devadas : Department of Electrical Engineering and Computer Science, Room 36-848, MIT, Cambridge, MA 02139. (617) 253-0454. Copyright* 1989 MIT...and reduction, A finite state miachinie is represenutedl by its State Transition Graphi itodlitied froini two-level B ~oolean imiinimizers. Ilist
A mixed parallel strategy for the solution of coupled multi-scale problems at finite strains
NASA Astrophysics Data System (ADS)
Lopes, I. A. Rodrigues; Pires, F. M. Andrade; Reis, F. J. P.
2018-02-01
A mixed parallel strategy for the solution of homogenization-based multi-scale constitutive problems undergoing finite strains is proposed. The approach aims to reduce the computational time and memory requirements of non-linear coupled simulations that use finite element discretization at both scales (FE^2). In the first level of the algorithm, a non-conforming domain decomposition technique, based on the FETI method combined with a mortar discretization at the interface of macroscopic subdomains, is employed. A master-slave scheme, which distributes tasks by macroscopic element and adopts dynamic scheduling, is then used for each macroscopic subdomain composing the second level of the algorithm. This strategy allows the parallelization of FE^2 simulations in computers with either shared memory or distributed memory architectures. The proposed strategy preserves the quadratic rates of asymptotic convergence that characterize the Newton-Raphson scheme. Several examples are presented to demonstrate the robustness and efficiency of the proposed parallel strategy.
Multilevel decomposition of complete vehicle configuration in a parallel computing environment
NASA Technical Reports Server (NTRS)
Bhatt, Vinay; Ragsdell, K. M.
1989-01-01
This research summarizes various approaches to multilevel decomposition to solve large structural problems. A linear decomposition scheme based on the Sobieski algorithm is selected as a vehicle for automated synthesis of a complete vehicle configuration in a parallel processing environment. The research is in a developmental state. Preliminary numerical results are presented for several example problems.
NASA Astrophysics Data System (ADS)
Zhou, T.; Popescu, S. C.; Krause, K.
2016-12-01
Waveform Light Detection and Ranging (LiDAR) data have advantages over discrete-return LiDAR data in accurately characterizing vegetation structure. However, we lack a comprehensive understanding of waveform data processing approaches under different topography and vegetation conditions. The objective of this paper is to highlight a novel deconvolution algorithm, the Gold algorithm, for processing waveform LiDAR data with optimal deconvolution parameters. Further, we present a comparative study of waveform processing methods to provide insight into selecting an approach for a given combination of vegetation and terrain characteristics. We employed two waveform processing methods: 1) direct decomposition, 2) deconvolution and decomposition. In method two, we utilized two deconvolution algorithms - the Richardson Lucy (RL) algorithm and the Gold algorithm. The comprehensive and quantitative comparisons were conducted in terms of the number of detected echoes, position accuracy, the bias of the end products (such as digital terrain model (DTM) and canopy height model (CHM)) from discrete LiDAR data, along with parameter uncertainty for these end products obtained from different methods. This study was conducted at three study sites that include diverse ecological regions, vegetation and elevation gradients. Results demonstrate that two deconvolution algorithms are sensitive to the pre-processing steps of input data. The deconvolution and decomposition method is more capable of detecting hidden echoes with a lower false echo detection rate, especially for the Gold algorithm. Compared to the reference data, all approaches generate satisfactory accuracy assessment results with small mean spatial difference (<1.22 m for DTMs, < 0.77 m for CHMs) and root mean square error (RMSE) (<1.26 m for DTMs, < 1.93 m for CHMs). More specifically, the Gold algorithm is superior to others with smaller root mean square error (RMSE) (< 1.01m), while the direct decomposition approach works better in terms of the percentage of spatial difference within 0.5 and 1 m. The parameter uncertainty analysis demonstrates that the Gold algorithm outperforms other approaches in dense vegetation areas, with the smallest RMSE, and the RL algorithm performs better in sparse vegetation areas in terms of RMSE.
An operational modal analysis method in frequency and spatial domain
NASA Astrophysics Data System (ADS)
Wang, Tong; Zhang, Lingmi; Tamura, Yukio
2005-12-01
A frequency and spatial domain decomposition method (FSDD) for operational modal analysis (OMA) is presented in this paper, which is an extension of the complex mode indicator function (CMIF) method for experimental modal analysis (EMA). The theoretical background of the FSDD method is clarified. Singular value decomposition is adopted to separate the signal space from the noise space. Finally, an enhanced power spectrum density (PSD) is proposed to obtain more accurate modal parameters by curve fitting in the frequency domain. Moreover, a simulation case and an application case are used to validate this method.
Mechanical and Assembly Units of Viral Capsids Identified via Quasi-Rigid Domain Decomposition
Polles, Guido; Indelicato, Giuliana; Potestio, Raffaello; Cermelli, Paolo; Twarock, Reidun; Micheletti, Cristian
2013-01-01
Key steps in a viral life-cycle, such as self-assembly of a protective protein container or in some cases also subsequent maturation events, are governed by the interplay of physico-chemical mechanisms involving various spatial and temporal scales. These salient aspects of a viral life cycle are hence well described and rationalised from a mesoscopic perspective. Accordingly, various experimental and computational efforts have been directed towards identifying the fundamental building blocks that are instrumental for the mechanical response, or constitute the assembly units, of a few specific viral shells. Motivated by these earlier studies we introduce and apply a general and efficient computational scheme for identifying the stable domains of a given viral capsid. The method is based on elastic network models and quasi-rigid domain decomposition. It is first applied to a heterogeneous set of well-characterized viruses (CCMV, MS2, STNV, STMV) for which the known mechanical or assembly domains are correctly identified. The validated method is next applied to other viral particles such as L-A, Pariacoto and polyoma viruses, whose fundamental functional domains are still unknown or debated and for which we formulate verifiable predictions. The numerical code implementing the domain decomposition strategy is made freely available. PMID:24244139
Suppressing correlations in massively parallel simulations of lattice models
NASA Astrophysics Data System (ADS)
Kelling, Jeffrey; Ódor, Géza; Gemming, Sibylle
2017-11-01
For lattice Monte Carlo simulations parallelization is crucial to make studies of large systems and long simulation time feasible, while sequential simulations remain the gold-standard for correlation-free dynamics. Here, various domain decomposition schemes are compared, concluding with one which delivers virtually correlation-free simulations on GPUs. Extensive simulations of the octahedron model for 2 + 1 dimensional Kardar-Parisi-Zhang surface growth, which is very sensitive to correlation in the site-selection dynamics, were performed to show self-consistency of the parallel runs and agreement with the sequential algorithm. We present a GPU implementation providing a speedup of about 30 × over a parallel CPU implementation on a single socket and at least 180 × with respect to the sequential reference.
Underdetermined blind separation of three-way fluorescence spectra of PAHs in water
NASA Astrophysics Data System (ADS)
Yang, Ruifang; Zhao, Nanjing; Xiao, Xue; Zhu, Wei; Chen, Yunan; Yin, Gaofang; Liu, Jianguo; Liu, Wenqing
2018-06-01
In this work, underdetermined blind decomposition method is developed to recognize individual components from the three-way fluorescent spectra of their mixtures by using sparse component analysis (SCA). The mixing matrix is estimated from the mixtures using fuzzy data clustering algorithm together with the scatters corresponding to local energy maximum value in the time-frequency domain, and the spectra of object components are recovered by pseudo inverse technique. As an example, using this method three and four pure components spectra can be blindly extracted from two samples of their mixture, with similarities between resolved and reference spectra all above 0.80. This work opens a new and effective path to realize monitoring PAHs in water by three-way fluorescence spectroscopy technique.
The Multiscale Robin Coupled Method for flows in porous media
NASA Astrophysics Data System (ADS)
Guiraldello, Rafael T.; Ausas, Roberto F.; Sousa, Fabricio S.; Pereira, Felipe; Buscaglia, Gustavo C.
2018-02-01
A multiscale mixed method aiming at the accurate approximation of velocity and pressure fields in heterogeneous porous media is proposed. The procedure is based on a new domain decomposition method in which the local problems are subject to Robin boundary conditions. The domain decomposition procedure is defined in terms of two independent spaces on the skeleton of the decomposition, corresponding to interface pressures and fluxes, that can be chosen with great flexibility to accommodate local features of the underlying permeability fields. The well-posedness of the new domain decomposition procedure is established and its connection with the method of Douglas et al. (1993) [12], is identified, also allowing us to reinterpret the known procedure as an optimized Schwarz (or Two-Lagrange-Multiplier) method. The multiscale property of the new domain decomposition method is indicated, and its relation with the Multiscale Mortar Mixed Finite Element Method (MMMFEM) and the Multiscale Hybrid-Mixed (MHM) Finite Element Method is discussed. Numerical simulations are presented aiming at illustrating several features of the new method. Initially we illustrate the possibility of switching from MMMFEM to MHM by suitably varying the Robin condition parameter in the new multiscale method. Then we turn our attention to realistic flows in high-contrast, channelized porous formations. We show that for a range of values of the Robin condition parameter our method provides better approximations for pressure and velocity than those computed with either the MMMFEM and the MHM. This is an indication that our method has the potential to produce more accurate velocity fields in the presence of rough, realistic permeability fields of petroleum reservoirs.
Pascual-García, Alberto; Abia, David; Ortiz, Angel R; Bastolla, Ugo
2009-03-01
Structural classifications of proteins assume the existence of the fold, which is an intrinsic equivalence class of protein domains. Here, we test in which conditions such an equivalence class is compatible with objective similarity measures. We base our analysis on the transitive property of the equivalence relationship, requiring that similarity of A with B and B with C implies that A and C are also similar. Divergent gene evolution leads us to expect that the transitive property should approximately hold. However, if protein domains are a combination of recurrent short polypeptide fragments, as proposed by several authors, then similarity of partial fragments may violate the transitive property, favouring the continuous view of the protein structure space. We propose a measure to quantify the violations of the transitive property when a clustering algorithm joins elements into clusters, and we find out that such violations present a well defined and detectable cross-over point, from an approximately transitive regime at high structure similarity to a regime with large transitivity violations and large differences in length at low similarity. We argue that protein structure space is discrete and hierarchic classification is justified up to this cross-over point, whereas at lower similarities the structure space is continuous and it should be represented as a network. We have tested the qualitative behaviour of this measure, varying all the choices involved in the automatic classification procedure, i.e., domain decomposition, alignment algorithm, similarity score, and clustering algorithm, and we have found out that this behaviour is quite robust. The final classification depends on the chosen algorithms. We used the values of the clustering coefficient and the transitivity violations to select the optimal choices among those that we tested. Interestingly, this criterion also favours the agreement between automatic and expert classifications. As a domain set, we have selected a consensus set of 2,890 domains decomposed very similarly in SCOP and CATH. As an alignment algorithm, we used a global version of MAMMOTH developed in our group, which is both rapid and accurate. As a similarity measure, we used the size-normalized contact overlap, and as a clustering algorithm, we used average linkage. The resulting automatic classification at the cross-over point was more consistent than expert ones with respect to the structure similarity measure, with 86% of the clusters corresponding to subsets of either SCOP or CATH superfamilies and fewer than 5% containing domains in distinct folds according to both SCOP and CATH. Almost 15% of SCOP superfamilies and 10% of CATH superfamilies were split, consistent with the notion of fold change in protein evolution. These results were qualitatively robust for all choices that we tested, although we did not try to use alignment algorithms developed by other groups. Folds defined in SCOP and CATH would be completely joined in the regime of large transitivity violations where clustering is more arbitrary. Consistently, the agreement between SCOP and CATH at fold level was lower than their agreement with the automatic classification obtained using as a clustering algorithm, respectively, average linkage (for SCOP) or single linkage (for CATH). The networks representing significant evolutionary and structural relationships between clusters beyond the cross-over point may allow us to perform evolutionary, structural, or functional analyses beyond the limits of classification schemes. These networks and the underlying clusters are available at http://ub.cbm.uam.es/research/ProtNet.php.
The shifting zoom: new possibilities for inverse scattering on electrically large domains
NASA Astrophysics Data System (ADS)
Persico, Raffaele; Ludeno, Giovanni; Soldovieri, Francesco; De Coster, Alberic; Lambot, Sebastien
2017-04-01
Inverse scattering is a subject of great interest in diagnostic problems, which are in their turn of interest for many applicative problems as investigation of cultural heritage, characterization of foundations or subservices, identification of unexploded ordnances and so on [1-4]. In particular, GPR data are usually focused by means of migration algorithms, essentially based on a linear approximation of the scattering phenomenon. Migration algorithms are popular because they are computationally efficient and do not require the inversion of a matrix, neither the calculation of the elements of a matrix. In fact, they are essentially based on the adjoint of the linearised scattering operator, which allows in the end to write the inversion formula as a suitably weighted integral of the data [5]. In particular, this makes a migration algorithm more suitable than a linear microwave tomography inversion algorithm for the reconstruction of an electrically large investigation domain. However, this computational challenge can be overcome by making use of investigation domains joined side by side, as proposed e.g. in ref. [3]. This allows to apply a microwave tomography algorithm even to large investigation domains. However, the joining side by side of sequential investigation domains introduces a problem of limited (and asymmetric) maximum view angle with regard to the targets occurring close to the edges between two adjacent domains, or possibly crossing these edges. The shifting zoom is a method that allows to overcome this difficulty by means of overlapped investigation and observation domains [6-7]. It requires more sequential inversion with respect to adjacent investigation domains, but the really required extra-time is minimal because the matrix to be inverted is calculated ones and for all, as well as its singular value decomposition: what is repeated more time is only a fast matrix-vector multiplication. References [1] M. Pieraccini, L. Noferini, D. Mecatti, C. Atzeni, R. Persico, F. Soldovieri, Advanced Processing Techniques for Step-frequency Continuous-Wave Penetrating Radar: the Case Study of "Palazzo Vecchio" Walls (Firenze, Italy), Research on Nondestructive Evaluation, vol. 17, pp. 71-83, 2006. [2] N. Masini, R. Persico, E. Rizzo, A. Calia, M. T. Giannotta, G. Quarta, A. Pagliuca, "Integrated Techniques for Analysis and Monitoring of Historical Monuments: the case of S.Giovanni al Sepolcro in Brindisi (Southern Italy)." Near Surface Geophysics, vol. 8 (5), pp. 423-432, 2010. [3] E. Pettinelli, A. Di Matteo, E. Mattei, L. Crocco, F. Soldovieri, J. D. Redman, and A. P. Annan, "GPR response from buried pipes: Measurement on field site and tomographic reconstructions", IEEE Transactions on Geoscience and Remote Sensing, vol. 47, n. 8, 2639-2645, Aug. 2009. [4] O. Lopera, E. C. Slob, N. Milisavljevic and S. Lambot, "Filtering soil surface and antenna effects from GPR data to enhance landmine detection", IEEE Transactions on Geoscience and Remote Sensing, vol. 45, n. 3, pp.707-717, 2007. [5] R. Persico, "Introduction to Ground Penetrating Radar: Inverse Scattering and Data Processing". Wiley, 2014. [6] R. Persico, J. Sala, "The problem of the investigation domain subdivision in 2D linear inversions for large scale GPR data", IEEE Geoscience and Remote Sensing Letters, vol. 11, n. 7, pp. 1215-1219, doi 10.1109/LGRS.2013.2290008, July 2014. [7] R. Persico, F. Soldovieri, S. Lambot, Shifting zoom in 2D linear inversions performed on GPR data gathered along an electrically large investigation domain, Proc. 16th International Conference on Ground Penetrating Radar GPR2016, Honk-Kong, June 13-16, 2016
Randomized interpolative decomposition of separated representations
NASA Astrophysics Data System (ADS)
Biagioni, David J.; Beylkin, Daniel; Beylkin, Gregory
2015-01-01
We introduce an algorithm to compute tensor interpolative decomposition (dubbed CTD-ID) for the reduction of the separation rank of Canonical Tensor Decompositions (CTDs). Tensor ID selects, for a user-defined accuracy ɛ, a near optimal subset of terms of a CTD to represent the remaining terms via a linear combination of the selected terms. CTD-ID can be used as an alternative to or in combination with the Alternating Least Squares (ALS) algorithm. We present examples of its use within a convergent iteration to compute inverse operators in high dimensions. We also briefly discuss the spectral norm as a computational alternative to the Frobenius norm in estimating approximation errors of tensor ID. We reduce the problem of finding tensor IDs to that of constructing interpolative decompositions of certain matrices. These matrices are generated via randomized projection of the terms of the given tensor. We provide cost estimates and several examples of the new approach to the reduction of separation rank.
Hassan, Ahnaf Rashik; Bhuiyan, Mohammed Imamul Hassan
2017-03-01
Automatic sleep staging is essential for alleviating the burden of the physicians of analyzing a large volume of data by visual inspection. It is also a precondition for making an automated sleep monitoring system feasible. Further, computerized sleep scoring will expedite large-scale data analysis in sleep research. Nevertheless, most of the existing works on sleep staging are either multichannel or multiple physiological signal based which are uncomfortable for the user and hinder the feasibility of an in-home sleep monitoring device. So, a successful and reliable computer-assisted sleep staging scheme is yet to emerge. In this work, we propose a single channel EEG based algorithm for computerized sleep scoring. In the proposed algorithm, we decompose EEG signal segments using Ensemble Empirical Mode Decomposition (EEMD) and extract various statistical moment based features. The effectiveness of EEMD and statistical features are investigated. Statistical analysis is performed for feature selection. A newly proposed classification technique, namely - Random under sampling boosting (RUSBoost) is introduced for sleep stage classification. This is the first implementation of EEMD in conjunction with RUSBoost to the best of the authors' knowledge. The proposed feature extraction scheme's performance is investigated for various choices of classification models. The algorithmic performance of our scheme is evaluated against contemporary works in the literature. The performance of the proposed method is comparable or better than that of the state-of-the-art ones. The proposed algorithm gives 88.07%, 83.49%, 92.66%, 94.23%, and 98.15% for 6-state to 2-state classification of sleep stages on Sleep-EDF database. Our experimental outcomes reveal that RUSBoost outperforms other classification models for the feature extraction framework presented in this work. Besides, the algorithm proposed in this work demonstrates high detection accuracy for the sleep states S1 and REM. Statistical moment based features in the EEMD domain distinguish the sleep states successfully and efficaciously. The automated sleep scoring scheme propounded herein can eradicate the onus of the clinicians, contribute to the device implementation of a sleep monitoring system, and benefit sleep research. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Integrand-level reduction of loop amplitudes by computational algebraic geometry methods
NASA Astrophysics Data System (ADS)
Zhang, Yang
2012-09-01
We present an algorithm for the integrand-level reduction of multi-loop amplitudes of renormalizable field theories, based on computational algebraic geometry. This algorithm uses (1) the Gröbner basis method to determine the basis for integrand-level reduction, (2) the primary decomposition of an ideal to classify all inequivalent solutions of unitarity cuts. The resulting basis and cut solutions can be used to reconstruct the integrand from unitarity cuts, via polynomial fitting techniques. The basis determination part of the algorithm has been implemented in the Mathematica package, BasisDet. The primary decomposition part can be readily carried out by algebraic geometry softwares, with the output of the package BasisDet. The algorithm works in both D = 4 and D = 4 - 2 ɛ dimensions, and we present some two and three-loop examples of applications of this algorithm.
Phase unwrapping in digital holography based on non-subsampled contourlet transform
NASA Astrophysics Data System (ADS)
Zhang, Xiaolei; Zhang, Xiangchao; Xu, Min; Zhang, Hao; Jiang, Xiangqian
2018-01-01
In the digital holographic measurement of complex surfaces, phase unwrapping is a critical step for accurate reconstruction. The phases of the complex amplitudes calculated from interferometric holograms are disturbed by speckle noise, thus reliable unwrapping results are difficult to be obtained. Most of existing unwrapping algorithms implement denoising operations first to obtain noise-free phases and then conduct phase unwrapping pixel by pixel. This approach is sensitive to spikes and prone to unreliable results in practice. In this paper, a robust unwrapping algorithm based on the non-subsampled contourlet transform (NSCT) is developed. The multiscale and directional decomposition of NSCT enhances the boundary between adjacent phase levels and henceforth the influence of local noise can be eliminated in the transform domain. The wrapped phase map is segmented into several regions corresponding to different phase levels. Finally, an unwrapped phase map is obtained by elevating the phases of a whole segment instead of individual pixels to avoid unwrapping errors caused by local spikes. This algorithm is suitable for dealing with complex and noisy wavefronts. Its universality and superiority in the digital holographic interferometry have been demonstrated by both numerical analysis and practical experiments.
Peak picking NMR spectral data using non-negative matrix factorization
2014-01-01
Background Simple peak-picking algorithms, such as those based on lineshape fitting, perform well when peaks are completely resolved in multidimensional NMR spectra, but often produce wrong intensities and frequencies for overlapping peak clusters. For example, NOESY-type spectra have considerable overlaps leading to significant peak-picking intensity errors, which can result in erroneous structural restraints. Precise frequencies are critical for unambiguous resonance assignments. Results To alleviate this problem, a more sophisticated peaks decomposition algorithm, based on non-negative matrix factorization (NMF), was developed. We produce peak shapes from Fourier-transformed NMR spectra. Apart from its main goal of deriving components from spectra and producing peak lists automatically, the NMF approach can also be applied if the positions of some peaks are known a priori, e.g. from consistently referenced spectral dimensions of other experiments. Conclusions Application of the NMF algorithm to a three-dimensional peak list of the 23 kDa bi-domain section of the RcsD protein (RcsD-ABL-HPt, residues 688-890) as well as to synthetic HSQC data shows that peaks can be picked accurately also in spectral regions with strong overlap. PMID:24511909
NASA Astrophysics Data System (ADS)
Caplan, R. M.; Mikić, Z.; Linker, J. A.; Lionello, R.
2017-05-01
We explore the performance and advantages/disadvantages of using unconditionally stable explicit super time-stepping (STS) algorithms versus implicit schemes with Krylov solvers for integrating parabolic operators in thermodynamic MHD models of the solar corona. Specifically, we compare the second-order Runge-Kutta Legendre (RKL2) STS method with the implicit backward Euler scheme computed using the preconditioned conjugate gradient (PCG) solver with both a point-Jacobi and a non-overlapping domain decomposition ILU0 preconditioner. The algorithms are used to integrate anisotropic Spitzer thermal conduction and artificial kinematic viscosity at time-steps much larger than classic explicit stability criteria allow. A key component of the comparison is the use of an established MHD model (MAS) to compute a real-world simulation on a large HPC cluster. Special attention is placed on the parallel scaling of the algorithms. It is shown that, for a specific problem and model, the RKL2 method is comparable or surpasses the implicit method with PCG solvers in performance and scaling, but suffers from some accuracy limitations. These limitations, and the applicability of RKL methods are briefly discussed.
Trackside acoustic diagnosis of axle box bearing based on kurtosis-optimization wavelet denoising
NASA Astrophysics Data System (ADS)
Peng, Chaoyong; Gao, Xiaorong; Peng, Jianping; Wang, Ai
2018-04-01
As one of the key components of railway vehicles, the operation condition of the axle box bearing has a significant effect on traffic safety. The acoustic diagnosis is more suitable than vibration diagnosis for trackside monitoring. The acoustic signal generated by the train axle box bearing is an amplitude modulation and frequency modulation signal with complex train running noise. Although empirical mode decomposition (EMD) and some improved time-frequency algorithms have proved to be useful in bearing vibration signal processing, it is hard to extract the bearing fault signal from serious trackside acoustic background noises by using those algorithms. Therefore, a kurtosis-optimization-based wavelet packet (KWP) denoising algorithm is proposed, as the kurtosis is the key indicator of bearing fault signal in time domain. Firstly, the geometry based Doppler correction is applied to signals of each sensor, and with the signal superposition of multiple sensors, random noises and impulse noises, which are the interference of the kurtosis indicator, are suppressed. Then, the KWP is conducted. At last, the EMD and Hilbert transform is applied to extract the fault feature. Experiment results indicate that the proposed method consisting of KWP and EMD is superior to the EMD.
Domain decomposition and matching for time-domain analysis of motions of ships advancing in head sea
NASA Astrophysics Data System (ADS)
Tang, Kai; Zhu, Ren-chuan; Miao, Guo-ping; Fan, Ju
2014-08-01
A domain decomposition and matching method in the time-domain is outlined for simulating the motions of ships advancing in waves. The flow field is decomposed into inner and outer domains by an imaginary control surface, and the Rankine source method is applied to the inner domain while the transient Green function method is used in the outer domain. Two initial boundary value problems are matched on the control surface. The corresponding numerical codes are developed, and the added masses, wave exciting forces and ship motions advancing in head sea for Series 60 ship and S175 containership, are presented and verified. A good agreement has been obtained when the numerical results are compared with the experimental data and other references. It shows that the present method is more efficient because of the panel discretization only in the inner domain during the numerical calculation, and good numerical stability is proved to avoid divergence problem regarding ships with flare.
Data decomposition method for parallel polygon rasterization considering load balancing
NASA Astrophysics Data System (ADS)
Zhou, Chen; Chen, Zhenjie; Liu, Yongxue; Li, Feixue; Cheng, Liang; Zhu, A.-xing; Li, Manchun
2015-12-01
It is essential to adopt parallel computing technology to rapidly rasterize massive polygon data. In parallel rasterization, it is difficult to design an effective data decomposition method. Conventional methods ignore load balancing of polygon complexity in parallel rasterization and thus fail to achieve high parallel efficiency. In this paper, a novel data decomposition method based on polygon complexity (DMPC) is proposed. First, four factors that possibly affect the rasterization efficiency were investigated. Then, a metric represented by the boundary number and raster pixel number in the minimum bounding rectangle was developed to calculate the complexity of each polygon. Using this metric, polygons were rationally allocated according to the polygon complexity, and each process could achieve balanced loads of polygon complexity. To validate the efficiency of DMPC, it was used to parallelize different polygon rasterization algorithms and tested on different datasets. Experimental results showed that DMPC could effectively parallelize polygon rasterization algorithms. Furthermore, the implemented parallel algorithms with DMPC could achieve good speedup ratios of at least 15.69 and generally outperformed conventional decomposition methods in terms of parallel efficiency and load balancing. In addition, the results showed that DMPC exhibited consistently better performance for different spatial distributions of polygons.
NASA Astrophysics Data System (ADS)
Furuichi, Mikito; Nishiura, Daisuke
2017-10-01
We developed dynamic load-balancing algorithms for Particle Simulation Methods (PSM) involving short-range interactions, such as Smoothed Particle Hydrodynamics (SPH), Moving Particle Semi-implicit method (MPS), and Discrete Element method (DEM). These are needed to handle billions of particles modeled in large distributed-memory computer systems. Our method utilizes flexible orthogonal domain decomposition, allowing the sub-domain boundaries in the column to be different for each row. The imbalances in the execution time between parallel logical processes are treated as a nonlinear residual. Load-balancing is achieved by minimizing the residual within the framework of an iterative nonlinear solver, combined with a multigrid technique in the local smoother. Our iterative method is suitable for adjusting the sub-domain frequently by monitoring the performance of each computational process because it is computationally cheaper in terms of communication and memory costs than non-iterative methods. Numerical tests demonstrated the ability of our approach to handle workload imbalances arising from a non-uniform particle distribution, differences in particle types, or heterogeneous computer architecture which was difficult with previously proposed methods. We analyzed the parallel efficiency and scalability of our method using Earth simulator and K-computer supercomputer systems.
NASA Astrophysics Data System (ADS)
Bagherzadeh, Seyed Amin; Asadi, Davood
2017-05-01
In search of a precise method for analyzing nonlinear and non-stationary flight data of an aircraft in the icing condition, an Empirical Mode Decomposition (EMD) algorithm enhanced by multi-objective optimization is introduced. In the proposed method, dissimilar IMF definitions are considered by the Genetic Algorithm (GA) in order to find the best decision parameters of the signal trend. To resolve disadvantages of the classical algorithm caused by the envelope concept, the signal trend is estimated directly in the proposed method. Furthermore, in order to simplify the performance and understanding of the EMD algorithm, the proposed method obviates the need for a repeated sifting process. The proposed enhanced EMD algorithm is verified by some benchmark signals. Afterwards, the enhanced algorithm is applied to simulated flight data in the icing condition in order to detect the ice assertion on the aircraft. The results demonstrate the effectiveness of the proposed EMD algorithm in aircraft ice detection by providing a figure of merit for the icing severity.
Software-engineering challenges of building and deploying reusable problem solvers.
O'Connor, Martin J; Nyulas, Csongor; Tu, Samson; Buckeridge, David L; Okhmatovskaia, Anna; Musen, Mark A
2009-11-01
Problem solving methods (PSMs) are software components that represent and encode reusable algorithms. They can be combined with representations of domain knowledge to produce intelligent application systems. A goal of research on PSMs is to provide principled methods and tools for composing and reusing algorithms in knowledge-based systems. The ultimate objective is to produce libraries of methods that can be easily adapted for use in these systems. Despite the intuitive appeal of PSMs as conceptual building blocks, in practice, these goals are largely unmet. There are no widely available tools for building applications using PSMs and no public libraries of PSMs available for reuse. This paper analyzes some of the reasons for the lack of widespread adoptions of PSM techniques and illustrate our analysis by describing our experiences developing a complex, high-throughput software system based on PSM principles. We conclude that many fundamental principles in PSM research are useful for building knowledge-based systems. In particular, the task-method decomposition process, which provides a means for structuring knowledge-based tasks, is a powerful abstraction for building systems of analytic methods. However, despite the power of PSMs in the conceptual modeling of knowledge-based systems, software engineering challenges have been seriously underestimated. The complexity of integrating control knowledge modeled by developers using PSMs with the domain knowledge that they model using ontologies creates a barrier to widespread use of PSM-based systems. Nevertheless, the surge of recent interest in ontologies has led to the production of comprehensive domain ontologies and of robust ontology-authoring tools. These developments present new opportunities to leverage the PSM approach.
Software-engineering challenges of building and deploying reusable problem solvers
O’CONNOR, MARTIN J.; NYULAS, CSONGOR; TU, SAMSON; BUCKERIDGE, DAVID L.; OKHMATOVSKAIA, ANNA; MUSEN, MARK A.
2012-01-01
Problem solving methods (PSMs) are software components that represent and encode reusable algorithms. They can be combined with representations of domain knowledge to produce intelligent application systems. A goal of research on PSMs is to provide principled methods and tools for composing and reusing algorithms in knowledge-based systems. The ultimate objective is to produce libraries of methods that can be easily adapted for use in these systems. Despite the intuitive appeal of PSMs as conceptual building blocks, in practice, these goals are largely unmet. There are no widely available tools for building applications using PSMs and no public libraries of PSMs available for reuse. This paper analyzes some of the reasons for the lack of widespread adoptions of PSM techniques and illustrate our analysis by describing our experiences developing a complex, high-throughput software system based on PSM principles. We conclude that many fundamental principles in PSM research are useful for building knowledge-based systems. In particular, the task–method decomposition process, which provides a means for structuring knowledge-based tasks, is a powerful abstraction for building systems of analytic methods. However, despite the power of PSMs in the conceptual modeling of knowledge-based systems, software engineering challenges have been seriously underestimated. The complexity of integrating control knowledge modeled by developers using PSMs with the domain knowledge that they model using ontologies creates a barrier to widespread use of PSM-based systems. Nevertheless, the surge of recent interest in ontologies has led to the production of comprehensive domain ontologies and of robust ontology-authoring tools. These developments present new opportunities to leverage the PSM approach. PMID:23565031
The Roadmaker's algorithm for the discrete pulse transform.
Laurie, Dirk P
2011-02-01
The discrete pulse transform (DPT) is a decomposition of an observed signal into a sum of pulses, i.e., signals that are constant on a connected set and zero elsewhere. Originally developed for 1-D signal processing, the DPT has recently been generalized to more dimensions. Applications in image processing are currently being investigated. The time required to compute the DPT as originally defined via the successive application of LULU operators (members of a class of minimax filters studied by Rohwer) has been a severe drawback to its applicability. This paper introduces a fast method for obtaining such a decomposition, called the Roadmaker's algorithm because it involves filling pits and razing bumps. It acts selectively only on those features actually present in the signal, flattening them in order of increasing size by subtracing an appropriate positive or negative pulse, which is then appended to the decomposition. The implementation described here covers 1-D signal as well as two and 3-D image processing in a single framework. This is achieved by considering the signal or image as a function defined on a graph, with the geometry specified by the edges of the graph. Whenever a feature is flattened, nodes in the graph are merged, until eventually only one node remains. At that stage, a new set of edges for the same nodes as the graph, forming a tree structure, defines the obtained decomposition. The Roadmaker's algorithm is shown to be equivalent to the DPT in the sense of obtaining the same decomposition. However, its simpler operators are not in general equivalent to the LULU operators in situations where those operators are not applied successively. A by-product of the Roadmaker's algorithm is that it yields a proof of the so-called Highlight Conjecture, stated as an open problem in 2006. We pay particular attention to algorithmic details and complexity, including a demonstration that in the 1-D case, and also in the case of a complete graph, the Roadmaker's algorithm has optimal complexity: it runs in time O(m), where m is the number of arcs in the graph.
NASA Technical Reports Server (NTRS)
Liu, Kuojuey Ray
1990-01-01
Least-squares (LS) estimations and spectral decomposition algorithms constitute the heart of modern signal processing and communication problems. Implementations of recursive LS and spectral decomposition algorithms onto parallel processing architectures such as systolic arrays with efficient fault-tolerant schemes are the major concerns of this dissertation. There are four major results in this dissertation. First, we propose the systolic block Householder transformation with application to the recursive least-squares minimization. It is successfully implemented on a systolic array with a two-level pipelined implementation at the vector level as well as at the word level. Second, a real-time algorithm-based concurrent error detection scheme based on the residual method is proposed for the QRD RLS systolic array. The fault diagnosis, order degraded reconfiguration, and performance analysis are also considered. Third, the dynamic range, stability, error detection capability under finite-precision implementation, order degraded performance, and residual estimation under faulty situations for the QRD RLS systolic array are studied in details. Finally, we propose the use of multi-phase systolic algorithms for spectral decomposition based on the QR algorithm. Two systolic architectures, one based on triangular array and another based on rectangular array, are presented for the multiphase operations with fault-tolerant considerations. Eigenvectors and singular vectors can be easily obtained by using the multi-pase operations. Performance issues are also considered.
Study of Track Irregularity Time Series Calibration and Variation Pattern at Unit Section
Jia, Chaolong; Wei, Lili; Wang, Hanning; Yang, Jiulin
2014-01-01
Focusing on problems existing in track irregularity time series data quality, this paper first presents abnormal data identification, data offset correction algorithm, local outlier data identification, and noise cancellation algorithms. And then proposes track irregularity time series decomposition and reconstruction through the wavelet decomposition and reconstruction approach. Finally, the patterns and features of track irregularity standard deviation data sequence in unit sections are studied, and the changing trend of track irregularity time series is discovered and described. PMID:25435869
Integration of progressive hedging and dual decomposition in stochastic integer programs
Watson, Jean -Paul; Guo, Ge; Hackebeil, Gabriel; ...
2015-04-07
We present a method for integrating the Progressive Hedging (PH) algorithm and the Dual Decomposition (DD) algorithm of Carøe and Schultz for stochastic mixed-integer programs. Based on the correspondence between lower bounds obtained with PH and DD, a method to transform weights from PH to Lagrange multipliers in DD is found. Fast progress in early iterations of PH speeds up convergence of DD to an exact solution. As a result, we report computational results on server location and unit commitment instances.
Gold - A novel deconvolution algorithm with optimization for waveform LiDAR processing
NASA Astrophysics Data System (ADS)
Zhou, Tan; Popescu, Sorin C.; Krause, Keith; Sheridan, Ryan D.; Putman, Eric
2017-07-01
Waveform Light Detection and Ranging (LiDAR) data have advantages over discrete-return LiDAR data in accurately characterizing vegetation structure. However, we lack a comprehensive understanding of waveform data processing approaches under different topography and vegetation conditions. The objective of this paper is to highlight a novel deconvolution algorithm, the Gold algorithm, for processing waveform LiDAR data with optimal deconvolution parameters. Further, we present a comparative study of waveform processing methods to provide insight into selecting an approach for a given combination of vegetation and terrain characteristics. We employed two waveform processing methods: (1) direct decomposition, (2) deconvolution and decomposition. In method two, we utilized two deconvolution algorithms - the Richardson-Lucy (RL) algorithm and the Gold algorithm. The comprehensive and quantitative comparisons were conducted in terms of the number of detected echoes, position accuracy, the bias of the end products (such as digital terrain model (DTM) and canopy height model (CHM)) from the corresponding reference data, along with parameter uncertainty for these end products obtained from different methods. This study was conducted at three study sites that include diverse ecological regions, vegetation and elevation gradients. Results demonstrate that two deconvolution algorithms are sensitive to the pre-processing steps of input data. The deconvolution and decomposition method is more capable of detecting hidden echoes with a lower false echo detection rate, especially for the Gold algorithm. Compared to the reference data, all approaches generate satisfactory accuracy assessment results with small mean spatial difference (<1.22 m for DTMs, <0.77 m for CHMs) and root mean square error (RMSE) (<1.26 m for DTMs, <1.93 m for CHMs). More specifically, the Gold algorithm is superior to others with smaller root mean square error (RMSE) (<1.01 m), while the direct decomposition approach works better in terms of the percentage of spatial difference within 0.5 and 1 m. The parameter uncertainty analysis demonstrates that the Gold algorithm outperforms other approaches in dense vegetation areas, with the smallest RMSE, and the RL algorithm performs better in sparse vegetation areas in terms of RMSE. Additionally, the high level of uncertainty occurs more on areas with high slope and high vegetation. This study provides an alternative and innovative approach for waveform processing that will benefit high fidelity processing of waveform LiDAR data to characterize vegetation structures.
Matching pursuit parallel decomposition of seismic data
NASA Astrophysics Data System (ADS)
Li, Chuanhui; Zhang, Fanchang
2017-07-01
In order to improve the computation speed of matching pursuit decomposition of seismic data, a matching pursuit parallel algorithm is designed in this paper. We pick a fixed number of envelope peaks from the current signal in every iteration according to the number of compute nodes and assign them to the compute nodes on average to search the optimal Morlet wavelets in parallel. With the help of parallel computer systems and Message Passing Interface, the parallel algorithm gives full play to the advantages of parallel computing to significantly improve the computation speed of the matching pursuit decomposition and also has good expandability. Besides, searching only one optimal Morlet wavelet by every compute node in every iteration is the most efficient implementation.
Singular value decomposition for collaborative filtering on a GPU
NASA Astrophysics Data System (ADS)
Kato, Kimikazu; Hosino, Tikara
2010-06-01
A collaborative filtering predicts customers' unknown preferences from known preferences. In a computation of the collaborative filtering, a singular value decomposition (SVD) is needed to reduce the size of a large scale matrix so that the burden for the next phase computation will be decreased. In this application, SVD means a roughly approximated factorization of a given matrix into smaller sized matrices. Webb (a.k.a. Simon Funk) showed an effective algorithm to compute SVD toward a solution of an open competition called "Netflix Prize". The algorithm utilizes an iterative method so that the error of approximation improves in each step of the iteration. We give a GPU version of Webb's algorithm. Our algorithm is implemented in the CUDA and it is shown to be efficient by an experiment.
Accelerated decomposition techniques for large discounted Markov decision processes
NASA Astrophysics Data System (ADS)
Larach, Abdelhadi; Chafik, S.; Daoui, C.
2017-12-01
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorithm, which is a variant of Tarjan's algorithm that simultaneously finds the SCCs and their belonging levels. Second, a new definition of the restricted MDPs is presented to ameliorate some hierarchical solutions in discounted MDPs using value iteration (VI) algorithm based on a list of state-action successors. Finally, a robotic motion-planning example and the experiment results are presented to illustrate the benefit of the proposed decomposition algorithms.
Heterogeneous Tensor Decomposition for Clustering via Manifold Optimization.
Sun, Yanfeng; Gao, Junbin; Hong, Xia; Mishra, Bamdev; Yin, Baocai
2016-03-01
Tensor clustering is an important tool that exploits intrinsically rich structures in real-world multiarray or Tensor datasets. Often in dealing with those datasets, standard practice is to use subspace clustering that is based on vectorizing multiarray data. However, vectorization of tensorial data does not exploit complete structure information. In this paper, we propose a subspace clustering algorithm without adopting any vectorization process. Our approach is based on a novel heterogeneous Tucker decomposition model taking into account cluster membership information. We propose a new clustering algorithm that alternates between different modes of the proposed heterogeneous tensor model. All but the last mode have closed-form updates. Updating the last mode reduces to optimizing over the multinomial manifold for which we investigate second order Riemannian geometry and propose a trust-region algorithm. Numerical experiments show that our proposed algorithm compete effectively with state-of-the-art clustering algorithms that are based on tensor factorization.
NASA Astrophysics Data System (ADS)
Gatto, Paolo; Lipparini, Filippo; Stamm, Benjamin
2017-12-01
The domain-decomposition (dd) paradigm, originally introduced for the conductor-like screening model, has been recently extended to the dielectric Polarizable Continuum Model (PCM), resulting in the ddPCM method. We present here a complete derivation of the analytical derivatives of the ddPCM energy with respect to the positions of the solute's atoms and discuss their efficient implementation. As it is the case for the energy, we observe a quadratic scaling, which is discussed and demonstrated with numerical tests.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vincenti, H.; Vay, J. -L.
Due to discretization effects and truncation to finite domains, many electromagnetic simulations present non-physical modifications of Maxwell's equations in space that may generate spurious signals affecting the overall accuracy of the result. Such modifications for instance occur when Perfectly Matched Layers (PMLs) are used at simulation domain boundaries to simulate open media. Another example is the use of arbitrary order Maxwell solver with domain decomposition technique that may under some condition involve stencil truncations at subdomain boundaries, resulting in small spurious errors that do eventually build up. In each case, a careful evaluation of the characteristics and magnitude of themore » errors resulting from these approximations, and their impact at any frequency and angle, requires detailed analytical and numerical studies. To this end, we present a general analytical approach that enables the evaluation of numerical discretization errors of fully three-dimensional arbitrary order finite-difference Maxwell solver, with arbitrary modification of the local stencil in the simulation domain. The analytical model is validated against simulations of domain decomposition technique and PMLs, when these are used with very high-order Maxwell solver, as well as in the infinite order limit of pseudo-spectral solvers. Results confirm that the new analytical approach enables exact predictions in each case. It also confirms that the domain decomposition technique can be used with very high-order Maxwell solver and a reasonably low number of guard cells with negligible effects on the whole accuracy of the simulation.« less
Structural system identification based on variational mode decomposition
NASA Astrophysics Data System (ADS)
Bagheri, Abdollah; Ozbulut, Osman E.; Harris, Devin K.
2018-03-01
In this paper, a new structural identification method is proposed to identify the modal properties of engineering structures based on dynamic response decomposition using the variational mode decomposition (VMD). The VMD approach is a decomposition algorithm that has been developed as a means to overcome some of the drawbacks and limitations of the empirical mode decomposition method. The VMD-based modal identification algorithm decomposes the acceleration signal into a series of distinct modal responses and their respective center frequencies, such that when combined their cumulative modal responses reproduce the original acceleration response. The decaying amplitude of the extracted modal responses is then used to identify the modal damping ratios using a linear fitting function on modal response data. Finally, after extracting modal responses from available sensors, the mode shape vector for each of the decomposed modes in the system is identified from all obtained modal response data. To demonstrate the efficiency of the algorithm, a series of numerical, laboratory, and field case studies were evaluated. The laboratory case study utilized the vibration response of a three-story shear frame, whereas the field study leveraged the ambient vibration response of a pedestrian bridge to characterize the modal properties of the structure. The modal properties of the shear frame were computed using analytical approach for a comparison with the experimental modal frequencies. Results from these case studies demonstrated that the proposed method is efficient and accurate in identifying modal data of the structures.
Niang, Oumar; Thioune, Abdoulaye; El Gueirea, Mouhamed Cheikh; Deléchelle, Eric; Lemoine, Jacques
2012-09-01
The major problem with the empirical mode decomposition (EMD) algorithm is its lack of a theoretical framework. So, it is difficult to characterize and evaluate this approach. In this paper, we propose, in the 2-D case, the use of an alternative implementation to the algorithmic definition of the so-called "sifting process" used in the original Huang's EMD method. This approach, especially based on partial differential equations (PDEs), was presented by Niang in previous works, in 2005 and 2007, and relies on a nonlinear diffusion-based filtering process to solve the mean envelope estimation problem. In the 1-D case, the efficiency of the PDE-based method, compared to the original EMD algorithmic version, was also illustrated in a recent paper. Recently, several 2-D extensions of the EMD method have been proposed. Despite some effort, 2-D versions for EMD appear poorly performing and are very time consuming. So in this paper, an extension to the 2-D space of the PDE-based approach is extensively described. This approach has been applied in cases of both signal and image decomposition. The obtained results confirm the usefulness of the new PDE-based sifting process for the decomposition of various kinds of data. Some results have been provided in the case of image decomposition. The effectiveness of the approach encourages its use in a number of signal and image applications such as denoising, detrending, or texture analysis.
Inferring Gene Regulatory Networks by Singular Value Decomposition and Gravitation Field Algorithm
Zheng, Ming; Wu, Jia-nan; Huang, Yan-xin; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang
2012-01-01
Reconstruction of gene regulatory networks (GRNs) is of utmost interest and has become a challenge computational problem in system biology. However, every existing inference algorithm from gene expression profiles has its own advantages and disadvantages. In particular, the effectiveness and efficiency of every previous algorithm is not high enough. In this work, we proposed a novel inference algorithm from gene expression data based on differential equation model. In this algorithm, two methods were included for inferring GRNs. Before reconstructing GRNs, singular value decomposition method was used to decompose gene expression data, determine the algorithm solution space, and get all candidate solutions of GRNs. In these generated family of candidate solutions, gravitation field algorithm was modified to infer GRNs, used to optimize the criteria of differential equation model, and search the best network structure result. The proposed algorithm is validated on both the simulated scale-free network and real benchmark gene regulatory network in networks database. Both the Bayesian method and the traditional differential equation model were also used to infer GRNs, and the results were used to compare with the proposed algorithm in our work. And genetic algorithm and simulated annealing were also used to evaluate gravitation field algorithm. The cross-validation results confirmed the effectiveness of our algorithm, which outperforms significantly other previous algorithms. PMID:23226565
NASA Astrophysics Data System (ADS)
Lohmann, Timo
Electric sector models are powerful tools that guide policy makers and stakeholders. Long-term power generation expansion planning models are a prominent example and determine a capacity expansion for an existing power system over a long planning horizon. With the changes in the power industry away from monopolies and regulation, the focus of these models has shifted to competing electric companies maximizing their profit in a deregulated electricity market. In recent years, consumers have started to participate in demand response programs, actively influencing electricity load and price in the power system. We introduce a model that features investment and retirement decisions over a long planning horizon of more than 20 years, as well as an hourly representation of day-ahead electricity markets in which sellers of electricity face buyers. This combination makes our model both unique and challenging to solve. Decomposition algorithms, and especially Benders decomposition, can exploit the model structure. We present a novel method that can be seen as an alternative to generalized Benders decomposition and relies on dynamic linear overestimation. We prove its finite convergence and present computational results, demonstrating its superiority over traditional approaches. In certain special cases of our model, all necessary solution values in the decomposition algorithms can be directly calculated and solving mathematical programming problems becomes entirely obsolete. This leads to highly efficient algorithms that drastically outperform their programming problem-based counterparts. Furthermore, we discuss the implementation of all tailored algorithms and the challenges from a modeling software developer's standpoint, providing an insider's look into the modeling language GAMS. Finally, we apply our model to the Texas power system and design two electricity policies motivated by the U.S. Environment Protection Agency's recently proposed CO2 emissions targets for the power sector.
Source and listener directivity for interactive wave-based sound propagation.
Mehra, Ravish; Antani, Lakulish; Kim, Sujeong; Manocha, Dinesh
2014-04-01
We present an approach to model dynamic, data-driven source and listener directivity for interactive wave-based sound propagation in virtual environments and computer games. Our directional source representation is expressed as a linear combination of elementary spherical harmonic (SH) sources. In the preprocessing stage, we precompute and encode the propagated sound fields due to each SH source. At runtime, we perform the SH decomposition of the varying source directivity interactively and compute the total sound field at the listener position as a weighted sum of precomputed SH sound fields. We propose a novel plane-wave decomposition approach based on higher-order derivatives of the sound field that enables dynamic HRTF-based listener directivity at runtime. We provide a generic framework to incorporate our source and listener directivity in any offline or online frequency-domain wave-based sound propagation algorithm. We have integrated our sound propagation system in Valve's Source game engine and use it to demonstrate realistic acoustic effects such as sound amplification, diffraction low-passing, scattering, localization, externalization, and spatial sound, generated by wave-based propagation of directional sources and listener in complex scenarios. We also present results from our preliminary user study.
Problem decomposition by mutual information and force-based clustering
NASA Astrophysics Data System (ADS)
Otero, Richard Edward
The scale of engineering problems has sharply increased over the last twenty years. Larger coupled systems, increasing complexity, and limited resources create a need for methods that automatically decompose problems into manageable sub-problems by discovering and leveraging problem structure. The ability to learn the coupling (inter-dependence) structure and reorganize the original problem could lead to large reductions in the time to analyze complex problems. Such decomposition methods could also provide engineering insight on the fundamental physics driving problem solution. This work forwards the current state of the art in engineering decomposition through the application of techniques originally developed within computer science and information theory. The work describes the current state of automatic problem decomposition in engineering and utilizes several promising ideas to advance the state of the practice. Mutual information is a novel metric for data dependence and works on both continuous and discrete data. Mutual information can measure both the linear and non-linear dependence between variables without the limitations of linear dependence measured through covariance. Mutual information is also able to handle data that does not have derivative information, unlike other metrics that require it. The value of mutual information to engineering design work is demonstrated on a planetary entry problem. This study utilizes a novel tool developed in this work for planetary entry system synthesis. A graphical method, force-based clustering, is used to discover related sub-graph structure as a function of problem structure and links ranked by their mutual information. This method does not require the stochastic use of neural networks and could be used with any link ranking method currently utilized in the field. Application of this method is demonstrated on a large, coupled low-thrust trajectory problem. Mutual information also serves as the basis for an alternative global optimizer, called MIMIC, which is unrelated to Genetic Algorithms. Advancement to the current practice demonstrates the use of MIMIC as a global method that explicitly models problem structure with mutual information, providing an alternate method for globally searching multi-modal domains. By leveraging discovered problem inter- dependencies, MIMIC may be appropriate for highly coupled problems or those with large function evaluation cost. This work introduces a useful addition to the MIMIC algorithm that enables its use on continuous input variables. By leveraging automatic decision tree generation methods from Machine Learning and a set of randomly generated test problems, decision trees for which method to apply are also created, quantifying decomposition performance over a large region of the design space.
Performance evaluation of canny edge detection on a tiled multicore architecture
NASA Astrophysics Data System (ADS)
Brethorst, Andrew Z.; Desai, Nehal; Enright, Douglas P.; Scrofano, Ronald
2011-01-01
In the last few years, a variety of multicore architectures have been used to parallelize image processing applications. In this paper, we focus on assessing the parallel speed-ups of different Canny edge detection parallelization strategies on the Tile64, a tiled multicore architecture developed by the Tilera Corporation. Included in these strategies are different ways Canny edge detection can be parallelized, as well as differences in data management. The two parallelization strategies examined were loop-level parallelism and domain decomposition. Loop-level parallelism is achieved through the use of OpenMP,1 and it is capable of parallelization across the range of values over which a loop iterates. Domain decomposition is the process of breaking down an image into subimages, where each subimage is processed independently, in parallel. The results of the two strategies show that for the same number of threads, programmer implemented, domain decomposition exhibits higher speed-ups than the compiler managed, loop-level parallelism implemented with OpenMP.
NASA Astrophysics Data System (ADS)
Negrello, Camille; Gosselet, Pierre; Rey, Christian
2018-05-01
An efficient method for solving large nonlinear problems combines Newton solvers and Domain Decomposition Methods (DDM). In the DDM framework, the boundary conditions can be chosen to be primal, dual or mixed. The mixed approach presents the advantage to be eligible for the research of an optimal interface parameter (often called impedance) which can increase the convergence rate. The optimal value for this parameter is often too expensive to be computed exactly in practice: an approximate version has to be sought for, along with a compromise between efficiency and computational cost. In the context of parallel algorithms for solving nonlinear structural mechanical problems, we propose a new heuristic for the impedance which combines short and long range effects at a low computational cost.
Parallel DSMC Solution of Three-Dimensional Flow Over a Finite Flat Plate
NASA Technical Reports Server (NTRS)
Nance, Robert P.; Wilmoth, Richard G.; Moon, Bongki; Hassan, H. A.; Saltz, Joel
1994-01-01
This paper describes a parallel implementation of the direct simulation Monte Carlo (DSMC) method. Runtime library support is used for scheduling and execution of communication between nodes, and domain decomposition is performed dynamically to maintain a good load balance. Performance tests are conducted using the code to evaluate various remapping and remapping-interval policies, and it is shown that a one-dimensional chain-partitioning method works best for the problems considered. The parallel code is then used to simulate the Mach 20 nitrogen flow over a finite-thickness flat plate. It is shown that the parallel algorithm produces results which compare well with experimental data. Moreover, it yields significantly faster execution times than the scalar code, as well as very good load-balance characteristics.
Underdetermined blind separation of three-way fluorescence spectra of PAHs in water.
Yang, Ruifang; Zhao, Nanjing; Xiao, Xue; Zhu, Wei; Chen, Yunan; Yin, Gaofang; Liu, Jianguo; Liu, Wenqing
2018-06-15
In this work, underdetermined blind decomposition method is developed to recognize individual components from the three-way fluorescent spectra of their mixtures by using sparse component analysis (SCA). The mixing matrix is estimated from the mixtures using fuzzy data clustering algorithm together with the scatters corresponding to local energy maximum value in the time-frequency domain, and the spectra of object components are recovered by pseudo inverse technique. As an example, using this method three and four pure components spectra can be blindly extracted from two samples of their mixture, with similarities between resolved and reference spectra all above 0.80. This work opens a new and effective path to realize monitoring PAHs in water by three-way fluorescence spectroscopy technique. Copyright © 2018 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jalas, S.; Dornmair, I.; Lehe, R.
Particle in Cell (PIC) simulations are a widely used tool for the investigation of both laser- and beam-driven plasma acceleration. It is a known issue that the beam quality can be artificially degraded by numerical Cherenkov radiation (NCR) resulting primarily from an incorrectly modeled dispersion relation. Pseudo-spectral solvers featuring infinite order stencils can strongly reduce NCR - or even suppress it - and are therefore well suited to correctly model the beam properties. For efficient parallelization of the PIC algorithm, however, localized solvers are inevitable. Arbitrary order pseudo-spectral methods provide this needed locality. Yet, these methods can again be pronemore » to NCR. Here in this paper, we show that acceptably low solver orders are sufficient to correctly model the physics of interest, while allowing for parallel computation by domain decomposition.« less
FACETS: multi-faceted functional decomposition of protein interaction networks.
Seah, Boon-Siew; Bhowmick, Sourav S; Dewey, C Forbes
2012-10-15
The availability of large-scale curated protein interaction datasets has given rise to the opportunity to investigate higher level organization and modularity within the protein-protein interaction (PPI) network using graph theoretic analysis. Despite the recent progress, systems level analysis of high-throughput PPIs remains a daunting task because of the amount of data they present. In this article, we propose a novel PPI network decomposition algorithm called FACETS in order to make sense of the deluge of interaction data using Gene Ontology (GO) annotations. FACETS finds not just a single functional decomposition of the PPI network, but a multi-faceted atlas of functional decompositions that portray alternative perspectives of the functional landscape of the underlying PPI network. Each facet in the atlas represents a distinct interpretation of how the network can be functionally decomposed and organized. Our algorithm maximizes interpretative value of the atlas by optimizing inter-facet orthogonality and intra-facet cluster modularity. We tested our algorithm on the global networks from IntAct, and compared it with gold standard datasets from MIPS and KEGG. We demonstrated the performance of FACETS. We also performed a case study that illustrates the utility of our approach. Supplementary data are available at the Bioinformatics online. Our software is available freely for non-commercial purposes from: http://www.cais.ntu.edu.sg/~assourav/Facets/
a Non-Overlapping Discretization Method for Partial Differential Equations
NASA Astrophysics Data System (ADS)
Rosas-Medina, A.; Herrera, I.
2013-05-01
Mathematical models of many systems of interest, including very important continuous systems of Engineering and Science, lead to a great variety of partial differential equations whose solution methods are based on the computational processing of large-scale algebraic systems. Furthermore, the incredible expansion experienced by the existing computational hardware and software has made amenable to effective treatment problems of an ever increasing diversity and complexity, posed by engineering and scientific applications. The emergence of parallel computing prompted on the part of the computational-modeling community a continued and systematic effort with the purpose of harnessing it for the endeavor of solving boundary-value problems (BVPs) of partial differential equations. Very early after such an effort began, it was recognized that domain decomposition methods (DDM) were the most effective technique for applying parallel computing to the solution of partial differential equations, since such an approach drastically simplifies the coordination of the many processors that carry out the different tasks and also reduces very much the requirements of information-transmission between them. Ideally, DDMs intend producing algorithms that fulfill the DDM-paradigm; i.e., such that "the global solution is obtained by solving local problems defined separately in each subdomain of the coarse-mesh -or domain-decomposition-". Stated in a simplistic manner, the basic idea is that, when the DDM-paradigm is satisfied, full parallelization can be achieved by assigning each subdomain to a different processor. When intensive DDM research began much attention was given to overlapping DDMs, but soon after attention shifted to non-overlapping DDMs. This evolution seems natural when the DDM-paradigm is taken into account: it is easier to uncouple the local problems when the subdomains are separated. However, an important limitation of non-overlapping domain decompositions, as that concept is usually understood today, is that interface nodes are shared by two or more subdomains of the coarse-mesh and, therefore, even non-overlapping DDMs are actually overlapping when seen from the perspective of the nodes used in the discretization. In this talk we present and discuss a discretization method in which the nodes used are non-overlapping, in the sense that each one of them belongs to one and only one subdomain of the coarse-mesh.
Application of higher-order cepstral techniques in problems of fetal heart signal extraction
NASA Astrophysics Data System (ADS)
Sabry-Rizk, Madiha; Zgallai, Walid; Hardiman, P.; O'Riordan, J.
1996-10-01
Recently, cepstral analysis based on second order statistics and homomorphic filtering techniques have been used in the adaptive decomposition of overlapping, or otherwise, and noise contaminated ECG complexes of mothers and fetals obtained by a transabdominal surface electrodes connected to a monitoring instrument, an interface card, and a PC. Differential time delays of fetal heart beats measured from a reference point located on the mother complex after transformation to cepstra domains are first obtained and this is followed by fetal heart rate variability computations. Homomorphic filtering in the complex cepstral domain and the subuent transformation to the time domain results in fetal complex recovery. However, three problems have been identified with second-order based cepstral techniques that needed rectification in this paper. These are (1) errors resulting from the phase unwrapping algorithms and leading to fetal complex perturbation, (2) the unavoidable conversion of noise statistics from Gaussianess to non-Gaussianess due to the highly non-linear nature of homomorphic transform does warrant stringent noise cancellation routines, (3) due to the aforementioned problems in (1) and (2), it is difficult to adaptively optimize windows to include all individual fetal complexes in the time domain based on amplitude thresholding routines in the complex cepstral domain (i.e. the task of `zooming' in on weak fetal complexes requires more processing time). The use of third-order based high resolution differential cepstrum technique results in recovery of the delay of the order of 120 milliseconds.
Kannan, R; Ievlev, A V; Laanait, N; Ziatdinov, M A; Vasudevan, R K; Jesse, S; Kalinin, S V
2018-01-01
Many spectral responses in materials science, physics, and chemistry experiments can be characterized as resulting from the superposition of a number of more basic individual spectra. In this context, unmixing is defined as the problem of determining the individual spectra, given measurements of multiple spectra that are spatially resolved across samples, as well as the determination of the corresponding abundance maps indicating the local weighting of each individual spectrum. Matrix factorization is a popular linear unmixing technique that considers that the mixture model between the individual spectra and the spatial maps is linear. Here, we present a tutorial paper targeted at domain scientists to introduce linear unmixing techniques, to facilitate greater understanding of spectroscopic imaging data. We detail a matrix factorization framework that can incorporate different domain information through various parameters of the matrix factorization method. We demonstrate many domain-specific examples to explain the expressivity of the matrix factorization framework and show how the appropriate use of domain-specific constraints such as non-negativity and sum-to-one abundance result in physically meaningful spectral decompositions that are more readily interpretable. Our aim is not only to explain the off-the-shelf available tools, but to add additional constraints when ready-made algorithms are unavailable for the task. All examples use the scalable open source implementation from https://github.com/ramkikannan/nmflibrary that can run from small laptops to supercomputers, creating a user-wide platform for rapid dissemination and adoption across scientific disciplines.
NASA Astrophysics Data System (ADS)
Forsén, R.; Ghafoor, N.; Odén, M.
2013-12-01
A concept to improve hardness and thermal stability of unstable multilayer alloys is presented based on control of the coherency strain such that the driving force for decomposition is favorably altered. Cathodic arc evaporated cubic TiCrAlN/Ti1-xCrxN multilayer coatings are used as demonstrators. Upon annealing, the coatings undergo spinodal decomposition into nanometer-sized coherent Ti- and Al-rich cubic domains which is affected by the coherency strain. In addition, the growth of the domains is restricted by the surrounding TiCrN layer compared to a non-layered TiCrAlN coating which together results in an improved thermal stability of the cubic structure. A significant hardness increase is seen during decomposition for the case with high coherency strain while a low coherency strain results in a hardness decrease for high annealing temperatures. The metal diffusion paths during the domain coarsening are affected by strain which in turn is controlled by the Cr-content (x) in the Ti1-xCrxN layers. For x = 0 the diffusion occurs both parallel and perpendicular to the growth direction but for x > =0.9 the diffusion occurs predominantly parallel to the growth direction. Altogether this study shows a structural tool to alter and fine-tune high temperature properties of multicomponent materials.
NASA Astrophysics Data System (ADS)
Dana, Saumik; Ganis, Benjamin; Wheeler, Mary F.
2018-01-01
In coupled flow and poromechanics phenomena representing hydrocarbon production or CO2 sequestration in deep subsurface reservoirs, the spatial domain in which fluid flow occurs is usually much smaller than the spatial domain over which significant deformation occurs. The typical approach is to either impose an overburden pressure directly on the reservoir thus treating it as a coupled problem domain or to model flow on a huge domain with zero permeability cells to mimic the no flow boundary condition on the interface of the reservoir and the surrounding rock. The former approach precludes a study of land subsidence or uplift and further does not mimic the true effect of the overburden on stress sensitive reservoirs whereas the latter approach has huge computational costs. In order to address these challenges, we augment the fixed-stress split iterative scheme with upscaling and downscaling operators to enable modeling flow and mechanics on overlapping nonmatching hexahedral grids. Flow is solved on a finer mesh using a multipoint flux mixed finite element method and mechanics is solved on a coarse mesh using a conforming Galerkin method. The multiscale operators are constructed using a procedure that involves singular value decompositions, a surface intersections algorithm and Delaunay triangulations. We numerically demonstrate the convergence of the augmented scheme using the classical Mandel's problem solution.
Navarro, Pedro J; Fernández-Isla, Carlos; Alcover, Pedro María; Suardíaz, Juan
2016-07-27
This paper presents a robust method for defect detection in textures, entropy-based automatic selection of the wavelet decomposition level (EADL), based on a wavelet reconstruction scheme, for detecting defects in a wide variety of structural and statistical textures. Two main features are presented. One of the new features is an original use of the normalized absolute function value (NABS) calculated from the wavelet coefficients derived at various different decomposition levels in order to identify textures where the defect can be isolated by eliminating the texture pattern in the first decomposition level. The second is the use of Shannon's entropy, calculated over detail subimages, for automatic selection of the band for image reconstruction, which, unlike other techniques, such as those based on the co-occurrence matrix or on energy calculation, provides a lower decomposition level, thus avoiding excessive degradation of the image, allowing a more accurate defect segmentation. A metric analysis of the results of the proposed method with nine different thresholding algorithms determined that selecting the appropriate thresholding method is important to achieve optimum performance in defect detection. As a consequence, several different thresholding algorithms depending on the type of texture are proposed.
Numeric Modified Adomian Decomposition Method for Power System Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dimitrovski, Aleksandar D; Simunovic, Srdjan; Pannala, Sreekanth
This paper investigates the applicability of numeric Wazwaz El Sayed modified Adomian Decomposition Method (WES-ADM) for time domain simulation of power systems. WESADM is a numerical method based on a modified Adomian decomposition (ADM) technique. WES-ADM is a numerical approximation method for the solution of nonlinear ordinary differential equations. The non-linear terms in the differential equations are approximated using Adomian polynomials. In this paper WES-ADM is applied to time domain simulations of multimachine power systems. WECC 3-generator, 9-bus system and IEEE 10-generator, 39-bus system have been used to test the applicability of the approach. Several fault scenarios have been tested.more » It has been found that the proposed approach is faster than the trapezoidal method with comparable accuracy.« less
An overlapped grid method for multigrid, finite volume/difference flow solvers: MaGGiE
NASA Technical Reports Server (NTRS)
Baysal, Oktay; Lessard, Victor R.
1990-01-01
The objective is to develop a domain decomposition method via overlapping/embedding the component grids, which is to be used by upwind, multi-grid, finite volume solution algorithms. A computer code, given the name MaGGiE (Multi-Geometry Grid Embedder) is developed to meet this objective. MaGGiE takes independently generated component grids as input, and automatically constructs the composite mesh and interpolation data, which can be used by the finite volume solution methods with or without multigrid convergence acceleration. Six demonstrative examples showing various aspects of the overlap technique are presented and discussed. These cases are used for developing the procedure for overlapping grids of different topologies, and to evaluate the grid connection and interpolation data for finite volume calculations on a composite mesh. Time fluxes are transferred between mesh interfaces using a trilinear interpolation procedure. Conservation losses are minimal at the interfaces using this method. The multi-grid solution algorithm, using the coaser grid connections, improves the convergence time history as compared to the solution on composite mesh without multi-gridding.
Polarizable Molecular Dynamics in a Polarizable Continuum Solvent
Lipparini, Filippo; Lagardère, Louis; Raynaud, Christophe; Stamm, Benjamin; Cancès, Eric; Mennucci, Benedetta; Schnieders, Michael; Ren, Pengyu; Maday, Yvon; Piquemal, Jean-Philip
2015-01-01
We present for the first time scalable polarizable molecular dynamics (MD) simulations within a polarizable continuum solvent with molecular shape cavities and exact solution of the mutual polarization. The key ingredients are a very efficient algorithm for solving the equations associated with the polarizable continuum, in particular, the domain decomposition Conductor-like Screening Model (ddCOSMO), a rigorous coupling of the continuum with the polarizable force field achieved through a robust variational formulation and an effective strategy to solve the coupled equations. The coupling of ddCOSMO with non variational force fields, including AMOEBA, is also addressed. The MD simulations are feasible, for real life systems, on standard cluster nodes; a scalable parallel implementation allows for further speed up in the context of a newly developed module in Tinker, named Tinker-HP. NVE simulations are stable and long term energy conservation can be achieved. This paper is focused on the methodological developments, on the analysis of the algorithm and on the stability of the simulations; a proof-of-concept application is also presented to attest the possibilities of this newly developed technique. PMID:26516318
Research in Parallel Algorithms and Software for Computational Aerosciences
NASA Technical Reports Server (NTRS)
Domel, Neal D.
1996-01-01
Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Research in Parallel Algorithms and Software for Computational Aerosciences
NASA Technical Reports Server (NTRS)
Domel, Neal D.
1996-01-01
Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Seismpol_ a visual-basic computer program for interactive and automatic earthquake waveform analysis
NASA Astrophysics Data System (ADS)
Patanè, Domenico; Ferrari, Ferruccio
1997-11-01
A Microsoft Visual-Basic computer program for waveform analysis of seismic signals is presented. The program combines interactive and automatic processing of digital signals using data recorded by three-component seismic stations. The analysis procedure can be used in either an interactive earthquake analysis or an automatic on-line processing of seismic recordings. The algorithm works in the time domain using the Covariance Matrix Decomposition method (CMD), so that polarization characteristics may be computed continuously in real time and seismic phases can be identified and discriminated. Visual inspection of the particle motion in hortogonal planes of projection (hodograms) reduces the danger of misinterpretation derived from the application of the polarization filter. The choice of time window and frequency intervals improves the quality of the extracted polarization information. In fact, the program uses a band-pass Butterworth filter to process the signals in the frequency domain by analysis of a selected signal window into a series of narrow frequency bands. Significant results supported by well defined polarizations and source azimuth estimates for P and S phases are also obtained for short-period seismic events (local microearthquakes).
Vincenti, H.; Vay, J. -L.
2015-11-22
Due to discretization effects and truncation to finite domains, many electromagnetic simulations present non-physical modifications of Maxwell's equations in space that may generate spurious signals affecting the overall accuracy of the result. Such modifications for instance occur when Perfectly Matched Layers (PMLs) are used at simulation domain boundaries to simulate open media. Another example is the use of arbitrary order Maxwell solver with domain decomposition technique that may under some condition involve stencil truncations at subdomain boundaries, resulting in small spurious errors that do eventually build up. In each case, a careful evaluation of the characteristics and magnitude of themore » errors resulting from these approximations, and their impact at any frequency and angle, requires detailed analytical and numerical studies. To this end, we present a general analytical approach that enables the evaluation of numerical discretization errors of fully three-dimensional arbitrary order finite-difference Maxwell solver, with arbitrary modification of the local stencil in the simulation domain. The analytical model is validated against simulations of domain decomposition technique and PMLs, when these are used with very high-order Maxwell solver, as well as in the infinite order limit of pseudo-spectral solvers. Results confirm that the new analytical approach enables exact predictions in each case. It also confirms that the domain decomposition technique can be used with very high-order Maxwell solver and a reasonably low number of guard cells with negligible effects on the whole accuracy of the simulation.« less
NASA Technical Reports Server (NTRS)
Lyster, P. M.; Liewer, P. C.; Decyk, V. K.; Ferraro, R. D.
1995-01-01
A three-dimensional electrostatic particle-in-cell (PIC) plasma simulation code has been developed on coarse-grain distributed-memory massively parallel computers with message passing communications. Our implementation is the generalization to three-dimensions of the general concurrent particle-in-cell (GCPIC) algorithm. In the GCPIC algorithm, the particle computation is divided among the processors using a domain decomposition of the simulation domain. In a three-dimensional simulation, the domain can be partitioned into one-, two-, or three-dimensional subdomains ("slabs," "rods," or "cubes") and we investigate the efficiency of the parallel implementation of the push for all three choices. The present implementation runs on the Intel Touchstone Delta machine at Caltech; a multiple-instruction-multiple-data (MIMD) parallel computer with 512 nodes. We find that the parallel efficiency of the push is very high, with the ratio of communication to computation time in the range 0.3%-10.0%. The highest efficiency (> 99%) occurs for a large, scaled problem with 64(sup 3) particles per processing node (approximately 134 million particles of 512 nodes) which has a push time of about 250 ns per particle per time step. We have also developed expressions for the timing of the code which are a function of both code parameters (number of grid points, particles, etc.) and machine-dependent parameters (effective FLOP rate, and the effective interprocessor bandwidths for the communication of particles and grid points). These expressions can be used to estimate the performance of scaled problems--including those with inhomogeneous plasmas--to other parallel machines once the machine-dependent parameters are known.
Topology Optimization - Engineering Contribution to Architectural Design
NASA Astrophysics Data System (ADS)
Tajs-Zielińska, Katarzyna; Bochenek, Bogdan
2017-10-01
The idea of the topology optimization is to find within a considered design domain the distribution of material that is optimal in some sense. Material, during optimization process, is redistributed and parts that are not necessary from objective point of view are removed. The result is a solid/void structure, for which an objective function is minimized. This paper presents an application of topology optimization to multi-material structures. The design domain defined by shape of a structure is divided into sub-regions, for which different materials are assigned. During design process material is relocated, but only within selected region. The proposed idea has been inspired by architectural designs like multi-material facades of buildings. The effectiveness of topology optimization is determined by proper choice of numerical optimization algorithm. This paper utilises very efficient heuristic method called Cellular Automata. Cellular Automata are mathematical, discrete idealization of a physical systems. Engineering implementation of Cellular Automata requires decomposition of the design domain into a uniform lattice of cells. It is assumed, that the interaction between cells takes place only within the neighbouring cells. The interaction is governed by simple, local update rules, which are based on heuristics or physical laws. The numerical studies show, that this method can be attractive alternative to traditional gradient-based algorithms. The proposed approach is evaluated by selected numerical examples of multi-material bridge structures, for which various material configurations are examined. The numerical studies demonstrated a significant influence the material sub-regions location on the final topologies. The influence of assumed volume fraction on final topologies for multi-material structures is also observed and discussed. The results of numerical calculations show, that this approach produces different results as compared with classical one-material problems.
Teodoro, Douglas; Lovis, Christian
2013-01-01
Background Antibiotic resistance is a major worldwide public health concern. In clinical settings, timely antibiotic resistance information is key for care providers as it allows appropriate targeted treatment or improved empirical treatment when the specific results of the patient are not yet available. Objective To improve antibiotic resistance trend analysis algorithms by building a novel, fully data-driven forecasting method from the combination of trend extraction and machine learning models for enhanced biosurveillance systems. Methods We investigate a robust model for extraction and forecasting of antibiotic resistance trends using a decade of microbiology data. Our method consists of breaking down the resistance time series into independent oscillatory components via the empirical mode decomposition technique. The resulting waveforms describing intrinsic resistance trends serve as the input for the forecasting algorithm. The algorithm applies the delay coordinate embedding theorem together with the k-nearest neighbor framework to project mappings from past events into the future dimension and estimate the resistance levels. Results The algorithms that decompose the resistance time series and filter out high frequency components showed statistically significant performance improvements in comparison with a benchmark random walk model. We present further qualitative use-cases of antibiotic resistance trend extraction, where empirical mode decomposition was applied to highlight the specificities of the resistance trends. Conclusion The decomposition of the raw signal was found not only to yield valuable insight into the resistance evolution, but also to produce novel models of resistance forecasters with boosted prediction performance, which could be utilized as a complementary method in the analysis of antibiotic resistance trends. PMID:23637796
Dey, Nilanjan; Bose, Soumyo; Das, Achintya; Chaudhuri, Sheli Sinha; Saba, Luca; Shafique, Shoaib; Nicolaides, Andrew; Suri, Jasjit S
2016-04-01
Embedding of diagnostic and health care information requires secure encryption and watermarking. This research paper presents a comprehensive study for the behavior of some well established watermarking algorithms in frequency domain for the preservation of stroke-based diagnostic parameters. Two different sets of watermarking algorithms namely: two correlation-based (binary logo hiding) and two singular value decomposition (SVD)-based (gray logo hiding) watermarking algorithms are used for embedding ownership logo. The diagnostic parameters in atherosclerotic plaque ultrasound video are namely: (a) bulb identification and recognition which consists of identifying the bulb edge points in far and near carotid walls; (b) carotid bulb diameter; and (c) carotid lumen thickness all along the carotid artery. The tested data set consists of carotid atherosclerotic movies taken under IRB protocol from University of Indiana Hospital, USA-AtheroPoint™ (Roseville, CA, USA) joint pilot study. ROC (receiver operating characteristic) analysis was performed on the bulb detection process that showed an accuracy and sensitivity of 100 % each, respectively. The diagnostic preservation (DPsystem) for SVD-based approach was above 99 % with PSNR (Peak signal-to-noise ratio) above 41, ensuring the retention of diagnostic parameter devalorization as an effect of watermarking. Thus, the fully automated proposed system proved to be an efficient method for watermarking the atherosclerotic ultrasound video for stroke application.
Parallel text rendering by a PostScript interpreter
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kritskii, S.P.; Zastavnoi, B.A.
1994-11-01
The most radical method of increasing the performance of devices controlled by PostScript interpreters may be the use of multiprocessor controllers. This paper presents a method for parallelizing the operation of a PostScript interpreter for rendering text. The proposed method is based on decomposition of the outlines of letters into horizontal strips covering equal areas. The subroutines thus obtained are distributed to the processors in a network and then filled in by conventional sequential algorithms. A special algorithm has been developed for dividing the outlines of characters into subroutines so that each may be colored independently of the others. Themore » algorithm uses special estimates for estimating the correct partition so that the corresponding outlines are divided into horizontal strips. A method is presented for finding such estimates. Two different processing approaches are presented. In the first, one of the processors performs the decomposition of the outlines and distributes the strips to the remaining processors, which are responsible for the rendering. In the second approach, the decomposition process is itself distributed among the processors in the network.« less
Integrating a Genetic Algorithm Into a Knowledge-Based System for Ordering Complex Design Processes
NASA Technical Reports Server (NTRS)
Rogers, James L.; McCulley, Collin M.; Bloebaum, Christina L.
1996-01-01
The design cycle associated with large engineering systems requires an initial decomposition of the complex system into design processes which are coupled through the transference of output data. Some of these design processes may be grouped into iterative subcycles. In analyzing or optimizing such a coupled system, it is essential to be able to determine the best ordering of the processes within these subcycles to reduce design cycle time and cost. Many decomposition approaches assume the capability is available to determine what design processes and couplings exist and what order of execution will be imposed during the design cycle. Unfortunately, this is often a complex problem and beyond the capabilities of a human design manager. A new feature, a genetic algorithm, has been added to DeMAID (Design Manager's Aid for Intelligent Decomposition) to allow the design manager to rapidly examine many different combinations of ordering processes in an iterative subcycle and to optimize the ordering based on cost, time, and iteration requirements. Two sample test cases are presented to show the effects of optimizing the ordering with a genetic algorithm.
NASA Astrophysics Data System (ADS)
Agurto, C.; Barriga, S.; Murray, V.; Pattichis, M.; Soliz, P.
2010-03-01
Diabetic retinopathy (DR) is one of the leading causes of blindness among adult Americans. Automatic methods for detection of the disease have been developed in recent years, most of them addressing the segmentation of bright and red lesions. In this paper we present an automatic DR screening system that does approach the problem through the segmentation of features. The algorithm determines non-diseased retinal images from those with pathology based on textural features obtained using multiscale Amplitude Modulation-Frequency Modulation (AM-FM) decompositions. The decomposition is represented as features that are the inputs to a classifier. The algorithm achieves 0.88 area under the ROC curve (AROC) for a set of 280 images from the MESSIDOR database. The algorithm is then used to analyze the effects of image compression and degradation, which will be present in most actual clinical or screening environments. Results show that the algorithm is insensitive to illumination variations, but high rates of compression and large blurring effects degrade its performance.
Hong, Xia
2006-07-01
In this letter, a Box-Cox transformation-based radial basis function (RBF) neural network is introduced using the RBF neural network to represent the transformed system output. Initially a fixed and moderate sized RBF model base is derived based on a rank revealing orthogonal matrix triangularization (QR decomposition). Then a new fast identification algorithm is introduced using Gauss-Newton algorithm to derive the required Box-Cox transformation, based on a maximum likelihood estimator. The main contribution of this letter is to explore the special structure of the proposed RBF neural network for computational efficiency by utilizing the inverse of matrix block decomposition lemma. Finally, the Box-Cox transformation-based RBF neural network, with good generalization and sparsity, is identified based on the derived optimal Box-Cox transformation and a D-optimality-based orthogonal forward regression algorithm. The proposed algorithm and its efficacy are demonstrated with an illustrative example in comparison with support vector machine regression.
3D Staggered-Grid Finite-Difference Simulation of Acoustic Waves in Turbulent Moving Media
NASA Astrophysics Data System (ADS)
Symons, N. P.; Aldridge, D. F.; Marlin, D.; Wilson, D. K.; Sullivan, P.; Ostashev, V.
2003-12-01
Acoustic wave propagation in a three-dimensional heterogeneous moving atmosphere is accurately simulated with a numerical algorithm recently developed under the DOD Common High Performance Computing Software Support Initiative (CHSSI). Sound waves within such a dynamic environment are mathematically described by a set of four, coupled, first-order partial differential equations governing small-amplitude fluctuations in pressure and particle velocity. The system is rigorously derived from fundamental principles of continuum mechanics, ideal-fluid constitutive relations, and reasonable assumptions that the ambient atmospheric motion is adiabatic and divergence-free. An explicit, time-domain, finite-difference (FD) numerical scheme is used to solve the system for both pressure and particle velocity wavefields. The atmosphere is characterized by 3D gridded models of sound speed, mass density, and the three components of the wind velocity vector. Dependent variables are stored on staggered spatial and temporal grids, and centered FD operators possess 2nd-order and 4th-order space/time accuracy. Accurate sound wave simulation is achieved provided grid intervals are chosen appropriately. The gridding must be fine enough to reduce numerical dispersion artifacts to an acceptable level and maintain stability. The algorithm is designed to execute on parallel computational platforms by utilizing a spatial domain-decomposition strategy. Currently, the algorithm has been validated on four different computational platforms, and parallel scalability of approximately 85% has been demonstrated. Comparisons with analytic solutions for uniform and vertically stratified wind models indicate that the FD algorithm generates accurate results with either a vanishing pressure or vanishing vertical-particle velocity boundary condition. Simulations are performed using a kinematic turbulence wind profile developed with the quasi-wavelet method. In addition, preliminary results are presented using high-resolution 3D dynamic turbulent flowfields generated by a large-eddy simulation model of a stably stratified planetary boundary layer. Sandia National Laboratories is a operated by Sandia Corporation, a Lockheed Martin Company, for the USDOE under contract 94-AL85000.
An epileptic seizures detection algorithm based on the empirical mode decomposition of EEG.
Orosco, Lorena; Laciar, Eric; Correa, Agustina Garces; Torres, Abel; Graffigna, Juan P
2009-01-01
Epilepsy is a neurological disorder that affects around 50 million people worldwide. The seizure detection is an important component in the diagnosis of epilepsy. In this study, the Empirical Mode Decomposition (EMD) method was proposed on the development of an automatic epileptic seizure detection algorithm. The algorithm first computes the Intrinsic Mode Functions (IMFs) of EEG records, then calculates the energy of each IMF and performs the detection based on an energy threshold and a minimum duration decision. The algorithm was tested in 9 invasive EEG records provided and validated by the Epilepsy Center of the University Hospital of Freiburg. In 90 segments analyzed (39 with epileptic seizures) the sensitivity and specificity obtained with the method were of 56.41% and 75.86% respectively. It could be concluded that EMD is a promissory method for epileptic seizure detection in EEG records.
Variability of ICA decomposition may impact EEG signals when used to remove eyeblink artifacts
PONTIFEX, MATTHEW B.; GWIZDALA, KATHRYN L.; PARKS, ANDREW C.; BILLINGER, MARTIN; BRUNNER, CLEMENS
2017-01-01
Despite the growing use of independent component analysis (ICA) algorithms for isolating and removing eyeblink-related activity from EEG data, we have limited understanding of how variability associated with ICA uncertainty may be influencing the reconstructed EEG signal after removing the eyeblink artifact components. To characterize the magnitude of this ICA uncertainty and to understand the extent to which it may influence findings within ERP and EEG investigations, ICA decompositions of EEG data from 32 college-aged young adults were repeated 30 times for three popular ICA algorithms. Following each decomposition, eyeblink components were identified and removed. The remaining components were back-projected, and the resulting clean EEG data were further used to analyze ERPs. Findings revealed that ICA uncertainty results in variation in P3 amplitude as well as variation across all EEG sampling points, but differs across ICA algorithms as a function of the spatial location of the EEG channel. This investigation highlights the potential of ICA uncertainty to introduce additional sources of variance when the data are back-projected without artifact components. Careful selection of ICA algorithms and parameters can reduce the extent to which ICA uncertainty may introduce an additional source of variance within ERP/EEG studies. PMID:28026876
Reactor Dosimetry Applications Using RAPTOR-M3G:. a New Parallel 3-D Radiation Transport Code
NASA Astrophysics Data System (ADS)
Longoni, Gianluca; Anderson, Stanwood L.
2009-08-01
The numerical solution of the Linearized Boltzmann Equation (LBE) via the Discrete Ordinates method (SN) requires extensive computational resources for large 3-D neutron and gamma transport applications due to the concurrent discretization of the angular, spatial, and energy domains. This paper will discuss the development RAPTOR-M3G (RApid Parallel Transport Of Radiation - Multiple 3D Geometries), a new 3-D parallel radiation transport code, and its application to the calculation of ex-vessel neutron dosimetry responses in the cavity of a commercial 2-loop Pressurized Water Reactor (PWR). RAPTOR-M3G is based domain decomposition algorithms, where the spatial and angular domains are allocated and processed on multi-processor computer architectures. As compared to traditional single-processor applications, this approach reduces the computational load as well as the memory requirement per processor, yielding an efficient solution methodology for large 3-D problems. Measured neutron dosimetry responses in the reactor cavity air gap will be compared to the RAPTOR-M3G predictions. This paper is organized as follows: Section 1 discusses the RAPTOR-M3G methodology; Section 2 describes the 2-loop PWR model and the numerical results obtained. Section 3 addresses the parallel performance of the code, and Section 4 concludes this paper with final remarks and future work.
Identification of hierarchical chromatin domains
Weinreb, Caleb; Raphael, Benjamin J.
2016-01-01
Motivation: The three-dimensional structure of the genome is an important regulator of many cellular processes including differentiation and gene regulation. Recently, technologies such as Hi-C that combine proximity ligation with high-throughput sequencing have revealed domains of self-interacting chromatin, called topologically associating domains (TADs), in many organisms. Current methods for identifying TADs using Hi-C data assume that TADs are non-overlapping, despite evidence for a nested structure in which TADs and sub-TADs form a complex hierarchy. Results: We introduce a model for decomposition of contact frequencies into a hierarchy of nested TADs. This model is based on empirical distributions of contact frequencies within TADs, where positions that are far apart have a greater enrichment of contacts than positions that are close together. We find that the increase in contact enrichment with distance is stronger for the inner TAD than for the outer TAD in a TAD/sub-TAD pair. Using this model, we develop the TADtree algorithm for detecting hierarchies of nested TADs. TADtree compares favorably with previous methods, finding TADs with a greater enrichment of chromatin marks such as CTCF at their boundaries. Availability and implementation: A python implementation of TADtree is available at http://compbio.cs.brown.edu/software/ Contact: braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26315910
Molloi, Sabee; Ding, Huanjun; Feig, Stephen
2015-01-01
Purpose The purpose of this study was to compare the precision of mammographic breast density measurement using radiologist reader assessment, histogram threshold segmentation, fuzzy C-mean segmentation and spectral material decomposition. Materials and Methods Spectral mammography images from a total of 92 consecutive asymptomatic women (50–69 years old) who presented for annual screening mammography were retrospectively analyzed for this study. Breast density was estimated using 10 radiologist reader assessment, standard histogram thresholding, fuzzy C-mean algorithm and spectral material decomposition. The breast density correlation between left and right breasts was used to assess the precision of these techniques to measure breast composition relative to dual-energy material decomposition. Results In comparison to the other techniques, the results of breast density measurements using dual-energy material decomposition showed the highest correlation. The relative standard error of estimate for breast density measurements from left and right breasts using radiologist reader assessment, standard histogram thresholding, fuzzy C-mean algorithm and dual-energy material decomposition was calculated to be 1.95, 2.87, 2.07 and 1.00, respectively. Conclusion The results indicate that the precision of dual-energy material decomposition was approximately factor of two higher than the other techniques with regard to better correlation of breast density measurements from right and left breasts. PMID:26031229
NASA Astrophysics Data System (ADS)
Zhang, Xin; Liu, Zhiwen; Miao, Qiang; Wang, Lei
2018-03-01
A time varying filtering based empirical mode decomposition (EMD) (TVF-EMD) method was proposed recently to solve the mode mixing problem of EMD method. Compared with the classical EMD, TVF-EMD was proven to improve the frequency separation performance and be robust to noise interference. However, the decomposition parameters (i.e., bandwidth threshold and B-spline order) significantly affect the decomposition results of this method. In original TVF-EMD method, the parameter values are assigned in advance, which makes it difficult to achieve satisfactory analysis results. To solve this problem, this paper develops an optimized TVF-EMD method based on grey wolf optimizer (GWO) algorithm for fault diagnosis of rotating machinery. Firstly, a measurement index termed weighted kurtosis index is constructed by using kurtosis index and correlation coefficient. Subsequently, the optimal TVF-EMD parameters that match with the input signal can be obtained by GWO algorithm using the maximum weighted kurtosis index as objective function. Finally, fault features can be extracted by analyzing the sensitive intrinsic mode function (IMF) owning the maximum weighted kurtosis index. Simulations and comparisons highlight the performance of TVF-EMD method for signal decomposition, and meanwhile verify the fact that bandwidth threshold and B-spline order are critical to the decomposition results. Two case studies on rotating machinery fault diagnosis demonstrate the effectiveness and advantages of the proposed method.
Intercomparison of Multiscale Modeling Approaches in Simulating Subsurface Flow and Transport
NASA Astrophysics Data System (ADS)
Yang, X.; Mehmani, Y.; Barajas-Solano, D. A.; Song, H. S.; Balhoff, M.; Tartakovsky, A. M.; Scheibe, T. D.
2016-12-01
Hybrid multiscale simulations that couple models across scales are critical to advance predictions of the larger system behavior using understanding of fundamental processes. In the current study, three hybrid multiscale methods are intercompared: multiscale loose-coupling method, multiscale finite volume (MsFV) method and multiscale mortar method. The loose-coupling method enables a parallel workflow structure based on the Swift scripting environment that manages the complex process of executing coupled micro- and macro-scale models without being intrusive to the at-scale simulators. The MsFV method applies microscale and macroscale models over overlapping subdomains of the modeling domain and enforces continuity of concentration and transport fluxes between models via restriction and prolongation operators. The mortar method is a non-overlapping domain decomposition approach capable of coupling all permutations of pore- and continuum-scale models with each other. In doing so, Lagrange multipliers are used at interfaces shared between the subdomains so as to establish continuity of species/fluid mass flux. Subdomain computations can be performed either concurrently or non-concurrently depending on the algorithm used. All the above methods have been proven to be accurate and efficient in studying flow and transport in porous media. However, there has not been any field-scale applications and benchmarking among various hybrid multiscale approaches. To address this challenge, we apply all three hybrid multiscale methods to simulate water flow and transport in a conceptualized 2D modeling domain of the hyporheic zone, where strong interactions between groundwater and surface water exist across multiple scales. In all three multiscale methods, fine-scale simulations are applied to a thin layer of riverbed alluvial sediments while the macroscopic simulations are used for the larger subsurface aquifer domain. Different numerical coupling methods are then applied between scales and inter-compared. Comparisons are drawn in terms of velocity distributions, solute transport behavior, algorithm-induced numerical error and computing cost. The intercomparison work provides support for confidence in a variety of hybrid multiscale methods and motivates further development and applications.
Using Multiple Grids To Compute Flows
NASA Technical Reports Server (NTRS)
Rai, Man Mohan
1991-01-01
Paper discusses decomposition of global grids into multiple patched and/or overlaid local grids in computations of fluid flow. Such "domain decomposition" particularly useful in computation of flows about complicated bodies moving relative to each other; for example, flows associated with rotors and stators in turbomachinery and rotors and fuselages in helicopters.
Hybridization of decomposition and local search for multiobjective optimization.
Ke, Liangjun; Zhang, Qingfu; Battiti, Roberto
2014-10-01
Combining ideas from evolutionary algorithms, decomposition approaches, and Pareto local search, this paper suggests a simple yet efficient memetic algorithm for combinatorial multiobjective optimization problems: memetic algorithm based on decomposition (MOMAD). It decomposes a combinatorial multiobjective problem into a number of single objective optimization problems using an aggregation method. MOMAD evolves three populations: 1) population P(L) for recording the current solution to each subproblem; 2) population P(P) for storing starting solutions for Pareto local search; and 3) an external population P(E) for maintaining all the nondominated solutions found so far during the search. A problem-specific single objective heuristic can be applied to these subproblems to initialize the three populations. At each generation, a Pareto local search method is first applied to search a neighborhood of each solution in P(P) to update P(L) and P(E). Then a single objective local search is applied to each perturbed solution in P(L) for improving P(L) and P(E), and reinitializing P(P). The procedure is repeated until a stopping condition is met. MOMAD provides a generic hybrid multiobjective algorithmic framework in which problem specific knowledge, well developed single objective local search and heuristics and Pareto local search methods can be hybridized. It is a population based iterative method and thus an anytime algorithm. Extensive experiments have been conducted in this paper to study MOMAD and compare it with some other state-of-the-art algorithms on the multiobjective traveling salesman problem and the multiobjective knapsack problem. The experimental results show that our proposed algorithm outperforms or performs similarly to the best so far heuristics on these two problems.
Park, Chunjae; Kwon, Ohin; Woo, Eung Je; Seo, Jin Keun
2004-03-01
In magnetic resonance electrical impedance tomography (MREIT), we try to visualize cross-sectional conductivity (or resistivity) images of a subject. We inject electrical currents into the subject through surface electrodes and measure the z component Bz of the induced internal magnetic flux density using an MRI scanner. Here, z is the direction of the main magnetic field of the MRI scanner. We formulate the conductivity image reconstruction problem in MREIT from a careful analysis of the relationship between the injection current and the induced magnetic flux density Bz. Based on the novel mathematical formulation, we propose the gradient Bz decomposition algorithm to reconstruct conductivity images. This new algorithm needs to differentiate Bz only once in contrast to the previously developed harmonic Bz algorithm where the numerical computation of (inverted delta)2Bz is required. The new algorithm, therefore, has the important advantage of much improved noise tolerance. Numerical simulations with added random noise of realistic amounts show the feasibility of the algorithm in practical applications and also its robustness against measurement noise.
NASA Astrophysics Data System (ADS)
Qian, Kun; Zhou, Huixin; Rong, Shenghui; Wang, Bingjian; Cheng, Kuanhong
2017-05-01
Infrared small target tracking plays an important role in applications including military reconnaissance, early warning and terminal guidance. In this paper, an effective algorithm based on the Singular Value Decomposition (SVD) and the improved Kernelized Correlation Filter (KCF) is presented for infrared small target tracking. Firstly, the super performance of the SVD-based algorithm is that it takes advantage of the target's global information and obtains a background estimation of an infrared image. A dim target is enhanced by subtracting the corresponding estimated background with update from the original image. Secondly, the KCF algorithm is combined with Gaussian Curvature Filter (GCF) to eliminate the excursion problem. The GCF technology is adopted to preserve the edge and eliminate the noise of the base sample in the KCF algorithm, helping to calculate the classifier parameter for a small target. At last, the target position is estimated with a response map, which is obtained via the kernelized classifier. Experimental results demonstrate that the presented algorithm performs favorably in terms of efficiency and accuracy, compared with several state-of-the-art algorithms.
Empirical projection-based basis-component decomposition method
NASA Astrophysics Data System (ADS)
Brendel, Bernhard; Roessl, Ewald; Schlomka, Jens-Peter; Proksa, Roland
2009-02-01
Advances in the development of semiconductor based, photon-counting x-ray detectors stimulate research in the domain of energy-resolving pre-clinical and clinical computed tomography (CT). For counting detectors acquiring x-ray attenuation in at least three different energy windows, an extended basis component decomposition can be performed in which in addition to the conventional approach of Alvarez and Macovski a third basis component is introduced, e.g., a gadolinium based CT contrast material. After the decomposition of the measured projection data into the basis component projections, conventional filtered-backprojection reconstruction is performed to obtain the basis-component images. In recent work, this basis component decomposition was obtained by maximizing the likelihood-function of the measurements. This procedure is time consuming and often unstable for excessively noisy data or low intrinsic energy resolution of the detector. Therefore, alternative procedures are of interest. Here, we introduce a generalization of the idea of empirical dual-energy processing published by Stenner et al. to multi-energy, photon-counting CT raw data. Instead of working in the image-domain, we use prior spectral knowledge about the acquisition system (tube spectra, bin sensitivities) to parameterize the line-integrals of the basis component decomposition directly in the projection domain. We compare this empirical approach with the maximum-likelihood (ML) approach considering image noise and image bias (artifacts) and see that only moderate noise increase is to be expected for small bias in the empirical approach. Given the drastic reduction of pre-processing time, the empirical approach is considered a viable alternative to the ML approach.
NASA Astrophysics Data System (ADS)
Sun, Benyuan; Yue, Shihong; Cui, Ziqiang; Wang, Huaxiang
2015-12-01
As an advanced measurement technique of non-radiant, non-intrusive, rapid response, and low cost, the electrical tomography (ET) technique has developed rapidly in recent decades. The ET imaging algorithm plays an important role in the ET imaging process. Linear back projection (LBP) is the most used ET algorithm due to its advantages of dynamic imaging process, real-time response, and easy realization. But the LBP algorithm is of low spatial resolution due to the natural ‘soft field’ effect and ‘ill-posed solution’ problems; thus its applicable ranges are greatly limited. In this paper, an original data decomposition method is proposed, and every ET measuring data are decomposed into two independent new data based on the positive and negative sensing areas of the measuring data. Consequently, the number of total measuring data is extended to twice as many as the number of the original data, thus effectively reducing the ‘ill-posed solution’. On the other hand, an index to measure the ‘soft field’ effect is proposed. The index shows that the decomposed data can distinguish between different contributions of various units (pixels) for any ET measuring data, and can efficiently reduce the ‘soft field’ effect of the ET imaging process. In light of the data decomposition method, a new linear back projection algorithm is proposed to improve the spatial resolution of the ET image. A series of simulations and experiments are applied to validate the proposed algorithm by the real-time performances and the progress of spatial resolutions.
Application of modified Martinez-Silva algorithm in determination of net cover
NASA Astrophysics Data System (ADS)
Stefanowicz, Łukasz; Grobelna, Iwona
2016-12-01
In the article we present the idea of modifications of Martinez-Silva algorithm, which allows for determination of place invariants (p-invariants) of Petri net. Their generation time is important in the parallel decomposition of discrete systems described by Petri nets. Decomposition process is essential from the point of view of discrete system design, as it allows for separation of smaller sequential parts. The proposed modifications of Martinez-Silva method concern the net cover by p-invariants and are focused on two important issues: cyclic reduction of invariant matrix and cyclic checking of net cover.
Signal processing method and system for noise removal and signal extraction
Fu, Chi Yung; Petrich, Loren
2009-04-14
A signal processing method and system combining smooth level wavelet pre-processing together with artificial neural networks all in the wavelet domain for signal denoising and extraction. Upon receiving a signal corrupted with noise, an n-level decomposition of the signal is performed using a discrete wavelet transform to produce a smooth component and a rough component for each decomposition level. The n.sup.th level smooth component is then inputted into a corresponding neural network pre-trained to filter out noise in that component by pattern recognition in the wavelet domain. Additional rough components, beginning at the highest level, may also be retained and inputted into corresponding neural networks pre-trained to filter out noise in those components also by pattern recognition in the wavelet domain. In any case, an inverse discrete wavelet transform is performed on the combined output from all the neural networks to recover a clean signal back in the time domain.
Trace Norm Regularized CANDECOMP/PARAFAC Decomposition With Missing Data.
Liu, Yuanyuan; Shang, Fanhua; Jiao, Licheng; Cheng, James; Cheng, Hong
2015-11-01
In recent years, low-rank tensor completion (LRTC) problems have received a significant amount of attention in computer vision, data mining, and signal processing. The existing trace norm minimization algorithms for iteratively solving LRTC problems involve multiple singular value decompositions of very large matrices at each iteration. Therefore, they suffer from high computational cost. In this paper, we propose a novel trace norm regularized CANDECOMP/PARAFAC decomposition (TNCP) method for simultaneous tensor decomposition and completion. We first formulate a factor matrix rank minimization model by deducing the relation between the rank of each factor matrix and the mode- n rank of a tensor. Then, we introduce a tractable relaxation of our rank function, and then achieve a convex combination problem of much smaller-scale matrix trace norm minimization. Finally, we develop an efficient algorithm based on alternating direction method of multipliers to solve our problem. The promising experimental results on synthetic and real-world data validate the effectiveness of our TNCP method. Moreover, TNCP is significantly faster than the state-of-the-art methods and scales to larger problems.
Calculation of excitation energies from the CC2 linear response theory using Cholesky decomposition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baudin, Pablo, E-mail: baudin.pablo@gmail.com; qLEAP – Center for Theoretical Chemistry, Department of Chemistry, Aarhus University, Langelandsgade 140, DK-8000 Aarhus C; Marín, José Sánchez
2014-03-14
A new implementation of the approximate coupled cluster singles and doubles CC2 linear response model is reported. It employs a Cholesky decomposition of the two-electron integrals that significantly reduces the computational cost and the storage requirements of the method compared to standard implementations. Our algorithm also exploits a partitioning form of the CC2 equations which reduces the dimension of the problem and avoids the storage of doubles amplitudes. We present calculation of excitation energies of benzene using a hierarchy of basis sets and compare the results with conventional CC2 calculations. The reduction of the scaling is evaluated as well asmore » the effect of the Cholesky decomposition parameter on the quality of the results. The new algorithm is used to perform an extrapolation to complete basis set investigation on the spectroscopically interesting benzylallene conformers. A set of calculations on medium-sized molecules is carried out to check the dependence of the accuracy of the results on the decomposition thresholds. Moreover, CC2 singlet excitation energies of the free base porphin are also presented.« less
Devi, B Pushpa; Singh, Kh Manglem; Roy, Sudipta
2016-01-01
This paper proposes a new watermarking algorithm based on the shuffled singular value decomposition and the visual cryptography for copyright protection of digital images. It generates the ownership and identification shares of the image based on visual cryptography. It decomposes the image into low and high frequency sub-bands. The low frequency sub-band is further divided into blocks of same size after shuffling it and then the singular value decomposition is applied to each randomly selected block. Shares are generated by comparing one of the elements in the first column of the left orthogonal matrix with its corresponding element in the right orthogonal matrix of the singular value decomposition of the block of the low frequency sub-band. The experimental results show that the proposed scheme clearly verifies the copyright of the digital images, and is robust to withstand several image processing attacks. Comparison with the other related visual cryptography-based algorithms reveals that the proposed method gives better performance. The proposed method is especially resilient against the rotation attack.
Jalas, S.; Dornmair, I.; Lehe, R.; ...
2017-03-20
Particle in Cell (PIC) simulations are a widely used tool for the investigation of both laser- and beam-driven plasma acceleration. It is a known issue that the beam quality can be artificially degraded by numerical Cherenkov radiation (NCR) resulting primarily from an incorrectly modeled dispersion relation. Pseudo-spectral solvers featuring infinite order stencils can strongly reduce NCR - or even suppress it - and are therefore well suited to correctly model the beam properties. For efficient parallelization of the PIC algorithm, however, localized solvers are inevitable. Arbitrary order pseudo-spectral methods provide this needed locality. Yet, these methods can again be pronemore » to NCR. Here in this paper, we show that acceptably low solver orders are sufficient to correctly model the physics of interest, while allowing for parallel computation by domain decomposition.« less
Tool Wear Feature Extraction Based on Hilbert Marginal Spectrum
NASA Astrophysics Data System (ADS)
Guan, Shan; Song, Weijie; Pang, Hongyang
2017-09-01
In the metal cutting process, the signal contains a wealth of tool wear state information. A tool wear signal’s analysis and feature extraction method based on Hilbert marginal spectrum is proposed. Firstly, the tool wear signal was decomposed by empirical mode decomposition algorithm and the intrinsic mode functions including the main information were screened out by the correlation coefficient and the variance contribution rate. Secondly, Hilbert transform was performed on the main intrinsic mode functions. Hilbert time-frequency spectrum and Hilbert marginal spectrum were obtained by Hilbert transform. Finally, Amplitude domain indexes were extracted on the basis of the Hilbert marginal spectrum and they structured recognition feature vector of tool wear state. The research results show that the extracted features can effectively characterize the different wear state of the tool, which provides a basis for monitoring tool wear condition.
Semantic super networks: A case analysis of Wikipedia papers
NASA Astrophysics Data System (ADS)
Kostyuchenko, Evgeny; Lebedeva, Taisiya; Goritov, Alexander
2017-11-01
An algorithm for constructing super-large semantic networks has been developed in current work. Algorithm was tested using the "Cosmos" category of the Internet encyclopedia "Wikipedia" as an example. During the implementation, a parser for the syntax analysis of Wikipedia pages was developed. A graph based on list of articles and categories was formed. On the basis of the obtained graph analysis, algorithms for finding domains of high connectivity in a graph were proposed and tested. Algorithms for constructing a domain based on the number of links and the number of articles in the current subject area is considered. The shortcomings of these algorithms are shown and explained, an algorithm is developed on their joint use. The possibility of applying a combined algorithm for obtaining the final domain is shown. The problem of instability of the received domain was discovered when starting an algorithm from two neighboring vertices related to the domain.
Optical systolic solutions of linear algebraic equations
NASA Technical Reports Server (NTRS)
Neuman, C. P.; Casasent, D.
1984-01-01
The philosophy and data encoding possible in systolic array optical processor (SAOP) were reviewed. The multitude of linear algebraic operations achievable on this architecture is examined. These operations include such linear algebraic algorithms as: matrix-decomposition, direct and indirect solutions, implicit and explicit methods for partial differential equations, eigenvalue and eigenvector calculations, and singular value decomposition. This architecture can be utilized to realize general techniques for solving matrix linear and nonlinear algebraic equations, least mean square error solutions, FIR filters, and nested-loop algorithms for control engineering applications. The data flow and pipelining of operations, design of parallel algorithms and flexible architectures, application of these architectures to computationally intensive physical problems, error source modeling of optical processors, and matching of the computational needs of practical engineering problems to the capabilities of optical processors are emphasized.
NASA Astrophysics Data System (ADS)
Schultz, A.
2010-12-01
3D forward solvers lie at the core of inverse formulations used to image the variation of electrical conductivity within the Earth's interior. This property is associated with variations in temperature, composition, phase, presence of volatiles, and in specific settings, the presence of groundwater, geothermal resources, oil/gas or minerals. The high cost of 3D solutions has been a stumbling block to wider adoption of 3D methods. Parallel algorithms for modeling frequency domain 3D EM problems have not achieved wide scale adoption, with emphasis on fairly coarse grained parallelism using MPI and similar approaches. The communications bandwidth as well as the latency required to send and receive network communication packets is a limiting factor in implementing fine grained parallel strategies, inhibiting wide adoption of these algorithms. Leading Graphics Processor Unit (GPU) companies now produce GPUs with hundreds of GPU processor cores per die. The footprint, in silicon, of the GPU's restricted instruction set is much smaller than the general purpose instruction set required of a CPU. Consequently, the density of processor cores on a GPU can be much greater than on a CPU. GPUs also have local memory, registers and high speed communication with host CPUs, usually through PCIe type interconnects. The extremely low cost and high computational power of GPUs provides the EM geophysics community with an opportunity to achieve fine grained (i.e. massive) parallelization of codes on low cost hardware. The current generation of GPUs (e.g. NVidia Fermi) provides 3 billion transistors per chip die, with nearly 500 processor cores and up to 6 GB of fast (DDR5) GPU memory. This latest generation of GPU supports fast hardware double precision (64 bit) floating point operations of the type required for frequency domain EM forward solutions. Each Fermi GPU board can sustain nearly 1 TFLOP in double precision, and multiple boards can be installed in the host computer system. We describe our ongoing efforts to achieve massive parallelization on a novel hybrid GPU testbed machine currently configured with 12 Intel Westmere Xeon CPU cores (or 24 parallel computational threads) with 96 GB DDR3 system memory, 4 GPU subsystems which in aggregate contain 960 NVidia Tesla GPU cores with 16 GB dedicated DDR3 GPU memory, and a second interleved bank of 4 GPU subsystems containing in aggregate 1792 NVidia Fermi GPU cores with 12 GB dedicated DDR5 GPU memory. We are applying domain decomposition methods to a modified version of Weiss' (2001) 3D frequency domain full physics EM finite difference code, an open source GPL licensed f90 code available for download from www.OpenEM.org. This will be the core of a new hybrid 3D inversion that parallelizes frequencies across CPUs and individual forward solutions across GPUs. We describe progress made in modifying the code to use direct solvers in GPU cores dedicated to each small subdomain, iteratively improving the solution by matching adjacent subdomain boundary solutions, rather than iterative Krylov space sparse solvers as currently applied to the whole domain.
Algorithms for Spectral Decomposition with Applications to Optical Plume Anomaly Detection
NASA Technical Reports Server (NTRS)
Srivastava, Askok N.; Matthews, Bryan; Das, Santanu
2008-01-01
The analysis of spectral signals for features that represent physical phenomenon is ubiquitous in the science and engineering communities. There are two main approaches that can be taken to extract relevant features from these high-dimensional data streams. The first set of approaches relies on extracting features using a physics-based paradigm where the underlying physical mechanism that generates the spectra is used to infer the most important features in the data stream. We focus on a complementary methodology that uses a data-driven technique that is informed by the underlying physics but also has the ability to adapt to unmodeled system attributes and dynamics. We discuss the following four algorithms: Spectral Decomposition Algorithm (SDA), Non-Negative Matrix Factorization (NMF), Independent Component Analysis (ICA) and Principal Components Analysis (PCA) and compare their performance on a spectral emulator which we use to generate artificial data with known statistical properties. This spectral emulator mimics the real-world phenomena arising from the plume of the space shuttle main engine and can be used to validate the results that arise from various spectral decomposition algorithms and is very useful for situations where real-world systems have very low probabilities of fault or failure. Our results indicate that methods like SDA and NMF provide a straightforward way of incorporating prior physical knowledge while NMF with a tuning mechanism can give superior performance on some tests. We demonstrate these algorithms to detect potential system-health issues on data from a spectral emulator with tunable health parameters.
Parallel computing of a climate model on the dawn 1000 by domain decomposition method
NASA Astrophysics Data System (ADS)
Bi, Xunqiang
1997-12-01
In this paper the parallel computing of a grid-point nine-level atmospheric general circulation model on the Dawn 1000 is introduced. The model was developed by the Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS). The Dawn 1000 is a MIMD massive parallel computer made by National Research Center for Intelligent Computer (NCIC), CAS. A two-dimensional domain decomposition method is adopted to perform the parallel computing. The potential ways to increase the speed-up ratio and exploit more resources of future massively parallel supercomputation are also discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gropp, W.D.; Keyes, D.E.
1988-03-01
The authors discuss the parallel implementation of preconditioned conjugate gradient (PCG)-based domain decomposition techniques for self-adjoint elliptic partial differential equations in two dimensions on several architectures. The complexity of these methods is described on a variety of message-passing parallel computers as a function of the size of the problem, number of processors and relative communication speeds of the processors. They show that communication startups are very important, and that even the small amount of global communication in these methods can significantly reduce the performance of many message-passing architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kempka, S.N.; Strickland, J.H.; Glass, M.W.
1995-04-01
formulation to satisfy velocity boundary conditions for the vorticity form of the incompressible, viscous fluid momentum equations is presented. The tangential and normal components of the velocity boundary condition are satisfied simultaneously by creating vorticity adjacent to boundaries. The newly created vorticity is determined using a kinematical formulation which is a generalization of Helmholtz` decomposition of a vector field. Though it has not been generally recognized, these formulations resolve the over-specification issue associated with creating voracity to satisfy velocity boundary conditions. The generalized decomposition has not been widely used, apparently due to a lack of a useful physical interpretation. Anmore » analysis is presented which shows that the generalized decomposition has a relatively simple physical interpretation which facilitates its numerical implementation. The implementation of the generalized decomposition is discussed in detail. As an example the flow in a two-dimensional lid-driven cavity is simulated. The solution technique is based on a Lagrangian transport algorithm in the hydrocode ALEGRA. ALEGRA`s Lagrangian transport algorithm has been modified to solve the vorticity transport equation and the generalized decomposition, thus providing a new, accurate method to simulate incompressible flows. This numerical implementation and the new boundary condition formulation allow vorticity-based formulations to be used in a wider range of engineering problems.« less
Automated diagnoses of attention deficit hyperactive disorder using magnetic resonance imaging.
Eloyan, Ani; Muschelli, John; Nebel, Mary Beth; Liu, Han; Han, Fang; Zhao, Tuo; Barber, Anita D; Joel, Suresh; Pekar, James J; Mostofsky, Stewart H; Caffo, Brian
2012-01-01
Successful automated diagnoses of attention deficit hyperactive disorder (ADHD) using imaging and functional biomarkers would have fundamental consequences on the public health impact of the disease. In this work, we show results on the predictability of ADHD using imaging biomarkers and discuss the scientific and diagnostic impacts of the research. We created a prediction model using the landmark ADHD 200 data set focusing on resting state functional connectivity (rs-fc) and structural brain imaging. We predicted ADHD status and subtype, obtained by behavioral examination, using imaging data, intelligence quotients and other covariates. The novel contributions of this manuscript include a thorough exploration of prediction and image feature extraction methodology on this form of data, including the use of singular value decompositions (SVDs), CUR decompositions, random forest, gradient boosting, bagging, voxel-based morphometry, and support vector machines as well as important insights into the value, and potentially lack thereof, of imaging biomarkers of disease. The key results include the CUR-based decomposition of the rs-fc-fMRI along with gradient boosting and the prediction algorithm based on a motor network parcellation and random forest algorithm. We conjecture that the CUR decomposition is largely diagnosing common population directions of head motion. Of note, a byproduct of this research is a potential automated method for detecting subtle in-scanner motion. The final prediction algorithm, a weighted combination of several algorithms, had an external test set specificity of 94% with sensitivity of 21%. The most promising imaging biomarker was a correlation graph from a motor network parcellation. In summary, we have undertaken a large-scale statistical exploratory prediction exercise on the unique ADHD 200 data set. The exercise produced several potential leads for future scientific exploration of the neurological basis of ADHD.
Automated diagnoses of attention deficit hyperactive disorder using magnetic resonance imaging
Eloyan, Ani; Muschelli, John; Nebel, Mary Beth; Liu, Han; Han, Fang; Zhao, Tuo; Barber, Anita D.; Joel, Suresh; Pekar, James J.; Mostofsky, Stewart H.; Caffo, Brian
2012-01-01
Successful automated diagnoses of attention deficit hyperactive disorder (ADHD) using imaging and functional biomarkers would have fundamental consequences on the public health impact of the disease. In this work, we show results on the predictability of ADHD using imaging biomarkers and discuss the scientific and diagnostic impacts of the research. We created a prediction model using the landmark ADHD 200 data set focusing on resting state functional connectivity (rs-fc) and structural brain imaging. We predicted ADHD status and subtype, obtained by behavioral examination, using imaging data, intelligence quotients and other covariates. The novel contributions of this manuscript include a thorough exploration of prediction and image feature extraction methodology on this form of data, including the use of singular value decompositions (SVDs), CUR decompositions, random forest, gradient boosting, bagging, voxel-based morphometry, and support vector machines as well as important insights into the value, and potentially lack thereof, of imaging biomarkers of disease. The key results include the CUR-based decomposition of the rs-fc-fMRI along with gradient boosting and the prediction algorithm based on a motor network parcellation and random forest algorithm. We conjecture that the CUR decomposition is largely diagnosing common population directions of head motion. Of note, a byproduct of this research is a potential automated method for detecting subtle in-scanner motion. The final prediction algorithm, a weighted combination of several algorithms, had an external test set specificity of 94% with sensitivity of 21%. The most promising imaging biomarker was a correlation graph from a motor network parcellation. In summary, we have undertaken a large-scale statistical exploratory prediction exercise on the unique ADHD 200 data set. The exercise produced several potential leads for future scientific exploration of the neurological basis of ADHD. PMID:22969709
Tissue artifact removal from respiratory signals based on empirical mode decomposition.
Liu, Shaopeng; Gao, Robert X; John, Dinesh; Staudenmayer, John; Freedson, Patty
2013-05-01
On-line measurement of respiration plays an important role in monitoring human physical activities. Such measurement commonly employs sensing belts secured around the rib cage and abdomen of the test object. Affected by the movement of body tissues, respiratory signals typically have a low signal-to-noise ratio. Removing tissue artifacts therefore is critical to ensuring effective respiration analysis. This paper presents a signal decomposition technique for tissue artifact removal from respiratory signals, based on the empirical mode decomposition (EMD). An algorithm based on the mutual information and power criteria was devised to automatically select appropriate intrinsic mode functions for tissue artifact removal and respiratory signal reconstruction. Performance of the EMD-algorithm was evaluated through simulations and real-life experiments (N = 105). Comparison with low-pass filtering that has been conventionally applied confirmed the effectiveness of the technique in tissue artifacts removal.
ERIC Educational Resources Information Center
Ceulemans, Eva; Van Mechelen, Iven; Leenen, Iwin
2007-01-01
Hierarchical classes models are quasi-order retaining Boolean decomposition models for N-way N-mode binary data. To fit these models to data, rationally started alternating least squares (or, equivalently, alternating least absolute deviations) algorithms have been proposed. Extensive simulation studies showed that these algorithms succeed quite…
A Novel Multilevel-SVD Method to Improve Multistep Ahead Forecasting in Traffic Accidents Domain.
Barba, Lida; Rodríguez, Nibaldo
2017-01-01
Here is proposed a novel method for decomposing a nonstationary time series in components of low and high frequency. The method is based on Multilevel Singular Value Decomposition (MSVD) of a Hankel matrix. The decomposition is used to improve the forecasting accuracy of Multiple Input Multiple Output (MIMO) linear and nonlinear models. Three time series coming from traffic accidents domain are used. They represent the number of persons with injuries in traffic accidents of Santiago, Chile. The data were continuously collected by the Chilean Police and were weekly sampled from 2000:1 to 2014:12. The performance of MSVD is compared with the decomposition in components of low and high frequency of a commonly accepted method based on Stationary Wavelet Transform (SWT). SWT in conjunction with the Autoregressive model (SWT + MIMO-AR) and SWT in conjunction with an Autoregressive Neural Network (SWT + MIMO-ANN) were evaluated. The empirical results have shown that the best accuracy was achieved by the forecasting model based on the proposed decomposition method MSVD, in comparison with the forecasting models based on SWT.
A Novel Multilevel-SVD Method to Improve Multistep Ahead Forecasting in Traffic Accidents Domain
Rodríguez, Nibaldo
2017-01-01
Here is proposed a novel method for decomposing a nonstationary time series in components of low and high frequency. The method is based on Multilevel Singular Value Decomposition (MSVD) of a Hankel matrix. The decomposition is used to improve the forecasting accuracy of Multiple Input Multiple Output (MIMO) linear and nonlinear models. Three time series coming from traffic accidents domain are used. They represent the number of persons with injuries in traffic accidents of Santiago, Chile. The data were continuously collected by the Chilean Police and were weekly sampled from 2000:1 to 2014:12. The performance of MSVD is compared with the decomposition in components of low and high frequency of a commonly accepted method based on Stationary Wavelet Transform (SWT). SWT in conjunction with the Autoregressive model (SWT + MIMO-AR) and SWT in conjunction with an Autoregressive Neural Network (SWT + MIMO-ANN) were evaluated. The empirical results have shown that the best accuracy was achieved by the forecasting model based on the proposed decomposition method MSVD, in comparison with the forecasting models based on SWT. PMID:28261267
Reactivity of fluoroalkanes in reactions of coordinated molecular decomposition
NASA Astrophysics Data System (ADS)
Pokidova, T. S.; Denisov, E. T.
2017-08-01
Experimental results on the coordinated molecular decomposition of RF fluoroalkanes to olefin and HF are analyzed using the model of intersecting parabolas (IPM). The kinetic parameters are calculated to allow estimates of the activation energy ( E) and rate constant ( k) of these reactions, based on enthalpy and IPM algorithms. Parameters E and k are found for the first time for eight RF decomposition reactions. The factors that affect activation energy E of RF decomposition (the enthalpy of the reaction, the electronegativity of the atoms of reaction centers, and the dipole-dipole interaction of polar groups) are determined. The values of E and k for reverse reactions of addition are estimated.
Structural optimization by multilevel decomposition
NASA Technical Reports Server (NTRS)
Sobieszczanski-Sobieski, J.; James, B.; Dovi, A.
1983-01-01
A method is described for decomposing an optimization problem into a set of subproblems and a coordination problem which preserves coupling between the subproblems. The method is introduced as a special case of multilevel, multidisciplinary system optimization and its algorithm is fully described for two level optimization for structures assembled of finite elements of arbitrary type. Numerical results are given for an example of a framework to show that the decomposition method converges and yields results comparable to those obtained without decomposition. It is pointed out that optimization by decomposition should reduce the design time by allowing groups of engineers, using different computers to work concurrently on the same large problem.
An integrated analysis-synthesis array system for spatial sound fields.
Bai, Mingsian R; Hua, Yi-Hsin; Kuo, Chia-Hao; Hsieh, Yu-Hao
2015-03-01
An integrated recording and reproduction array system for spatial audio is presented within a generic framework akin to the analysis-synthesis filterbanks in discrete time signal processing. In the analysis stage, a microphone array "encodes" the sound field by using the plane-wave decomposition. Direction of arrival of plane-wave components that comprise the sound field of interest are estimated by multiple signal classification. Next, the source signals are extracted by using a deconvolution procedure. In the synthesis stage, a loudspeaker array "decodes" the sound field by reconstructing the plane-wave components obtained in the analysis stage. This synthesis stage is carried out by pressure matching in the interior domain of the loudspeaker array. The deconvolution problem is solved by truncated singular value decomposition or convex optimization algorithms. For high-frequency reproduction that suffers from the spatial aliasing problem, vector panning is utilized. Listening tests are undertaken to evaluate the deconvolution method, vector panning, and a hybrid approach that combines both methods to cover frequency ranges below and above the spatial aliasing frequency. Localization and timbral attributes are considered in the subjective evaluation. The results show that the hybrid approach performs the best in overall preference. In addition, there is a trade-off between reproduction performance and the external radiation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shimojo, Fuyuki; Hattori, Shinnosuke; Department of Physics, Kumamoto University, Kumamoto 860-8555
We introduce an extension of the divide-and-conquer (DC) algorithmic paradigm called divide-conquer-recombine (DCR) to perform large quantum molecular dynamics (QMD) simulations on massively parallel supercomputers, in which interatomic forces are computed quantum mechanically in the framework of density functional theory (DFT). In DCR, the DC phase constructs globally informed, overlapping local-domain solutions, which in the recombine phase are synthesized into a global solution encompassing large spatiotemporal scales. For the DC phase, we design a lean divide-and-conquer (LDC) DFT algorithm, which significantly reduces the prefactor of the O(N) computational cost for N electrons by applying a density-adaptive boundary condition at themore » peripheries of the DC domains. Our globally scalable and locally efficient solver is based on a hybrid real-reciprocal space approach that combines: (1) a highly scalable real-space multigrid to represent the global charge density; and (2) a numerically efficient plane-wave basis for local electronic wave functions and charge density within each domain. Hybrid space-band decomposition is used to implement the LDC-DFT algorithm on parallel computers. A benchmark test on an IBM Blue Gene/Q computer exhibits an isogranular parallel efficiency of 0.984 on 786 432 cores for a 50.3 × 10{sup 6}-atom SiC system. As a test of production runs, LDC-DFT-based QMD simulation involving 16 661 atoms is performed on the Blue Gene/Q to study on-demand production of hydrogen gas from water using LiAl alloy particles. As an example of the recombine phase, LDC-DFT electronic structures are used as a basis set to describe global photoexcitation dynamics with nonadiabatic QMD (NAQMD) and kinetic Monte Carlo (KMC) methods. The NAQMD simulations are based on the linear response time-dependent density functional theory to describe electronic excited states and a surface-hopping approach to describe transitions between the excited states. A series of techniques are employed for efficiently calculating the long-range exact exchange correction and excited-state forces. The NAQMD trajectories are analyzed to extract the rates of various excitonic processes, which are then used in KMC simulation to study the dynamics of the global exciton flow network. This has allowed the study of large-scale photoexcitation dynamics in 6400-atom amorphous molecular solid, reaching the experimental time scales.« less
NASA Technical Reports Server (NTRS)
Bokhari, Shahid H.; Crockett, Thomas W.; Nicol, David M.
1993-01-01
Binary dissection is widely used to partition non-uniform domains over parallel computers. This algorithm does not consider the perimeter, surface area, or aspect ratio of the regions being generated and can yield decompositions that have poor communication to computation ratio. Parametric Binary Dissection (PBD) is a new algorithm in which each cut is chosen to minimize load + lambda x(shape). In a 2 (or 3) dimensional problem, load is the amount of computation to be performed in a subregion and shape could refer to the perimeter (respectively surface) of that subregion. Shape is a measure of communication overhead and the parameter permits us to trade off load imbalance against communication overhead. When A is zero, the algorithm reduces to plain binary dissection. This algorithm can be used to partition graphs embedded in 2 or 3-d. Load is the number of nodes in a subregion, shape the number of edges that leave that subregion, and lambda the ratio of time to communicate over an edge to the time to compute at a node. An algorithm is presented that finds the depth d parametric dissection of an embedded graph with n vertices and e edges in O(max(n log n, de)) time, which is an improvement over the O(dn log n) time of plain binary dissection. Parallel versions of this algorithm are also presented; the best of these requires O((n/p) log(sup 3)p) time on a p processor hypercube, assuming graphs of bounded degree. How PBD is applied to 3-d unstructured meshes and yields partitions that are better than those obtained by plain dissection is described. Its application to the color image quantization problem is also discussed, in which samples in a high-resolution color space are mapped onto a lower resolution space in a way that minimizes the color error.
NASA Astrophysics Data System (ADS)
Tønning, Erik; Polders, Daniel; Callaghan, Paul T.; Engelsen, Søren B.
2007-09-01
This paper demonstrates how the multi-linear PARAFAC model can with advantage be used to decompose 2D diffusion-relaxation correlation NMR spectra prior to 2D-Laplace inversion to the T2- D domain. The decomposition is advantageous for better interpretation of the complex correlation maps as well as for the quantification of extracted T2- D components. To demonstrate the new method seventeen mixtures of wheat flour, starch, gluten, oil and water were prepared and measured with a 300 MHz nuclear magnetic resonance (NMR) spectrometer using a pulsed gradient stimulated echo (PGSTE) pulse sequence followed by a Carr-Purcell-Meiboom-Gill (CPMG) pulse echo train. By varying the gradient strength, 2D diffusion-relaxation data were recorded for each sample. From these double exponentially decaying relaxation data the PARAFAC algorithm extracted two unique diffusion-relaxation components, explaining 99.8% of the variation in the data set. These two components were subsequently transformed to the T2- D domain using 2D-inverse Laplace transformation and quantitatively assigned to the oil and water components of the samples. The oil component was one distinct distribution with peak intensity at D = 3 × 10 -12 m 2 s -1 and T2 = 180 ms. The water component consisted of two broad populations of water molecules with diffusion coefficients and relaxation times centered around correlation pairs: D = 10 -9 m 2 s -1, T2 = 10 ms and D = 3 × 10 -13 m 2 s -1, T2 = 13 ms. Small spurious peaks observed in the inverse Laplace transformation of original complex data were effectively filtered by the PARAFAC decomposition and thus considered artefacts from the complex Laplace transformation. The oil-to-water ratio determined by PARAFAC followed by 2D-Laplace inversion was perfectly correlated with known oil-to-water ratio of the samples. The new method of using PARAFAC prior to the 2D-Laplace inversion proved to have superior potential in analysis of diffusion-relaxation spectra, as it improves not only the interpretation, but also the quantification.
Origin of the Chemical and Kinetic Stability of Graphene Oxide
Zhou, Si; Bongiorno, Angelo
2013-01-01
At moderate temperatures (≤ 70°C), thermal reduction of graphene oxide is inefficient and after its synthesis the material enters in a metastable state. Here, first-principles and statistical calculations are used to investigate both the low-temperature processes leading to decomposition of graphene oxide and the role of ageing on the structure and stability of this material. Our study shows that the key factor underlying the stability of graphene oxide is the tendency of the oxygen functionalities to agglomerate and form highly oxidized domains surrounded by areas of pristine graphene. Within the agglomerates of functional groups, the primary decomposition reactions are hindered by both geometrical and energetic factors. The number of reacting sites is reduced by the occurrence of local order in the oxidized domains, and due to the close packing of the oxygen functionalities, the decomposition reactions become – on average – endothermic by more than 0.6 eV. PMID:23963517
Origin of the chemical and kinetic stability of graphene oxide.
Zhou, Si; Bongiorno, Angelo
2013-01-01
At moderate temperatures (≤ 70°C), thermal reduction of graphene oxide is inefficient and after its synthesis the material enters in a metastable state. Here, first-principles and statistical calculations are used to investigate both the low-temperature processes leading to decomposition of graphene oxide and the role of ageing on the structure and stability of this material. Our study shows that the key factor underlying the stability of graphene oxide is the tendency of the oxygen functionalities to agglomerate and form highly oxidized domains surrounded by areas of pristine graphene. Within the agglomerates of functional groups, the primary decomposition reactions are hindered by both geometrical and energetic factors. The number of reacting sites is reduced by the occurrence of local order in the oxidized domains, and due to the close packing of the oxygen functionalities, the decomposition reactions become - on average - endothermic by more than 0.6 eV.
LP and NLP decomposition without a master problem
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fuller, D.; Lan, B.
We describe a new algorithm for decomposition of linear programs and a class of convex nonlinear programs, together with theoretical properties and some test results. Its most striking feature is the absence of a master problem; the subproblems pass primal and dual proposals directly to one another. The algorithm is defined for multi-stage LPs or NLPs, in which the constraints link the current stage`s variables to earlier stages` variables. This problem class is general enough to include many problem structures that do not immediately suggest stages, such as block diagonal problems. The basic algorithmis derived for two-stage problems and extendedmore » to more than two stages through nested decomposition. The main theoretical result assures convergence, to within any preset tolerance of the optimal value, in a finite number of iterations. This asymptotic convergence result contrasts with the results of limited tests on LPs, in which the optimal solution is apparently found exactly, i.e., to machine accuracy, in a small number of iterations. The tests further suggest that for LPs, the new algorithm is faster than the simplex method applied to the whole problem, as long as the stages are linked loosely; that the speedup over the simpex method improves as the number of stages increases; and that the algorithm is more reliable than nested Dantzig-Wolfe or Benders` methods in its improvement over the simplex method.« less
Eliminating the zero spectrum in Fourier transform profilometry using empirical mode decomposition.
Li, Sikun; Su, Xianyu; Chen, Wenjing; Xiang, Liqun
2009-05-01
Empirical mode decomposition is introduced into Fourier transform profilometry to extract the zero spectrum included in the deformed fringe pattern without the need for capturing two fringe patterns with pi phase difference. The fringe pattern is subsequently demodulated using a standard Fourier transform profilometry algorithm. With this method, the deformed fringe pattern is adaptively decomposed into a finite number of intrinsic mode functions that vary from high frequency to low frequency by means of an algorithm referred to as a sifting process. Then the zero spectrum is separated from the high-frequency components effectively. Experiments validate the feasibility of this method.
Cloud parallel processing of tandem mass spectrometry based proteomics data.
Mohammed, Yassene; Mostovenko, Ekaterina; Henneman, Alex A; Marissen, Rob J; Deelder, André M; Palmblad, Magnus
2012-10-05
Data analysis in mass spectrometry based proteomics struggles to keep pace with the advances in instrumentation and the increasing rate of data acquisition. Analyzing this data involves multiple steps requiring diverse software, using different algorithms and data formats. Speed and performance of the mass spectral search engines are continuously improving, although not necessarily as needed to face the challenges of acquired big data. Improving and parallelizing the search algorithms is one possibility; data decomposition presents another, simpler strategy for introducing parallelism. We describe a general method for parallelizing identification of tandem mass spectra using data decomposition that keeps the search engine intact and wraps the parallelization around it. We introduce two algorithms for decomposing mzXML files and recomposing resulting pepXML files. This makes the approach applicable to different search engines, including those relying on sequence databases and those searching spectral libraries. We use cloud computing to deliver the computational power and scientific workflow engines to interface and automate the different processing steps. We show how to leverage these technologies to achieve faster data analysis in proteomics and present three scientific workflows for parallel database as well as spectral library search using our data decomposition programs, X!Tandem and SpectraST.
Hong, X; Harris, C J
2000-01-01
This paper introduces a new neurofuzzy model construction algorithm for nonlinear dynamic systems based upon basis functions that are Bézier-Bernstein polynomial functions. This paper is generalized in that it copes with n-dimensional inputs by utilising an additive decomposition construction to overcome the curse of dimensionality associated with high n. This new construction algorithm also introduces univariate Bézier-Bernstein polynomial functions for the completeness of the generalized procedure. Like the B-spline expansion based neurofuzzy systems, Bézier-Bernstein polynomial function based neurofuzzy networks hold desirable properties such as nonnegativity of the basis functions, unity of support, and interpretability of basis function as fuzzy membership functions, moreover with the additional advantages of structural parsimony and Delaunay input space partition, essentially overcoming the curse of dimensionality associated with conventional fuzzy and RBF networks. This new modeling network is based on additive decomposition approach together with two separate basis function formation approaches for both univariate and bivariate Bézier-Bernstein polynomial functions used in model construction. The overall network weights are then learnt using conventional least squares methods. Numerical examples are included to demonstrate the effectiveness of this new data based modeling approach.
FACETS: multi-faceted functional decomposition of protein interaction networks
Seah, Boon-Siew; Bhowmick, Sourav S.; Forbes Dewey, C.
2012-01-01
Motivation: The availability of large-scale curated protein interaction datasets has given rise to the opportunity to investigate higher level organization and modularity within the protein–protein interaction (PPI) network using graph theoretic analysis. Despite the recent progress, systems level analysis of high-throughput PPIs remains a daunting task because of the amount of data they present. In this article, we propose a novel PPI network decomposition algorithm called FACETS in order to make sense of the deluge of interaction data using Gene Ontology (GO) annotations. FACETS finds not just a single functional decomposition of the PPI network, but a multi-faceted atlas of functional decompositions that portray alternative perspectives of the functional landscape of the underlying PPI network. Each facet in the atlas represents a distinct interpretation of how the network can be functionally decomposed and organized. Our algorithm maximizes interpretative value of the atlas by optimizing inter-facet orthogonality and intra-facet cluster modularity. Results: We tested our algorithm on the global networks from IntAct, and compared it with gold standard datasets from MIPS and KEGG. We demonstrated the performance of FACETS. We also performed a case study that illustrates the utility of our approach. Contact: seah0097@ntu.edu.sg or assourav@ntu.edu.sg Supplementary information: Supplementary data are available at the Bioinformatics online. Availability: Our software is available freely for non-commercial purposes from: http://www.cais.ntu.edu.sg/∼assourav/Facets/ PMID:22908217
NASA Astrophysics Data System (ADS)
Mechlem, Korbinian; Ehn, Sebastian; Sellerer, Thorsten; Pfeiffer, Franz; Noël, Peter B.
2017-03-01
In spectral computed tomography (spectral CT), the additional information about the energy dependence of attenuation coefficients can be exploited to generate material selective images. These images have found applications in various areas such as artifact reduction, quantitative imaging or clinical diagnosis. However, significant noise amplification on material decomposed images remains a fundamental problem of spectral CT. Most spectral CT algorithms separate the process of material decomposition and image reconstruction. Separating these steps is suboptimal because the full statistical information contained in the spectral tomographic measurements cannot be exploited. Statistical iterative reconstruction (SIR) techniques provide an alternative, mathematically elegant approach to obtaining material selective images with improved tradeoffs between noise and resolution. Furthermore, image reconstruction and material decomposition can be performed jointly. This is accomplished by a forward model which directly connects the (expected) spectral projection measurements and the material selective images. To obtain this forward model, detailed knowledge of the different photon energy spectra and the detector response was assumed in previous work. However, accurately determining the spectrum is often difficult in practice. In this work, a new algorithm for statistical iterative material decomposition is presented. It uses a semi-empirical forward model which relies on simple calibration measurements. Furthermore, an efficient optimization algorithm based on separable surrogate functions is employed. This partially negates one of the major shortcomings of SIR, namely high computational cost and long reconstruction times. Numerical simulations and real experiments show strongly improved image quality and reduced statistical bias compared to projection-based material decomposition.
High-performance computing on GPUs for resistivity logging of oil and gas wells
NASA Astrophysics Data System (ADS)
Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.
2017-10-01
We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.
Reduced Order Model Basis Vector Generation: Generates Basis Vectors fro ROMs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arrighi, Bill
2016-03-03
libROM is a library that implements order reduction via singular value decomposition (SVD) of sampled state vectors. It implements 2 parallel, incremental SVD algorithms and one serial, non-incremental algorithm. It also provides a mechanism for adaptive sampling of basis vectors.
GPU-accelerated computing for Lagrangian coherent structures of multi-body gravitational regimes
NASA Astrophysics Data System (ADS)
Lin, Mingpei; Xu, Ming; Fu, Xiaoyu
2017-04-01
Based on a well-established theoretical foundation, Lagrangian Coherent Structures (LCSs) have elicited widespread research on the intrinsic structures of dynamical systems in many fields, including the field of astrodynamics. Although the application of LCSs in dynamical problems seems straightforward theoretically, its associated computational cost is prohibitive. We propose a block decomposition algorithm developed on Compute Unified Device Architecture (CUDA) platform for the computation of the LCSs of multi-body gravitational regimes. In order to take advantage of GPU's outstanding computing properties, such as Shared Memory, Constant Memory, and Zero-Copy, the algorithm utilizes a block decomposition strategy to facilitate computation of finite-time Lyapunov exponent (FTLE) fields of arbitrary size and timespan. Simulation results demonstrate that this GPU-based algorithm can satisfy double-precision accuracy requirements and greatly decrease the time needed to calculate final results, increasing speed by approximately 13 times. Additionally, this algorithm can be generalized to various large-scale computing problems, such as particle filters, constellation design, and Monte-Carlo simulation.
Repeated decompositions reveal the stability of infomax decomposition of fMRI data
Duann, Jeng-Ren; Jung, Tzyy-Ping; Sejnowski, Terrence J.; Makeig, Scott
2010-01-01
In this study, we decomposed 12 fMRI data sets from six subjects each 101 times using the infomax algorithm. The first decomposition was taken as a reference decomposition; the others were used to form a component matrix of 100 by 100 components. Equivalence relations between components in this matrix, defined as maximum spatial correlations to the components of the reference decomposition, were found by the Hungarian sorting method and used to form 100 equivalence classes for each data set. We then tested the reproducibility of the matched components in the equivalence classes using uncertainty measures based on component distributions, time courses, and ROC curves. Infomax ICA rarely failed to derive nearly the same components in different decompositions. Very few components per data set were poorly reproduced, even using vector angle uncertainty measures stricter than correlation and detection theory measures. PMID:17281453
A projection method for low speed flows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Colella, P.; Pao, K.
The authors propose a decomposition applicable to low speed, inviscid flows of all Mach numbers less than 1. By using the Hodge decomposition, they may write the velocity field as the sum of a divergence-free vector field and a gradient of a scalar function. Evolution equations for these parts are presented. A numerical procedure based on this decomposition is designed, using projection methods for solving the incompressible variables and a backward-Euler method for solving the potential variables. Numerical experiments are included to illustrate various aspects of the algorithm.
Self-similar pyramidal structures and signal reconstruction
NASA Astrophysics Data System (ADS)
Benedetto, John J.; Leon, Manuel; Saliani, Sandra
1998-03-01
Pyramidal structures are defined which are locally a combination of low and highpass filtering. The structures are analogous to but different from wavelet packet structures. In particular, new frequency decompositions are obtained; and these decompositions can be parameterized to establish a correspondence with a large class of Cantor sets. Further correspondences are then established to relate such frequency decompositions with more general self- similarities. The role of the filters in defining these pyramidal structures gives rise to signal reconstruction algorithms, and these, in turn, are used in the analysis of speech data.
NASA Astrophysics Data System (ADS)
Gharekhan, Anita H.; Biswal, Nrusingh C.; Gupta, Sharad; Pradhan, Asima; Sureshkumar, M. B.; Panigrahi, Prasanta K.
2008-02-01
The statistical and characteristic features of the polarized fluorescence spectra from cancer, normal and benign human breast tissues are studied through wavelet transform and singular value decomposition. The discrete wavelets enabled one to isolate high and low frequency spectral fluctuations, which revealed substantial randomization in the cancerous tissues, not present in the normal cases. In particular, the fluctuations fitted well with a Gaussian distribution for the cancerous tissues in the perpendicular component. One finds non-Gaussian behavior for normal and benign tissues' spectral variations. The study of the difference of intensities in parallel and perpendicular channels, which is free from the diffusive component, revealed weak fluorescence activity in the 630nm domain, for the cancerous tissues. This may be ascribable to porphyrin emission. The role of both scatterers and fluorophores in the observed minor intensity peak for the cancer case is experimentally confirmed through tissue-phantom experiments. Continuous Morlet wavelet also highlighted this domain for the cancerous tissue fluorescence spectra. Correlation in the spectral fluctuation is further studied in different tissue types through singular value decomposition. Apart from identifying different domains of spectral activity for diseased and non-diseased tissues, we found random matrix support for the spectral fluctuations. The small eigenvalues of the perpendicular polarized fluorescence spectra of cancerous tissues fitted remarkably well with random matrix prediction for Gaussian random variables, confirming our observations about spectral fluctuations in the wavelet domain.
A non-invasive implementation of a mixed domain decomposition method for frictional contact problems
NASA Astrophysics Data System (ADS)
Oumaziz, Paul; Gosselet, Pierre; Boucard, Pierre-Alain; Guinard, Stéphane
2017-11-01
A non-invasive implementation of the Latin domain decomposition method for frictional contact problems is described. The formulation implies to deal with mixed (Robin) conditions on the faces of the subdomains, which is not a classical feature of commercial software. Therefore we propose a new implementation of the linear stage of the Latin method with a non-local search direction built as the stiffness of a layer of elements on the interfaces. This choice enables us to implement the method within the open source software Code_Aster, and to derive 2D and 3D examples with similar performance as the standard Latin method.
Liu, Peilu; Li, Xinghua; Li, Haopeng; Su, Zhikun; Zhang, Hongxu
2017-01-01
In order to improve the accuracy of ultrasonic phased array focusing time delay, analyzing the original interpolation Cascade-Integrator-Comb (CIC) filter, an 8× interpolation CIC filter parallel algorithm was proposed, so that interpolation and multichannel decomposition can simultaneously process. Moreover, we summarized the general formula of arbitrary multiple interpolation CIC filter parallel algorithm and established an ultrasonic phased array focusing time delay system based on 8× interpolation CIC filter parallel algorithm. Improving the algorithmic structure, 12.5% of addition and 29.2% of multiplication was reduced, meanwhile the speed of computation is still very fast. Considering the existing problems of the CIC filter, we compensated the CIC filter; the compensated CIC filter’s pass band is flatter, the transition band becomes steep, and the stop band attenuation increases. Finally, we verified the feasibility of this algorithm on Field Programming Gate Array (FPGA). In the case of system clock is 125 MHz, after 8× interpolation filtering and decomposition, time delay accuracy of the defect echo becomes 1 ns. Simulation and experimental results both show that the algorithm we proposed has strong feasibility. Because of the fast calculation, small computational amount and high resolution, this algorithm is especially suitable for applications with high time delay accuracy and fast detection. PMID:29023385
Liu, Peilu; Li, Xinghua; Li, Haopeng; Su, Zhikun; Zhang, Hongxu
2017-10-12
In order to improve the accuracy of ultrasonic phased array focusing time delay, analyzing the original interpolation Cascade-Integrator-Comb (CIC) filter, an 8× interpolation CIC filter parallel algorithm was proposed, so that interpolation and multichannel decomposition can simultaneously process. Moreover, we summarized the general formula of arbitrary multiple interpolation CIC filter parallel algorithm and established an ultrasonic phased array focusing time delay system based on 8× interpolation CIC filter parallel algorithm. Improving the algorithmic structure, 12.5% of addition and 29.2% of multiplication was reduced, meanwhile the speed of computation is still very fast. Considering the existing problems of the CIC filter, we compensated the CIC filter; the compensated CIC filter's pass band is flatter, the transition band becomes steep, and the stop band attenuation increases. Finally, we verified the feasibility of this algorithm on Field Programming Gate Array (FPGA). In the case of system clock is 125 MHz, after 8× interpolation filtering and decomposition, time delay accuracy of the defect echo becomes 1 ns. Simulation and experimental results both show that the algorithm we proposed has strong feasibility. Because of the fast calculation, small computational amount and high resolution, this algorithm is especially suitable for applications with high time delay accuracy and fast detection.
Efficient Credit Assignment through Evaluation Function Decomposition
NASA Technical Reports Server (NTRS)
Agogino, Adrian; Turner, Kagan; Mikkulainen, Risto
2005-01-01
Evolutionary methods are powerful tools in discovering solutions for difficult continuous tasks. When such a solution is encoded over multiple genes, a genetic algorithm faces the difficult credit assignment problem of evaluating how a single gene in a chromosome contributes to the full solution. Typically a single evaluation function is used for the entire chromosome, implicitly giving each gene in the chromosome the same evaluation. This method is inefficient because a gene will get credit for the contribution of all the other genes as well. Accurately measuring the fitness of individual genes in such a large search space requires many trials. This paper instead proposes turning this single complex search problem into a multi-agent search problem, where each agent has the simpler task of discovering a suitable gene. Gene-specific evaluation functions can then be created that have better theoretical properties than a single evaluation function over all genes. This method is tested in the difficult double-pole balancing problem, showing that agents using gene-specific evaluation functions can create a successful control policy in 20 percent fewer trials than the best existing genetic algorithms. The method is extended to more distributed problems, achieving 95 percent performance gains over tradition methods in the multi-rover domain.
NASA Astrophysics Data System (ADS)
Sim, Sung-Han; Spencer, Billie F., Jr.; Park, Jongwoong; Jung, Hyungjo
2012-04-01
Wireless Smart Sensor Networks (WSSNs) facilitates a new paradigm to structural identification and monitoring for civil infrastructure. Conventional monitoring systems based on wired sensors and centralized data acquisition and processing have been considered to be challenging and costly due to cabling and expensive equipment and maintenance costs. WSSNs have emerged as a technology that can overcome such difficulties, making deployment of a dense array of sensors on large civil structures both feasible and economical. However, as opposed to wired sensor networks in which centralized data acquisition and processing is common practice, WSSNs require decentralized computing algorithms to reduce data transmission due to the limitation associated with wireless communication. Thus, several system identification methods have been implemented to process sensor data and extract essential information, including Natural Excitation Technique with Eigensystem Realization Algorithm, Frequency Domain Decomposition (FDD), and Random Decrement Technique (RDT); however, Stochastic Subspace Identification (SSI) has not been fully utilized in WSSNs, while SSI has the strong potential to enhance the system identification. This study presents a decentralized system identification using SSI in WSSNs. The approach is implemented on MEMSIC's Imote2 sensor platform and experimentally verified using a 5-story shear building model.
An Algorithm for Integrated Subsystem Embodiment and System Synthesis
NASA Technical Reports Server (NTRS)
Lewis, Kemper
1997-01-01
Consider the statement,'A system has two coupled subsystems, one of which dominates the design process. Each subsystem consists of discrete and continuous variables, and is solved using sequential analysis and solution.' To address this type of statement in the design of complex systems, three steps are required, namely, the embodiment of the statement in terms of entities on a computer, the mathematical formulation of subsystem models, and the resulting solution and system synthesis. In complex system decomposition, the subsystems are not isolated, self-supporting entities. Information such as constraints, goals, and design variables may be shared between entities. But many times in engineering problems, full communication and cooperation does not exist, information is incomplete, or one subsystem may dominate the design. Additionally, these engineering problems give rise to mathematical models involving nonlinear functions of both discrete and continuous design variables. In this dissertation an algorithm is developed to handle these types of scenarios for the domain-independent integration of subsystem embodiment, coordination, and system synthesis using constructs from Decision-Based Design, Game Theory, and Multidisciplinary Design Optimization. Implementation of the concept in this dissertation involves testing of the hypotheses using example problems and a motivating case study involving the design of a subsonic passenger aircraft.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gamblin, T; de Supinski, B R; Schulz, M
Good load balance is crucial on very large parallel systems, but the most sophisticated algorithms introduce dynamic imbalances through adaptation in domain decomposition or use of adaptive solvers. To observe and diagnose imbalance, developers need system-wide, temporally-ordered measurements from full-scale runs. This potentially requires data collection from multiple code regions on all processors over the entire execution. Doing this instrumentation naively can, in combination with the application itself, exceed available I/O bandwidth and storage capacity, and can induce severe behavioral perturbations. We present and evaluate a novel technique for scalable, low-error load balance measurement. This uses a parallel wavelet transformmore » and other parallel encoding methods. We show that our technique collects and reconstructs system-wide measurements with low error. Compression time scales sublinearly with system size and data volume is several orders of magnitude smaller than the raw data. The overhead is low enough for online use in a production environment.« less
NASA Astrophysics Data System (ADS)
Levin, Alan R.; Zhang, Deyin; Polizzi, Eric
2012-11-01
In a recent article Polizzi (2009) [15], the FEAST algorithm has been presented as a general purpose eigenvalue solver which is ideally suited for addressing the numerical challenges in electronic structure calculations. Here, FEAST is presented beyond the “black-box” solver as a fundamental modeling framework which can naturally address the original numerical complexity of the electronic structure problem as formulated by Slater in 1937 [3]. The non-linear eigenvalue problem arising from the muffin-tin decomposition of the real-space domain is first derived and then reformulated to be solved exactly within the FEAST framework. This new framework is presented as a fundamental and practical solution for performing both accurate and scalable electronic structure calculations, bypassing the various issues of using traditional approaches such as linearization and pseudopotential techniques. A finite element implementation of this FEAST framework along with simulation results for various molecular systems is also presented and discussed.
Strong scaling of general-purpose molecular dynamics simulations on GPUs
NASA Astrophysics Data System (ADS)
Glaser, Jens; Nguyen, Trung Dac; Anderson, Joshua A.; Lui, Pak; Spiga, Filippo; Millan, Jaime A.; Morse, David C.; Glotzer, Sharon C.
2015-07-01
We describe a highly optimized implementation of MPI domain decomposition in a GPU-enabled, general-purpose molecular dynamics code, HOOMD-blue (Anderson and Glotzer, 2013). Our approach is inspired by a traditional CPU-based code, LAMMPS (Plimpton, 1995), but is implemented within a code that was designed for execution on GPUs from the start (Anderson et al., 2008). The software supports short-ranged pair force and bond force fields and achieves optimal GPU performance using an autotuning algorithm. We are able to demonstrate equivalent or superior scaling on up to 3375 GPUs in Lennard-Jones and dissipative particle dynamics (DPD) simulations of up to 108 million particles. GPUDirect RDMA capabilities in recent GPU generations provide better performance in full double precision calculations. For a representative polymer physics application, HOOMD-blue 1.0 provides an effective GPU vs. CPU node speed-up of 12.5 ×.
Newmark-Beta-FDTD method for super-resolution analysis of time reversal waves
NASA Astrophysics Data System (ADS)
Shi, Sheng-Bing; Shao, Wei; Ma, Jing; Jin, Congjun; Wang, Xiao-Hua
2017-09-01
In this work, a new unconditionally stable finite-difference time-domain (FDTD) method with the split-field perfectly matched layer (PML) is proposed for the analysis of time reversal (TR) waves. The proposed method is very suitable for multiscale problems involving microstructures. The spatial and temporal derivatives in this method are discretized by the central difference technique and Newmark-Beta algorithm, respectively, and the derivation results in the calculation of a banded-sparse matrix equation. Since the coefficient matrix keeps unchanged during the whole simulation process, the lower-upper (LU) decomposition of the matrix needs to be performed only once at the beginning of the calculation. Moreover, the reverse Cuthill-Mckee (RCM) technique, an effective preprocessing technique in bandwidth compression of sparse matrices, is used to improve computational efficiency. The super-resolution focusing of TR wave propagation in two- and three-dimensional spaces is included to validate the accuracy and efficiency of the proposed method.
Retained energy-based coding for EEG signals.
Bazán-Prieto, Carlos; Blanco-Velasco, Manuel; Cárdenas-Barrera, Julián; Cruz-Roldán, Fernando
2012-09-01
The recent use of long-term records in electroencephalography is becoming more frequent due to its diagnostic potential and the growth of novel signal processing methods that deal with these types of recordings. In these cases, the considerable volume of data to be managed makes compression necessary to reduce the bit rate for transmission and storage applications. In this paper, a new compression algorithm specifically designed to encode electroencephalographic (EEG) signals is proposed. Cosine modulated filter banks are used to decompose the EEG signal into a set of subbands well adapted to the frequency bands characteristic of the EEG. Given that no regular pattern may be easily extracted from the signal in time domain, a thresholding-based method is applied for quantizing samples. The method of retained energy is designed for efficiently computing the threshold in the decomposition domain which, at the same time, allows the quality of the reconstructed EEG to be controlled. The experiments are conducted over a large set of signals taken from two public databases available at Physionet and the results show that the compression scheme yields better compression than other reported methods. Copyright © 2011 IPEM. Published by Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Frank, Andreas O.; Twombly, I. Alexander; Barth, Timothy J.; Smith, Jeffrey D.; Dalton, Bonnie P. (Technical Monitor)
2001-01-01
We have applied the linear elastic finite element method to compute haptic force feedback and domain deformations of soft tissue models for use in virtual reality simulators. Our results show that, for virtual object models of high-resolution 3D data (>10,000 nodes), haptic real time computations (>500 Hz) are not currently possible using traditional methods. Current research efforts are focused in the following areas: 1) efficient implementation of fully adaptive multi-resolution methods and 2) multi-resolution methods with specialized basis functions to capture the singularity at the haptic interface (point loading). To achieve real time computations, we propose parallel processing of a Jacobi preconditioned conjugate gradient method applied to a reduced system of equations resulting from surface domain decomposition. This can effectively be achieved using reconfigurable computing systems such as field programmable gate arrays (FPGA), thereby providing a flexible solution that allows for new FPGA implementations as improved algorithms become available. The resulting soft tissue simulation system would meet NASA Virtual Glovebox requirements and, at the same time, provide a generalized simulation engine for any immersive environment application, such as biomedical/surgical procedures or interactive scientific applications.
Elastic and acoustic wavefield decompositions and application to reverse time migrations
NASA Astrophysics Data System (ADS)
Wang, Wenlong
P- and S-waves coexist in elastic wavefields, and separation between them is an essential step in elastic reverse-time migrations (RTMs). Unlike the traditional separation methods that use curl and divergence operators, which do not preserve the wavefield vector component information, we propose and compare two vector decomposition methods, which preserve the same vector components that exist in the input elastic wavefield. The amplitude and phase information is automatically preserved, so no amplitude or phase corrections are required. The decoupled propagation method is extended from elastic to viscoelastic wavefields. To use the decomposed P and S vector wavefields and generate PP and PS images, we create a new 2D migration context for isotropic, elastic RTM which includes PS vector decomposition; the propagation directions of both incident and reflected P- and S-waves are calculated directly from the stress and particle velocity definitions of the decomposed P- and S-wave Poynting vectors. Then an excitation-amplitude image condition that scales the receiver wavelet by the source vector magnitude produces angle-dependent images of PP and PS reflection coefficients with the correct polarities, polarization, and amplitudes. It thus simplifies the process of obtaining PP and PS angle-domain common-image gathers (ADCIGs); it is less effort to generate ADCIGs from vector data than from scalar data. Besides P- and S-waves decomposition, separations of up- and down-going waves are also a part of processing of multi-component recorded data and propagating wavefields. A complex trace based up/down separation approach is extended from acoustic to elastic, and combined with P- and S-wave decomposition by decoupled propagation. This eliminates the need for a Fourier transform over time, thereby significantly reducing the storage cost and improving computational efficiency. Wavefield decomposition is applied to both synthetic elastic VSP data and propagating wavefield snapshots. Poynting vectors obtained from the particle-velocity and stress fields after P/S and up/down decompositions are much more accurate than those without. The up/down separation algorithm is also applicable in acoustic RTMs, where both (forward-time extrapolated) source and (reverse-time extrapolated) receiver wavefields are decomposed into up-going and down-going parts. Together with the crosscorrelation imaging condition, four images (down-up, up-down, up-up and down-down) are generated, which facilitate the analysis of artifacts and the imaging ability of the four images. Artifacts may exist in all the decomposed images, but their positions and types are different. The causes of artifacts in different images are explained and illustrated with sketches and numerical tests.
NASA Astrophysics Data System (ADS)
Pankow, C.; Brady, P.; Ochsner, E.; O'Shaughnessy, R.
2015-07-01
We introduce a highly parallelizable architecture for estimating parameters of compact binary coalescence using gravitational-wave data and waveform models. Using a spherical harmonic mode decomposition, the waveform is expressed as a sum over modes that depend on the intrinsic parameters (e.g., masses) with coefficients that depend on the observer dependent extrinsic parameters (e.g., distance, sky position). The data is then prefiltered against those modes, at fixed intrinsic parameters, enabling efficiently evaluation of the likelihood for generic source positions and orientations, independent of waveform length or generation time. We efficiently parallelize our intrinsic space calculation by integrating over all extrinsic parameters using a Monte Carlo integration strategy. Since the waveform generation and prefiltering happens only once, the cost of integration dominates the procedure. Also, we operate hierarchically, using information from existing gravitational-wave searches to identify the regions of parameter space to emphasize in our sampling. As proof of concept and verification of the result, we have implemented this algorithm using standard time-domain waveforms, processing each event in less than one hour on recent computing hardware. For most events we evaluate the marginalized likelihood (evidence) with statistical errors of ≲5 %, and even smaller in many cases. With a bounded runtime independent of the waveform model starting frequency, a nearly unchanged strategy could estimate neutron star (NS)-NS parameters in the 2018 advanced LIGO era. Our algorithm is usable with any noise curve and existing time-domain model at any mass, including some waveforms which are computationally costly to evolve.
Identification of Synchronous Machine Stability - Parameters: AN On-Line Time-Domain Approach.
NASA Astrophysics Data System (ADS)
Le, Loc Xuan
1987-09-01
A time-domain modeling approach is described which enables the stability-study parameters of the synchronous machine to be determined directly from input-output data measured at the terminals of the machine operating under normal conditions. The transient responses due to system perturbations are used to identify the parameters of the equivalent circuit models. The described models are verified by comparing their responses with the machine responses generated from the transient stability models of a small three-generator multi-bus power system and of a single -machine infinite-bus power network. The least-squares method is used for the solution of the model parameters. As a precaution against ill-conditioned problems, the singular value decomposition (SVD) is employed for its inherent numerical stability. In order to identify the equivalent-circuit parameters uniquely, the solution of a linear optimization problem with non-linear constraints is required. Here, the SVD appears to offer a simple solution to this otherwise difficult problem. Furthermore, the SVD yields solutions with small bias and, therefore, physically meaningful parameters even in the presence of noise in the data. The question concerning the need for a more advanced model of the synchronous machine which describes subtransient and even sub-subtransient behavior is dealt with sensibly by the concept of condition number. The concept provides a quantitative measure for determining whether such an advanced model is indeed necessary. Finally, the recursive SVD algorithm is described for real-time parameter identification and tracking of slowly time-variant parameters. The algorithm is applied to identify the dynamic equivalent power system model.
The Exact Solution to Rank-1 L1-Norm TUCKER2 Decomposition
NASA Astrophysics Data System (ADS)
Markopoulos, Panos P.; Chachlakis, Dimitris G.; Papalexakis, Evangelos E.
2018-04-01
We study rank-1 {L1-norm-based TUCKER2} (L1-TUCKER2) decomposition of 3-way tensors, treated as a collection of $N$ $D \\times M$ matrices that are to be jointly decomposed. Our contributions are as follows. i) We prove that the problem is equivalent to combinatorial optimization over $N$ antipodal-binary variables. ii) We derive the first two algorithms in the literature for its exact solution. The first algorithm has cost exponential in $N$; the second one has cost polynomial in $N$ (under a mild assumption). Our algorithms are accompanied by formal complexity analysis. iii) We conduct numerical studies to compare the performance of exact L1-TUCKER2 (proposed) with standard HOSVD, HOOI, GLRAM, PCA, L1-PCA, and TPCA-L1. Our studies show that L1-TUCKER2 outperforms (in tensor approximation) all the above counterparts when the processed data are outlier corrupted.
NASA Astrophysics Data System (ADS)
Das, Siddhartha; Siopsis, George; Weedbrook, Christian
2018-02-01
With the significant advancement in quantum computation during the past couple of decades, the exploration of machine-learning subroutines using quantum strategies has become increasingly popular. Gaussian process regression is a widely used technique in supervised classical machine learning. Here we introduce an algorithm for Gaussian process regression using continuous-variable quantum systems that can be realized with technology based on photonic quantum computers under certain assumptions regarding distribution of data and availability of efficient quantum access. Our algorithm shows that by using a continuous-variable quantum computer a dramatic speedup in computing Gaussian process regression can be achieved, i.e., the possibility of exponentially reducing the time to compute. Furthermore, our results also include a continuous-variable quantum-assisted singular value decomposition method of nonsparse low rank matrices and forms an important subroutine in our Gaussian process regression algorithm.
USDA-ARS?s Scientific Manuscript database
Hyperspectral scattering is a promising technique for rapid and noninvasive measurement of multiple quality attributes of apple fruit. A hierarchical evolutionary algorithm (HEA) approach, in combination with subspace decomposition and partial least squares (PLS) regression, was proposed to select o...
Recursive inverse factorization.
Rubensson, Emanuel H; Bock, Nicolas; Holmström, Erik; Niklasson, Anders M N
2008-03-14
A recursive algorithm for the inverse factorization S(-1)=ZZ(*) of Hermitian positive definite matrices S is proposed. The inverse factorization is based on iterative refinement [A.M.N. Niklasson, Phys. Rev. B 70, 193102 (2004)] combined with a recursive decomposition of S. As the computational kernel is matrix-matrix multiplication, the algorithm can be parallelized and the computational effort increases linearly with system size for systems with sufficiently sparse matrices. Recent advances in network theory are used to find appropriate recursive decompositions. We show that optimization of the so-called network modularity results in an improved partitioning compared to other approaches. In particular, when the recursive inverse factorization is applied to overlap matrices of irregularly structured three-dimensional molecules.
Computationally efficient algorithm for high sampling-frequency operation of active noise control
NASA Astrophysics Data System (ADS)
Rout, Nirmal Kumar; Das, Debi Prasad; Panda, Ganapati
2015-05-01
In high sampling-frequency operation of active noise control (ANC) system the length of the secondary path estimate and the ANC filter are very long. This increases the computational complexity of the conventional filtered-x least mean square (FXLMS) algorithm. To reduce the computational complexity of long order ANC system using FXLMS algorithm, frequency domain block ANC algorithms have been proposed in past. These full block frequency domain ANC algorithms are associated with some disadvantages such as large block delay, quantization error due to computation of large size transforms and implementation difficulties in existing low-end DSP hardware. To overcome these shortcomings, the partitioned block ANC algorithm is newly proposed where the long length filters in ANC are divided into a number of equal partitions and suitably assembled to perform the FXLMS algorithm in the frequency domain. The complexity of this proposed frequency domain partitioned block FXLMS (FPBFXLMS) algorithm is quite reduced compared to the conventional FXLMS algorithm. It is further reduced by merging one fast Fourier transform (FFT)-inverse fast Fourier transform (IFFT) combination to derive the reduced structure FPBFXLMS (RFPBFXLMS) algorithm. Computational complexity analysis for different orders of filter and partition size are presented. Systematic computer simulations are carried out for both the proposed partitioned block ANC algorithms to show its accuracy compared to the time domain FXLMS algorithm.
Reumann, Matthias; Fitch, Blake G; Rayshubskiy, Aleksandr; Keller, David U J; Seemann, Gunnar; Dossel, Olaf; Pitman, Michael C; Rice, John J
2009-01-01
Orthogonal recursive bisection (ORB) algorithm can be used as data decomposition strategy to distribute a large data set of a cardiac model to a distributed memory supercomputer. It has been shown previously that good scaling results can be achieved using the ORB algorithm for data decomposition. However, the ORB algorithm depends on the distribution of computational load of each element in the data set. In this work we investigated the dependence of data decomposition and load balancing on different rotations of the anatomical data set to achieve optimization in load balancing. The anatomical data set was given by both ventricles of the Visible Female data set in a 0.2 mm resolution. Fiber orientation was included. The data set was rotated by 90 degrees around x, y and z axis, respectively. By either translating or by simply taking the magnitude of the resulting negative coordinates we were able to create 14 data set of the same anatomy with different orientation and position in the overall volume. Computation load ratios for non - tissue vs. tissue elements used in the data decomposition were 1:1, 1:2, 1:5, 1:10, 1:25, 1:38.85, 1:50 and 1:100 to investigate the effect of different load ratios on the data decomposition. The ten Tusscher et al. (2004) electrophysiological cell model was used in monodomain simulations of 1 ms simulation time to compare performance using the different data sets and orientations. The simulations were carried out for load ratio 1:10, 1:25 and 1:38.85 on a 512 processor partition of the IBM Blue Gene/L supercomputer. Th results show that the data decomposition does depend on the orientation and position of the anatomy in the global volume. The difference in total run time between the data sets is 10 s for a simulation time of 1 ms. This yields a difference of about 28 h for a simulation of 10 s simulation time. However, given larger processor partitions, the difference in run time decreases and becomes less significant. Depending on the processor partition size, future work will have to consider the orientation of the anatomy in the global volume for longer simulation runs.
Advancing MODFLOW Applying the Derived Vector Space Method
NASA Astrophysics Data System (ADS)
Herrera, G. S.; Herrera, I.; Lemus-García, M.; Hernandez-Garcia, G. D.
2015-12-01
The most effective domain decomposition methods (DDM) are non-overlapping DDMs. Recently a new approach, the DVS-framework, based on an innovative discretization method that uses a non-overlapping system of nodes (the derived-nodes), was introduced and developed by I. Herrera et al. [1, 2]. Using the DVS-approach a group of four algorithms, referred to as the 'DVS-algorithms', which fulfill the DDM-paradigm (i.e. the solution of global problems is obtained by resolution of local problems exclusively) has been derived. Such procedures are applicable to any boundary-value problem, or system of such equations, for which a standard discretization method is available and then software with a high degree of parallelization can be constructed. In a parallel talk, in this AGU Fall Meeting, Ismael Herrera will introduce the general DVS methodology. The application of the DVS-algorithms has been demonstrated in the solution of several boundary values problems of interest in Geophysics. Numerical examples for a single-equation, for the cases of symmetric, non-symmetric and indefinite problems were demonstrated before [1,2]. For these problems DVS-algorithms exhibited significantly improved numerical performance with respect to standard versions of DDM algorithms. In view of these results our research group is in the process of applying the DVS method to a widely used simulator for the first time, here we present the advances of the application of this method for the parallelization of MODFLOW. Efficiency results for a group of tests will be presented. References [1] I. Herrera, L.M. de la Cruz and A. Rosas-Medina. Non overlapping discretization methods for partial differential equations, Numer Meth Part D E, (2013). [2] Herrera, I., & Contreras Iván "An Innovative Tool for Effectively Applying Highly Parallelized Software To Problems of Elasticity". Geofísica Internacional, 2015 (In press)
Task Decomposition Module For Telerobot Trajectory Generation
NASA Astrophysics Data System (ADS)
Wavering, Albert J.; Lumia, Ron
1988-10-01
A major consideration in the design of trajectory generation software for a Flight Telerobotic Servicer (FTS) is that the FTS will be called upon to perform tasks which require a diverse range of manipulator behaviors and capabilities. In a hierarchical control system where tasks are decomposed into simpler and simpler subtasks, the task decomposition module which performs trajectory planning and execution should therefore be able to accommodate a wide range of algorithms. In some cases, it will be desirable to plan a trajectory for an entire motion before manipulator motion commences, as when optimizing over the entire trajectory. Many FTS motions, however, will be highly sensory-interactive, such as moving to attain a desired position relative to a non-stationary object whose position is periodically updated by a vision system. In this case, the time-varying nature of the trajectory may be handled either by frequent replanning using updated sensor information, or by using an algorithm which creates a less specific state-dependent plan that determines the manipulator path as the trajectory is executed (rather than a priori). This paper discusses a number of trajectory generation techniques from these categories and how they may be implemented in a task decompo-sition module of a hierarchical control system. The structure, function, and interfaces of the proposed trajectory gener-ation module are briefly described, followed by several examples of how different algorithms may be performed by the module. The proposed task decomposition module provides a logical structure for trajectory planning and execution, and supports a large number of published trajectory generation techniques.
Finding Hierarchical and Overlapping Dense Subgraphs using Nucleus Decompositions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seshadhri, Comandur; Pinar, Ali; Sariyuce, Ahmet Erdem
Finding dense substructures in a graph is a fundamental graph mining operation, with applications in bioinformatics, social networks, and visualization to name a few. Yet most standard formulations of this problem (like clique, quasiclique, k-densest subgraph) are NP-hard. Furthermore, the goal is rarely to nd the \\true optimum", but to identify many (if not all) dense substructures, understand their distribution in the graph, and ideally determine a hierarchical structure among them. Current dense subgraph nding algorithms usually optimize some objective, and only nd a few such subgraphs without providing any hierarchy. It is also not clear how to account formore » overlaps in dense substructures. We de ne the nucleus decomposition of a graph, which represents the graph as a forest of nuclei. Each nucleus is a subgraph where smaller cliques are present in many larger cliques. The forest of nuclei is a hierarchy by containment, where the edge density increases as we proceed towards leaf nuclei. Sibling nuclei can have limited intersections, which allows for discovery of overlapping dense subgraphs. With the right parameters, the nuclear decomposition generalizes the classic notions of k-cores and k-trusses. We give provable e cient algorithms for nuclear decompositions, and empirically evaluate their behavior in a variety of real graphs. The tree of nuclei consistently gives a global, hierarchical snapshot of dense substructures, and outputs dense subgraphs of higher quality than other state-of-theart solutions. Our algorithm can process graphs with tens of millions of edges in less than an hour.« less
[Detection of constitutional types of EEG using the orthogonal decomposition method].
Kuznetsova, S M; Kudritskaia, O V
1987-01-01
The authors present an algorithm of investigation into the processes of brain bioelectrical activity with the help of an orthogonal decomposition device intended for the identification of constitutional types of EEGs. The method has helped to effectively solve the task of the diagnosis of constitutional types of EEGs, which are determined by a variable degree of hereditary predisposition for longevity or cerebral stroke.
NASA Astrophysics Data System (ADS)
Imamura, N.; Schultz, A.
2015-12-01
Recently, a full waveform time domain solution has been developed for the magnetotelluric (MT) and controlled-source electromagnetic (CSEM) methods. The ultimate goal of this approach is to obtain a computationally tractable direct waveform joint inversion for source fields and earth conductivity structure in three and four dimensions. This is desirable on several grounds, including the improved spatial resolving power expected from use of a multitude of source illuminations of non-zero wavenumber, the ability to operate in areas of high levels of source signal spatial complexity and non-stationarity, etc. This goal would not be obtainable if one were to adopt the finite difference time-domain (FDTD) approach for the forward problem. This is particularly true for the case of MT surveys, since an enormous number of degrees of freedom are required to represent the observed MT waveforms across the large frequency bandwidth. It means that for FDTD simulation, the smallest time steps should be finer than that required to represent the highest frequency, while the number of time steps should also cover the lowest frequency. This leads to a linear system that is computationally burdensome to solve. We have implemented our code that addresses this situation through the use of a fictitious wave domain method and GPUs to speed up the computation time. We also substantially reduce the size of the linear systems by applying concepts from successive cascade decimation, through quasi-equivalent time domain decomposition. By combining these refinements, we have made good progress toward implementing the core of a full waveform joint source field/earth conductivity inverse modeling method. From results, we found the use of previous generation of CPU/GPU speeds computations by an order of magnitude over a parallel CPU only approach. In part, this arises from the use of the quasi-equivalent time domain decomposition, which shrinks the size of the linear system dramatically.
NASA Astrophysics Data System (ADS)
Imamura, N.; Schultz, A.
2016-12-01
Recently, a full waveform time domain inverse solution has been developed for the magnetotelluric (MT) and controlled-source electromagnetic (CSEM) methods. The ultimate goal of this approach is to obtain a computationally tractable direct waveform joint inversion to solve simultaneously for source fields and earth conductivity structure in three and four dimensions. This is desirable on several grounds, including the improved spatial resolving power expected from use of a multitude of source illuminations, the ability to operate in areas of high levels of source signal spatial complexity, and non-stationarity. This goal would not be obtainable if one were to adopt the pure time domain solution for the inverse problem. This is particularly true for the case of MT surveys, since an enormous number of degrees of freedom are required to represent the observed MT waveforms across a large frequency bandwidth. This means that for the forward simulation, the smallest time steps should be finer than that required to represent the highest frequency, while the number of time steps should also cover the lowest frequency. This leads to a sensitivity matrix that is computationally burdensome to solve a model update. We have implemented a code that addresses this situation through the use of cascade decimation decomposition to reduce the size of the sensitivity matrix substantially, through quasi-equivalent time domain decomposition. We also use a fictitious wave domain method to speed up computation time of the forward simulation in the time domain. By combining these refinements, we have developed a full waveform joint source field/earth conductivity inverse modeling method. We found that cascade decimation speeds computations of the sensitivity matrices dramatically, keeping the solution close to that of the undecimated case. For example, for a model discretized into 2.6x105 cells, we obtain model updates in less than 1 hour on a 4U rack-mounted workgroup Linux server, which is a practical computational time for the inverse problem.
NASA Astrophysics Data System (ADS)
Gomez Gonzalez, C. A.; Absil, O.; Absil, P.-A.; Van Droogenbroeck, M.; Mawet, D.; Surdej, J.
2016-05-01
Context. Data processing constitutes a critical component of high-contrast exoplanet imaging. Its role is almost as important as the choice of a coronagraph or a wavefront control system, and it is intertwined with the chosen observing strategy. Among the data processing techniques for angular differential imaging (ADI), the most recent is the family of principal component analysis (PCA) based algorithms. It is a widely used statistical tool developed during the first half of the past century. PCA serves, in this case, as a subspace projection technique for constructing a reference point spread function (PSF) that can be subtracted from the science data for boosting the detectability of potential companions present in the data. Unfortunately, when building this reference PSF from the science data itself, PCA comes with certain limitations such as the sensitivity of the lower dimensional orthogonal subspace to non-Gaussian noise. Aims: Inspired by recent advances in machine learning algorithms such as robust PCA, we aim to propose a localized subspace projection technique that surpasses current PCA-based post-processing algorithms in terms of the detectability of companions at near real-time speed, a quality that will be useful for future direct imaging surveys. Methods: We used randomized low-rank approximation methods recently proposed in the machine learning literature, coupled with entry-wise thresholding to decompose an ADI image sequence locally into low-rank, sparse, and Gaussian noise components (LLSG). This local three-term decomposition separates the starlight and the associated speckle noise from the planetary signal, which mostly remains in the sparse term. We tested the performance of our new algorithm on a long ADI sequence obtained on β Pictoris with VLT/NACO. Results: Compared to a standard PCA approach, LLSG decomposition reaches a higher signal-to-noise ratio and has an overall better performance in the receiver operating characteristic space. This three-term decomposition brings a detectability boost compared to the full-frame standard PCA approach, especially in the small inner working angle region where complex speckle noise prevents PCA from discerning true companions from noise.
Hybrid protection algorithms based on game theory in multi-domain optical networks
NASA Astrophysics Data System (ADS)
Guo, Lei; Wu, Jingjing; Hou, Weigang; Liu, Yejun; Zhang, Lincong; Li, Hongming
2011-12-01
With the network size increasing, the optical backbone is divided into multiple domains and each domain has its own network operator and management policy. At the same time, the failures in optical network may lead to a huge data loss since each wavelength carries a lot of traffic. Therefore, the survivability in multi-domain optical network is very important. However, existing survivable algorithms can achieve only the unilateral optimization for profit of either users or network operators. Then, they cannot well find the double-win optimal solution with considering economic factors for both users and network operators. Thus, in this paper we develop the multi-domain network model with involving multiple Quality of Service (QoS) parameters. After presenting the link evaluation approach based on fuzzy mathematics, we propose the game model to find the optimal solution to maximize the user's utility, the network operator's utility, and the joint utility of user and network operator. Since the problem of finding double-win optimal solution is NP-complete, we propose two new hybrid protection algorithms, Intra-domain Sub-path Protection (ISP) algorithm and Inter-domain End-to-end Protection (IEP) algorithm. In ISP and IEP, the hybrid protection means that the intelligent algorithm based on Bacterial Colony Optimization (BCO) and the heuristic algorithm are used to solve the survivability in intra-domain routing and inter-domain routing, respectively. Simulation results show that ISP and IEP have the similar comprehensive utility. In addition, ISP has better resource utilization efficiency, lower blocking probability, and higher network operator's utility, while IEP has better user's utility.
Performance of the Wavelet Decomposition on Massively Parallel Architectures
NASA Technical Reports Server (NTRS)
El-Ghazawi, Tarek A.; LeMoigne, Jacqueline; Zukor, Dorothy (Technical Monitor)
2001-01-01
Traditionally, Fourier Transforms have been utilized for performing signal analysis and representation. But although it is straightforward to reconstruct a signal from its Fourier transform, no local description of the signal is included in its Fourier representation. To alleviate this problem, Windowed Fourier transforms and then wavelet transforms have been introduced, and it has been proven that wavelets give a better localization than traditional Fourier transforms, as well as a better division of the time- or space-frequency plane than Windowed Fourier transforms. Because of these properties and after the development of several fast algorithms for computing the wavelet representation of any signal, in particular the Multi-Resolution Analysis (MRA) developed by Mallat, wavelet transforms have increasingly been applied to signal analysis problems, especially real-life problems, in which speed is critical. In this paper we present and compare efficient wavelet decomposition algorithms on different parallel architectures. We report and analyze experimental measurements, using NASA remotely sensed images. Results show that our algorithms achieve significant performance gains on current high performance parallel systems, and meet scientific applications and multimedia requirements. The extensive performance measurements collected over a number of high-performance computer systems have revealed important architectural characteristics of these systems, in relation to the processing demands of the wavelet decomposition of digital images.
Adaptive bearing estimation and tracking of multiple targets in a realistic passive sonar scenario
NASA Astrophysics Data System (ADS)
Rajagopal, R.; Challa, Subhash; Faruqi, Farhan A.; Rao, P. R.
1997-06-01
In a realistic passive sonar environment, the received signal consists of multipath arrivals from closely separated moving targets. The signals are contaminated by spatially correlated noise. The differential MUSIC has been proposed to estimate the DOAs in such a scenario. This method estimates the 'noise subspace' in order to estimate the DOAs. However, the 'noise subspace' estimate has to be updated as and when new data become available. In order to save the computational costs, a new adaptive noise subspace estimation algorithm is proposed in this paper. The salient features of the proposed algorithm are: (1) Noise subspace estimation is done by QR decomposition of the difference matrix which is formed from the data covariance matrix. Thus, as compared to standard eigen-decomposition based methods which require O(N3) computations, the proposed method requires only O(N2) computations. (2) Noise subspace is updated by updating the QR decomposition. (3) The proposed algorithm works in a realistic sonar environment. In the second part of the paper, the estimated bearing values are used to track multiple targets. In order to achieve this, the nonlinear system/linear measurement extended Kalman filtering proposed is applied. Computer simulation results are also presented to support the theory.
Error analysis of multipoint flux domain decomposition methods for evolutionary diffusion problems
NASA Astrophysics Data System (ADS)
Arrarás, A.; Portero, L.; Yotov, I.
2014-01-01
We study space and time discretizations for mixed formulations of parabolic problems. The spatial approximation is based on the multipoint flux mixed finite element method, which reduces to an efficient cell-centered pressure system on general grids, including triangles, quadrilaterals, tetrahedra, and hexahedra. The time integration is performed by using a domain decomposition time-splitting technique combined with multiterm fractional step diagonally implicit Runge-Kutta methods. The resulting scheme is unconditionally stable and computationally efficient, as it reduces the global system to a collection of uncoupled subdomain problems that can be solved in parallel without the need for Schwarz-type iteration. Convergence analysis for both the semidiscrete and fully discrete schemes is presented.
NASA Astrophysics Data System (ADS)
Bertrand, G.; Comperat, M.; Lallemant, M.
1980-09-01
Copper sulfate pentahydrate dehydration into trihydrate was investigated using monocrystalline platelets with (110) crystallographic orientation. Temperature and pressure conditions were selected so as to obtain elliptical trihydrate domains. The study deals with the evolution, vs time, of elliptical domain dimensions and the evolution, vs water vapor pressure, of the {D}/{d} ratio of ellipse axes and on the other hand of the interface displacement rate along a given direction. The phenomena observed are not basically different from those yielded by the overall kinetic study of the solid sample. Their magnitude, however, is modulated depending on displacement direction. The results are analyzed within the scope of our study of endothermic decomposition of solids.
Domain decomposition methods in computational fluid dynamics
NASA Technical Reports Server (NTRS)
Gropp, William D.; Keyes, David E.
1991-01-01
The divide-and-conquer paradigm of iterative domain decomposition, or substructuring, has become a practical tool in computational fluid dynamic applications because of its flexibility in accommodating adaptive refinement through locally uniform (or quasi-uniform) grids, its ability to exploit multiple discretizations of the operator equations, and the modular pathway it provides towards parallelism. These features are illustrated on the classic model problem of flow over a backstep using Newton's method as the nonlinear iteration. Multiple discretizations (second-order in the operator and first-order in the preconditioner) and locally uniform mesh refinement pay dividends separately, and they can be combined synergistically. Sample performance results are included from an Intel iPSC/860 hypercube implementation.
Domain decomposition methods for nonconforming finite element spaces of Lagrange-type
NASA Technical Reports Server (NTRS)
Cowsar, Lawrence C.
1993-01-01
In this article, we consider the application of three popular domain decomposition methods to Lagrange-type nonconforming finite element discretizations of scalar, self-adjoint, second order elliptic equations. The additive Schwarz method of Dryja and Widlund, the vertex space method of Smith, and the balancing method of Mandel applied to nonconforming elements are shown to converge at a rate no worse than their applications to the standard conforming piecewise linear Galerkin discretization. Essentially, the theory for the nonconforming elements is inherited from the existing theory for the conforming elements with only modest modification by constructing an isomorphism between the nonconforming finite element space and a space of continuous piecewise linear functions.
Galerkin-collocation domain decomposition method for arbitrary binary black holes
NASA Astrophysics Data System (ADS)
Barreto, W.; Clemente, P. C. M.; de Oliveira, H. P.; Rodriguez-Mueller, B.
2018-05-01
We present a new computational framework for the Galerkin-collocation method for double domain in the context of ADM 3 +1 approach in numerical relativity. This work enables us to perform high resolution calculations for initial sets of two arbitrary black holes. We use the Bowen-York method for binary systems and the puncture method to solve the Hamiltonian constraint. The nonlinear numerical code solves the set of equations for the spectral modes using the standard Newton-Raphson method, LU decomposition and Gaussian quadratures. We show convergence of our code for the conformal factor and the ADM mass. Thus, we display features of the conformal factor for different masses, spins and linear momenta.
Domain decomposition methods in aerodynamics
NASA Technical Reports Server (NTRS)
Venkatakrishnan, V.; Saltz, Joel
1990-01-01
Compressible Euler equations are solved for two-dimensional problems by a preconditioned conjugate gradient-like technique. An approximate Riemann solver is used to compute the numerical fluxes to second order accuracy in space. Two ways to achieve parallelism are tested, one which makes use of parallelism inherent in triangular solves and the other which employs domain decomposition techniques. The vectorization/parallelism in triangular solves is realized by the use of a recording technique called wavefront ordering. This process involves the interpretation of the triangular matrix as a directed graph and the analysis of the data dependencies. It is noted that the factorization can also be done in parallel with the wave front ordering. The performances of two ways of partitioning the domain, strips and slabs, are compared. Results on Cray YMP are reported for an inviscid transonic test case. The performances of linear algebra kernels are also reported.
Segmented Domain Decomposition Multigrid For 3-D Turbomachinery Flows
NASA Technical Reports Server (NTRS)
Celestina, M. L.; Adamczyk, J. J.; Rubin, S. G.
2001-01-01
A Segmented Domain Decomposition Multigrid (SDDMG) procedure was developed for three-dimensional viscous flow problems as they apply to turbomachinery flows. The procedure divides the computational domain into a coarse mesh comprised of uniformly spaced cells. To resolve smaller length scales such as the viscous layer near a surface, segments of the coarse mesh are subdivided into a finer mesh. This is repeated until adequate resolution of the smallest relevant length scale is obtained. Multigrid is used to communicate information between the different grid levels. To test the procedure, simulation results will be presented for a compressor and turbine cascade. These simulations are intended to show the ability of the present method to generate grid independent solutions. Comparisons with data will also be presented. These comparisons will further demonstrate the usefulness of the present work for they allow an estimate of the accuracy of the flow modeling equations independent of error attributed to numerical discretization.
Artifact removal from EEG data with empirical mode decomposition
NASA Astrophysics Data System (ADS)
Grubov, Vadim V.; Runnova, Anastasiya E.; Efremova, Tatyana Yu.; Hramov, Alexander E.
2017-03-01
In the paper we propose the novel method for dealing with the physiological artifacts caused by intensive activity of facial and neck muscles and other movements in experimental human EEG recordings. The method is based on analysis of EEG signals with empirical mode decomposition (Hilbert-Huang transform). We introduce the mathematical algorithm of the method with following steps: empirical mode decomposition of EEG signal, choosing of empirical modes with artifacts, removing empirical modes with artifacts, reconstruction of the initial EEG signal. We test the method on filtration of experimental human EEG signals from movement artifacts and show high efficiency of the method.
NASA Astrophysics Data System (ADS)
Yamamoto, Takanori; Bannai, Hideo; Nagasaki, Masao; Miyano, Satoru
We present new decomposition heuristics for finding the optimal solution for the maximum-weight connected graph problem, which is known to be NP-hard. Previous optimal algorithms for solving the problem decompose the input graph into subgraphs using heuristics based on node degree. We propose new heuristics based on betweenness centrality measures, and show through computational experiments that our new heuristics tend to reduce the number of subgraphs in the decomposition, and therefore could lead to the reduction in computational time for finding the optimal solution. The method is further applied to analysis of biological pathway data.
NASA Technical Reports Server (NTRS)
Fiske, David R.
2004-01-01
In an earlier paper, Misner (2004, Class. Quant. Grav., 21, S243) presented a novel algorithm for computing the spherical harmonic components of data represented on a cubic grid. I extend Misner s original analysis by making detailed error estimates of the numerical errors accrued by the algorithm, by using symmetry arguments to suggest a more efficient implementation scheme, and by explaining how the algorithm can be applied efficiently on data with explicit reflection symmetries.
NASA Astrophysics Data System (ADS)
Safari, A.; Sharifi, M. A.; Amjadiparvar, B.
2010-05-01
The GRACE mission has substantiated the low-low satellite-to-satellite tracking (LL-SST) concept. The LL-SST configuration can be combined with the previously realized high-low SST concept in the CHAMP mission to provide a much higher accuracy. The line of sight (LOS) acceleration difference between the GRACE satellite pair is the mostly used observable for mapping the global gravity field of the Earth in terms of spherical harmonic coefficients. In this paper, mathematical formulae for LOS acceleration difference observations have been derived and the corresponding linear system of equations has been set up for spherical harmonic up to degree and order 120. The total number of unknowns is 14641. Such a linear equation system can be solved with iterative solvers or direct solvers. However, the runtime of direct methods or that of iterative solvers without a suitable preconditioner increases tremendously. This is the reason why we need a more sophisticated method to solve the linear system of problems with a large number of unknowns. Multiplicative variant of the Schwarz alternating algorithm is a domain decomposition method, which allows it to split the normal matrix of the system into several smaller overlaped submatrices. In each iteration step the multiplicative variant of the Schwarz alternating algorithm solves linear systems with the matrices obtained from the splitting successively. It reduces both runtime and memory requirements drastically. In this paper we propose the Multiplicative Schwarz Alternating Algorithm (MSAA) for solving the large linear system of gravity field recovery. The proposed algorithm has been tested on the International Association of Geodesy (IAG)-simulated data of the GRACE mission. The achieved results indicate the validity and efficiency of the proposed algorithm in solving the linear system of equations from accuracy and runtime points of view. Keywords: Gravity field recovery, Multiplicative Schwarz Alternating Algorithm, Low-Low Satellite-to-Satellite Tracking
Data Mining and Optimization Tools for Developing Engine Parameters Tools
NASA Technical Reports Server (NTRS)
Dhawan, Atam P.
1998-01-01
This project was awarded for understanding the problem and developing a plan for Data Mining tools for use in designing and implementing an Engine Condition Monitoring System. From the total budget of $5,000, Tricia and I studied the problem domain for developing ail Engine Condition Monitoring system using the sparse and non-standardized datasets to be available through a consortium at NASA Lewis Research Center. We visited NASA three times to discuss additional issues related to dataset which was not made available to us. We discussed and developed a general framework of data mining and optimization tools to extract useful information from sparse and non-standard datasets. These discussions lead to the training of Tricia Erhardt to develop Genetic Algorithm based search programs which were written in C++ and used to demonstrate the capability of GA algorithm in searching an optimal solution in noisy datasets. From the study and discussion with NASA LERC personnel, we then prepared a proposal, which is being submitted to NASA for future work for the development of data mining algorithms for engine conditional monitoring. The proposed set of algorithm uses wavelet processing for creating multi-resolution pyramid of the data for GA based multi-resolution optimal search. Wavelet processing is proposed to create a coarse resolution representation of data providing two advantages in GA based search: 1. We will have less data to begin with to make search sub-spaces. 2. It will have robustness against the noise because at every level of wavelet based decomposition, we will be decomposing the signal into low pass and high pass filters.
A simple and efficient algorithm operating with linear time for MCEEG data compression.
Titus, Geevarghese; Sudhakar, M S
2017-09-01
Popularisation of electroencephalograph (EEG) signals in diversified fields have increased the need for devices capable of operating at lower power and storage requirements. This has led to a great deal of research in data compression, that can address (a) low latency in the coding of the signal, (b) reduced hardware and software dependencies, (c) quantify the system anomalies, and (d) effectively reconstruct the compressed signal. This paper proposes a computationally simple and novel coding scheme named spatial pseudo codec (SPC), to achieve lossy to near lossless compression of multichannel EEG (MCEEG). In the proposed system, MCEEG signals are initially normalized, followed by two parallel processes: one operating on integer part and the other, on fractional part of the normalized data. The redundancies in integer part are exploited using spatial domain encoder, and the fractional part is coded as pseudo integers. The proposed method has been tested on a wide range of databases having variable sampling rates and resolutions. Results indicate that the algorithm has a good recovery performance with an average percentage root mean square deviation (PRD) of 2.72 for an average compression ratio (CR) of 3.16. Furthermore, the algorithm has a complexity of only O(n) with an average encoding and decoding time per sample of 0.3 ms and 0.04 ms respectively. The performance of the algorithm is comparable with recent methods like fast discrete cosine transform (fDCT) and tensor decomposition methods. The results validated the feasibility of the proposed compression scheme for practical MCEEG recording, archiving and brain computer interfacing systems.
A Flexible Method for Multi-Material Decomposition of Dual-Energy CT Images.
Mendonca, Paulo R S; Lamb, Peter; Sahani, Dushyant V
2014-01-01
The ability of dual-energy computed-tomographic (CT) systems to determine the concentration of constituent materials in a mixture, known as material decomposition, is the basis for many of dual-energy CT's clinical applications. However, the complex composition of tissues and organs in the human body poses a challenge for many material decomposition methods, which assume the presence of only two, or at most three, materials in the mixture. We developed a flexible, model-based method that extends dual-energy CT's core material decomposition capability to handle more complex situations, in which it is necessary to disambiguate among and quantify the concentration of a larger number of materials. The proposed method, named multi-material decomposition (MMD), was used to develop two image analysis algorithms. The first was virtual unenhancement (VUE), which digitally removes the effect of contrast agents from contrast-enhanced dual-energy CT exams. VUE has the ability to reduce patient dose and improve clinical workflow, and can be used in a number of clinical applications such as CT urography and CT angiography. The second algorithm developed was liver-fat quantification (LFQ), which accurately quantifies the fat concentration in the liver from dual-energy CT exams. LFQ can form the basis of a clinical application targeting the diagnosis and treatment of fatty liver disease. Using image data collected from a cohort consisting of 50 patients and from phantoms, the application of MMD to VUE and LFQ yielded quantitatively accurate results when compared against gold standards. Furthermore, consistent results were obtained across all phases of imaging (contrast-free and contrast-enhanced). This is of particular importance since most clinical protocols for abdominal imaging with CT call for multi-phase imaging. We conclude that MMD can successfully form the basis of a number of dual-energy CT image analysis algorithms, and has the potential to improve the clinical utility of dual-energy CT in disease management.
Multiscale Methods, Parallel Computation, and Neural Networks for Real-Time Computer Vision.
NASA Astrophysics Data System (ADS)
Battiti, Roberto
1990-01-01
This thesis presents new algorithms for low and intermediate level computer vision. The guiding ideas in the presented approach are those of hierarchical and adaptive processing, concurrent computation, and supervised learning. Processing of the visual data at different resolutions is used not only to reduce the amount of computation necessary to reach the fixed point, but also to produce a more accurate estimation of the desired parameters. The presented adaptive multiple scale technique is applied to the problem of motion field estimation. Different parts of the image are analyzed at a resolution that is chosen in order to minimize the error in the coefficients of the differential equations to be solved. Tests with video-acquired images show that velocity estimation is more accurate over a wide range of motion with respect to the homogeneous scheme. In some cases introduction of explicit discontinuities coupled to the continuous variables can be used to avoid propagation of visual information from areas corresponding to objects with different physical and/or kinematic properties. The human visual system uses concurrent computation in order to process the vast amount of visual data in "real -time." Although with different technological constraints, parallel computation can be used efficiently for computer vision. All the presented algorithms have been implemented on medium grain distributed memory multicomputers with a speed-up approximately proportional to the number of processors used. A simple two-dimensional domain decomposition assigns regions of the multiresolution pyramid to the different processors. The inter-processor communication needed during the solution process is proportional to the linear dimension of the assigned domain, so that efficiency is close to 100% if a large region is assigned to each processor. Finally, learning algorithms are shown to be a viable technique to engineer computer vision systems for different applications starting from multiple-purpose modules. In the last part of the thesis a well known optimization method (the Broyden-Fletcher-Goldfarb-Shanno memoryless quasi -Newton method) is applied to simple classification problems and shown to be superior to the "error back-propagation" algorithm for numerical stability, automatic selection of parameters, and convergence properties.
Remote listening and passive acoustic detection in a 3-D environment
NASA Astrophysics Data System (ADS)
Barnhill, Colin
Teleconferencing environments are a necessity in business, education and personal communication. They allow for the communication of information to remote locations without the need for travel and the necessary time and expense required for that travel. Visual information can be communicated using cameras and monitors. The advantage of visual communication is that an image can capture multiple objects and convey them, using a monitor, to a large group of people regardless of the receiver's location. This is not the case for audio. Currently, most experimental teleconferencing systems' audio is based on stereo recording and reproduction techniques. The problem with this solution is that it is only effective for one or two receivers. To accurately capture a sound environment consisting of multiple sources and to recreate that for a group of people is an unsolved problem. This work will focus on new methods of multiple source 3-D environment sound capture and applications using these captured environments. Using spherical microphone arrays, it is now possible to capture a true 3-D environment A spherical harmonic transform on the array's surface allows us to determine the basis functions (spherical harmonics) for all spherical wave solutions (up to a fixed order). This spherical harmonic decomposition (SHD) allows us to not only look at the time and frequency characteristics of an audio signal but also the spatial characteristics of an audio signal. In this way, a spherical harmonic transform is analogous to a Fourier transform in that a Fourier transform transforms a signal into the frequency domain and a spherical harmonic transform transforms a signal into the spatial domain. The SHD also decouples the input signals from the microphone locations. Using the SHD of a soundfield, new algorithms are available for remote listening, acoustic detection, and signal enhancement The new algorithms presented in this paper show distinct advantages over previous detection and listening algorithms especially for multiple speech sources and room environments. The algorithms use high order (spherical harmonic) beamforming and power signal characteristics for source localization and signal enhancement These methods are applied to remote listening, surveillance, and teleconferencing.
Zhang, Rubo; Yang, Yu
2017-01-01
Research on distributed task planning model for multi-autonomous underwater vehicle (MAUV). A scroll time domain quantum artificial bee colony (STDQABC) optimization algorithm is proposed to solve the multi-AUV optimal task planning scheme. In the uncertain marine environment, the rolling time domain control technique is used to realize a numerical optimization in a narrowed time range. Rolling time domain control is one of the better task planning techniques, which can greatly reduce the computational workload and realize the tradeoff between AUV dynamics, environment and cost. Finally, a simulation experiment was performed to evaluate the distributed task planning performance of the scroll time domain quantum bee colony optimization algorithm. The simulation results demonstrate that the STDQABC algorithm converges faster than the QABC and ABC algorithms in terms of both iterations and running time. The STDQABC algorithm can effectively improve MAUV distributed tasking planning performance, complete the task goal and get the approximate optimal solution. PMID:29186166
Li, Jianjun; Zhang, Rubo; Yang, Yu
2017-01-01
Research on distributed task planning model for multi-autonomous underwater vehicle (MAUV). A scroll time domain quantum artificial bee colony (STDQABC) optimization algorithm is proposed to solve the multi-AUV optimal task planning scheme. In the uncertain marine environment, the rolling time domain control technique is used to realize a numerical optimization in a narrowed time range. Rolling time domain control is one of the better task planning techniques, which can greatly reduce the computational workload and realize the tradeoff between AUV dynamics, environment and cost. Finally, a simulation experiment was performed to evaluate the distributed task planning performance of the scroll time domain quantum bee colony optimization algorithm. The simulation results demonstrate that the STDQABC algorithm converges faster than the QABC and ABC algorithms in terms of both iterations and running time. The STDQABC algorithm can effectively improve MAUV distributed tasking planning performance, complete the task goal and get the approximate optimal solution.
Laamiri, Imen; Khouaja, Anis; Messaoud, Hassani
2015-03-01
In this paper we provide a convergence analysis of the alternating RGLS (Recursive Generalized Least Square) algorithm used for the identification of the reduced complexity Volterra model describing stochastic non-linear systems. The reduced Volterra model used is the 3rd order SVD-PARAFC-Volterra model provided using the Singular Value Decomposition (SVD) and the Parallel Factor (PARAFAC) tensor decomposition of the quadratic and the cubic kernels respectively of the classical Volterra model. The Alternating RGLS (ARGLS) algorithm consists on the execution of the classical RGLS algorithm in alternating way. The ARGLS convergence was proved using the Ordinary Differential Equation (ODE) method. It is noted that the algorithm convergence canno׳t be ensured when the disturbance acting on the system to be identified has specific features. The ARGLS algorithm is tested in simulations on a numerical example by satisfying the determined convergence conditions. To raise the elegies of the proposed algorithm, we proceed to its comparison with the classical Alternating Recursive Least Squares (ARLS) presented in the literature. The comparison has been built on a non-linear satellite channel and a benchmark system CSTR (Continuous Stirred Tank Reactor). Moreover the efficiency of the proposed identification approach is proved on an experimental Communicating Two Tank system (CTTS). Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
2012-03-22
with performance profiles, Math. Program., 91 (2002), pp. 201–213. [6] P. DRINEAS, R. KANNAN, AND M. W. MAHONEY , Fast Monte Carlo algorithms for matrices...computing invariant subspaces of non-Hermitian matri- ces, Numer. Math., 25 ( 1975 /76), pp. 123–136. [25] , Matrix algorithms Vol. II: Eigensystems
A Nonlinear Calibration Algorithm Based on Harmonic Decomposition for Two-Axis Fluxgate Sensors
Liu, Shibin
2018-01-01
Nonlinearity is a prominent limitation to the calibration performance for two-axis fluxgate sensors. In this paper, a novel nonlinear calibration algorithm taking into account the nonlinearity of errors is proposed. In order to establish the nonlinear calibration model, the combined effort of all time-invariant errors is analyzed in detail, and then harmonic decomposition method is utilized to estimate the compensation coefficients. Meanwhile, the proposed nonlinear calibration algorithm is validated and compared with a classical calibration algorithm by experiments. The experimental results show that, after the nonlinear calibration, the maximum deviation of magnetic field magnitude is decreased from 1302 nT to 30 nT, which is smaller than 81 nT after the classical calibration. Furthermore, for the two-axis fluxgate sensor used as magnetic compass, the maximum error of heading is corrected from 1.86° to 0.07°, which is approximately 11% in contrast with 0.62° after the classical calibration. The results suggest an effective way to improve the calibration performance of two-axis fluxgate sensors. PMID:29789448
NASA Astrophysics Data System (ADS)
Pan, Xiao-Min; Wei, Jian-Gong; Peng, Zhen; Sheng, Xin-Qing
2012-02-01
The interpolative decomposition (ID) is combined with the multilevel fast multipole algorithm (MLFMA), denoted by ID-MLFMA, to handle multiscale problems. The ID-MLFMA first generates ID levels by recursively dividing the boxes at the finest MLFMA level into smaller boxes. It is specifically shown that near-field interactions with respect to the MLFMA, in the form of the matrix vector multiplication (MVM), are efficiently approximated at the ID levels. Meanwhile, computations on far-field interactions at the MLFMA levels remain unchanged. Only a small portion of matrix entries are required to approximate coupling among well-separated boxes at the ID levels, and these submatrices can be filled without computing the complete original coupling matrix. It follows that the matrix filling in the ID-MLFMA becomes much less expensive. The memory consumed is thus greatly reduced and the MVM is accelerated as well. Several factors that may influence the accuracy, efficiency and reliability of the proposed ID-MLFMA are investigated by numerical experiments. Complex targets are calculated to demonstrate the capability of the ID-MLFMA algorithm.
2012-09-01
Robust global image registration based on a hybrid algorithm combining Fourier and spatial domain techniques Peter N. Crabtree, Collin Seanor...00-00-2012 to 00-00-2012 4. TITLE AND SUBTITLE Robust global image registration based on a hybrid algorithm combining Fourier and spatial domain...demonstrate performance of a hybrid algorithm . These results are from analysis of a set of images of an ISO 12233 [12] resolution chart captured in the
Frequency-domain-independent vector analysis for mode-division multiplexed transmission
NASA Astrophysics Data System (ADS)
Liu, Yunhe; Hu, Guijun; Li, Jiao
2018-04-01
In this paper, we propose a demultiplexing method based on frequency-domain independent vector analysis (FD-IVA) algorithm for mode-division multiplexing (MDM) system. FD-IVA extends frequency-domain independent component analysis (FD-ICA) from unitary variable to multivariate variables, and provides an efficient method to eliminate the permutation ambiguity. In order to verify the performance of FD-IVA algorithm, a 6 ×6 MDM system is simulated. The simulation results show that the FD-IVA algorithm has basically the same bit-error-rate(BER) performance with the FD-ICA algorithm and frequency-domain least mean squares (FD-LMS) algorithm. Meanwhile, the convergence speed of FD-IVA algorithm is the same as that of FD-ICA. However, compared with the FD-ICA and the FD-LMS, the FD-IVA has an obviously lower computational complexity.
Adaptive Fourier decomposition based ECG denoising.
Wang, Ze; Wan, Feng; Wong, Chi Man; Zhang, Liming
2016-10-01
A novel ECG denoising method is proposed based on the adaptive Fourier decomposition (AFD). The AFD decomposes a signal according to its energy distribution, thereby making this algorithm suitable for separating pure ECG signal and noise with overlapping frequency ranges but different energy distributions. A stop criterion for the iterative decomposition process in the AFD is calculated on the basis of the estimated signal-to-noise ratio (SNR) of the noisy signal. The proposed AFD-based method is validated by the synthetic ECG signal using an ECG model and also real ECG signals from the MIT-BIH Arrhythmia Database both with additive Gaussian white noise. Simulation results of the proposed method show better performance on the denoising and the QRS detection in comparing with major ECG denoising schemes based on the wavelet transform, the Stockwell transform, the empirical mode decomposition, and the ensemble empirical mode decomposition. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Rai, Prashant; Sargsyan, Khachik; Najm, Habib; Hermes, Matthew R.; Hirata, So
2017-09-01
A new method is proposed for a fast evaluation of high-dimensional integrals of potential energy surfaces (PES) that arise in many areas of quantum dynamics. It decomposes a PES into a canonical low-rank tensor format, reducing its integral into a relatively short sum of products of low-dimensional integrals. The decomposition is achieved by the alternating least squares (ALS) algorithm, requiring only a small number of single-point energy evaluations. Therefore, it eradicates a force-constant evaluation as the hotspot of many quantum dynamics simulations and also possibly lifts the curse of dimensionality. This general method is applied to the anharmonic vibrational zero-point and transition energy calculations of molecules using the second-order diagrammatic vibrational many-body Green's function (XVH2) theory with a harmonic-approximation reference. In this application, high dimensional PES and Green's functions are both subjected to a low-rank decomposition. Evaluating the molecular integrals over a low-rank PES and Green's functions as sums of low-dimensional integrals using the Gauss-Hermite quadrature, this canonical-tensor-decomposition-based XVH2 (CT-XVH2) achieves an accuracy of 0.1 cm-1 or higher and nearly an order of magnitude speedup as compared with the original algorithm using force constants for water and formaldehyde.
Empirical mode decomposition-based facial pose estimation inside video sequences
NASA Astrophysics Data System (ADS)
Qing, Chunmei; Jiang, Jianmin; Yang, Zhijing
2010-03-01
We describe a new pose-estimation algorithm via integration of the strength in both empirical mode decomposition (EMD) and mutual information. While mutual information is exploited to measure the similarity between facial images to estimate poses, EMD is exploited to decompose input facial images into a number of intrinsic mode function (IMF) components, which redistribute the effect of noise, expression changes, and illumination variations as such that, when the input facial image is described by the selected IMF components, all the negative effects can be minimized. Extensive experiments were carried out in comparisons to existing representative techniques, and the results show that the proposed algorithm achieves better pose-estimation performances with robustness to noise corruption, illumination variation, and facial expressions.
A robust watermarking scheme using lifting wavelet transform and singular value decomposition
NASA Astrophysics Data System (ADS)
Bhardwaj, Anuj; Verma, Deval; Verma, Vivek Singh
2017-01-01
The present paper proposes a robust image watermarking scheme using lifting wavelet transform (LWT) and singular value decomposition (SVD). Second level LWT is applied on host/cover image to decompose into different subbands. SVD is used to obtain singular values of watermark image and then these singular values are updated with the singular values of LH2 subband. The algorithm is tested on a number of benchmark images and it is found that the present algorithm is robust against different geometric and image processing operations. A comparison of the proposed scheme is performed with other existing schemes and observed that the present scheme is better not only in terms of robustness but also in terms of imperceptibility.
NASA Astrophysics Data System (ADS)
Kendon, Vivien M.; Cates, Michael E.; Pagonabarraga, Ignacio; Desplat, J.-C.; Bladon, Peter
2001-08-01
The late-stage demixing following spinodal decomposition of a three-dimensional symmetric binary fluid mixture is studied numerically, using a thermodynamically consistent lattice Boltzmann method. We combine results from simulations with different numerical parameters to obtain an unprecedented range of length and time scales when expressed in reduced physical units. (These are the length and time units derived from fluid density, viscosity, and interfacial tension.) Using eight large (2563) runs, the resulting composite graph of reduced domain size l against reduced time t covers 1 [less, similar] l [less, similar] 105, 10 [less, similar] t [less, similar] 108. Our data are consistent with the dynamical scaling hypothesis that l(t) is a universal scaling curve. We give the first detailed statistical analysis of fluid motion, rather than just domain evolution, in simulations of this kind, and introduce scaling plots for several quantities derived from the fluid velocity and velocity gradient fields. Using the conventional definition of Reynolds number for this problem, Re[phi] = ldl/dt, we attain values approaching 350. At Re[phi] [greater, similar] 100 (which requires t [greater, similar] 106) we find clear evidence of Furukawa's inertial scaling (l [similar] t2/3), although the crossover from the viscous regime (l [similar] t) is both broad and late (102 [less, similar] t [less, similar] 106). Though it cannot be ruled out, we find no indication that Re[phi] is self-limiting (l [similar] t1/2) at late times, as recently proposed by Grant & Elder. Detailed study of the velocity fields confirms that, for our most inertial runs, the RMS ratio of nonlinear to viscous terms in the Navier Stokes equation, R2, is of order 10, with the fluid mixture showing incipient turbulent characteristics. However, we cannot go far enough into the inertial regime to obtain a clear length separation of domain size, Taylor microscale, and Kolmogorov scale, as would be needed to test a recent ‘extended’ scaling theory of Kendon (in which R2 is self-limiting but Re[phi] not). Obtaining our results has required careful steering of several numerical control parameters so as to maintain adequate algorithmic stability, efficiency and isotropy, while eliminating unwanted residual diffusion. (We argue that the latter affects some studies in the literature which report l [similar] t2/3 for t [less, similar] 104.) We analyse the various sources of error and find them just within acceptable levels (a few percent each) in most of our datasets. To bring these under significantly better control, or to go much further into the inertial regime, would require much larger computational resources and/or a breakthrough in algorithm design.
Identifying Optimal Measurement Subspace for the Ensemble Kalman Filter
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Ning; Huang, Zhenyu; Welch, Greg
2012-05-24
To reduce the computational load of the ensemble Kalman filter while maintaining its efficacy, an optimization algorithm based on the generalized eigenvalue decomposition method is proposed for identifying the most informative measurement subspace. When the number of measurements is large, the proposed algorithm can be used to make an effective tradeoff between computational complexity and estimation accuracy. This algorithm also can be extended to other Kalman filters for measurement subspace selection.
Algorithm For Solution Of Subset-Regression Problems
NASA Technical Reports Server (NTRS)
Verhaegen, Michel
1991-01-01
Reliable and flexible algorithm for solution of subset-regression problem performs QR decomposition with new column-pivoting strategy, enables selection of subset directly from originally defined regression parameters. This feature, in combination with number of extensions, makes algorithm very flexible for use in analysis of subset-regression problems in which parameters have physical meanings. Also extended to enable joint processing of columns contaminated by noise with those free of noise, without using scaling techniques.
Domain decomposition methods for the parallel computation of reacting flows
NASA Technical Reports Server (NTRS)
Keyes, David E.
1988-01-01
Domain decomposition is a natural route to parallel computing for partial differential equation solvers. Subdomains of which the original domain of definition is comprised are assigned to independent processors at the price of periodic coordination between processors to compute global parameters and maintain the requisite degree of continuity of the solution at the subdomain interfaces. In the domain-decomposed solution of steady multidimensional systems of PDEs by finite difference methods using a pseudo-transient version of Newton iteration, the only portion of the computation which generally stands in the way of efficient parallelization is the solution of the large, sparse linear systems arising at each Newton step. For some Jacobian matrices drawn from an actual two-dimensional reacting flow problem, comparisons are made between relaxation-based linear solvers and also preconditioned iterative methods of Conjugate Gradient and Chebyshev type, focusing attention on both iteration count and global inner product count. The generalized minimum residual method with block-ILU preconditioning is judged the best serial method among those considered, and parallel numerical experiments on the Encore Multimax demonstrate for it approximately 10-fold speedup on 16 processors.
A spectral analysis of the domain decomposed Monte Carlo method for linear systems
Slattery, Stuart R.; Evans, Thomas M.; Wilson, Paul P. H.
2015-09-08
The domain decomposed behavior of the adjoint Neumann-Ulam Monte Carlo method for solving linear systems is analyzed using the spectral properties of the linear oper- ator. Relationships for the average length of the adjoint random walks, a measure of convergence speed and serial performance, are made with respect to the eigenvalues of the linear operator. In addition, relationships for the effective optical thickness of a domain in the decomposition are presented based on the spectral analysis and diffusion theory. Using the effective optical thickness, the Wigner rational approxi- mation and the mean chord approximation are applied to estimate the leakagemore » frac- tion of random walks from a domain in the decomposition as a measure of parallel performance and potential communication costs. The one-speed, two-dimensional neutron diffusion equation is used as a model problem in numerical experiments to test the models for symmetric operators with spectral qualities similar to light water reactor problems. We find, in general, the derived approximations show good agreement with random walk lengths and leakage fractions computed by the numerical experiments.« less
Mathew, B; Schmitz, A; Muñoz-Descalzo, S; Ansari, N; Pampaloni, F; Stelzer, E H K; Fischer, S C
2015-06-08
Due to the large amount of data produced by advanced microscopy, automated image analysis is crucial in modern biology. Most applications require reliable cell nuclei segmentation. However, in many biological specimens cell nuclei are densely packed and appear to touch one another in the images. Therefore, a major difficulty of three-dimensional cell nuclei segmentation is the decomposition of cell nuclei that apparently touch each other. Current methods are highly adapted to a certain biological specimen or a specific microscope. They do not ensure similarly accurate segmentation performance, i.e. their robustness for different datasets is not guaranteed. Hence, these methods require elaborate adjustments to each dataset. We present an advanced three-dimensional cell nuclei segmentation algorithm that is accurate and robust. Our approach combines local adaptive pre-processing with decomposition based on Lines-of-Sight (LoS) to separate apparently touching cell nuclei into approximately convex parts. We demonstrate the superior performance of our algorithm using data from different specimens recorded with different microscopes. The three-dimensional images were recorded with confocal and light sheet-based fluorescence microscopes. The specimens are an early mouse embryo and two different cellular spheroids. We compared the segmentation accuracy of our algorithm with ground truth data for the test images and results from state-of-the-art methods. The analysis shows that our method is accurate throughout all test datasets (mean F-measure: 91%) whereas the other methods each failed for at least one dataset (F-measure≤69%). Furthermore, nuclei volume measurements are improved for LoS decomposition. The state-of-the-art methods required laborious adjustments of parameter values to achieve these results. Our LoS algorithm did not require parameter value adjustments. The accurate performance was achieved with one fixed set of parameter values. We developed a novel and fully automated three-dimensional cell nuclei segmentation method incorporating LoS decomposition. LoS are easily accessible features that ensure correct splitting of apparently touching cell nuclei independent of their shape, size or intensity. Our method showed superior performance compared to state-of-the-art methods, performing accurately for a variety of test images. Hence, our LoS approach can be readily applied to quantitative evaluation in drug testing, developmental and cell biology.
A novel ECG data compression method based on adaptive Fourier decomposition
NASA Astrophysics Data System (ADS)
Tan, Chunyu; Zhang, Liming
2017-12-01
This paper presents a novel electrocardiogram (ECG) compression method based on adaptive Fourier decomposition (AFD). AFD is a newly developed signal decomposition approach, which can decompose a signal with fast convergence, and hence reconstruct ECG signals with high fidelity. Unlike most of the high performance algorithms, our method does not make use of any preprocessing operation before compression. Huffman coding is employed for further compression. Validated with 48 ECG recordings of MIT-BIH arrhythmia database, the proposed method achieves the compression ratio (CR) of 35.53 and the percentage root mean square difference (PRD) of 1.47% on average with N = 8 decomposition times and a robust PRD-CR relationship. The results demonstrate that the proposed method has a good performance compared with the state-of-the-art ECG compressors.
The high performance parallel algorithm for Unified Gas-Kinetic Scheme
NASA Astrophysics Data System (ADS)
Li, Shiyi; Li, Qibing; Fu, Song; Xu, Jinxiu
2016-11-01
A high performance parallel algorithm for UGKS is developed to simulate three-dimensional flows internal and external on arbitrary grid system. The physical domain and velocity domain are divided into different blocks and distributed according to the two-dimensional Cartesian topology with intra-communicators in physical domain for data exchange and other intra-communicators in velocity domain for sum reduction to moment integrals. Numerical results of three-dimensional cavity flow and flow past a sphere agree well with the results from the existing studies and validate the applicability of the algorithm. The scalability of the algorithm is tested both on small (1-16) and large (729-5832) scale processors. The tested speed-up ratio is near linear ashind thus the efficiency is around 1, which reveals the good scalability of the present algorithm.
Petri nets SM-cover-based on heuristic coloring algorithm
NASA Astrophysics Data System (ADS)
Tkacz, Jacek; Doligalski, Michał
2015-09-01
In the paper, coloring heuristic algorithm of interpreted Petri nets is presented. Coloring is used to determine the State Machines (SM) subnets. The present algorithm reduces the Petri net in order to reduce the computational complexity and finds one of its possible State Machines cover. The proposed algorithm uses elements of interpretation of Petri nets. The obtained result may not be the best, but it is sufficient for use in rapid prototyping of logic controllers. Found SM-cover will be also used in the development of algorithms for decomposition, and modular synthesis and implementation of parallel logic controllers. Correctness developed heuristic algorithm was verified using Gentzen formal reasoning system.
Multipoint to multipoint routing and wavelength assignment in multi-domain optical networks
NASA Astrophysics Data System (ADS)
Qin, Panke; Wu, Jingru; Li, Xudong; Tang, Yongli
2018-01-01
In multi-point to multi-point (MP2MP) routing and wavelength assignment (RWA) problems, researchers usually assume the optical networks to be a single domain. However, the optical networks develop toward to multi-domain and larger scale in practice. In this context, multi-core shared tree (MST)-based MP2MP RWA are introduced problems including optimal multicast domain sequence selection, core nodes belonging in which domains and so on. In this letter, we focus on MST-based MP2MP RWA problems in multi-domain optical networks, mixed integer linear programming (MILP) formulations to optimally construct MP2MP multicast trees is presented. A heuristic algorithm base on network virtualization and weighted clustering algorithm (NV-WCA) is proposed. Simulation results show that, under different traffic patterns, the proposed algorithm achieves significant improvement on network resources occupation and multicast trees setup latency in contrast with the conventional algorithms which were proposed base on a single domain network environment.
Object detection with a multistatic array using singular value decomposition
Hallquist, Aaron T.; Chambers, David H.
2014-07-01
A method and system for detecting the presence of subsurface objects within a medium is provided. In some embodiments, the detection system operates in a multistatic mode to collect radar return signals generated by an array of transceiver antenna pairs that is positioned across a surface and that travels down the surface. The detection system converts the return signals from a time domain to a frequency domain, resulting in frequency return signals. The detection system then performs a singular value decomposition for each frequency to identify singular values for each frequency. The detection system then detects the presence of a subsurface object based on a comparison of the identified singular values to expected singular values when no subsurface object is present.
Tensor Toolbox for MATLAB v. 3.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kola, Tamara; Bader, Brett W.; Acar Ataman, Evrim NMN
Tensors (also known as multidimensional arrays or N-way arrays) are used in a variety of applications ranging from chemometrics to network analysis. The Tensor Toolbox provides classes for manipulating dense, sparse, and structured tensors using MATLAB's object-oriented features. It also provides algorithms for tensor decomposition and factorization, algorithms for computing tensor eigenvalues, and methods for visualization of results.
Spectral CT Image Restoration via an Average Image-Induced Nonlocal Means Filter.
Zeng, Dong; Huang, Jing; Zhang, Hua; Bian, Zhaoying; Niu, Shanzhou; Zhang, Zhang; Feng, Qianjin; Chen, Wufan; Ma, Jianhua
2016-05-01
Spectral computed tomography (SCT) images reconstructed by an analytical approach often suffer from a poor signal-to-noise ratio and strong streak artifacts when sufficient photon counts are not available in SCT imaging. In reducing noise-induced artifacts in SCT images, in this study, we propose an average image-induced nonlocal means (aviNLM) filter for each energy-specific image restoration. Methods: The present aviNLM algorithm exploits redundant information in the whole energy domain. Specifically, the proposed aviNLM algorithm yields the restored results by performing a nonlocal weighted average operation on the noisy energy-specific images with the nonlocal weight matrix between the target and prior images, in which the prior image is generated from all of the images reconstructed in each energy bin. Results: Qualitative and quantitative studies are conducted to evaluate the aviNLM filter by using the data of digital phantom, physical phantom, and clinical patient data acquired from the energy-resolved and -integrated detectors, respectively. Experimental results show that the present aviNLM filter can achieve promising results for SCT image restoration in terms of noise-induced artifact suppression, cross profile, and contrast-to-noise ratio and material decomposition assessment. Conclusion and Significance: The present aviNLM algorithm has useful potential for radiation dose reduction by lowering the mAs in SCT imaging, and it may be useful for some other clinical applications, such as in myocardial perfusion imaging and radiotherapy.
Virtual network embedding in cross-domain network based on topology and resource attributes
NASA Astrophysics Data System (ADS)
Zhu, Lei; Zhang, Zhizhong; Feng, Linlin; Liu, Lilan
2018-03-01
Aiming at the network architecture ossification and the diversity of access technologies issues, this paper researches the cross-domain virtual network embedding algorithm. By analysing the topological attribute from the local and global perspective of nodes in the virtual network and the physical network, combined with the local network resource property, we rank the embedding priority of the nodes with PCA and TOPSIS methods. Besides, the link load distribution is considered. Above all, We proposed an cross-domain virtual network embedding algorithm based on topology and resource attributes. The simulation results depicts that our algorithm increases the acceptance rate of multi-domain virtual network requests, compared with the existing virtual network embedding algorithm.
Multi-variants synthesis of Petri nets for FPGA devices
NASA Astrophysics Data System (ADS)
Bukowiec, Arkadiusz; Doligalski, Michał
2015-09-01
There is presented new method of synthesis of application specific logic controllers for FPGA devices. The specification of control algorithm is made with use of control interpreted Petri net (PT type). It allows specifying parallel processes in easy way. The Petri net is decomposed into state-machine type subnets. In this case, each subnet represents one parallel process. For this purpose there are applied algorithms of coloring of Petri nets. There are presented two approaches of such decomposition: with doublers of macroplaces or with one global wait place. Next, subnets are implemented into two-level logic circuit of the controller. The levels of logic circuit are obtained as a result of its architectural decomposition. The first level combinational circuit is responsible for generation of next places and second level decoder is responsible for generation output symbols. There are worked out two variants of such circuits: with one shared operational memory or with many flexible distributed memories as a decoder. Variants of Petri net decomposition and structures of logic circuits can be combined together without any restrictions. It leads to existence of four variants of multi-variants synthesis.
NASA Astrophysics Data System (ADS)
Hsiao, Y. R.; Tsai, C.
2017-12-01
As the WHO Air Quality Guideline indicates, ambient air pollution exposes world populations under threat of fatal symptoms (e.g. heart disease, lung cancer, asthma etc.), raising concerns of air pollution sources and relative factors. This study presents a novel approach to investigating the multiscale variations of PM2.5 in southern Taiwan over the past decade, with four meteorological influencing factors (Temperature, relative humidity, precipitation and wind speed),based on Noise-assisted Multivariate Empirical Mode Decomposition(NAMEMD) algorithm, Hilbert Spectral Analysis(HSA) and Time-dependent Intrinsic Correlation(TDIC) method. NAMEMD algorithm is a fully data-driven approach designed for nonlinear and nonstationary multivariate signals, and is performed to decompose multivariate signals into a collection of channels of Intrinsic Mode Functions (IMFs). TDIC method is an EMD-based method using a set of sliding window sizes to quantify localized correlation coefficients for multiscale signals. With the alignment property and quasi-dyadic filter bank of NAMEMD algorithm, one is able to produce same number of IMFs for all variables and estimates the cross correlation in a more accurate way. The performance of spectral representation of NAMEMD-HSA method is compared with Complementary Empirical Mode Decomposition/ Hilbert Spectral Analysis (CEEMD-HSA) and Wavelet Analysis. The nature of NAMAMD-based TDICC analysis is then compared with CEEMD-based TDIC analysis and the traditional correlation analysis.
Sensor system for heart sound biomonitor
NASA Astrophysics Data System (ADS)
Maple, Jarrad L.; Hall, Leonard T.; Agzarian, John; Abbott, Derek
1999-09-01
Heart sounds can be utilized more efficiently by medical doctors when they are displayed visually, rather than through a conventional stethoscope. A system whereby a digital stethoscope interfaces directly to a PC will be directly along with signal processing algorithms, adopted. The sensor is based on a noise cancellation microphone, with a 450 Hz bandwidth and is sampled at 2250 samples/sec with 12-bit resolution. Further to this, we discuss for comparison a piezo-based sensor with a 1 kHz bandwidth. A major problem is that the recording of the heart sound into these devices is subject to unwanted background noise which can override the heart sound and results in a poor visual representation. This noise originates from various sources such as skin contact with the stethoscope diaphragm, lung sounds, and other surrounding sounds such as speech. Furthermore we demonstrate a solution using 'wavelet denoising'. The wavelet transform is used because of the similarity between the shape of wavelets and the time-domain shape of a heartbeat sound. Thus coding of the waveform into the wavelet domain is achieved with relatively few wavelet coefficients, in contrast to the many Fourier components that would result from conventional decomposition. We show that the background noise can be dramatically reduced by a thresholding operation in the wavelet domain. The principle is that the background noise codes into many small broadband wavelet coefficients that can be removed without significant degradation of the signal of interest.
Model based approach to UXO imaging using the time domain electromagnetic method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lavely, E.M.
1999-04-01
Time domain electromagnetic (TDEM) sensors have emerged as a field-worthy technology for UXO detection in a variety of geological and environmental settings. This success has been achieved with commercial equipment that was not optimized for UXO detection and discrimination. The TDEM response displays a rich spatial and temporal behavior which is not currently utilized. Therefore, in this paper the author describes a research program for enhancing the effectiveness of the TDEM method for UXO detection and imaging. Fundamental research is required in at least three major areas: (a) model based imaging capability i.e. the forward and inverse problem, (b) detectormore » modeling and instrument design, and (c) target recognition and discrimination algorithms. These research problems are coupled and demand a unified treatment. For example: (1) the inverse solution depends on solution of the forward problem and knowledge of the instrument response; (2) instrument design with improved diagnostic power requires forward and inverse modeling capability; and (3) improved target recognition algorithms (such as neural nets) must be trained with data collected from the new instrument and with synthetic data computed using the forward model. Further, the design of the appropriate input and output layers of the net will be informed by the results of the forward and inverse modeling. A more fully developed model of the TDEM response would enable the joint inversion of data collected from multiple sensors (e.g., TDEM sensors and magnetometers). Finally, the author suggests that a complementary approach to joint inversions is the statistical recombination of data using principal component analysis. The decomposition into principal components is useful since the first principal component contains those features that are most strongly correlated from image to image.« less
An improved algorithm for balanced POD through an analytic treatment of impulse response tails
NASA Astrophysics Data System (ADS)
Tu, Jonathan H.; Rowley, Clarence W.
2012-06-01
We present a modification of the balanced proper orthogonal decomposition (balanced POD) algorithm for systems with simple impulse response tails. In this new method, we use dynamic mode decomposition (DMD) to estimate the slowly decaying eigenvectors that dominate the long-time behavior of the direct and adjoint impulse responses. This is done using a new, low-memory variant of the DMD algorithm, appropriate for large datasets. We then formulate analytic expressions for the contribution of these eigenvectors to the controllability and observability Gramians. These contributions can be accounted for in the balanced POD algorithm by simply appending the impulse response snapshot matrices (direct and adjoint, respectively) with particular linear combinations of the slow eigenvectors. Aside from these additions to the snapshot matrices, the algorithm remains unchanged. By treating the tails analytically, we eliminate the need to run long impulse response simulations, lowering storage requirements and speeding up ensuing computations. To demonstrate its effectiveness, we apply this method to two examples: the linearized, complex Ginzburg-Landau equation, and the two-dimensional fluid flow past a cylinder. As expected, reduced-order models computed using an analytic tail match or exceed the accuracy of those computed using the standard balanced POD procedure, at a fraction of the cost.