The Mixed Finite Element Multigrid Method for Stokes Equations
Muzhinji, K.; Shateyi, S.; Motsa, S. S.
2015-01-01
The stable finite element discretization of the Stokes problem produces a symmetric indefinite system of linear algebraic equations. A variety of iterative solvers have been proposed for such systems in an attempt to construct efficient, fast, and robust solution techniques. This paper investigates one of such iterative solvers, the geometric multigrid solver, to find the approximate solution of the indefinite systems. The main ingredient of the multigrid method is the choice of an appropriate smoothing strategy. This study considers the application of different smoothers and compares their effects in the overall performance of the multigrid solver. We study the multigrid method with the following smoothers: distributed Gauss Seidel, inexact Uzawa, preconditioned MINRES, and Braess-Sarazin type smoothers. A comparative study of the smoothers shows that the Braess-Sarazin smoothers enhance good performance of the multigrid method. We study the problem in a two-dimensional domain using stable Hood-Taylor Q 2-Q 1 pair of finite rectangular elements. We also give the main theoretical convergence results. We present the numerical results to demonstrate the efficiency and robustness of the multigrid method and confirm the theoretical results. PMID:25945361
A Parallel Multigrid Solver for Viscous Flows on Anisotropic Structured Grids
NASA Technical Reports Server (NTRS)
Prieto, Manuel; Montero, Ruben S.; Llorente, Ignacio M.; Bushnell, Dennis M. (Technical Monitor)
2001-01-01
This paper presents an efficient parallel multigrid solver for speeding up the computation of a 3-D model that treats the flow of a viscous fluid over a flat plate. The main interest of this simulation lies in exhibiting some basic difficulties that prevent optimal multigrid efficiencies from being achieved. As the computing platform, we have used Coral, a Beowulf-class system based on Intel Pentium processors and equipped with GigaNet cLAN and switched Fast Ethernet networks. Our study not only examines the scalability of the solver but also includes a performance evaluation of Coral where the investigated solver has been used to compare several of its design choices, namely, the interconnection network (GigaNet versus switched Fast-Ethernet) and the node configuration (dual nodes versus single nodes). As a reference, the performance results have been compared with those obtained with the NAS-MG benchmark.
Evaluation of a Multigrid Scheme for the Incompressible Navier-Stokes Equations
NASA Technical Reports Server (NTRS)
Swanson, R. C.
2004-01-01
A fast multigrid solver for the steady, incompressible Navier-Stokes equations is presented. The multigrid solver is based upon a factorizable discrete scheme for the velocity-pressure form of the Navier-Stokes equations. This scheme correctly distinguishes between the advection-diffusion and elliptic parts of the operator, allowing efficient smoothers to be constructed. To evaluate the multigrid algorithm, solutions are computed for flow over a flat plate, parabola, and a Karman-Trefftz airfoil. Both nonlifting and lifting airfoil flows are considered, with a Reynolds number range of 200 to 800. Convergence and accuracy of the algorithm are discussed. Using Gauss-Seidel line relaxation in alternating directions, multigrid convergence behavior approaching that of O(N) methods is achieved. The computational efficiency of the numerical scheme is compared with that of Runge-Kutta and implicit upwind based multigrid methods.
Textbook Multigrid Efficiency for Leading Edge Stagnation
NASA Technical Reports Server (NTRS)
Diskin, Boris; Thomas, James L.; Mineck, Raymond E.
2004-01-01
A multigrid solver is defined as having textbook multigrid efficiency (TME) if the solutions to the governing system of equations are attained in a computational work which is a small (less than 10) multiple of the operation count in evaluating the discrete residuals. TME in solving the incompressible inviscid fluid equations is demonstrated for leading-edge stagnation flows. The contributions of this paper include (1) a special formulation of the boundary conditions near stagnation allowing convergence of the Newton iterations on coarse grids, (2) the boundary relaxation technique to facilitate relaxation and residual restriction near the boundaries, (3) a modified relaxation scheme to prevent initial error amplification, and (4) new general analysis techniques for multigrid solvers. Convergence of algebraic errors below the level of discretization errors is attained by a full multigrid (FMG) solver with one full approximation scheme (FAS) cycle per grid. Asymptotic convergence rates of the FAS cycles for the full system of flow equations are very fast, approaching those for scalar elliptic equations.
Textbook Multigrid Efficiency for Leading Edge Stagnation
NASA Technical Reports Server (NTRS)
Diskin, Boris; Thomas, James L.; Mineck, Raymond E.
2004-01-01
A multigrid solver is defined as having textbook multigrid efficiency (TME) if the solutions to the governing system of equations are attained in a computational work which is a small (less than 10) multiple of the operation count in evaluating the discrete residuals. TME in solving the incompressible inviscid fluid equations is demonstrated for leading- edge stagnation flows. The contributions of this paper include (1) a special formulation of the boundary conditions near stagnation allowing convergence of the Newton iterations on coarse grids, (2) the boundary relaxation technique to facilitate relaxation and residual restriction near the boundaries, (3) a modified relaxation scheme to prevent initial error amplification, and (4) new general analysis techniques for multigrid solvers. Convergence of algebraic errors below the level of discretization errors is attained by a full multigrid (FMG) solver with one full approximation scheme (F.4S) cycle per grid. Asymptotic convergence rates of the F.4S cycles for the full system of flow equations are very fast, approaching those for scalar elliptic equations.
Textbook Multigrid Efficiency for the Steady Euler Equations
NASA Technical Reports Server (NTRS)
Roberts, Thomas W.; Sidilkover, David; Swanson, R. C.
2004-01-01
A fast multigrid solver for the steady incompressible Euler equations is presented. Unlike time-marching schemes, this approach uses relaxation of the steady equations. Application of this method results in a discretization that correctly distinguishes between the advection and elliptic parts of the operator, allowing efficient smoothers to be constructed. Solvers for both unstructured triangular grids and structured quadrilateral grids have been written. Computations for channel flow and flow over a nonlifting airfoil have computed. Using Gauss-Seidel relaxation ordered in the flow direction, textbook multigrid convergence rates of nearly one order-of-magnitude residual reduction per multigrid cycle are achieved, independent of the grid spacing. This approach also may be applied to the compressible Euler equations and the incompressible Navier-Stokes equations.
NASA Astrophysics Data System (ADS)
Lv, X.; Zhao, Y.; Huang, X. Y.; Xia, G. H.; Su, X. H.
2007-07-01
A new three-dimensional (3D) matrix-free implicit unstructured multigrid finite volume (FV) solver for structural dynamics is presented in this paper. The solver is first validated using classical 2D and 3D cantilever problems. It is shown that very accurate predictions of the fundamental natural frequencies of the problems can be obtained by the solver with fast convergence rates. This method has been integrated into our existing FV compressible solver [X. Lv, Y. Zhao, et al., An efficient parallel/unstructured-multigrid preconditioned implicit method for simulating 3d unsteady compressible flows with moving objects, Journal of Computational Physics 215(2) (2006) 661-690] based on the immersed membrane method (IMM) [X. Lv, Y. Zhao, et al., as mentioned above]. Results for the interaction between the fluid and an immersed fixed-free cantilever are also presented to demonstrate the potential of this integrated fluid-structure interaction approach.
A fast direct solver for a class of two-dimensional separable elliptic equations on the sphere
NASA Technical Reports Server (NTRS)
Moorthi, Shrinivas; Higgins, R. Wayne
1992-01-01
An efficient, direct, second-order solver for the discrete solution of two-dimensional separable elliptic equations on the sphere is presented. The method involves a Fourier transformation in longitude and a direct solution of the resulting coupled second-order finite difference equations in latitude. The solver is made efficient by vectorizing over longitudinal wavenumber and by using a vectorized fast Fourier transform routine. It is evaluated using a prescribed solution method and compared with a multigrid solver and the standard direct solver from FISHPAK.
NASA Astrophysics Data System (ADS)
Cools, S.; Vanroose, W.
2016-03-01
This paper improves the convergence and robustness of a multigrid-based solver for the cross sections of the driven Schrödinger equation. Adding a Coupled Channel Correction Step (CCCS) after each multigrid (MG) V-cycle efficiently removes the errors that remain after the V-cycle sweep. The combined iterative solution scheme (MG-CCCS) is shown to feature significantly improved convergence rates over the classical MG method at energies where bound states dominate the solution, resulting in a fast and scalable solution method for the complex-valued Schrödinger break-up problem for any energy regime. The proposed solver displays optimal scaling; a solution is found in a time that is linear in the number of unknowns. The method is validated on a 2D Temkin-Poet model problem, and convergence results both as a solver and preconditioner are provided to support the O (N) scalability of the method. This paper extends the applicability of the complex contour approach for far field map computation (Cools et al. (2014) [10]).
A simplified analysis of the multigrid V-cycle as a fast elliptic solver
NASA Technical Reports Server (NTRS)
Decker, Naomi H.; Taasan, Shlomo
1988-01-01
For special model problems, Fourier analysis gives exact convergence rates for the two-grid multigrid cycle and, for more general problems, provides estimates of the two-grid convergence rates via local mode analysis. A method is presented for obtaining mutigrid convergence rate estimates for cycles involving more than two grids (using essentially the same analysis as for the two-grid cycle). For the simple cast of the V-cycle used as a fast Laplace solver on the unit square, the k-grid convergence rate bounds obtained by this method are sharper than the bounds predicted by the variational theory. Both theoretical justification and experimental evidence are presented.
The block adaptive multigrid method applied to the solution of the Euler equations
NASA Technical Reports Server (NTRS)
Pantelelis, Nikos
1993-01-01
In the present study, a scheme capable of solving very fast and robust complex nonlinear systems of equations is presented. The Block Adaptive Multigrid (BAM) solution method offers multigrid acceleration and adaptive grid refinement based on the prediction of the solution error. The proposed solution method was used with an implicit upwind Euler solver for the solution of complex transonic flows around airfoils. Very fast results were obtained (18-fold acceleration of the solution) using one fourth of the volumes of a global grid with the same solution accuracy for two test cases.
Recent Advances in Agglomerated Multigrid
NASA Technical Reports Server (NTRS)
Nishikawa, Hiroaki; Diskin, Boris; Thomas, James L.; Hammond, Dana P.
2013-01-01
We report recent advancements of the agglomerated multigrid methodology for complex flow simulations on fully unstructured grids. An agglomerated multigrid solver is applied to a wide range of test problems from simple two-dimensional geometries to realistic three- dimensional configurations. The solver is evaluated against a single-grid solver and, in some cases, against a structured-grid multigrid solver. Grid and solver issues are identified and overcome, leading to significant improvements over single-grid solvers.
Fast Multilevel Solvers for a Class of Discrete Fourth Order Parabolic Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zheng, Bin; Chen, Luoping; Hu, Xiaozhe
2016-03-05
In this paper, we study fast iterative solvers for the solution of fourth order parabolic equations discretized by mixed finite element methods. We propose to use consistent mass matrix in the discretization and use lumped mass matrix to construct efficient preconditioners. We provide eigenvalue analysis for the preconditioned system and estimate the convergence rate of the preconditioned GMRes method. Furthermore, we show that these preconditioners only need to be solved inexactly by optimal multigrid algorithms. Our numerical examples indicate that the proposed preconditioners are very efficient and robust with respect to both discretization parameters and diffusion coefficients. We also investigatemore » the performance of multigrid algorithms with either collective smoothers or distributive smoothers when solving the preconditioner systems.« less
Analysis Tools for CFD Multigrid Solvers
NASA Technical Reports Server (NTRS)
Mineck, Raymond E.; Thomas, James L.; Diskin, Boris
2004-01-01
Analysis tools are needed to guide the development and evaluate the performance of multigrid solvers for the fluid flow equations. Classical analysis tools, such as local mode analysis, often fail to accurately predict performance. Two-grid analysis tools, herein referred to as Idealized Coarse Grid and Idealized Relaxation iterations, have been developed and evaluated within a pilot multigrid solver. These new tools are applicable to general systems of equations and/or discretizations and point to problem areas within an existing multigrid solver. Idealized Relaxation and Idealized Coarse Grid are applied in developing textbook-efficient multigrid solvers for incompressible stagnation flow problems.
NASA Technical Reports Server (NTRS)
Moorthi, Shrinivas; Higgins, R. W.
1993-01-01
An efficient, direct, second-order solver for the discrete solution of a class of two-dimensional separable elliptic equations on the sphere (which generally arise in implicit and semi-implicit atmospheric models) is presented. The method involves a Fourier transformation in longitude and a direct solution of the resulting coupled second-order finite-difference equations in latitude. The solver is made efficient by vectorizing over longitudinal wave-number and by using a vectorized fast Fourier transform routine. It is evaluated using a prescribed solution method and compared with a multigrid solver and the standard direct solver from FISHPAK.
Geometric multigrid for an implicit-time immersed boundary method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guy, Robert D.; Philip, Bobby; Griffith, Boyce E.
2014-10-12
The immersed boundary (IB) method is an approach to fluid-structure interaction that uses Lagrangian variables to describe the deformations and resulting forces of the structure and Eulerian variables to describe the motion and forces of the fluid. Explicit time stepping schemes for the IB method require solvers only for Eulerian equations, for which fast Cartesian grid solution methods are available. Such methods are relatively straightforward to develop and are widely used in practice but often require very small time steps to maintain stability. Implicit-time IB methods permit the stable use of large time steps, but efficient implementations of such methodsmore » require significantly more complex solvers that effectively treat both Lagrangian and Eulerian variables simultaneously. Moreover, several different approaches to solving the coupled Lagrangian-Eulerian equations have been proposed, but a complete understanding of this problem is still emerging. This paper presents a geometric multigrid method for an implicit-time discretization of the IB equations. This multigrid scheme uses a generalization of box relaxation that is shown to handle problems in which the physical stiffness of the structure is very large. Numerical examples are provided to illustrate the effectiveness and efficiency of the algorithms described herein. Finally, these tests show that using multigrid as a preconditioner for a Krylov method yields improvements in both robustness and efficiency as compared to using multigrid as a solver. They also demonstrate that with a time step 100–1000 times larger than that permitted by an explicit IB method, the multigrid-preconditioned implicit IB method is approximately 50–200 times more efficient than the explicit method.« less
Multigrid accelerated simulations for Twisted Mass fermions
NASA Astrophysics Data System (ADS)
Bacchio, Simone; Alexandrou, Constantia; Finkerath, Jacob
2018-03-01
Simulations at physical quark masses are affected by the critical slowing down of the solvers. Multigrid preconditioning has proved to deal effectively with this problem. Multigrid accelerated simulations at the physical value of the pion mass are being performed to generate Nf = 2 and Nf = 2 + 1 + 1 gauge ensembles using twisted mass fermions. The adaptive aggregation-based domain decomposition multigrid solver, referred to as DD-αAMG method, is employed for these simulations. Our simulation strategy consists of an hybrid approach of different solvers, involving the Conjugate Gradient (CG), multi-mass-shift CG and DD-αAMG solvers. We present an analysis of the multigrid performance during the simulations discussing the stability of the method. This significant speeds up the Hybrid Monte Carlo simulation by more than a factor 4 at physical pion mass compared to the usage of the CG solver.
Solving the Fluid Pressure Poisson Equation Using Multigrid-Evaluation and Improvements.
Dick, Christian; Rogowsky, Marcus; Westermann, Rudiger
2016-11-01
In many numerical simulations of fluids governed by the incompressible Navier-Stokes equations, the pressure Poisson equation needs to be solved to enforce mass conservation. Multigrid solvers show excellent convergence in simple scenarios, yet they can converge slowly in domains where physically separated regions are combined at coarser scales. Moreover, existing multigrid solvers are tailored to specific discretizations of the pressure Poisson equation, and they cannot easily be adapted to other discretizations. In this paper we analyze the convergence properties of existing multigrid solvers for the pressure Poisson equation in different simulation domains, and we show how to further improve the multigrid convergence rate by using a graph-based extension to determine the coarse grid hierarchy. The proposed multigrid solver is generic in that it can be applied to different kinds of discretizations of the pressure Poisson equation, by using solely the specification of the simulation domain and pre-assembled computational stencils. We analyze the proposed solver in combination with finite difference and finite volume discretizations of the pressure Poisson equation. Our evaluations show that, despite the common assumption, multigrid schemes can exploit their potential even in the most complicated simulation scenarios, yet this behavior is obtained at the price of higher memory consumption.
Multigrid solutions to quasi-elliptic schemes
NASA Technical Reports Server (NTRS)
Brandt, A.; Taasan, S.
1985-01-01
Quasi-elliptic schemes arise from central differencing or finite element discretization of elliptic systems with odd order derivatives on non-staggered grids. They are somewhat unstable and less accurate then corresponding staggered-grid schemes. When usual multigrid solvers are applied to them, the asymptotic algebraic convergence is necessarily slow. Nevertheless, it is shown by mode analyses and numerical experiments that the usual FMG algorithm is very efficient in solving quasi-elliptic equations to the level of truncation errors. Also, a new type of multigrid algorithm is presented, mode analyzed and tested, for which even the asymptotic algebraic convergence is fast. The essence of that algorithm is applicable to other kinds of problems, including highly indefinite ones.
Multigrid solutions to quasi-elliptic schemes
NASA Technical Reports Server (NTRS)
Brandt, A.; Taasan, S.
1985-01-01
Quasi-elliptic schemes arise from central differencing or finite element discretization of elliptic systems with odd order derivatives on non-staggered grids. They are somewhat unstable and less accurate than corresponding staggered-grid schemes. When usual multigrid solvers are applied to them, the asymptotic algebraic convergence is necessarily slow. Nevertheless, it is shown by mode analyses and numerical experiments that the usual FMG algorithm is very efficient in solving quasi-elliptic equations to the level of truncation errors. Also, a new type of multigrid algorithm is presented, mode analyzed and tested, for which even the asymptotic algebraic convergence is fast. The essence of that algorithm is applicable to other kinds of problems, including highly indefinite ones.
Multigrid methods for numerical simulation of laminar diffusion flames
NASA Technical Reports Server (NTRS)
Liu, C.; Liu, Z.; Mccormick, S.
1993-01-01
This paper documents the result of a computational study of multigrid methods for numerical simulation of 2D diffusion flames. The focus is on a simplified combustion model, which is assumed to be a single step, infinitely fast and irreversible chemical reaction with five species (C3H8, O2, N2, CO2 and H2O). A fully-implicit second-order hybrid scheme is developed on a staggered grid, which is stretched in the streamwise coordinate direction. A full approximation multigrid scheme (FAS) based on line distributive relaxation is developed as a fast solver for the algebraic equations arising at each time step. Convergence of the process for the simplified model problem is more than two-orders of magnitude faster than other iterative methods, and the computational results show good grid convergence, with second-order accuracy, as well as qualitatively agreement with the results of other researchers.
Agglomeration Multigrid for an Unstructured-Grid Flow Solver
NASA Technical Reports Server (NTRS)
Frink, Neal; Pandya, Mohagna J.
2004-01-01
An agglomeration multigrid scheme has been implemented into the sequential version of the NASA code USM3Dns, tetrahedral cell-centered finite volume Euler/Navier-Stokes flow solver. Efficiency and robustness of the multigrid-enhanced flow solver have been assessed for three configurations assuming an inviscid flow and one configuration assuming a viscous fully turbulent flow. The inviscid studies include a transonic flow over the ONERA M6 wing and a generic business jet with flow-through nacelles and a low subsonic flow over a high-lift trapezoidal wing. The viscous case includes a fully turbulent flow over the RAE 2822 rectangular wing. The multigrid solutions converged with 12%-33% of the Central Processing Unit (CPU) time required by the solutions obtained without multigrid. For all of the inviscid cases, multigrid in conjunction with an explicit time-stepping scheme performed the best with regard to the run time memory and CPU time requirements. However, for the viscous case multigrid had to be used with an implicit backward Euler time-stepping scheme that increased the run time memory requirement by 22% as compared to the run made without multigrid.
Efficient Implementation of Multigrid Solvers on Message-Passing Parrallel Systems
NASA Technical Reports Server (NTRS)
Lou, John
1994-01-01
We discuss our implementation strategies for finite difference multigrid partial differential equation (PDE) solvers on message-passing systems. Our target parallel architecture is Intel parallel computers: the Delta and Paragon system.
NASA Astrophysics Data System (ADS)
Cao, Jian; Chen, Jing-Bo; Dai, Meng-Xue
2018-01-01
An efficient finite-difference frequency-domain modeling of seismic wave propagation relies on the discrete schemes and appropriate solving methods. The average-derivative optimal scheme for the scalar wave modeling is advantageous in terms of the storage saving for the system of linear equations and the flexibility for arbitrary directional sampling intervals. However, using a LU-decomposition-based direct solver to solve its resulting system of linear equations is very costly for both memory and computational requirements. To address this issue, we consider establishing a multigrid-preconditioned BI-CGSTAB iterative solver fit for the average-derivative optimal scheme. The choice of preconditioning matrix and its corresponding multigrid components is made with the help of Fourier spectral analysis and local mode analysis, respectively, which is important for the convergence. Furthermore, we find that for the computation with unequal directional sampling interval, the anisotropic smoothing in the multigrid precondition may affect the convergence rate of this iterative solver. Successful numerical applications of this iterative solver for the homogenous and heterogeneous models in 2D and 3D are presented where the significant reduction of computer memory and the improvement of computational efficiency are demonstrated by comparison with the direct solver. In the numerical experiments, we also show that the unequal directional sampling interval will weaken the advantage of this multigrid-preconditioned iterative solver in the computing speed or, even worse, could reduce its accuracy in some cases, which implies the need for a reasonable control of directional sampling interval in the discretization.
Multigrid solution of the Navier-Stokes equations on highly stretched grids with defect correction
NASA Technical Reports Server (NTRS)
Sockol, Peter M.
1993-01-01
Relaxation-based multigrid solvers for the steady incompressible Navier-Stokes equations are examined to determine their computational speed and robustness. Four relaxation methods with a common discretization have been used as smoothers in a single tailored multigrid procedure. The equations are discretized on a staggered grid with first order upwind used for convection in the relaxation process on all grids and defect correction to second order central on the fine grid introduced once per multigrid cycle. A fixed W(1,1) cycle with full weighting of residuals is used in the FAS multigrid process. The resulting solvers have been applied to three 2D flow problems, over a range of Reynolds numbers, on both uniform and highly stretched grids. In all cases the L(sub 2) norm of the velocity changes is reduced to 10(exp -6) in a few 10's of fine grid sweeps. The results from this study are used to draw conclusions on the strengths and weaknesses of the individual relaxation schemes as well as those of the overall multigrid procedure when used as a solver on highly stretched grids.
Parallel Element Agglomeration Algebraic Multigrid and Upscaling Library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barker, Andrew T.; Benson, Thomas R.; Lee, Chak Shing
ParELAG is a parallel C++ library for numerical upscaling of finite element discretizations and element-based algebraic multigrid solvers. It provides optimal complexity algorithms to build multilevel hierarchies and solvers that can be used for solving a wide class of partial differential equations (elliptic, hyperbolic, saddle point problems) on general unstructured meshes. Additionally, a novel multilevel solver for saddle point problems with divergence constraint is implemented.
A multigrid solver for the semiconductor equations
NASA Technical Reports Server (NTRS)
Bachmann, Bernhard
1993-01-01
We present a multigrid solver for the exponential fitting method. The solver is applied to the current continuity equations of semiconductor device simulation in two dimensions. The exponential fitting method is based on a mixed finite element discretization using the lowest-order Raviart-Thomas triangular element. This discretization method yields a good approximation of front layers and guarantees current conservation. The corresponding stiffness matrix is an M-matrix. 'Standard' multigrid solvers, however, cannot be applied to the resulting system, as this is dominated by an unsymmetric part, which is due to the presence of strong convection in part of the domain. To overcome this difficulty, we explore the connection between Raviart-Thomas mixed methods and the nonconforming Crouzeix-Raviart finite element discretization. In this way we can construct nonstandard prolongation and restriction operators using easily computable weighted L(exp 2)-projections based on suitable quadrature rules and the upwind effects of the discretization. The resulting multigrid algorithm shows very good results, even for real-world problems and for locally refined grids.
An O(Nm(sup 2)) Plane Solver for the Compressible Navier-Stokes Equations
NASA Technical Reports Server (NTRS)
Thomas, J. L.; Bonhaus, D. L.; Anderson, W. K.; Rumsey, C. L.; Biedron, R. T.
1999-01-01
A hierarchical multigrid algorithm for efficient steady solutions to the two-dimensional compressible Navier-Stokes equations is developed and demonstrated. The algorithm applies multigrid in two ways: a Full Approximation Scheme (FAS) for a nonlinear residual equation and a Correction Scheme (CS) for a linearized defect correction implicit equation. Multigrid analyses which include the effect of boundary conditions in one direction are used to estimate the convergence rate of the algorithm for a model convection equation. Three alternating-line- implicit algorithms are compared in terms of efficiency. The analyses indicate that full multigrid efficiency is not attained in the general case; the number of cycles to attain convergence is dependent on the mesh density for high-frequency cross-stream variations. However, the dependence is reasonably small and fast convergence is eventually attained for any given frequency with either the FAS or the CS scheme alone. The paper summarizes numerical computations for which convergence has been attained to within truncation error in a few multigrid cycles for both inviscid and viscous ow simulations on highly stretched meshes.
Advanced computational simulations of water waves interacting with wave energy converters
NASA Astrophysics Data System (ADS)
Pathak, Ashish; Freniere, Cole; Raessi, Mehdi
2017-03-01
Wave energy converter (WEC) devices harness the renewable ocean wave energy and convert it into useful forms of energy, e.g. mechanical or electrical. This paper presents an advanced 3D computational framework to study the interaction between water waves and WEC devices. The computational tool solves the full Navier-Stokes equations and considers all important effects impacting the device performance. To enable large-scale simulations in fast turnaround times, the computational solver was developed in an MPI parallel framework. A fast multigrid preconditioned solver is introduced to solve the computationally expensive pressure Poisson equation. The computational solver was applied to two surface-piercing WEC geometries: bottom-hinged cylinder and flap. Their numerically simulated response was validated against experimental data. Additional simulations were conducted to investigate the applicability of Froude scaling in predicting full-scale WEC response from the model experiments.
Multigrid approaches to non-linear diffusion problems on unstructured meshes
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.; Bushnell, Dennis M. (Technical Monitor)
2001-01-01
The efficiency of three multigrid methods for solving highly non-linear diffusion problems on two-dimensional unstructured meshes is examined. The three multigrid methods differ mainly in the manner in which the nonlinearities of the governing equations are handled. These comprise a non-linear full approximation storage (FAS) multigrid method which is used to solve the non-linear equations directly, a linear multigrid method which is used to solve the linear system arising from a Newton linearization of the non-linear system, and a hybrid scheme which is based on a non-linear FAS multigrid scheme, but employs a linear solver on each level as a smoother. Results indicate that all methods are equally effective at converging the non-linear residual in a given number of grid sweeps, but that the linear solver is more efficient in cpu time due to the lower cost of linear versus non-linear grid sweeps.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Srinath Vadlamani; Scott Kruger; Travis Austin
Extended magnetohydrodynamic (MHD) codes are used to model the large, slow-growing instabilities that are projected to limit the performance of International Thermonuclear Experimental Reactor (ITER). The multiscale nature of the extended MHD equations requires an implicit approach. The current linear solvers needed for the implicit algorithm scale poorly because the resultant matrices are so ill-conditioned. A new solver is needed, especially one that scales to the petascale. The most successful scalable parallel processor solvers to date are multigrid solvers. Applying multigrid techniques to a set of equations whose fundamental modes are dispersive waves is a promising solution to CEMM problems.more » For the Phase 1, we implemented multigrid preconditioners from the HYPRE project of the Center for Applied Scientific Computing at LLNL via PETSc of the DOE SciDAC TOPS for the real matrix systems of the extended MHD code NIMROD which is a one of the primary modeling codes of the OFES-funded Center for Extended Magnetohydrodynamic Modeling (CEMM) SciDAC. We implemented the multigrid solvers on the fusion test problem that allows for real matrix systems with success, and in the process learned about the details of NIMROD data structures and the difficulties of inverting NIMROD operators. The further success of this project will allow for efficient usage of future petascale computers at the National Leadership Facilities: Oak Ridge National Laboratory, Argonne National Laboratory, and National Energy Research Scientific Computing Center. The project will be a collaborative effort between computational plasma physicists and applied mathematicians at Tech-X Corporation, applied mathematicians Front Range Scientific Computations, Inc. (who are collaborators on the HYPRE project), and other computational plasma physicists involved with the CEMM project.« less
Multiscale solvers and systematic upscaling in computational physics
NASA Astrophysics Data System (ADS)
Brandt, A.
2005-07-01
Multiscale algorithms can overcome the scale-born bottlenecks that plague most computations in physics. These algorithms employ separate processing at each scale of the physical space, combined with interscale iterative interactions, in ways which use finer scales very sparingly. Having been developed first and well known as multigrid solvers for partial differential equations, highly efficient multiscale techniques have more recently been developed for many other types of computational tasks, including: inverse PDE problems; highly indefinite (e.g., standing wave) equations; Dirac equations in disordered gauge fields; fast computation and updating of large determinants (as needed in QCD); fast integral transforms; integral equations; astrophysics; molecular dynamics of macromolecules and fluids; many-atom electronic structures; global and discrete-state optimization; practical graph problems; image segmentation and recognition; tomography (medical imaging); fast Monte-Carlo sampling in statistical physics; and general, systematic methods of upscaling (accurate numerical derivation of large-scale equations from microscopic laws).
NASA Astrophysics Data System (ADS)
Debreu, Laurent; Neveu, Emilie; Simon, Ehouarn; Le Dimet, Francois Xavier; Vidard, Arthur
2014-05-01
In order to lower the computational cost of the variational data assimilation process, we investigate the use of multigrid methods to solve the associated optimal control system. On a linear advection equation, we study the impact of the regularization term on the optimal control and the impact of discretization errors on the efficiency of the coarse grid correction step. We show that even if the optimal control problem leads to the solution of an elliptic system, numerical errors introduced by the discretization can alter the success of the multigrid methods. The view of the multigrid iteration as a preconditioner for a Krylov optimization method leads to a more robust algorithm. A scale dependent weighting of the multigrid preconditioner and the usual background error covariance matrix based preconditioner is proposed and brings significant improvements. [1] Laurent Debreu, Emilie Neveu, Ehouarn Simon, François-Xavier Le Dimet and Arthur Vidard, 2014: Multigrid solvers and multigrid preconditioners for the solution of variational data assimilation problems, submitted to QJRMS, http://hal.inria.fr/hal-00874643 [2] Emilie Neveu, Laurent Debreu and François-Xavier Le Dimet, 2011: Multigrid methods and data assimilation - Convergence study and first experiments on non-linear equations, ARIMA, 14, 63-80, http://intranet.inria.fr/international/arima/014/014005.html
Final Report: Subcontract B623868 Algebraic Multigrid solvers for coupled PDE systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brannick, J.
The Pennsylvania State University (“Subcontractor”) continued to work on the design of algebraic multigrid solvers for coupled systems of partial differential equations (PDEs) arising in numerical modeling of various applications, with a main focus on solving the Dirac equation arising in Quantum Chromodynamics (QCD). The goal of the proposed work was to develop combined geometric and algebraic multilevel solvers that are robust and lend themselves to efficient implementation on massively parallel heterogeneous computers for these QCD systems. The research in these areas built on previous works, focusing on the following three topics: (1) the development of parallel full-multigrid (PFMG) andmore » non-Galerkin coarsening techniques in this frame work for solving the Wilson Dirac system; (2) the use of these same Wilson MG solvers for preconditioning the Overlap and Domain Wall formulations of the Dirac equation; and (3) the design and analysis of algebraic coarsening algorithms for coupled PDE systems including Stokes equation, Maxwell equation and linear elasticity.« less
Convergence of Defect-Correction and Multigrid Iterations for Inviscid Flows
NASA Technical Reports Server (NTRS)
Diskin, Boris; Thomas, James L.
2011-01-01
Convergence of multigrid and defect-correction iterations is comprehensively studied within different incompressible and compressible inviscid regimes on high-density grids. Good smoothing properties of the defect-correction relaxation have been shown using both a modified Fourier analysis and a more general idealized-coarse-grid analysis. Single-grid defect correction alone has some slowly converging iterations on grids of medium density. The convergence is especially slow for near-sonic flows and for very low compressible Mach numbers. Additionally, the fast asymptotic convergence seen on medium density grids deteriorates on high-density grids. Certain downstream-boundary modes are very slowly damped on high-density grids. Multigrid scheme accelerates convergence of the slow defect-correction iterations to the extent determined by the coarse-grid correction. The two-level asymptotic convergence rates are stable and significantly below one in most of the regions but slow convergence is noted for near-sonic and very low-Mach compressible flows. Multigrid solver has been applied to the NACA 0012 airfoil and to different flow regimes, such as near-tangency and stagnation. Certain convergence difficulties have been encountered within stagnation regions. Nonetheless, for the airfoil flow, with a sharp trailing-edge, residuals were fast converging for a subcritical flow on a sequence of grids. For supercritical flow, residuals converged slower on some intermediate grids than on the finest grid or the two coarsest grids.
Directional Agglomeration Multigrid Techniques for High Reynolds Number Viscous Flow Solvers
NASA Technical Reports Server (NTRS)
1998-01-01
A preconditioned directional-implicit agglomeration algorithm is developed for solving two- and three-dimensional viscous flows on highly anisotropic unstructured meshes of mixed-element types. The multigrid smoother consists of a pre-conditioned point- or line-implicit solver which operates on lines constructed in the unstructured mesh using a weighted graph algorithm. Directional coarsening or agglomeration is achieved using a similar weighted graph algorithm. A tight coupling of the line construction and directional agglomeration algorithms enables the use of aggressive coarsening ratios in the multigrid algorithm, which in turn reduces the cost of a multigrid cycle. Convergence rates which are independent of the degree of grid stretching are demonstrated in both two and three dimensions. Further improvement of the three-dimensional convergence rates through a GMRES technique is also demonstrated.
Vectorized multigrid Poisson solver for the CDC CYBER 205
NASA Technical Reports Server (NTRS)
Barkai, D.; Brandt, M. A.
1984-01-01
The full multigrid (FMG) method is applied to the two dimensional Poisson equation with Dirichlet boundary conditions. This has been chosen as a relatively simple test case for examining the efficiency of fully vectorizing of the multigrid method. Data structure and programming considerations and techniques are discussed, accompanied by performance details.
NASA Astrophysics Data System (ADS)
Kifonidis, K.; Müller, E.
2012-08-01
Aims: We describe and study a family of new multigrid iterative solvers for the multidimensional, implicitly discretized equations of hydrodynamics. Schemes of this class are free of the Courant-Friedrichs-Lewy condition. They are intended for simulations in which widely differing wave propagation timescales are present. A preferred solver in this class is identified. Applications to some simple stiff test problems that are governed by the compressible Euler equations, are presented to evaluate the convergence behavior, and the stability properties of this solver. Algorithmic areas are determined where further work is required to make the method sufficiently efficient and robust for future application to difficult astrophysical flow problems. Methods: The basic equations are formulated and discretized on non-orthogonal, structured curvilinear meshes. Roe's approximate Riemann solver and a second-order accurate reconstruction scheme are used for spatial discretization. Implicit Runge-Kutta (ESDIRK) schemes are employed for temporal discretization. The resulting discrete equations are solved with a full-coarsening, non-linear multigrid method. Smoothing is performed with multistage-implicit smoothers. These are applied here to the time-dependent equations by means of dual time stepping. Results: For steady-state problems, our results show that the efficiency of the present approach is comparable to the best implicit solvers for conservative discretizations of the compressible Euler equations that can be found in the literature. The use of red-black as opposed to symmetric Gauss-Seidel iteration in the multistage-smoother is found to have only a minor impact on multigrid convergence. This should enable scalable parallelization without having to seriously compromise the method's algorithmic efficiency. For time-dependent test problems, our results reveal that the multigrid convergence rate degrades with increasing Courant numbers (i.e. time step sizes). Beyond a Courant number of nine thousand, even complete multigrid breakdown is observed. Local Fourier analysis indicates that the degradation of the convergence rate is associated with the coarse-grid correction algorithm. An implicit scheme for the Euler equations that makes use of the present method was, nevertheless, able to outperform a standard explicit scheme on a time-dependent problem with a Courant number of order 1000. Conclusions: For steady-state problems, the described approach enables the construction of parallelizable, efficient, and robust implicit hydrodynamics solvers. The applicability of the method to time-dependent problems is presently restricted to cases with moderately high Courant numbers. This is due to an insufficient coarse-grid correction of the employed multigrid algorithm for large time steps. Further research will be required to help us to understand and overcome the observed multigrid convergence difficulties for time-dependent problems.
Architecting the Finite Element Method Pipeline for the GPU.
Fu, Zhisong; Lewis, T James; Kirby, Robert M; Whitaker, Ross T
2014-02-01
The finite element method (FEM) is a widely employed numerical technique for approximating the solution of partial differential equations (PDEs) in various science and engineering applications. Many of these applications benefit from fast execution of the FEM pipeline. One way to accelerate the FEM pipeline is by exploiting advances in modern computational hardware, such as the many-core streaming processors like the graphical processing unit (GPU). In this paper, we present the algorithms and data-structures necessary to move the entire FEM pipeline to the GPU. First we propose an efficient GPU-based algorithm to generate local element information and to assemble the global linear system associated with the FEM discretization of an elliptic PDE. To solve the corresponding linear system efficiently on the GPU, we implement a conjugate gradient method preconditioned with a geometry-informed algebraic multi-grid (AMG) method preconditioner. We propose a new fine-grained parallelism strategy, a corresponding multigrid cycling stage and efficient data mapping to the many-core architecture of GPU. Comparison of our on-GPU assembly versus a traditional serial implementation on the CPU achieves up to an 87 × speedup. Focusing on the linear system solver alone, we achieve a speedup of up to 51 × versus use of a comparable state-of-the-art serial CPU linear system solver. Furthermore, the method compares favorably with other GPU-based, sparse, linear solvers.
Assessment of Linear Finite-Difference Poisson-Boltzmann Solvers
Wang, Jun; Luo, Ray
2009-01-01
CPU time and memory usage are two vital issues that any numerical solvers for the Poisson-Boltzmann equation have to face in biomolecular applications. In this study we systematically analyzed the CPU time and memory usage of five commonly used finite-difference solvers with a large and diversified set of biomolecular structures. Our comparative analysis shows that modified incomplete Cholesky conjugate gradient and geometric multigrid are the most efficient in the diversified test set. For the two efficient solvers, our test shows that their CPU times increase approximately linearly with the numbers of grids. Their CPU times also increase almost linearly with the negative logarithm of the convergence criterion at very similar rate. Our comparison further shows that geometric multigrid performs better in the large set of tested biomolecules. However, modified incomplete Cholesky conjugate gradient is superior to geometric multigrid in molecular dynamics simulations of tested molecules. We also investigated other significant components in numerical solutions of the Poisson-Boltzmann equation. It turns out that the time-limiting step is the free boundary condition setup for the linear systems for the selected proteins if the electrostatic focusing is not used. Thus, development of future numerical solvers for the Poisson-Boltzmann equation should balance all aspects of the numerical procedures in realistic biomolecular applications. PMID:20063271
Topology-Aware Performance Optimization and Modeling of Adaptive Mesh Refinement Codes for Exascale
Chan, Cy P.; Bachan, John D.; Kenny, Joseph P.; ...
2017-01-26
Here, we introduce a topology-aware performance optimization and modeling workflow for AMR simulation that includes two new modeling tools, ProgrAMR and Mota Mapper, which interface with the BoxLib AMR framework and the SSTmacro network simulator. ProgrAMR allows us to generate and model the execution of task dependency graphs from high-level specifications of AMR-based applications, which we demonstrate by analyzing two example AMR-based multigrid solvers with varying degrees of asynchrony. Mota Mapper generates multiobjective, network topology-aware box mappings, which we apply to optimize the data layout for the example multigrid solvers. While the sensitivity of these solvers to layout and executionmore » strategy appears to be modest for balanced scenarios, the impact of better mapping algorithms can be significant when performance is highly constrained by network hop latency. Furthermore, we show that network latency in the multigrid bottom solve is the main contributing factor preventing good scaling on exascale-class machines.« less
Topology-Aware Performance Optimization and Modeling of Adaptive Mesh Refinement Codes for Exascale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chan, Cy P.; Bachan, John D.; Kenny, Joseph P.
Here, we introduce a topology-aware performance optimization and modeling workflow for AMR simulation that includes two new modeling tools, ProgrAMR and Mota Mapper, which interface with the BoxLib AMR framework and the SSTmacro network simulator. ProgrAMR allows us to generate and model the execution of task dependency graphs from high-level specifications of AMR-based applications, which we demonstrate by analyzing two example AMR-based multigrid solvers with varying degrees of asynchrony. Mota Mapper generates multiobjective, network topology-aware box mappings, which we apply to optimize the data layout for the example multigrid solvers. While the sensitivity of these solvers to layout and executionmore » strategy appears to be modest for balanced scenarios, the impact of better mapping algorithms can be significant when performance is highly constrained by network hop latency. Furthermore, we show that network latency in the multigrid bottom solve is the main contributing factor preventing good scaling on exascale-class machines.« less
Three-dimensional unstructured grid Euler computations using a fully-implicit, upwind method
NASA Technical Reports Server (NTRS)
Whitaker, David L.
1993-01-01
A method has been developed to solve the Euler equations on a three-dimensional unstructured grid composed of tetrahedra. The method uses an upwind flow solver with a linearized, backward-Euler time integration scheme. Each time step results in a sparse linear system of equations which is solved by an iterative, sparse matrix solver. Local-time stepping, switched evolution relaxation (SER), preconditioning and reuse of the Jacobian are employed to accelerate the convergence rate. Implicit boundary conditions were found to be extremely important for fast convergence. Numerical experiments have shown that convergence rates comparable to that of a multigrid, central-difference scheme are achievable on the same mesh. Results are presented for several grids about an ONERA M6 wing.
NONLINEAR MULTIGRID SOLVER EXPLOITING AMGe COARSE SPACES WITH APPROXIMATION PROPERTIES
DOE Office of Scientific and Technical Information (OSTI.GOV)
Christensen, Max La Cour; Villa, Umberto E.; Engsig-Karup, Allan P.
The paper introduces a nonlinear multigrid solver for mixed nite element discretizations based on the Full Approximation Scheme (FAS) and element-based Algebraic Multigrid (AMGe). The main motivation to use FAS for unstruc- tured problems is the guaranteed approximation property of the AMGe coarse spaces that were developed recently at Lawrence Livermore National Laboratory. These give the ability to derive stable and accurate coarse nonlinear discretization problems. The previous attempts (including ones with the original AMGe method, [5, 11]), were less successful due to lack of such good approximation properties of the coarse spaces. With coarse spaces with approximation properties, ourmore » FAS approach on un- structured meshes should be as powerful/successful as FAS on geometrically re ned meshes. For comparison, Newton's method and Picard iterations with an inner state-of-the-art linear solver is compared to FAS on a nonlinear saddle point problem with applications to porous media ow. It is demonstrated that FAS is faster than Newton's method and Picard iterations for the experiments considered here. Due to the guaranteed approximation properties of our AMGe, the coarse spaces are very accurate, providing a solver with the potential for mesh-independent convergence on general unstructured meshes.« less
Advanced Multigrid Solvers for Fluid Dynamics
NASA Technical Reports Server (NTRS)
Brandt, Achi
1999-01-01
The main objective of this project has been to support the development of multigrid techniques in computational fluid dynamics that can achieve "textbook multigrid efficiency" (TME), which is several orders of magnitude faster than current industrial CFD solvers. Toward that goal we have assembled a detailed table which lists every foreseen kind of computational difficulty for achieving it, together with the possible ways for resolving the difficulty, their current state of development, and references. We have developed several codes to test and demonstrate, in the framework of simple model problems, several approaches for overcoming the most important of the listed difficulties that had not been resolved before. In particular, TME has been demonstrated for incompressible flows on one hand, and for near-sonic flows on the other hand. General approaches were advanced for the relaxation of stagnation points and boundary conditions under various situations. Also, new algebraic multigrid techniques were formed for treating unstructured grid formulations. More details on all these are given below.
A Fast and Robust Poisson-Boltzmann Solver Based on Adaptive Cartesian Grids
Boschitsch, Alexander H.; Fenley, Marcia O.
2011-01-01
An adaptive Cartesian grid (ACG) concept is presented for the fast and robust numerical solution of the 3D Poisson-Boltzmann Equation (PBE) governing the electrostatic interactions of large-scale biomolecules and highly charged multi-biomolecular assemblies such as ribosomes and viruses. The ACG offers numerous advantages over competing grid topologies such as regular 3D lattices and unstructured grids. For very large biological molecules and multi-biomolecule assemblies, the total number of grid-points is several orders of magnitude less than that required in a conventional lattice grid used in the current PBE solvers thus allowing the end user to obtain accurate and stable nonlinear PBE solutions on a desktop computer. Compared to tetrahedral-based unstructured grids, ACG offers a simpler hierarchical grid structure, which is naturally suited to multigrid, relieves indirect addressing requirements and uses fewer neighboring nodes in the finite difference stencils. Construction of the ACG and determination of the dielectric/ionic maps are straightforward, fast and require minimal user intervention. Charge singularities are eliminated by reformulating the problem to produce the reaction field potential in the molecular interior and the total electrostatic potential in the exterior ionic solvent region. This approach minimizes grid-dependency and alleviates the need for fine grid spacing near atomic charge sites. The technical portion of this paper contains three parts. First, the ACG and its construction for general biomolecular geometries are described. Next, a discrete approximation to the PBE upon this mesh is derived. Finally, the overall solution procedure and multigrid implementation are summarized. Results obtained with the ACG-based PBE solver are presented for: (i) a low dielectric spherical cavity, containing interior point charges, embedded in a high dielectric ionic solvent – analytical solutions are available for this case, thus allowing rigorous assessment of the solution accuracy; (ii) a pair of low dielectric charged spheres embedded in a ionic solvent to compute electrostatic interaction free energies as a function of the distance between sphere centers; (iii) surface potentials of proteins, nucleic acids and their larger-scale assemblies such as ribosomes; and (iv) electrostatic solvation free energies and their salt sensitivities – obtained with both linear and nonlinear Poisson-Boltzmann equation – for a large set of proteins. These latter results along with timings can serve as benchmarks for comparing the performance of different PBE solvers. PMID:21984876
On some Aitken-like acceleration of the Schwarz method
NASA Astrophysics Data System (ADS)
Garbey, M.; Tromeur-Dervout, D.
2002-12-01
In this paper we present a family of domain decomposition based on Aitken-like acceleration of the Schwarz method seen as an iterative procedure with a linear rate of convergence. We first present the so-called Aitken-Schwarz procedure for linear differential operators. The solver can be a direct solver when applied to the Helmholtz problem with five-point finite difference scheme on regular grids. We then introduce the Steffensen-Schwarz variant which is an iterative domain decomposition solver that can be applied to linear and nonlinear problems. We show that these solvers have reasonable numerical efficiency compared to classical fast solvers for the Poisson problem or multigrids for more general linear and nonlinear elliptic problems. However, the salient feature of our method is that our algorithm has high tolerance to slow network in the context of distributed parallel computing and is attractive, generally speaking, to use with computer architecture for which performance is limited by the memory bandwidth rather than the flop performance of the CPU. This is nowadays the case for most parallel. computer using the RISC processor architecture. We will illustrate this highly desirable property of our algorithm with large-scale computing experiments.
NASA Astrophysics Data System (ADS)
Moghaderi, Hamid; Dehghan, Mehdi; Donatelli, Marco; Mazza, Mariarosa
2017-12-01
Fractional diffusion equations (FDEs) are a mathematical tool used for describing some special diffusion phenomena arising in many different applications like porous media and computational finance. In this paper, we focus on a two-dimensional space-FDE problem discretized by means of a second order finite difference scheme obtained as combination of the Crank-Nicolson scheme and the so-called weighted and shifted Grünwald formula. By fully exploiting the Toeplitz-like structure of the resulting linear system, we provide a detailed spectral analysis of the coefficient matrix at each time step, both in the case of constant and variable diffusion coefficients. Such a spectral analysis has a very crucial role, since it can be used for designing fast and robust iterative solvers. In particular, we employ the obtained spectral information to define a Galerkin multigrid method based on the classical linear interpolation as grid transfer operator and damped-Jacobi as smoother, and to prove the linear convergence rate of the corresponding two-grid method. The theoretical analysis suggests that the proposed grid transfer operator is strong enough for working also with the V-cycle method and the geometric multigrid. On this basis, we introduce two computationally favourable variants of the proposed multigrid method and we use them as preconditioners for Krylov methods. Several numerical results confirm that the resulting preconditioning strategies still keep a linear convergence rate.
Three-Dimensional High-Lift Analysis Using a Parallel Unstructured Multigrid Solver
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.
1998-01-01
A directional implicit unstructured agglomeration multigrid solver is ported to shared and distributed memory massively parallel machines using the explicit domain-decomposition and message-passing approach. Because the algorithm operates on local implicit lines in the unstructured mesh, special care is required in partitioning the problem for parallel computing. A weighted partitioning strategy is described which avoids breaking the implicit lines across processor boundaries, while incurring minimal additional communication overhead. Good scalability is demonstrated on a 128 processor SGI Origin 2000 machine and on a 512 processor CRAY T3E machine for reasonably fine grids. The feasibility of performing large-scale unstructured grid calculations with the parallel multigrid algorithm is demonstrated by computing the flow over a partial-span flap wing high-lift geometry on a highly resolved grid of 13.5 million points in approximately 4 hours of wall clock time on the CRAY T3E.
ML 3.0 smoothed aggregation user's guide.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sala, Marzio; Hu, Jonathan Joseph; Tuminaro, Raymond Stephen
2004-05-01
ML is a multigrid preconditioning package intended to solve linear systems of equations Az = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. ML should be used on large sparse linear systems arising from partial differential equation (PDE) discretizations. While technically any linear system can be considered, ML should be used on linear systems that correspond to things that work well with multigrid methods (e.g. elliptic PDEs). ML can be used as a stand-alone package ormore » to generate preconditioners for a traditional iterative solver package (e.g. Krylov methods). We have supplied support for working with the AZTEC 2.1 and AZTECOO iterative package [15]. However, other solvers can be used by supplying a few functions. This document describes one specific algebraic multigrid approach: smoothed aggregation. This approach is used within several specialized multigrid methods: one for the eddy current formulation for Maxwell's equations, and a multilevel and domain decomposition method for symmetric and non-symmetric systems of equations (like elliptic equations, or compressible and incompressible fluid dynamics problems). Other methods exist within ML but are not described in this document. Examples are given illustrating the problem definition and exercising multigrid options.« less
ML 3.1 smoothed aggregation user's guide.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sala, Marzio; Hu, Jonathan Joseph; Tuminaro, Raymond Stephen
2004-10-01
ML is a multigrid preconditioning package intended to solve linear systems of equations Ax = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. ML should be used on large sparse linear systems arising from partial differential equation (PDE) discretizations. While technically any linear system can be considered, ML should be used on linear systems that correspond to things that work well with multigrid methods (e.g. elliptic PDEs). ML can be used as a stand-alone package ormore » to generate preconditioners for a traditional iterative solver package (e.g. Krylov methods). We have supplied support for working with the Aztec 2.1 and AztecOO iterative package [16]. However, other solvers can be used by supplying a few functions. This document describes one specific algebraic multigrid approach: smoothed aggregation. This approach is used within several specialized multigrid methods: one for the eddy current formulation for Maxwell's equations, and a multilevel and domain decomposition method for symmetric and nonsymmetric systems of equations (like elliptic equations, or compressible and incompressible fluid dynamics problems). Other methods exist within ML but are not described in this document. Examples are given illustrating the problem definition and exercising multigrid options.« less
Scalable smoothing strategies for a geometric multigrid method for the immersed boundary equations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhalla, Amneet Pal Singh; Knepley, Matthew G.; Adams, Mark F.
2016-12-20
The immersed boundary (IB) method is a widely used approach to simulating fluid-structure interaction (FSI). Although explicit versions of the IB method can suffer from severe time step size restrictions, these methods remain popular because of their simplicity and generality. In prior work (Guy et al., Adv Comput Math, 2015), some of us developed a geometric multigrid preconditioner for a stable semi-implicit IB method under Stokes flow conditions; however, this solver methodology used a Vanka-type smoother that presented limited opportunities for parallelization. This work extends this Stokes-IB solver methodology by developing smoothing techniques that are suitable for parallel implementation. Specifically,more » we demonstrate that an additive version of the Vanka smoother can yield an effective multigrid preconditioner for the Stokes-IB equations, and we introduce an efficient Schur complement-based smoother that is also shown to be effective for the Stokes-IB equations. We investigate the performance of these solvers for a broad range of material stiffnesses, both for Stokes flows and flows at nonzero Reynolds numbers, and for thick and thin structural models. We show here that linear solver performance degrades with increasing Reynolds number and material stiffness, especially for thin interface cases. Nonetheless, the proposed approaches promise to yield effective solution algorithms, especially at lower Reynolds numbers and at modest-to-high elastic stiffnesses.« less
Wilson, John D.; Naff, Richard L.
2004-01-01
A geometric multigrid solver (GMG), based in the preconditioned conjugate gradient algorithm, has been developed for solving systems of equations resulting from applying the cell-centered finite difference algorithm to flow in porous media. This solver has been adapted to the U.S. Geological Survey ground-water flow model MODFLOW-2000. The documentation herein is a description of the solver and the adaptation to MODFLOW-2000.
On the Performance of an Algebraic MultigridSolver on Multicore Clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, A H; Schulz, M; Yang, U M
2010-04-29
Algebraic multigrid (AMG) solvers have proven to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore cluster architectures, we face new challenges that can significantly harm AMG's performance. We discuss our experiences on such an architecture and present a set of techniques that help users to overcome the associated problems, including thread and process pinning and correct memory associations. We have implemented most of the techniques in a MultiCore SUPport library (MCSup), which helps to map OpenMP applications to multicore machines. We present results using both an MPI-only and a hybrid MPI/OpenMP model.
Multi-Level Adaptive Techniques (MLAT) for singular-perturbation problems
NASA Technical Reports Server (NTRS)
Brandt, A.
1978-01-01
The multilevel (multigrid) adaptive technique, a general strategy of solving continuous problems by cycling between coarser and finer levels of discretization is described. It provides very fast general solvers, together with adaptive, nearly optimal discretization schemes. In the process, boundary layers are automatically either resolved or skipped, depending on a control function which expresses the computational goal. The global error decreases exponentially as a function of the overall computational work, in a uniform rate independent of the magnitude of the singular-perturbation terms. The key is high-order uniformly stable difference equations, and uniformly smoothing relaxation schemes.
An Upwind Multigrid Algorithm for Calculating Flows on Unstructured Grids
NASA Technical Reports Server (NTRS)
Bonhaus, Daryl L.
1993-01-01
An algorithm is described that calculates inviscid, laminar, and turbulent flows on triangular meshes with an upwind discretization. A brief description of the base solver and the multigrid implementation is given, followed by results that consist mainly of convergence rates for inviscid and viscous flows over a NACA four-digit airfoil section. The results show that multigrid does accelerate convergence when the same relaxation parameters that yield good single-grid performance are used; however, larger gains in performance can be realized by doing less work in the relaxation scheme.
Implicit adaptive mesh refinement for 2D reduced resistive magnetohydrodynamics
NASA Astrophysics Data System (ADS)
Philip, Bobby; Chacón, Luis; Pernice, Michael
2008-10-01
An implicit structured adaptive mesh refinement (SAMR) solver for 2D reduced magnetohydrodynamics (MHD) is described. The time-implicit discretization is able to step over fast normal modes, while the spatial adaptivity resolves thin, dynamically evolving features. A Jacobian-free Newton-Krylov method is used for the nonlinear solver engine. For preconditioning, we have extended the optimal "physics-based" approach developed in [L. Chacón, D.A. Knoll, J.M. Finn, An implicit, nonlinear reduced resistive MHD solver, J. Comput. Phys. 178 (2002) 15-36] (which employed multigrid solver technology in the preconditioner for scalability) to SAMR grids using the well-known Fast Adaptive Composite grid (FAC) method [S. McCormick, Multilevel Adaptive Methods for Partial Differential Equations, SIAM, Philadelphia, PA, 1989]. A grid convergence study demonstrates that the solver performance is independent of the number of grid levels and only depends on the finest resolution considered, and that it scales well with grid refinement. The study of error generation and propagation in our SAMR implementation demonstrates that high-order (cubic) interpolation during regridding, combined with a robustly damping second-order temporal scheme such as BDF2, is required to minimize impact of grid errors at coarse-fine interfaces on the overall error of the computation for this MHD application. We also demonstrate that our implementation features the desired property that the overall numerical error is dependent only on the finest resolution level considered, and not on the base-grid resolution or on the number of refinement levels present during the simulation. We demonstrate the effectiveness of the tool on several challenging problems.
Multigrid time-accurate integration of Navier-Stokes equations
NASA Technical Reports Server (NTRS)
Arnone, Andrea; Liou, Meng-Sing; Povinelli, Louis A.
1993-01-01
Efficient acceleration techniques typical of explicit steady-state solvers are extended to time-accurate calculations. Stability restrictions are greatly reduced by means of a fully implicit time discretization. A four-stage Runge-Kutta scheme with local time stepping, residual smoothing, and multigridding is used instead of traditional time-expensive factorizations. Some applications to natural and forced unsteady viscous flows show the capability of the procedure.
The multigrid preconditioned conjugate gradient method
NASA Technical Reports Server (NTRS)
Tatebe, Osamu
1993-01-01
A multigrid preconditioned conjugate gradient method (MGCG method), which uses the multigrid method as a preconditioner of the PCG method, is proposed. The multigrid method has inherent high parallelism and improves convergence of long wavelength components, which is important in iterative methods. By using this method as a preconditioner of the PCG method, an efficient method with high parallelism and fast convergence is obtained. First, it is considered a necessary condition of the multigrid preconditioner in order to satisfy requirements of a preconditioner of the PCG method. Next numerical experiments show a behavior of the MGCG method and that the MGCG method is superior to both the ICCG method and the multigrid method in point of fast convergence and high parallelism. This fast convergence is understood in terms of the eigenvalue analysis of the preconditioned matrix. From this observation of the multigrid preconditioner, it is realized that the MGCG method converges in very few iterations and the multigrid preconditioner is a desirable preconditioner of the conjugate gradient method.
NASA Astrophysics Data System (ADS)
Guan, Zhen; Pekurovsky, Dmitry; Luce, Jason; Thornton, Katsuyo; Lowengrub, John
The structural phase field crystal (XPFC) model can be used to model grain growth in polycrystalline materials at diffusive time-scales while maintaining atomic scale resolution. However, the governing equation of the XPFC model is an integral-partial-differential-equation (IPDE), which poses challenges in implementation onto high performance computing (HPC) platforms. In collaboration with the XSEDE Extended Collaborative Support Service, we developed a distributed memory HPC solver for the XPFC model, which combines parallel multigrid and P3DFFT. The performance benchmarking on the Stampede supercomputer indicates near linear strong and weak scaling for both multigrid and transfer time between multigrid and FFT modules up to 1024 cores. Scalability of the FFT module begins to decline at 128 cores, but it is sufficient for the type of problem we will be examining. We have demonstrated simulations using 1024 cores, and we expect to achieve 4096 cores and beyond. Ongoing work involves optimization of MPI/OpenMP-based codes for the Intel KNL Many-Core Architecture. This optimizes the code for coming pre-exascale systems, in particular many-core systems such as Stampede 2.0 and Cori 2 at NERSC, without sacrificing efficiency on other general HPC systems.
Application of an unstructured grid flow solver to planes, trains and automobiles
NASA Technical Reports Server (NTRS)
Spragle, Gregory S.; Smith, Wayne A.; Yadlin, Yoram
1993-01-01
Rampant, an unstructured flow solver developed at Fluent Inc., is used to compute three-dimensional, viscous, turbulent, compressible flow fields within complex solution domains. Rampant is an explicit, finite-volume flow solver capable of computing flow fields using either triangular (2d) or tetrahedral (3d) unstructured grids. Local time stepping, implicit residual smoothing, and multigrid techniques are used to accelerate the convergence of the explicit scheme. The paper describes the Rampant flow solver and presents flow field solutions about a plane, train, and automobile.
Array-based, parallel hierarchical mesh refinement algorithms for unstructured meshes
Ray, Navamita; Grindeanu, Iulian; Zhao, Xinglin; ...
2016-08-18
In this paper, we describe an array-based hierarchical mesh refinement capability through uniform refinement of unstructured meshes for efficient solution of PDE's using finite element methods and multigrid solvers. A multi-degree, multi-dimensional and multi-level framework is designed to generate the nested hierarchies from an initial coarse mesh that can be used for a variety of purposes such as in multigrid solvers/preconditioners, to do solution convergence and verification studies and to improve overall parallel efficiency by decreasing I/O bandwidth requirements (by loading smaller meshes and in memory refinement). We also describe a high-order boundary reconstruction capability that can be used tomore » project the new points after refinement using high-order approximations instead of linear projection in order to minimize and provide more control on geometrical errors introduced by curved boundaries.The capability is developed under the parallel unstructured mesh framework "Mesh Oriented dAtaBase" (MOAB Tautges et al. (2004)). We describe the underlying data structures and algorithms to generate such hierarchies in parallel and present numerical results for computational efficiency and effect on mesh quality. Furthermore, we also present results to demonstrate the applicability of the developed capability to study convergence properties of different point projection schemes for various mesh hierarchies and to a multigrid finite-element solver for elliptic problems.« less
Directional Agglomeration Multigrid Techniques for High-Reynolds Number Viscous Flows
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.
1998-01-01
A preconditioned directional-implicit agglomeration algorithm is developed for solving two- and three-dimensional viscous flows on highly anisotropic unstructured meshes of mixed-element types. The multigrid smoother consists of a pre-conditioned point- or line-implicit solver which operates on lines constructed in the unstructured mesh using a weighted graph algorithm. Directional coarsening or agglomeration is achieved using a similar weighted graph algorithm. A tight coupling of the line construction and directional agglomeration algorithms enables the use of aggressive coarsening ratios in the multigrid algorithm, which in turn reduces the cost of a multigrid cycle. Convergence rates which are independent of the degree of grid stretching are demonstrated in both two and three dimensions. Further improvement of the three-dimensional convergence rates through a GMRES technique is also demonstrated.
Reducing Communication in Algebraic Multigrid Using Additive Variants
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vassilevski, Panayot S.; Yang, Ulrike Meier
Algebraic multigrid (AMG) has proven to be an effective scalable solver on many high performance computers. However, its increasing communication complexity on coarser levels has shown to seriously impact its performance on computers with high communication cost. Moreover, additive AMG variants provide not only increased parallelism as well as decreased numbers of messages per cycle but also generally exhibit slower convergence. Here we present various new additive variants with convergence rates that are significantly improved compared to the classical additive algebraic multigrid method and investigate their potential for decreased communication, and improved communication-computation overlap, features that are essential for goodmore » performance on future exascale architectures.« less
Reducing Communication in Algebraic Multigrid Using Additive Variants
Vassilevski, Panayot S.; Yang, Ulrike Meier
2014-02-12
Algebraic multigrid (AMG) has proven to be an effective scalable solver on many high performance computers. However, its increasing communication complexity on coarser levels has shown to seriously impact its performance on computers with high communication cost. Moreover, additive AMG variants provide not only increased parallelism as well as decreased numbers of messages per cycle but also generally exhibit slower convergence. Here we present various new additive variants with convergence rates that are significantly improved compared to the classical additive algebraic multigrid method and investigate their potential for decreased communication, and improved communication-computation overlap, features that are essential for goodmore » performance on future exascale architectures.« less
Array-based Hierarchical Mesh Generation in Parallel
Ray, Navamita; Grindeanu, Iulian; Zhao, Xinglin; ...
2015-11-03
In this paper, we describe an array-based hierarchical mesh generation capability through uniform refinement of unstructured meshes for efficient solution of PDE's using finite element methods and multigrid solvers. A multi-degree, multi-dimensional and multi-level framework is designed to generate the nested hierarchies from an initial mesh that can be used for a number of purposes such as multi-level methods to generating large meshes. The capability is developed under the parallel mesh framework “Mesh Oriented dAtaBase” a.k.a MOAB. We describe the underlying data structures and algorithms to generate such hierarchies and present numerical results for computational efficiency and mesh quality. Inmore » conclusion, we also present results to demonstrate the applicability of the developed capability to a multigrid finite-element solver.« less
Detwiler, R.L.; Mehl, S.; Rajaram, H.; Cheung, W.W.
2002-01-01
Numerical solution of large-scale ground water flow and transport problems is often constrained by the convergence behavior of the iterative solvers used to solve the resulting systems of equations. We demonstrate the ability of an algebraic multigrid algorithm (AMG) to efficiently solve the large, sparse systems of equations that result from computational models of ground water flow and transport in large and complex domains. Unlike geometric multigrid methods, this algorithm is applicable to problems in complex flow geometries, such as those encountered in pore-scale modeling of two-phase flow and transport. We integrated AMG into MODFLOW 2000 to compare two- and three-dimensional flow simulations using AMG to simulations using PCG2, a preconditioned conjugate gradient solver that uses the modified incomplete Cholesky preconditioner and is included with MODFLOW 2000. CPU times required for convergence with AMG were up to 140 times faster than those for PCG2. The cost of this increased speed was up to a nine-fold increase in required random access memory (RAM) for the three-dimensional problems and up to a four-fold increase in required RAM for the two-dimensional problems. We also compared two-dimensional numerical simulations of steady-state transport using AMG and the generalized minimum residual method with an incomplete LU-decomposition preconditioner. For these transport simulations, AMG yielded increased speeds of up to 17 times with only a 20% increase in required RAM. The ability of AMG to solve flow and transport problems in large, complex flow systems and its ready availability make it an ideal solver for use in both field-scale and pore-scale modeling.
Ringe, Stefan; Oberhofer, Harald; Hille, Christoph; Matera, Sebastian; Reuter, Karsten
2016-08-09
The size-modified Poisson-Boltzmann (MPB) equation is an efficient implicit solvation model which also captures electrolytic solvent effects. It combines an account of the dielectric solvent response with a mean-field description of solvated finite-sized ions. We present a general solution scheme for the MPB equation based on a fast function-space-oriented Newton method and a Green's function preconditioned iterative linear solver. In contrast to popular multigrid solvers, this approach allows us to fully exploit specialized integration grids and optimized integration schemes. We describe a corresponding numerically efficient implementation for the full-potential density-functional theory (DFT) code FHI-aims. We show that together with an additional Stern layer correction the DFT+MPB approach can describe the mean activity coefficient of a KCl aqueous solution over a wide range of concentrations. The high sensitivity of the calculated activity coefficient on the employed ionic parameters thereby suggests to use extensively tabulated experimental activity coefficients of salt solutions for a systematic parametrization protocol.
NASA Astrophysics Data System (ADS)
Tzanos, Constantine P.
1992-10-01
A higher-order differencing scheme (Tzanos, 1990) is used in conjunction with a multigrid approach to obtain accurate solutions of the Navier-Stokes convection-diffusion equations at high Re numbers. Flow in a square cavity with a moving lid is used as a test problem. a multigrid approach based on the additive correction method (Settari and Aziz) and an iterative incomplete lower and upper solver demonstrated good performance for the whole range of Re number under consideration (from 1000 to 10,000) and for both uniform and nonuniform grids. It is concluded that the combination of the higher-order differencing scheme with a multigrid approach proved to be an effective technique for giving accurate solutions of the Navier-Stokes equations at high Re numbers.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Druinsky, Alex; Ghysels, Pieter; Li, Xiaoye S.
In this paper, we study the performance of a two-level algebraic-multigrid algorithm, with a focus on the impact of the coarse-grid solver on performance. We consider two algorithms for solving the coarse-space systems: the preconditioned conjugate gradient method and a new robust HSS-embedded low-rank sparse-factorization algorithm. Our test data comes from the SPE Comparative Solution Project for oil-reservoir simulations. We contrast the performance of our code on one 12-core socket of a Cray XC30 machine with performance on a 60-core Intel Xeon Phi coprocessor. To obtain top performance, we optimized the code to take full advantage of fine-grained parallelism andmore » made it thread-friendly for high thread count. We also developed a bounds-and-bottlenecks performance model of the solver which we used to guide us through the optimization effort, and also carried out performance tuning in the solver’s large parameter space. Finally, as a result, significant speedups were obtained on both machines.« less
Parallel Solver for H(div) Problems Using Hybridization and AMG
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Chak S.; Vassilevski, Panayot S.
2016-01-15
In this paper, a scalable parallel solver is proposed for H(div) problems discretized by arbitrary order finite elements on general unstructured meshes. The solver is based on hybridization and algebraic multigrid (AMG). Unlike some previously studied H(div) solvers, the hybridization solver does not require discrete curl and gradient operators as additional input from the user. Instead, only some element information is needed in the construction of the solver. The hybridization results in a H1-equivalent symmetric positive definite system, which is then rescaled and solved by AMG solvers designed for H1 problems. Weak and strong scaling of the method are examinedmore » through several numerical tests. Our numerical results show that the proposed solver provides a promising alternative to ADS, a state-of-the-art solver [12], for H(div) problems. In fact, it outperforms ADS for higher order elements.« less
Parallel performance investigations of an unstructured mesh Navier-Stokes solver
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.
2000-01-01
A Reynolds-averaged Navier-Stokes solver based on unstructured mesh techniques for analysis of high-lift configurations is described. The method makes use of an agglomeration multigrid solver for convergence acceleration. Implicit line-smoothing is employed to relieve the stiffness associated with highly stretched meshes. A GMRES technique is also implemented to speed convergence at the expense of additional memory usage. The solver is cache efficient and fully vectorizable, and is parallelized using a two-level hybrid MPI-OpenMP implementation suitable for shared and/or distributed memory architectures, as well as clusters of shared memory machines. Convergence and scalability results are illustrated for various high-lift cases.
Solving Upwind-Biased Discretizations. 2; Multigrid Solver Using Semicoarsening
NASA Technical Reports Server (NTRS)
Diskin, Boris
1999-01-01
This paper studies a novel multigrid approach to the solution for a second order upwind biased discretization of the convection equation in two dimensions. This approach is based on semi-coarsening and well balanced explicit correction terms added to coarse-grid operators to maintain on coarse-grid the same cross-characteristic interaction as on the target (fine) grid. Colored relaxation schemes are used on all the levels allowing a very efficient parallel implementation. The results of the numerical tests can be summarized as follows: 1) The residual asymptotic convergence rate of the proposed V(0, 2) multigrid cycle is about 3 per cycle. This convergence rate far surpasses the theoretical limit (4/3) predicted for standard multigrid algorithms using full coarsening. The reported efficiency does not deteriorate with increasing the cycle, depth (number of levels) and/or refining the target-grid mesh spacing. 2) The full multi-grid algorithm (FMG) with two V(0, 2) cycles on the target grid and just one V(0, 2) cycle on all the coarse grids always provides an approximate solution with the algebraic error less than the discretization error. Estimates of the total work in the FMG algorithm are ranged between 18 and 30 minimal work units (depending on the target (discretizatioin). Thus, the overall efficiency of the FMG solver closely approaches (if does not achieve) the goal of the textbook multigrid efficiency. 3) A novel approach to deriving a discrete solution approximating the true continuous solution with a relative accuracy given in advance is developed. An adaptive multigrid algorithm (AMA) using comparison of the solutions on two successive target grids to estimate the accuracy of the current target-grid solution is defined. A desired relative accuracy is accepted as an input parameter. The final target grid on which this accuracy can be achieved is chosen automatically in the solution process. the actual relative accuracy of the discrete solution approximation obtained by AMA is always better than the required accuracy; the computational complexity of the AMA algorithm is (nearly) optimal (comparable with the complexity of the FMG algorithm applied to solve the problem on the optimally spaced target grid).
New multigrid approach for three-dimensional unstructured, adaptive grids
NASA Technical Reports Server (NTRS)
Parthasarathy, Vijayan; Kallinderis, Y.
1994-01-01
A new multigrid method with adaptive unstructured grids is presented. The three-dimensional Euler equations are solved on tetrahedral grids that are adaptively refined or coarsened locally. The multigrid method is employed to propagate the fine grid corrections more rapidly by redistributing the changes-in-time of the solution from the fine grid to the coarser grids to accelerate convergence. A new approach is employed that uses the parent cells of the fine grid cells in an adapted mesh to generate successively coaser levels of multigrid. This obviates the need for the generation of a sequence of independent, nonoverlapping grids as well as the relatively complicated operations that need to be performed to interpolate the solution and the residuals between the independent grids. The solver is an explicit, vertex-based, finite volume scheme that employs edge-based data structures and operations. Spatial discretization is of central-differencing type combined with a special upwind-like smoothing operators. Application cases include adaptive solutions obtained with multigrid acceleration for supersonic and subsonic flow over a bump in a channel, as well as transonic flow around the ONERA M6 wing. Two levels of multigrid resulted in reduction in the number of iterations by a factor of 5.
A semi-Lagrangian approach to the shallow water equation
NASA Technical Reports Server (NTRS)
Bates, J. R.; Mccormick, Stephen F.; Ruge, John; Sholl, David S.; Yavneh, Irad
1993-01-01
We present a formulation of the shallow water equations that emphasizes the conservation of potential vorticity. A locally conservative semi-Lagrangian time-stepping scheme is developed, which leads to a system of three coupled PDE's to be solved at each time level. We describe a smoothing analysis of these equations, on which an effective multigrid solver is constructed. Some results from applying this solver to the static version of these equations are presented.
An overlapped grid method for multigrid, finite volume/difference flow solvers: MaGGiE
NASA Technical Reports Server (NTRS)
Baysal, Oktay; Lessard, Victor R.
1990-01-01
The objective is to develop a domain decomposition method via overlapping/embedding the component grids, which is to be used by upwind, multi-grid, finite volume solution algorithms. A computer code, given the name MaGGiE (Multi-Geometry Grid Embedder) is developed to meet this objective. MaGGiE takes independently generated component grids as input, and automatically constructs the composite mesh and interpolation data, which can be used by the finite volume solution methods with or without multigrid convergence acceleration. Six demonstrative examples showing various aspects of the overlap technique are presented and discussed. These cases are used for developing the procedure for overlapping grids of different topologies, and to evaluate the grid connection and interpolation data for finite volume calculations on a composite mesh. Time fluxes are transferred between mesh interfaces using a trilinear interpolation procedure. Conservation losses are minimal at the interfaces using this method. The multi-grid solution algorithm, using the coaser grid connections, improves the convergence time history as compared to the solution on composite mesh without multi-gridding.
NASA Astrophysics Data System (ADS)
Sanan, P.; Tackley, P. J.; Gerya, T.; Kaus, B. J. P.; May, D.
2017-12-01
StagBL is an open-source parallel solver and discretization library for geodynamic simulation,encapsulating and optimizing operations essential to staggered-grid finite volume Stokes flow solvers.It provides a parallel staggered-grid abstraction with a high-level interface in C and Fortran.On top of this abstraction, tools are available to define boundary conditions and interact with particle systems.Tools and examples to efficiently solve Stokes systems defined on the grid are provided in small (direct solver), medium (simple preconditioners), and large (block factorization and multigrid) model regimes.By working directly with leading application codes (StagYY, I3ELVIS, and LaMEM) and providing an API and examples to integrate with others, StagBL aims to become a community tool supplying scalable, portable, reproducible performance toward novel science in regional- and planet-scale geodynamics and planetary science.By implementing kernels used by many research groups beneath a uniform abstraction layer, the library will enable optimization for modern hardware, thus reducing community barriers to large- or extreme-scale parallel simulation on modern architectures. In particular, the library will include CPU-, Manycore-, and GPU-optimized variants of matrix-free operators and multigrid components.The common layer provides a framework upon which to introduce innovative new tools.StagBL will leverage p4est to provide distributed adaptive meshes, and incorporate a multigrid convergence analysis tool.These options, in addition to a wealth of solver options provided by an interface to PETSc, will make the most modern solution techniques available from a common interface. StagBL in turn provides a PETSc interface, DMStag, to its central staggered grid abstraction.We present public version 0.5 of StagBL, including preliminary integration with application codes and demonstrations with its own demonstration application, StagBLDemo. Central to StagBL is the notion of an uninterrupted pipeline from toy/teaching codes to high-performance, extreme-scale solves. StagBLDemo replicates the functionality of an advanced MATLAB-style regional geodynamics code, thus providing users with a concrete procedure to exceed the performance and scalability limitations of smaller-scale tools.
NASA Technical Reports Server (NTRS)
Yang, Cheng I.; Guo, Yan-Hu; Liu, C.- H.
1996-01-01
The analysis and design of a submarine propulsor requires the ability to predict the characteristics of both laminar and turbulent flows to a higher degree of accuracy. This report presents results of certain benchmark computations based on an upwind, high-resolution, finite-differencing Navier-Stokes solver. The purpose of the computations is to evaluate the ability, the accuracy and the performance of the solver in the simulation of detailed features of viscous flows. Features of interest include flow separation and reattachment, surface pressure and skin friction distributions. Those features are particularly relevant to the propulsor analysis. Test cases with a wide range of Reynolds numbers are selected; therefore, the effects of the convective and the diffusive terms of the solver can be evaluated separately. Test cases include flows over bluff bodies, such as circular cylinders and spheres, at various low Reynolds numbers, flows over a flat plate with and without turbulence effects, and turbulent flows over axisymmetric bodies with and without propulsor effects. Finally, to enhance the iterative solution procedure, a full approximation scheme V-cycle multigrid method is implemented. Preliminary results indicate that the method significantly reduces the computational effort.
FAS multigrid calculations of three dimensional flow using non-staggered grids
NASA Technical Reports Server (NTRS)
Matovic, D.; Pollard, A.; Becker, H. A.; Grandmaison, E. W.
1993-01-01
Grid staggering is a well known remedy for the problem of velocity/pressure coupling in incompressible flow calculations. Numerous inconveniences occur, however, when staggered grids are implemented, particularly when a general-purpose code, capable of handling irregular three-dimensional domains, is sought. In several non-staggered grid numerical procedures proposed in the literature, the velocity/pressure coupling is achieved by either pressure or velocity (momentum) averaging. This approach is not convenient for simultaneous (block) solvers that are preferred when using multigrid methods. A new method is introduced in this paper that is based upon non-staggered grid formulation with a set of virtual cell face velocities used for pressure/velocity coupling. Instead of pressure or velocity averaging, a momentum balance at the cell face is used as a link between the momentum and mass balance constraints. The numerical stencil is limited to 9 nodes (in 2D) or 27 nodes (in 3D), both during the smoothing and inter-grid transfer, which is a convenient feature when a block point solver is applied. The results for a lid-driven cavity and a cube in a lid-driven cavity are presented and compared to staggered grid calculations using the same multigrid algorithm. The method is shown to be stable and produce a smooth (wiggle-free) pressure field.
Transonic Drag Prediction Using an Unstructured Multigrid Solver
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.; Levy, David W.
2001-01-01
This paper summarizes the results obtained with the NSU-3D unstructured multigrid solver for the AIAA Drag Prediction Workshop held in Anaheim, CA, June 2001. The test case for the workshop consists of a wing-body configuration at transonic flow conditions. Flow analyses for a complete test matrix of lift coefficient values and Mach numbers at a constant Reynolds number are performed, thus producing a set of drag polars and drag rise curves which are compared with experimental data. Results were obtained independently by both authors using an identical baseline grid and different refined grids. Most cases were run in parallel on commodity cluster-type machines while the largest cases were run on an SGI Origin machine using 128 processors. The objective of this paper is to study the accuracy of the subject unstructured grid solver for predicting drag in the transonic cruise regime, to assess the efficiency of the method in terms of convergence, cpu time, and memory, and to determine the effects of grid resolution on this predictive ability and its computational efficiency. A good predictive ability is demonstrated over a wide range of conditions, although accuracy was found to degrade for cases at higher Mach numbers and lift values where increasing amounts of flow separation occur. The ability to rapidly compute large numbers of cases at varying flow conditions using an unstructured solver on inexpensive clusters of commodity computers is also demonstrated.
Tuminaro, Raymond S.; Perego, Mauro; Tezaur, Irina Kalashnikova; ...
2016-10-06
A multigrid method is proposed that combines ideas from matrix dependent multigrid for structured grids and algebraic multigrid for unstructured grids. It targets problems where a three-dimensional mesh can be viewed as an extrusion of a two-dimensional, unstructured mesh in a third dimension. Our motivation comes from the modeling of thin structures via finite elements and, more specifically, the modeling of ice sheets. Extruded meshes are relatively common for thin structures and often give rise to anisotropic problems when the thin direction mesh spacing is much smaller than the broad direction mesh spacing. Within our approach, the first few multigridmore » hierarchy levels are obtained by applying matrix dependent multigrid to semicoarsen in a structured thin direction fashion. After sufficient structured coarsening, the resulting mesh contains only a single layer corresponding to a two-dimensional, unstructured mesh. Algebraic multigrid can then be employed in a standard manner to create further coarse levels, as the anisotropic phenomena is no longer present in the single layer problem. The overall approach remains fully algebraic, with the minor exception that some additional information is needed to determine the extruded direction. Furthermore, this facilitates integration of the solver with a variety of different extruded mesh applications.« less
Implementation and Optimization of miniGMG - a Compact Geometric Multigrid Benchmark
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Kalamkar, Dhiraj; Singh, Amik
2012-12-01
Multigrid methods are widely used to accelerate the convergence of iterative solvers for linear systems used in a number of different application areas. In this report, we describe miniGMG, our compact geometric multigrid benchmark designed to proxy the multigrid solves found in AMR applications. We explore optimization techniques for geometric multigrid on existing and emerging multicore systems including the Opteron-based Cray XE6, Intel Sandy Bridge and Nehalem-based Infiniband clusters, as well as manycore-based architectures including NVIDIA's Fermi and Kepler GPUs and Intel's Knights Corner (KNC) co-processor. This report examines a variety of novel techniques including communication-aggregation, threaded wavefront-based DRAM communication-avoiding,more » dynamic threading decisions, SIMDization, and fusion of operators. We quantify performance through each phase of the V-cycle for both single-node and distributed-memory experiments and provide detailed analysis for each class of optimization. Results show our optimizations yield significant speedups across a variety of subdomain sizes while simultaneously demonstrating the potential of multi- and manycore processors to dramatically accelerate single-node performance. However, our analysis also indicates that improvements in networks and communication will be essential to reap the potential of manycore processors in large-scale multigrid calculations.« less
Multigrid Strategies for Viscous Flow Solvers on Anisotropic Unstructured Meshes
NASA Technical Reports Server (NTRS)
Movriplis, Dimitri J.
1998-01-01
Unstructured multigrid techniques for relieving the stiffness associated with high-Reynolds number viscous flow simulations on extremely stretched grids are investigated. One approach consists of employing a semi-coarsening or directional-coarsening technique, based on the directions of strong coupling within the mesh, in order to construct more optimal coarse grid levels. An alternate approach is developed which employs directional implicit smoothing with regular fully coarsened multigrid levels. The directional implicit smoothing is obtained by constructing implicit lines in the unstructured mesh based on the directions of strong coupling. Both approaches yield large increases in convergence rates over the traditional explicit full-coarsening multigrid algorithm. However, maximum benefits are achieved by combining the two approaches in a coupled manner into a single algorithm. An order of magnitude increase in convergence rate over the traditional explicit full-coarsening algorithm is demonstrated, and convergence rates for high-Reynolds number viscous flows which are independent of the grid aspect ratio are obtained. Further acceleration is provided by incorporating low-Mach-number preconditioning techniques, and a Newton-GMRES strategy which employs the multigrid scheme as a preconditioner. The compounding effects of these various techniques on speed of convergence is documented through several example test cases.
NASA Technical Reports Server (NTRS)
McCormick, S.; Ruge, John W.
1998-01-01
This work represents a part of a project to develop an atmospheric general circulation model based on the semi-Lagrangian advection of potential vorticity (PC) with divergence as the companion prognostic variable.
NASA Astrophysics Data System (ADS)
Lavery, N.; Taylor, C.
1999-07-01
Multigrid and iterative methods are used to reduce the solution time of the matrix equations which arise from the finite element (FE) discretisation of the time-independent equations of motion of the incompressible fluid in turbulent motion. Incompressible flow is solved by using the method of reduce interpolation for the pressure to satisfy the Brezzi-Babuska condition. The k-l model is used to complete the turbulence closure problem. The non-symmetric iterative matrix methods examined are the methods of least squares conjugate gradient (LSCG), biconjugate gradient (BCG), conjugate gradient squared (CGS), and the biconjugate gradient squared stabilised (BCGSTAB). The multigrid algorithm applied is based on the FAS algorithm of Brandt, and uses two and three levels of grids with a V-cycling schedule. These methods are all compared to the non-symmetric frontal solver. Copyright
NASA Technical Reports Server (NTRS)
Atkins, Harold
1991-01-01
A multiple block multigrid method for the solution of the three dimensional Euler and Navier-Stokes equations is presented. The basic flow solver is a cell vertex method which employs central difference spatial approximations and Runge-Kutta time stepping. The use of local time stepping, implicit residual smoothing, multigrid techniques and variable coefficient numerical dissipation results in an efficient and robust scheme is discussed. The multiblock strategy places the block loop within the Runge-Kutta Loop such that accuracy and convergence are not affected by block boundaries. This has been verified by comparing the results of one and two block calculations in which the two block grid is generated by splitting the one block grid. Results are presented for both Euler and Navier-Stokes computations of wing/fuselage combinations.
NASA Technical Reports Server (NTRS)
Duncan, Comer; Jones, Jim
1993-01-01
A key ingredient in the simulation of self-gravitating astrophysical fluid dynamical systems is the gravitational potential and its gradient. This paper focuses on the development of a mixed method multigrid solver of the Poisson equation formulated so that both the potential and the Cartesian components of its gradient are self-consistently and accurately generated. The method achieves this goal by formulating the problem as a system of four equations for the gravitational potential and the three Cartesian components of the gradient and solves them using a distributed relaxation technique combined with conventional full multigrid V-cycles. The method is described, some tests are presented, and the accuracy of the method is assessed. We also describe how the method has been incorporated into our three-dimensional hydrodynamics code and give an example of an application to the collision of two stars. We end with some remarks about the future developments of the method and some of the applications in which it will be used in astrophysics.
A Comparison of Solver Performance for Complex Gastric Electrophysiology Models
Sathar, Shameer; Cheng, Leo K.; Trew, Mark L.
2016-01-01
Computational techniques for solving systems of equations arising in gastric electrophysiology have not been studied for efficient solution process. We present a computationally challenging problem of simulating gastric electrophysiology in anatomically realistic stomach geometries with multiple intracellular and extracellular domains. The multiscale nature of the problem and mesh resolution required to capture geometric and functional features necessitates efficient solution methods if the problem is to be tractable. In this study, we investigated and compared several parallel preconditioners for the linear systems arising from tetrahedral discretisation of electrically isotropic and anisotropic problems, with and without stimuli. The results showed that the isotropic problem was computationally less challenging than the anisotropic problem and that the application of extracellular stimuli increased workload considerably. Preconditioning based on block Jacobi and algebraic multigrid solvers were found to have the best overall solution times and least iteration counts, respectively. The algebraic multigrid preconditioner would be expected to perform better on large problems. PMID:26736543
A multiblock multigrid three-dimensional Euler equation solver
NASA Technical Reports Server (NTRS)
Cannizzaro, Frank E.; Elmiligui, Alaa; Melson, N. Duane; Vonlavante, E.
1990-01-01
Current aerodynamic designs are often quite complex (geometrically). Flexible computational tools are needed for the analysis of a wide range of configurations with both internal and external flows. In the past, geometrically dissimilar configurations required different analysis codes with different grid topologies in each. The duplicity of codes can be avoided with the use of a general multiblock formulation which can handle any grid topology. Rather than hard wiring the grid topology into the program, it is instead dictated by input to the program. In this work, the compressible Euler equations, written in a body-fitted finite-volume formulation, are solved using a pseudo-time-marching approach. Two upwind methods (van Leer's flux-vector-splitting and Roe's flux-differencing) were investigated. Two types of explicit solvers (a two-step predictor-corrector and a modified multistage Runge-Kutta) were used with multigrid acceleration to enhance convergence. A multiblock strategy is used to allow greater geometric flexibility. A report on simple explicit upwind schemes for solving compressible flows is included.
One shot methods for optimal control of distributed parameter systems 1: Finite dimensional control
NASA Technical Reports Server (NTRS)
Taasan, Shlomo
1991-01-01
The efficient numerical treatment of optimal control problems governed by elliptic partial differential equations (PDEs) and systems of elliptic PDEs, where the control is finite dimensional is discussed. Distributed control as well as boundary control cases are discussed. The main characteristic of the new methods is that they are designed to solve the full optimization problem directly, rather than accelerating a descent method by an efficient multigrid solver for the equations involved. The methods use the adjoint state in order to achieve efficient smoother and a robust coarsening strategy. The main idea is the treatment of the control variables on appropriate scales, i.e., control variables that correspond to smooth functions are solved for on coarse grids depending on the smoothness of these functions. Solution of the control problems is achieved with the cost of solving the constraint equations about two to three times (by a multigrid solver). Numerical examples demonstrate the effectiveness of the method proposed in distributed control case, pointwise control and boundary control problems.
NASA Astrophysics Data System (ADS)
Feng, Wenqiang; Guo, Zhenlin; Lowengrub, John S.; Wise, Steven M.
2018-01-01
We present a mass-conservative full approximation storage (FAS) multigrid solver for cell-centered finite difference methods on block-structured, locally cartesian grids. The algorithm is essentially a standard adaptive FAS (AFAS) scheme, but with a simple modification that comes in the form of a mass-conservative correction to the coarse-level force. This correction is facilitated by the creation of a zombie variable, analogous to a ghost variable, but defined on the coarse grid and lying under the fine grid refinement patch. We show that a number of different types of fine-level ghost cell interpolation strategies could be used in our framework, including low-order linear interpolation. In our approach, the smoother, prolongation, and restriction operations need never be aware of the mass conservation conditions at the coarse-fine interface. To maintain global mass conservation, we need only modify the usual FAS algorithm by correcting the coarse-level force function at points adjacent to the coarse-fine interface. We demonstrate through simulations that the solver converges geometrically, at a rate that is h-independent, and we show the generality of the solver, applying it to several nonlinear, time-dependent, and multi-dimensional problems. In several tests, we show that second-order asymptotic (h → 0) convergence is observed for the discretizations, provided that (1) at least linear interpolation of the ghost variables is employed, and (2) the mass conservation corrections are applied to the coarse-level force term.
Higher-order ice-sheet modelling accelerated by multigrid on graphics cards
NASA Astrophysics Data System (ADS)
Brædstrup, Christian; Egholm, David
2013-04-01
Higher-order ice flow modelling is a very computer intensive process owing primarily to the nonlinear influence of the horizontal stress coupling. When applied for simulating long-term glacial landscape evolution, the ice-sheet models must consider very long time series, while both high temporal and spatial resolution is needed to resolve small effects. The use of higher-order and full stokes models have therefore seen very limited usage in this field. However, recent advances in graphics card (GPU) technology for high performance computing have proven extremely efficient in accelerating many large-scale scientific computations. The general purpose GPU (GPGPU) technology is cheap, has a low power consumption and fits into a normal desktop computer. It could therefore provide a powerful tool for many glaciologists working on ice flow models. Our current research focuses on utilising the GPU as a tool in ice-sheet and glacier modelling. To this extent we have implemented the Integrated Second-Order Shallow Ice Approximation (iSOSIA) equations on the device using the finite difference method. To accelerate the computations, the GPU solver uses a non-linear Red-Black Gauss-Seidel iterator coupled with a Full Approximation Scheme (FAS) multigrid setup to further aid convergence. The GPU finite difference implementation provides the inherent parallelization that scales from hundreds to several thousands of cores on newer cards. We demonstrate the efficiency of the GPU multigrid solver using benchmark experiments.
NASA Astrophysics Data System (ADS)
Augustin, Christoph M.; Neic, Aurel; Liebmann, Manfred; Prassl, Anton J.; Niederer, Steven A.; Haase, Gundolf; Plank, Gernot
2016-01-01
Electromechanical (EM) models of the heart have been used successfully to study fundamental mechanisms underlying a heart beat in health and disease. However, in all modeling studies reported so far numerous simplifications were made in terms of representing biophysical details of cellular function and its heterogeneity, gross anatomy and tissue microstructure, as well as the bidirectional coupling between electrophysiology (EP) and tissue distension. One limiting factor is the employed spatial discretization methods which are not sufficiently flexible to accommodate complex geometries or resolve heterogeneities, but, even more importantly, the limited efficiency of the prevailing solver techniques which is not sufficiently scalable to deal with the incurring increase in degrees of freedom (DOF) when modeling cardiac electromechanics at high spatio-temporal resolution. This study reports on the development of a novel methodology for solving the nonlinear equation of finite elasticity using human whole organ models of cardiac electromechanics, discretized at a high para-cellular resolution. Three patient-specific, anatomically accurate, whole heart EM models were reconstructed from magnetic resonance (MR) scans at resolutions of 220 μm, 440 μm and 880 μm, yielding meshes of approximately 184.6, 24.4 and 3.7 million tetrahedral elements and 95.9, 13.2 and 2.1 million displacement DOF, respectively. The same mesh was used for discretizing the governing equations of both electrophysiology (EP) and nonlinear elasticity. A novel algebraic multigrid (AMG) preconditioner for an iterative Krylov solver was developed to deal with the resulting computational load. The AMG preconditioner was designed under the primary objective of achieving favorable strong scaling characteristics for both setup and solution runtimes, as this is key for exploiting current high performance computing hardware. Benchmark results using the 220 μm, 440 μm and 880 μm meshes demonstrate efficient scaling up to 1024, 4096 and 8192 compute cores which allowed the simulation of a single heart beat in 44.3, 87.8 and 235.3 minutes, respectively. The efficiency of the method allows fast simulation cycles without compromising anatomical or biophysical detail.
Augustin, Christoph M.; Neic, Aurel; Liebmann, Manfred; Prassl, Anton J.; Niederer, Steven A.; Haase, Gundolf; Plank, Gernot
2016-01-01
Electromechanical (EM) models of the heart have been used successfully to study fundamental mechanisms underlying a heart beat in health and disease. However, in all modeling studies reported so far numerous simplifications were made in terms of representing biophysical details of cellular function and its heterogeneity, gross anatomy and tissue microstructure, as well as the bidirectional coupling between electrophysiology (EP) and tissue distension. One limiting factor is the employed spatial discretization methods which are not sufficiently flexible to accommodate complex geometries or resolve heterogeneities, but, even more importantly, the limited efficiency of the prevailing solver techniques which are not sufficiently scalable to deal with the incurring increase in degrees of freedom (DOF) when modeling cardiac electromechanics at high spatio-temporal resolution. This study reports on the development of a novel methodology for solving the nonlinear equation of finite elasticity using human whole organ models of cardiac electromechanics, discretized at a high para-cellular resolution. Three patient-specific, anatomically accurate, whole heart EM models were reconstructed from magnetic resonance (MR) scans at resolutions of 220 μm, 440 μm and 880 μm, yielding meshes of approximately 184.6, 24.4 and 3.7 million tetrahedral elements and 95.9, 13.2 and 2.1 million displacement DOF, respectively. The same mesh was used for discretizing the governing equations of both electrophysiology (EP) and nonlinear elasticity. A novel algebraic multigrid (AMG) preconditioner for an iterative Krylov solver was developed to deal with the resulting computational load. The AMG preconditioner was designed under the primary objective of achieving favorable strong scaling characteristics for both setup and solution runtimes, as this is key for exploiting current high performance computing hardware. Benchmark results using the 220 μm, 440 μm and 880 μm meshes demonstrate efficient scaling up to 1024, 4096 and 8192 compute cores which allowed the simulation of a single heart beat in 44.3, 87.8 and 235.3 minutes, respectively. The efficiency of the method allows fast simulation cycles without compromising anatomical or biophysical detail. PMID:26819483
NASA Technical Reports Server (NTRS)
Cannizzaro, Frank E.; Ash, Robert L.
1992-01-01
A state-of-the-art computer code has been developed that incorporates a modified Runge-Kutta time integration scheme, upwind numerical techniques, multigrid acceleration, and multi-block capabilities (RUMM). A three-dimensional thin-layer formulation of the Navier-Stokes equations is employed. For turbulent flow cases, the Baldwin-Lomax algebraic turbulence model is used. Two different upwind techniques are available: van Leer's flux-vector splitting and Roe's flux-difference splitting. Full approximation multi-grid plus implicit residual and corrector smoothing were implemented to enhance the rate of convergence. Multi-block capabilities were developed to provide geometric flexibility. This feature allows the developed computer code to accommodate any grid topology or grid configuration with multiple topologies. The results shown in this dissertation were chosen to validate the computer code and display its geometric flexibility, which is provided by the multi-block structure.
Efficient relaxed-Jacobi smoothers for multigrid on parallel computers
NASA Astrophysics Data System (ADS)
Yang, Xiang; Mittal, Rajat
2017-03-01
In this Technical Note, we present a family of Jacobi-based multigrid smoothers suitable for the solution of discretized elliptic equations. These smoothers are based on the idea of scheduled-relaxation Jacobi proposed recently by Yang & Mittal (2014) [18] and employ two or three successive relaxed Jacobi iterations with relaxation factors derived so as to maximize the smoothing property of these iterations. The performance of these new smoothers measured in terms of convergence acceleration and computational workload, is assessed for multi-domain implementations typical of parallelized solvers, and compared to the lexicographic point Gauss-Seidel smoother. The tests include the geometric multigrid method on structured grids as well as the algebraic grid method on unstructured grids. The tests demonstrate that unlike Gauss-Seidel, the convergence of these Jacobi-based smoothers is unaffected by domain decomposition, and furthermore, they outperform the lexicographic Gauss-Seidel by factors that increase with domain partition count.
Robust Multigrid Smoothers for Three Dimensional Elliptic Equations with Strong Anisotropies
NASA Technical Reports Server (NTRS)
Llorente, Ignacio M.; Melson, N. Duane
1998-01-01
We discuss the behavior of several plane relaxation methods as multigrid smoothers for the solution of a discrete anisotropic elliptic model problem on cell-centered grids. The methods compared are plane Jacobi with damping, plane Jacobi with partial damping, plane Gauss-Seidel, plane zebra Gauss-Seidel, and line Gauss-Seidel. Based on numerical experiments and local mode analysis, we compare the smoothing factor of the different methods in the presence of strong anisotropies. A four-color Gauss-Seidel method is found to have the best numerical and architectural properties of the methods considered in the present work. Although alternating direction plane relaxation schemes are simpler and more robust than other approaches, they are not currently used in industrial and production codes because they require the solution of a two-dimensional problem for each plane in each direction. We verify the theoretical predictions of Thole and Trottenberg that an exact solution of each plane is not necessary and that a single two-dimensional multigrid cycle gives the same result as an exact solution, in much less execution time. Parallelization of the two-dimensional multigrid cycles, the kernel of the three-dimensional implicit solver, is also discussed. Alternating-plane smoothers are found to be highly efficient multigrid smoothers for anisotropic elliptic problems.
NASA Astrophysics Data System (ADS)
Ying, Jinyong; Xie, Dexuan
2015-10-01
The Poisson-Boltzmann equation (PBE) is one widely-used implicit solvent continuum model for calculating electrostatics of ionic solvated biomolecule. In this paper, a new finite element and finite difference hybrid method is presented to solve PBE efficiently based on a special seven-overlapped box partition with one central box containing the solute region and surrounded by six neighboring boxes. In particular, an efficient finite element solver is applied to the central box while a fast preconditioned conjugate gradient method using a multigrid V-cycle preconditioning is constructed for solving a system of finite difference equations defined on a uniform mesh of each neighboring box. Moreover, the PBE domain, the box partition, and an interface fitted tetrahedral mesh of the central box can be generated adaptively for a given PQR file of a biomolecule. This new hybrid PBE solver is programmed in C, Fortran, and Python as a software tool for predicting electrostatics of a biomolecule in a symmetric 1:1 ionic solvent. Numerical results on two test models with analytical solutions and 12 proteins validate this new software tool, and demonstrate its high performance in terms of CPU time and memory usage.
Variational optical flow computation in real time.
Bruhn, Andrés; Weickert, Joachim; Feddern, Christian; Kohlberger, Timo; Schnörr, Christoph
2005-05-01
This paper investigates the usefulness of bidirectional multigrid methods for variational optical flow computations. Although these numerical schemes are among the fastest methods for solving equation systems, they are rarely applied in the field of computer vision. We demonstrate how to employ those numerical methods for the treatment of variational optical flow formulations and show that the efficiency of this approach even allows for real-time performance on standard PCs. As a representative for variational optic flow methods, we consider the recently introduced combined local-global method. It can be considered as a noise-robust generalization of the Horn and Schunck technique. We present a decoupled, as well as a coupled, version of the classical Gauss-Seidel solver, and we develop several multgrid implementations based on a discretization coarse grid approximation. In contrast, with standard bidirectional multigrid algorithms, we take advantage of intergrid transfer operators that allow for nondyadic grid hierarchies. As a consequence, no restrictions concerning the image size or the number of traversed levels have to be imposed. In the experimental section, we juxtapose the developed multigrid schemes and demonstrate their superior performance when compared to unidirectional multgrid methods and nonhierachical solvers. For the well-known 316 x 252 Yosemite sequence, we succeeded in computing the complete set of dense flow fields in three quarters of a second on a 3.06-GHz Pentium4 PC. This corresponds to a frame rate of 18 flow fields per second which outperforms the widely-used Gauss-Seidel method by almost three orders of magnitude.
pyro: Python-based tutorial for computational methods for hydrodynamics
NASA Astrophysics Data System (ADS)
Zingale, Michael
2015-07-01
pyro is a simple python-based tutorial on computational methods for hydrodynamics. It includes 2-d solvers for advection, compressible, incompressible, and low Mach number hydrodynamics, diffusion, and multigrid. It is written with ease of understanding in mind. An extensive set of notes that is part of the Open Astrophysics Bookshelf project provides details of the algorithms.
Multigrid Equation Solvers for Large Scale Nonlinear Finite Element Simulations
1999-01-01
purpose of the second partitioning phase , on each SMP, is to minimize the communication within the SMP; even if a multi - threaded matrix vector product...8.7 Comparison of model with experimental data for send phase of matrix vector product on ne grid...140 8.4 Matrix vector product phase times : : : : : : : : : : : : : : : : : : : : : : : 145 9.1 Flat and
DOE Office of Scientific and Technical Information (OSTI.GOV)
D'Ambra, P.; Vassilevski, P. S.
2014-05-30
Adaptive Algebraic Multigrid (or Multilevel) Methods (αAMG) are introduced to improve robustness and efficiency of classical algebraic multigrid methods in dealing with problems where no a-priori knowledge or assumptions on the near-null kernel of the underlined matrix are available. Recently we proposed an adaptive (bootstrap) AMG method, αAMG, aimed to obtain a composite solver with a desired convergence rate. Each new multigrid component relies on a current (general) smooth vector and exploits pairwise aggregation based on weighted matching in a matrix graph to define a new automatic, general-purpose coarsening process, which we refer to as “the compatible weighted matching”. Inmore » this work, we present results that broaden the applicability of our method to different finite element discretizations of elliptic PDEs. In particular, we consider systems arising from displacement methods in linear elasticity problems and saddle-point systems that appear in the application of the mixed method to Darcy problems.« less
Womack, James C; Anton, Lucian; Dziedzic, Jacek; Hasnip, Phil J; Probert, Matt I J; Skylaris, Chris-Kriton
2018-03-13
The solution of the Poisson equation is a crucial step in electronic structure calculations, yielding the electrostatic potential-a key component of the quantum mechanical Hamiltonian. In recent decades, theoretical advances and increases in computer performance have made it possible to simulate the electronic structure of extended systems in complex environments. This requires the solution of more complicated variants of the Poisson equation, featuring nonhomogeneous dielectric permittivities, ionic concentrations with nonlinear dependencies, and diverse boundary conditions. The analytic solutions generally used to solve the Poisson equation in vacuum (or with homogeneous permittivity) are not applicable in these circumstances, and numerical methods must be used. In this work, we present DL_MG, a flexible, scalable, and accurate solver library, developed specifically to tackle the challenges of solving the Poisson equation in modern large-scale electronic structure calculations on parallel computers. Our solver is based on the multigrid approach and uses an iterative high-order defect correction method to improve the accuracy of solutions. Using two chemically relevant model systems, we tested the accuracy and computational performance of DL_MG when solving the generalized Poisson and Poisson-Boltzmann equations, demonstrating excellent agreement with analytic solutions and efficient scaling to ∼10 9 unknowns and 100s of CPU cores. We also applied DL_MG in actual large-scale electronic structure calculations, using the ONETEP linear-scaling electronic structure package to study a 2615 atom protein-ligand complex with routinely available computational resources. In these calculations, the overall execution time with DL_MG was not significantly greater than the time required for calculations using a conventional FFT-based solver.
Multigrid Method for Modeling Multi-Dimensional Combustion with Detailed Chemistry
NASA Technical Reports Server (NTRS)
Zheng, Xiaoqing; Liu, Chaoqun; Liao, Changming; Liu, Zhining; McCormick, Steve
1996-01-01
A highly accurate and efficient numerical method is developed for modeling 3-D reacting flows with detailed chemistry. A contravariant velocity-based governing system is developed for general curvilinear coordinates to maintain simplicity of the continuity equation and compactness of the discretization stencil. A fully-implicit backward Euler technique and a third-order monotone upwind-biased scheme on a staggered grid are used for the respective temporal and spatial terms. An efficient semi-coarsening multigrid method based on line-distributive relaxation is used as the flow solver. The species equations are solved in a fully coupled way and the chemical reaction source terms are treated implicitly. Example results are shown for a 3-D gas turbine combustor with strong swirling inflows.
Multigrid treatment of implicit continuum diffusion
NASA Astrophysics Data System (ADS)
Francisquez, Manaure; Zhu, Ben; Rogers, Barrett
2017-10-01
Implicit treatment of diffusive terms of various differential orders common in continuum mechanics modeling, such as computational fluid dynamics, is investigated with spectral and multigrid algorithms in non-periodic 2D domains. In doubly periodic time dependent problems these terms can be efficiently and implicitly handled by spectral methods, but in non-periodic systems solved with distributed memory parallel computing and 2D domain decomposition, this efficiency is lost for large numbers of processors. We built and present here a multigrid algorithm for these types of problems which outperforms a spectral solution that employs the highly optimized FFTW library. This multigrid algorithm is not only suitable for high performance computing but may also be able to efficiently treat implicit diffusion of arbitrary order by introducing auxiliary equations of lower order. We test these solvers for fourth and sixth order diffusion with idealized harmonic test functions as well as a turbulent 2D magnetohydrodynamic simulation. It is also shown that an anisotropic operator without cross-terms can improve model accuracy and speed, and we examine the impact that the various diffusion operators have on the energy, the enstrophy, and the qualitative aspect of a simulation. This work was supported by DOE-SC-0010508. This research used resources of the National Energy Research Scientific Computing Center (NERSC).
A Kernel-free Boundary Integral Method for Elliptic Boundary Value Problems ⋆
Ying, Wenjun; Henriquez, Craig S.
2013-01-01
This paper presents a class of kernel-free boundary integral (KFBI) methods for general elliptic boundary value problems (BVPs). The boundary integral equations reformulated from the BVPs are solved iteratively with the GMRES method. During the iteration, the boundary and volume integrals involving Green's functions are approximated by structured grid-based numerical solutions, which avoids the need to know the analytical expressions of Green's functions. The KFBI method assumes that the larger regular domain, which embeds the original complex domain, can be easily partitioned into a hierarchy of structured grids so that fast elliptic solvers such as the fast Fourier transform (FFT) based Poisson/Helmholtz solvers or those based on geometric multigrid iterations are applicable. The structured grid-based solutions are obtained with standard finite difference method (FDM) or finite element method (FEM), where the right hand side of the resulting linear system is appropriately modified at irregular grid nodes to recover the formal accuracy of the underlying numerical scheme. Numerical results demonstrating the efficiency and accuracy of the KFBI methods are presented. It is observed that the number of GM-RES iterations used by the method for solving isotropic and moderately anisotropic BVPs is independent of the sizes of the grids that are employed to approximate the boundary and volume integrals. With the standard second-order FEMs and FDMs, the KFBI method shows a second-order convergence rate in accuracy for all of the tested Dirichlet/Neumann BVPs when the anisotropy of the diffusion tensor is not too strong. PMID:23519600
Performance of Nonlinear Finite-Difference Poisson-Boltzmann Solvers
Cai, Qin; Hsieh, Meng-Juei; Wang, Jun; Luo, Ray
2014-01-01
We implemented and optimized seven finite-difference solvers for the full nonlinear Poisson-Boltzmann equation in biomolecular applications, including four relaxation methods, one conjugate gradient method, and two inexact Newton methods. The performance of the seven solvers was extensively evaluated with a large number of nucleic acids and proteins. Worth noting is the inexact Newton method in our analysis. We investigated the role of linear solvers in its performance by incorporating the incomplete Cholesky conjugate gradient and the geometric multigrid into its inner linear loop. We tailored and optimized both linear solvers for faster convergence rate. In addition, we explored strategies to optimize the successive over-relaxation method to reduce its convergence failures without too much sacrifice in its convergence rate. Specifically we attempted to adaptively change the relaxation parameter and to utilize the damping strategy from the inexact Newton method to improve the successive over-relaxation method. Our analysis shows that the nonlinear methods accompanied with a functional-assisted strategy, such as the conjugate gradient method and the inexact Newton method, can guarantee convergence in the tested molecules. Especially the inexact Newton method exhibits impressive performance when it is combined with highly efficient linear solvers that are tailored for its special requirement. PMID:24723843
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clark, M. A.; Strelchenko, Alexei; Vaquero, Alejandro
Lattice quantum chromodynamics simulations in nuclear physics have benefited from a tremendous number of algorithmic advances such as multigrid and eigenvector deflation. These improve the time to solution but do not alleviate the intrinsic memory-bandwidth constraints of the matrix-vector operation dominating iterative solvers. Batching this operation for multiple vectors and exploiting cache and register blocking can yield a super-linear speed up. Block-Krylov solvers can naturally take advantage of such batched matrix-vector operations, further reducing the iterations to solution by sharing the Krylov space between solves. However, practical implementations typically suffer from the quadratic scaling in the number of vector-vector operations.more » Using the QUDA library, we present an implementation of a block-CG solver on NVIDIA GPUs which reduces the memory-bandwidth complexity of vector-vector operations from quadratic to linear. We present results for the HISQ discretization, showing a 5x speedup compared to highly-optimized independent Krylov solves on NVIDIA's SaturnV cluster.« less
NASA Technical Reports Server (NTRS)
Sidilkover, David
1997-01-01
Some important advances took place during the last several years in the development of genuinely multidimensional upwind schemes for the compressible Euler equations. In particular, a robust, high-resolution genuinely multidimensional scheme which can be used for any of the flow regimes computations was constructed. This paper summarizes briefly these developments and outlines the fundamental advantages of this approach.
A Parallel Multigrid Solver for Viscous Flows on Anisotropic Structured Grids
2001-10-01
the US-Spain Joint Commission for Scienti c and Technological Cooperation. yDepartamento Arquitectura de Computadores y Automatica, Universidad...Complutense, 28040 Madrid, Spain. E-mail: mpmatias@dacya.ucm.es zDepartamento Arquitectura de Computadores y Automatica, Universidad Complutense, 28040...Madrid, Spain. E-mail: rubensm@dacya.ucm.es xDepartamento Arquitectura de Computadores y Automatica, Universidad Complutense, 28040 Madrid, Spain. E-mail
NASA Technical Reports Server (NTRS)
Aftosmis, M. J.; Berger, M. J.; Adomavicius, G.
2000-01-01
Preliminary verification and validation of an efficient Euler solver for adaptively refined Cartesian meshes with embedded boundaries is presented. The parallel, multilevel method makes use of a new on-the-fly parallel domain decomposition strategy based upon the use of space-filling curves, and automatically generates a sequence of coarse meshes for processing by the multigrid smoother. The coarse mesh generation algorithm produces grids which completely cover the computational domain at every level in the mesh hierarchy. A series of examples on realistically complex three-dimensional configurations demonstrate that this new coarsening algorithm reliably achieves mesh coarsening ratios in excess of 7 on adaptively refined meshes. Numerical investigations of the scheme's local truncation error demonstrate an achieved order of accuracy between 1.82 and 1.88. Convergence results for the multigrid scheme are presented for both subsonic and transonic test cases and demonstrate W-cycle multigrid convergence rates between 0.84 and 0.94. Preliminary parallel scalability tests on both simple wing and complex complete aircraft geometries shows a computational speedup of 52 on 64 processors using the run-time mesh partitioner.
NASA Astrophysics Data System (ADS)
Esmaily, M.; Jofre, L.; Mani, A.; Iaccarino, G.
2018-03-01
A geometric multigrid algorithm is introduced for solving nonsymmetric linear systems resulting from the discretization of the variable density Navier-Stokes equations on nonuniform structured rectilinear grids and high-Reynolds number flows. The restriction operation is defined such that the resulting system on the coarser grids is symmetric, thereby allowing for the use of efficient smoother algorithms. To achieve an optimal rate of convergence, the sequence of interpolation and restriction operations are determined through a dynamic procedure. A parallel partitioning strategy is introduced to minimize communication while maintaining the load balance between all processors. To test the proposed algorithm, we consider two cases: 1) homogeneous isotropic turbulence discretized on uniform grids and 2) turbulent duct flow discretized on stretched grids. Testing the algorithm on systems with up to a billion unknowns shows that the cost varies linearly with the number of unknowns. This O (N) behavior confirms the robustness of the proposed multigrid method regarding ill-conditioning of large systems characteristic of multiscale high-Reynolds number turbulent flows. The robustness of our method to density variations is established by considering cases where density varies sharply in space by a factor of up to 104, showing its applicability to two-phase flow problems. Strong and weak scalability studies are carried out, employing up to 30,000 processors, to examine the parallel performance of our implementation. Excellent scalability of our solver is shown for a granularity as low as 104 to 105 unknowns per processor. At its tested peak throughput, it solves approximately 4 billion unknowns per second employing over 16,000 processors with a parallel efficiency higher than 50%.
Factorizable Upwind Schemes: The Triangular Unstructured Grid Formulation
NASA Technical Reports Server (NTRS)
Sidilkover, David; Nielsen, Eric J.
2001-01-01
The upwind factorizable schemes for the equations of fluid were introduced recently. They facilitate achieving the Textbook Multigrid Efficiency (TME) and are expected also to result in the solvers of unparalleled robustness. The approach itself is very general. Therefore, it may well become a general framework for the large-scale, Computational Fluid Dynamics. In this paper we outline the triangular grid formulation of the factorizable schemes. The derivation is based on the fact that the factorizable schemes can be expressed entirely using vector notation. without explicitly mentioning a particular coordinate frame. We, describe the resulting discrete scheme in detail and present some computational results verifying the basic properties of the scheme/solver.
A parallel finite-difference method for computational aerodynamics
NASA Technical Reports Server (NTRS)
Swisshelm, Julie M.
1989-01-01
A finite-difference scheme for solving complex three-dimensional aerodynamic flow on parallel-processing supercomputers is presented. The method consists of a basic flow solver with multigrid convergence acceleration, embedded grid refinements, and a zonal equation scheme. Multitasking and vectorization have been incorporated into the algorithm. Results obtained include multiprocessed flow simulations from the Cray X-MP and Cray-2. Speedups as high as 3.3 for the two-dimensional case and 3.5 for segments of the three-dimensional case have been achieved on the Cray-2. The entire solver attained a factor of 2.7 improvement over its unitasked version on the Cray-2. The performance of the parallel algorithm on each machine is analyzed.
NASA Astrophysics Data System (ADS)
Southworth, Benjamin Scott
PART I: One of the most fascinating questions to humans has long been whether life exists outside of our planet. To our knowledge, water is a fundamental building block of life, which makes liquid water on other bodies in the universe a topic of great interest. In fact, there are large bodies of water right here in our solar system, underneath the icy crust of moons around Saturn and Jupiter. The NASA-ESA Cassini Mission spent two decades studying the Saturnian system. One of the many exciting discoveries was a "plume" on the south pole of Enceladus, emitting hundreds of kg/s of water vapor and frozen water-ice particles from Enceladus' subsurface ocean. It has since been determined that Enceladus likely has a global liquid water ocean separating its rocky core from icy surface, with conditions that are relatively favorable to support life. The plume is of particular interest because it gives direct access to ocean particles from space, by flying through the plume. Recently, evidence has been found for similar geological activity occurring on Jupiter's moon Europa, long considered one of the most likely candidate bodies to support life in our solar system. Here, a model for plume-particle dynamics is developed based on studies of the Enceladus plume and data from the Cassini Cosmic Dust Analyzer. A C++, OpenMP/MPI parallel software package is then built to run large scale simulations of dust plumes on planetary satellites. In the case of Enceladus, data from simulations and the Cassini mission provide insight into the structure of emissions on the surface, the total mass production of the plume, and the distribution of particles being emitted. Each of these are fundamental to understanding the plume and, for Europa and Enceladus, simulation data provide important results for the planning of future missions to these icy moons. In particular, this work has contributed to the Europa Clipper mission and proposed Enceladus Life Finder. PART II: Solving large, sparse linear systems arises often in the modeling of biological and physical phenomenon, data analysis through graphs and networks, and other scientific applications. This work focuses primarily on linear systems resulting from the discretization of partial differential equations (PDEs). Because solving linear systems is the bottleneck of many large simulation codes, there is a rich field of research in developing "fast" solvers, with the ultimate goal being a method that solves an n x n linear system in O(n) operations. One of the most effective classes of solvers is algebraic multigrid (AMG), which is a multilevel iterative method based on projecting the problem into progressively smaller spaces, and scales like O(n) or O(nlog n) for certain classes of problems. The field of AMG is well-developed for symmetric positive definite matrices, and is typically most effective on linear systems resulting from the discretization of scalar elliptic PDEs, such as the heat equation. Systems of PDEs can add additional difficulties, but the underlying linear algebraic theory is consistent and, in many cases, an elliptic system of PDEs can be handled well by AMG with appropriate modifications of the solver. Solving general, nonsymmetric linear systems remains the wild west of AMG (and other fast solvers), lacking significant results in convergence theory as well as robust methods. Here, we develop new theoretical motivation and practical variations of AMG to solve nonsymmetric linear systems, often resulting from the discretization of hyperbolic PDEs. In particular, multilevel convergence of AMG for nonsymmetric systems is proven for the first time. A new nonsymmetric AMG solver is also developed based on an approximate ideal restriction, referred to as AIR, which is able to solve advection-dominated, hyperbolic-type problems that are outside the scope of existing AMG solvers and other fast iterative methods. AIR demonstrates scalable convergence on unstructured meshes, in multiple dimensions, and with high-order finite elements, expanding the applicability of AMG to a new class of problems.
Eigenvalue Solvers for Modeling Nuclear Reactors on Leadership Class Machines
Slaybaugh, R. N.; Ramirez-Zweiger, M.; Pandya, Tara; ...
2018-02-20
In this paper, three complementary methods have been implemented in the code Denovo that accelerate neutral particle transport calculations with methods that use leadership-class computers fully and effectively: a multigroup block (MG) Krylov solver, a Rayleigh quotient iteration (RQI) eigenvalue solver, and a multigrid in energy (MGE) preconditioner. The MG Krylov solver converges more quickly than Gauss Seidel and enables energy decomposition such that Denovo can scale to hundreds of thousands of cores. RQI should converge in fewer iterations than power iteration (PI) for large and challenging problems. RQI creates shifted systems that would not be tractable without the MGmore » Krylov solver. It also creates ill-conditioned matrices. The MGE preconditioner reduces iteration count significantly when used with RQI and takes advantage of the new energy decomposition such that it can scale efficiently. Each individual method has been described before, but this is the first time they have been demonstrated to work together effectively. The combination of solvers enables the RQI eigenvalue solver to work better than the other available solvers for large reactors problems on leadership-class machines. Using these methods together, RQI converged in fewer iterations and in less time than PI for a full pressurized water reactor core. These solvers also performed better than an Arnoldi eigenvalue solver for a reactor benchmark problem when energy decomposition is needed. The MG Krylov, MGE preconditioner, and RQI solver combination also scales well in energy. Finally, this solver set is a strong choice for very large and challenging problems.« less
Eigenvalue Solvers for Modeling Nuclear Reactors on Leadership Class Machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Slaybaugh, R. N.; Ramirez-Zweiger, M.; Pandya, Tara
In this paper, three complementary methods have been implemented in the code Denovo that accelerate neutral particle transport calculations with methods that use leadership-class computers fully and effectively: a multigroup block (MG) Krylov solver, a Rayleigh quotient iteration (RQI) eigenvalue solver, and a multigrid in energy (MGE) preconditioner. The MG Krylov solver converges more quickly than Gauss Seidel and enables energy decomposition such that Denovo can scale to hundreds of thousands of cores. RQI should converge in fewer iterations than power iteration (PI) for large and challenging problems. RQI creates shifted systems that would not be tractable without the MGmore » Krylov solver. It also creates ill-conditioned matrices. The MGE preconditioner reduces iteration count significantly when used with RQI and takes advantage of the new energy decomposition such that it can scale efficiently. Each individual method has been described before, but this is the first time they have been demonstrated to work together effectively. The combination of solvers enables the RQI eigenvalue solver to work better than the other available solvers for large reactors problems on leadership-class machines. Using these methods together, RQI converged in fewer iterations and in less time than PI for a full pressurized water reactor core. These solvers also performed better than an Arnoldi eigenvalue solver for a reactor benchmark problem when energy decomposition is needed. The MG Krylov, MGE preconditioner, and RQI solver combination also scales well in energy. Finally, this solver set is a strong choice for very large and challenging problems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bui, Quan M.; Wang, Lu; Osei-Kuffuor, Daniel
Multiphase flow is a critical process in a wide range of applications, including oil and gas recovery, carbon sequestration, and contaminant remediation. Numerical simulation of multiphase flow requires solving of a large, sparse linear system resulting from the discretization of the partial differential equations modeling the flow. In the case of multiphase multicomponent flow with miscible effect, this is a very challenging task. The problem becomes even more difficult if phase transitions are taken into account. A new approach to handle phase transitions is to formulate the system as a nonlinear complementarity problem (NCP). Unlike in the primary variable switchingmore » technique, the set of primary variables in this approach is fixed even when there is phase transition. Not only does this improve the robustness of the nonlinear solver, it opens up the possibility to use multigrid methods to solve the resulting linear system. The disadvantage of the complementarity approach, however, is that when a phase disappears, the linear system has the structure of a saddle point problem and becomes indefinite, and current algebraic multigrid (AMG) algorithms cannot be applied directly. In this study, we explore the effectiveness of a new multilevel strategy, based on the multigrid reduction technique, to deal with problems of this type. We demonstrate the effectiveness of the method through numerical results for the case of two-phase, two-component flow with phase appearance/disappearance. In conclusion, we also show that the strategy is efficient and scales optimally with problem size.« less
Bui, Quan M.; Wang, Lu; Osei-Kuffuor, Daniel
2018-02-06
Multiphase flow is a critical process in a wide range of applications, including oil and gas recovery, carbon sequestration, and contaminant remediation. Numerical simulation of multiphase flow requires solving of a large, sparse linear system resulting from the discretization of the partial differential equations modeling the flow. In the case of multiphase multicomponent flow with miscible effect, this is a very challenging task. The problem becomes even more difficult if phase transitions are taken into account. A new approach to handle phase transitions is to formulate the system as a nonlinear complementarity problem (NCP). Unlike in the primary variable switchingmore » technique, the set of primary variables in this approach is fixed even when there is phase transition. Not only does this improve the robustness of the nonlinear solver, it opens up the possibility to use multigrid methods to solve the resulting linear system. The disadvantage of the complementarity approach, however, is that when a phase disappears, the linear system has the structure of a saddle point problem and becomes indefinite, and current algebraic multigrid (AMG) algorithms cannot be applied directly. In this study, we explore the effectiveness of a new multilevel strategy, based on the multigrid reduction technique, to deal with problems of this type. We demonstrate the effectiveness of the method through numerical results for the case of two-phase, two-component flow with phase appearance/disappearance. In conclusion, we also show that the strategy is efficient and scales optimally with problem size.« less
Aerodynamic Shape Optimization of Complex Aircraft Configurations via an Adjoint Formulation
NASA Technical Reports Server (NTRS)
Reuther, James; Jameson, Antony; Farmer, James; Martinelli, Luigi; Saunders, David
1996-01-01
This work describes the implementation of optimization techniques based on control theory for complex aircraft configurations. Here control theory is employed to derive the adjoint differential equations, the solution of which allows for a drastic reduction in computational costs over previous design methods (13, 12, 43, 38). In our earlier studies (19, 20, 22, 23, 39, 25, 40, 41, 42) it was shown that this method could be used to devise effective optimization procedures for airfoils, wings and wing-bodies subject to either analytic or arbitrary meshes. Design formulations for both potential flows and flows governed by the Euler equations have been demonstrated, showing that such methods can be devised for various governing equations (39, 25). In our most recent works (40, 42) the method was extended to treat wing-body configurations with a large number of mesh points, verifying that significant computational savings can be gained for practical design problems. In this paper the method is extended for the Euler equations to treat complete aircraft configurations via a new multiblock implementation. New elements include a multiblock-multigrid flow solver, a multiblock-multigrid adjoint solver, and a multiblock mesh perturbation scheme. Two design examples are presented in which the new method is used for the wing redesign of a transonic business jet.
Emulsion droplet interactions: a front-tracking treatment
NASA Astrophysics Data System (ADS)
Mason, Lachlan; Juric, Damir; Chergui, Jalel; Shin, Seungwon; Craster, Richard V.; Matar, Omar K.
2017-11-01
Emulsion coalescence influences a multitude of industrial applications including solvent extraction, oil recovery and the manufacture of fast-moving consumer goods. Droplet interaction models are vital for the design and scale-up of processing systems, however predictive modelling at the droplet-scale remains a research challenge. This study simulates industrially relevant moderate-inertia collisions for which a high degree of droplet deformation occurs. A hybrid front-tracking/level-set approach is used to automatically account for interface merging without the need for `bookkeeping' of interface connectivity. The model is implemented in Code BLUE using a parallel multi-grid solver, allowing both film and droplet-scale dynamics to be resolved efficiently. Droplet interaction simulations are validated using experimental sequences from the literature in the presence and absence of background turbulence. The framework is readily extensible for modelling the influence of surfactants and non-Newtonian fluids on droplet interaction processes. EPSRC, UK, MEMPHIS program Grant (EP/K003976/1), RAEng Research Chair (OKM), PETRONAS.
Adaptive Skin Meshes Coarsening for Biomolecular Simulation
Shi, Xinwei; Koehl, Patrice
2011-01-01
In this paper, we present efficient algorithms for generating hierarchical molecular skin meshes with decreasing size and guaranteed quality. Our algorithms generate a sequence of coarse meshes for both the surfaces and the bounded volumes. Each coarser surface mesh is adaptive to the surface curvature and maintains the topology of the skin surface with guaranteed mesh quality. The corresponding tetrahedral mesh is conforming to the interface surface mesh and contains high quality tetrahedral that decompose both the interior of the molecule and the surrounding region (enclosed in a sphere). Our hierarchical tetrahedral meshes have a number of advantages that will facilitate fast and accurate multigrid PDE solvers. Firstly, the quality of both the surface triangulations and tetrahedral meshes is guaranteed. Secondly, the interface in the tetrahedral mesh is an accurate approximation of the molecular boundary. In particular, all the boundary points lie on the skin surface. Thirdly, our meshes are Delaunay meshes. Finally, the meshes are adaptive to the geometry. PMID:21779137
Unweighted least squares phase unwrapping by means of multigrid techniques
NASA Astrophysics Data System (ADS)
Pritt, Mark D.
1995-11-01
We present a multigrid algorithm for unweighted least squares phase unwrapping. This algorithm applies Gauss-Seidel relaxation schemes to solve the Poisson equation on smaller, coarser grids and transfers the intermediate results to the finer grids. This approach forms the basis of our multigrid algorithm for weighted least squares phase unwrapping, which is described in a separate paper. The key idea of our multigrid approach is to maintain the partial derivatives of the phase data in separate arrays and to correct these derivatives at the boundaries of the coarser grids. This maintains the boundary conditions necessary for rapid convergence to the correct solution. Although the multigrid algorithm is an iterative algorithm, we demonstrate that it is nearly as fast as the direct Fourier-based method. We also describe how to parallelize the algorithm for execution on a distributed-memory parallel processor computer or a network-cluster of workstations.
NASA Astrophysics Data System (ADS)
Koldan, Jelena; Puzyrev, Vladimir; de la Puente, Josep; Houzeaux, Guillaume; Cela, José María
2014-06-01
We present an elaborate preconditioning scheme for Krylov subspace methods which has been developed to improve the performance and reduce the execution time of parallel node-based finite-element (FE) solvers for 3-D electromagnetic (EM) numerical modelling in exploration geophysics. This new preconditioner is based on algebraic multigrid (AMG) that uses different basic relaxation methods, such as Jacobi, symmetric successive over-relaxation (SSOR) and Gauss-Seidel, as smoothers and the wave front algorithm to create groups, which are used for a coarse-level generation. We have implemented and tested this new preconditioner within our parallel nodal FE solver for 3-D forward problems in EM induction geophysics. We have performed series of experiments for several models with different conductivity structures and characteristics to test the performance of our AMG preconditioning technique when combined with biconjugate gradient stabilized method. The results have shown that, the more challenging the problem is in terms of conductivity contrasts, ratio between the sizes of grid elements and/or frequency, the more benefit is obtained by using this preconditioner. Compared to other preconditioning schemes, such as diagonal, SSOR and truncated approximate inverse, the AMG preconditioner greatly improves the convergence of the iterative solver for all tested models. Also, when it comes to cases in which other preconditioners succeed to converge to a desired precision, AMG is able to considerably reduce the total execution time of the forward-problem code-up to an order of magnitude. Furthermore, the tests have confirmed that our AMG scheme ensures grid-independent rate of convergence, as well as improvement in convergence regardless of how big local mesh refinements are. In addition, AMG is designed to be a black-box preconditioner, which makes it easy to use and combine with different iterative methods. Finally, it has proved to be very practical and efficient in the parallel context.
Numerical Analysis of Shear Thickening Fluids for Blast Mitigation Applications
2011-12-01
integrate with other types of physics simulation technologies ( ANSYS , 2011). One well-known product offered by ANSYS is the ANSYS CFX . The ANSYS CFD...centered. The ANSYS CFX solver uses coupled algebraic multigrid to achieve its solutions and its engineered scalability ensures a linear increase in CPU...on the user-defined distribution and size. As the numerical analysis focused on the behavior of each individual particle, the ANSYS CFX Rigid Body
The ANTARES Code: New Developments
NASA Astrophysics Data System (ADS)
Blies, P. M.; Kupka, F.; Muthsam, H. J.
2015-10-01
We give an update on the ANTARES code. It was presented by Muthsam et al. (2010) and has since experienced various improvements and has also been extended by new features which we will mention in this paper. Two new features will be presented in a bit more detail: the parallel multigrid solver for the 2D non-linear, generalized Helmholtz equation by Happenhofer (2014) and the capability to use curvilinear grids by Grimm-Strele (2014).
An installed nacelle design code using a multiblock Euler solver. Volume 1: Theory document
NASA Technical Reports Server (NTRS)
Chen, H. C.
1992-01-01
An efficient multiblock Euler design code was developed for designing a nacelle installed on geometrically complex airplane configurations. This approach employed a design driver based on a direct iterative surface curvature method developed at LaRC. A general multiblock Euler flow solver was used for computing flow around complex geometries. The flow solver used a finite-volume formulation with explicit time-stepping to solve the Euler Equations. It used a multiblock version of the multigrid method to accelerate the convergence of the calculations. The design driver successively updated the surface geometry to reduce the difference between the computed and target pressure distributions. In the flow solver, the change in surface geometry was simulated by applying surface transpiration boundary conditions to avoid repeated grid generation during design iterations. Smoothness of the designed surface was ensured by alternate application of streamwise and circumferential smoothings. The capability and efficiency of the code was demonstrated through the design of both an isolated nacelle and an installed nacelle at various flow conditions. Information on the execution of the computer program is provided in volume 2.
Algebraic multigrid domain and range decomposition (AMG-DD / AMG-RD)*
Bank, R.; Falgout, R. D.; Jones, T.; ...
2015-10-29
In modern large-scale supercomputing applications, algebraic multigrid (AMG) is a leading choice for solving matrix equations. However, the high cost of communication relative to that of computation is a concern for the scalability of traditional implementations of AMG on emerging architectures. This paper introduces two new algebraic multilevel algorithms, algebraic multigrid domain decomposition (AMG-DD) and algebraic multigrid range decomposition (AMG-RD), that replace traditional AMG V-cycles with a fully overlapping domain decomposition approach. While the methods introduced here are similar in spirit to the geometric methods developed by Brandt and Diskin [Multigrid solvers on decomposed domains, in Domain Decomposition Methods inmore » Science and Engineering, Contemp. Math. 157, AMS, Providence, RI, 1994, pp. 135--155], Mitchell [Electron. Trans. Numer. Anal., 6 (1997), pp. 224--233], and Bank and Holst [SIAM J. Sci. Comput., 22 (2000), pp. 1411--1443], they differ primarily in that they are purely algebraic: AMG-RD and AMG-DD trade communication for computation by forming global composite “grids” based only on the matrix, not the geometry. (As is the usual AMG convention, “grids” here should be taken only in the algebraic sense, regardless of whether or not it corresponds to any geometry.) Another important distinguishing feature of AMG-RD and AMG-DD is their novel residual communication process that enables effective parallel computation on composite grids, avoiding the all-to-all communication costs of the geometric methods. The main purpose of this paper is to study the potential of these two algebraic methods as possible alternatives to existing AMG approaches for future parallel machines. As a result, this paper develops some theoretical properties of these methods and reports on serial numerical tests of their convergence properties over a spectrum of problem parameters.« less
A new extrapolation cascadic multigrid method for three dimensional elliptic boundary value problems
NASA Astrophysics Data System (ADS)
Pan, Kejia; He, Dongdong; Hu, Hongling; Ren, Zhengyong
2017-09-01
In this paper, we develop a new extrapolation cascadic multigrid method, which makes it possible to solve three dimensional elliptic boundary value problems with over 100 million unknowns on a desktop computer in half a minute. First, by combining Richardson extrapolation and quadratic finite element (FE) interpolation for the numerical solutions on two-level of grids (current and previous grids), we provide a quite good initial guess for the iterative solution on the next finer grid, which is a third-order approximation to the FE solution. And the resulting large linear system from the FE discretization is then solved by the Jacobi-preconditioned conjugate gradient (JCG) method with the obtained initial guess. Additionally, instead of performing a fixed number of iterations as used in existing cascadic multigrid methods, a relative residual tolerance is introduced in the JCG solver, which enables us to obtain conveniently the numerical solution with the desired accuracy. Moreover, a simple method based on the midpoint extrapolation formula is proposed to achieve higher-order accuracy on the finest grid cheaply and directly. Test results from four examples including two smooth problems with both constant and variable coefficients, an H3-regular problem as well as an anisotropic problem are reported to show that the proposed method has much better efficiency compared to the classical V-cycle and W-cycle multigrid methods. Finally, we present the reason why our method is highly efficient for solving these elliptic problems.
Algebraic multigrid preconditioners for two-phase flow in porous media with phase transitions
NASA Astrophysics Data System (ADS)
Bui, Quan M.; Wang, Lu; Osei-Kuffuor, Daniel
2018-04-01
Multiphase flow is a critical process in a wide range of applications, including oil and gas recovery, carbon sequestration, and contaminant remediation. Numerical simulation of multiphase flow requires solving of a large, sparse linear system resulting from the discretization of the partial differential equations modeling the flow. In the case of multiphase multicomponent flow with miscible effect, this is a very challenging task. The problem becomes even more difficult if phase transitions are taken into account. A new approach to handle phase transitions is to formulate the system as a nonlinear complementarity problem (NCP). Unlike in the primary variable switching technique, the set of primary variables in this approach is fixed even when there is phase transition. Not only does this improve the robustness of the nonlinear solver, it opens up the possibility to use multigrid methods to solve the resulting linear system. The disadvantage of the complementarity approach, however, is that when a phase disappears, the linear system has the structure of a saddle point problem and becomes indefinite, and current algebraic multigrid (AMG) algorithms cannot be applied directly. In this study, we explore the effectiveness of a new multilevel strategy, based on the multigrid reduction technique, to deal with problems of this type. We demonstrate the effectiveness of the method through numerical results for the case of two-phase, two-component flow with phase appearance/disappearance. We also show that the strategy is efficient and scales optimally with problem size.
A Multigrid Approach to Embedded-Grid Solvers
1992-08-01
previously as a Master’s Thesis at the University of Florida. Not edited by TESCO , Inc. 12a. DISTRIBUTION / AVAILABILITY STATEMENT 12b. DISTRIBUTION CODE...domain decomposition techniques in order to accurately model the aerodynamics of complex geometries , 5, 11, 12, 13, 24’. Although these high...quantities subscripted by oc denote reference values in the undisturbed gas. Uv v, e e P - (10) Where • = (7b,/•)1/2, is the speed of sound in the
A geometric multigrid preconditioning strategy for DPG system matrices
Roberts, Nathan V.; Chan, Jesse
2017-08-23
Here, the discontinuous Petrov–Galerkin (DPG) methodology of Demkowicz and Gopalakrishnan (2010, 2011) guarantees the optimality of the solution in an energy norm, and provides several features facilitating adaptive schemes. A key question that has not yet been answered in general – though there are some results for Poisson, e.g.– is how best to precondition the DPG system matrix, so that iterative solvers may be used to allow solution of large-scale problems.
Fully anisotropic 3-D EM modelling on a Lebedev grid with a multigrid pre-conditioner
NASA Astrophysics Data System (ADS)
Jaysaval, Piyoosh; Shantsev, Daniil V.; de la Kethulle de Ryhove, Sébastien; Bratteland, Tarjei
2016-12-01
We present a numerical algorithm for 3-D electromagnetic (EM) simulations in conducting media with general electric anisotropy. The algorithm is based on the finite-difference discretization of frequency-domain Maxwell's equations on a Lebedev grid, in which all components of the electric field are collocated but half a spatial step staggered with respect to the magnetic field components, which also are collocated. This leads to a system of linear equations that is solved using a stabilized biconjugate gradient method with a multigrid preconditioner. We validate the accuracy of the numerical results for layered and 3-D tilted transverse isotropic (TTI) earth models representing typical scenarios used in the marine controlled-source EM method. It is then demonstrated that not taking into account the full anisotropy of the conductivity tensor can lead to misleading inversion results. For synthetic data corresponding to a 3-D model with a TTI anticlinal structure, a standard vertical transverse isotropic (VTI) inversion is not able to image a resistor, while for a 3-D model with a TTI synclinal structure it produces a false resistive anomaly. However, if the VTI forward solver used in the inversion is replaced by the proposed TTI solver with perfect knowledge of the strike and dip of the dipping structures, the resulting resistivity images become consistent with the true models.
Operator induced multigrid algorithms using semirefinement
NASA Technical Reports Server (NTRS)
Decker, Naomi; Vanrosendale, John
1989-01-01
A variant of multigrid, based on zebra relaxation, and a new family of restriction/prolongation operators is described. Using zebra relaxation in combination with an operator-induced prolongation leads to fast convergence, since the coarse grid can correct all error components. The resulting algorithms are not only fast, but are also robust, in the sense that the convergence rate is insensitive to the mesh aspect ratio. This is true even though line relaxation is performed in only one direction. Multigrid becomes a direct method if an operator-induced prolongation is used, together with the induced coarse grid operators. Unfortunately, this approach leads to stencils which double in size on each coarser grid. The use of an implicit three point restriction can be used to factor these large stencils, in order to retain the usual five or nine point stencils, while still achieving fast convergence. This algorithm achieves a V-cycle convergence rate of 0.03 on Poisson's equation, using 1.5 zebra sweeps per level, while the convergence rate improves to 0.003 if optimal nine point stencils are used. Numerical results for two and three dimensional model problems are presented, together with a two level analysis explaining these results.
Implementing High-Performance Geometric Multigrid Solver with Naturally Grained Messages
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shan, Hongzhang; Williams, Samuel; Zheng, Yili
2015-10-26
Structured-grid linear solvers often require manually packing and unpacking of communication data to achieve high performance.Orchestrating this process efficiently is challenging, labor-intensive, and potentially error-prone.In this paper, we explore an alternative approach that communicates the data with naturally grained messagesizes without manual packing and unpacking. This approach is the distributed analogue of shared-memory programming, taking advantage of the global addressspace in PGAS languages to provide substantial programming ease. However, its performance may suffer from the large number of small messages. We investigate theruntime support required in the UPC ++ library for this naturally grained version to close the performance gapmore » between the two approaches and attain comparable performance at scale using the High-Performance Geometric Multgrid (HPGMG-FV) benchmark as a driver.« less
A general multiblock Euler code for propulsion integration. Volume 1: Theory document
NASA Technical Reports Server (NTRS)
Chen, H. C.; Su, T. Y.; Kao, T. J.
1991-01-01
A general multiblock Euler solver was developed for the analysis of flow fields over geometrically complex configurations either in free air or in a wind tunnel. In this approach, the external space around a complex configuration was divided into a number of topologically simple blocks, so that surface-fitted grids and an efficient flow solution algorithm could be easily applied in each block. The computational grid in each block is generated using a combination of algebraic and elliptic methods. A grid generation/flow solver interface program was developed to facilitate the establishment of block-to-block relations and the boundary conditions for each block. The flow solver utilizes a finite volume formulation and an explicit time stepping scheme to solve the Euler equations. A multiblock version of the multigrid method was developed to accelerate the convergence of the calculations. The generality of the method was demonstrated through the analysis of two complex configurations at various flow conditions. Results were compared to available test data. Two accompanying volumes, user manuals for the preparation of multi-block grids (vol. 2) and for the Euler flow solver (vol. 3), provide information on input data format and program execution.
A coarse-grid projection method for accelerating incompressible flow computations
NASA Astrophysics Data System (ADS)
San, Omer; Staples, Anne
2011-11-01
We present a coarse-grid projection (CGP) algorithm for accelerating incompressible flow computations, which is applicable to methods involving Poisson equations as incompressibility constraints. CGP methodology is a modular approach that facilitates data transfer with simple interpolations and uses black-box solvers for the Poisson and advection-diffusion equations in the flow solver. Here, we investigate a particular CGP method for the vorticity-stream function formulation that uses the full weighting operation for mapping from fine to coarse grids, the third-order Runge-Kutta method for time stepping, and finite differences for the spatial discretization. After solving the Poisson equation on a coarsened grid, bilinear interpolation is used to obtain the fine data for consequent time stepping on the full grid. We compute several benchmark flows: the Taylor-Green vortex, a vortex pair merging, a double shear layer, decaying turbulence and the Taylor-Green vortex on a distorted grid. In all cases we use either FFT-based or V-cycle multigrid linear-cost Poisson solvers. Reducing the number of degrees of freedom of the Poisson solver by powers of two accelerates these computations while, for the first level of coarsening, retaining the same level of accuracy in the fine resolution vorticity field.
A High Order, Locally-Adaptive Method for the Navier-Stokes Equations
NASA Astrophysics Data System (ADS)
Chan, Daniel
1998-11-01
I have extended the FOSLS method of Cai, Manteuffel and McCormick (1997) and implemented it within the framework of a spectral element formulation using the Legendre polynomial basis function. The FOSLS method solves the Navier-Stokes equations as a system of coupled first-order equations and provides the ellipticity that is needed for fast iterative matrix solvers like multigrid to operate efficiently. Each element is treated as an object and its properties are self-contained. Only C^0 continuity is imposed across element interfaces; this design allows local grid refinement and coarsening without the burden of having an elaborate data structure, since only information along element boundaries is needed. With the FORTRAN 90 programming environment, I can maintain a high computational efficiency by employing a hybrid parallel processing model. The OpenMP directives provides parallelism in the loop level which is executed in a shared-memory SMP and the MPI protocol allows the distribution of elements to a cluster of SMP's connected via a commodity network. This talk will provide timing results and a comparison with a second order finite difference method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chacon, Luis; Stanier, Adam John
Here, we demonstrate a scalable fully implicit algorithm for the two-field low-β extended MHD model. This reduced model describes plasma behavior in the presence of strong guide fields, and is of significant practical impact both in nature and in laboratory plasmas. The model displays strong hyperbolic behavior, as manifested by the presence of fast dispersive waves, which make a fully implicit treatment very challenging. In this study, we employ a Jacobian-free Newton–Krylov nonlinear solver, for which we propose a physics-based preconditioner that renders the linearized set of equations suitable for inversion with multigrid methods. As a result, the algorithm ismore » shown to scale both algorithmically (i.e., the iteration count is insensitive to grid refinement and timestep size) and in parallel in a weak-scaling sense, with the wall-clock time scaling weakly with the number of cores for up to 4096 cores. For a 4096 × 4096 mesh, we demonstrate a wall-clock-time speedup of ~6700 with respect to explicit algorithms. The model is validated linearly (against linear theory predictions) and nonlinearly (against fully kinetic simulations), demonstrating excellent agreement.« less
Construction, classification and parametrization of complex Hadamard matrices
NASA Astrophysics Data System (ADS)
Szöllősi, Ferenc
To improve the design of nuclear systems, high-fidelity neutron fluxes are required. Leadership-class machines provide platforms on which very large problems can be solved. Computing such fluxes efficiently requires numerical methods with good convergence properties and algorithms that can scale to hundreds of thousands of cores. Many 3-D deterministic transport codes are decomposable in space and angle only, limiting them to tens of thousands of cores. Most codes rely on methods such as Gauss Seidel for fixed source problems and power iteration for eigenvalue problems, which can be slow to converge for challenging problems like those with highly scattering materials or high dominance ratios. Three methods have been added to the 3-D SN transport code Denovo that are designed to improve convergence and enable the full use of cutting-edge computers. The first is a multigroup Krylov solver that converges more quickly than Gauss Seidel and parallelizes the code in energy such that Denovo can use hundreds of thousand of cores effectively. The second is Rayleigh quotient iteration (RQI), an old method applied in a new context. This eigenvalue solver finds the dominant eigenvalue in a mathematically optimal way and should converge in fewer iterations than power iteration. RQI creates energy-block-dense equations that the new Krylov solver treats efficiently. However, RQI can have convergence problems because it creates poorly conditioned systems. This can be overcome with preconditioning. The third method is a multigrid-in-energy preconditioner. The preconditioner takes advantage of the new energy decomposition because the grids are in energy rather than space or angle. The preconditioner greatly reduces iteration count for many problem types and scales well in energy. It also allows RQI to be successful for problems it could not solve otherwise. The methods added to Denovo accomplish the goals of this work. They converge in fewer iterations than traditional methods and enable the use of hundreds of thousands of cores. Each method can be used individually, with the multigroup Krylov solver and multigrid-in-energy preconditioner being particularly successful on their own. The largest benefit, though, comes from using these methods in concert.
NASA Astrophysics Data System (ADS)
van Dyk, Danny; Geveler, Markus; Mallach, Sven; Ribbrock, Dirk; Göddeke, Dominik; Gutwenger, Carsten
2009-12-01
We present HONEI, an open-source collection of libraries offering a hardware oriented approach to numerical calculations. HONEI abstracts the hardware, and applications written on top of HONEI can be executed on a wide range of computer architectures such as CPUs, GPUs and the Cell processor. We demonstrate the flexibility and performance of our approach with two test applications, a Finite Element multigrid solver for the Poisson problem and a robust and fast simulation of shallow water waves. By linking against HONEI's libraries, we achieve a two-fold speedup over straight forward C++ code using HONEI's SSE backend, and additional 3-4 and 4-16 times faster execution on the Cell and a GPU. A second important aspect of our approach is that the full performance capabilities of the hardware under consideration can be exploited by adding optimised application-specific operations to the HONEI libraries. HONEI provides all necessary infrastructure for development and evaluation of such kernels, significantly simplifying their development. Program summaryProgram title: HONEI Catalogue identifier: AEDW_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDW_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPLv2 No. of lines in distributed program, including test data, etc.: 216 180 No. of bytes in distributed program, including test data, etc.: 1 270 140 Distribution format: tar.gz Programming language: C++ Computer: x86, x86_64, NVIDIA CUDA GPUs, Cell blades and PlayStation 3 Operating system: Linux RAM: at least 500 MB free Classification: 4.8, 4.3, 6.1 External routines: SSE: none; [1] for GPU, [2] for Cell backend Nature of problem: Computational science in general and numerical simulation in particular have reached a turning point. The revolution developers are facing is not primarily driven by a change in (problem-specific) methodology, but rather by the fundamental paradigm shift of the underlying hardware towards heterogeneity and parallelism. This is particularly relevant for data-intensive problems stemming from discretisations with local support, such as finite differences, volumes and elements. Solution method: To address these issues, we present a hardware aware collection of libraries combining the advantages of modern software techniques and hardware oriented programming. Applications built on top of these libraries can be configured trivially to execute on CPUs, GPUs or the Cell processor. In order to evaluate the performance and accuracy of our approach, we provide two domain specific applications; a multigrid solver for the Poisson problem and a fully explicit solver for 2D shallow water equations. Restrictions: HONEI is actively being developed, and its feature list is continuously expanded. Not all combinations of operations and architectures might be supported in earlier versions of the code. Obtaining snapshots from http://www.honei.org is recommended. Unusual features: The considered applications as well as all library operations can be run on NVIDIA GPUs and the Cell BE. Running time: Depending on the application, and the input sizes. The Poisson solver executes in few seconds, while the SWE solver requires up to 5 minutes for large spatial discretisations or small timesteps. References:http://www.nvidia.com/cuda. http://www.ibm.com/developerworks/power/cell.
Hydrodynamics of suspensions of passive and active rigid particles: a rigid multiblob approach
Usabiaga, Florencio Balboa; Kallemov, Bakytzhan; Delmotte, Blaise; ...
2016-01-12
We develop a rigid multiblob method for numerically solving the mobility problem for suspensions of passive and active rigid particles of complex shape in Stokes flow in unconfined, partially confined, and fully confined geometries. As in a number of existing methods, we discretize rigid bodies using a collection of minimally resolved spherical blobs constrained to move as a rigid body, to arrive at a potentially large linear system of equations for the unknown Lagrange multipliers and rigid-body motions. Here we develop a block-diagonal preconditioner for this linear system and show that a standard Krylov solver converges in a modest numbermore » of iterations that is essentially independent of the number of particles. Key to the efficiency of the method is a technique for fast computation of the product of the blob-blob mobility matrix and a vector. For unbounded suspensions, we rely on existing analytical expressions for the Rotne-Prager-Yamakawa tensor combined with a fast multipole method (FMM) to obtain linear scaling in the number of particles. For suspensions sedimented against a single no-slip boundary, we use a direct summation on a graphical processing unit (GPU), which gives quadratic asymptotic scaling with the number of particles. For fully confined domains, such as periodic suspensions or suspensions confined in slit and square channels, we extend a recently developed rigid-body immersed boundary method by B. Kallemov, A. P. S. Bhalla, B. E. Griffith, and A. Donev (Commun. Appl. Math. Comput. Sci. 11 (2016), no. 1, 79-141) to suspensions of freely moving passive or active rigid particles at zero Reynolds number. We demonstrate that the iterative solver for the coupled fluid and rigid-body equations converges in a bounded number of iterations regardless of the system size. In our approach, each iteration only requires a few cycles of a geometric multigrid solver for the Poisson equation, and an application of the block-diagonal preconditioner, leading to linear scaling with the number of particles. We optimize a number of parameters in the iterative solvers and apply our method to a variety of benchmark problems to carefully assess the accuracy of the rigid multiblob approach as a function of the resolution. We also model the dynamics of colloidal particles studied in recent experiments, such as passive boomerangs in a slit channel, as well as a pair of non-Brownian active nanorods sedimented against a wall.« less
NASA Astrophysics Data System (ADS)
Lashkin, S. V.; Kozelkov, A. S.; Yalozo, A. V.; Gerasimov, V. Yu.; Zelensky, D. K.
2017-12-01
This paper describes the details of the parallel implementation of the SIMPLE algorithm for numerical solution of the Navier-Stokes system of equations on arbitrary unstructured grids. The iteration schemes for the serial and parallel versions of the SIMPLE algorithm are implemented. In the description of the parallel implementation, special attention is paid to computational data exchange among processors under the condition of the grid model decomposition using fictitious cells. We discuss the specific features for the storage of distributed matrices and implementation of vector-matrix operations in parallel mode. It is shown that the proposed way of matrix storage reduces the number of interprocessor exchanges. A series of numerical experiments illustrates the effect of the multigrid SLAE solver tuning on the general efficiency of the algorithm; the tuning involves the types of the cycles used (V, W, and F), the number of iterations of a smoothing operator, and the number of cells for coarsening. Two ways (direct and indirect) of efficiency evaluation for parallelization of the numerical algorithm are demonstrated. The paper presents the results of solving some internal and external flow problems with the evaluation of parallelization efficiency by two algorithms. It is shown that the proposed parallel implementation enables efficient computations for the problems on a thousand processors. Based on the results obtained, some general recommendations are made for the optimal tuning of the multigrid solver, as well as for selecting the optimal number of cells per processor.
NASA Astrophysics Data System (ADS)
Kaus, B.; Popov, A.
2015-12-01
The analytical expression for the Jacobian is a key component to achieve fast and robust convergence of the nonlinear Newton-Raphson iterative solver. Accomplishing this task in practice often requires a significant algebraic effort. Therefore it is quite common to use a cheap alternative instead, for example by approximating the Jacobian with a finite difference estimation. Despite its simplicity it is a relatively fragile and unreliable technique that is sensitive to the scaling of the residual and unknowns, as well as to the perturbation parameter selection. Unfortunately no universal rule can be applied to provide both a robust scaling and a perturbation. The approach we use here is to derive the analytical Jacobian for the coupled set of momentum, mass, and energy conservation equations together with the elasto-visco-plastic rheology and a marker in cell/staggered finite difference method. The software project LaMEM (Lithosphere and Mantle Evolution Model) is primarily developed for the thermo-mechanically coupled modeling of the 3D lithospheric deformation. The code is based on a staggered grid finite difference discretization in space, and uses customized scalable solvers form PETSc library to efficiently run on the massively parallel machines (such as IBM Blue Gene/Q). Currently LaMEM relies on the Jacobian-Free Newton-Krylov (JFNK) nonlinear solver, which approximates the Jacobian-vector product using a simple finite difference formula. This approach never requires an assembled Jacobian matrix and uses only the residual computation routine. We use an approximate Jacobian (Picard) matrix to precondition the Krylov solver with the Galerkin geometric multigrid. Because of the inherent problems of the finite difference Jacobian estimation, this approach doesn't always result in stable convergence. In this work we present and discuss a matrix-free technique in which the Jacobian-vector product is replaced by analytically-derived expressions and compare results with those obtained with a finite difference approximation of the Jacobian. This project is funded by ERC Starting Grant 258830 and computer facilities were provided by Jülich supercomputer center (Germany).
NASA Technical Reports Server (NTRS)
Smith, Crawford F.; Podleski, Steve D.
1993-01-01
The proper use of a computational fluid dynamics code requires a good understanding of the particular code being applied. In this report the application of CFL3D, a thin-layer Navier-Stokes code, is compared with the results obtained from PARC3D, a full Navier-Stokes code. In order to gain an understanding of the use of this code, a simple problem was chosen in which several key features of the code could be exercised. The problem chosen is a cone in supersonic flow at an angle of attack. The issues of grid resolution, grid blocking, and multigridding with CFL3D are explored. The use of multigridding resulted in a significant reduction in the computational time required to solve the problem. Solutions obtained are compared with the results using the full Navier-Stokes equations solver PARC3D. The results obtained with the CFL3D code compared well with the PARC3D solutions.
NASA Technical Reports Server (NTRS)
Bates, J. R.; Semazzi, F. H. M.; Higgins, R. W.; Barros, Saulo R. M.
1990-01-01
A vector semi-Lagrangian semi-implicit two-time-level finite-difference integration scheme for the shallow water equations on the sphere is presented. A C-grid is used for the spatial differencing. The trajectory-centered discretization of the momentum equation in vector form eliminates pole problems and, at comparable cost, gives greater accuracy than a previous semi-Lagrangian finite-difference scheme which used a rotated spherical coordinate system. In terms of the insensitivity of the results to increasing timestep, the new scheme is as successful as recent spectral semi-Lagrangian schemes. In addition, the use of a multigrid method for solving the elliptic equation for the geopotential allows efficient integration with an operation count which, at high resolution, is of lower order than in the case of the spectral models. The properties of the new scheme should allow finite-difference models to compete with spectral models more effectively than has previously been possible.
NASA Astrophysics Data System (ADS)
Caughey, David A.; Jameson, Antony
2003-10-01
New versions of implicit algorithms are developed for the efficient solution of the Euler and Navier-Stokes equations of compressible flow. The methods are based on a preconditioned, lower-upper (LU) implementation of a non-linear, symmetric Gauss-Seidel (SGS) algorithm for use as a smoothing algorithm in a multigrid method. Previously, this method had been implemented for flows in quasi-one-dimensional ducts and for two-dimensional flows past airfoils on boundary-conforming O-type grids for a variety of symmetric limited positive (SLIP) spatial approximations, including the scalar dissipation and convective upwind split pressure (CUSP) schemes. Here results are presented for both inviscid and viscous (laminar) flows past airfoils on boundary-conforming C-type grids. The method is significantly faster than earlier explicit or implicit methods for inviscid problems, allowing solution of these problems to the level of truncation error in three to five multigrid cycles. Viscous solutions still require as many as twenty multigrid cycles.
General relaxation schemes in multigrid algorithms for higher order singularity methods
NASA Technical Reports Server (NTRS)
Oskam, B.; Fray, J. M. J.
1981-01-01
Relaxation schemes based on approximate and incomplete factorization technique (AF) are described. The AF schemes allow construction of a fast multigrid method for solving integral equations of the second and first kind. The smoothing factors for integral equations of the first kind, and comparison with similar results from the second kind of equations are a novel item. Application of the MD algorithm shows convergence to the level of truncation error of a second order accurate panel method.
A comparison of locally adaptive multigrid methods: LDC, FAC and FIC
NASA Technical Reports Server (NTRS)
Khadra, Khodor; Angot, Philippe; Caltagirone, Jean-Paul
1993-01-01
This study is devoted to a comparative analysis of three 'Adaptive ZOOM' (ZOom Overlapping Multi-level) methods based on similar concepts of hierarchical multigrid local refinement: LDC (Local Defect Correction), FAC (Fast Adaptive Composite), and FIC (Flux Interface Correction)--which we proposed recently. These methods are tested on two examples of a bidimensional elliptic problem. We compare, for V-cycle procedures, the asymptotic evolution of the global error evaluated by discrete norms, the corresponding local errors, and the convergence rates of these algorithms.
Adaptive Mesh Refinement in Curvilinear Body-Fitted Grid Systems
NASA Technical Reports Server (NTRS)
Steinthorsson, Erlendur; Modiano, David; Colella, Phillip
1995-01-01
To be truly compatible with structured grids, an AMR algorithm should employ a block structure for the refined grids to allow flow solvers to take advantage of the strengths of unstructured grid systems, such as efficient solution algorithms for implicit discretizations and multigrid schemes. One such algorithm, the AMR algorithm of Berger and Colella, has been applied to and adapted for use with body-fitted structured grid systems. Results are presented for a transonic flow over a NACA0012 airfoil (AGARD-03 test case) and a reflection of a shock over a double wedge.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grout, Ray W. S.
Convergence of spectral deferred correction (SDC), where low-order time integration methods are used to construct higher-order methods through iterative refinement, can be accelerated in terms of computational effort by using mixed-precision methods. Using ideas from multi-level SDC (in turn based on FAS multigrid ideas), some of the SDC correction sweeps can use function values computed in reduced precision without adversely impacting the accuracy of the final solution. This is particularly beneficial for the performance of combustion solvers such as S3D [6] which require double precision accuracy but are performance limited by the cost of data motion.
NASA Technical Reports Server (NTRS)
Goodrich, John W.
1991-01-01
An algorithm is presented for unsteady two-dimensional incompressible Navier-Stokes calculations. This algorithm is based on the fourth order partial differential equation for incompressible fluid flow which uses the streamfunction as the only dependent variable. The algorithm is second order accurate in both time and space. It uses a multigrid solver at each time step. It is extremely efficient with respect to the use of both CPU time and physical memory. It is extremely robust with respect to Reynolds number.
Adaptive Meshing Techniques for Viscous Flow Calculations on Mixed Element Unstructured Meshes
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.
1997-01-01
An adaptive refinement strategy based on hierarchical element subdivision is formulated and implemented for meshes containing arbitrary mixtures of tetrahendra, hexahendra, prisms and pyramids. Special attention is given to keeping memory overheads as low as possible. This procedure is coupled with an algebraic multigrid flow solver which operates on mixed-element meshes. Inviscid flows as well as viscous flows are computed an adaptively refined tetrahedral, hexahedral, and hybrid meshes. The efficiency of the method is demonstrated by generating an adapted hexahedral mesh containing 3 million vertices on a relatively inexpensive workstation.
NASA Technical Reports Server (NTRS)
Mineck, Raymond E.; Thomas, James L.; Biedron, Robert T.; Diskin, Boris
2005-01-01
FMG3D (full multigrid 3 dimensions) is a pilot computer program that solves equations of fluid flow using a finite difference representation on a structured grid. Infrastructure exists for three dimensions but the current implementation treats only two dimensions. Written in Fortran 90, FMG3D takes advantage of the recursive subroutine feature, dynamic memory allocation, and structured-programming constructs of that language. FMG3D supports multi-block grids with three types of block-to-block interfaces: periodic, C-zero, and C-infinity. For all three types, grid points must match at interfaces. For periodic and C-infinity types, derivatives of grid metrics must be continuous at interfaces. The available equation sets are as follows: scalar elliptic equations, scalar convection equations, and the pressure-Poisson formulation of the Navier-Stokes equations for an incompressible fluid. All the equation sets are implemented with nonzero forcing functions to enable the use of user-specified solutions to assist in verification and validation. The equations are solved with a full multigrid scheme using a full approximation scheme to converge the solution on each succeeding grid level. Restriction to the next coarser mesh uses direct injection for variables and full weighting for residual quantities; prolongation of the coarse grid correction from the coarse mesh to the fine mesh uses bilinear interpolation; and prolongation of the coarse grid solution uses bicubic interpolation.
Parallel computation of three-dimensional aeroelastic fluid-structure interaction
NASA Astrophysics Data System (ADS)
Sadeghi, Mani
This dissertation presents a numerical method for the parallel computation of aeroelasticity (ParCAE). A flow solver is coupled to a structural solver by use of a fluid-structure interface method. The integration of the three-dimensional unsteady Navier-Stokes equations is performed in the time domain, simultaneously to the integration of a modal three-dimensional structural model. The flow solution is accelerated by using a multigrid method and a parallel multiblock approach. Fluid-structure coupling is achieved by subiteration. A grid-deformation algorithm is developed to interpolate the deformation of the structural boundaries onto the flow grid. The code is formulated to allow application to general, three-dimensional, complex configurations with multiple independent structures. Computational results are presented for various configurations, such as turbomachinery blade rows and aircraft wings. Investigations are performed on vortex-induced vibrations, effects of cascade mistuning on flutter, and cases of nonlinear cascade and wing flutter.
NASA Technical Reports Server (NTRS)
Chen, H. C.; Yu, N. Y.
1991-01-01
An Euler flow solver was developed for predicting the airframe/propulsion integration effects for an aft-mounted turboprop transport. This solver employs a highly efficient multigrid scheme, with a successive mesh-refinement procedure to accelerate the convergence of the solution. A new dissipation model was also implemented to render solutions that are grid insensitive. The propeller power effects are simulated by the actuator disk concept. An embedded flow solution method was developed for predicting the detailed flow characteristics in the local vicinity of an aft-mounted propfan engine in the presence of a flow field induced by a complete aircraft. Results from test case analysis are presented. A user's guide for execution of computer programs, including format of various input files, sample job decks, and sample input files, is provided in an accompanying volume.
Efficiency optimization of a fast Poisson solver in beam dynamics simulation
NASA Astrophysics Data System (ADS)
Zheng, Dawei; Pöplau, Gisela; van Rienen, Ursula
2016-01-01
Calculating the solution of Poisson's equation relating to space charge force is still the major time consumption in beam dynamics simulations and calls for further improvement. In this paper, we summarize a classical fast Poisson solver in beam dynamics simulations: the integrated Green's function method. We introduce three optimization steps of the classical Poisson solver routine: using the reduced integrated Green's function instead of the integrated Green's function; using the discrete cosine transform instead of discrete Fourier transform for the Green's function; using a novel fast convolution routine instead of an explicitly zero-padded convolution. The new Poisson solver routine preserves the advantages of fast computation and high accuracy. This provides a fast routine for high performance calculation of the space charge effect in accelerators.
NASA Astrophysics Data System (ADS)
Sanan, Patrick; May, Dave A.; Schenk, Olaf; Bollhöffer, Matthias
2017-04-01
Geodynamics simulations typically involve the repeated solution of saddle-point systems arising from the Stokes equations. These computations often dominate the time to solution. Direct solvers are known for their robustness and ``black box'' properties, yet exhibit superlinear memory requirements and time to solution. More complex multilevel-preconditioned iterative solvers have been very successful for large problems, yet their use can require more effort from the practitioner in terms of setting up a solver and choosing its parameters. We champion an intermediate approach, based on leveraging the power of modern incomplete factorization techniques for indefinite symmetric matrices. These provide an interesting alternative in situations in between the regimes where direct solvers are an obvious choice and those where complex, scalable, iterative solvers are an obvious choice. That is, much like their relatives for definite systems, ILU/ICC-preconditioned Krylov methods and ILU/ICC-smoothed multigrid methods, the approaches demonstrated here provide a useful addition to the solver toolkit. We present results with a simple, PETSc-based, open-source Q2-Q1 (Taylor-Hood) finite element discretization, in 2 and 3 dimensions, with the Stokes and Lamé (linear elasticity) saddle point systems. Attention is paid to cases in which full-operator incomplete factorization gives an improvement in time to solution over direct solution methods (which may not even be feasible due to memory limitations), without the complication of more complex (or at least, less-automatic) preconditioners or smoothers. As an important factor in the relevance of these tools is their availability in portable software, we also describe open-source PETSc interfaces to the factorization routines.
High order spectral volume and spectral difference methods on unstructured grids
NASA Astrophysics Data System (ADS)
Kannan, Ravishekar
The spectral volume (SV) and the spectral difference (SD) methods were developed by Wang and Liu and their collaborators for conservation laws on unstructured grids. They were introduced to achieve high-order accuracy in an efficient manner. Recently, these methods were extended to three-dimensional systems and to the Navier Stokes equations. The simplicity and robustness of these methods have made them competitive against other higher order methods such as the discontinuous Galerkin and residual distribution methods. Although explicit TVD Runge-Kutta schemes for the temporal advancement are easy to implement, they suffer from small time step limited by the Courant-Friedrichs-Lewy (CFL) condition. When the polynomial order is high or when the grid is stretched due to complex geometries or boundary layers, the convergence rate of explicit schemes slows down rapidly. Solution strategies to remedy this problem include implicit methods and multigrid methods. A novel implicit lower-upper symmetric Gauss-Seidel (LU-SGS) relaxation method is employed as an iterative smoother. It is compared to the explicit TVD Runge-Kutta smoothers. For some p-multigrid calculations, combining implicit and explicit smoothers for different p-levels is also studied. The multigrid method considered is nonlinear and uses Full Approximation Scheme (FAS). An overall speed-up factor of up to 150 is obtained using a three-level p-multigrid LU-SGS approach in comparison with the single level explicit method for the Euler equations for the 3rd order SD method. A study of viscous flux formulations was carried out for the SV method. Three formulations were used to discretize the viscous fluxes: local discontinuous Galerkin (LDG), a penalty method and the 2nd method of Bassi and Rebay. Fourier analysis revealed some interesting advantages for the penalty method. These were implemented in the Navier Stokes solver. An implicit and p-multigrid method was also implemented for the above. An overall speed-up factor of up to 1500 is obtained using a three-level p-multigrid LU-SGS approach in comparison with the single level explicit method for the Navier-Stokes equations. The SV method was also extended to turbulent flows. The RANS based SA model was used to close the Reynolds stresses. The numerical results are very promising and indicate that the approaches have great potentials for 3D flow problems.
A scalable, fully implicit algorithm for the reduced two-field low-β extended MHD model
Chacon, Luis; Stanier, Adam John
2016-12-01
Here, we demonstrate a scalable fully implicit algorithm for the two-field low-β extended MHD model. This reduced model describes plasma behavior in the presence of strong guide fields, and is of significant practical impact both in nature and in laboratory plasmas. The model displays strong hyperbolic behavior, as manifested by the presence of fast dispersive waves, which make a fully implicit treatment very challenging. In this study, we employ a Jacobian-free Newton–Krylov nonlinear solver, for which we propose a physics-based preconditioner that renders the linearized set of equations suitable for inversion with multigrid methods. As a result, the algorithm ismore » shown to scale both algorithmically (i.e., the iteration count is insensitive to grid refinement and timestep size) and in parallel in a weak-scaling sense, with the wall-clock time scaling weakly with the number of cores for up to 4096 cores. For a 4096 × 4096 mesh, we demonstrate a wall-clock-time speedup of ~6700 with respect to explicit algorithms. The model is validated linearly (against linear theory predictions) and nonlinearly (against fully kinetic simulations), demonstrating excellent agreement.« less
Real-time haptic cutting of high-resolution soft tissues.
Wu, Jun; Westermann, Rüdiger; Dick, Christian
2014-01-01
We present our systematic efforts in advancing the computational performance of physically accurate soft tissue cutting simulation, which is at the core of surgery simulators in general. We demonstrate a real-time performance of 15 simulation frames per second for haptic soft tissue cutting of a deformable body at an effective resolution of 170,000 finite elements. This is achieved by the following innovative components: (1) a linked octree discretization of the deformable body, which allows for fast and robust topological modifications of the simulation domain, (2) a composite finite element formulation, which thoroughly reduces the number of simulation degrees of freedom and thus enables to carefully balance simulation performance and accuracy, (3) a highly efficient geometric multigrid solver for solving the linear systems of equations arising from implicit time integration, (4) an efficient collision detection algorithm that effectively exploits the composition structure, and (5) a stable haptic rendering algorithm for computing the feedback forces. Considering that our method increases the finite element resolution for physically accurate real-time soft tissue cutting simulation by an order of magnitude, our technique has a high potential to significantly advance the realism of surgery simulators.
Fast solution of elliptic partial differential equations using linear combinations of plane waves.
Pérez-Jordá, José M
2016-02-01
Given an arbitrary elliptic partial differential equation (PDE), a procedure for obtaining its solution is proposed based on the method of Ritz: the solution is written as a linear combination of plane waves and the coefficients are obtained by variational minimization. The PDE to be solved is cast as a system of linear equations Ax=b, where the matrix A is not sparse, which prevents the straightforward application of standard iterative methods in order to solve it. This sparseness problem can be circumvented by means of a recursive bisection approach based on the fast Fourier transform, which makes it possible to implement fast versions of some stationary iterative methods (such as Gauss-Seidel) consuming O(NlogN) memory and executing an iteration in O(Nlog(2)N) time, N being the number of plane waves used. In a similar way, fast versions of Krylov subspace methods and multigrid methods can also be implemented. These procedures are tested on Poisson's equation expressed in adaptive coordinates. It is found that the best results are obtained with the GMRES method using a multigrid preconditioner with Gauss-Seidel relaxation steps.
Block-accelerated aggregation multigrid for Markov chains with application to PageRank problems
NASA Astrophysics Data System (ADS)
Shen, Zhao-Li; Huang, Ting-Zhu; Carpentieri, Bruno; Wen, Chun; Gu, Xian-Ming
2018-06-01
Recently, the adaptive algebraic aggregation multigrid method has been proposed for computing stationary distributions of Markov chains. This method updates aggregates on every iterative cycle to keep high accuracies of coarse-level corrections. Accordingly, its fast convergence rate is well guaranteed, but often a large proportion of time is cost by aggregation processes. In this paper, we show that the aggregates on each level in this method can be utilized to transfer the probability equation of that level into a block linear system. Then we propose a Block-Jacobi relaxation that deals with the block system on each level to smooth error. Some theoretical analysis of this technique is presented, meanwhile it is also adapted to solve PageRank problems. The purpose of this technique is to accelerate the adaptive aggregation multigrid method and its variants for solving Markov chains and PageRank problems. It also attempts to shed some light on new solutions for making aggregation processes more cost-effective for aggregation multigrid methods. Numerical experiments are presented to illustrate the effectiveness of this technique.
A fast mass spring model solver for high-resolution elastic objects
NASA Astrophysics Data System (ADS)
Zheng, Mianlun; Yuan, Zhiyong; Zhu, Weixu; Zhang, Guian
2017-03-01
Real-time simulation of elastic objects is of great importance for computer graphics and virtual reality applications. The fast mass spring model solver can achieve visually realistic simulation in an efficient way. Unfortunately, this method suffers from resolution limitations and lack of mechanical realism for a surface geometry model, which greatly restricts its application. To tackle these problems, in this paper we propose a fast mass spring model solver for high-resolution elastic objects. First, we project the complex surface geometry model into a set of uniform grid cells as cages through *cages mean value coordinate method to reflect its internal structure and mechanics properties. Then, we replace the original Cholesky decomposition method in the fast mass spring model solver with a conjugate gradient method, which can make the fast mass spring model solver more efficient for detailed surface geometry models. Finally, we propose a graphics processing unit accelerated parallel algorithm for the conjugate gradient method. Experimental results show that our method can realize efficient deformation simulation of 3D elastic objects with visual reality and physical fidelity, which has a great potential for applications in computer animation.
A new fast direct solver for the boundary element method
NASA Astrophysics Data System (ADS)
Huang, S.; Liu, Y. J.
2017-09-01
A new fast direct linear equation solver for the boundary element method (BEM) is presented in this paper. The idea of the new fast direct solver stems from the concept of the hierarchical off-diagonal low-rank matrix. The hierarchical off-diagonal low-rank matrix can be decomposed into the multiplication of several diagonal block matrices. The inverse of the hierarchical off-diagonal low-rank matrix can be calculated efficiently with the Sherman-Morrison-Woodbury formula. In this paper, a more general and efficient approach to approximate the coefficient matrix of the BEM with the hierarchical off-diagonal low-rank matrix is proposed. Compared to the current fast direct solver based on the hierarchical off-diagonal low-rank matrix, the proposed method is suitable for solving general 3-D boundary element models. Several numerical examples of 3-D potential problems with the total number of unknowns up to above 200,000 are presented. The results show that the new fast direct solver can be applied to solve large 3-D BEM models accurately and with better efficiency compared with the conventional BEM.
Preconditioned augmented Lagrangian formulation for nearly incompressible cardiac mechanics.
Campos, Joventino Oliveira; Dos Santos, Rodrigo Weber; Sundnes, Joakim; Rocha, Bernardo Martins
2018-04-01
Computational modeling of the heart is a subject of substantial medical and scientific interest, which may contribute to increase the understanding of several phenomena associated with cardiac physiological and pathological states. Modeling the mechanics of the heart have led to considerable insights, but it still represents a complex and a demanding computational problem, especially in a strongly coupled electromechanical setting. Passive cardiac tissue is commonly modeled as hyperelastic and is characterized by quasi-incompressible, orthotropic, and nonlinear material behavior. These factors are known to be very challenging for the numerical solution of the model. The near-incompressibility is known to cause numerical issues such as the well-known locking phenomenon and ill-conditioning of the stiffness matrix. In this work, the augmented Lagrangian method is used to handle the nearly incompressible condition. This approach can potentially improve computational performance by reducing the condition number of the stiffness matrix and thereby improving the convergence of iterative solvers. We also improve the performance of iterative solvers by the use of an algebraic multigrid preconditioner. Numerical results of the augmented Lagrangian method combined with a preconditioned iterative solver for a cardiac mechanics benchmark suite are presented to show its improved performance. Copyright © 2017 John Wiley & Sons, Ltd.
Parameter investigation with line-implicit lower-upper symmetric Gauss-Seidel on 3D stretched grids
NASA Astrophysics Data System (ADS)
Otero, Evelyn; Eliasson, Peter
2015-03-01
An implicit lower-upper symmetric Gauss-Seidel (LU-SGS) solver has been implemented as a multigrid smoother combined with a line-implicit method as an acceleration technique for Reynolds-averaged Navier-Stokes (RANS) simulation on stretched meshes. The computational fluid dynamics code concerned is Edge, an edge-based finite volume Navier-Stokes flow solver for structured and unstructured grids. The paper focuses on the investigation of the parameters related to our novel line-implicit LU-SGS solver for convergence acceleration on 3D RANS meshes. The LU-SGS parameters are defined as the Courant-Friedrichs-Lewy number, the left-hand side dissipation, and the convergence of iterative solution of the linear problem arising from the linearisation of the implicit scheme. The influence of these parameters on the overall convergence is presented and default values are defined for maximum convergence acceleration. The optimised settings are applied to 3D RANS computations for comparison with explicit and line-implicit Runge-Kutta smoothing. For most of the cases, a computing time acceleration of the order of 2 is found depending on the mesh type, namely the boundary layer and the magnitude of residual reduction.
NASA Technical Reports Server (NTRS)
Agrawal, Gagan; Sussman, Alan; Saltz, Joel
1993-01-01
Scientific and engineering applications often involve structured meshes. These meshes may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). A combined runtime and compile-time approach for parallelizing these applications on distributed memory parallel machines in an efficient and machine-independent fashion was described. A runtime library which can be used to port these applications on distributed memory machines was designed and implemented. The library is currently implemented on several different systems. To further ease the task of application programmers, methods were developed for integrating this runtime library with compilers for HPK-like parallel programming languages. How this runtime library was integrated with the Fortran 90D compiler being developed at Syracuse University is discussed. Experimental results to demonstrate the efficacy of our approach are presented. A multiblock Navier-Stokes solver template and a multigrid code were experimented with. Our experimental results show that our primitives have low runtime communication overheads. Further, the compiler parallelized codes perform within 20 percent of the code parallelized by manually inserting calls to the runtime library.
High Resolution Aerospace Applications using the NASA Columbia Supercomputer
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.; Aftosmis, Michael J.; Berger, Marsha
2005-01-01
This paper focuses on the parallel performance of two high-performance aerodynamic simulation packages on the newly installed NASA Columbia supercomputer. These packages include both a high-fidelity, unstructured, Reynolds-averaged Navier-Stokes solver, and a fully-automated inviscid flow package for cut-cell Cartesian grids. The complementary combination of these two simulation codes enables high-fidelity characterization of aerospace vehicle design performance over the entire flight envelope through extensive parametric analysis and detailed simulation of critical regions of the flight envelope. Both packages. are industrial-level codes designed for complex geometry and incorpor.ats. CuStomized multigrid solution algorithms. The performance of these codes on Columbia is examined using both MPI and OpenMP and using both the NUMAlink and InfiniBand interconnect fabrics. Numerical results demonstrate good scalability on up to 2016 CPUs using the NUMAIink4 interconnect, with measured computational rates in the vicinity of 3 TFLOP/s, while InfiniBand showed some performance degradation at high CPU counts, particularly with multigrid. Nonetheless, the results are encouraging enough to indicate that larger test cases using combined MPI/OpenMP communication should scale well on even more processors.
Development of an explicit multiblock/multigrid flow solver for viscous flows in complex geometries
NASA Technical Reports Server (NTRS)
Steinthorsson, E.; Liou, M. S.; Povinelli, L. A.
1993-01-01
A new computer program is being developed for doing accurate simulations of compressible viscous flows in complex geometries. The code employs the full compressible Navier-Stokes equations. The eddy viscosity model of Baldwin and Lomax is used to model the effects of turbulence on the flow. A cell centered finite volume discretization is used for all terms in the governing equations. The Advection Upwind Splitting Method (AUSM) is used to compute the inviscid fluxes, while central differencing is used for the diffusive fluxes. A four-stage Runge-Kutta time integration scheme is used to march solutions to steady state, while convergence is enhanced by a multigrid scheme, local time-stepping, and implicit residual smoothing. To enable simulations of flows in complex geometries, the code uses composite structured grid systems where all grid lines are continuous at block boundaries (multiblock grids). Example results shown are a flow in a linear cascade, a flow around a circular pin extending between the main walls in a high aspect-ratio channel, and a flow of air in a radial turbine coolant passage.
Development of an explicit multiblock/multigrid flow solver for viscous flows in complex geometries
NASA Technical Reports Server (NTRS)
Steinthorsson, E.; Liou, M.-S.; Povinelli, L. A.
1993-01-01
A new computer program is being developed for doing accurate simulations of compressible viscous flows in complex geometries. The code employs the full compressible Navier-Stokes equations. The eddy viscosity model of Baldwin and Lomax is used to model the effects of turbulence on the flow. A cell centered finite volume discretization is used for all terms in the governing equations. The Advection Upwind Splitting Method (AUSM) is used to compute the inviscid fluxes, while central differencing is used for the diffusive fluxes. A four-stage Runge-Kutta time integration scheme is used to march solutions to steady state, while convergence is enhanced by a multigrid scheme, local time-stepping and implicit residual smoothing. To enable simulations of flows in complex geometries, the code uses composite structured grid systems where all grid lines are continuous at block boundaries (multiblock grids). Example results are shown a flow in a linear cascade, a flow around a circular pin extending between the main walls in a high aspect-ratio channel, and a flow of air in a radial turbine coolant passage.
Distributed Relaxation for Conservative Discretizations
NASA Technical Reports Server (NTRS)
Diskin, Boris; Thomas, James L.
2001-01-01
A multigrid method is defined as having textbook multigrid efficiency (TME) if the solutions to the governing system of equations are attained in a computational work that is a small (less than 10) multiple of the operation count in one target-grid residual evaluation. The way to achieve this efficiency is the distributed relaxation approach. TME solvers employing distributed relaxation have already been demonstrated for nonconservative formulations of high-Reynolds-number viscous incompressible and subsonic compressible flow regimes. The purpose of this paper is to provide foundations for applications of distributed relaxation to conservative discretizations. A direct correspondence between the primitive variable interpolations for calculating fluxes in conservative finite-volume discretizations and stencils of the discretized derivatives in the nonconservative formulation has been established. Based on this correspondence, one can arrive at a conservative discretization which is very efficiently solved with a nonconservative relaxation scheme and this is demonstrated for conservative discretization of the quasi one-dimensional Euler equations. Formulations for both staggered and collocated grid arrangements are considered and extensions of the general procedure to multiple dimensions are discussed.
NASA Technical Reports Server (NTRS)
Aftosmis, M. J.; Berger, M. J.; Murman, S. M.; Kwak, Dochan (Technical Monitor)
2002-01-01
The proposed paper will present recent extensions in the development of an efficient Euler solver for adaptively-refined Cartesian meshes with embedded boundaries. The paper will focus on extensions of the basic method to include solution adaptation, time-dependent flow simulation, and arbitrary rigid domain motion. The parallel multilevel method makes use of on-the-fly parallel domain decomposition to achieve extremely good scalability on large numbers of processors, and is coupled with an automatic coarse mesh generation algorithm for efficient processing by a multigrid smoother. Numerical results are presented demonstrating parallel speed-ups of up to 435 on 512 processors. Solution-based adaptation may be keyed off truncation error estimates using tau-extrapolation or a variety of feature detection based refinement parameters. The multigrid method is extended to for time-dependent flows through the use of a dual-time approach. The extension to rigid domain motion uses an Arbitrary Lagrangian-Eulerlarian (ALE) formulation, and results will be presented for a variety of two- and three-dimensional example problems with both simple and complex geometry.
A fast Poisson solver for unsteady incompressible Navier-Stokes equations on the half-staggered grid
NASA Technical Reports Server (NTRS)
Golub, G. H.; Huang, L. C.; Simon, H.; Tang, W. -P.
1995-01-01
In this paper, a fast Poisson solver for unsteady, incompressible Navier-Stokes equations with finite difference methods on the non-uniform, half-staggered grid is presented. To achieve this, new algorithms for diagonalizing a semi-definite pair are developed. Our fast solver can also be extended to the three dimensional case. The motivation and related issues in using this second kind of staggered grid are also discussed. Numerical testing has indicated the effectiveness of this algorithm.
Performance Models for the Spike Banded Linear System Solver
Manguoglu, Murat; Saied, Faisal; Sameh, Ahmed; ...
2011-01-01
With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers themselves. In this paper, we investigate the problem of characterizing performance and scalability of banded preconditioners. Recent work has demonstrated the superior convergence properties and robustness of banded preconditioners,more » compared to state-of-the-art ILU family of preconditioners as well as algebraic multigrid preconditioners. Furthermore, when used in conjunction with efficient banded solvers, banded preconditioners are capable of significantly faster time-to-solution. Our banded solver, the Truncated Spike algorithm is specifically designed for parallel performance and tolerance to deep memory hierarchies. Its regular structure is also highly amenable to accurate performance characterization. Using these characteristics, we derive the following results in this paper: (i) we develop parallel formulations of the Truncated Spike solver, (ii) we develop a highly accurate pseudo-analytical parallel performance model for our solver, (iii) we show excellent predication capabilities of our model – based on which we argue the high scalability of our solver. Our pseudo-analytical performance model is based on analytical performance characterization of each phase of our solver. These analytical models are then parameterized using actual runtime information on target platforms. An important consequence of our performance models is that they reveal underlying performance bottlenecks in both serial and parallel formulations. All of our results are validated on diverse heterogeneous multiclusters – platforms for which performance prediction is particularly challenging. Finally, we provide predict the scalability of the Spike algorithm using up to 65,536 cores with our model. In this paper we extend the results presented in the Ninth International Symposium on Parallel and Distributed Computing.« less
Mathematical and Numerical Aspects of the Adaptive Fast Multipole Poisson-Boltzmann Solver
Zhang, Bo; Lu, Benzhuo; Cheng, Xiaolin; ...
2013-01-01
This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole Poisson-Boltzmann (AFMPB) solver. We introduce and discuss the following components in order: the Poisson-Boltzmann model, boundary integral equation reformulation, surface mesh generation, the nodepatch discretization approach, Krylov iterative methods, the new version of fast multipole methods (FMMs), and a dynamic prioritization technique for scheduling parallel operations. For each component, we also remark on feasible approaches for further improvements in efficiency, accuracy and applicability of the AFMPB solver to large-scale long-time molecular dynamics simulations. Lastly, the potential of the solver is demonstrated with preliminary numericalmore » results.« less
A Spectral Element Discretisation on Unstructured Triangle / Tetrahedral Meshes for Elastodynamics
NASA Astrophysics Data System (ADS)
May, Dave A.; Gabriel, Alice-A.
2017-04-01
The spectral element method (SEM) defined over quadrilateral and hexahedral element geometries has proven to be a fast, accurate and scalable approach to study wave propagation phenomena. In the context of regional scale seismology and or simulations incorporating finite earthquake sources, the geometric restrictions associated with hexahedral elements can limit the applicability of the classical quad./hex. SEM. Here we describe a continuous Galerkin spectral element discretisation defined over unstructured meshes composed of triangles (2D), or tetrahedra (3D). The method uses a stable, nodal basis constructed from PKD polynomials and thus retains the spectral accuracy and low dispersive properties of the classical SEM, in addition to the geometric versatility provided by unstructured simplex meshes. For the particular basis and quadrature rule we have adopted, the discretisation results in a mass matrix which is not diagonal, thereby mandating linear solvers be utilised. To that end, we have developed efficient solvers and preconditioners which are robust with respect to the polynomial order (p), and possess high arithmetic intensity. Furthermore, we also consider using implicit time integrators, together with a p-multigrid preconditioner to circumvent the CFL condition. Implicit time integrators become particularly relevant when considering solving problems on poor quality meshes, or meshes containing elements with a widely varying range of length scales - both of which frequently arise when meshing non-trivial geometries. We demonstrate the applicability of the new method by examining a number of two- and three-dimensional wave propagation scenarios. These scenarios serve to characterise the accuracy and cost of the new method. Lastly, we will assess the potential benefits of using implicit time integrators for regional scale wave propagation simulations.
NASA Technical Reports Server (NTRS)
Taylor, C. (Editor); Chin, J. H. (Editor); Homsy, G. M. (Editor)
1991-01-01
Consideration is given to the impulse response of a laminar boundary layer and receptivity; numerical transition to turbulence in plane Poiseuille flow; large eddy simulation of turbulent wake flow; a viscous model and loss calculation of a multisplitter cascade; vortex initiation during dynamic stall of an airfoil; a numerical analysis of isothermal flow in a combustion chamber; and compressible flow calculations with a two-equation turbulence model and unstructured grids. Attention is also given to a 2D calculation of a buoyant flow around a burning sphere, a fast multigrid method for 3D turbulent incompressible flows, a streaming flow induced by an oscillating cascade of circular cylinders, an algebraic multigrid scheme for solving the Navier-Stokes equations on unstructured meshes; and nonlinear coupled multigrid solutions to thermal problems employing different nodal grid arrangements and convective transport approximations.
Multigrid preconditioned conjugate-gradient method for large-scale wave-front reconstruction.
Gilles, Luc; Vogel, Curtis R; Ellerbroek, Brent L
2002-09-01
We introduce a multigrid preconditioned conjugate-gradient (MGCG) iterative scheme for computing open-loop wave-front reconstructors for extreme adaptive optics systems. We present numerical simulations for a 17-m class telescope with n = 48756 sensor measurement grid points within the aperture, which indicate that our MGCG method has a rapid convergence rate for a wide range of subaperture average slope measurement signal-to-noise ratios. The total computational cost is of order n log n. Hence our scheme provides for fast wave-front simulation and control in large-scale adaptive optics systems.
Large Scale, High Resolution, Mantle Dynamics Modeling
NASA Astrophysics Data System (ADS)
Geenen, T.; Berg, A. V.; Spakman, W.
2007-12-01
To model the geodynamic evolution of plate convergence, subduction and collision and to allow for a connection to various types of observational data, geophysical, geodetical and geological, we developed a 4D (space-time) numerical mantle convection code. The model is based on a spherical 3D Eulerian fem model, with quadratic elements, on top of which we constructed a 3D Lagrangian particle in cell(PIC) method. We use the PIC method to transport material properties and to incorporate a viscoelastic rheology. Since capturing small scale processes associated with localization phenomena require a high resolution, we spend a considerable effort on implementing solvers suitable to solve for models with over 100 million degrees of freedom. We implemented Additive Schwartz type ILU based methods in combination with a Krylov solver, GMRES. However we found that for problems with over 500 thousend degrees of freedom the convergence of the solver degraded severely. This observation is known from the literature [Saad, 2003] and results from the local character of the ILU preconditioner resulting in a poor approximation of the inverse of A for large A. The size of A for which ILU is no longer usable depends on the condition of A and on the amount of fill in allowed for the ILU preconditioner. We found that for our problems with over 5×105 degrees of freedom convergence became to slow to solve the system within an acceptable amount of walltime, one minute, even when allowing for considerable amount of fill in. We also implemented MUMPS and found good scaling results for problems up to 107 degrees of freedom for up to 32 CPU¡¯s. For problems with over 100 million degrees of freedom we implemented Algebraic Multigrid type methods (AMG) from the ML library [Sala, 2006]. Since multigrid methods are most effective for single parameter problems, we rebuild our model to use the SIMPLE method in the Stokes solver [Patankar, 1980]. We present scaling results from these solvers for 3D spherical models. We also applied the above mentioned method to a high resolution (~ 1 km) 2D mantle convection model with temperature, pressure and phase dependent rheology including several phase transitions. We focus on a model of a subducting lithospheric slab which is subject to strong folding at the bottom of the mantle's D" region which includes the postperovskite phase boundary. For a detailed description of this model we refer to poster [Mantel convection models of the D" region, U17] [Saad, 2003] Saad, Y. (2003). Iterative methods for sparse linear systems. [Sala, 2006] Sala. M (2006) An Object-Oriented Framework for the Development of Scalable Parallel Multilevel Preconditioners. ACM Transactions on Mathematical Software, 32 (3), 2006 [Patankar, 1980] Patankar, S. V.(1980) Numerical Heat Transfer and Fluid Flow, Hemisphere, Washington.
NASA Technical Reports Server (NTRS)
Jothiprasad, Giridhar; Mavriplis, Dimitri J.; Caughey, David A.; Bushnell, Dennis M. (Technical Monitor)
2002-01-01
The efficiency gains obtained using higher-order implicit Runge-Kutta schemes as compared with the second-order accurate backward difference schemes for the unsteady Navier-Stokes equations are investigated. Three different algorithms for solving the nonlinear system of equations arising at each timestep are presented. The first algorithm (NMG) is a pseudo-time-stepping scheme which employs a non-linear full approximation storage (FAS) agglomeration multigrid method to accelerate convergence. The other two algorithms are based on Inexact Newton's methods. The linear system arising at each Newton step is solved using iterative/Krylov techniques and left preconditioning is used to accelerate convergence of the linear solvers. One of the methods (LMG) uses Richardson's iterative scheme for solving the linear system at each Newton step while the other (PGMRES) uses the Generalized Minimal Residual method. Results demonstrating the relative superiority of these Newton's methods based schemes are presented. Efficiency gains as high as 10 are obtained by combining the higher-order time integration schemes with the more efficient nonlinear solvers.
The design and implementation of a parallel unstructured Euler solver using software primitives
NASA Technical Reports Server (NTRS)
Das, R.; Mavriplis, D. J.; Saltz, J.; Gupta, S.; Ponnusamy, R.
1992-01-01
This paper is concerned with the implementation of a three-dimensional unstructured grid Euler-solver on massively parallel distributed-memory computer architectures. The goal is to minimize solution time by achieving high computational rates with a numerically efficient algorithm. An unstructured multigrid algorithm with an edge-based data structure has been adopted, and a number of optimizations have been devised and implemented in order to accelerate the parallel communication rates. The implementation is carried out by creating a set of software tools, which provide an interface between the parallelization issues and the sequential code, while providing a basis for future automatic run-time compilation support. Large practical unstructured grid problems are solved on the Intel iPSC/860 hypercube and Intel Touchstone Delta machine. The quantitative effect of the various optimizations are demonstrated, and we show that the combined effect of these optimizations leads to roughly a factor of three performance improvement. The overall solution efficiency is compared with that obtained on the CRAY-YMP vector supercomputer.
CPIC: a curvilinear Particle-In-Cell code for plasma-material interaction studies
NASA Astrophysics Data System (ADS)
Delzanno, G.; Camporeale, E.; Moulton, J. D.; Borovsky, J. E.; MacDonald, E.; Thomsen, M. F.
2012-12-01
We present a recently developed Particle-In-Cell (PIC) code in curvilinear geometry called CPIC (Curvilinear PIC) [1], where the standard PIC algorithm is coupled with a grid generation/adaptation strategy. Through the grid generator, which maps the physical domain to a logical domain where the grid is uniform and Cartesian, the code can simulate domains of arbitrary complexity, including the interaction of complex objects with a plasma. At present the code is electrostatic. Poisson's equation (in logical space) can be solved with either an iterative method based on the Conjugate Gradient (CG) or the Generalized Minimal Residual (GMRES) coupled with a multigrid solver used as a preconditioner, or directly with multigrid. The multigrid strategy is critical for the solver to perform optimally or nearly optimally as the dimension of the problem increases. CPIC also features a hybrid particle mover, where the computational particles are characterized by position in logical space and velocity in physical space. The advantage of a hybrid mover, as opposed to more conventional movers that move particles directly in the physical space, is that the interpolation of the particles in logical space is straightforward and computationally inexpensive, since one does not have to track the position of the particle. We will present our latest progress on the development of the code and document the code performance on standard plasma-physics tests. Then we will present the (preliminary) application of the code to a basic dynamic-charging problem, namely the charging and shielding of a spherical spacecraft in a magnetized plasma for various level of magnetization and including the pulsed emission of an electron beam from the spacecraft. The dynamical evolution of the sheath and the time-dependent current collection will be described. This study is in support of the ConnEx mission concept to use an electron beam from a magnetospheric spacecraft to trace magnetic field lines from the magnetosphere to the ionosphere [2]. [1] G.L. Delzanno, E. Camporeale, "CPIC: a new Particle-in-Cell code for plasma-material interaction studies", in preparation (2012). [2] J.E. Borovsky, D.J. McComas, M.F. Thomsen, J.L. Burch, J. Cravens, C.J. Pollock, T.E. Moore, and S.B. Mende, "Magnetosphere-Ionosphere Observatory (MIO): A multisatellite mission designed to solve the problem of what generates auroral arcs," Eos. Trans. Amer. Geophys. Union 79 (45), F744 (2000).
A stopping criterion for the iterative solution of partial differential equations
NASA Astrophysics Data System (ADS)
Rao, Kaustubh; Malan, Paul; Perot, J. Blair
2018-01-01
A stopping criterion for iterative solution methods is presented that accurately estimates the solution error using low computational overhead. The proposed criterion uses information from prior solution changes to estimate the error. When the solution changes are noisy or stagnating it reverts to a less accurate but more robust, low-cost singular value estimate to approximate the error given the residual. This estimator can also be applied to iterative linear matrix solvers such as Krylov subspace or multigrid methods. Examples of the stopping criterion's ability to accurately estimate the non-linear and linear solution error are provided for a number of different test cases in incompressible fluid dynamics.
MUSIC: MUlti-Scale Initial Conditions
NASA Astrophysics Data System (ADS)
Hahn, Oliver; Abel, Tom
2013-11-01
MUSIC generates multi-scale initial conditions with multiple levels of refinements for cosmological ‘zoom-in’ simulations. The code uses an adaptive convolution of Gaussian white noise with a real-space transfer function kernel together with an adaptive multi-grid Poisson solver to generate displacements and velocities following first- (1LPT) or second-order Lagrangian perturbation theory (2LPT). MUSIC achieves rms relative errors of the order of 10-4 for displacements and velocities in the refinement region and thus improves in terms of errors by about two orders of magnitude over previous approaches. In addition, errors are localized at coarse-fine boundaries and do not suffer from Fourier space-induced interference ringing.
Efficient solution of the simplified P N equations
Hamilton, Steven P.; Evans, Thomas M.
2014-12-23
We show new solver strategies for the multigroup SPN equations for nuclear reactor analysis. By forming the complete matrix over space, moments, and energy a robust set of solution strategies may be applied. Moreover, power iteration, shifted power iteration, Rayleigh quotient iteration, Arnoldi's method, and a generalized Davidson method, each using algebraic and physics-based multigrid preconditioners, have been compared on C5G7 MOX test problem as well as an operational PWR model. These results show that the most ecient approach is the generalized Davidson method, that is 30-40 times faster than traditional power iteration and 6-10 times faster than Arnoldi's method.
A fast direct solver for boundary value problems on locally perturbed geometries
NASA Astrophysics Data System (ADS)
Zhang, Yabin; Gillman, Adrianna
2018-03-01
Many applications including optimal design and adaptive discretization techniques involve solving several boundary value problems on geometries that are local perturbations of an original geometry. This manuscript presents a fast direct solver for boundary value problems that are recast as boundary integral equations. The idea is to write the discretized boundary integral equation on a new geometry as a low rank update to the discretized problem on the original geometry. Using the Sherman-Morrison formula, the inverse can be expressed in terms of the inverse of the original system applied to the low rank factors and the right hand side. Numerical results illustrate for problems where perturbation is localized the fast direct solver is three times faster than building a new solver from scratch.
Multiphase three-dimensional direct numerical simulation of a rotating impeller with code Blue
NASA Astrophysics Data System (ADS)
Kahouadji, Lyes; Shin, Seungwon; Chergui, Jalel; Juric, Damir; Craster, Richard V.; Matar, Omar K.
2017-11-01
The flow driven by a rotating impeller inside an open fixed cylindrical cavity is simulated using code Blue, a solver for massively-parallel simulations of fully three-dimensional multiphase flows. The impeller is composed of four blades at a 45° inclination all attached to a central hub and tube stem. In Blue, solid forms are constructed through the definition of immersed objects via a distance function that accounts for the object's interaction with the flow for both single and two-phase flows. We use a moving frame technique for imposing translation and/or rotation. The variation of the Reynolds number, the clearance, and the tank aspect ratio are considered, and we highlight the importance of the confinement ratio (blade radius versus the tank radius) in the mixing process. Blue uses a domain decomposition strategy for parallelization with MPI. The fluid interface solver is based on a parallel implementation of a hybrid front-tracking/level-set method designed complex interfacial topological changes. Parallel GMRES and multigrid iterative solvers are applied to the linear systems arising from the implicit solution for the fluid velocities and pressure in the presence of strong density and viscosity discontinuities across fluid phases. EPSRC, UK, MEMPHIS program Grant (EP/K003976/1), RAEng Research Chair (OKM).
A High-Order Direct Solver for Helmholtz Equations with Neumann Boundary Conditions
NASA Technical Reports Server (NTRS)
Sun, Xian-He; Zhuang, Yu
1997-01-01
In this study, a compact finite-difference discretization is first developed for Helmholtz equations on rectangular domains. Special treatments are then introduced for Neumann and Neumann-Dirichlet boundary conditions to achieve accuracy and separability. Finally, a Fast Fourier Transform (FFT) based technique is used to yield a fast direct solver. Analytical and experimental results show this newly proposed solver is comparable to the conventional second-order elliptic solver when accuracy is not a primary concern, and is significantly faster than that of the conventional solver if a highly accurate solution is required. In addition, this newly proposed fourth order Helmholtz solver is parallel in nature. It is readily available for parallel and distributed computers. The compact scheme introduced in this study is likely extendible for sixth-order accurate algorithms and for more general elliptic equations.
Fully Implicit, Nonlinear 3D Extended Magnetohydrodynamics
NASA Astrophysics Data System (ADS)
Chacon, Luis; Knoll, Dana
2003-10-01
Extended magnetohydrodynamics (XMHD) includes nonideal effects such as nonlinear, anisotropic transport and two-fluid (Hall) effects. XMHD supports multiple, separate time scales that make explicit time differencing approaches extremely inefficient. While a fully implicit implementation promises efficiency without sacrificing numerical accuracy,(D. A. Knoll et al., phJ. Comput. Phys.) 185 (2), 583-611 (2003) the nonlinear nature of the XMHD system and the numerical stiffness associated with the fast waves make this endeavor difficult. Newton-Krylov methods are, however, ideally suited for such a task. These synergistically combine Newton's method for nonlinear convergence, and Krylov techniques to solve the associated Jacobian (linear) systems. Krylov methods can be implemented Jacobian-free and can be preconditioned for efficiency. Successful preconditioning strategies have been developed for 2D incompressible resistive(L. Chacón et al., phJ. Comput. Phys). 178 (1), 15- 36 (2002) and Hall(L. Chacón and D. A. Knoll, phJ. Comput. Phys.), 188 (2), 573-592 (2003) MHD models. These are based on ``physics-based'' ideas, in which knowledge of the physics is exploited to derive well-conditioned (diagonally-dominant) approximations to the original system that are amenable to optimal solver technologies (multigrid). In this work, we will describe the status of the extension of the 2D preconditioning ideas for a 3D compressible, single-fluid XMHD model.
Basu, Protonu; Williams, Samuel; Van Straalen, Brian; ...
2017-04-05
GPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model like CUDA, OpenCL, or OpenACC. Thus, in order to deliver portability across CPU-based and GPU-accelerated supercomputers, programmers are forced to write and maintain two versions of their applications or frameworks. In this paper, we explore the use of a compiler-based autotuning framework based on CUDA-CHiLL to deliver not only portability, but also performance portability across CPU- and GPU-accelerated platforms for the geometric multigrid linear solvers found inmore » many scientific applications. We also show that with autotuning we can attain near Roofline (a performance bound for a computation and target architecture) performance across the key operations in the miniGMG benchmark for both CPU- and GPU-based architectures as well as for a multiple stencil discretizations and smoothers. We show that our technology is readily interoperable with MPI resulting in performance at scale equal to that obtained via hand-optimized MPI+CUDA implementation.« less
NASA Astrophysics Data System (ADS)
Cavaglieri, Daniele; Bewley, Thomas; Mashayek, Ali
2015-11-01
We present a new code, Diablo 2.0, for the simulation of the incompressible NSE in channel and duct flows with strong grid stretching near walls. The code leverages the fractional step approach with a few twists. New low-storage IMEX (implicit-explicit) Runge-Kutta time-marching schemes are tested which are superior to the traditional and widely-used CN/RKW3 (Crank-Nicolson/Runge-Kutta-Wray) approach; the new schemes tested are L-stable in their implicit component, and offer improved overall order of accuracy and stability with, remarkably, similar computational cost and storage requirements. For duct flow simulations, our new code also introduces a new smoother for the multigrid solver for the pressure Poisson equation. The classic approach, involving alternating-direction zebra relaxation, is replaced by a new scheme, dubbed tweed relaxation, which achieves the same convergence rate with roughly half the computational cost. The code is then tested on the simulation of a shear flow instability in a duct, a classic problem in fluid mechanics which has been the object of extensive numerical modelling for its role as a canonical pathway to energetic turbulence in several fields of science and engineering.
Performance of fully-coupled algebraic multigrid preconditioners for large-scale VMS resistive MHD
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, P. T.; Shadid, J. N.; Hu, J. J.
Here, we explore the current performance and scaling of a fully-implicit stabilized unstructured finite element (FE) variational multiscale (VMS) capability for large-scale simulations of 3D incompressible resistive magnetohydrodynamics (MHD). The large-scale linear systems that are generated by a Newton nonlinear solver approach are iteratively solved by preconditioned Krylov subspace methods. The efficiency of this approach is critically dependent on the scalability and performance of the algebraic multigrid preconditioner. Our study considers the performance of the numerical methods as recently implemented in the second-generation Trilinos implementation that is 64-bit compliant and is not limited by the 32-bit global identifiers of themore » original Epetra-based Trilinos. The study presents representative results for a Poisson problem on 1.6 million cores of an IBM Blue Gene/Q platform to demonstrate very large-scale parallel execution. Additionally, results for a more challenging steady-state MHD generator and a transient solution of a benchmark MHD turbulence calculation for the full resistive MHD system are also presented. These results are obtained on up to 131,000 cores of a Cray XC40 and one million cores of a BG/Q system.« less
Performance of fully-coupled algebraic multigrid preconditioners for large-scale VMS resistive MHD
Lin, P. T.; Shadid, J. N.; Hu, J. J.; ...
2017-11-06
Here, we explore the current performance and scaling of a fully-implicit stabilized unstructured finite element (FE) variational multiscale (VMS) capability for large-scale simulations of 3D incompressible resistive magnetohydrodynamics (MHD). The large-scale linear systems that are generated by a Newton nonlinear solver approach are iteratively solved by preconditioned Krylov subspace methods. The efficiency of this approach is critically dependent on the scalability and performance of the algebraic multigrid preconditioner. Our study considers the performance of the numerical methods as recently implemented in the second-generation Trilinos implementation that is 64-bit compliant and is not limited by the 32-bit global identifiers of themore » original Epetra-based Trilinos. The study presents representative results for a Poisson problem on 1.6 million cores of an IBM Blue Gene/Q platform to demonstrate very large-scale parallel execution. Additionally, results for a more challenging steady-state MHD generator and a transient solution of a benchmark MHD turbulence calculation for the full resistive MHD system are also presented. These results are obtained on up to 131,000 cores of a Cray XC40 and one million cores of a BG/Q system.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Basu, Protonu; Williams, Samuel; Van Straalen, Brian
GPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model like CUDA, OpenCL, or OpenACC. Thus, in order to deliver portability across CPU-based and GPU-accelerated supercomputers, programmers are forced to write and maintain two versions of their applications or frameworks. In this paper, we explore the use of a compiler-based autotuning framework based on CUDA-CHiLL to deliver not only portability, but also performance portability across CPU- and GPU-accelerated platforms for the geometric multigrid linear solvers found inmore » many scientific applications. We also show that with autotuning we can attain near Roofline (a performance bound for a computation and target architecture) performance across the key operations in the miniGMG benchmark for both CPU- and GPU-based architectures as well as for a multiple stencil discretizations and smoothers. We show that our technology is readily interoperable with MPI resulting in performance at scale equal to that obtained via hand-optimized MPI+CUDA implementation.« less
NASA Astrophysics Data System (ADS)
Furuichi, Mikito; Nishiura, Daisuke
2017-10-01
We developed dynamic load-balancing algorithms for Particle Simulation Methods (PSM) involving short-range interactions, such as Smoothed Particle Hydrodynamics (SPH), Moving Particle Semi-implicit method (MPS), and Discrete Element method (DEM). These are needed to handle billions of particles modeled in large distributed-memory computer systems. Our method utilizes flexible orthogonal domain decomposition, allowing the sub-domain boundaries in the column to be different for each row. The imbalances in the execution time between parallel logical processes are treated as a nonlinear residual. Load-balancing is achieved by minimizing the residual within the framework of an iterative nonlinear solver, combined with a multigrid technique in the local smoother. Our iterative method is suitable for adjusting the sub-domain frequently by monitoring the performance of each computational process because it is computationally cheaper in terms of communication and memory costs than non-iterative methods. Numerical tests demonstrated the ability of our approach to handle workload imbalances arising from a non-uniform particle distribution, differences in particle types, or heterogeneous computer architecture which was difficult with previously proposed methods. We analyzed the parallel efficiency and scalability of our method using Earth simulator and K-computer supercomputer systems.
AQUASOL: An efficient solver for the dipolar Poisson–Boltzmann–Langevin equation
Koehl, Patrice; Delarue, Marc
2010-01-01
The Poisson–Boltzmann (PB) formalism is among the most popular approaches to modeling the solvation of molecules. It assumes a continuum model for water, leading to a dielectric permittivity that only depends on position in space. In contrast, the dipolar Poisson–Boltzmann–Langevin (DPBL) formalism represents the solvent as a collection of orientable dipoles with nonuniform concentration; this leads to a nonlinear permittivity function that depends both on the position and on the local electric field at that position. The differences in the assumptions underlying these two models lead to significant differences in the equations they generate. The PB equation is a second order, elliptic, nonlinear partial differential equation (PDE). Its response coefficients correspond to the dielectric permittivity and are therefore constant within each subdomain of the system considered (i.e., inside and outside of the molecules considered). While the DPBL equation is also a second order, elliptic, nonlinear PDE, its response coefficients are nonlinear functions of the electrostatic potential. Many solvers have been developed for the PB equation; to our knowledge, none of these can be directly applied to the DPBL equation. The methods they use may adapt to the difference; their implementations however are PBE specific. We adapted the PBE solver originally developed by Holst and Saied [J. Comput. Chem. 16, 337 (1995)] to the problem of solving the DPBL equation. This solver uses a truncated Newton method with a multigrid preconditioner. Numerical evidences suggest that it converges for the DPBL equation and that the convergence is superlinear. It is found however to be slow and greedy in memory requirement for problems commonly encountered in computational biology and computational chemistry. To circumvent these problems, we propose two variants, a quasi-Newton solver based on a simplified, inexact Jacobian and an iterative self-consistent solver that is based directly on the PBE solver. While both methods are not guaranteed to converge, numerical evidences suggest that they do and that their convergence is also superlinear. Both variants are significantly faster than the solver based on the exact Jacobian, with a much smaller memory footprint. All three methods have been implemented in a new code named AQUASOL, which is freely available. PMID:20151727
AQUASOL: An efficient solver for the dipolar Poisson-Boltzmann-Langevin equation.
Koehl, Patrice; Delarue, Marc
2010-02-14
The Poisson-Boltzmann (PB) formalism is among the most popular approaches to modeling the solvation of molecules. It assumes a continuum model for water, leading to a dielectric permittivity that only depends on position in space. In contrast, the dipolar Poisson-Boltzmann-Langevin (DPBL) formalism represents the solvent as a collection of orientable dipoles with nonuniform concentration; this leads to a nonlinear permittivity function that depends both on the position and on the local electric field at that position. The differences in the assumptions underlying these two models lead to significant differences in the equations they generate. The PB equation is a second order, elliptic, nonlinear partial differential equation (PDE). Its response coefficients correspond to the dielectric permittivity and are therefore constant within each subdomain of the system considered (i.e., inside and outside of the molecules considered). While the DPBL equation is also a second order, elliptic, nonlinear PDE, its response coefficients are nonlinear functions of the electrostatic potential. Many solvers have been developed for the PB equation; to our knowledge, none of these can be directly applied to the DPBL equation. The methods they use may adapt to the difference; their implementations however are PBE specific. We adapted the PBE solver originally developed by Holst and Saied [J. Comput. Chem. 16, 337 (1995)] to the problem of solving the DPBL equation. This solver uses a truncated Newton method with a multigrid preconditioner. Numerical evidences suggest that it converges for the DPBL equation and that the convergence is superlinear. It is found however to be slow and greedy in memory requirement for problems commonly encountered in computational biology and computational chemistry. To circumvent these problems, we propose two variants, a quasi-Newton solver based on a simplified, inexact Jacobian and an iterative self-consistent solver that is based directly on the PBE solver. While both methods are not guaranteed to converge, numerical evidences suggest that they do and that their convergence is also superlinear. Both variants are significantly faster than the solver based on the exact Jacobian, with a much smaller memory footprint. All three methods have been implemented in a new code named AQUASOL, which is freely available.
Fast Euler solver for transonic airfoils. I - Theory. II - Applications
NASA Technical Reports Server (NTRS)
Dadone, Andrea; Moretti, Gino
1988-01-01
Equations written in terms of generalized Riemann variables are presently integrated by inverting six bidiagonal matrices and two tridiagonal matrices, using an implicit Euler solver that is based on the lambda-formulation. The solution is found on a C-grid whose boundaries are very close to the airfoil. The fast solver is then applied to the computation of several flowfields on a NACA 0012 airfoil at various Mach number and alpha values, yielding results that are primarily concerned with transonic flows. The effects of grid fineness and boundary distances are analyzed; the code is found to be robust and accurate, as well as fast.
NASA Technical Reports Server (NTRS)
Jameson, A.
1975-01-01
The use of a fast elliptic solver in combination with relaxation is presented as an effective way to accelerate the convergence of transonic flow calculations, particularly when a marching scheme can be used to treat the supersonic zone in the relaxation process.
3D inversion based on multi-grid approach of magnetotelluric data from Northern Scandinavia
NASA Astrophysics Data System (ADS)
Cherevatova, M.; Smirnov, M.; Korja, T. J.; Egbert, G. D.
2012-12-01
In this work we investigate the geoelectrical structure of the cratonic margin of Fennoscandian Shield by means of magnetotelluric (MT) measurements carried out in Northern Norway and Sweden during summer 2011-2012. The project Magnetotellurics in the Scandes (MaSca) focuses on the investigation of the crust, upper mantle and lithospheric structure in a transition zone from a stable Precambrian cratonic interior to a passive continental margin beneath the Caledonian Orogen and the Scandes Mountains in western Fennoscandia. Recent MT profiles in the central and southern Scandes indicated a large contrast in resistivity between Caledonides and Precambrian basement. The alum shales as a highly conductive layers between the resistive Precambrian basement and the overlying Caledonian nappes are revealed from this profiles. Additional measurements in the Northern Scandes were required. All together data from 60 synchronous long period (LMT) and about 200 broad band (BMT) sites were acquired. The array stretches from Lofoten and Bodo (Norway) in the west to Kiruna and Skeleftea (Sweden) in the east covering an area of 500x500 square kilometers. LMT sites were occupied for about two months, while most of the BMT sites were measured during one day. We have used new multi-grid approach for 3D electromagnetic (EM) inversion and modelling. Our approach is based on the OcTree discretization where the spatial domain is represented by rectangular cells, each of which might be subdivided (recursively) into eight sub-cells. In this simplified implementation the grid is refined only in the horizontal direction, uniformly in each vertical layer. Using multi-grid we manage to have a high grid resolution near the surface (for instance, to tackle with galvanic distortions) and lower resolution at greater depth as the EM fields decay in the Earth according to the diffusion equation. We also have a benefit in computational costs as number of unknowns decrease. The multi-grid forward solver is implemented within the framework of the modular system for EM inversion (ModEM by G. Egbert, A. Kelbert, N. Meqbel), using the ModEM 3D finite difference staggered grid forward solver (second order PDE in the electric field, with divergence correction) as a starting point for our development. The first 3D inversion model for the crust and upper mantle shows the highly conducting bodies in the crust which can be interpreted as alum shales. The eastern and central parts are presented by resistive Precambrian rocks of the Svecofennian and Archaean domains. The upper mantle is resistive and relates to the Baltica basement. We also compare 3D inversion model with the results of 2D inversion along several profiles. We are able to explain some of the features in the data (out of quadrant phase) with 3D model, thus providing more reliable results compared to routine 2D approach.
Sabouni, Abas; Pouliot, Philippe; Shmuel, Amir; Lesage, Frederic
2014-01-01
This paper introduce a fast and efficient solver for simulating the induced (eddy) current distribution in the brain during transcranial magnetic stimulation procedure. This solver has been integrated with MRI and neuronavigation software to accurately model the electromagnetic field and show eddy current in the head almost in real-time. To examine the performance of the proposed technique, we used a 3D anatomically accurate MRI model of the 25 year old female subject.
Computational Challenges of 3D Radiative Transfer in Atmospheric Models
NASA Astrophysics Data System (ADS)
Jakub, Fabian; Bernhard, Mayer
2017-04-01
The computation of radiative heating and cooling rates is one of the most expensive components in todays atmospheric models. The high computational cost stems not only from the laborious integration over a wide range of the electromagnetic spectrum but also from the fact that solving the integro-differential radiative transfer equation for monochromatic light is already rather involved. This lead to the advent of numerous approximations and parameterizations to reduce the cost of the solver. One of the most prominent one is the so called independent pixel approximations (IPA) where horizontal energy transfer is neglected whatsoever and radiation may only propagate in the vertical direction (1D). Recent studies implicate that the IPA introduces significant errors in high resolution simulations and affects the evolution and development of convective systems. However, using fully 3D solvers such as for example MonteCarlo methods is not even on state of the art supercomputers feasible. The parallelization of atmospheric models is often realized by a horizontal domain decomposition, and hence, horizontal transfer of energy necessitates communication. E.g. a cloud's shadow at a low zenith angle will cast a long shadow and potentially needs to communication through a multitude of processors. Especially light in the solar spectral range may travel long distances through the atmosphere. Concerning highly parallel simulations, it is vital that 3D radiative transfer solvers put a special emphasis on parallel scalability. We will present an introduction to intricacies computing 3D radiative heating and cooling rates as well as report on the parallel performance of the TenStream solver. The TenStream is a 3D radiative transfer solver using the PETSc framework to iteratively solve a set of partial differential equation. We investigate two matrix preconditioners, (a) geometric algebraic multigrid preconditioning(MG+GAMG) and (b) block Jacobi incomplete LU (ILU) factorization. The TenStream solver is tested for up to 4096 cores and shows a parallel scaling efficiency of 80-90% on various supercomputers.
Multi-Grid detector for neutron spectroscopy: results obtained on time-of-flight spectrometer CNCS
NASA Astrophysics Data System (ADS)
Anastasopoulos, M.; Bebb, R.; Berry, K.; Birch, J.; Bryś, T.; Buffet, J.-C.; Clergeau, J.-F.; Deen, P. P.; Ehlers, G.; van Esch, P.; Everett, S. M.; Guerard, B.; Hall-Wilton, R.; Herwig, K.; Hultman, L.; Höglund, C.; Iruretagoiena, I.; Issa, F.; Jensen, J.; Khaplanov, A.; Kirstein, O.; Lopez Higuera, I.; Piscitelli, F.; Robinson, L.; Schmidt, S.; Stefanescu, I.
2017-04-01
The Multi-Grid detector technology has evolved from the proof-of-principle and characterisation stages. Here we report on the performance of the Multi-Grid detector, the MG.CNCS prototype, which has been installed and tested at the Cold Neutron Chopper Spectrometer, CNCS at SNS. This has allowed a side-by-side comparison to the performance of 3He detectors on an operational instrument. The demonstrator has an active area of 0.2 m2. It is specifically tailored to the specifications of CNCS. The detector was installed in June 2016 and has operated since then, collecting neutron scattering data in parallel to the He-3 detectors of CNCS. In this paper, we present a comprehensive analysis of this data, in particular on instrument energy resolution, rate capability, background and relative efficiency. Stability, gamma-ray and fast neutron sensitivity have also been investigated. The effect of scattering in the detector components has been measured and provides input to comparison for Monte Carlo simulations. All data is presented in comparison to that measured by the 3He detectors simultaneously, showing that all features recorded by one detector are also recorded by the other. The energy resolution matches closely. We find that the Multi-Grid is able to match the data collected by 3He, and see an indication of a considerable advantage in the count rate capability. Based on these results, we are confident that the Multi-Grid detector will be capable of producing high quality scientific data on chopper spectrometers utilising the unprecedented neutron flux of the ESS.
CFD code evaluation for internal flow modeling
NASA Technical Reports Server (NTRS)
Chung, T. J.
1990-01-01
Research on the computational fluid dynamics (CFD) code evaluation with emphasis on supercomputing in reacting flows is discussed. Advantages of unstructured grids, multigrids, adaptive methods, improved flow solvers, vector processing, parallel processing, and reduction of memory requirements are discussed. As examples, researchers include applications of supercomputing to reacting flow Navier-Stokes equations including shock waves and turbulence and combustion instability problems associated with solid and liquid propellants. Evaluation of codes developed by other organizations are not included. Instead, the basic criteria for accuracy and efficiency have been established, and some applications on rocket combustion have been made. Research toward an ultimate goal, the most accurate and efficient CFD code, is in progress and will continue for years to come.
Factorizable Schemes for the Equations of Fluid Flow
NASA Technical Reports Server (NTRS)
Sidilkover, David
1999-01-01
We present an upwind high-resolution factorizable (UHF) discrete scheme for the compressible Euler equations that allows to distinguish between full-potential and advection factors at the discrete level. The scheme approximates equations in their general conservative form and is related to the family of genuinely multidimensional upwind schemes developed previously and demonstrated to have good shock-capturing capabilities. A unique property of this scheme is that in addition to the aforementioned features it is also factorizable, i.e., it allows to distinguish between full-potential and advection factors at the discrete level. The latter property facilitates the construction of optimally efficient multigrid solvers. This is done through a relaxation procedure that utilizes the factorizability property.
Multitasking domain decomposition fast Poisson solvers on the Cray Y-MP
NASA Technical Reports Server (NTRS)
Chan, Tony F.; Fatoohi, Rod A.
1990-01-01
The results of multitasking implementation of a domain decomposition fast Poisson solver on eight processors of the Cray Y-MP are presented. The object of this research is to study the performance of domain decomposition methods on a Cray supercomputer and to analyze the performance of different multitasking techniques using highly parallel algorithms. Two implementations of multitasking are considered: macrotasking (parallelism at the subroutine level) and microtasking (parallelism at the do-loop level). A conventional FFT-based fast Poisson solver is also multitasked. The results of different implementations are compared and analyzed. A speedup of over 7.4 on the Cray Y-MP running in a dedicated environment is achieved for all cases.
Final Report - Subcontract B623760
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bank, R.
2017-11-17
During my visit to LLNL during July 17{27, 2017, I worked on linear system solvers. The two level hierarchical solver that initiated our study was developed to solve linear systems arising from hp adaptive finite element calculations, and is implemented in the PLTMG software package, version 12. This preconditioner typically requires 3-20% of the space used by the stiffness matrix for higher order elements. It has multigrid like convergence rates for a wide variety of PDEs (self-adjoint positive de nite elliptic equations, convection dominated convection-diffusion equations, and highly indefinite Helmholtz equations, among others). The convergence rate is not independent ofmore » the polynomial degree p as p ! 1, but but remains strong for p 9, which is the highest polynomial degree allowed in PLTMG, due to limitations of the numerical quadrature rules implemented in the software package. A more complete description of the method and some numerical experiments illustrating its effectiveness appear in. Like traditional geometric multilevel methods, this scheme relies on knowledge of the underlying finite element space in order to construct the smoother and the coarse grid correction.« less
Treatment of charge singularities in implicit solvent models.
Geng, Weihua; Yu, Sining; Wei, Guowei
2007-09-21
This paper presents a novel method for solving the Poisson-Boltzmann (PB) equation based on a rigorous treatment of geometric singularities of the dielectric interface and a Green's function formulation of charge singularities. Geometric singularities, such as cusps and self-intersecting surfaces, in the dielectric interfaces are bottleneck in developing highly accurate PB solvers. Based on an advanced mathematical technique, the matched interface and boundary (MIB) method, we have recently developed a PB solver by rigorously enforcing the flux continuity conditions at the solvent-molecule interface where geometric singularities may occur. The resulting PB solver, denoted as MIBPB-II, is able to deliver second order accuracy for the molecular surfaces of proteins. However, when the mesh size approaches half of the van der Waals radius, the MIBPB-II cannot maintain its accuracy because the grid points that carry the interface information overlap with those that carry distributed singular charges. In the present Green's function formalism, the charge singularities are transformed into interface flux jump conditions, which are treated on an equal footing as the geometric singularities in our MIB framework. The resulting method, denoted as MIBPB-III, is able to provide highly accurate electrostatic potentials at a mesh as coarse as 1.2 A for proteins. Consequently, at a given level of accuracy, the MIBPB-III is about three times faster than the APBS, a recent multigrid PB solver. The MIBPB-III has been extensively validated by using analytically solvable problems, molecular surfaces of polyatomic systems, and 24 proteins. It provides reliable benchmark numerical solutions for the PB equation.
Treatment of charge singularities in implicit solvent models
NASA Astrophysics Data System (ADS)
Geng, Weihua; Yu, Sining; Wei, Guowei
2007-09-01
This paper presents a novel method for solving the Poisson-Boltzmann (PB) equation based on a rigorous treatment of geometric singularities of the dielectric interface and a Green's function formulation of charge singularities. Geometric singularities, such as cusps and self-intersecting surfaces, in the dielectric interfaces are bottleneck in developing highly accurate PB solvers. Based on an advanced mathematical technique, the matched interface and boundary (MIB) method, we have recently developed a PB solver by rigorously enforcing the flux continuity conditions at the solvent-molecule interface where geometric singularities may occur. The resulting PB solver, denoted as MIBPB-II, is able to deliver second order accuracy for the molecular surfaces of proteins. However, when the mesh size approaches half of the van der Waals radius, the MIBPB-II cannot maintain its accuracy because the grid points that carry the interface information overlap with those that carry distributed singular charges. In the present Green's function formalism, the charge singularities are transformed into interface flux jump conditions, which are treated on an equal footing as the geometric singularities in our MIB framework. The resulting method, denoted as MIBPB-III, is able to provide highly accurate electrostatic potentials at a mesh as coarse as 1.2Å for proteins. Consequently, at a given level of accuracy, the MIBPB-III is about three times faster than the APBS, a recent multigrid PB solver. The MIBPB-III has been extensively validated by using analytically solvable problems, molecular surfaces of polyatomic systems, and 24 proteins. It provides reliable benchmark numerical solutions for the PB equation.
An adaptive grid algorithm for one-dimensional nonlinear equations
NASA Technical Reports Server (NTRS)
Gutierrez, William E.; Hills, Richard G.
1990-01-01
Richards' equation, which models the flow of liquid through unsaturated porous media, is highly nonlinear and difficult to solve. Step gradients in the field variables require the use of fine grids and small time step sizes. The numerical instabilities caused by the nonlinearities often require the use of iterative methods such as Picard or Newton interation. These difficulties result in large CPU requirements in solving Richards equation. With this in mind, adaptive and multigrid methods are investigated for use with nonlinear equations such as Richards' equation. Attention is focused on one-dimensional transient problems. To investigate the use of multigrid and adaptive grid methods, a series of problems are studied. First, a multigrid program is developed and used to solve an ordinary differential equation, demonstrating the efficiency with which low and high frequency errors are smoothed out. The multigrid algorithm and an adaptive grid algorithm is used to solve one-dimensional transient partial differential equations, such as the diffusive and convective-diffusion equations. The performance of these programs are compared to that of the Gauss-Seidel and tridiagonal methods. The adaptive and multigrid schemes outperformed the Gauss-Seidel algorithm, but were not as fast as the tridiagonal method. The adaptive grid scheme solved the problems slightly faster than the multigrid method. To solve nonlinear problems, Picard iterations are introduced into the adaptive grid and tridiagonal methods. Burgers' equation is used as a test problem for the two algorithms. Both methods obtain solutions of comparable accuracy for similar time increments. For the Burgers' equation, the adaptive grid method finds the solution approximately three times faster than the tridiagonal method. Finally, both schemes are used to solve the water content formulation of the Richards' equation. For this problem, the adaptive grid method obtains a more accurate solution in fewer work units and less computation time than required by the tridiagonal method. The performance of the adaptive grid method tends to degrade as the solution process proceeds in time, but still remains faster than the tridiagonal scheme.
On the implementation of an accurate and efficient solver for convection-diffusion equations
NASA Astrophysics Data System (ADS)
Wu, Chin-Tien
In this dissertation, we examine several different aspects of computing the numerical solution of the convection-diffusion equation. The solution of this equation often exhibits sharp gradients due to Dirichlet outflow boundaries or discontinuities in boundary conditions. Because of the singular-perturbed nature of the equation, numerical solutions often have severe oscillations when grid sizes are not small enough to resolve sharp gradients. To overcome such difficulties, the streamline diffusion discretization method can be used to obtain an accurate approximate solution in regions where the solution is smooth. To increase accuracy of the solution in the regions containing layers, adaptive mesh refinement and mesh movement based on a posteriori error estimations can be employed. An error-adapted mesh refinement strategy based on a posteriori error estimations is also proposed to resolve layers. For solving the sparse linear systems that arise from discretization, goemetric multigrid (MG) and algebraic multigrid (AMG) are compared. In addition, both methods are also used as preconditioners for Krylov subspace methods. We derive some convergence results for MG with line Gauss-Seidel smoothers and bilinear interpolation. Finally, while considering adaptive mesh refinement as an integral part of the solution process, it is natural to set a stopping tolerance for the iterative linear solvers on each mesh stage so that the difference between the approximate solution obtained from iterative methods and the finite element solution is bounded by an a posteriori error bound. Here, we present two stopping criteria. The first is based on a residual-type a posteriori error estimator developed by Verfurth. The second is based on an a posteriori error estimator, using local solutions, developed by Kay and Silvester. Our numerical results show the refined mesh obtained from the iterative solution which satisfies the second criteria is similar to the refined mesh obtained from the finite element solution.
NASA Technical Reports Server (NTRS)
Costiner, Sorin; Taasan, Shlomo
1994-01-01
This paper presents multigrid (MG) techniques for nonlinear eigenvalue problems (EP) and emphasizes an MG algorithm for a nonlinear Schrodinger EP. The algorithm overcomes the mentioned difficulties combining the following techniques: an MG projection coupled with backrotations for separation of solutions and treatment of difficulties related to clusters of close and equal eigenvalues; MG subspace continuation techniques for treatment of the nonlinearity; an MG simultaneous treatment of the eigenvectors at the same time with the nonlinearity and with the global constraints. The simultaneous MG techniques reduce the large number of self consistent iterations to only a few or one MG simultaneous iteration and keep the solutions in a right neighborhood where the algorithm converges fast.
A second order discontinuous Galerkin fast sweeping method for Eikonal equations
NASA Astrophysics Data System (ADS)
Li, Fengyan; Shu, Chi-Wang; Zhang, Yong-Tao; Zhao, Hongkai
2008-09-01
In this paper, we construct a second order fast sweeping method with a discontinuous Galerkin (DG) local solver for computing viscosity solutions of a class of static Hamilton-Jacobi equations, namely the Eikonal equations. Our piecewise linear DG local solver is built on a DG method developed recently [Y. Cheng, C.-W. Shu, A discontinuous Galerkin finite element method for directly solving the Hamilton-Jacobi equations, Journal of Computational Physics 223 (2007) 398-415] for the time-dependent Hamilton-Jacobi equations. The causality property of Eikonal equations is incorporated into the design of this solver. The resulting local nonlinear system in the Gauss-Seidel iterations is a simple quadratic system and can be solved explicitly. The compactness of the DG method and the fast sweeping strategy lead to fast convergence of the new scheme for Eikonal equations. Extensive numerical examples verify efficiency, convergence and second order accuracy of the proposed method.
A New Approximate Chimera Donor Cell Search Algorithm
NASA Technical Reports Server (NTRS)
Holst, Terry L.; Nixon, David (Technical Monitor)
1998-01-01
The objectives of this study were to develop chimera-based full potential methodology which is compatible with overflow (Euler/Navier-Stokes) chimera flow solver and to develop a fast donor cell search algorithm that is compatible with the chimera full potential approach. Results of this work included presenting a new donor cell search algorithm suitable for use with a chimera-based full potential solver. This algorithm was found to be extremely fast and simple producing donor cells as fast as 60,000 per second.
Solvers for the Cardiac Bidomain Equations
Vigmond, E.J.; Weber dos Santos, R.; Prassl, A.J.; Deo, M.; Plank, G.
2010-01-01
The bidomain equations are widely used for the simulation of electrical activity in cardiac tissue. They are especially important for accurately modelling extracellular stimulation, as evidenced by their prediction of virtual electrode polarization before experimental verification. However, solution of the equations is computationally expensive due to the fine spatial and temporal discretization needed. This limits the size and duration of the problem which can be modeled. Regardless of the specific form into which they are cast, the computational bottleneck becomes the repeated solution of a large, linear system. The purpose of this review is to give an overview of the equations, and the methods by which they have been solved. Of particular note are recent developments in multigrid methods, which have proven to be the most efficient. PMID:17900668
NASA Astrophysics Data System (ADS)
Wagenhoffer, Nathan; Moored, Keith; Jaworski, Justin
2016-11-01
The design of quiet and efficient bio-inspired propulsive concepts requires a rapid, unified computational framework that integrates the coupled fluid dynamics with the noise generation. Such a framework is developed where the fluid motion is modeled with a two-dimensional unsteady boundary element method that includes a vortex-particle wake. The unsteady surface forces from the potential flow solver are then passed to an acoustic boundary element solver to predict the radiated sound in low-Mach-number flows. The use of the boundary element method for both the hydrodynamic and acoustic solvers permits dramatic computational acceleration by application of the fast multiple method. The reduced order of calculations due to the fast multipole method allows for greater spatial resolution of the vortical wake per unit of computational time. The coupled flow-acoustic solver is validated against canonical vortex-sound problems. The capability of the coupled solver is demonstrated by analyzing the performance and noise production of an isolated bio-inspired swimmer and of tandem swimmers.
NASA Astrophysics Data System (ADS)
Chen, Hui; Deng, Ju-Zhi; Yin, Min; Yin, Chang-Chun; Tang, Wen-Wu
2017-03-01
To speed up three-dimensional (3D) DC resistivity modeling, we present a new multigrid method, the aggregation-based algebraic multigrid method (AGMG). We first discretize the differential equation of the secondary potential field with mixed boundary conditions by using a seven-point finite-difference method to obtain a large sparse system of linear equations. Then, we introduce the theory behind the pairwise aggregation algorithms for AGMG and use the conjugate-gradient method with the V-cycle AGMG preconditioner (AGMG-CG) to solve the linear equations. We use typical geoelectrical models to test the proposed AGMG-CG method and compare the results with analytical solutions and the 3DDCXH algorithm for 3D DC modeling (3DDCXH). In addition, we apply the AGMG-CG method to different grid sizes and geoelectrical models and compare it to different iterative methods, such as ILU-BICGSTAB, ILU-GCR, and SSOR-CG. The AGMG-CG method yields nearly linearly decreasing errors, whereas the number of iterations increases slowly with increasing grid size. The AGMG-CG method is precise and converges fast, and thus can improve the computational efficiency in forward modeling of three-dimensional DC resistivity.
An algebraic multigrid method for Q2-Q1 mixed discretizations of the Navier-Stokes equations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prokopenko, Andrey; Tuminaro, Raymond S.
Algebraic multigrid (AMG) preconditioners are considered for discretized systems of partial differential equations (PDEs) where unknowns associated with different physical quantities are not necessarily co-located at mesh points. Speci cally, we investigate a Q 2-Q 1 mixed finite element discretization of the incompressible Navier-Stokes equations where the number of velocity nodes is much greater than the number of pressure nodes. Consequently, some velocity degrees-of-freedom (dofs) are defined at spatial locations where there are no corresponding pressure dofs. Thus, AMG approaches lever- aging this co-located structure are not applicable. This paper instead proposes an automatic AMG coarsening that mimics certain pressure/velocitymore » dof relationships of the Q 2-Q 1 discretization. The main idea is to first automatically define coarse pressures in a somewhat standard AMG fashion and then to carefully (but automatically) choose coarse velocity unknowns so that the spatial location relationship between pressure and velocity dofs resembles that on the nest grid. To define coefficients within the inter-grid transfers, an energy minimization AMG (EMIN-AMG) is utilized. EMIN-AMG is not tied to specific coarsening schemes and grid transfer sparsity patterns, and so it is applicable to the proposed coarsening. Numerical results highlighting solver performance are given on Stokes and incompressible Navier-Stokes problems.« less
Semi-automatic sparse preconditioners for high-order finite element methods on non-uniform meshes
NASA Astrophysics Data System (ADS)
Austin, Travis M.; Brezina, Marian; Jamroz, Ben; Jhurani, Chetan; Manteuffel, Thomas A.; Ruge, John
2012-05-01
High-order finite elements often have a higher accuracy per degree of freedom than the classical low-order finite elements. However, in the context of implicit time-stepping methods, high-order finite elements present challenges to the construction of efficient simulations due to the high cost of inverting the denser finite element matrix. There are many cases where simulations are limited by the memory required to store the matrix and/or the algorithmic components of the linear solver. We are particularly interested in preconditioned Krylov methods for linear systems generated by discretization of elliptic partial differential equations with high-order finite elements. Using a preconditioner like Algebraic Multigrid can be costly in terms of memory due to the need to store matrix information at the various levels. We present a novel method for defining a preconditioner for systems generated by high-order finite elements that is based on a much sparser system than the original high-order finite element system. We investigate the performance for non-uniform meshes on a cube and a cubed sphere mesh, showing that the sparser preconditioner is more efficient and uses significantly less memory. Finally, we explore new methods to construct the sparse preconditioner and examine their effectiveness for non-uniform meshes. We compare results to a direct use of Algebraic Multigrid as a preconditioner and to a two-level additive Schwarz method.
An algebraic multigrid method for Q2-Q1 mixed discretizations of the Navier-Stokes equations
Prokopenko, Andrey; Tuminaro, Raymond S.
2016-07-01
Algebraic multigrid (AMG) preconditioners are considered for discretized systems of partial differential equations (PDEs) where unknowns associated with different physical quantities are not necessarily co-located at mesh points. Speci cally, we investigate a Q 2-Q 1 mixed finite element discretization of the incompressible Navier-Stokes equations where the number of velocity nodes is much greater than the number of pressure nodes. Consequently, some velocity degrees-of-freedom (dofs) are defined at spatial locations where there are no corresponding pressure dofs. Thus, AMG approaches lever- aging this co-located structure are not applicable. This paper instead proposes an automatic AMG coarsening that mimics certain pressure/velocitymore » dof relationships of the Q 2-Q 1 discretization. The main idea is to first automatically define coarse pressures in a somewhat standard AMG fashion and then to carefully (but automatically) choose coarse velocity unknowns so that the spatial location relationship between pressure and velocity dofs resembles that on the nest grid. To define coefficients within the inter-grid transfers, an energy minimization AMG (EMIN-AMG) is utilized. EMIN-AMG is not tied to specific coarsening schemes and grid transfer sparsity patterns, and so it is applicable to the proposed coarsening. Numerical results highlighting solver performance are given on Stokes and incompressible Navier-Stokes problems.« less
Advanced Fast 3-D Electromagnetic Solver for Microwave Tomography Imaging.
Simonov, Nikolai; Kim, Bo-Ra; Lee, Kwang-Jae; Jeon, Soon-Ik; Son, Seong-Ho
2017-10-01
This paper describes a fast-forward electromagnetic solver (FFS) for the image reconstruction algorithm of our microwave tomography system. Our apparatus is a preclinical prototype of a biomedical imaging system, designed for the purpose of early breast cancer detection. It operates in the 3-6-GHz frequency band using a circular array of probe antennas immersed in a matching liquid; it produces image reconstructions of the permittivity and conductivity profiles of the breast under examination. Our reconstruction algorithm solves the electromagnetic (EM) inverse problem and takes into account the real EM properties of the probe antenna array as well as the influence of the patient's body and that of the upper metal screen sheet. This FFS algorithm is much faster than conventional EM simulation solvers. In comparison, in the same PC, the CST solver takes ~45 min, while the FFS takes ~1 s of effective simulation time for the same EM model of a numerical breast phantom.
A Newton-Krylov solver for fast spin-up of online ocean tracers
NASA Astrophysics Data System (ADS)
Lindsay, Keith
2017-01-01
We present a Newton-Krylov based solver to efficiently spin up tracers in an online ocean model. We demonstrate that the solver converges, that tracer simulations initialized with the solution from the solver have small drift, and that the solver takes orders of magnitude less computational time than the brute force spin-up approach. To demonstrate the application of the solver, we use it to efficiently spin up the tracer ideal age with respect to the circulation from different time intervals in a long physics run. We then evaluate how the spun-up ideal age tracer depends on the duration of the physics run, i.e., on how equilibrated the circulation is.
Fast immersed interface Poisson solver for 3D unbounded problems around arbitrary geometries
NASA Astrophysics Data System (ADS)
Gillis, T.; Winckelmans, G.; Chatelain, P.
2018-02-01
We present a fast and efficient Fourier-based solver for the Poisson problem around an arbitrary geometry in an unbounded 3D domain. This solver merges two rewarding approaches, the lattice Green's function method and the immersed interface method, using the Sherman-Morrison-Woodbury decomposition formula. The method is intended to be second order up to the boundary. This is verified on two potential flow benchmarks. We also further analyse the iterative process and the convergence behavior of the proposed algorithm. The method is applicable to a wide range of problems involving a Poisson equation around inner bodies, which goes well beyond the present validation on potential flows.
Tree-based solvers for adaptive mesh refinement code FLASH - I: gravity and optical depths
NASA Astrophysics Data System (ADS)
Wünsch, R.; Walch, S.; Dinnbier, F.; Whitworth, A.
2018-04-01
We describe an OctTree algorithm for the MPI parallel, adaptive mesh refinement code FLASH, which can be used to calculate the gas self-gravity, and also the angle-averaged local optical depth, for treating ambient diffuse radiation. The algorithm communicates to the different processors only those parts of the tree that are needed to perform the tree-walk locally. The advantage of this approach is a relatively low memory requirement, important in particular for the optical depth calculation, which needs to process information from many different directions. This feature also enables a general tree-based radiation transport algorithm that will be described in a subsequent paper, and delivers excellent scaling up to at least 1500 cores. Boundary conditions for gravity can be either isolated or periodic, and they can be specified in each direction independently, using a newly developed generalization of the Ewald method. The gravity calculation can be accelerated with the adaptive block update technique by partially re-using the solution from the previous time-step. Comparison with the FLASH internal multigrid gravity solver shows that tree-based methods provide a competitive alternative, particularly for problems with isolated or mixed boundary conditions. We evaluate several multipole acceptance criteria (MACs) and identify a relatively simple approximate partial error MAC which provides high accuracy at low computational cost. The optical depth estimates are found to agree very well with those of the RADMC-3D radiation transport code, with the tree-solver being much faster. Our algorithm is available in the standard release of the FLASH code in version 4.0 and later.
Weighted least squares phase unwrapping based on the wavelet transform
NASA Astrophysics Data System (ADS)
Chen, Jiafeng; Chen, Haiqin; Yang, Zhengang; Ren, Haixia
2007-01-01
The weighted least squares phase unwrapping algorithm is a robust and accurate method to solve phase unwrapping problem. This method usually leads to a large sparse linear equation system. Gauss-Seidel relaxation iterative method is usually used to solve this large linear equation. However, this method is not practical due to its extremely slow convergence. The multigrid method is an efficient algorithm to improve convergence rate. However, this method needs an additional weight restriction operator which is very complicated. For this reason, the multiresolution analysis method based on the wavelet transform is proposed. By applying the wavelet transform, the original system is decomposed into its coarse and fine resolution levels and an equivalent equation system with better convergence condition can be obtained. Fast convergence in separate coarse resolution levels speeds up the overall system convergence rate. The simulated experiment shows that the proposed method converges faster and provides better result than the multigrid method.
Application of a fast Newton-Krylov solver for equilibrium simulations of phosphorus and oxygen
NASA Astrophysics Data System (ADS)
Fu, Weiwei; Primeau, François
2017-11-01
Model drift due to inadequate spinup is a serious problem that complicates the interpretation of climate change simulations. Even after a 300 year spinup we show that solutions are not only still drifting but often drifting away from their eventual equilibrium over large parts of the ocean. Here we present a Newton-Krylov solver for computing cyclostationary equilibrium solutions of a biogeochemical model for the cycling of phosphorus and oxygen. In addition to using previously developed preconditioning strategies - time-averaging and coarse-graining the Jacobian matrix - we also introduce a new strategy: the adiabatic elimination of a fast variable (particulate organic phosphorus) by slaving it to a slow variable (dissolved inorganic phosphorus). We use transport matrices derived from the Community Earth System Model (CESM) with a nominal horizontal resolution of 1° × 1° and 60 vertical levels to implement and test the solver. We find that the new solver obtains seasonally-varying equilibrium solutions with no visible drift using no more than 80 simulation years.
Li, Zhilin; Xiao, Li; Cai, Qin; Zhao, Hongkai; Luo, Ray
2016-01-01
In this paper, a new Navier–Stokes solver based on a finite difference approximation is proposed to solve incompressible flows on irregular domains with open, traction, and free boundary conditions, which can be applied to simulations of fluid structure interaction, implicit solvent model for biomolecular applications and other free boundary or interface problems. For some problems of this type, the projection method and the augmented immersed interface method (IIM) do not work well or does not work at all. The proposed new Navier–Stokes solver is based on the local pressure boundary method, and a semi-implicit augmented IIM. A fast Poisson solver can be used in our algorithm which gives us the potential for developing fast overall solvers in the future. The time discretization is based on a second order multi-step method. Numerical tests with exact solutions are presented to validate the accuracy of the method. Application to fluid structure interaction between an incompressible fluid and a compressible gas bubble is also presented. PMID:27087702
Li, Zhilin; Xiao, Li; Cai, Qin; Zhao, Hongkai; Luo, Ray
2015-08-15
In this paper, a new Navier-Stokes solver based on a finite difference approximation is proposed to solve incompressible flows on irregular domains with open, traction, and free boundary conditions, which can be applied to simulations of fluid structure interaction, implicit solvent model for biomolecular applications and other free boundary or interface problems. For some problems of this type, the projection method and the augmented immersed interface method (IIM) do not work well or does not work at all. The proposed new Navier-Stokes solver is based on the local pressure boundary method, and a semi-implicit augmented IIM. A fast Poisson solver can be used in our algorithm which gives us the potential for developing fast overall solvers in the future. The time discretization is based on a second order multi-step method. Numerical tests with exact solutions are presented to validate the accuracy of the method. Application to fluid structure interaction between an incompressible fluid and a compressible gas bubble is also presented.
Lattice QCD Calculations in Nuclear Physics towards the Exascale
NASA Astrophysics Data System (ADS)
Joo, Balint
2017-01-01
The combination of algorithmic advances and new highly parallel computing architectures are enabling lattice QCD calculations to tackle ever more complex problems in nuclear physics. In this talk I will review some computational challenges that are encountered in large scale cold nuclear physics campaigns such as those in hadron spectroscopy calculations. I will discuss progress in addressing these with algorithmic improvements such as multi-grid solvers and software for recent hardware architectures such as GPUs and Intel Xeon Phi, Knights Landing. Finally, I will highlight some current topics for research and development as we head towards the Exascale era This material is funded by the U.S. Department of Energy, Office Of Science, Offices of Nuclear Physics, High Energy Physics and Advanced Scientific Computing Research, as well as the Office of Nuclear Physics under contract DE-AC05-06OR23177.
NASA Technical Reports Server (NTRS)
Maliassov, Serguei
1996-01-01
In this paper an algebraic substructuring preconditioner is considered for nonconforming finite element approximations of second order elliptic problems in 3D domains with a piecewise constant diffusion coefficient. Using a substructuring idea and a block Gauss elimination, part of the unknowns is eliminated and the Schur complement obtained is preconditioned by a spectrally equivalent very sparse matrix. In the case of quasiuniform tetrahedral mesh an appropriate algebraic multigrid solver can be used to solve the problem with this matrix. Explicit estimates of condition numbers and implementation algorithms are established for the constructed preconditioner. It is shown that the condition number of the preconditioned matrix does not depend on either the mesh step size or the jump of the coefficient. Finally, numerical experiments are presented to illustrate the theory being developed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Paul T.; Shadid, John N.; Sala, Marzio
In this study results are presented for the large-scale parallel performance of an algebraic multilevel preconditioner for solution of the drift-diffusion model for semiconductor devices. The preconditioner is the key numerical procedure determining the robustness, efficiency and scalability of the fully-coupled Newton-Krylov based, nonlinear solution method that is employed for this system of equations. The coupled system is comprised of a source term dominated Poisson equation for the electric potential, and two convection-diffusion-reaction type equations for the electron and hole concentration. The governing PDEs are discretized in space by a stabilized finite element method. Solution of the discrete system ismore » obtained through a fully-implicit time integrator, a fully-coupled Newton-based nonlinear solver, and a restarted GMRES Krylov linear system solver. The algebraic multilevel preconditioner is based on an aggressive coarsening graph partitioning of the nonzero block structure of the Jacobian matrix. Representative performance results are presented for various choices of multigrid V-cycles and W-cycles and parameter variations for smoothers based on incomplete factorizations. Parallel scalability results are presented for solution of up to 10{sup 8} unknowns on 4096 processors of a Cray XT3/4 and an IBM POWER eServer system.« less
SOME NEW FINITE DIFFERENCE METHODS FOR HELMHOLTZ EQUATIONS ON IRREGULAR DOMAINS OR WITH INTERFACES
Wan, Xiaohai; Li, Zhilin
2012-01-01
Solving a Helmholtz equation Δu + λu = f efficiently is a challenge for many applications. For example, the core part of many efficient solvers for the incompressible Navier-Stokes equations is to solve one or several Helmholtz equations. In this paper, two new finite difference methods are proposed for solving Helmholtz equations on irregular domains, or with interfaces. For Helmholtz equations on irregular domains, the accuracy of the numerical solution obtained using the existing augmented immersed interface method (AIIM) may deteriorate when the magnitude of λ is large. In our new method, we use a level set function to extend the source term and the PDE to a larger domain before we apply the AIIM. For Helmholtz equations with interfaces, a new maximum principle preserving finite difference method is developed. The new method still uses the standard five-point stencil with modifications of the finite difference scheme at irregular grid points. The resulting coefficient matrix of the linear system of finite difference equations satisfies the sign property of the discrete maximum principle and can be solved efficiently using a multigrid solver. The finite difference method is also extended to handle temporal discretized equations where the solution coefficient λ is inversely proportional to the mesh size. PMID:22701346
SOME NEW FINITE DIFFERENCE METHODS FOR HELMHOLTZ EQUATIONS ON IRREGULAR DOMAINS OR WITH INTERFACES.
Wan, Xiaohai; Li, Zhilin
2012-06-01
Solving a Helmholtz equation Δu + λu = f efficiently is a challenge for many applications. For example, the core part of many efficient solvers for the incompressible Navier-Stokes equations is to solve one or several Helmholtz equations. In this paper, two new finite difference methods are proposed for solving Helmholtz equations on irregular domains, or with interfaces. For Helmholtz equations on irregular domains, the accuracy of the numerical solution obtained using the existing augmented immersed interface method (AIIM) may deteriorate when the magnitude of λ is large. In our new method, we use a level set function to extend the source term and the PDE to a larger domain before we apply the AIIM. For Helmholtz equations with interfaces, a new maximum principle preserving finite difference method is developed. The new method still uses the standard five-point stencil with modifications of the finite difference scheme at irregular grid points. The resulting coefficient matrix of the linear system of finite difference equations satisfies the sign property of the discrete maximum principle and can be solved efficiently using a multigrid solver. The finite difference method is also extended to handle temporal discretized equations where the solution coefficient λ is inversely proportional to the mesh size.
An iterative solver for the 3D Helmholtz equation
NASA Astrophysics Data System (ADS)
Belonosov, Mikhail; Dmitriev, Maxim; Kostin, Victor; Neklyudov, Dmitry; Tcheverda, Vladimir
2017-09-01
We develop a frequency-domain iterative solver for numerical simulation of acoustic waves in 3D heterogeneous media. It is based on the application of a unique preconditioner to the Helmholtz equation that ensures convergence for Krylov subspace iteration methods. Effective inversion of the preconditioner involves the Fast Fourier Transform (FFT) and numerical solution of a series of boundary value problems for ordinary differential equations. Matrix-by-vector multiplication for iterative inversion of the preconditioned matrix involves inversion of the preconditioner and pointwise multiplication of grid functions. Our solver has been verified by benchmarking against exact solutions and a time-domain solver.
CUDA GPU based full-Stokes finite difference modelling of glaciers
NASA Astrophysics Data System (ADS)
Brædstrup, C. F.; Egholm, D. L.
2012-04-01
Many have stressed the limitations of using the shallow shelf and shallow ice approximations when modelling ice streams or surging glaciers. Using a full-stokes approach requires either large amounts of computer power or time and is therefore seldom an option for most glaciologists. Recent advances in graphics card (GPU) technology for high performance computing have proven extremely efficient in accelerating many large scale scientific computations. The general purpose GPU (GPGPU) technology is cheap, has a low power consumption and fits into a normal desktop computer. It could therefore provide a powerful tool for many glaciologists. Our full-stokes ice sheet model implements a Red-Black Gauss-Seidel iterative linear solver to solve the full stokes equations. This technique has proven very effective when applied to the stokes equation in geodynamics problems, and should therefore also preform well in glaciological flow probems. The Gauss-Seidel iterator is known to be robust but several other linear solvers have a much faster convergence. To aid convergence, the solver uses a multigrid approach where values are interpolated and extrapolated between different grid resolutions to minimize the short wavelength errors efficiently. This reduces the iteration count by several orders of magnitude. The run-time is further reduced by using the GPGPU technology where each card has up to 448 cores. Researchers utilizing the GPGPU technique in other areas have reported between 2 - 11 times speedup compared to multicore CPU implementations on similar problems. The goal of these initial investigations into the possible usage of GPGPU technology in glacial modelling is to apply the enhanced resolution of a full-stokes solver to ice streams and surging glaciers. This is a area of growing interest because ice streams are the main drainage conjugates for large ice sheets. It is therefore crucial to understand this streaming behavior and it's impact up-ice.
Development of an Unstructured Mesh Code for Flows About Complete Vehicles
NASA Technical Reports Server (NTRS)
Peraire, Jaime; Gupta, K. K. (Technical Monitor)
2001-01-01
This report describes the research work undertaken at the Massachusetts Institute of Technology, under NASA Research Grant NAG4-157. The aim of this research is to identify effective algorithms and methodologies for the efficient and routine solution of flow simulations about complete vehicle configurations. For over ten years we have received support from NASA to develop unstructured mesh methods for Computational Fluid Dynamics. As a result of this effort a methodology based on the use of unstructured adapted meshes of tetrahedra and finite volume flow solvers has been developed. A number of gridding algorithms, flow solvers, and adaptive strategies have been proposed. The most successful algorithms developed from the basis of the unstructured mesh system FELISA. The FELISA system has been extensively for the analysis of transonic and hypersonic flows about complete vehicle configurations. The system is highly automatic and allows for the routine aerodynamic analysis of complex configurations starting from CAD data. The code has been parallelized and utilizes efficient solution algorithms. For hypersonic flows, a version of the code which incorporates real gas effects, has been produced. The FELISA system is also a component of the STARS aeroservoelastic system developed at NASA Dryden. One of the latest developments before the start of this grant was to extend the system to include viscous effects. This required the development of viscous generators, capable of generating the anisotropic grids required to represent boundary layers, and viscous flow solvers. We show some sample hypersonic viscous computations using the developed viscous generators and solvers. Although this initial results were encouraging it became apparent that in order to develop a fully functional capability for viscous flows, several advances in solution accuracy, robustness and efficiency were required. In this grant we set out to investigate some novel methodologies that could lead to the required improvements. In particular we focused on two fronts: (1) finite element methods and (2) iterative algebraic multigrid solution techniques.
Multigrid techniques for unstructured meshes
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.
1995-01-01
An overview of current multigrid techniques for unstructured meshes is given. The basic principles of the multigrid approach are first outlined. Application of these principles to unstructured mesh problems is then described, illustrating various different approaches, and giving examples of practical applications. Advanced multigrid topics, such as the use of algebraic multigrid methods, and the combination of multigrid techniques with adaptive meshing strategies are dealt with in subsequent sections. These represent current areas of research, and the unresolved issues are discussed. The presentation is organized in an educational manner, for readers familiar with computational fluid dynamics, wishing to learn more about current unstructured mesh techniques.
Multigrid calculation of three-dimensional turbomachinery flows
NASA Technical Reports Server (NTRS)
Caughey, David A.
1989-01-01
Research was performed in the general area of computational aerodynamics, with particular emphasis on the development of efficient techniques for the solution of the Euler and Navier-Stokes equations for transonic flows through the complex blade passages associated with turbomachines. In particular, multigrid methods were developed, using both explicit and implicit time-stepping schemes as smoothing algorithms. The specific accomplishments of the research have included: (1) the development of an explicit multigrid method to solve the Euler equations for three-dimensional turbomachinery flows based upon the multigrid implementation of Jameson's explicit Runge-Kutta scheme (Jameson 1983); (2) the development of an implicit multigrid scheme for the three-dimensional Euler equations based upon lower-upper factorization; (3) the development of a multigrid scheme using a diagonalized alternating direction implicit (ADI) algorithm; (4) the extension of the diagonalized ADI multigrid method to solve the Euler equations of inviscid flow for three-dimensional turbomachinery flows; and also (5) the extension of the diagonalized ADI multigrid scheme to solve the Reynolds-averaged Navier-Stokes equations for two-dimensional turbomachinery flows.
NASA Technical Reports Server (NTRS)
Dendy, J. E., Jr.
1981-01-01
The black box multigrid (BOXMG) code, which only needs specification of the matrix problem for application in the multigrid method was investigated. It is contended that a major problem with the multigrid method is that each new grid configuration requires a major programming effort to develop a code that specifically handles that grid configuration. The SOR and ICCG methods only specify the matrix problem, no matter what the grid configuration. It is concluded that the BOXMG does everything else necessary to set up the auxiliary coarser problems to achieve a multigrid solution.
NASA Technical Reports Server (NTRS)
Dinar, N.
1978-01-01
Several aspects of multigrid methods are briefly described. The main subjects include the development of very efficient multigrid algorithms for systems of elliptic equations (Cauchy-Riemann, Stokes, Navier-Stokes), as well as the development of control and prediction tools (based on local mode Fourier analysis), used to analyze, check and improve these algorithms. Preliminary research on multigrid algorithms for time dependent parabolic equations is also described. Improvements in existing multigrid processes and algorithms for elliptic equations were studied.
Application of multi-grid methods for solving the Navier-Stokes equations
NASA Technical Reports Server (NTRS)
Demuren, A. O.
1989-01-01
The application of a class of multi-grid methods to the solution of the Navier-Stokes equations for two-dimensional laminar flow problems is discussed. The methods consist of combining the full approximation scheme-full multi-grid technique (FAS-FMG) with point-, line-, or plane-relaxation routines for solving the Navier-Stokes equations in primitive variables. The performance of the multi-grid methods is compared to that of several single-grid methods. The results show that much faster convergence can be procured through the use of the multi-grid approach than through the various suggestions for improving single-grid methods. The importance of the choice of relaxation scheme for the multi-grid method is illustrated.
Application of multi-grid methods for solving the Navier-Stokes equations
NASA Technical Reports Server (NTRS)
Demuren, A. O.
1989-01-01
This paper presents the application of a class of multi-grid methods to the solution of the Navier-Stokes equations for two-dimensional laminar flow problems. The methods consists of combining the full approximation scheme-full multi-grid technique (FAS-FMG) with point-, line- or plane-relaxation routines for solving the Navier-Stokes equations in primitive variables. The performance of the multi-grid methods is compared to those of several single-grid methods. The results show that much faster convergence can be procured through the use of the multi-grid approach than through the various suggestions for improving single-grid methods. The importance of the choice of relaxation scheme for the multi-grid method is illustrated.
A fast optimization algorithm for multicriteria intensity modulated proton therapy planning.
Chen, Wei; Craft, David; Madden, Thomas M; Zhang, Kewu; Kooy, Hanne M; Herman, Gabor T
2010-09-01
To describe a fast projection algorithm for optimizing intensity modulated proton therapy (IMPT) plans and to describe and demonstrate the use of this algorithm in multicriteria IMPT planning. The authors develop a projection-based solver for a class of convex optimization problems and apply it to IMPT treatment planning. The speed of the solver permits its use in multicriteria optimization, where several optimizations are performed which span the space of possible treatment plans. The authors describe a plan database generation procedure which is customized to the requirements of the solver. The optimality precision of the solver can be specified by the user. The authors apply the algorithm to three clinical cases: A pancreas case, an esophagus case, and a tumor along the rib cage case. Detailed analysis of the pancreas case shows that the algorithm is orders of magnitude faster than industry-standard general purpose algorithms (MOSEK'S interior point optimizer, primal simplex optimizer, and dual simplex optimizer). Additionally, the projection solver has almost no memory overhead. The speed and guaranteed accuracy of the algorithm make it suitable for use in multicriteria treatment planning, which requires the computation of several diverse treatment plans. Additionally, given the low memory overhead of the algorithm, the method can be extended to include multiple geometric instances and proton range possibilities, for robust optimization.
Novel Scalable 3-D MT Inverse Solver
NASA Astrophysics Data System (ADS)
Kuvshinov, A. V.; Kruglyakov, M.; Geraskin, A.
2016-12-01
We present a new, robust and fast, three-dimensional (3-D) magnetotelluric (MT) inverse solver. As a forward modelling engine a highly-scalable solver extrEMe [1] is used. The (regularized) inversion is based on an iterative gradient-type optimization (quasi-Newton method) and exploits adjoint sources approach for fast calculation of the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT (single-site and/or inter-site) responses, and supports massive parallelization. Different parallelization strategies implemented in the code allow for optimal usage of available computational resources for a given problem set up. To parameterize an inverse domain a mask approach is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to high-performance clusters demonstrate practically linear scalability of the code up to thousands of nodes. 1. Kruglyakov, M., A. Geraskin, A. Kuvshinov, 2016. Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method, Computers and Geosciences, in press.
Large-Scale Parallel Viscous Flow Computations using an Unstructured Multigrid Algorithm
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.
1999-01-01
The development and testing of a parallel unstructured agglomeration multigrid algorithm for steady-state aerodynamic flows is discussed. The agglomeration multigrid strategy uses a graph algorithm to construct the coarse multigrid levels from the given fine grid, similar to an algebraic multigrid approach, but operates directly on the non-linear system using the FAS (Full Approximation Scheme) approach. The scalability and convergence rate of the multigrid algorithm are examined on the SGI Origin 2000 and the Cray T3E. An argument is given which indicates that the asymptotic scalability of the multigrid algorithm should be similar to that of its underlying single grid smoothing scheme. For medium size problems involving several million grid points, near perfect scalability is obtained for the single grid algorithm, while only a slight drop-off in parallel efficiency is observed for the multigrid V- and W-cycles, using up to 128 processors on the SGI Origin 2000, and up to 512 processors on the Cray T3E. For a large problem using 25 million grid points, good scalability is observed for the multigrid algorithm using up to 1450 processors on a Cray T3E, even when the coarsest grid level contains fewer points than the total number of processors.
NASA Astrophysics Data System (ADS)
Han, Song; Zhang, Wei; Zhang, Jie
2017-09-01
A fast sweeping method (FSM) determines the first arrival traveltimes of seismic waves by sweeping the velocity model in different directions meanwhile applying a local solver. It is an efficient way to numerically solve Hamilton-Jacobi equations for traveltime calculations. In this study, we develop an improved FSM to calculate the first arrival traveltimes of quasi-P (qP) waves in 2-D tilted transversely isotropic (TTI) media. A local solver utilizes the coupled slowness surface of qP and quasi-SV (qSV) waves to form a quartic equation, and solve it numerically to obtain possible traveltimes of qP-wave. The proposed quartic solver utilizes Fermat's principle to limit the range of the possible solution, then uses the bisection procedure to efficiently determine the real roots. With causality enforced during sweepings, our FSM converges fast in a few iterations, and the exact number depending on the complexity of the velocity model. To improve the accuracy, we employ high-order finite difference schemes and derive the second-order formulae. There is no weak anisotropy assumption, and no approximation is made to the complex slowness surface of qP-wave. In comparison to the traveltimes calculated by a horizontal slowness shooting method, the validity and accuracy of our FSM is demonstrated.
Large-scale Parallel Unstructured Mesh Computations for 3D High-lift Analysis
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.; Pirzadeh, S.
1999-01-01
A complete "geometry to drag-polar" analysis capability for the three-dimensional high-lift configurations is described. The approach is based on the use of unstructured meshes in order to enable rapid turnaround for complicated geometries that arise in high-lift configurations. Special attention is devoted to creating a capability for enabling analyses on highly resolved grids. Unstructured meshes of several million vertices are initially generated on a work-station, and subsequently refined on a supercomputer. The flow is solved on these refined meshes on large parallel computers using an unstructured agglomeration multigrid algorithm. Good prediction of lift and drag throughout the range of incidences is demonstrated on a transport take-off configuration using up to 24.7 million grid points. The feasibility of using this approach in a production environment on existing parallel machines is demonstrated, as well as the scalability of the solver on machines using up to 1450 processors.
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication
Azad, Ariful; Ballard, Grey; Buluc, Aydin; ...
2016-11-08
Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdös-Rényi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first implementation of the 3D SpGEMM formulation that exploits multiple (intranode and internode) levels of parallelism, achievingmore » significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research.« less
Computer simulations of phase field drops on super-hydrophobic surfaces
NASA Astrophysics Data System (ADS)
Fedeli, Livio
2017-09-01
We present a novel quasi-Newton continuation procedure that efficiently solves the system of nonlinear equations arising from the discretization of a phase field model for wetting phenomena. We perform a comparative numerical analysis that shows the improved speed of convergence gained with respect to other numerical schemes. Moreover, we discuss the conditions that, on a theoretical level, guarantee the convergence of this method. At each iterative step, a suitable continuation procedure develops and passes to the nonlinear solver an accurate initial guess. Discretization performs through cell-centered finite differences. The resulting system of equations is solved on a composite grid that uses dynamic mesh refinement and multi-grid techniques. The final code achieves three-dimensional, realistic computer experiments comparable to those produced in laboratory settings. This code offers not only new insights into the phenomenology of super-hydrophobicity, but also serves as a reliable predictive tool for the study of hydrophobic surfaces.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Müller, Florian, E-mail: florian.mueller@sam.math.ethz.ch; Jenny, Patrick, E-mail: jenny@ifd.mavt.ethz.ch; Meyer, Daniel W., E-mail: meyerda@ethz.ch
2013-10-01
Monte Carlo (MC) is a well known method for quantifying uncertainty arising for example in subsurface flow problems. Although robust and easy to implement, MC suffers from slow convergence. Extending MC by means of multigrid techniques yields the multilevel Monte Carlo (MLMC) method. MLMC has proven to greatly accelerate MC for several applications including stochastic ordinary differential equations in finance, elliptic stochastic partial differential equations and also hyperbolic problems. In this study, MLMC is combined with a streamline-based solver to assess uncertain two phase flow and Buckley–Leverett transport in random heterogeneous porous media. The performance of MLMC is compared tomore » MC for a two dimensional reservoir with a multi-point Gaussian logarithmic permeability field. The influence of the variance and the correlation length of the logarithmic permeability on the MLMC performance is studied.« less
Final report for “Extreme-scale Algorithms and Solver Resilience”
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gropp, William Douglas
2017-06-30
This is a joint project with principal investigators at Oak Ridge National Laboratory, Sandia National Laboratories, the University of California at Berkeley, and the University of Tennessee. Our part of the project involves developing performance models for highly scalable algorithms and the development of latency tolerant iterative methods. During this project, we extended our performance models for the Multigrid method for solving large systems of linear equations and conducted experiments with highly scalable variants of conjugate gradient methods that avoid blocking synchronization. In addition, we worked with the other members of the project on alternative techniques for resilience and reproducibility.more » We also presented an alternative approach for reproducible dot-products in parallel computations that performs almost as well as the conventional approach by separating the order of computation from the details of the decomposition of vectors across the processes.« less
Progress on the Development of the hPIC Particle-in-Cell Code
NASA Astrophysics Data System (ADS)
Dart, Cameron; Hayes, Alyssa; Khaziev, Rinat; Marcinko, Stephen; Curreli, Davide; Laboratory of Computational Plasma Physics Team
2017-10-01
Advancements were made in the development of the kinetic-kinetic electrostatic Particle-in-Cell code, hPIC, designed for large-scale simulation of the Plasma-Material Interface. hPIC achieved a weak scaling efficiency of 87% using the Algebraic Multigrid Solver BoomerAMG from the PETSc library on more than 64,000 cores of the Blue Waters supercomputer at the University of Illinois at Urbana-Champaign. The code successfully simulates two-stream instability and a volume of plasma over several square centimeters of surface extending out to the presheath in kinetic-kinetic mode. Results from a parametric study of the plasma sheath in strongly magnetized conditions will be presented, as well as a detailed analysis of the plasma sheath structure at grazing magnetic angles. The distribution function and its moments will be reported for plasma species in the simulation domain and at the material surface for plasma sheath simulations. Membership Pending.
NASA Technical Reports Server (NTRS)
Green, Lawrence L.; Newman, Perry A.; Haigler, Kara J.
1993-01-01
The computational technique of automatic differentiation (AD) is applied to a three-dimensional thin-layer Navier-Stokes multigrid flow solver to assess the feasibility and computational impact of obtaining exact sensitivity derivatives typical of those needed for sensitivity analyses. Calculations are performed for an ONERA M6 wing in transonic flow with both the Baldwin-Lomax and Johnson-King turbulence models. The wing lift, drag, and pitching moment coefficients are differentiated with respect to two different groups of input parameters. The first group consists of the second- and fourth-order damping coefficients of the computational algorithm, whereas the second group consists of two parameters in the viscous turbulent flow physics modelling. Results obtained via AD are compared, for both accuracy and computational efficiency with the results obtained with divided differences (DD). The AD results are accurate, extremely simple to obtain, and show significant computational advantage over those obtained by DD for some cases.
Navier-Stokes turbine heat transfer predictions using two-equation turbulence closures
NASA Technical Reports Server (NTRS)
Ameri, Ali A.; Arnone, Andrea
1992-01-01
Navier-Stokes calculations were carried out in order to predict the heat-transfer rates on turbine blades. The calculations were performed using TRAF2D which is a k-epsilon, explicit, finite volume mass-averaged Navier-Stokes solver. Turbulence was modeled using Coakley's q-omega and Chien's k-epsilon two-equation models and the Baldwin-Lomax algebraic model. The model equations along with the flow equations were solved explicitly on a nonperiodic C grid. Implicit residual smoothing (IRS) or a combination of multigrid technique and IRS was applied to enhance convergence rates. Calculations were performed to predict the Stanton number distributions on the first stage vane and blade row as well as the second stage vane row of the SSME high-pressure fuel turbine. The comparison serves to highlight the weaknesses of the turbulence models for use in turbomachinery heat-transfer calculations.
Textbook Multigrid Efficiency for Computational Fluid Dynamics Simulations
NASA Technical Reports Server (NTRS)
Brandt, Achi; Thomas, James L.; Diskin, Boris
2001-01-01
Considerable progress over the past thirty years has been made in the development of large-scale computational fluid dynamics (CFD) solvers for the Euler and Navier-Stokes equations. Computations are used routinely to design the cruise shapes of transport aircraft through complex-geometry simulations involving the solution of 25-100 million equations; in this arena the number of wind-tunnel tests for a new design has been substantially reduced. However, simulations of the entire flight envelope of the vehicle, including maximum lift, buffet onset, flutter, and control effectiveness have not been as successful in eliminating the reliance on wind-tunnel testing. These simulations involve unsteady flows with more separation and stronger shock waves than at cruise. The main reasons limiting further inroads of CFD into the design process are: (1) the reliability of turbulence models; and (2) the time and expense of the numerical simulation. Because of the prohibitive resolution requirements of direct simulations at high Reynolds numbers, transition and turbulence modeling is expected to remain an issue for the near term. The focus of this paper addresses the latter problem by attempting to attain optimal efficiencies in solving the governing equations. Typically current CFD codes based on the use of multigrid acceleration techniques and multistage Runge-Kutta time-stepping schemes are able to converge lift and drag values for cruise configurations within approximately 1000 residual evaluations. An optimally convergent method is defined as having textbook multigrid efficiency (TME), meaning the solutions to the governing system of equations are attained in a computational work which is a small (less than 10) multiple of the operation count in the discretized system of equations (residual equations). In this paper, a distributed relaxation approach to achieving TME for Reynolds-averaged Navier-Stokes (RNAS) equations are discussed along with the foundations that form the basis of this approach. Because the governing equations are a set of coupled nonlinear conservation equations with discontinuities (shocks, slip lines, etc.) and singularities (flow- or grid-induced), the difficulties are many. This paper summarizes recent progress towards the attainment of TME in basic CFD simulations.
NASA Astrophysics Data System (ADS)
Qiang, Ji
2017-10-01
A three-dimensional (3D) Poisson solver with longitudinal periodic and transverse open boundary conditions can have important applications in beam physics of particle accelerators. In this paper, we present a fast efficient method to solve the Poisson equation using a spectral finite-difference method. This method uses a computational domain that contains the charged particle beam only and has a computational complexity of O(Nu(logNmode)) , where Nu is the total number of unknowns and Nmode is the maximum number of longitudinal or azimuthal modes. This saves both the computational time and the memory usage of using an artificial boundary condition in a large extended computational domain. The new 3D Poisson solver is parallelized using a message passing interface (MPI) on multi-processor computers and shows a reasonable parallel performance up to hundreds of processor cores.
A fast multigrid-based electromagnetic eigensolver for curved metal boundaries on the Yee mesh
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bauer, Carl A., E-mail: carl.bauer@colorado.edu; Werner, Gregory R.; Cary, John R.
For embedded boundary electromagnetics using the Dey–Mittra (Dey and Mittra, 1997) [1] algorithm, a special grad–div matrix constructed in this work allows use of multigrid methods for efficient inversion of Maxwell’s curl–curl matrix. Efficient curl–curl inversions are demonstrated within a shift-and-invert Krylov-subspace eigensolver (open-sourced at ([ofortt]https://github.com/bauerca/maxwell[cfortt])) on the spherical cavity and the 9-cell TESLA superconducting accelerator cavity. The accuracy of the Dey–Mittra algorithm is also examined: frequencies converge with second-order error, and surface fields are found to converge with nearly second-order error. In agreement with previous work (Nieter et al., 2009) [2], neglecting some boundary-cut cell faces (as is requiredmore » in the time domain for numerical stability) reduces frequency convergence to first-order and surface-field convergence to zeroth-order (i.e. surface fields do not converge). Additionally and importantly, neglecting faces can reduce accuracy by an order of magnitude at low resolutions.« less
LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators
NASA Astrophysics Data System (ADS)
Gonzalez, Juan; Núñez, Rafael C.
2009-07-01
We present LAPACKrc, a family of FPGA-based linear algebra solvers able to achieve more than 100x speedup per commodity processor on certain problems. LAPACKrc subsumes some of the LAPACK and ScaLAPACK functionalities, and it also incorporates sparse direct and iterative matrix solvers. Current LAPACKrc prototypes demonstrate between 40x-150x speedup compared against top-of-the-line hardware/software systems. A technology roadmap is in place to validate current performance of LAPACKrc in HPC applications, and to increase the computational throughput by factors of hundreds within the next few years.
Upscaling of Mixed Finite Element Discretization Problems by the Spectral AMGe Method
Kalchev, Delyan Z.; Lee, C. S.; Villa, U.; ...
2016-09-22
Here, we propose two multilevel spectral techniques for constructing coarse discretization spaces for saddle-point problems corresponding to PDEs involving a divergence constraint, with a focus on mixed finite element discretizations of scalar self-adjoint second order elliptic equations on general unstructured grids. We use element agglomeration algebraic multigrid (AMGe), which employs coarse elements that can have nonstandard shape since they are agglomerates of fine-grid elements. The coarse basis associated with each agglomerated coarse element is constructed by solving local eigenvalue problems and local mixed finite element problems. This construction leads to stable upscaled coarse spaces and guarantees the inf-sup compatibility ofmore » the upscaled discretization. Also, the approximation properties of these upscaled spaces improve by adding more local eigenfunctions to the coarse spaces. The higher accuracy comes at the cost of additional computational effort, as the sparsity of the resulting upscaled coarse discretization (referred to as operator complexity) deteriorates when we introduce additional functions in the coarse space. We also provide an efficient solver for the coarse (upscaled) saddle-point system by employing hybridization, which leads to a symmetric positive definite (s.p.d.) reduced system for the Lagrange multipliers, and to solve the latter s.p.d. system, we use our previously developed spectral AMGe solver. Numerical experiments, in both two and three dimensions, are provided to illustrate the efficiency of the proposed upscaling technique.« less
Upscaling of Mixed Finite Element Discretization Problems by the Spectral AMGe Method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kalchev, Delyan Z.; Lee, C. S.; Villa, U.
Here, we propose two multilevel spectral techniques for constructing coarse discretization spaces for saddle-point problems corresponding to PDEs involving a divergence constraint, with a focus on mixed finite element discretizations of scalar self-adjoint second order elliptic equations on general unstructured grids. We use element agglomeration algebraic multigrid (AMGe), which employs coarse elements that can have nonstandard shape since they are agglomerates of fine-grid elements. The coarse basis associated with each agglomerated coarse element is constructed by solving local eigenvalue problems and local mixed finite element problems. This construction leads to stable upscaled coarse spaces and guarantees the inf-sup compatibility ofmore » the upscaled discretization. Also, the approximation properties of these upscaled spaces improve by adding more local eigenfunctions to the coarse spaces. The higher accuracy comes at the cost of additional computational effort, as the sparsity of the resulting upscaled coarse discretization (referred to as operator complexity) deteriorates when we introduce additional functions in the coarse space. We also provide an efficient solver for the coarse (upscaled) saddle-point system by employing hybridization, which leads to a symmetric positive definite (s.p.d.) reduced system for the Lagrange multipliers, and to solve the latter s.p.d. system, we use our previously developed spectral AMGe solver. Numerical experiments, in both two and three dimensions, are provided to illustrate the efficiency of the proposed upscaling technique.« less
NASA Tech Briefs, November 2005
NASA Technical Reports Server (NTRS)
2005-01-01
Topics covered include: Laser System for Precise, Unambiguous Range Measurements; Flexible Cryogenic Temperature and Liquid-Level Probes; Precision Cryogenic Dilatometer; Stroboscopic Interferometer for Measuring Mirror Vibrations; Some Improvements in H-PDLCs; Multiple-Bit Differential Detection of OQPSK; Absolute Position Encoders With Vertical Image Binning; Flexible, Carbon-Based Ohmic Contacts for Organic Transistors; GaAs QWIP Array Containing More Than a Million Pixels; AutoChem; Virtual Machine Language; Two-Dimensional Ffowcs Williams/Hawkings Equation Solver; Full Multigrid Flow Solver; Doclet To Synthesize UML; Computing Thermal Effects of Cavitation in Cryogenic Liquids; GUI for Computational Simulation of a Propellant Mixer; Control Program for an Optical-Calibration Robot; SQL-RAMS; Distributing Data from Desktop to Hand-Held Computers; Best-Fit Conic Approximation of Spacecraft Trajectory; Improved Charge-Transfer Fluorescent Dyes; Stability-Augmentation Devices for Miniature Aircraft; Tool Measures Depths of Defects on a Case Tang Joint; Two Heat-Transfer Improvements for Gas Liquefiers; Controlling Force and Depth in Friction Stir Welding; Spill-Resistant Alkali-Metal-Vapor Dispenser; A Methodology for Quantifying Certain Design Requirements During the Design Phase; Measuring Two Key Parameters of H3 Color Centers in Diamond; Improved Compression of Wavelet-Transformed Images; NASA Interactive Forms Type Interface - NIFTI; Predicting Numbers of Problems in Development of Software; Hot-Electron Photon Counters for Detecting Terahertz Photons; Magnetic Variations Associated With Solar Flares; and Artificial Intelligence for Controlling Robotic Aircraft.
NASA Technical Reports Server (NTRS)
Swanson, R. C.; Rossow, C.-C.
2008-01-01
A three-stage Runge-Kutta (RK) scheme with multigrid and an implicit preconditioner has been shown to be an effective solver for the fluid dynamic equations. This scheme has been applied to both the compressible and essentially incompressible Reynolds-averaged Navier-Stokes (RANS) equations using the algebraic turbulence model of Baldwin and Lomax (BL). In this paper we focus on the convergence of the RK/implicit scheme when the effects of turbulence are represented by either the Spalart-Allmaras model or the Wilcox k-! model, which are frequently used models in practical fluid dynamic applications. Convergence behavior of the scheme with these turbulence models and the BL model are directly compared. For this initial investigation we solve the flow equations and the partial differential equations of the turbulence models indirectly coupled. With this approach we examine the convergence behavior of each system. Both point and line symmetric Gauss-Seidel are considered for approximating the inverse of the implicit operator of the flow solver. To solve the turbulence equations we use a diagonally dominant alternating direction implicit (DDADI) scheme. Computational results are presented for three airfoil flow cases and comparisons are made with experimental data. We demonstrate that the two-dimensional RANS equations and transport-type equations for turbulence modeling can be efficiently solved with an indirectly coupled algorithm that uses the RK/implicit scheme for the flow equations.
Fast Solvers for Moving Material Interfaces
2008-01-01
interface method—with the semi-Lagrangian contouring method developed in References [16–20]. We are now finalizing portable C / C ++ codes for fast adaptive ...stepping scheme couples a CIR predictor with a trapezoidal corrector using the velocity evaluated from the CIR approximation. It combines the...formula with efficient geometric algorithms and fast accurate contouring techniques. A modular adaptive implementation with fast new geometry modules
Toward textbook multigrid efficiency for fully implicit resistive magnetohydrodynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adams, Mark F.; Samtaney, Ravi, E-mail: samtaney@pppl.go; Brandt, Achi
2010-09-01
Multigrid methods can solve some classes of elliptic and parabolic equations to accuracy below the truncation error with a work-cost equivalent to a few residual calculations - so-called 'textbook' multigrid efficiency. We investigate methods to solve the system of equations that arise in time dependent magnetohydrodynamics (MHD) simulations with textbook multigrid efficiency. We apply multigrid techniques such as geometric interpolation, full approximate storage, Gauss-Seidel smoothers, and defect correction for fully implicit, nonlinear, second-order finite volume discretizations of MHD. We apply these methods to a standard resistive MHD benchmark problem, the GEM reconnection problem, and add a strong magnetic guide field,more » which is a critical characteristic of magnetically confined fusion plasmas. We show that our multigrid methods can achieve near textbook efficiency on fully implicit resistive MHD simulations.« less
Toward textbook multigrid efficiency for fully implicit resistive magnetohydrodynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adams, Mark F.; Samtaney, Ravi; Brandt, Achi
2010-09-01
Multigrid methods can solve some classes of elliptic and parabolic equations to accuracy below the truncation error with a work-cost equivalent to a few residual calculations – so-called ‘‘textbook” multigrid efficiency. We investigate methods to solve the system of equations that arise in time dependent magnetohydrodynamics (MHD) simulations with textbook multigrid efficiency. We apply multigrid techniques such as geometric interpolation, full approximate storage, Gauss–Seidel smoothers, and defect correction for fully implicit, nonlinear, second-order finite volume discretizations of MHD. We apply these methods to a standard resistive MHD benchmark problem, the GEM reconnection problem, and add a strong magnetic guide field,more » which is a critical characteristic of magnetically confined fusion plasmas. We show that our multigrid methods can achieve near textbook efficiency on fully implicit resistive MHD simulations.« less
Toward textbook multigrid efficiency for fully implicit resistive magnetohydrodynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adams, Mark F.; Samtaney, Ravi; Brandt, Achi
2013-12-14
Multigrid methods can solve some classes of elliptic and parabolic equations to accuracy below the truncation error with a work-cost equivalent to a few residual calculations – so-called “textbook” multigrid efficiency. We investigate methods to solve the system of equations that arise in time dependent magnetohydrodynamics (MHD) simulations with textbook multigrid efficiency. We apply multigrid techniques such as geometric interpolation, full approximate storage, Gauss-Seidel smoothers, and defect correction for fully implicit, nonlinear, second-order finite volume discretizations of MHD. We apply these methods to a standard resistive MHD benchmark problem, the GEM reconnection problem, and add a strong magnetic guide field,more » which is a critical characteristic of magnetically confined fusion plasmas. We show that our multigrid methods can achieve near textbook efficiency on fully implicit resistive MHD simulations.« less
Unified gas-kinetic scheme with multigrid convergence for rarefied flow study
NASA Astrophysics Data System (ADS)
Zhu, Yajun; Zhong, Chengwen; Xu, Kun
2017-09-01
The unified gas kinetic scheme (UGKS) is based on direct modeling of gas dynamics on the mesh size and time step scales. With the modeling of particle transport and collision in a time-dependent flux function in a finite volume framework, the UGKS can connect the flow physics smoothly from the kinetic particle transport to the hydrodynamic wave propagation. In comparison with the direct simulation Monte Carlo (DSMC) method, the current equation-based UGKS can implement implicit techniques in the updates of macroscopic conservative variables and microscopic distribution functions. The implicit UGKS significantly increases the convergence speed for steady flow computations, especially in the highly rarefied and near continuum regimes. In order to further improve the computational efficiency, for the first time, a geometric multigrid technique is introduced into the implicit UGKS, where the prediction step for the equilibrium state and the evolution step for the distribution function are both treated with multigrid acceleration. More specifically, a full approximate nonlinear system is employed in the prediction step for fast evaluation of the equilibrium state, and a correction linear equation is solved in the evolution step for the update of the gas distribution function. As a result, convergent speed has been greatly improved in all flow regimes from rarefied to the continuum ones. The multigrid implicit UGKS (MIUGKS) is used in the non-equilibrium flow study, which includes microflow, such as lid-driven cavity flow and the flow passing through a finite-length flat plate, and high speed one, such as supersonic flow over a square cylinder. The MIUGKS shows 5-9 times efficiency increase over the previous implicit scheme. For the low speed microflow, the efficiency of MIUGKS is several orders of magnitude higher than the DSMC. Even for the hypersonic flow at Mach number 5 and Knudsen number 0.1, the MIUGKS is still more than 100 times faster than the DSMC method for obtaining a convergent steady state solution.
On unstructured grids and solvers
NASA Technical Reports Server (NTRS)
Barth, T. J.
1990-01-01
The fundamentals and the state-of-the-art technology for unstructured grids and solvers are highlighted. Algorithms and techniques pertinent to mesh generation are discussed. It is shown that grid generation and grid manipulation schemes rely on fast multidimensional searching. Flow solution techniques for the Euler equations, which can be derived from the integral form of the equations are discussed. Sample calculations are also provided.
NASA Technical Reports Server (NTRS)
Jentink, Thomas Neil; Usab, William J., Jr.
1990-01-01
An explicit, Multigrid algorithm was written to solve the Euler and Navier-Stokes equations with special consideration given to the coarse mesh boundary conditions. These are formulated in a manner consistent with the interior solution, utilizing forcing terms to prevent coarse-mesh truncation error from affecting the fine-mesh solution. A 4-Stage Hybrid Runge-Kutta Scheme is used to advance the solution in time, and Multigrid convergence is further enhanced by using local time-stepping and implicit residual smoothing. Details of the algorithm are presented along with a description of Jameson's standard Multigrid method and a new approach to formulating the Multigrid equations.
Introduction to multigrid methods
NASA Technical Reports Server (NTRS)
Wesseling, P.
1995-01-01
These notes were written for an introductory course on the application of multigrid methods to elliptic and hyperbolic partial differential equations for engineers, physicists and applied mathematicians. The use of more advanced mathematical tools, such as functional analysis, is avoided. The course is intended to be accessible to a wide audience of users of computational methods. We restrict ourselves to finite volume and finite difference discretization. The basic principles are given. Smoothing methods and Fourier smoothing analysis are reviewed. The fundamental multigrid algorithm is studied. The smoothing and coarse grid approximation properties are discussed. Multigrid schedules and structured programming of multigrid algorithms are treated. Robustness and efficiency are considered.
NASA Technical Reports Server (NTRS)
Chang, S. C.
1984-01-01
Generally, fast direct solvers are not directly applicable to a nonseparable elliptic partial differential equation. This limitation, however, is circumvented by a semi-direct procedure, i.e., an iterative procedure using fast direct solvers. An efficient semi-direct procedure which is easy to implement and applicable to a variety of boundary conditions is presented. The current procedure also possesses other highly desirable properties, i.e.: (1) the convergence rate does not decrease with an increase of grid cell aspect ratio, and (2) the convergence rate is estimated using the coefficients of the partial differential equation being solved.
Some fast elliptic solvers on parallel architectures and their complexities
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Y.
1989-01-01
The discretization of separable elliptic partial differential equations leads to linear systems with special block tridiagonal matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconstant coefficients. A method was recently proposed to parallelize and vectorize BCR. In this paper, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational compelxity lower than that of parallel BCR.
Some fast elliptic solvers on parallel architectures and their complexities
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Youcef
1989-01-01
The discretization of separable elliptic partial differential equations leads to linear systems with special block triangular matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconsistant coefficients. A method was recently proposed to parallelize and vectorize BCR. Here, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches, including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational complexity lower than that of parallel BCR.
A Critical Study of Agglomerated Multigrid Methods for Diffusion
NASA Technical Reports Server (NTRS)
Nishikawa, Hiroaki; Diskin, Boris; Thomas, James L.
2011-01-01
Agglomerated multigrid techniques used in unstructured-grid methods are studied critically for a model problem representative of laminar diffusion in the incompressible limit. The studied target-grid discretizations and discretizations used on agglomerated grids are typical of current node-centered formulations. Agglomerated multigrid convergence rates are presented using a range of two- and three-dimensional randomly perturbed unstructured grids for simple geometries with isotropic and stretched grids. Two agglomeration techniques are used within an overall topology-preserving agglomeration framework. The results show that multigrid with an inconsistent coarse-grid scheme using only the edge terms (also referred to in the literature as a thin-layer formulation) provides considerable speedup over single-grid methods but its convergence deteriorates on finer grids. Multigrid with a Galerkin coarse-grid discretization using piecewise-constant prolongation and a heuristic correction factor is slower and also grid-dependent. In contrast, grid-independent convergence rates are demonstrated for multigrid with consistent coarse-grid discretizations. Convergence rates of multigrid cycles are verified with quantitative analysis methods in which parts of the two-grid cycle are replaced by their idealized counterparts.
Fast multigrid-based computation of the induced electric field for transcranial magnetic stimulation
NASA Astrophysics Data System (ADS)
Laakso, Ilkka; Hirata, Akimasa
2012-12-01
In transcranial magnetic stimulation (TMS), the distribution of the induced electric field, and the affected brain areas, depends on the position of the stimulation coil and the individual geometry of the head and brain. The distribution of the induced electric field in realistic anatomies can be modelled using computational methods. However, existing computational methods for accurately determining the induced electric field in realistic anatomical models have suffered from long computation times, typically in the range of tens of minutes or longer. This paper presents a matrix-free implementation of the finite-element method with a geometric multigrid method that can potentially reduce the computation time to several seconds or less even when using an ordinary computer. The performance of the method is studied by computing the induced electric field in two anatomically realistic models. An idealized two-loop coil is used as the stimulating coil. Multiple computational grid resolutions ranging from 2 to 0.25 mm are used. The results show that, for macroscopic modelling of the electric field in an anatomically realistic model, computational grid resolutions of 1 mm or 2 mm appear to provide good numerical accuracy compared to higher resolutions. The multigrid iteration typically converges in less than ten iterations independent of the grid resolution. Even without parallelization, each iteration takes about 1.0 s or 0.1 s for the 1 and 2 mm resolutions, respectively. This suggests that calculating the electric field with sufficient accuracy in real time is feasible.
New Nonlinear Multigrid Analysis
NASA Technical Reports Server (NTRS)
Xie, Dexuan
1996-01-01
The nonlinear multigrid is an efficient algorithm for solving the system of nonlinear equations arising from the numerical discretization of nonlinear elliptic boundary problems. In this paper, we present a new nonlinear multigrid analysis as an extension of the linear multigrid theory presented by Bramble. In particular, we prove the convergence of the nonlinear V-cycle method for a class of mildly nonlinear second order elliptic boundary value problems which do not have full elliptic regularity.
From 2D to 3D modelling in long term tectonics: Modelling challenges and HPC solutions (Invited)
NASA Astrophysics Data System (ADS)
Le Pourhiet, L.; May, D.
2013-12-01
Over the last decades, 3D thermo-mechanical codes have been made available to the long term tectonics community either as open source (Underworld, Gale) or more limited access (Fantom, Elvis3D, Douar, LaMem etc ...). However, to date, few published results using these methods have included the coupling between crustal and lithospheric dynamics at large strain. The fact that these computations are computational expensive is not the primary reason for the relatively slow development of 3D modeling in the long term tectonics community, as compare to the rapid development observed within the mantle dynamic community, or in the short-term tectonics field. Long term tectonics problems have specific issues not found in either of these two field, including; large strain (not an issue for short-term), the inclusion of free surface and the occurence of large viscosity contrasts. The first issue is typically eliminated using a combined marker-ALE method instead of fully lagrangian method, however, the marker-ALE approach can pose some algorithmic challenges in a massively parallel environment. The two last issues are more problematic because they affect the convergence of the linear/non-linear solver and the memory cost. Two options have been tested so far, using low order element and solving with a sparse direct solver, or using higher order stable elements together with a multi-grid solver. The first options, is simpler to code and to use but reaches its limit at around 80^3 low order elements. The second option requires more operations but allows using iterative solver on extremely large computers. In this presentation, I will describe the design philosophy and highlight results obtained using a code from the second-class method. The presentation will be oriented from an end-user point of view, using an application from 3D continental break up to illustrate key concepts. The description will proceed point by point from implementing physics into the code, to dealing with specific issues related to solving the discrete system of non linear equations.
Unstructured Mesh Methods for the Simulation of Hypersonic Flows
NASA Technical Reports Server (NTRS)
Peraire, Jaime; Bibb, K. L. (Technical Monitor)
2001-01-01
This report describes the research work undertaken at the Massachusetts Institute of Technology. The aim of this research is to identify effective algorithms and methodologies for the efficient and routine solution of hypersonic viscous flows about re-entry vehicles. For over ten years we have received support from NASA to develop unstructured mesh methods for Computational Fluid Dynamics. As a result of this effort a methodology based on the use, of unstructured adapted meshes of tetrahedra and finite volume flow solvers has been developed. A number of gridding algorithms flow solvers, and adaptive strategies have been proposed. The most successful algorithms developed from the basis of the unstructured mesh system FELISA. The FELISA system has been extensively for the analysis of transonic and hypersonic flows about complete vehicle configurations. The system is highly automatic and allows for the routine aerodynamic analysis of complex configurations starting from CAD data. The code has been parallelized and utilizes efficient solution algorithms. For hypersonic flows, a version of the, code which incorporates real gas effects, has been produced. One of the latest developments before the start of this grant was to extend the system to include viscous effects. This required the development of viscous generators, capable of generating the anisotropic grids required to represent boundary layers, and viscous flow solvers. In figures I and 2, we show some sample hypersonic viscous computations using the developed viscous generators and solvers. Although these initial results were encouraging, it became apparent that in order to develop a fully functional capability for viscous flows, several advances in gridding, solution accuracy, robustness and efficiency were required. As part of this research we have developed: 1) automatic meshing techniques and the corresponding computer codes have been delivered to NASA and implemented into the GridEx system, 2) a finite element algorithm for the solution of the viscous compressible flow equations which can solve flows all the way down to the incompressible limit and that can use higher order (quadratic) approximations leading to highly accurate answers, and 3) and iterative algebraic multigrid solution techniques.
NASA Astrophysics Data System (ADS)
Al-Chalabi, Rifat M. Khalil
1997-09-01
Development of an improvement to the computational efficiency of the existing nested iterative solution strategy of the Nodal Exapansion Method (NEM) nodal based neutron diffusion code NESTLE is presented. The improvement in the solution strategy is the result of developing a multilevel acceleration scheme that does not suffer from the numerical stalling associated with a number of iterative solution methods. The acceleration scheme is based on the multigrid method, which is specifically adapted for incorporation into the NEM nonlinear iterative strategy. This scheme optimizes the computational interplay between the spatial discretization and the NEM nonlinear iterative solution process through the use of the multigrid method. The combination of the NEM nodal method, calculation of the homogenized, neutron nodal balance coefficients (i.e. restriction operator), efficient underlying smoothing algorithm (power method of NESTLE), and the finer mesh reconstruction algorithm (i.e. prolongation operator), all operating on a sequence of coarser spatial nodes, constitutes the multilevel acceleration scheme employed in this research. Two implementations of the multigrid method into the NESTLE code were examined; the Imbedded NEM Strategy and the Imbedded CMFD Strategy. The main difference in implementation between the two methods is that in the Imbedded NEM Strategy, the NEM solution is required at every MG level. Numerical tests have shown that the Imbedded NEM Strategy suffers from divergence at coarse- grid levels, hence all the results for the different benchmarks presented here were obtained using the Imbedded CMFD Strategy. The novelties in the developed MG method are as follows: the formulation of the restriction and prolongation operators, and the selection of the relaxation method. The restriction operator utilizes a variation of the reactor physics, consistent homogenization technique. The prolongation operator is based upon a variant of the pin power reconstruction methodology. The relaxation method, which is the power method, utilizes a constant coefficient matrix within the NEM non-linear iterative strategy. The choice of the MG nesting within the nested iterative strategy enables the incorporation of other non-linear effects with no additional coding effort. In addition, if an eigenvalue problem is being solved, it remains an eigenvalue problem at all grid levels, simplifying coding implementation. The merit of the developed MG method was tested by incorporating it into the NESTLE iterative solver, and employing it to solve four different benchmark problems. In addition to the base cases, three different sensitivity studies are performed, examining the effects of number of MG levels, homogenized coupling coefficients correction (i.e. restriction operator), and fine-mesh reconstruction algorithm (i.e. prolongation operator). The multilevel acceleration scheme developed in this research provides the foundation for developing adaptive multilevel acceleration methods for steady-state and transient NEM nodal neutron diffusion equations. (Abstract shortened by UMI.)
Multigrid schemes for viscous hypersonic flows
NASA Technical Reports Server (NTRS)
Swanson, R. C.; Radespiel, R.
1993-01-01
Several multigrid schemes are considered for the numerical computation of viscous hypersonic flows. For each scheme, the basic solution algorithm employs upwind spatial discretization with explicit multistage time stepping. Two-level versions of the various multigrid algorithms are applied to the two-dimensional advection equation, and Fourier analysis is used to determine their damping properties. The capabilities of the multigrid methods are assessed by solving two different hypersonic flow problems. Some new multigrid schemes, based on semicoarsening strategies, are shown to be quite effective in relieving the stiffness caused by the high-aspect-ratio cells required to resolve high Reynolds number flows. These schemes exhibit good convergence rates for Reynolds numbers up to 200 x 10(exp 6).
Convergence acceleration of the Proteus computer code with multigrid methods
NASA Technical Reports Server (NTRS)
Demuren, A. O.; Ibraheem, S. O.
1995-01-01
This report presents the results of a study to implement convergence acceleration techniques based on the multigrid concept in the two-dimensional and three-dimensional versions of the Proteus computer code. The first section presents a review of the relevant literature on the implementation of the multigrid methods in computer codes for compressible flow analysis. The next two sections present detailed stability analysis of numerical schemes for solving the Euler and Navier-Stokes equations, based on conventional von Neumann analysis and the bi-grid analysis, respectively. The next section presents details of the computational method used in the Proteus computer code. Finally, the multigrid implementation and applications to several two-dimensional and three-dimensional test problems are presented. The results of the present study show that the multigrid method always leads to a reduction in the number of iterations (or time steps) required for convergence. However, there is an overhead associated with the use of multigrid acceleration. The overhead is higher in 2-D problems than in 3-D problems, thus overall multigrid savings in CPU time are in general better in the latter. Savings of about 40-50 percent are typical in 3-D problems, but they are about 20-30 percent in large 2-D problems. The present multigrid method is applicable to steady-state problems and is therefore ineffective in problems with inherently unstable solutions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hu, Jonathan Joseph; Wiesner, Tobias A.; Prokopenko, Andrey
2014-10-01
The MueLu tutorial is written as a hands-on tutorial for MueLu, the next generation multigrid framework in Trilinos. It covers the whole spectrum from absolute beginners’ topics to expert level. Since the focus of this tutorial is on practical and technical aspects of multigrid methods in general and MueLu in particular, the reader is expected to have a basic understanding of multigrid methods and its general underlying concepts. Please refer to multigrid textbooks (e.g. [1]) for the theoretical background.
CubiCal: Suite for fast radio interferometric calibration
NASA Astrophysics Data System (ADS)
Kenyon, J. S.; Smirnov, O. M.; Grobler, T. L.; Perkins, S. J.
2018-05-01
CubiCal implements several accelerated gain solvers which exploit complex optimization for fast radio interferometric gain calibration. The code can be used for both direction-independent and direction-dependent self-calibration. CubiCal is implemented in Python and Cython, and multiprocessing is fully supported.
Progress with multigrid schemes for hypersonic flow problems
NASA Technical Reports Server (NTRS)
Radespiel, R.; Swanson, R. C.
1991-01-01
Several multigrid schemes are considered for the numerical computation of viscous hypersonic flows. For each scheme, the basic solution algorithm uses upwind spatial discretization with explicit multistage time stepping. Two level versions of the various multigrid algorithms are applied to the two dimensional advection equation, and Fourier analysis is used to determine their damping properties. The capabilities of the multigrid methods are assessed by solving three different hypersonic flow problems. Some new multigrid schemes based on semicoarsening strategies are shown to be quite effective in relieving the stiffness caused by the high aspect ratio cells required to resolve high Reynolds number flows. These schemes exhibit good convergence rates for Reynolds numbers up to 200 x 10(exp 6) and Mach numbers up to 25.
NASA Astrophysics Data System (ADS)
S, Kyriacou; E, Kontoleontos; S, Weissenberger; L, Mangani; E, Casartelli; I, Skouteropoulou; M, Gattringer; A, Gehrer; M, Buchmayr
2014-03-01
An efficient hydraulic optimization procedure, suitable for industrial use, requires an advanced optimization tool (EASY software), a fast solver (block coupled CFD) and a flexible geometry generation tool. EASY optimization software is a PCA-driven metamodel-assisted Evolutionary Algorithm (MAEA (PCA)) that can be used in both single- (SOO) and multiobjective optimization (MOO) problems. In MAEAs, low cost surrogate evaluation models are used to screen out non-promising individuals during the evolution and exclude them from the expensive, problem specific evaluation, here the solution of Navier-Stokes equations. For additional reduction of the optimization CPU cost, the PCA technique is used to identify dependences among the design variables and to exploit them in order to efficiently drive the application of the evolution operators. To further enhance the hydraulic optimization procedure, a very robust and fast Navier-Stokes solver has been developed. This incompressible CFD solver employs a pressure-based block-coupled approach, solving the governing equations simultaneously. This method, apart from being robust and fast, also provides a big gain in terms of computational cost. In order to optimize the geometry of hydraulic machines, an automatic geometry and mesh generation tool is necessary. The geometry generation tool used in this work is entirely based on b-spline curves and surfaces. In what follows, the components of the tool chain are outlined in some detail and the optimization results of hydraulic machine components are shown in order to demonstrate the performance of the presented optimization procedure.
Multigrid direct numerical simulation of the whole process of flow transition in 3-D boundary layers
NASA Technical Reports Server (NTRS)
Liu, Chaoqun; Liu, Zhining
1993-01-01
A new technology was developed in this study which provides a successful numerical simulation of the whole process of flow transition in 3-D boundary layers, including linear growth, secondary instability, breakdown, and transition at relatively low CPU cost. Most other spatial numerical simulations require high CPU cost and blow up at the stage of flow breakdown. A fourth-order finite difference scheme on stretched and staggered grids, a fully implicit time marching technique, a semi-coarsening multigrid based on the so-called approximate line-box relaxation, and a buffer domain for the outflow boundary conditions were all used for high-order accuracy, good stability, and fast convergence. A new fine-coarse-fine grid mapping technique was developed to keep the code running after the laminar flow breaks down. The computational results are in good agreement with linear stability theory, secondary instability theory, and some experiments. The cost for a typical case with 162 x 34 x 34 grid is around 2 CRAY-YMP CPU hours for 10 T-S periods.
NASA Astrophysics Data System (ADS)
Codd, A. L.; Gross, L.
2018-03-01
We present a new inversion method for Electrical Resistivity Tomography which, in contrast to established approaches, minimizes the cost function prior to finite element discretization for the unknown electric conductivity and electric potential. Minimization is performed with the Broyden-Fletcher-Goldfarb-Shanno method (BFGS) in an appropriate function space. BFGS is self-preconditioning and avoids construction of the dense Hessian which is the major obstacle to solving large 3-D problems using parallel computers. In addition to the forward problem predicting the measurement from the injected current, the so-called adjoint problem also needs to be solved. For this problem a virtual current is injected through the measurement electrodes and an adjoint electric potential is obtained. The magnitude of the injected virtual current is equal to the misfit at the measurement electrodes. This new approach has the advantage that the solution process of the optimization problem remains independent to the meshes used for discretization and allows for mesh adaptation during inversion. Computation time is reduced by using superposition of pole loads for the forward and adjoint problems. A smoothed aggregation algebraic multigrid (AMG) preconditioned conjugate gradient is applied to construct the potentials for a given electric conductivity estimate and for constructing a first level BFGS preconditioner. Through the additional reuse of AMG operators and coarse grid solvers inversion time for large 3-D problems can be reduced further. We apply our new inversion method to synthetic survey data created by the resistivity profile representing the characteristics of subsurface fluid injection. We further test it on data obtained from a 2-D surface electrode survey on Heron Island, a small tropical island off the east coast of central Queensland, Australia.
Annual Copper Mountain Conferences on Multigrid and Iterative Methods, Copper Mountain, Colorado
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCormick, Stephen F.
This project supported the Copper Mountain Conference on Multigrid and Iterative Methods, held from 2007 to 2015, at Copper Mountain, Colorado. The subject of the Copper Mountain Conference Series alternated between Multigrid Methods in odd-numbered years and Iterative Methods in even-numbered years. Begun in 1983, the Series represents an important forum for the exchange of ideas in these two closely related fields. This report describes the Copper Mountain Conference on Multigrid and Iterative Methods, 2007-2015. Information on the conference series is available at http://grandmaster.colorado.edu/~copper/.
Fast Quantitative Susceptibility Mapping with L1-Regularization and Automatic Parameter Selection
Bilgic, Berkin; Fan, Audrey P.; Polimeni, Jonathan R.; Cauley, Stephen F.; Bianciardi, Marta; Adalsteinsson, Elfar; Wald, Lawrence L.; Setsompop, Kawin
2014-01-01
Purpose To enable fast reconstruction of quantitative susceptibility maps with Total Variation penalty and automatic regularization parameter selection. Methods ℓ1-regularized susceptibility mapping is accelerated by variable-splitting, which allows closed-form evaluation of each iteration of the algorithm by soft thresholding and FFTs. This fast algorithm also renders automatic regularization parameter estimation practical. A weighting mask derived from the magnitude signal can be incorporated to allow edge-aware regularization. Results Compared to the nonlinear Conjugate Gradient (CG) solver, the proposed method offers 20× speed-up in reconstruction time. A complete pipeline including Laplacian phase unwrapping, background phase removal with SHARP filtering and ℓ1-regularized dipole inversion at 0.6 mm isotropic resolution is completed in 1.2 minutes using Matlab on a standard workstation compared to 22 minutes using the Conjugate Gradient solver. This fast reconstruction allows estimation of regularization parameters with the L-curve method in 13 minutes, which would have taken 4 hours with the CG algorithm. Proposed method also permits magnitude-weighted regularization, which prevents smoothing across edges identified on the magnitude signal. This more complicated optimization problem is solved 5× faster than the nonlinear CG approach. Utility of the proposed method is also demonstrated in functional BOLD susceptibility mapping, where processing of the massive time-series dataset would otherwise be prohibitive with the CG solver. Conclusion Online reconstruction of regularized susceptibility maps may become feasible with the proposed dipole inversion. PMID:24259479
An accurate, fast, and scalable solver for high-frequency wave propagation
NASA Astrophysics Data System (ADS)
Zepeda-Núñez, L.; Taus, M.; Hewett, R.; Demanet, L.
2017-12-01
In many science and engineering applications, solving time-harmonic high-frequency wave propagation problems quickly and accurately is of paramount importance. For example, in geophysics, particularly in oil exploration, such problems can be the forward problem in an iterative process for solving the inverse problem of subsurface inversion. It is important to solve these wave propagation problems accurately in order to efficiently obtain meaningful solutions of the inverse problems: low order forward modeling can hinder convergence. Additionally, due to the volume of data and the iterative nature of most optimization algorithms, the forward problem must be solved many times. Therefore, a fast solver is necessary to make solving the inverse problem feasible. For time-harmonic high-frequency wave propagation, obtaining both speed and accuracy is historically challenging. Recently, there have been many advances in the development of fast solvers for such problems, including methods which have linear complexity with respect to the number of degrees of freedom. While most methods scale optimally only in the context of low-order discretizations and smooth wave speed distributions, the method of polarized traces has been shown to retain optimal scaling for high-order discretizations, such as hybridizable discontinuous Galerkin methods and for highly heterogeneous (and even discontinuous) wave speeds. The resulting fast and accurate solver is consequently highly attractive for geophysical applications. To date, this method relies on a layered domain decomposition together with a preconditioner applied in a sweeping fashion, which has limited straight-forward parallelization. In this work, we introduce a new version of the method of polarized traces which reveals more parallel structure than previous versions while preserving all of its other advantages. We achieve this by further decomposing each layer and applying the preconditioner to these new components separately and in parallel. We demonstrate that this produces an even more effective and parallelizable preconditioner for a single right-hand side. As before, additional speed can be gained by pipelining several right-hand-sides.
NASA Astrophysics Data System (ADS)
Usman, K.; Walayat, K.; Mahmood, R.; Kousar, N.
2018-06-01
We have examined the behavior of solid particles in particulate flows. The interaction of particles with each other and with the fluid is analyzed. Solid particles can move freely through a fixed computational mesh using an Eulerian approach. Fictitious boundary method (FBM) is used for treating the interaction between particles and the fluid. Hydrodynamic forces acting on the particle's surface are calculated using an explicit volume integral approach. A collision model proposed by Glowinski, Singh, Joseph and coauthors is used to handle particle-wall and particle-particle interactions. The particulate flow is computed using multigrid finite element solver FEATFLOW. Numerical experiments are performed considering two particles falling and colliding and sedimentation of many particles while interacting with each other. Results for these experiments are presented and compared with the reference values. Effects of the particle-particle interaction on the motion of the particles and on the physical behavior of the fluid-particle system has been analyzed.
Development of advanced Navier-Stokes solver
NASA Technical Reports Server (NTRS)
Yoon, Seokkwan
1994-01-01
The objective of research was to develop and validate new computational algorithms for solving the steady and unsteady Euler and Navier-Stokes equations. The end-products are new three-dimensional Euler and Navier-Stokes codes that are faster, more reliable, more accurate, and easier to use. The three-dimensional Euler and full/thin-layer Reynolds-averaged Navier-Stokes equations for compressible/incompressible flows are solved on structured hexahedral grids. The Baldwin-Lomax algebraic turbulence model is used for closure. The space discretization is based on a cell-centered finite-volume method augmented by a variety of numerical dissipation models with optional total variation diminishing limiters. The governing equations are integrated in time by an implicit method based on lower-upper factorization and symmetric Gauss-Seidel relaxation. The algorithm is vectorized on diagonal planes of sweep using two-dimensional indices in three dimensions. Convergence rates and the robustness of the codes are enhanced by the use of an implicit full approximation storage multigrid method.
NASA Technical Reports Server (NTRS)
Swisshelm, Julie M.
1989-01-01
An explicit flow solver, applicable to the hierarchy of model equations ranging from Euler to full Navier-Stokes, is combined with several techniques designed to reduce computational expense. The computational domain consists of local grid refinements embedded in a global coarse mesh, where the locations of these refinements are defined by the physics of the flow. Flow characteristics are also used to determine which set of model equations is appropriate for solution in each region, thereby reducing not only the number of grid points at which the solution must be obtained, but also the computational effort required to get that solution. Acceleration to steady-state is achieved by applying multigrid on each of the subgrids, regardless of the particular model equations being solved. Since each of these components is explicit, advantage can readily be taken of the vector- and parallel-processing capabilities of machines such as the Cray X-MP and Cray-2.
Real-time adaptive finite element solution of time-dependent Kohn-Sham equation
NASA Astrophysics Data System (ADS)
Bao, Gang; Hu, Guanghui; Liu, Di
2015-01-01
In our previous paper (Bao et al., 2012 [1]), a general framework of using adaptive finite element methods to solve the Kohn-Sham equation has been presented. This work is concerned with solving the time-dependent Kohn-Sham equations. The numerical methods are studied in the time domain, which can be employed to explain both the linear and the nonlinear effects. A Crank-Nicolson scheme and linear finite element space are employed for the temporal and spatial discretizations, respectively. To resolve the trouble regions in the time-dependent simulations, a heuristic error indicator is introduced for the mesh adaptive methods. An algebraic multigrid solver is developed to efficiently solve the complex-valued system derived from the semi-implicit scheme. A mask function is employed to remove or reduce the boundary reflection of the wavefunction. The effectiveness of our method is verified by numerical simulations for both linear and nonlinear phenomena, in which the effectiveness of the mesh adaptive methods is clearly demonstrated.
The GBS code for tokamak scrape-off layer simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Halpern, F.D., E-mail: federico.halpern@epfl.ch; Ricci, P.; Jolliet, S.
2016-06-15
We describe a new version of GBS, a 3D global, flux-driven plasma turbulence code to simulate the turbulent dynamics in the tokamak scrape-off layer (SOL), superseding the code presented by Ricci et al. (2012) [14]. The present work is driven by the objective of studying SOL turbulent dynamics in medium size tokamaks and beyond with a high-fidelity physics model. We emphasize an intertwining framework of improved physics models and the computational improvements that allow them. The model extensions include neutral atom physics, finite ion temperature, the addition of a closed field line region, and a non-Boussinesq treatment of the polarizationmore » drift. GBS has been completely refactored with the introduction of a 3-D Cartesian communicator and a scalable parallel multigrid solver. We report dramatically enhanced parallel scalability, with the possibility of treating electromagnetic fluctuations very efficiently. The method of manufactured solutions as a verification process has been carried out for this new code version, demonstrating the correct implementation of the physical model.« less
Applications of automatic differentiation in computational fluid dynamics
NASA Technical Reports Server (NTRS)
Green, Lawrence L.; Carle, A.; Bischof, C.; Haigler, Kara J.; Newman, Perry A.
1994-01-01
Automatic differentiation (AD) is a powerful computational method that provides for computing exact sensitivity derivatives (SD) from existing computer programs for multidisciplinary design optimization (MDO) or in sensitivity analysis. A pre-compiler AD tool for FORTRAN programs called ADIFOR has been developed. The ADIFOR tool has been easily and quickly applied by NASA Langley researchers to assess the feasibility and computational impact of AD in MDO with several different FORTRAN programs. These include a state-of-the-art three dimensional multigrid Navier-Stokes flow solver for wings or aircraft configurations in transonic turbulent flow. With ADIFOR the user specifies sets of independent and dependent variables with an existing computer code. ADIFOR then traces the dependency path throughout the code, applies the chain rule to formulate derivative expressions, and generates new code to compute the required SD matrix. The resulting codes have been verified to compute exact non-geometric and geometric SD for a variety of cases. in less time than is required to compute the SD matrix using centered divided differences.
Multitasking a three-dimensional Navier-Stokes algorithm on the Cray-2
NASA Technical Reports Server (NTRS)
Swisshelm, Julie M.
1989-01-01
A three-dimensional computational aerodynamics algorithm has been multitasked for efficient parallel execution on the Cray-2. It provides a means for examining the multitasking performance of a complete CFD application code. An embedded zonal multigrid scheme is used to solve the Reynolds-averaged Navier-Stokes equations for an internal flow model problem. The explicit nature of each component of the method allows a spatial partitioning of the computational domain to achieve a well-balanced task load for MIMD computers with vector-processing capability. Experiments have been conducted with both two- and three-dimensional multitasked cases. The best speedup attained by an individual task group was 3.54 on four processors of the Cray-2, while the entire solver yielded a speedup of 2.67 on four processors for the three-dimensional case. The multiprocessing efficiency of various types of computational tasks is examined, performance on two Cray-2s with different memory access speeds is compared, and extrapolation to larger problems is discussed.
Multilevel acceleration of scattering-source iterations with application to electron transport
Drumm, Clif; Fan, Wesley
2017-08-18
Acceleration/preconditioning strategies available in the SCEPTRE radiation transport code are described. A flexible transport synthetic acceleration (TSA) algorithm that uses a low-order discrete-ordinates (S N) or spherical-harmonics (P N) solve to accelerate convergence of a high-order S N source-iteration (SI) solve is described. Convergence of the low-order solves can be further accelerated by applying off-the-shelf incomplete-factorization or algebraic-multigrid methods. Also available is an algorithm that uses a generalized minimum residual (GMRES) iterative method rather than SI for convergence, using a parallel sweep-based solver to build up a Krylov subspace. TSA has been applied as a preconditioner to accelerate the convergencemore » of the GMRES iterations. The methods are applied to several problems involving electron transport and problems with artificial cross sections with large scattering ratios. These methods were compared and evaluated by considering material discontinuities and scattering anisotropy. Observed accelerations obtained are highly problem dependent, but speedup factors around 10 have been observed in typical applications.« less
Structured grid technology to enable flow simulation in an integrated system environment
NASA Astrophysics Data System (ADS)
Remotigue, Michael Gerard
An application-driven Computational Fluid Dynamics (CFD) environment needs flexible and general tools to effectively solve complex problems in a timely manner. In addition, reusable, portable, and maintainable specialized libraries will aid in rapidly developing integrated systems or procedures. The presented structured grid technology enables the flow simulation for complex geometries by addressing grid generation, grid decomposition/solver setup, solution, and interpretation. Grid generation is accomplished with the graphical, arbitrarily-connected, multi-block structured grid generation software system (GUM-B) developed and presented here. GUM-B is an integrated system comprised of specialized libraries for the graphical user interface and graphical display coupled with a solid-modeling data structure that utilizes a structured grid generation library and a geometric library based on Non-Uniform Rational B-Splines (NURBS). A presented modification of the solid-modeling data structure provides the capability for arbitrarily-connected regions between the grid blocks. The presented grid generation library provides algorithms that are reliable and accurate. GUM-B has been utilized to generate numerous structured grids for complex geometries in hydrodynamics, propulsors, and aerodynamics. The versatility of the libraries that compose GUM-B is also displayed in a prototype to automatically regenerate a grid for a free-surface solution. Grid decomposition and solver setup is accomplished with the graphical grid manipulation and repartition software system (GUMBO) developed and presented here. GUMBO is an integrated system comprised of specialized libraries for the graphical user interface and graphical display coupled with a structured grid-tools library. The described functions within the grid-tools library reduce the possibility of human error during decomposition and setup for the numerical solver by accounting for boundary conditions and connectivity. GUMBO is linked with a flow solver interface, to the parallel UNCLE code, to provide load balancing tools and solver setup. Weeks of boundary condition and connectivity specification and validation has been reduced to hours. The UNCLE flow solver is utilized for the solution of the flow field. To accelerate convergence toward a quick engineering answer, a full multigrid (FMG) approach coupled with UNCLE, which is a full approximation scheme (FAS), is presented. The prolongation operators used in the FMG-FAS method are compared. The procedure is demonstrated on a marine propeller in incompressible flow. Interpretation of the solution is accomplished by vortex feature detection. Regions of "Intrinsic Swirl" are located by interrogating the velocity gradient tensor for complex eigenvalues. The "Intrinsic Swirl" parameter is visualized on a solution of a marine propeller to determine if any vortical features are captured. The libraries and the structured grid technology presented herein are flexible and general enough to tackle a variety of complex applications. This technology has significantly enabled the capability of the ERC personnel to effectively calculate solutions for complex geometries.
An efficient spectral crystal plasticity solver for GPU architectures
NASA Astrophysics Data System (ADS)
Malahe, Michael
2018-03-01
We present a spectral crystal plasticity (CP) solver for graphics processing unit (GPU) architectures that achieves a tenfold increase in efficiency over prior GPU solvers. The approach makes use of a database containing a spectral decomposition of CP simulations performed using a conventional iterative solver over a parameter space of crystal orientations and applied velocity gradients. The key improvements in efficiency come from reducing global memory transactions, exposing more instruction-level parallelism, reducing integer instructions and performing fast range reductions on trigonometric arguments. The scheme also makes more efficient use of memory than prior work, allowing for larger problems to be solved on a single GPU. We illustrate these improvements with a simulation of 390 million crystal grains on a consumer-grade GPU, which executes at a rate of 2.72 s per strain step.
High order multi-grid methods to solve the Poisson equation
NASA Technical Reports Server (NTRS)
Schaffer, S.
1981-01-01
High order multigrid methods based on finite difference discretization of the model problem are examined. The following methods are described: (1) a fixed high order FMG-FAS multigrid algorithm; (2) the high order methods; and (3) results are presented on four problems using each method with the same underlying fixed FMG-FAS algorithm.
A Critical Study of Agglomerated Multigrid Methods for Diffusion
NASA Technical Reports Server (NTRS)
Thomas, James L.; Nishikawa, Hiroaki; Diskin, Boris
2009-01-01
Agglomerated multigrid techniques used in unstructured-grid methods are studied critically for a model problem representative of laminar diffusion in the incompressible limit. The studied target-grid discretizations and discretizations used on agglomerated grids are typical of current node-centered formulations. Agglomerated multigrid convergence rates are presented using a range of two- and three-dimensional randomly perturbed unstructured grids for simple geometries with isotropic and highly stretched grids. Two agglomeration techniques are used within an overall topology-preserving agglomeration framework. The results show that multigrid with an inconsistent coarse-grid scheme using only the edge terms (also referred to in the literature as a thin-layer formulation) provides considerable speedup over single-grid methods but its convergence deteriorates on finer grids. Multigrid with a Galerkin coarse-grid discretization using piecewise-constant prolongation and a heuristic correction factor is slower and also grid-dependent. In contrast, grid-independent convergence rates are demonstrated for multigrid with consistent coarse-grid discretizations. Actual cycle results are verified using quantitative analysis methods in which parts of the cycle are replaced by their idealized counterparts.
NASA Astrophysics Data System (ADS)
Shi, Yu; Liang, Long; Ge, Hai-Wen; Reitz, Rolf D.
2010-03-01
Acceleration of the chemistry solver for engine combustion is of much interest due to the fact that in practical engine simulations extensive computational time is spent solving the fuel oxidation and emission formation chemistry. A dynamic adaptive chemistry (DAC) scheme based on a directed relation graph error propagation (DRGEP) method has been applied to study homogeneous charge compression ignition (HCCI) engine combustion with detailed chemistry (over 500 species) previously using an R-value-based breadth-first search (RBFS) algorithm, which significantly reduced computational times (by as much as 30-fold). The present paper extends the use of this on-the-fly kinetic mechanism reduction scheme to model combustion in direct-injection (DI) engines. It was found that the DAC scheme becomes less efficient when applied to DI engine simulations using a kinetic mechanism of relatively small size and the accuracy of the original DAC scheme decreases for conventional non-premixed combustion engine. The present study also focuses on determination of search-initiating species, involvement of the NOx chemistry, selection of a proper error tolerance, as well as treatment of the interaction of chemical heat release and the fuel spray. Both the DAC schemes were integrated into the ERC KIVA-3v2 code, and simulations were conducted to compare the two schemes. In general, the present DAC scheme has better efficiency and similar accuracy compared to the previous DAC scheme. The efficiency depends on the size of the chemical kinetics mechanism used and the engine operating conditions. For cases using a small n-heptane kinetic mechanism of 34 species, 30% of the computational time is saved, and 50% for a larger n-heptane kinetic mechanism of 61 species. The paper also demonstrates that by combining the present DAC scheme with an adaptive multi-grid chemistry (AMC) solver, it is feasible to simulate a direct-injection engine using a detailed n-heptane mechanism with 543 species with practical computer time.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weston, Brian T.
This dissertation focuses on the development of a fully-implicit, high-order compressible ow solver with phase change. The work is motivated by laser-induced phase change applications, particularly by the need to develop large-scale multi-physics simulations of the selective laser melting (SLM) process in metal additive manufacturing (3D printing). Simulations of the SLM process require precise tracking of multi-material solid-liquid-gas interfaces, due to laser-induced melting/ solidi cation and evaporation/condensation of metal powder in an ambient gas. These rapid density variations and phase change processes tightly couple the governing equations, requiring a fully compressible framework to robustly capture the rapid density variations ofmore » the ambient gas and the melting/evaporation of the metal powder. For non-isothermal phase change, the velocity is gradually suppressed through the mushy region by a variable viscosity and Darcy source term model. The governing equations are discretized up to 4th-order accuracy with our reconstructed Discontinuous Galerkin spatial discretization scheme and up to 5th-order accuracy with L-stable fully implicit time discretization schemes (BDF2 and ESDIRK3-5). The resulting set of non-linear equations is solved using a robust Newton-Krylov method, with the Jacobian-free version of the GMRES solver for linear iterations. Due to the sti nes associated with the acoustic waves and thermal and viscous/material strength e ects, preconditioning the GMRES solver is essential. A robust and scalable approximate block factorization preconditioner was developed, which utilizes the velocity-pressure (vP) and velocity-temperature (vT) Schur complement systems. This multigrid block reduction preconditioning technique converges for high CFL/Fourier numbers and exhibits excellent parallel and algorithmic scalability on classic benchmark problems in uid dynamics (lid-driven cavity ow and natural convection heat transfer) as well as for laser-induced phase change problems in 2D and 3D.« less
On the connection between multigrid and cyclic reduction
NASA Technical Reports Server (NTRS)
Merriam, M. L.
1984-01-01
A technique is shown whereby it is possible to relate a particular multigrid process to cyclic reduction using purely mathematical arguments. This technique suggest methods for solving Poisson's equation in 1-, 2-, or 3-dimensions with Dirichlet or Neumann boundary conditions. In one dimension the method is exact and, in fact, reduces to cyclic reduction. This provides a valuable reference point for understanding multigrid techniques. The particular multigrid process analyzed is referred to here as Approximate Cyclic Reduction (ACR) and is one of a class known as Multigrid Reduction methods in the literature. It involves one approximation with a known error term. It is possible to relate the error term in this approximation with certain eigenvector components of the error. These are sharply reduced in amplitude by classical relaxation techniques. The approximation can thus be made a very good one.
Convergence acceleration of the Proteus computer code with multigrid methods
NASA Technical Reports Server (NTRS)
Demuren, A. O.; Ibraheem, S. O.
1992-01-01
Presented here is the first part of a study to implement convergence acceleration techniques based on the multigrid concept in the Proteus computer code. A review is given of previous studies on the implementation of multigrid methods in computer codes for compressible flow analysis. Also presented is a detailed stability analysis of upwind and central-difference based numerical schemes for solving the Euler and Navier-Stokes equations. Results are given of a convergence study of the Proteus code on computational grids of different sizes. The results presented here form the foundation for the implementation of multigrid methods in the Proteus code.
BeamDyn: A High-Fidelity Wind Turbine Blade Solver in the FAST Modular Framework: Preprint
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Q.; Sprague, M.; Jonkman, J.
2015-01-01
BeamDyn, a Legendre-spectral-finite-element implementation of geometrically exact beam theory (GEBT), was developed to meet the design challenges associated with highly flexible composite wind turbine blades. In this paper, the governing equations of GEBT are reformulated into a nonlinear state-space form to support its coupling within the modular framework of the FAST wind turbine computer-aided engineering (CAE) tool. Different time integration schemes (implicit and explicit) were implemented and examined for wind turbine analysis. Numerical examples are presented to demonstrate the capability of this new beam solver. An example analysis of a realistic wind turbine blade, the CX-100, is also presented asmore » validation.« less
Multigrid method for the equilibrium equations of elasticity using a compact scheme
NASA Technical Reports Server (NTRS)
Taasan, S.
1986-01-01
A compact difference scheme is derived for treating the equilibrium equations of elasticity. The scheme is inconsistent and unstable. A multigrid method which takes into account these properties is described. The solution of the discrete equations, up to the level of discretization errors, is obtained by this method in just two multigrid cycles.
MILAMIN 2 - Fast MATLAB FEM solver
NASA Astrophysics Data System (ADS)
Dabrowski, Marcin; Krotkiewski, Marcin; Schmid, Daniel W.
2013-04-01
MILAMIN is a free and efficient MATLAB-based two-dimensional FEM solver utilizing unstructured meshes [Dabrowski et al., G-cubed (2008)]. The code consists of steady-state thermal diffusion and incompressible Stokes flow solvers implemented in approximately 200 lines of native MATLAB code. The brevity makes the code easily customizable. An important quality of MILAMIN is speed - it can handle millions of nodes within minutes on one CPU core of a standard desktop computer, and is faster than many commercial solutions. The new MILAMIN 2 allows three-dimensional modeling. It is designed as a set of functional modules that can be used as building blocks for efficient FEM simulations using MATLAB. The utilities are largely implemented as native MATLAB functions. For performance critical parts we use MUTILS - a suite of compiled MEX functions optimized for shared memory multi-core computers. The most important features of MILAMIN 2 are: 1. Modular approach to defining, tracking, and discretizing the geometry of the model 2. Interfaces to external mesh generators (e.g., Triangle, Fade2d, T3D) and mesh utilities (e.g., element type conversion, fast point location, boundary extraction) 3. Efficient computation of the stiffness matrix for a wide range of element types, anisotropic materials and three-dimensional problems 4. Fast global matrix assembly using a dedicated MEX function 5. Automatic integration rules 6. Flexible prescription (spatial, temporal, and field functions) and efficient application of Dirichlet, Neuman, and periodic boundary conditions 7. Treatment of transient and non-linear problems 8. Various iterative and multi-level solution strategies 9. Post-processing tools (e.g., numerical integration) 10. Visualization primitives using MATLAB, and VTK export functions We provide a large number of examples that show how to implement a custom FEM solver using the MILAMIN 2 framework. The examples are MATLAB scripts of increasing complexity that address a given technical topic (e.g., creating meshes, reordering nodes, applying boundary conditions), a given numerical topic (e.g., using various solution strategies, non-linear iterations), or that present a fully-developed solver designed to address a scientific topic (e.g., performing Stokes flow simulations in synthetic porous medium). References: Dabrowski, M., M. Krotkiewski, and D. W. Schmid MILAMIN: MATLAB-based finite element method solver for large problems, Geochem. Geophys. Geosyst., 9, Q04030, 2008
NASA Technical Reports Server (NTRS)
Cain, Michael D.
1999-01-01
The goal of this thesis is to develop an efficient and robust locally preconditioned semi-coarsening multigrid algorithm for the two-dimensional Navier-Stokes equations. This thesis examines the performance of the multigrid algorithm with local preconditioning for an upwind-discretization of the Navier-Stokes equations. A block Jacobi iterative scheme is used because of its high frequency error mode damping ability. At low Mach numbers, the performance of a flux preconditioner is investigated. The flux preconditioner utilizes a new limiting technique based on local information that was developed by Siu. Full-coarsening and-semi-coarsening are examined as well as the multigrid V-cycle and full multigrid. The numerical tests were performed on a NACA 0012 airfoil at a range of Mach numbers. The tests show that semi-coarsening with flux preconditioning is the most efficient and robust combination of coarsening strategy, and iterative scheme - especially at low Mach numbers.
On a multigrid method for the coupled Stokes and porous media flow problem
NASA Astrophysics Data System (ADS)
Luo, P.; Rodrigo, C.; Gaspar, F. J.; Oosterlee, C. W.
2017-07-01
The multigrid solution of coupled porous media and Stokes flow problems is considered. The Darcy equation as the saturated porous medium model is coupled to the Stokes equations by means of appropriate interface conditions. We focus on an efficient multigrid solution technique for the coupled problem, which is discretized by finite volumes on staggered grids, giving rise to a saddle point linear system. Special treatment is required regarding the discretization at the interface. An Uzawa smoother is employed in multigrid, which is a decoupled procedure based on symmetric Gauss-Seidel smoothing for velocity components and a simple Richardson iteration for the pressure field. Since a relaxation parameter is part of a Richardson iteration, Local Fourier Analysis (LFA) is applied to determine the optimal parameters. Highly satisfactory multigrid convergence is reported, and, moreover, the algorithm performs well for small values of the hydraulic conductivity and fluid viscosity, that are relevant for applications.
Uniform convergence of multigrid V-cycle iterations for indefinite and nonsymmetric problems
NASA Technical Reports Server (NTRS)
Bramble, James H.; Kwak, Do Y.; Pasciak, Joseph E.
1993-01-01
In this paper, we present an analysis of a multigrid method for nonsymmetric and/or indefinite elliptic problems. In this multigrid method various types of smoothers may be used. One type of smoother which we consider is defined in terms of an associated symmetric problem and includes point and line, Jacobi, and Gauss-Seidel iterations. We also study smoothers based entirely on the original operator. One is based on the normal form, that is, the product of the operator and its transpose. Other smoothers studied include point and line, Jacobi, and Gauss-Seidel. We show that the uniform estimates for symmetric positive definite problems carry over to these algorithms. More precisely, the multigrid iteration for the nonsymmetric and/or indefinite problem is shown to converge at a uniform rate provided that the coarsest grid in the multilevel iteration is sufficiently fine (but not depending on the number of multigrid levels).
Fast Laplace solver approach to pore-scale permeability
NASA Astrophysics Data System (ADS)
Arns, C. H.; Adler, P. M.
2018-02-01
We introduce a powerful and easily implemented method to calculate the permeability of porous media at the pore scale using an approximation based on the Poiseulle equation to calculate permeability to fluid flow with a Laplace solver. The method consists of calculating the Euclidean distance map of the fluid phase to assign local conductivities and lends itself naturally to the treatment of multiscale problems. We compare with analytical solutions as well as experimental measurements and lattice Boltzmann calculations of permeability for Fontainebleau sandstone. The solver is significantly more stable than the lattice Boltzmann approach, uses less memory, and is significantly faster. Permeabilities are in excellent agreement over a wide range of porosities.
solveME: fast and reliable solution of nonlinear ME models.
Yang, Laurence; Ma, Ding; Ebrahim, Ali; Lloyd, Colton J; Saunders, Michael A; Palsson, Bernhard O
2016-09-22
Genome-scale models of metabolism and macromolecular expression (ME) significantly expand the scope and predictive capabilities of constraint-based modeling. ME models present considerable computational challenges: they are much (>30 times) larger than corresponding metabolic reconstructions (M models), are multiscale, and growth maximization is a nonlinear programming (NLP) problem, mainly due to macromolecule dilution constraints. Here, we address these computational challenges. We develop a fast and numerically reliable solution method for growth maximization in ME models using a quad-precision NLP solver (Quad MINOS). Our method was up to 45 % faster than binary search for six significant digits in growth rate. We also develop a fast, quad-precision flux variability analysis that is accelerated (up to 60× speedup) via solver warm-starts. Finally, we employ the tools developed to investigate growth-coupled succinate overproduction, accounting for proteome constraints. Just as genome-scale metabolic reconstructions have become an invaluable tool for computational and systems biologists, we anticipate that these fast and numerically reliable ME solution methods will accelerate the wide-spread adoption of ME models for researchers in these fields.
NASA Astrophysics Data System (ADS)
Lu, Benzhuo; Cheng, Xiaolin; Huang, Jingfang; McCammon, J. Andrew
2010-06-01
A Fortran program package is introduced for rapid evaluation of the electrostatic potentials and forces in biomolecular systems modeled by the linearized Poisson-Boltzmann equation. The numerical solver utilizes a well-conditioned boundary integral equation (BIE) formulation, a node-patch discretization scheme, a Krylov subspace iterative solver package with reverse communication protocols, and an adaptive new version of fast multipole method in which the exponential expansions are used to diagonalize the multipole-to-local translations. The program and its full description, as well as several closely related libraries and utility tools are available at http://lsec.cc.ac.cn/~lubz/afmpb.html and a mirror site at http://mccammon.ucsd.edu/. This paper is a brief summary of the program: the algorithms, the implementation and the usage. Program summaryProgram title: AFMPB: Adaptive fast multipole Poisson-Boltzmann solver Catalogue identifier: AEGB_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGB_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL 2.0 No. of lines in distributed program, including test data, etc.: 453 649 No. of bytes in distributed program, including test data, etc.: 8 764 754 Distribution format: tar.gz Programming language: Fortran Computer: Any Operating system: Any RAM: Depends on the size of the discretized biomolecular system Classification: 3 External routines: Pre- and post-processing tools are required for generating the boundary elements and for visualization. Users can use MSMS ( http://www.scripps.edu/~sanner/html/msms_home.html) for pre-processing, and VMD ( http://www.ks.uiuc.edu/Research/vmd/) for visualization. Sub-programs included: An iterative Krylov subspace solvers package from SPARSKIT by Yousef Saad ( http://www-users.cs.umn.edu/~saad/software/SPARSKIT/sparskit.html), and the fast multipole methods subroutines from FMMSuite ( http://www.fastmultipole.org/). Nature of problem: Numerical solution of the linearized Poisson-Boltzmann equation that describes electrostatic interactions of molecular systems in ionic solutions. Solution method: A novel node-patch scheme is used to discretize the well-conditioned boundary integral equation formulation of the linearized Poisson-Boltzmann equation. Various Krylov subspace solvers can be subsequently applied to solve the resulting linear system, with a bounded number of iterations independent of the number of discretized unknowns. The matrix-vector multiplication at each iteration is accelerated by the adaptive new versions of fast multipole methods. The AFMPB solver requires other stand-alone pre-processing tools for boundary mesh generation, post-processing tools for data analysis and visualization, and can be conveniently coupled with different time stepping methods for dynamics simulation. Restrictions: Only three or six significant digits options are provided in this version. Unusual features: Most of the codes are in Fortran77 style. Memory allocation functions from Fortran90 and above are used in a few subroutines. Additional comments: The current version of the codes is designed and written for single core/processor desktop machines. Check http://lsec.cc.ac.cn/~lubz/afmpb.html and http://mccammon.ucsd.edu/ for updates and changes. Running time: The running time varies with the number of discretized elements ( N) in the system and their distributions. In most cases, it scales linearly as a function of N.
NASA Technical Reports Server (NTRS)
Thomas, J. L.; Diskin, B.; Brandt, A.
1999-01-01
The distributed-relaxation multigrid and defect- correction methods are applied to the two- dimensional compressible Navier-Stokes equations. The formulation is intended for high Reynolds number applications and several applications are made at a laminar Reynolds number of 10,000. A staggered- grid arrangement of variables is used; the coupled pressure and internal energy equations are solved together with multigrid, requiring a block 2x2 matrix solution. Textbook multigrid efficiencies are attained for incompressible and slightly compressible simulations of the boundary layer on a flat plate. Textbook efficiencies are obtained for compressible simulations up to Mach numbers of 0.7 for a viscous wake simulation.
3D Parallel Multigrid Methods for Real-Time Fluid Simulation
NASA Astrophysics Data System (ADS)
Wan, Feifei; Yin, Yong; Zhang, Suiyu
2018-03-01
The multigrid method is widely used in fluid simulation because of its strong convergence. In addition to operating accuracy, operational efficiency is also an important factor to consider in order to enable real-time fluid simulation in computer graphics. For this problem, we compared the performance of the Algebraic Multigrid and the Geometric Multigrid in the V-Cycle and Full-Cycle schemes respectively, and analyze the convergence and speed of different methods. All the calculations are done on the parallel computing of GPU in this paper. Finally, we experiment with the 3D-grid for each scale, and give the exact experimental results.
Multigrid Methods for Fully Implicit Oil Reservoir Simulation
NASA Technical Reports Server (NTRS)
Molenaar, J.
1996-01-01
In this paper we consider the simultaneous flow of oil and water in reservoir rock. This displacement process is modeled by two basic equations: the material balance or continuity equations and the equation of motion (Darcy's law). For the numerical solution of this system of nonlinear partial differential equations there are two approaches: the fully implicit or simultaneous solution method and the sequential solution method. In the sequential solution method the system of partial differential equations is manipulated to give an elliptic pressure equation and a hyperbolic (or parabolic) saturation equation. In the IMPES approach the pressure equation is first solved, using values for the saturation from the previous time level. Next the saturations are updated by some explicit time stepping method; this implies that the method is only conditionally stable. For the numerical solution of the linear, elliptic pressure equation multigrid methods have become an accepted technique. On the other hand, the fully implicit method is unconditionally stable, but it has the disadvantage that in every time step a large system of nonlinear algebraic equations has to be solved. The most time-consuming part of any fully implicit reservoir simulator is the solution of this large system of equations. Usually this is done by Newton's method. The resulting systems of linear equations are then either solved by a direct method or by some conjugate gradient type method. In this paper we consider the possibility of applying multigrid methods for the iterative solution of the systems of nonlinear equations. There are two ways of using multigrid for this job: either we use a nonlinear multigrid method or we use a linear multigrid method to deal with the linear systems that arise in Newton's method. So far only a few authors have reported on the use of multigrid methods for fully implicit simulations. Two-level FAS algorithm is presented for the black-oil equations, and linear multigrid for two-phase flow problems with strong heterogeneities and anisotropies is studied. Here we consider both possibilities. Moreover we present a novel way for constructing the coarse grid correction operator in linear multigrid algorithms. This approach has the advantage in that it preserves the sparsity pattern of the fine grid matrix and it can be extended to systems of equations in a straightforward manner. We compare the linear and nonlinear multigrid algorithms by means of a numerical experiment.
Incompressible SPH (ISPH) with fast Poisson solver on a GPU
NASA Astrophysics Data System (ADS)
Chow, Alex D.; Rogers, Benedict D.; Lind, Steven J.; Stansby, Peter K.
2018-05-01
This paper presents a fast incompressible SPH (ISPH) solver implemented to run entirely on a graphics processing unit (GPU) capable of simulating several millions of particles in three dimensions on a single GPU. The ISPH algorithm is implemented by converting the highly optimised open-source weakly-compressible SPH (WCSPH) code DualSPHysics to run ISPH on the GPU, combining it with the open-source linear algebra library ViennaCL for fast solutions of the pressure Poisson equation (PPE). Several challenges are addressed with this research: constructing a PPE matrix every timestep on the GPU for moving particles, optimising the limited GPU memory, and exploiting fast matrix solvers. The ISPH pressure projection algorithm is implemented as 4 separate stages, each with a particle sweep, including an algorithm for the population of the PPE matrix suitable for the GPU, and mixed precision storage methods. An accurate and robust ISPH boundary condition ideal for parallel processing is also established by adapting an existing WCSPH boundary condition for ISPH. A variety of validation cases are presented: an impulsively started plate, incompressible flow around a moving square in a box, and dambreaks (2-D and 3-D) which demonstrate the accuracy, flexibility, and speed of the methodology. Fragmentation of the free surface is shown to influence the performance of matrix preconditioners and therefore the PPE matrix solution time. The Jacobi preconditioner demonstrates robustness and reliability in the presence of fragmented flows. For a dambreak simulation, GPU speed ups demonstrate up to 10-18 times and 1.1-4.5 times compared to single-threaded and 16-threaded CPU run times respectively.
Applications of multigrid software in the atmospheric sciences
NASA Technical Reports Server (NTRS)
Adams, J.; Garcia, R.; Gross, B.; Hack, J.; Haidvogel, D.; Pizzo, V.
1992-01-01
Elliptic partial differential equations from different areas in the atmospheric sciences are efficiently and easily solved utilizing the multigrid software package named MUDPACK. It is demonstrated that the multigrid method is more efficient than other commonly employed techniques, such as Gaussian elimination and fixed-grid relaxation. The efficiency relative to other techniques, both in terms of storage requirement and computational time, increases quickly with grid size.
Shieh, Bernard; Sabra, Karim G; Degertekin, F Levent
2016-11-01
A boundary element model provides great flexibility for the simulation of membrane-type micromachined ultrasonic transducers (MUTs) in terms of membrane shape, actuating mechanism, and array layout. Acoustic crosstalk is accounted for through a mutual impedance matrix that captures the primary crosstalk mechanism of dispersive-guided modes generated at the fluid-solid interface. However, finding the solution to the fully populated boundary element matrix equation using standard techniques requires computation time and memory usage that scales by the cube and by the square of the number of nodes, respectively, limiting simulation to a small number of membranes. We implement a solver with improved speed and efficiency through the application of a multilevel fast multipole algorithm (FMA). By approximating the fields of collections of nodes using multipole expansions of the free-space Green's function, an FMA solver can enable the simulation of hundreds of thousands of nodes while incurring an approximation error that is controllable. Convergence is drastically improved using a problem-specific block-diagonal preconditioner. We demonstrate the solver's capabilities by simulating a 32-element 7-MHz 1-D capacitive MUT (CMUT) phased array with 2880 membranes. The array is simulated using 233280 nodes for a very wide frequency band up to 50 MHz. For a simulation with 15210 nodes, the FMA solver performed ten times faster and used 32 times less memory than a standard solver based on LU decomposition. We investigate the effects of mesh density and phasing on the predicted array response and find that it is necessary to use about seven nodes over the width of the membrane to observe convergence of the solution-even below the first membrane resonance frequency-due to the influence of higher order membrane modes.
NASA Technical Reports Server (NTRS)
Ghil, M.; Balgovind, R.
1979-01-01
The inhomogeneous Cauchy-Riemann equations in a rectangle are discretized by a finite difference approximation. Several different boundary conditions are treated explicitly, leading to algorithms which have overall second-order accuracy. All boundary conditions with either u or v prescribed along a side of the rectangle can be treated by similar methods. The algorithms presented here have nearly minimal time and storage requirements and seem suitable for development into a general-purpose direct Cauchy-Riemann solver for arbitrary boundary conditions.
Fast Electromagnetic Solvers for Large-Scale Naval Scattering Problems
2008-09-27
IEEE Trans. Antennas Propag., vol. 52, no. 8, pp. 2141–2146, 2004. [12] R. J. Burkholder and J. F. Lee, “Fast dual-MGS block-factorization algorithm...Golub and C. F. V. Loan, Matrix Computations. Baltimore: The Johns Hopkins University Press, 1996. [20] W. D. Li, W. Hong, and H. X. Zhou, “Integral
A Pseudo-Temporal Multi-Grid Relaxation Scheme for Solving the Parabolized Navier-Stokes Equations
NASA Technical Reports Server (NTRS)
White, J. A.; Morrison, J. H.
1999-01-01
A multi-grid, flux-difference-split, finite-volume code, VULCAN, is presented for solving the elliptic and parabolized form of the equations governing three-dimensional, turbulent, calorically perfect and non-equilibrium chemically reacting flows. The space marching algorithms developed to improve convergence rate and or reduce computational cost are emphasized. The algorithms presented are extensions to the class of implicit pseudo-time iterative, upwind space-marching schemes. A full approximate storage, full multi-grid scheme is also described which is used to accelerate the convergence of a Gauss-Seidel relaxation method. The multi-grid algorithm is shown to significantly improve convergence on high aspect ratio grids.
NASA Technical Reports Server (NTRS)
Atkins, H. L.; Helenbrook, B. T.
2005-01-01
This paper describes numerical experiments with P-multigrid to corroborate analysis, validate the present implementation, and to examine issues that arise in the implementations of the various combinations of relaxation schemes, discretizations and P-multigrid methods. The two approaches to implement P-multigrid presented here are equivalent for most high-order discretization methods such as spectral element, SUPG, and discontinuous Galerkin applied to advection; however it is discovered that the approach that mimics the common geometric multigrid implementation is less robust, and frequently unstable when applied to discontinuous Galerkin discretizations of di usion. Gauss-Seidel relaxation converges 40% faster than block Jacobi, as predicted by analysis; however, the implementation of Gauss-Seidel is considerably more expensive that one would expect because gradients in most neighboring elements must be updated. A compromise quasi Gauss-Seidel relaxation method that evaluates the gradient in each element twice per iteration converges at rates similar to those predicted for true Gauss-Seidel.
A multigrid nonoscillatory method for computing high speed flows
NASA Technical Reports Server (NTRS)
Li, C. P.; Shieh, T. H.
1993-01-01
A multigrid method using different smoothers has been developed to solve the Euler equations discretized by a nonoscillatory scheme up to fourth order accuracy. The best smoothing property is provided by a five-stage Runge-Kutta technique with optimized coefficients, yet the most efficient smoother is a backward Euler technique in factored and diagonalized form. The singlegrid solution for a hypersonic, viscous conic flow is in excellent agreement with the solution obtained by the third order MUSCL and Roe's method. Mach 8 inviscid flow computations for a complete entry probe have shown that the accuracy is at least as good as the symmetric TVD scheme of Yee and Harten. The implicit multigrid method is four times more efficient than the explicit multigrid technique and 3.5 times faster than the single-grid implicit technique. For a Mach 8.7 inviscid flow over a blunt delta wing at 30 deg incidence, the CPU reduction factor from the three-level multigrid computation is 2.2 on a grid of 37 x 41 x 73 nodes.
Phillips, Edward Geoffrey; Shadid, John N.; Cyr, Eric C.
2018-05-01
Here, we report multiple physical time-scales can arise in electromagnetic simulations when dissipative effects are introduced through boundary conditions, when currents follow external time-scales, and when material parameters vary spatially. In such scenarios, the time-scales of interest may be much slower than the fastest time-scales supported by the Maxwell equations, therefore making implicit time integration an efficient approach. The use of implicit temporal discretizations results in linear systems in which fast time-scales, which severely constrain the stability of an explicit method, can manifest as so-called stiff modes. This study proposes a new block preconditioner for structure preserving (also termed physicsmore » compatible) discretizations of the Maxwell equations in first order form. The intent of the preconditioner is to enable the efficient solution of multiple-time-scale Maxwell type systems. An additional benefit of the developed preconditioner is that it requires only a traditional multigrid method for its subsolves and compares well against alternative approaches that rely on specialized edge-based multigrid routines that may not be readily available. Lastly, results demonstrate parallel scalability at large electromagnetic wave CFL numbers on a variety of test problems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Phillips, Edward Geoffrey; Shadid, John N.; Cyr, Eric C.
Here, we report multiple physical time-scales can arise in electromagnetic simulations when dissipative effects are introduced through boundary conditions, when currents follow external time-scales, and when material parameters vary spatially. In such scenarios, the time-scales of interest may be much slower than the fastest time-scales supported by the Maxwell equations, therefore making implicit time integration an efficient approach. The use of implicit temporal discretizations results in linear systems in which fast time-scales, which severely constrain the stability of an explicit method, can manifest as so-called stiff modes. This study proposes a new block preconditioner for structure preserving (also termed physicsmore » compatible) discretizations of the Maxwell equations in first order form. The intent of the preconditioner is to enable the efficient solution of multiple-time-scale Maxwell type systems. An additional benefit of the developed preconditioner is that it requires only a traditional multigrid method for its subsolves and compares well against alternative approaches that rely on specialized edge-based multigrid routines that may not be readily available. Lastly, results demonstrate parallel scalability at large electromagnetic wave CFL numbers on a variety of test problems.« less
NASA Astrophysics Data System (ADS)
Frickenhaus, Stephan; Hiller, Wolfgang; Best, Meike
The portable software FoSSI is introduced that—in combination with additional free solver software packages—allows for an efficient and scalable parallel solution of large sparse linear equations systems arising in finite element model codes. FoSSI is intended to support rapid model code development, completely hiding the complexity of the underlying solver packages. In particular, the model developer need not be an expert in parallelization and is yet free to switch between different solver packages by simple modifications of the interface call. FoSSI offers an efficient and easy, yet flexible interface to several parallel solvers, most of them available on the web, such as PETSC, AZTEC, MUMPS, PILUT and HYPRE. FoSSI makes use of the concept of handles for vectors, matrices, preconditioners and solvers, that is frequently used in solver libraries. Hence, FoSSI allows for a flexible treatment of several linear equations systems and associated preconditioners at the same time, even in parallel on separate MPI-communicators. The second special feature in FoSSI is the task specifier, being a combination of keywords, each configuring a certain phase in the solver setup. This enables the user to control a solver over one unique subroutine. Furthermore, FoSSI has rather similar features for all solvers, making a fast solver intercomparison or exchange an easy task. FoSSI is a community software, proven in an adaptive 2D-atmosphere model and a 3D-primitive equation ocean model, both formulated in finite elements. The present paper discusses perspectives of an OpenMP-implementation of parallel iterative solvers based on domain decomposition methods. This approach to OpenMP solvers is rather attractive, as the code for domain-local operations of factorization, preconditioning and matrix-vector product can be readily taken from a sequential implementation that is also suitable to be used in an MPI-variant. Code development in this direction is in an advanced state under the name ScOPES: the Scalable Open Parallel sparse linear Equations Solver.
Solving Upwind-Biased Discretizations: Defect-Correction Iterations
NASA Technical Reports Server (NTRS)
Diskin, Boris; Thomas, James L.
1999-01-01
This paper considers defect-correction solvers for a second order upwind-biased discretization of the 2D convection equation. The following important features are reported: (1) The asymptotic convergence rate is about 0.5 per defect-correction iteration. (2) If the operators involved in defect-correction iterations have different approximation order, then the initial convergence rates may be very slow. The number of iterations required to get into the asymptotic convergence regime might grow on fine grids as a negative power of h. In the case of a second order target operator and a first order driver operator, this number of iterations is roughly proportional to h-1/3. (3) If both the operators have the second approximation order, the defect-correction solver demonstrates the asymptotic convergence rate after three iterations at most. The same three iterations are required to converge algebraic error below the truncation error level. A novel comprehensive half-space Fourier mode analysis (which, by the way, can take into account the influence of discretized outflow boundary conditions as well) for the defect-correction method is developed. This analysis explains many phenomena observed in solving non-elliptic equations and provides a close prediction of the actual solution behavior. It predicts the convergence rate for each iteration and the asymptotic convergence rate. As a result of this analysis, a new very efficient adaptive multigrid algorithm solving the discrete problem to within a given accuracy is proposed. Numerical simulations confirm the accuracy of the analysis and the efficiency of the proposed algorithm. The results of the numerical tests are reported.
Fluid Simulation in the Movies: Navier and Stokes Must Be Circulating in Their Graves
NASA Astrophysics Data System (ADS)
Tessendorf, Jerry
2010-11-01
Fluid simulations based on the Incompressible Navier-Stokes equations are commonplace computer graphics tools in the visual effects industry. These simulations mostly come from custom C++ code written by the visual effects companies. Their significant impact in films was recognized in 2008 with Academy Awards to four visual effects companies for their technical achievement. However artists are not fluid dynamicists, and fluid dynamics simulations are expensive to use in a deadline-driven production environment. As a result, the simulation algorithms are modified to limit the computational resources, adapt them to production workflow, and to respect the client's vision of the film plot. Eulerian solvers on fixed rectangular grids use a mix of momentum solvers, including Semi-Lagrangian, FLIP, and QUICK. Incompressibility is enforced with FFT, Conjugate Gradient, and Multigrid methods. For liquids, a levelset field tracks the free surface. Smooth Particle Hydrodynamics is also used, and is part of a hybrid Eulerian-SPH liquid simulator. Artists use all of them in a mix and match fashion to control the appearance of the simulation. Specially designed forces and boundary conditions control the flow. The simulation can be an input to artistically driven procedural particle simulations that enhance the flow with more detail and drama. Post-simulation processing increases the visual detail beyond the grid resolution. Ultimately, iterative simulation methods that fit naturally in the production workflow are extremely desirable but not yet successful. Results from some efforts for iterative methods are shown, and other approaches motivated by the history of production are proposed.
A Conforming Multigrid Method for the Pure Traction Problem of Linear Elasticity: Mixed Formulation
NASA Technical Reports Server (NTRS)
Lee, Chang-Ock
1996-01-01
A multigrid method using conforming P-1 finite element is developed for the two-dimensional pure traction boundary value problem of linear elasticity. The convergence is uniform even as the material becomes nearly incompressible. A heuristic argument for acceleration of the multigrid method is discussed as well. Numerical results with and without this acceleration as well as performance estimates on a parallel computer are included.
Multigrid methods in structural mechanics
NASA Technical Reports Server (NTRS)
Raju, I. S.; Bigelow, C. A.; Taasan, S.; Hussaini, M. Y.
1986-01-01
Although the application of multigrid methods to the equations of elasticity has been suggested, few such applications have been reported in the literature. In the present work, multigrid techniques are applied to the finite element analysis of a simply supported Bernoulli-Euler beam, and various aspects of the multigrid algorithm are studied and explained in detail. In this study, six grid levels were used to model half the beam. With linear prolongation and sequential ordering, the multigrid algorithm yielded results which were of machine accuracy with work equivalent to 200 standard Gauss-Seidel iterations on the fine grid. Also with linear prolongation and sequential ordering, the V(1,n) cycle with n greater than 2 yielded better convergence rates than the V(n,1) cycle. The restriction and prolongation operators were derived based on energy principles. Conserving energy during the inter-grid transfers required that the prolongation operator be the transpose of the restriction operator, and led to improved convergence rates. With energy-conserving prolongation and sequential ordering, the multigrid algorithm yielded results of machine accuracy with a work equivalent to 45 Gauss-Seidel iterations on the fine grid. The red-black ordering of relaxations yielded solutions of machine accuracy in a single V(1,1) cycle, which required work equivalent to about 4 iterations on the finest grid level.
Numerical Methods for Forward and Inverse Problems in Discontinuous Media
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chartier, Timothy P.
The research emphasis under this grant's funding is in the area of algebraic multigrid methods. The research has two main branches: 1) exploring interdisciplinary applications in which algebraic multigrid can make an impact and 2) extending the scope of algebraic multigrid methods with algorithmic improvements that are based in strong analysis.The work in interdisciplinary applications falls primarily in the field of biomedical imaging. Work under this grant demonstrated the effectiveness and robustness of multigrid for solving linear systems that result from highly heterogeneous finite element method models of the human head. The results in this work also give promise tomore » medical advances possible with software that may be developed. Research to extend the scope of algebraic multigrid has been focused in several areas. In collaboration with researchers at the University of Colorado, Lawrence Livermore National Laboratory, and Los Alamos National Laboratory, the PI developed an adaptive multigrid with subcycling via complementary grids. This method has very cheap computing costs per iterate and is showing promise as a preconditioner for conjugate gradient. Recent work with Los Alamos National Laboratory concentrates on developing algorithms that take advantage of the recent advances in adaptive multigrid research. The results of the various efforts in this research could ultimately have direct use and impact to researchers for a wide variety of applications, including, astrophysics, neuroscience, contaminant transport in porous media, bi-domain heart modeling, modeling of tumor growth, and flow in heterogeneous porous media. This work has already led to basic advances in computational mathematics and numerical linear algebra and will continue to do so into the future.« less
A fast numerical scheme for causal relativistic hydrodynamics with dissipation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Takamoto, Makoto, E-mail: takamoto@tap.scphys.kyoto-u.ac.jp; Inutsuka, Shu-ichiro
2011-08-01
Highlights: {yields} We have developed a new multi-dimensional numerical scheme for causal relativistic hydrodynamics with dissipation. {yields} Our new scheme can calculate the evolution of dissipative relativistic hydrodynamics faster and more effectively than existing schemes. {yields} Since we use the Riemann solver for solving the advection steps, our method can capture shocks very accurately. - Abstract: In this paper, we develop a stable and fast numerical scheme for relativistic dissipative hydrodynamics based on Israel-Stewart theory. Israel-Stewart theory is a stable and causal description of dissipation in relativistic hydrodynamics although it includes relaxation process with the timescale for collision of constituentmore » particles, which introduces stiff equations and makes practical numerical calculation difficult. In our new scheme, we use Strang's splitting method, and use the piecewise exact solutions for solving the extremely short timescale problem. In addition, since we split the calculations into inviscid step and dissipative step, Riemann solver can be used for obtaining numerical flux for the inviscid step. The use of Riemann solver enables us to capture shocks very accurately. Simple numerical examples are shown. The present scheme can be applied to various high energy phenomena of astrophysics and nuclear physics.« less
NASA Technical Reports Server (NTRS)
Chen, Zhangxin; Ewing, Richard E.
1996-01-01
Multigrid algorithms for nonconforming and mixed finite element methods for second order elliptic problems on triangular and rectangular finite elements are considered. The construction of several coarse-to-fine intergrid transfer operators for nonconforming multigrid algorithms is discussed. The equivalence between the nonconforming and mixed finite element methods with and without projection of the coefficient of the differential problems into finite element spaces is described.
NASA Technical Reports Server (NTRS)
Fay, John F.
1990-01-01
A calculation is made of the stability of various relaxation schemes for the numerical solution of partial differential equations. A multigrid acceleration method is introduced, and its effects on stability are explored. A detailed stability analysis of a simple case is carried out and verified by numerical experiment. It is shown that the use of multigrids can speed convergence by several orders of magnitude without adversely affecting stability.
Segmental Refinement: A Multigrid Technique for Data Locality
Adams, Mark F.; Brown, Jed; Knepley, Matt; ...
2016-08-04
In this paper, we investigate a domain decomposed multigrid technique, termed segmental refinement, for solving general nonlinear elliptic boundary value problems. We extend the method first proposed in 1994 by analytically and experimentally investigating its complexity. We confirm that communication of traditional parallel multigrid is eliminated on fine grids, with modest amounts of extra work and storage, while maintaining the asymptotic exactness of full multigrid. We observe an accuracy dependence on the segmental refinement subdomain size, which was not considered in the original analysis. Finally, we present a communication complexity analysis that quantifies the communication costs ameliorated by segmental refinementmore » and report performance results with up to 64K cores on a Cray XC30.« less
NASA Astrophysics Data System (ADS)
Hano, Mitsuo; Hotta, Masashi
A new multigrid method based on high-order vector finite elements is proposed in this paper. Low level discretizations in this method are obtained by using low-order vector finite elements for the same mesh. Gauss-Seidel method is used as a smoother, and a linear equation of lowest level is solved by ICCG method. But it is often found that multigrid solutions do not converge into ICCG solutions. An elimination algolithm of constant term using a null space of the coefficient matrix is also described. In three dimensional magnetostatic field analysis, convergence time and number of iteration of this multigrid method are discussed with the convectional ICCG method.
Schwenke, M; Hennemuth, A; Fischer, B; Friman, O
2012-01-01
Phase-contrast MRI (PC MRI) can be used to assess blood flow dynamics noninvasively inside the human body. The acquired images can be reconstructed into flow vector fields. Traditionally, streamlines can be computed based on the vector fields to visualize flow patterns and particle trajectories. The traditional methods may give a false impression of precision, as they do not consider the measurement uncertainty in the PC MRI images. In our prior work, we incorporated the uncertainty of the measurement into the computation of particle trajectories. As a major part of the contribution, a novel numerical scheme for solving the anisotropic Fast Marching problem is presented. A computing time comparison to state-of-the-art methods is conducted on artificial tensor fields. A visual comparison of healthy to pathological blood flow patterns is given. The comparison shows that the novel anisotropic Fast Marching solver outperforms previous schemes in terms of computing time. The visual comparison of flow patterns directly visualizes large deviations of pathological flow from healthy flow. The novel anisotropic Fast Marching solver efficiently resolves even strongly anisotropic path costs. The visualization method enables the user to assess the uncertainty of particle trajectories derived from PC MRI images.
An adaptive discontinuous Galerkin solver for aerodynamic flows
NASA Astrophysics Data System (ADS)
Burgess, Nicholas K.
This work considers the accuracy, efficiency, and robustness of an unstructured high-order accurate discontinuous Galerkin (DG) solver for computational fluid dynamics (CFD). Recently, there has been a drive to reduce the discretization error of CFD simulations using high-order methods on unstructured grids. However, high-order methods are often criticized for lacking robustness and having high computational cost. The goal of this work is to investigate methods that enhance the robustness of high-order discontinuous Galerkin (DG) methods on unstructured meshes, while maintaining low computational cost and high accuracy of the numerical solutions. This work investigates robustness enhancement of high-order methods by examining effective non-linear solvers, shock capturing methods, turbulence model discretizations and adaptive refinement techniques. The goal is to develop an all encompassing solver that can simulate a large range of physical phenomena, where all aspects of the solver work together to achieve a robust, efficient and accurate solution strategy. The components and framework for a robust high-order accurate solver that is capable of solving viscous, Reynolds Averaged Navier-Stokes (RANS) and shocked flows is presented. In particular, this work discusses robust discretizations of the turbulence model equation used to close the RANS equations, as well as stable shock capturing strategies that are applicable across a wide range of discretization orders and applicable to very strong shock waves. Furthermore, refinement techniques are considered as both efficiency and robustness enhancement strategies. Additionally, efficient non-linear solvers based on multigrid and Krylov subspace methods are presented. The accuracy, efficiency, and robustness of the solver is demonstrated using a variety of challenging aerodynamic test problems, which include turbulent high-lift and viscous hypersonic flows. Adaptive mesh refinement was found to play a critical role in obtaining a robust and efficient high-order accurate flow solver. A goal-oriented error estimation technique has been developed to estimate the discretization error of simulation outputs. For high-order discretizations, it is shown that functional output error super-convergence can be obtained, provided the discretization satisfies a property known as dual consistency. The dual consistency of the DG methods developed in this work is shown via mathematical analysis and numerical experimentation. Goal-oriented error estimation is also used to drive an hp-adaptive mesh refinement strategy, where a combination of mesh or h-refinement, and order or p-enrichment, is employed based on the smoothness of the solution. The results demonstrate that the combination of goal-oriented error estimation and hp-adaptation yield superior accuracy, as well as enhanced robustness and efficiency for a variety of aerodynamic flows including flows with strong shock waves. This work demonstrates that DG discretizations can be the basis of an accurate, efficient, and robust CFD solver. Furthermore, enhancing the robustness of DG methods does not adversely impact the accuracy or efficiency of the solver for challenging and complex flow problems. In particular, when considering the computation of shocked flows, this work demonstrates that the available shock capturing techniques are sufficiently accurate and robust, particularly when used in conjunction with adaptive mesh refinement . This work also demonstrates that robust solutions of the Reynolds Averaged Navier-Stokes (RANS) and turbulence model equations can be obtained for complex and challenging aerodynamic flows. In this context, the most robust strategy was determined to be a low-order turbulence model discretization coupled to a high-order discretization of the RANS equations. Although RANS solutions using high-order accurate discretizations of the turbulence model were obtained, the behavior of current-day RANS turbulence models discretized to high-order was found to be problematic, leading to solver robustness issues. This suggests that future work is warranted in the area of turbulence model formulation for use with high-order discretizations. Alternately, the use of Large-Eddy Simulation (LES) subgrid scale models with high-order DG methods offers the potential to leverage the high accuracy of these methods for very high fidelity turbulent simulations. This thesis has developed the algorithmic improvements that will lay the foundation for the development of a three-dimensional high-order flow solution strategy that can be used as the basis for future LES simulations.
A Parallel Fast Sweeping Method for the Eikonal Equation
NASA Astrophysics Data System (ADS)
Baker, B.
2017-12-01
Recently, there has been an exciting emergence of probabilistic methods for travel time tomography. Unlike gradient-based optimization strategies, probabilistic tomographic methods are resistant to becoming trapped in a local minimum and provide a much better quantification of parameter resolution than, say, appealing to ray density or performing checkerboard reconstruction tests. The benefits associated with random sampling methods however are only realized by successive computation of predicted travel times in, potentially, strongly heterogeneous media. To this end this abstract is concerned with expediting the solution of the Eikonal equation. While many Eikonal solvers use a fast marching method, the proposed solver will use the iterative fast sweeping method because the eight fixed sweep orderings in each iteration are natural targets for parallelization. To reduce the number of iterations and grid points required the high-accuracy finite difference stencil of Nobel et al., 2014 is implemented. A directed acyclic graph (DAG) is created with a priori knowledge of the sweep ordering and finite different stencil. By performing a topological sort of the DAG sets of independent nodes are identified as candidates for concurrent updating. Additionally, the proposed solver will also address scalability during earthquake relocation, a necessary step in local and regional earthquake tomography and a barrier to extending probabilistic methods from active source to passive source applications, by introducing an asynchronous parallel forward solve phase for all receivers in the network. Synthetic examples using the SEG over-thrust model will be presented.
A comparison of optimization algorithms for localized in vivo B0 shimming.
Nassirpour, Sahar; Chang, Paul; Fillmer, Ariane; Henning, Anke
2018-02-01
To compare several different optimization algorithms currently used for localized in vivo B 0 shimming, and to introduce a novel, fast, and robust constrained regularized algorithm (ConsTru) for this purpose. Ten different optimization algorithms (including samples from both generic and dedicated least-squares solvers, and a novel constrained regularized inversion method) were implemented and compared for shimming in five different shimming volumes on 66 in vivo data sets from both 7 T and 9.4 T. The best algorithm was chosen to perform single-voxel spectroscopy at 9.4 T in the frontal cortex of the brain on 10 volunteers. The results of the performance tests proved that the shimming algorithm is prone to unstable solutions if it depends on the value of a starting point, and is not regularized to handle ill-conditioned problems. The ConsTru algorithm proved to be the most robust, fast, and efficient algorithm among all of the chosen algorithms. It enabled acquisition of spectra of reproducible high quality in the frontal cortex at 9.4 T. For localized in vivo B 0 shimming, the use of a dedicated linear least-squares solver instead of a generic nonlinear one is highly recommended. Among all of the linear solvers, the constrained regularized method (ConsTru) was found to be both fast and most robust. Magn Reson Med 79:1145-1156, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
IGA-ADS: Isogeometric analysis FEM using ADS solver
NASA Astrophysics Data System (ADS)
Łoś, Marcin M.; Woźniak, Maciej; Paszyński, Maciej; Lenharth, Andrew; Hassaan, Muhamm Amber; Pingali, Keshav
2017-08-01
In this paper we present a fast explicit solver for solution of non-stationary problems using L2 projections with isogeometric finite element method. The solver has been implemented within GALOIS framework. It enables parallel multi-core simulations of different time-dependent problems, in 1D, 2D, or 3D. We have prepared the solver framework in a way that enables direct implementation of the selected PDE and corresponding boundary conditions. In this paper we describe the installation, implementation of exemplary three PDEs, and execution of the simulations on multi-core Linux cluster nodes. We consider three case studies, including heat transfer, linear elasticity, as well as non-linear flow in heterogeneous media. The presented package generates output suitable for interfacing with Gnuplot and ParaView visualization software. The exemplary simulations show near perfect scalability on Gilbert shared-memory node with four Intel® Xeon® CPU E7-4860 processors, each possessing 10 physical cores (for a total of 40 cores).
A highly parallel multigrid-like method for the solution of the Euler equations
NASA Technical Reports Server (NTRS)
Tuminaro, Ray S.
1989-01-01
We consider a highly parallel multigrid-like method for the solution of the two dimensional steady Euler equations. The new method, introduced as filtering multigrid, is similar to a standard multigrid scheme in that convergence on the finest grid is accelerated by iterations on coarser grids. In the filtering method, however, additional fine grid subproblems are processed concurrently with coarse grid computations to further accelerate convergence. These additional problems are obtained by splitting the residual into a smooth and an oscillatory component. The smooth component is then used to form a coarse grid problem (similar to standard multigrid) while the oscillatory component is used for a fine grid subproblem. The primary advantage in the filtering approach is that fewer iterations are required and that most of the additional work per iteration can be performed in parallel with the standard coarse grid computations. We generalize the filtering algorithm to a version suitable for nonlinear problems. We emphasize that this generalization is conceptually straight-forward and relatively easy to implement. In particular, no explicit linearization (e.g., formation of Jacobians) needs to be performed (similar to the FAS multigrid approach). We illustrate the nonlinear version by applying it to the Euler equations, and presenting numerical results. Finally, a performance evaluation is made based on execution time models and convergence information obtained from numerical experiments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Paul T.; Shadid, John N.; Tsuji, Paul H.
Here, this study explores the performance and scaling of a GMRES Krylov method employed as a smoother for an algebraic multigrid (AMG) preconditioned Newton- Krylov solution approach applied to a fully-implicit variational multiscale (VMS) nite element (FE) resistive magnetohydrodynamics (MHD) formulation. In this context a Newton iteration is used for the nonlinear system and a Krylov (GMRES) method is employed for the linear subsystems. The efficiency of this approach is critically dependent on the scalability and performance of the AMG preconditioner for the linear solutions and the performance of the smoothers play a critical role. Krylov smoothers are considered inmore » an attempt to reduce the time and memory requirements of existing robust smoothers based on additive Schwarz domain decomposition (DD) with incomplete LU factorization solves on each subdomain. Three time dependent resistive MHD test cases are considered to evaluate the method. The results demonstrate that the GMRES smoother can be faster due to a decrease in the preconditioner setup time and a reduction in outer GMRESR solver iterations, and requires less memory (typically 35% less memory for global GMRES smoother) than the DD ILU smoother.« less
The implementation of an aeronautical CFD flow code onto distributed memory parallel systems
NASA Astrophysics Data System (ADS)
Ierotheou, C. S.; Forsey, C. R.; Leatham, M.
2000-04-01
The parallelization of an industrially important in-house computational fluid dynamics (CFD) code for calculating the airflow over complex aircraft configurations using the Euler or Navier-Stokes equations is presented. The code discussed is the flow solver module of the SAUNA CFD suite. This suite uses a novel grid system that may include block-structured hexahedral or pyramidal grids, unstructured tetrahedral grids or a hybrid combination of both. To assist in the rapid convergence to a solution, a number of convergence acceleration techniques are employed including implicit residual smoothing and a multigrid full approximation storage scheme (FAS). Key features of the parallelization approach are the use of domain decomposition and encapsulated message passing to enable the execution in parallel using a single programme multiple data (SPMD) paradigm. In the case where a hybrid grid is used, a unified grid partitioning scheme is employed to define the decomposition of the mesh. The parallel code has been tested using both structured and hybrid grids on a number of different distributed memory parallel systems and is now routinely used to perform industrial scale aeronautical simulations. Copyright
Large calculation of the flow over a hypersonic vehicle using a GPU
NASA Astrophysics Data System (ADS)
Elsen, Erich; LeGresley, Patrick; Darve, Eric
2008-12-01
Graphics processing units are capable of impressive computing performance up to 518 Gflops peak performance. Various groups have been using these processors for general purpose computing; most efforts have focussed on demonstrating relatively basic calculations, e.g. numerical linear algebra, or physical simulations for visualization purposes with limited accuracy. This paper describes the simulation of a hypersonic vehicle configuration with detailed geometry and accurate boundary conditions using the compressible Euler equations. To the authors' knowledge, this is the most sophisticated calculation of this kind in terms of complexity of the geometry, the physical model, the numerical methods employed, and the accuracy of the solution. The Navier-Stokes Stanford University Solver (NSSUS) was used for this purpose. NSSUS is a multi-block structured code with a provably stable and accurate numerical discretization which uses a vertex-based finite-difference method. A multi-grid scheme is used to accelerate the solution of the system. Based on a comparison of the Intel Core 2 Duo and NVIDIA 8800GTX, speed-ups of over 40× were demonstrated for simple test geometries and 20× for complex geometries.
LBMD : a layer-based mesh data structure tailored for generic API infrastructures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ebeida, Mohamed S.; Knupp, Patrick Michael
2010-11-01
A new mesh data structure is introduced for the purpose of mesh processing in Application Programming Interface (API) infrastructures. This data structure utilizes a reduced mesh representation to increase its ability to handle significantly larger meshes compared to full mesh representation. In spite of the reduced representation, each mesh entity (vertex, edge, face, and region) is represented using a unique handle, with no extra storage cost, which is a crucial requirement in most API libraries. The concept of mesh layers makes the data structure more flexible for mesh generation and mesh modification operations. This flexibility can have a favorable impactmore » in solver based queries of finite volume and multigrid methods. The capabilities of LBMD make it even more attractive for parallel implementations using Message Passing Interface (MPI) or Graphics Processing Units (GPUs). The data structure is associated with a new classification method to relate mesh entities to their corresponding geometrical entities. The classification technique stores the related information at the node level without introducing any ambiguities. Several examples are presented to illustrate the strength of this new data structure.« less
Lagrangian Approach to Jet Mixing and Optimization of the Reactor for Production of Carbon Nanotubes
NASA Technical Reports Server (NTRS)
Povitsky, Alex; Salas, Manuel D.
2001-01-01
This study was motivated by an attempt to optimize the High Pressure carbon oxide (HiPco) process for the production of carbon nanotubes from gaseous carbon oxide, The goal is to achieve rapid and uniform heating of catalyst particles by an optimal arrangement of jets. A mixed Eulerian and Lagrangian approach is implemented to track the temperature of catalyst particles along their trajectories as a function of time. The FLUENT CFD software with second-order upwind approximation of convective terms and an algebraic multigrid-based solver is used. The poor performance of the original reactor configuration is explained in terms of features of particle trajectories. The trajectories most exposed to the hot jets appear to be the most problematic for heating because they either bend towards the cold jet interior or rotate upwind of the mixing zone. To reduce undesirable slow and/or oscillatory heating of catalyst particles, a reactor configuration with three central jets is proposed and the optimal location of the central and peripheral nozzles is determined.
Multigrid methods for bifurcation problems: The self adjoint case
NASA Technical Reports Server (NTRS)
Taasan, Shlomo
1987-01-01
This paper deals with multigrid methods for computational problems that arise in the theory of bifurcation and is restricted to the self adjoint case. The basic problem is to solve for arcs of solutions, a task that is done successfully with an arc length continuation method. Other important issues are, for example, detecting and locating singular points as part of the continuation process, switching branches at bifurcation points, etc. Multigrid methods have been applied to continuation problems. These methods work well at regular points and at limit points, while they may encounter difficulties in the vicinity of bifurcation points. A new continuation method that is very efficient also near bifurcation points is presented here. The other issues mentioned above are also treated very efficiently with appropriate multigrid algorithms. For example, it is shown that limit points and bifurcation points can be solved for directly by a multigrid algorithm. Moreover, the algorithms presented here solve the corresponding problems in just a few work units (about 10 or less), where a work unit is the work involved in one local relaxation on the finest grid.
Conduct of the International Multigrid Conference
NASA Technical Reports Server (NTRS)
Mccormick, S.
1984-01-01
The 1983 International Multigrid Conference was held at Colorado's Copper Mountain Ski Resort, April 5-8. It was organized jointly by the Institute for Computational Studies at Colorado State University, U.S.A., and the Gasellschaft fur Mathematik und Datenverarbeitung Bonn, F.R. Germany, and was sponsored by the Air Force Office of Sponsored Research and National Aeronautics and Space Administration Headquarters. The conference was attended by 80 scientists, divided by institution almost equally into private industry, research laboratories, and academia. Fifteen attendees came from countries other than the U.S.A. In addition to the fruitful discussions, the most significant factor of the conference was of course the lectures. The lecturers include most of the leaders in the field of multigrid research. The program offered a nice integrated blend of theory, numerical studies, basic research, and applications. Some of the new areas of research that have surfaced since the Koln-Porz conference include: the algebraic multigrid approach; multigrid treatment of Euler equations for inviscid fluid flow problems; 3-D problems; and the application of MG methods on vector and parallel computers.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seal, Sudip K; Perumalla, Kalyan S; Hirshman, Steven Paul
2013-01-01
Simulations that require solutions of block tridiagonal systems of equations rely on fast parallel solvers for runtime efficiency. Leading parallel solvers that are highly effective for general systems of equations, dense or sparse, are limited in scalability when applied to block tridiagonal systems. This paper presents scalability results as well as detailed analyses of two parallel solvers that exploit the special structure of block tridiagonal matrices to deliver superior performance, often by orders of magnitude. A rigorous analysis of their relative parallel runtimes is shown to reveal the existence of a critical block size that separates the parameter space spannedmore » by the number of block rows, the block size and the processor count, into distinct regions that favor one or the other of the two solvers. Dependence of this critical block size on the above parameters as well as on machine-specific constants is established. These formal insights are supported by empirical results on up to 2,048 cores of a Cray XT4 system. To the best of our knowledge, this is the highest reported scalability for parallel block tridiagonal solvers to date.« less
Multigrid-based reconstruction algorithm for quantitative photoacoustic tomography
Li, Shengfu; Montcel, Bruno; Yuan, Zhen; Liu, Wanyu; Vray, Didier
2015-01-01
This paper proposes a multigrid inversion framework for quantitative photoacoustic tomography reconstruction. The forward model of optical fluence distribution and the inverse problem are solved at multiple resolutions. A fixed-point iteration scheme is formulated for each resolution and used as a cost function. The simulated and experimental results for quantitative photoacoustic tomography reconstruction show that the proposed multigrid inversion can dramatically reduce the required number of iterations for the optimization process without loss of reliability in the results. PMID:26203371
Multigrid solution of internal flows using unstructured solution adaptive meshes
NASA Technical Reports Server (NTRS)
Smith, Wayne A.; Blake, Kenneth R.
1992-01-01
This is the final report of the NASA Lewis SBIR Phase 2 Contract Number NAS3-25785, Multigrid Solution of Internal Flows Using Unstructured Solution Adaptive Meshes. The objective of this project, as described in the Statement of Work, is to develop and deliver to NASA a general three-dimensional Navier-Stokes code using unstructured solution-adaptive meshes for accuracy and multigrid techniques for convergence acceleration. The code will primarily be applied, but not necessarily limited, to high speed internal flows in turbomachinery.
Fast animation of lightning using an adaptive mesh.
Kim, Theodore; Lin, Ming C
2007-01-01
We present a fast method for simulating, animating, and rendering lightning using adaptive grids. The "dielectric breakdown model" is an elegant algorithm for electrical pattern formation that we extend to enable animation of lightning. The simulation can be slow, particularly in 3D, because it involves solving a large Poisson problem. Losasso et al. recently proposed an octree data structure for simulating water and smoke, and we show that this discretization can be applied to the problem of lightning simulation as well. However, implementing the incomplete Cholesky conjugate gradient (ICCG) solver for this problem can be daunting, so we provide an extensive discussion of implementation issues. ICCG solvers can usually be accelerated using "Eisenstat's trick," but the trick cannot be directly applied to the adaptive case. Fortunately, we show that an "almost incomplete Cholesky" factorization can be computed so that Eisenstat's trick can still be used. We then present a fast rendering method based on convolution that is competitive with Monte Carlo ray tracing but orders of magnitude faster, and we also show how to further improve the visual results using jittering.
The novel high-performance 3-D MT inverse solver
NASA Astrophysics Data System (ADS)
Kruglyakov, Mikhail; Geraskin, Alexey; Kuvshinov, Alexey
2016-04-01
We present novel, robust, scalable, and fast 3-D magnetotelluric (MT) inverse solver. The solver is written in multi-language paradigm to make it as efficient, readable and maintainable as possible. Separation of concerns and single responsibility concepts go through implementation of the solver. As a forward modelling engine a modern scalable solver extrEMe, based on contracting integral equation approach, is used. Iterative gradient-type (quasi-Newton) optimization scheme is invoked to search for (regularized) inverse problem solution, and adjoint source approach is used to calculate efficiently the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT responses, and supports massive parallelization. Moreover, different parallelization strategies implemented in the code allow optimal usage of available computational resources for a given problem statement. To parameterize an inverse domain the so-called mask parameterization is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to HPC Piz Daint (6th supercomputer in the world) demonstrate practically linear scalability of the code up to thousands of nodes.
Removal of the Gibbs phenomenon and its application to fast-Fourier-transform-based mode solvers.
Wangüemert-Pérez, J G; Godoy-Rubio, R; Ortega-Moñux, A; Molina-Fernández, I
2007-12-01
A simple strategy for accurately recovering discontinuous functions from their Fourier series coefficients is presented. The aim of the proposed approach, named spectrum splitting (SS), is to remove the Gibbs phenomenon by making use of signal-filtering-based concepts and some properties of the Fourier series. While the technique can be used in a vast range of situations, it is particularly suitable for being incorporated into fast-Fourier-transform-based electromagnetic mode solvers (FFT-MSs), which are known to suffer from very poor convergence rates when applied to situations where the field distributions are highly discontinuous (e.g., silicon-on-insulator photonic wires). The resultant method, SS-FFT-MS, is exhaustively tested under the assumption of a simplified one-dimensional model, clearly showing a dramatic improvement of the convergence rates with respect to the original FFT-based methods.
A stochastic-dynamic model for global atmospheric mass field statistics
NASA Technical Reports Server (NTRS)
Ghil, M.; Balgovind, R.; Kalnay-Rivas, E.
1981-01-01
A model that yields the spatial correlation structure of atmospheric mass field forecast errors was developed. The model is governed by the potential vorticity equation forced by random noise. Expansion in spherical harmonics and correlation function was computed analytically using the expansion coefficients. The finite difference equivalent was solved using a fast Poisson solver and the correlation function was computed using stratified sampling of the individual realization of F(omega) and hence of phi(omega). A higher order equation for gamma was derived and solved directly in finite differences by two successive applications of the fast Poisson solver. The methods were compared for accuracy and efficiency and the third method was chosen as clearly superior. The results agree well with the latitude dependence of observed atmospheric correlation data. The value of the parameter c sub o which gives the best fit to the data is close to the value expected from dynamical considerations.
A GPU accelerated and error-controlled solver for the unbounded Poisson equation in three dimensions
NASA Astrophysics Data System (ADS)
Exl, Lukas
2017-12-01
An efficient solver for the three dimensional free-space Poisson equation is presented. The underlying numerical method is based on finite Fourier series approximation. While the error of all involved approximations can be fully controlled, the overall computation error is driven by the convergence of the finite Fourier series of the density. For smooth and fast-decaying densities the proposed method will be spectrally accurate. The method scales with O(N log N) operations, where N is the total number of discretization points in the Cartesian grid. The majority of the computational costs come from fast Fourier transforms (FFT), which makes it ideal for GPU computation. Several numerical computations on CPU and GPU validate the method and show efficiency and convergence behavior. Tests are performed using the Vienna Scientific Cluster 3 (VSC3). A free MATLAB implementation for CPU and GPU is provided to the interested community.
Multistage Schemes with Multigrid for Euler and Navier-Strokes Equations: Components and Analysis
NASA Technical Reports Server (NTRS)
Swanson, R. C.; Turkel, Eli
1997-01-01
A class of explicit multistage time-stepping schemes with centered spatial differencing and multigrids are considered for the compressible Euler and Navier-Stokes equations. These schemes are the basis for a family of computer programs (flow codes with multigrid (FLOMG) series) currently used to solve a wide range of fluid dynamics problems, including internal and external flows. In this paper, the components of these multistage time-stepping schemes are defined, discussed, and in many cases analyzed to provide additional insight into their behavior. Special emphasis is given to numerical dissipation, stability of Runge-Kutta schemes, and the convergence acceleration techniques of multigrid and implicit residual smoothing. Both the Baldwin and Lomax algebraic equilibrium model and the Johnson and King one-half equation nonequilibrium model are used to establish turbulence closure. Implementation of these models is described.
Application of PDSLin to the magnetic reconnection problem
NASA Astrophysics Data System (ADS)
Yuan, Xuefei; Li, Xiaoye S.; Yamazaki, Ichitaro; Jardin, Stephen C.; Koniges, Alice E.; Keyes, David E.
2013-01-01
Magnetic reconnection is a fundamental process in a magnetized plasma at both low and high magnetic Lundquist numbers (the ratio of the resistive diffusion time to the Alfvén wave transit time), which occurs in a wide variety of laboratory and space plasmas, e.g. magnetic fusion experiments, the solar corona and the Earth's magnetotail. An implicit time advance for the two-fluid magnetic reconnection problem is known to be difficult because of the large condition number of the associated matrix. This is especially troublesome when the collisionless ion skin depth is large so that the Whistler waves, which cause the fast reconnection, dominate the physics (Yuan et al 2012 J. Comput. Phys. 231 5822-53). For small system sizes, a direct solver such as SuperLU can be employed to obtain an accurate solution as long as the condition number is bounded by the reciprocal of the floating-point machine precision. However, SuperLU scales effectively only to hundreds of processors or less. For larger system sizes, it has been shown that physics-based (Chacón and Knoll 2003 J. Comput. Phys. 188 573-92) or other preconditioners can be applied to provide adequate solver performance. In recent years, we have been developing a new algebraic hybrid linear solver, PDSLin (Parallel Domain decomposition Schur complement-based Linear solver) (Yamazaki and Li 2010 Proc. VECPAR pp 421-34 and Yamazaki et al 2011 Technical Report). In this work, we compare numerical results from a direct solver and the proposed hybrid solver for the magnetic reconnection problem and demonstrate that the new hybrid solver is scalable to thousands of processors while maintaining the same robustness as a direct solver.
Etude des performances de solveurs deterministes sur un coeur rapide a caloporteur sodium
NASA Astrophysics Data System (ADS)
Bay, Charlotte
The reactors of next generation, in particular SFR model, represent a true challenge for current codes and solvers, used mainly for thermic cores. There is no guarantee that their competences could be straight adapted to fast neutron spectrum, or to major design differences. Thus it is necessary to assess the validity of solvers and their potential shortfall in the case of fast neutron reactors. As part of an internship with CEA (France), and at the instigation of EPM Nuclear Institute, this study concerns the following codes : DRAGON/DONJON, ERANOS, PARIS and APOLLO3. The precision assessment has been performed using Monte Carlo code TRIPOLI4. Only core calculation was of interest, namely numerical methods competences in precision and rapidity. Lattice code was not part of the study, that is to say nuclear data, self-shielding, or isotopic compositions. Nor was tackled burnup or time evolution effects. The study consists in two main steps : first evaluating the sensitivity of each solver to calculation parameters, and obtain its optimal calculation set ; then compare their competences in terms of precision and rapidity, by collecting usual quantities (effective multiplication factor, reaction rates map), but also more specific quantities which are crucial to the SFR design, namely control rod worth and sodium void effect. The calculation time is also a key factor. Whatever conclusion or recommendation that could be drawn from this study, they must first of all be applied within similar frameworks, that is to say small fast neutron cores with hexagonal geometry. Eventual adjustments for big cores will have to be demonstrated in developments of this study.
Efficient dense blur map estimation for automatic 2D-to-3D conversion
NASA Astrophysics Data System (ADS)
Vosters, L. P. J.; de Haan, G.
2012-03-01
Focus is an important depth cue for 2D-to-3D conversion of low depth-of-field images and video. However, focus can be only reliably estimated on edges. Therefore, Bea et al. [1] first proposed an optimization based approach to propagate focus to non-edge image portions, for single image focus editing. While their approach produces accurate dense blur maps, the computational complexity and memory requirements for solving the resulting sparse linear system with standard multigrid or (multilevel) preconditioning techniques, are infeasible within the stringent requirements of the consumer electronics and broadcast industry. In this paper we propose fast, efficient, low latency, line scanning based focus propagation, which mitigates the need for complex multigrid or (multilevel) preconditioning techniques. In addition we propose facial blur compensation to compensate for false shading edges that cause incorrect blur estimates in people's faces. In general shading leads to incorrect focus estimates, which may lead to unnatural 3D and visual discomfort. Since visual attention mostly tends to faces, our solution solves the most distracting errors. A subjective assessment by paired comparison on a set of challenging low-depth-of-field images shows that the proposed approach achieves equal 3D image quality as optimization based approaches, and that facial blur compensation results in a significant improvement.
Ge, Liang; Sotiropoulos, Fotis
2007-08-01
A novel numerical method is developed that integrates boundary-conforming grids with a sharp interface, immersed boundary methodology. The method is intended for simulating internal flows containing complex, moving immersed boundaries such as those encountered in several cardiovascular applications. The background domain (e.g the empty aorta) is discretized efficiently with a curvilinear boundary-fitted mesh while the complex moving immersed boundary (say a prosthetic heart valve) is treated with the sharp-interface, hybrid Cartesian/immersed-boundary approach of Gilmanov and Sotiropoulos [1]. To facilitate the implementation of this novel modeling paradigm in complex flow simulations, an accurate and efficient numerical method is developed for solving the unsteady, incompressible Navier-Stokes equations in generalized curvilinear coordinates. The method employs a novel, fully-curvilinear staggered grid discretization approach, which does not require either the explicit evaluation of the Christoffel symbols or the discretization of all three momentum equations at cell interfaces as done in previous formulations. The equations are integrated in time using an efficient, second-order accurate fractional step methodology coupled with a Jacobian-free, Newton-Krylov solver for the momentum equations and a GMRES solver enhanced with multigrid as preconditioner for the Poisson equation. Several numerical experiments are carried out on fine computational meshes to demonstrate the accuracy and efficiency of the proposed method for standard benchmark problems as well as for unsteady, pulsatile flow through a curved, pipe bend. To demonstrate the ability of the method to simulate flows with complex, moving immersed boundaries we apply it to calculate pulsatile, physiological flow through a mechanical, bileaflet heart valve mounted in a model straight aorta with an anatomical-like triple sinus.
Ge, Liang; Sotiropoulos, Fotis
2008-01-01
A novel numerical method is developed that integrates boundary-conforming grids with a sharp interface, immersed boundary methodology. The method is intended for simulating internal flows containing complex, moving immersed boundaries such as those encountered in several cardiovascular applications. The background domain (e.g the empty aorta) is discretized efficiently with a curvilinear boundary-fitted mesh while the complex moving immersed boundary (say a prosthetic heart valve) is treated with the sharp-interface, hybrid Cartesian/immersed-boundary approach of Gilmanov and Sotiropoulos [1]. To facilitate the implementation of this novel modeling paradigm in complex flow simulations, an accurate and efficient numerical method is developed for solving the unsteady, incompressible Navier-Stokes equations in generalized curvilinear coordinates. The method employs a novel, fully-curvilinear staggered grid discretization approach, which does not require either the explicit evaluation of the Christoffel symbols or the discretization of all three momentum equations at cell interfaces as done in previous formulations. The equations are integrated in time using an efficient, second-order accurate fractional step methodology coupled with a Jacobian-free, Newton-Krylov solver for the momentum equations and a GMRES solver enhanced with multigrid as preconditioner for the Poisson equation. Several numerical experiments are carried out on fine computational meshes to demonstrate the accuracy and efficiency of the proposed method for standard benchmark problems as well as for unsteady, pulsatile flow through a curved, pipe bend. To demonstrate the ability of the method to simulate flows with complex, moving immersed boundaries we apply it to calculate pulsatile, physiological flow through a mechanical, bileaflet heart valve mounted in a model straight aorta with an anatomical-like triple sinus. PMID:19194533
A robust multilevel simultaneous eigenvalue solver
NASA Technical Reports Server (NTRS)
Costiner, Sorin; Taasan, Shlomo
1993-01-01
Multilevel (ML) algorithms for eigenvalue problems are often faced with several types of difficulties such as: the mixing of approximated eigenvectors by the solution process, the approximation of incomplete clusters of eigenvectors, the poor representation of solution on coarse levels, and the existence of close or equal eigenvalues. Algorithms that do not treat appropriately these difficulties usually fail, or their performance degrades when facing them. These issues motivated the development of a robust adaptive ML algorithm which treats these difficulties, for the calculation of a few eigenvectors and their corresponding eigenvalues. The main techniques used in the new algorithm include: the adaptive completion and separation of the relevant clusters on different levels, the simultaneous treatment of solutions within each cluster, and the robustness tests which monitor the algorithm's efficiency and convergence. The eigenvectors' separation efficiency is based on a new ML projection technique generalizing the Rayleigh Ritz projection, combined with a technique, the backrotations. These separation techniques, when combined with an FMG formulation, in many cases lead to algorithms of O(qN) complexity, for q eigenvectors of size N on the finest level. Previously developed ML algorithms are less focused on the mentioned difficulties. Moreover, algorithms which employ fine level separation techniques are of O(q(sub 2)N) complexity and usually do not overcome all these difficulties. Computational examples are presented where Schrodinger type eigenvalue problems in 2-D and 3-D, having equal and closely clustered eigenvalues, are solved with the efficiency of the Poisson multigrid solver. A second order approximation is obtained in O(qN) work, where the total computational work is equivalent to only a few fine level relaxations per eigenvector.
NASA Astrophysics Data System (ADS)
Becker, Matthew Rand
I present a new algorithm, CALCLENS, for efficiently computing weak gravitational lensing shear signals from large N-body light cone simulations over a curved sky. This new algorithm properly accounts for the sky curvature and boundary conditions, is able to produce redshift- dependent shear signals including corrections to the Born approximation by using multiple- plane ray tracing, and properly computes the lensed images of source galaxies in the light cone. The key feature of this algorithm is a new, computationally efficient Poisson solver for the sphere that combines spherical harmonic transform and multigrid methods. As a result, large areas of sky (~10,000 square degrees) can be ray traced efficiently at high-resolution using only a few hundred cores. Using this new algorithm and curved-sky calculations that only use a slower but more accurate spherical harmonic transform Poisson solver, I study the convergence, shear E-mode, shear B-mode and rotation mode power spectra. Employing full-sky E/B-mode decompositions, I confirm that the numerically computed shear B-mode and rotation mode power spectra are equal at high accuracy ( ≲ 1%) as expected from perturbation theory up to second order. Coupled with realistic galaxy populations placed in large N-body light cone simulations, this new algorithm is ideally suited for the construction of synthetic weak lensing shear catalogs to be used to test for systematic effects in data analysis procedures for upcoming large-area sky surveys. The implementation presented in this work, written in C and employing widely available software libraries to maintain portability, is publicly available at http://code.google.com/p/calclens.
NASA Astrophysics Data System (ADS)
Becker, Matthew R.
2013-10-01
I present a new algorithm, Curved-sky grAvitational Lensing for Cosmological Light conE simulatioNS (CALCLENS), for efficiently computing weak gravitational lensing shear signals from large N-body light cone simulations over a curved sky. This new algorithm properly accounts for the sky curvature and boundary conditions, is able to produce redshift-dependent shear signals including corrections to the Born approximation by using multiple-plane ray tracing and properly computes the lensed images of source galaxies in the light cone. The key feature of this algorithm is a new, computationally efficient Poisson solver for the sphere that combines spherical harmonic transform and multigrid methods. As a result, large areas of sky (˜10 000 square degrees) can be ray traced efficiently at high resolution using only a few hundred cores. Using this new algorithm and curved-sky calculations that only use a slower but more accurate spherical harmonic transform Poisson solver, I study the convergence, shear E-mode, shear B-mode and rotation mode power spectra. Employing full-sky E/B-mode decompositions, I confirm that the numerically computed shear B-mode and rotation mode power spectra are equal at high accuracy (≲1 per cent) as expected from perturbation theory up to second order. Coupled with realistic galaxy populations placed in large N-body light cone simulations, this new algorithm is ideally suited for the construction of synthetic weak lensing shear catalogues to be used to test for systematic effects in data analysis procedures for upcoming large-area sky surveys. The implementation presented in this work, written in C and employing widely available software libraries to maintain portability, is publicly available at http://code.google.com/p/calclens.
Multigrid Approach to Incompressible Viscous Cavity Flows
NASA Technical Reports Server (NTRS)
Wood, William A.
1996-01-01
Two-dimensional incompressible viscous driven-cavity flows are computed for Reynolds numbers on the range 100-20,000 using a loosely coupled, implicit, second-order centrally-different scheme. Mesh sequencing and three-level V-cycle multigrid error smoothing are incorporated into the symmetric Gauss-Seidel time-integration algorithm. Parametrics on the numerical parameters are performed, achieving reductions in solution times by more than 60 percent with the full multigrid approach. Details of the circulation patterns are investigated in cavities of 2-to-1, 1-to-1, and 1-to-2 depth to width ratios.
1988-08-01
Time Series 53. J. Barros-Neto and R. A. Artino, Hypoelliptic Boundary-Value Problems 54. R. L. Sternberg, A. J. Kalinowski, and J. S. Papadakis... Systems 95. C E. AuL Rings of Continuous Functions 96. R. Chuaqui, Analysis , Geometry, and Probability 97. L. Fuchs and L. Sace, Modules Over...Local Refinements for a Class of Nonshared Memory Systems 449 Hermann Mierendorif Analysis of a Multigrid Method for the Euler Equations of Gas Dynamics
A positivity-preserving, implicit defect-correction multigrid method for turbulent combustion
NASA Astrophysics Data System (ADS)
Wasserman, M.; Mor-Yossef, Y.; Greenberg, J. B.
2016-07-01
A novel, robust multigrid method for the simulation of turbulent and chemically reacting flows is developed. A survey of previous attempts at implementing multigrid for the problems at hand indicated extensive use of artificial stabilization to overcome numerical instability arising from non-linearity of turbulence and chemistry model source-terms, small-scale physics of combustion, and loss of positivity. These issues are addressed in the current work. The highly stiff Reynolds-averaged Navier-Stokes (RANS) equations, coupled with turbulence and finite-rate chemical kinetics models, are integrated in time using the unconditionally positive-convergent (UPC) implicit method. The scheme is successfully extended in this work for use with chemical kinetics models, in a fully-coupled multigrid (FC-MG) framework. To tackle the degraded performance of multigrid methods for chemically reacting flows, two major modifications are introduced with respect to the basic, Full Approximation Storage (FAS) approach. First, a novel prolongation operator that is based on logarithmic variables is proposed to prevent loss of positivity due to coarse-grid corrections. Together with the extended UPC implicit scheme, the positivity-preserving prolongation operator guarantees unconditional positivity of turbulence quantities and species mass fractions throughout the multigrid cycle. Second, to improve the coarse-grid-correction obtained in localized regions of high chemical activity, a modified defect correction procedure is devised, and successfully applied for the first time to simulate turbulent, combusting flows. The proposed modifications to the standard multigrid algorithm create a well-rounded and robust numerical method that provides accelerated convergence, while unconditionally preserving the positivity of model equation variables. Numerical simulations of various flows involving premixed combustion demonstrate that the proposed MG method increases the efficiency by a factor of up to eight times with respect to an equivalent single-grid method, and by two times with respect to an artificially-stabilized MG method.
Monolithic multigrid methods for two-dimensional resistive magnetohydrodynamics
Adler, James H.; Benson, Thomas R.; Cyr, Eric C.; ...
2016-01-06
Magnetohydrodynamic (MHD) representations are used to model a wide range of plasma physics applications and are characterized by a nonlinear system of partial differential equations that strongly couples a charged fluid with the evolution of electromagnetic fields. The resulting linear systems that arise from discretization and linearization of the nonlinear problem are generally difficult to solve. In this paper, we investigate multigrid preconditioners for this system. We consider two well-known multigrid relaxation methods for incompressible fluid dynamics: Braess--Sarazin relaxation and Vanka relaxation. We first extend these to the context of steady-state one-fluid viscoresistive MHD. Then we compare the two relaxationmore » procedures within a multigrid-preconditioned GMRES method employed within Newton's method. To isolate the effects of the different relaxation methods, we use structured grids, inf-sup stable finite elements, and geometric interpolation. Furthermore, we present convergence and timing results for a two-dimensional, steady-state test problem.« less
Parallel multigrid smoothing: polynomial versus Gauss-Seidel
NASA Astrophysics Data System (ADS)
Adams, Mark; Brezina, Marian; Hu, Jonathan; Tuminaro, Ray
2003-07-01
Gauss-Seidel is often the smoother of choice within multigrid applications. In the context of unstructured meshes, however, maintaining good parallel efficiency is difficult with multiplicative iterative methods such as Gauss-Seidel. This leads us to consider alternative smoothers. We discuss the computational advantages of polynomial smoothers within parallel multigrid algorithms for positive definite symmetric systems. Two particular polynomials are considered: Chebyshev and a multilevel specific polynomial. The advantages of polynomial smoothing over traditional smoothers such as Gauss-Seidel are illustrated on several applications: Poisson's equation, thin-body elasticity, and eddy current approximations to Maxwell's equations. While parallelizing the Gauss-Seidel method typically involves a compromise between a scalable convergence rate and maintaining high flop rates, polynomial smoothers achieve parallel scalable multigrid convergence rates without sacrificing flop rates. We show that, although parallel computers are the main motivation, polynomial smoothers are often surprisingly competitive with Gauss-Seidel smoothers on serial machines.
Multigrid methods for isogeometric discretization
Gahalaut, K.P.S.; Kraus, J.K.; Tomar, S.K.
2013-01-01
We present (geometric) multigrid methods for isogeometric discretization of scalar second order elliptic problems. The smoothing property of the relaxation method, and the approximation property of the intergrid transfer operators are analyzed. These properties, when used in the framework of classical multigrid theory, imply uniform convergence of two-grid and multigrid methods. Supporting numerical results are provided for the smoothing property, the approximation property, convergence factor and iterations count for V-, W- and F-cycles, and the linear dependence of V-cycle convergence on the smoothing steps. For two dimensions, numerical results include the problems with variable coefficients, simple multi-patch geometry, a quarter annulus, and the dependence of convergence behavior on refinement levels ℓ, whereas for three dimensions, only the constant coefficient problem in a unit cube is considered. The numerical results are complete up to polynomial order p=4, and for C0 and Cp-1 smoothness. PMID:24511168
A multi-level solution algorithm for steady-state Markov chains
NASA Technical Reports Server (NTRS)
Horton, Graham; Leutenegger, Scott T.
1993-01-01
A new iterative algorithm, the multi-level algorithm, for the numerical solution of steady state Markov chains is presented. The method utilizes a set of recursively coarsened representations of the original system to achieve accelerated convergence. It is motivated by multigrid methods, which are widely used for fast solution of partial differential equations. Initial results of numerical experiments are reported, showing significant reductions in computation time, often an order of magnitude or more, relative to the Gauss-Seidel and optimal SOR algorithms for a variety of test problems. The multi-level method is compared and contrasted with the iterative aggregation-disaggregation algorithm of Takahashi.
Robust parallel iterative solvers for linear and least-squares problems, Final Technical Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Saad, Yousef
2014-01-16
The primary goal of this project is to study and develop robust iterative methods for solving linear systems of equations and least squares systems. The focus of the Minnesota team is on algorithms development, robustness issues, and on tests and validation of the methods on realistic problems. 1. The project begun with an investigation on how to practically update a preconditioner obtained from an ILU-type factorization, when the coefficient matrix changes. 2. We investigated strategies to improve robustness in parallel preconditioners in a specific case of a PDE with discontinuous coefficients. 3. We explored ways to adapt standard preconditioners formore » solving linear systems arising from the Helmholtz equation. These are often difficult linear systems to solve by iterative methods. 4. We have also worked on purely theoretical issues related to the analysis of Krylov subspace methods for linear systems. 5. We developed an effective strategy for performing ILU factorizations for the case when the matrix is highly indefinite. The strategy uses shifting in some optimal way. The method was extended to the solution of Helmholtz equations by using complex shifts, yielding very good results in many cases. 6. We addressed the difficult problem of preconditioning sparse systems of equations on GPUs. 7. A by-product of the above work is a software package consisting of an iterative solver library for GPUs based on CUDA. This was made publicly available. It was the first such library that offers complete iterative solvers for GPUs. 8. We considered another form of ILU which blends coarsening techniques from Multigrid with algebraic multilevel methods. 9. We have released a new version on our parallel solver - called pARMS [new version is version 3]. As part of this we have tested the code in complex settings - including the solution of Maxwell and Helmholtz equations and for a problem of crystal growth.10. As an application of polynomial preconditioning we considered the problem of evaluating f(A)v which arises in statistical sampling. 11. As an application to the methods we developed, we tackled the problem of computing the diagonal of the inverse of a matrix. This arises in statistical applications as well as in many applications in physics. We explored probing methods as well as domain-decomposition type methods. 12. A collaboration with researchers from Toulouse, France, considered the important problem of computing the Schur complement in a domain-decomposition approach. 13. We explored new ways of preconditioning linear systems, based on low-rank approximations.« less
Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry; ...
2016-10-27
Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
Implicit integration methods for dislocation dynamics
Gardner, D. J.; Woodward, C. S.; Reynolds, D. R.; ...
2015-01-20
In dislocation dynamics simulations, strain hardening simulations require integrating stiff systems of ordinary differential equations in time with expensive force calculations, discontinuous topological events, and rapidly changing problem size. Current solvers in use often result in small time steps and long simulation times. Faster solvers may help dislocation dynamics simulations accumulate plastic strains at strain rates comparable to experimental observations. Here, this paper investigates the viability of high order implicit time integrators and robust nonlinear solvers to reduce simulation run times while maintaining the accuracy of the computed solution. In particular, implicit Runge-Kutta time integrators are explored as a waymore » of providing greater accuracy over a larger time step than is typically done with the standard second-order trapezoidal method. In addition, both accelerated fixed point and Newton's method are investigated to provide fast and effective solves for the nonlinear systems that must be resolved within each time step. Results show that integrators of third order are the most effective, while accelerated fixed point and Newton's method both improve solver performance over the standard fixed point method used for the solution of the nonlinear systems.« less
Hollaus, K; Weiss, B; Magele, Ch; Hutten, H
2004-02-01
The acceleration of the solution of the quasi-static electric field problem considering anisotropic complex conductivity simulated by tetrahedral finite elements of first order is investigated by geometric multigrid.
Fast GPU-based computation of spatial multigrid multiframe LMEM for PET.
Nassiri, Moulay Ali; Carrier, Jean-François; Després, Philippe
2015-09-01
Significant efforts were invested during the last decade to accelerate PET list-mode reconstructions, notably with GPU devices. However, the computation time per event is still relatively long, and the list-mode efficiency on the GPU is well below the histogram-mode efficiency. Since list-mode data are not arranged in any regular pattern, costly accesses to the GPU global memory can hardly be optimized and geometrical symmetries cannot be used. To overcome obstacles that limit the acceleration of reconstruction from list-mode on the GPU, a multigrid and multiframe approach of an expectation-maximization algorithm was developed. The reconstruction process is started during data acquisition, and calculations are executed concurrently on the GPU and the CPU, while the system matrix is computed on-the-fly. A new convergence criterion also was introduced, which is computationally more efficient on the GPU. The implementation was tested on a Tesla C2050 GPU device for a Gemini GXL PET system geometry. The results show that the proposed algorithm (multigrid and multiframe list-mode expectation-maximization, MGMF-LMEM) converges to the same solution as the LMEM algorithm more than three times faster. The execution time of the MGMF-LMEM algorithm was 1.1 s per million of events on the Tesla C2050 hardware used, for a reconstructed space of 188 x 188 x 57 voxels of 2 x 2 x 3.15 mm3. For 17- and 22-mm simulated hot lesions, the MGMF-LMEM algorithm led on the first iteration to contrast recovery coefficients (CRC) of more than 75 % of the maximum CRC while achieving a minimum in the relative mean square error. Therefore, the MGMF-LMEM algorithm can be used as a one-pass method to perform real-time reconstructions for low-count acquisitions, as in list-mode gated studies. The computation time for one iteration and 60 millions of events was approximately 66 s.
A fast collocation method for a variable-coefficient nonlocal diffusion model
NASA Astrophysics Data System (ADS)
Wang, Che; Wang, Hong
2017-02-01
We develop a fast collocation scheme for a variable-coefficient nonlocal diffusion model, for which a numerical discretization would yield a dense stiffness matrix. The development of the fast method is achieved by carefully handling the variable coefficients appearing inside the singular integral operator and exploiting the structure of the dense stiffness matrix. The resulting fast method reduces the computational work from O (N3) required by a commonly used direct solver to O (Nlog N) per iteration and the memory requirement from O (N2) to O (N). Furthermore, the fast method reduces the computational work of assembling the stiffness matrix from O (N2) to O (N). Numerical results are presented to show the utility of the fast method.
Integrated multidisciplinary CAD/CAE environment for micro-electro-mechanical systems (MEMS)
NASA Astrophysics Data System (ADS)
Przekwas, Andrzej J.
1999-03-01
Computational design of MEMS involves several strongly coupled physical disciplines, including fluid mechanics, heat transfer, stress/deformation dynamics, electronics, electro/magneto statics, calorics, biochemistry and others. CFDRC is developing a new generation multi-disciplinary CAD systems for MEMS using high-fidelity field solvers on unstructured, solution-adaptive grids for a full range of disciplines. The software system, ACE + MEMS, includes all essential CAD tools; geometry/grid generation for multi- discipline, multi-equation solvers, GUI, tightly coupled configurable 3D field solvers for FVM, FEM and BEM and a 3D visualization/animation tool. The flow/heat transfer/calorics/chemistry equations are solved with unstructured adaptive FVM solver, stress/deformation are computed with a FEM STRESS solver and a FAST BEM solver is used to solve linear heat transfer, electro/magnetostatics and elastostatics equations on adaptive polygonal surface grids. Tight multidisciplinary coupling and automatic interoperability between the tools was achieved by designing a comprehensive database structure and APIs for complete model definition. The virtual model definition is implemented in data transfer facility, a publicly available tool described in this paper. The paper presents overall description of the software architecture and MEMS design flow in ACE + MEMS. It describes current status, ongoing effort and future plans for the software. The paper also discusses new concepts of mixed-level and mixed- dimensionality capability in which 1D microfluidic networks are simulated concurrently with 3D high-fidelity models of discrete components.
Multigrid Methods for Aerodynamic Problems in Complex Geometries
NASA Technical Reports Server (NTRS)
Caughey, David A.
1995-01-01
Work has been directed at the development of efficient multigrid methods for the solution of aerodynamic problems involving complex geometries, including the development of computational methods for the solution of both inviscid and viscous transonic flow problems. The emphasis is on problems of complex, three-dimensional geometry. The methods developed are based upon finite-volume approximations to both the Euler and the Reynolds-Averaged Navier-Stokes equations. The methods are developed for use on multi-block grids using diagonalized implicit multigrid methods to achieve computational efficiency. The work is focused upon aerodynamic problems involving complex geometries, including advanced engine inlets.
A multigrid method for steady Euler equations on unstructured adaptive grids
NASA Technical Reports Server (NTRS)
Riemslagh, Kris; Dick, Erik
1993-01-01
A flux-difference splitting type algorithm is formulated for the steady Euler equations on unstructured grids. The polynomial flux-difference splitting technique is used. A vertex-centered finite volume method is employed on a triangular mesh. The multigrid method is in defect-correction form. A relaxation procedure with a first order accurate inner iteration and a second-order correction performed only on the finest grid, is used. A multi-stage Jacobi relaxation method is employed as a smoother. Since the grid is unstructured a Jacobi type is chosen. The multi-staging is necessary to provide sufficient smoothing properties. The domain is discretized using a Delaunay triangular mesh generator. Three grids with more or less uniform distribution of nodes but with different resolution are generated by successive refinement of the coarsest grid. Nodes of coarser grids appear in the finer grids. The multigrid method is started on these grids. As soon as the residual drops below a threshold value, an adaptive refinement is started. The solution on the adaptively refined grid is accelerated by a multigrid procedure. The coarser multigrid grids are generated by successive coarsening through point removement. The adaption cycle is repeated a few times. Results are given for the transonic flow over a NACA-0012 airfoil.
Compressed Scattering Matrices and Fast Direct Solvers
2007-10-18
50, vol. 3B, Washington, DC, USA, 2005. [19] M. Jun, L. Mingyu , and E. Michielssen, "A fast space-adaptive algorithm to evaluate transient wave fields...the 2005 IEEE Antennas and Propagation Society International Symposium, pp 163-166, vol. 3A, Washington, DC, USA, 2005. [22] C. Qin, L. Mingyu , L...Society International Symposium, 2975-2978, Albuquerque, NM, USA, 2006. [30] M. Jun, L. Mingyu , and E. Michielssen, "Towards efficient and stable low
LSRN: A PARALLEL ITERATIVE SOLVER FOR STRONGLY OVER- OR UNDERDETERMINED SYSTEMS*
Meng, Xiangrui; Saunders, Michael A.; Mahoney, Michael W.
2014-01-01
We describe a parallel iterative least squares solver named LSRN that is based on random normal projection. LSRN computes the min-length solution to minx∈ℝn ‖Ax − b‖2, where A ∈ ℝm × n with m ≫ n or m ≪ n, and where A may be rank-deficient. Tikhonov regularization may also be included. Since A is involved only in matrix-matrix and matrix-vector multiplications, it can be a dense or sparse matrix or a linear operator, and LSRN automatically speeds up when A is sparse or a fast linear operator. The preconditioning phase consists of a random normal projection, which is embarrassingly parallel, and a singular value decomposition of size ⌈γ min(m, n)⌉ × min(m, n), where γ is moderately larger than 1, e.g., γ = 2. We prove that the preconditioned system is well-conditioned, with a strong concentration result on the extreme singular values, and hence that the number of iterations is fully predictable when we apply LSQR or the Chebyshev semi-iterative method. As we demonstrate, the Chebyshev method is particularly efficient for solving large problems on clusters with high communication cost. Numerical results show that on a shared-memory machine, LSRN is very competitive with LAPACK’s DGELSD and a fast randomized least squares solver called Blendenpik on large dense problems, and it outperforms the least squares solver from SuiteSparseQR on sparse problems without sparsity patterns that can be exploited to reduce fill-in. Further experiments show that LSRN scales well on an Amazon Elastic Compute Cloud cluster. PMID:25419094
Strategies for global optimization in photonics design.
Vukovic, Ana; Sewell, Phillip; Benson, Trevor M
2010-10-01
This paper reports on two important issues that arise in the context of the global optimization of photonic components where large problem spaces must be investigated. The first is the implementation of a fast simulation method and associated matrix solver for assessing particular designs and the second, the strategies that a designer can adopt to control the size of the problem design space to reduce runtimes without compromising the convergence of the global optimization tool. For this study an analytical simulation method based on Mie scattering and a fast matrix solver exploiting the fast multipole method are combined with genetic algorithms (GAs). The impact of the approximations of the simulation method on the accuracy and runtime of individual design assessments and the consequent effects on the GA are also examined. An investigation of optimization strategies for controlling the design space size is conducted on two illustrative examples, namely, 60° and 90° waveguide bends based on photonic microstructures, and their effectiveness is analyzed in terms of a GA's ability to converge to the best solution within an acceptable timeframe. Finally, the paper describes some particular optimized solutions found in the course of this work.
Hesford, Andrew J.; Chew, Weng C.
2010-01-01
The distorted Born iterative method (DBIM) computes iterative solutions to nonlinear inverse scattering problems through successive linear approximations. By decomposing the scattered field into a superposition of scattering by an inhomogeneous background and by a material perturbation, large or high-contrast variations in medium properties can be imaged through iterations that are each subject to the distorted Born approximation. However, the need to repeatedly compute forward solutions still imposes a very heavy computational burden. To ameliorate this problem, the multilevel fast multipole algorithm (MLFMA) has been applied as a forward solver within the DBIM. The MLFMA computes forward solutions in linear time for volumetric scatterers. The typically regular distribution and shape of scattering elements in the inverse scattering problem allow the method to take advantage of data redundancy and reduce the computational demands of the normally expensive MLFMA setup. Additional benefits are gained by employing Kaczmarz-like iterations, where partial measurements are used to accelerate convergence. Numerical results demonstrate both the efficiency of the forward solver and the successful application of the inverse method to imaging problems with dimensions in the neighborhood of ten wavelengths. PMID:20707438
High-speed extended-term time-domain simulation for online cascading analysis of power system
NASA Astrophysics Data System (ADS)
Fu, Chuan
A high-speed extended-term (HSET) time domain simulator (TDS), intended to become a part of an energy management system (EMS), has been newly developed for use in online extended-term dynamic cascading analysis of power systems. HSET-TDS includes the following attributes for providing situational awareness of high-consequence events: (i) online analysis, including n-1 and n-k events, (ii) ability to simulate both fast and slow dynamics for 1-3 hours in advance, (iii) inclusion of rigorous protection-system modeling, (iv) intelligence for corrective action ID, storage, and fast retrieval, and (v) high-speed execution. Very fast on-line computational capability is the most desired attribute of this simulator. Based on the process of solving algebraic differential equations describing the dynamics of power system, HSET-TDS seeks to develop computational efficiency at each of the following hierarchical levels, (i) hardware, (ii) strategies, (iii) integration methods, (iv) nonlinear solvers, and (v) linear solver libraries. This thesis first describes the Hammer-Hollingsworth 4 (HH4) implicit integration method. Like the trapezoidal rule, HH4 is symmetrically A-Stable but it possesses greater high-order precision (h4 ) than the trapezoidal rule. Such precision enables larger integration steps and therefore improves simulation efficiency for variable step size implementations. This thesis provides the underlying theory on which we advocate use of HH4 over other numerical integration methods for power system time-domain simulation. Second, motivated by the need to perform high speed extended-term time domain simulation (HSET-TDS) for on-line purposes, this thesis presents principles for designing numerical solvers of differential algebraic systems associated with power system time-domain simulation, including DAE construction strategies (Direct Solution Method), integration methods(HH4), nonlinear solvers(Very Dishonest Newton), and linear solvers(SuperLU). We have implemented a design appropriate for HSET-TDS, and we compare it to various solvers, including the commercial grade PSSE program, with respect to computational efficiency and accuracy, using as examples the New England 39 bus system, the expanded 8775 bus system, and PJM 13029 buses system. Third, we have explored a stiffness-decoupling method, intended to be part of parallel design of time domain simulation software for super computers. The stiffness-decoupling method is able to combine the advantages of implicit methods (A-stability) and explicit method(less computation). With the new stiffness detection method proposed herein, the stiffness can be captured. The expanded 975 buses system is used to test simulation efficiency. Finally, several parallel strategies for super computer deployment to simulate power system dynamics are proposed and compared. Design A partitions the task via scale with the stiffness decoupling method, waveform relaxation, and parallel linear solver. Design B partitions the task via the time axis using a highly precise integration method, the Kuntzmann-Butcher Method - order 8 (KB8). The strategy of partitioning events is designed to partition the whole simulation via the time axis through a simulated sequence of cascading events. For all strategies proposed, a strategy of partitioning cascading events is recommended, since the sub-tasks for each processor are totally independent, and therefore minimum communication time is needed.
Multigrid Methods for the Computation of Propagators in Gauge Fields
NASA Astrophysics Data System (ADS)
Kalkreuter, Thomas
Multigrid methods were invented for the solution of discretized partial differential equations in order to overcome the slowness of traditional algorithms by updates on various length scales. In the present work generalizations of multigrid methods for propagators in gauge fields are investigated. Gauge fields are incorporated in algorithms in a covariant way. The kernel C of the restriction operator which averages from one grid to the next coarser grid is defined by projection on the ground-state of a local Hamiltonian. The idea behind this definition is that the appropriate notion of smoothness depends on the dynamics. The ground-state projection choice of C can be used in arbitrary dimension and for arbitrary gauge group. We discuss proper averaging operations for bosons and for staggered fermions. The kernels C can also be used in multigrid Monte Carlo simulations, and for the definition of block spins and blocked gauge fields in Monte Carlo renormalization group studies. Actual numerical computations are performed in four-dimensional SU(2) gauge fields. We prove that our proposals for block spins are “good”, using renormalization group arguments. A central result is that the multigrid method works in arbitrarily disordered gauge fields, in principle. It is proved that computations of propagators in gauge fields without critical slowing down are possible when one uses an ideal interpolation kernel. Unfortunately, the idealized algorithm is not practical, but it was important to answer questions of principle. Practical methods are able to outperform the conjugate gradient algorithm in case of bosons. The case of staggered fermions is harder. Multigrid methods give considerable speed-ups compared to conventional relaxation algorithms, but on lattices up to 184 conjugate gradient is superior.
Multigrid Methods in Electronic Structure Calculations
NASA Astrophysics Data System (ADS)
Briggs, Emil
1996-03-01
Multigrid techniques have become the method of choice for a broad range of computational problems. Their use in electronic structure calculations introduces a new set of issues when compared to traditional plane wave approaches. We have developed a set of techniques that address these issues and permit multigrid algorithms to be applied to the electronic structure problem in an efficient manner. In our approach the Kohn-Sham equations are discretized on a real-space mesh using a compact representation of the Hamiltonian. The resulting equations are solved directly on the mesh using multigrid iterations. This produces rapid convergence rates even for ill-conditioned systems with large length and/or energy scales. The method has been applied to both periodic and non-periodic systems containing over 400 atoms and the results are in very good agreement with both theory and experiment. Example applications include a vacancy in diamond, an isolated C60 molecule, and a 64-atom cell of GaN with the Ga d-electrons in valence which required a 250 Ry cutoff. A particular strength of a real-space multigrid approach is its ready adaptability to massively parallel computer architectures. The compact representation of the Hamiltonian is especially well suited to such machines. Tests on the Cray-T3D have shown nearly linear scaling of the execution time up to the maximum number of processors (512). The MPP implementation has been used for studies of a large Amyloid Beta Peptide (C_146O_45N_42H_210) found in the brains of Alzheimers disease patients. Further applications of the multigrid method will also be described. (in collaboration D. J. Sullivan and J. Bernholc)
Multigrid Acceleration of Time-Accurate DNS of Compressible Turbulent Flow
NASA Technical Reports Server (NTRS)
Broeze, Jan; Geurts, Bernard; Kuerten, Hans; Streng, Martin
1996-01-01
An efficient scheme for the direct numerical simulation of 3D transitional and developed turbulent flow is presented. Explicit and implicit time integration schemes for the compressible Navier-Stokes equations are compared. The nonlinear system resulting from the implicit time discretization is solved with an iterative method and accelerated by the application of a multigrid technique. Since we use central spatial discretizations and no artificial dissipation is added to the equations, the smoothing method is less effective than in the more traditional use of multigrid in steady-state calculations. Therefore, a special prolongation method is needed in order to obtain an effective multigrid method. This simulation scheme was studied in detail for compressible flow over a flat plate. In the laminar regime and in the first stages of turbulent flow the implicit method provides a speed-up of a factor 2 relative to the explicit method on a relatively coarse grid. At increased resolution this speed-up is enhanced correspondingly.
NASA Astrophysics Data System (ADS)
Melazzi, D.; Curreli, D.; Manente, M.; Carlsson, J.; Pavarin, D.
2012-06-01
We present SPIREs (plaSma Padova Inhomogeneous Radial Electromagnetic solver), a Finite-Difference Frequency-Domain (FDFD) electromagnetic solver in one dimension for the rapid calculation of the electromagnetic fields and the deposited power of a large variety of cylindrical plasma problems. The two Maxwell wave equations have been discretized using a staggered Yee mesh along the radial direction of the cylinder, and Fourier transformed along the other two dimensions and in time. By means of this kind of discretization, we have found that mode-coupling of fast and slow branches can be fully resolved without singularity issues that flawed other well-established methods in the past. Fields are forced by an antenna placed at a given distance from the plasma. The plasma can be inhomogeneous, finite-temperature, collisional, magnetized and multi-species. Finite-temperature Maxwellian effects, comprising Landau and cyclotron damping, have been included by means of the plasma Z dispersion function. Finite Larmor radius effects have been neglected. Radial variations of the plasma parameters are taken into account, thus extending the range of applications to a large variety of inhomogeneous plasma systems. The method proved to be fast and reliable, with accuracy depending on the spatial grid size. Two physical examples are reported: fields in a forced vacuum waveguide with the antenna inside, and forced plasma oscillations in the helicon radiofrequency range.
Multigrid Methods for EHL Problems
NASA Technical Reports Server (NTRS)
Nurgat, Elyas; Berzins, Martin
1996-01-01
In many bearings and contacts, forces are transmitted through thin continuous fluid films which separate two contacting elements. Objects in contact are normally subjected to friction and wear which can be reduced effectively by using lubricants. If the lubricant film is sufficiently thin to prevent the opposing solids from coming into contact and carries the entire load, then we have hydrodynamic lubrication, where the lubricant film is determined by the motion and geometry of the solids. However, for loaded contacts of low geometrical conformity, such as gears, rolling contact bearings and cams, this is not the case due to high pressures and this is referred to as Elasto-Hydrodynamic Lubrication (EHL) In EHL, elastic deformation of the contacting elements and the increase in fluid viscosity with pressure are very significant and cannot be ignored. Since the deformation results in changing the geometry of the lubricating film, which in turn determines the pressure distribution, an EHL mathematical model must simultaneously satisfy the complex elasticity (integral) and the Reynolds lubrication (differential) equations. The nonlinear and coupled nature of the two equations makes numerical calculations computationally intensive. This is especially true for highly loaded problems found in practice. One novel feature of these problems is that the solution may exhibit sharp pressure spikes in the outlet region. To this date both finite element and finite difference methods have been used to solve EHL problems with perhaps greater emphasis on the use of the finite difference approach. In both cases, a major computational difficulty is ensuring convergence of the nonlinear equations solver to a steady state solution. Two successful methods for achieving this are direct iteration and multigrid methods. Direct iteration methods (e.g Gauss Seidel) have long been used in conjunction with finite difference discretizations on regular meshes. Perhaps one of the best examples of the application of such methods is the recent Effective Influence Method of Dowson and Wang. Multigrid methods have also been used with great success by Venner and Venner and Lubrecht with a good summary being given by Venner. As both these finite difference discretization based approaches appear to provide an efficient way of solving EHL problems, it is important to understand their relative merits. This paper is a first attempt at providing such an understanding in the context of EHL point contact problem, (contact of two spheres), in which the contact zone is a point and an ellipse or circle for unloaded and loaded dry contacts respectively. Since the film thickness and the contact width are generally small compared to the local radius of curvature of the two surfaces, the reduced geometry of the surfaces in the contact area can be accurately approximated to the contact between a paraboloid and a flat surface. The layout of the remainder of this paper is as follows. In section 2 we introduce the form of the equations to be solved. The Effective Influence Newton Method is described in Section 3 while Section 4 describes the Multigrid method to be used. Sections 5 and 6 describe the test problems to be used in the comparison between the two methods and compare the performance of the two methods. Section 7 concludes the paper with an argument of the two methods and suggests some future research directions.
Massively Parallel Solution of Poisson Equation on Coarse Grain MIMD Architectures
NASA Technical Reports Server (NTRS)
Fijany, A.; Weinberger, D.; Roosta, R.; Gulati, S.
1998-01-01
In this paper a new algorithm, designated as Fast Invariant Imbedding algorithm, for solution of Poisson equation on vector and massively parallel MIMD architectures is presented. This algorithm achieves the same optimal computational efficiency as other Fast Poisson solvers while offering a much better structure for vector and parallel implementation. Our implementation on the Intel Delta and Paragon shows that a speedup of over two orders of magnitude can be achieved even for moderate size problems.
Development and Application of Agglomerated Multigrid Methods for Complex Geometries
NASA Technical Reports Server (NTRS)
Nishikawa, Hiroaki; Diskin, Boris; Thomas, James L.
2010-01-01
We report progress in the development of agglomerated multigrid techniques for fully un- structured grids in three dimensions, building upon two previous studies focused on efficiently solving a model diffusion equation. We demonstrate a robust fully-coarsened agglomerated multigrid technique for 3D complex geometries, incorporating the following key developments: consistent and stable coarse-grid discretizations, a hierarchical agglomeration scheme, and line-agglomeration/relaxation using prismatic-cell discretizations in the highly-stretched grid regions. A signi cant speed-up in computer time is demonstrated for a model diffusion problem, the Euler equations, and the Reynolds-averaged Navier-Stokes equations for 3D realistic complex geometries.
Multigrid for Staggered Lattice Fermions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brower, Richard C.; Clark, M. A.; Strelchenko, Alexei
Critical slowing down in Krylov methods for the Dirac operator presents a major obstacle to further advances in lattice field theory as it approaches the continuum solution. Here we formulate a multi-grid algorithm for the Kogut-Susskind (or staggered) fermion discretization which has proven difficult relative to Wilson multigrid due to its first-order anti-Hermitian structure. The solution is to introduce a novel spectral transformation by the K\\"ahler-Dirac spin structure prior to the Galerkin projection. We present numerical results for the two-dimensional, two-flavor Schwinger model, however, the general formalism is agnostic to dimension and is directly applicable to four-dimensional lattice QCD.
NASA Technical Reports Server (NTRS)
Thompson, C. P.; Leaf, G. K.; Vanrosendale, J.
1991-01-01
An algorithm is described for the solution of the laminar, incompressible Navier-Stokes equations. The basic algorithm is a multigrid based on a robust, box-based smoothing step. Its most important feature is the incorporation of automatic, dynamic mesh refinement. This algorithm supports generalized simple domains. The program is based on a standard staggered-grid formulation of the Navier-Stokes equations for robustness and efficiency. Special grid transfer operators were introduced at grid interfaces in the multigrid algorithm to ensure discrete mass conservation. Results are presented for three models: the driven-cavity, a backward-facing step, and a sudden expansion/contraction.
Seventh Copper Mountain Conference on Multigrid Methods. Part 2
NASA Technical Reports Server (NTRS)
Melson, N. Duane (Editor); Manteuffel, Tom A. (Editor); McCormick, Steve F. (Editor); Douglas, Craig C. (Editor)
1996-01-01
The Seventh Copper Mountain Conference on Multigrid Methods was held on April 2-7, 1995 at Copper Mountain, Colorado. This book is a collection of many of the papers presented at the conference and so represents the conference proceedings. NASA Langley graciously provided printing of this document so that all of the papers could be presented in a single forum. Each paper was reviewed by a member of the conference organizing committee under the coordination of the editors. The vibrancy and diversity in this field are amply expressed in these important papers, and the collection clearly shows the continuing rapid growth of the use of multigrid acceleration techniques.
Recovery Discontinuous Galerkin Jacobian-Free Newton-Krylov Method for All-Speed Flows
DOE Office of Scientific and Technical Information (OSTI.GOV)
HyeongKae Park; Robert Nourgaliev; Vincent Mousseau
2008-07-01
A novel numerical algorithm (rDG-JFNK) for all-speed fluid flows with heat conduction and viscosity is introduced. The rDG-JFNK combines the Discontinuous Galerkin spatial discretization with the implicit Runge-Kutta time integration under the Jacobian-free Newton-Krylov framework. We solve fully-compressible Navier-Stokes equations without operator-splitting of hyperbolic, diffusion and reaction terms, which enables fully-coupled high-order temporal discretization. The stability constraint is removed due to the L-stable Explicit, Singly Diagonal Implicit Runge-Kutta (ESDIRK) scheme. The governing equations are solved in the conservative form, which allows one to accurately compute shock dynamics, as well as low-speed flows. For spatial discretization, we develop a “recovery” familymore » of DG, exhibiting nearly-spectral accuracy. To precondition the Krylov-based linear solver (GMRES), we developed an “Operator-Split”-(OS) Physics Based Preconditioner (PBP), in which we transform/simplify the fully-coupled system to a sequence of segregated scalar problems, each can be solved efficiently with Multigrid method. Each scalar problem is designed to target/cluster eigenvalues of the Jacobian matrix associated with a specific physics.« less
Numerical study of the flow in a three-dimensional thermally driven cavity
NASA Astrophysics Data System (ADS)
Rauwoens, Pieter; Vierendeels, Jan; Merci, Bart
2008-06-01
Solutions for the fully compressible Navier-Stokes equations are presented for the flow and temperature fields in a cubic cavity with large horizontal temperature differences. The ideal-gas approximation for air is assumed and viscosity is computed using Sutherland's law. The three-dimensional case forms an extension of previous studies performed on a two-dimensional square cavity. The influence of imposed boundary conditions in the third dimension is investigated as a numerical experiment. Comparison is made between convergence rates in case of periodic and free-slip boundary conditions. Results with no-slip boundary conditions are presented as well. The effect of the Rayleigh number is studied. Results are computed using a finite volume method on a structured, collocated grid. An explicit third-order discretization for the convective part and an implicit central discretization for the acoustic part and for the diffusive part are used. To stabilize the scheme an artificial dissipation term for the pressure and the temperature is introduced. The discrete equations are solved using a time-marching method with restrictions on the timestep corresponding to the explicit parts of the solver. Multigrid is used as acceleration technique.
NASA Astrophysics Data System (ADS)
Gorpas, Dimitris; Politopoulos, Kostas; Yova, Dido; Andersson-Engels, Stefan
2008-02-01
One of the most challenging problems in medical imaging is to "see" a tumour embedded into tissue, which is a turbid medium, by using fluorescent probes for tumour labeling. This problem, despite the efforts made during the last years, has not been fully encountered yet, due to the non-linear nature of the inverse problem and the convergence failures of many optimization techniques. This paper describes a robust solution of the inverse problem, based on data fitting and image fine-tuning techniques. As a forward solver the coupled radiative transfer equation and diffusion approximation model is proposed and compromised via a finite element method, enhanced with adaptive multi-grids for faster and more accurate convergence. A database is constructed by application of the forward model on virtual tumours with known geometry, and thus fluorophore distribution, embedded into simulated tissues. The fitting procedure produces the best matching between the real and virtual data, and thus provides the initial estimation of the fluorophore distribution. Using this information, the coupled radiative transfer equation and diffusion approximation model has the required initial values for a computational reasonable and successful convergence during the image fine-tuning application.
NASA Astrophysics Data System (ADS)
D'Ambra, Pasqua; Tartaglione, Gaetano
2015-04-01
Image segmentation addresses the problem to partition a given image into its constituent objects and then to identify the boundaries of the objects. This problem can be formulated in terms of a variational model aimed to find optimal approximations of a bounded function by piecewise-smooth functions, minimizing a given functional. The corresponding Euler-Lagrange equations are a set of two coupled elliptic partial differential equations with varying coefficients. Numerical solution of the above system often relies on alternating minimization techniques involving descent methods coupled with explicit or semi-implicit finite-difference discretization schemes, which are slowly convergent and poorly scalable with respect to image size. In this work we focus on generalized relaxation methods also coupled with multigrid linear solvers, when a finite-difference discretization is applied to the Euler-Lagrange equations of Ambrosio-Tortorelli model. We show that non-linear Gauss-Seidel, accelerated by inner linear iterations, is an effective method for large-scale image analysis as those arising from high-throughput screening platforms for stem cells targeted differentiation, where one of the main goal is segmentation of thousand of images to analyze cell colonies morphology.
Solution of Ambrosio-Tortorelli model for image segmentation by generalized relaxation method
NASA Astrophysics Data System (ADS)
D'Ambra, Pasqua; Tartaglione, Gaetano
2015-03-01
Image segmentation addresses the problem to partition a given image into its constituent objects and then to identify the boundaries of the objects. This problem can be formulated in terms of a variational model aimed to find optimal approximations of a bounded function by piecewise-smooth functions, minimizing a given functional. The corresponding Euler-Lagrange equations are a set of two coupled elliptic partial differential equations with varying coefficients. Numerical solution of the above system often relies on alternating minimization techniques involving descent methods coupled with explicit or semi-implicit finite-difference discretization schemes, which are slowly convergent and poorly scalable with respect to image size. In this work we focus on generalized relaxation methods also coupled with multigrid linear solvers, when a finite-difference discretization is applied to the Euler-Lagrange equations of Ambrosio-Tortorelli model. We show that non-linear Gauss-Seidel, accelerated by inner linear iterations, is an effective method for large-scale image analysis as those arising from high-throughput screening platforms for stem cells targeted differentiation, where one of the main goal is segmentation of thousand of images to analyze cell colonies morphology.
Pseudo-time methods for constrained optimization problems governed by PDE
NASA Technical Reports Server (NTRS)
Taasan, Shlomo
1995-01-01
In this paper we present a novel method for solving optimization problems governed by partial differential equations. Existing methods are gradient information in marching toward the minimum, where the constrained PDE is solved once (sometimes only approximately) per each optimization step. Such methods can be viewed as a marching techniques on the intersection of the state and costate hypersurfaces while improving the residuals of the design equations per each iteration. In contrast, the method presented here march on the design hypersurface and at each iteration improve the residuals of the state and costate equations. The new method is usually much less expensive per iteration step since, in most problems of practical interest, the design equation involves much less unknowns that that of either the state or costate equations. Convergence is shown using energy estimates for the evolution equations governing the iterative process. Numerical tests show that the new method allows the solution of the optimization problem in a cost of solving the analysis problems just a few times, independent of the number of design parameters. The method can be applied using single grid iterations as well as with multigrid solvers.
Eigensystem analysis of classical relaxation techniques with applications to multigrid analysis
NASA Technical Reports Server (NTRS)
Lomax, Harvard; Maksymiuk, Catherine
1987-01-01
Classical relaxation techniques are related to numerical methods for solution of ordinary differential equations. Eigensystems for Point-Jacobi, Gauss-Seidel, and SOR methods are presented. Solution techniques such as eigenvector annihilation, eigensystem mixing, and multigrid methods are examined with regard to the eigenstructure.
Multidimensional radiative transfer with multilevel atoms. II. The non-linear multigrid method.
NASA Astrophysics Data System (ADS)
Fabiani Bendicho, P.; Trujillo Bueno, J.; Auer, L.
1997-08-01
A new iterative method for solving non-LTE multilevel radiative transfer (RT) problems in 1D, 2D or 3D geometries is presented. The scheme obtains the self-consistent solution of the kinetic and RT equations at the cost of only a few (<10) formal solutions of the RT equation. It combines, for the first time, non-linear multigrid iteration (Brandt, 1977, Math. Comp. 31, 333; Hackbush, 1985, Multi-Grid Methods and Applications, springer-Verlag, Berlin), an efficient multilevel RT scheme based on Gauss-Seidel iterations (cf. Trujillo Bueno & Fabiani Bendicho, 1995ApJ...455..646T), and accurate short-characteristics formal solution techniques. By combining a valid stopping criterion with a nested-grid strategy a converged solution with the desired true error is automatically guaranteed. Contrary to the current operator splitting methods the very high convergence speed of the new RT method does not deteriorate when the grid spatial resolution is increased. With this non-linear multigrid method non-LTE problems discretized on N grid points are solved in O(N) operations. The nested multigrid RT method presented here is, thus, particularly attractive in complicated multilevel transfer problems where small grid-sizes are required. The properties of the method are analyzed both analytically and with illustrative multilevel calculations for Ca II in 1D and 2D schematic model atmospheres.
A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garrett, C. Kristopher; Hauck, Cory D.
In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less
Xie, Yang; Ying, Jinyong; Xie, Dexuan
2017-03-30
SMPBS (Size Modified Poisson-Boltzmann Solvers) is a web server for computing biomolecular electrostatics using finite element solvers of the size modified Poisson-Boltzmann equation (SMPBE). SMPBE not only reflects ionic size effects but also includes the classic Poisson-Boltzmann equation (PBE) as a special case. Thus, its web server is expected to have a broader range of applications than a PBE web server. SMPBS is designed with a dynamic, mobile-friendly user interface, and features easily accessible help text, asynchronous data submission, and an interactive, hardware-accelerated molecular visualization viewer based on the 3Dmol.js library. In particular, the viewer allows computed electrostatics to be directly mapped onto an irregular triangular mesh of a molecular surface. Due to this functionality and the fast SMPBE finite element solvers, the web server is very efficient in the calculation and visualization of electrostatics. In addition, SMPBE is reconstructed using a new objective electrostatic free energy, clearly showing that the electrostatics and ionic concentrations predicted by SMPBE are optimal in the sense of minimizing the objective electrostatic free energy. SMPBS is available at the URL: smpbs.math.uwm.edu © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Sides, Scott; Jamroz, Ben; Crockett, Robert; Pletzer, Alexander
2012-02-01
Self-consistent field theory (SCFT) for dense polymer melts has been highly successful in describing complex morphologies in block copolymers. Field-theoretic simulations such as these are able to access large length and time scales that are difficult or impossible for particle-based simulations such as molecular dynamics. The modified diffusion equations that arise as a consequence of the coarse-graining procedure in the SCF theory can be efficiently solved with a pseudo-spectral (PS) method that uses fast-Fourier transforms on uniform Cartesian grids. However, PS methods can be difficult to apply in many block copolymer SCFT simulations (eg. confinement, interface adsorption) in which small spatial regions might require finer resolution than most of the simulation grid. Progress on using new solver algorithms to address these problems will be presented. The Tech-X Chompst project aims at marrying the best of adaptive mesh refinement with linear matrix solver algorithms. The Tech-X code PolySwift++ is an SCFT simulation platform that leverages ongoing development in coupling Chombo, a package for solving PDEs via block-structured AMR calculations and embedded boundaries, with PETSc, a toolkit that includes a large assortment of sparse linear solvers.
A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework
Garrett, C. Kristopher; Hauck, Cory D.
2018-04-05
In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less
Multigrid for hypersonic viscous two- and three-dimensional flows
NASA Technical Reports Server (NTRS)
Turkel, E.; Swanson, R. C.; Vatsa, V. N.; White, J. A.
1991-01-01
The use of a multigrid method with central differencing to solve the Navier-Stokes equations for hypersonic flows is considered. The time dependent form of the equations is integrated with an explicit Runge-Kutta scheme accelerated by local time stepping and implicit residual smoothing. Variable coefficients are developed for the implicit process that removes the diffusion limit on the time step, producing significant improvement in convergence. A numerical dissipation formulation that provides good shock capturing capability for hypersonic flows is presented. This formulation is shown to be a crucial aspect of the multigrid method. Solutions are given for two-dimensional viscous flow over a NACA 0012 airfoil and three-dimensional flow over a blunt biconic.
Seventh Copper Mountain Conference on Multigrid Methods. Part 1
NASA Technical Reports Server (NTRS)
Melson, N. Duane; Manteuffel, Tom A.; McCormick, Steve F.; Douglas, Craig C.
1996-01-01
The Seventh Copper Mountain Conference on Multigrid Methods was held on 2-7 Apr. 1995 at Copper Mountain, Colorado. This book is a collection of many of the papers presented at the conference and so represents the conference proceedings. NASA Langley graciously provided printing of this document so that all of the papers could be presented in a single forum. Each paper was reviewed by a member of the conference organizing committee under the coordination of the editors. The multigrid discipline continues to expand and mature, as is evident from these proceedings. The vibrancy in this field is amply expressed in these important papers, and the collection shows its rapid trend to further diversity and depth.
The Sixth Copper Mountain Conference on Multigrid Methods, part 2
NASA Technical Reports Server (NTRS)
Melson, N. Duane (Editor); Mccormick, Steve F. (Editor); Manteuffel, Thomas A. (Editor)
1993-01-01
The Sixth Copper Mountain Conference on Multigrid Methods was held on April 4-9, 1993, at Copper Mountain, Colorado. This book is a collection of many of the papers presented at the conference and so represents the conference proceedings. NASA Langley graciously provided printing of this document so that all of the papers could be presented in a single forum. Each paper was reviewed by a member of the conference organizing committee under the coordination of the editors. The multigrid discipline continues to expand and mature, as is evident from these proceedings. The vibrancy in this field is amply expressed in these important papers, and the collection clearly shows its rapid trend to further diversity and depth.
Mapping implicit spectral methods to distributed memory architectures
NASA Technical Reports Server (NTRS)
Overman, Andrea L.; Vanrosendale, John
1991-01-01
Spectral methods were proven invaluable in numerical simulation of PDEs (Partial Differential Equations), but the frequent global communication required raises a fundamental barrier to their use on highly parallel architectures. To explore this issue, a 3-D implicit spectral method was implemented on an Intel hypercube. Utilization of about 50 percent was achieved on a 32 node iPSC/860 hypercube, for a 64 x 64 x 64 Fourier-spectral grid; finer grids yield higher utilizations. Chebyshev-spectral grids are more problematic, since plane-relaxation based multigrid is required. However, by using a semicoarsening multigrid algorithm, and by relaxing all multigrid levels concurrently, relatively high utilizations were also achieved in this harder case.
Multigrid techniques for the solution of the passive scalar advection-diffusion equation
NASA Technical Reports Server (NTRS)
Phillips, R. E.; Schmidt, F. W.
1985-01-01
The solution of elliptic passive scalar advection-diffusion equations is required in the analysis of many turbulent flow and convective heat transfer problems. The accuracy of the solution may be affected by the presence of regions containing large gradients of the dependent variables. The multigrid concept of local grid refinement is a method for improving the accuracy of the calculations in these problems. In combination with the multilevel acceleration techniques, an accurate and efficient computational procedure is developed. In addition, a robust implementation of the QUICK finite-difference scheme is described. Calculations of a test problem are presented to quantitatively demonstrate the advantages of the multilevel-multigrid method.
Spectral element multigrid. Part 2: Theoretical justification
NASA Technical Reports Server (NTRS)
Maday, Yvon; Munoz, Rafael
1988-01-01
A multigrid algorithm is analyzed which is used for solving iteratively the algebraic system resulting from tha approximation of a second order problem by spectral or spectral element methods. The analysis, performed here in the one dimensional case, justifies the good smoothing properties of the Jacobi preconditioner that was presented in Part 1 of this paper.
Investigation of fast ion pressure effects in ASDEX Upgrade by spectral MSE measurements
NASA Astrophysics Data System (ADS)
Reimer, René; Dinklage, Andreas; Wolf, Robert; Dunne, Mike; Geiger, Benedikt; Hobirk, Jörg; Reich, Matthias; ASDEX Upgrade Team; McCarthy, Patrick J.
2017-04-01
High precision measurements of fast ion effects on the magnetic equilibrium in the ASDEX Upgrade tokamak have been conducted in a high-power (10 MW) neutral-beam injection discharge. An improved analysis of the spectral motional Stark effect data based on forward-modeling, including the Zeeman effect, fine-structure and non-statistical sub-level distribution, revealed changes in the order of 1% in |B| . The results were found to be consistent with results from the equilibrium solver CLISTE. The measurements allowed us to derive the fast ion pressure fraction to be Δ {{p}\\text{FI}}/{{p}\\text{mhd}}≈ 10 % and variations of the fast ion pressure are consistent with calculations of the transport code TRANSP. The results advance the understanding of fast ion confinement and magneto-hydrodynamic stability in the presence of fast ions.
A Multigrid NLS-4DVar Data Assimilation Scheme with Advanced Research WRF (ARW)
NASA Astrophysics Data System (ADS)
Zhang, H.; Tian, X.
2017-12-01
The motions of the atmosphere have multiscale properties in space and/or time, and the background error covariance matrix (Β) should thus contain error information at different correlation scales. To obtain an optimal analysis, the multigrid three-dimensional variational data assimilation scheme is used widely when sequentially correcting errors from large to small scales. However, introduction of the multigrid technique into four-dimensional variational data assimilation is not easy, due to its strong dependence on the adjoint model, which has extremely high computational costs in data coding, maintenance, and updating. In this study, the multigrid technique was introduced into the nonlinear least-squares four-dimensional variational assimilation (NLS-4DVar) method, which is an advanced four-dimensional ensemble-variational method that can be applied without invoking the adjoint models. The multigrid NLS-4DVar (MG-NLS-4DVar) scheme uses the number of grid points to control the scale, with doubling of this number when moving from a coarse to a finer grid. Furthermore, the MG-NLS-4DVar scheme not only retains the advantages of NLS-4DVar, but also sufficiently corrects multiscale errors to achieve a highly accurate analysis. The effectiveness and efficiency of the proposed MG-NLS-4DVar scheme were evaluated by several groups of observing system simulation experiments using the Advanced Research Weather Research and Forecasting Model. MG-NLS-4DVar outperformed NLS-4DVar, with a lower computational cost.
NASA Astrophysics Data System (ADS)
Ge, Yongbin; Cao, Fujun
2011-05-01
In this paper, a multigrid method based on the high order compact (HOC) difference scheme on nonuniform grids, which has been proposed by Kalita et al. [J.C. Kalita, A.K. Dass, D.C. Dalal, A transformation-free HOC scheme for steady convection-diffusion on non-uniform grids, Int. J. Numer. Methods Fluids 44 (2004) 33-53], is proposed to solve the two-dimensional (2D) convection diffusion equation. The HOC scheme is not involved in any grid transformation to map the nonuniform grids to uniform grids, consequently, the multigrid method is brand-new for solving the discrete system arising from the difference equation on nonuniform grids. The corresponding multigrid projection and interpolation operators are constructed by the area ratio. Some boundary layer and local singularity problems are used to demonstrate the superiority of the present method. Numerical results show that the multigrid method with the HOC scheme on nonuniform grids almost gets as equally efficient convergence rate as on uniform grids and the computed solution on nonuniform grids retains fourth order accuracy while on uniform grids just gets very poor solution for very steep boundary layer or high local singularity problems. The present method is also applied to solve the 2D incompressible Navier-Stokes equations using the stream function-vorticity formulation and the numerical solutions of the lid-driven cavity flow problem are obtained and compared with solutions available in the literature.
Time-marching multi-grid seismic tomography
NASA Astrophysics Data System (ADS)
Tong, P.; Yang, D.; Liu, Q.
2016-12-01
From the classic ray-based traveltime tomography to the state-of-the-art full waveform inversion, because of the nonlinearity of seismic inverse problems, a good starting model is essential for preventing the convergence of the objective function toward local minima. With a focus on building high-accuracy starting models, we propose the so-called time-marching multi-grid seismic tomography method in this study. The new seismic tomography scheme consists of a temporal time-marching approach and a spatial multi-grid strategy. We first divide the recording period of seismic data into a series of time windows. Sequentially, the subsurface properties in each time window are iteratively updated starting from the final model of the previous time window. There are at least two advantages of the time-marching approach: (1) the information included in the seismic data of previous time windows has been explored to build the starting models of later time windows; (2) seismic data of later time windows could provide extra information to refine the subsurface images. Within each time window, we use a multi-grid method to decompose the scale of the inverse problem. Specifically, the unknowns of the inverse problem are sampled on a coarse mesh to capture the macro-scale structure of the subsurface at the beginning. Because of the low dimensionality, it is much easier to reach the global minimum on a coarse mesh. After that, finer meshes are introduced to recover the micro-scale properties. That is to say, the subsurface model is iteratively updated on multi-grid in every time window. We expect that high-accuracy starting models should be generated for the second and later time windows. We will test this time-marching multi-grid method by using our newly developed eikonal-based traveltime tomography software package tomoQuake. Real application results in the 2016 Kumamoto earthquake (Mw 7.0) region in Japan will be demonstrated.
NASA Technical Reports Server (NTRS)
Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.
1991-01-01
Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.
Fast Numerical Methods for the Design of Layered Photonic Structures with Rough Interfaces
NASA Technical Reports Server (NTRS)
Komarevskiy, Nikolay; Braginsky, Leonid; Shklover, Valery; Hafner, Christian; Lawson, John
2011-01-01
Modified boundary conditions (MBC) and a multilayer approach (MA) are proposed as fast and efficient numerical methods for the design of 1D photonic structures with rough interfaces. These methods are applicable for the structures, composed of materials with arbitrary permittivity tensor. MBC and MA are numerically validated on different types of interface roughness and permittivities of the constituent materials. The proposed methods can be combined with the 4x4 scattering matrix method as a field solver and an evolutionary strategy as an optimizer. The resulted optimization procedure is fast, accurate, numerically stable and can be used to design structures for various applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ruge, J W; Dean, D
2000-11-20
The goal of this subcontract was to modify the FOSPACK code, developed by John Ruge, to call the BoomerAMG solver developed at LLNL through the HYPRE interface. FOSPACK is a package developed for the automatic discretization and solution of First-Order System Least-Squares (FOSLS) formulations of 2D partial differential equations (c.f [3-9]). FOSPACK takes a user-specified mesh (which can be an unstructured combination of triangular and quadrilateral elements) and specification of the first-order system, and produces the discretizations needed for solution. Generally, all specifications are contained in data files, so no re-compilation is necessary when changing domains, mesh sizes, problems, etc.more » Much of the work in FOSPACK has gone into an interpreter that allows for simple, intuitive specification of the equations. The interpreter reads the equations, processes them, and stores them as instruction lists needed to apply the operators involved to finite element basis functions, allowing assembly of the discrete system. Quite complex equations may be specified, including variable coefficients, user defined functions, and vector notation. The first-order systems may be nonlinear, with linearizations either performed automatically, or specified in a convenient way by the user. The program also includes global/local refinement capability. FOSLS formulations are very well suited for solution by algebraic multigrid (AMG) (c.f. [10-13]). The original version uses a version of algebraic multigrid written by John Ruge in FORTRAN 77, and modified somewhat for use with FOSPACK. BoomerAMG, a version of AMG developed at CASC, has a number of advantages over the FORTRAN version, including dynamic memory allocation and parallel capability. This project was to benefit both FRSC and CASC, giving FOSPACK the advantages of BoomerAMG, while giving CASC a tool for testing FOSLS as a discretization method for problems of interest there. The major parts of this work were implementation and testing of the HYPRE package on our computers, writing a C wrapper/driver for the FOSPACK code, and modifying the wrapper to call BoomerAMG through the HYPRE interface.« less
NASA Astrophysics Data System (ADS)
Saxena, Nishank; Hofmann, Ronny; Alpak, Faruk O.; Berg, Steffen; Dietderich, Jesse; Agarwal, Umang; Tandon, Kunj; Hunter, Sander; Freeman, Justin; Wilson, Ove Bjorn
2017-11-01
We generate a novel reference dataset to quantify the impact of numerical solvers, boundary conditions, and simulation platforms. We consider a variety of microstructures ranging from idealized pipes to digital rocks. Pore throats of the digital rocks considered are large enough to be well resolved with state-of-the-art micro-computerized tomography technology. Permeability is computed using multiple numerical engines, 12 in total, including, Lattice-Boltzmann, computational fluid dynamics, voxel based, fast semi-analytical, and known empirical models. Thus, we provide a measure of uncertainty associated with flow computations of digital media. Moreover, the reference and standards dataset generated is the first of its kind and can be used to test and improve new fluid flow algorithms. We find that there is an overall good agreement between solvers for idealized cross-section shape pipes. As expected, the disagreement increases with increase in complexity of the pore space. Numerical solutions for pipes with sinusoidal variation of cross section show larger variability compared to pipes of constant cross-section shapes. We notice relatively larger variability in computed permeability of digital rocks with coefficient of variation (of up to 25%) in computed values between various solvers. Still, these differences are small given other subsurface uncertainties. The observed differences between solvers can be attributed to several causes including, differences in boundary conditions, numerical convergence criteria, and parameterization of fundamental physics equations. Solvers that perform additional meshing of irregular pore shapes require an additional step in practical workflows which involves skill and can introduce further uncertainty. Computation times for digital rocks vary from minutes to several days depending on the algorithm and available computational resources. We find that more stringent convergence criteria can improve solver accuracy but at the expense of longer computation time.
A Tensor-Train accelerated solver for integral equations in complex geometries
NASA Astrophysics Data System (ADS)
Corona, Eduardo; Rahimian, Abtin; Zorin, Denis
2017-04-01
We present a framework using the Quantized Tensor Train (QTT) decomposition to accurately and efficiently solve volume and boundary integral equations in three dimensions. We describe how the QTT decomposition can be used as a hierarchical compression and inversion scheme for matrices arising from the discretization of integral equations. For a broad range of problems, computational and storage costs of the inversion scheme are extremely modest O (log N) and once the inverse is computed, it can be applied in O (Nlog N) . We analyze the QTT ranks for hierarchically low rank matrices and discuss its relationship to commonly used hierarchical compression techniques such as FMM and HSS. We prove that the QTT ranks are bounded for translation-invariant systems and argue that this behavior extends to non-translation invariant volume and boundary integrals. For volume integrals, the QTT decomposition provides an efficient direct solver requiring significantly less memory compared to other fast direct solvers. We present results demonstrating the remarkable performance of the QTT-based solver when applied to both translation and non-translation invariant volume integrals in 3D. For boundary integral equations, we demonstrate that using a QTT decomposition to construct preconditioners for a Krylov subspace method leads to an efficient and robust solver with a small memory footprint. We test the QTT preconditioners in the iterative solution of an exterior elliptic boundary value problem (Laplace) formulated as a boundary integral equation in complex, multiply connected geometries.
Spectral multigrid methods for elliptic equations 2
NASA Technical Reports Server (NTRS)
Zang, T. A.; Wong, Y. S.; Hussaini, M. Y.
1983-01-01
A detailed description of spectral multigrid methods is provided. This includes the interpolation and coarse-grid operators for both periodic and Dirichlet problems. The spectral methods for periodic problems use Fourier series and those for Dirichlet problems are based upon Chebyshev polynomials. An improved preconditioning for Dirichlet problems is given. Numerical examples and practical advice are included.
Multigrid and Krylov Subspace Methods for the Discrete Stokes Equations
NASA Technical Reports Server (NTRS)
Elman, Howard C.
1996-01-01
Discretization of the Stokes equations produces a symmetric indefinite system of linear equations. For stable discretizations, a variety of numerical methods have been proposed that have rates of convergence independent of the mesh size used in the discretization. In this paper, we compare the performance of four such methods: variants of the Uzawa, preconditioned conjugate gradient, preconditioned conjugate residual, and multigrid methods, for solving several two-dimensional model problems. The results indicate that where it is applicable, multigrid with smoothing based on incomplete factorization is more efficient than the other methods, but typically by no more than a factor of two. The conjugate residual method has the advantage of being both independent of iteration parameters and widely applicable.
Multigrid Solution of the Navier-Stokes Equations at Low Speeds with Large Temperature Variations
NASA Technical Reports Server (NTRS)
Sockol, Peter M.
2002-01-01
Multigrid methods for the Navier-Stokes equations at low speeds and large temperature variations are investigated. The compressible equations with time-derivative preconditioning and preconditioned flux-difference splitting of the inviscid terms are used. Three implicit smoothers have been incorporated into a common multigrid procedure. Both full coarsening and semi-coarsening with directional fine-grid defect correction have been studied. The resulting methods have been tested on four 2D laminar problems over a range of Reynolds numbers on both uniform and highly stretched grids. Two of the three methods show efficient and robust performance over the entire range of conditions. In addition none of the methods have any difficulty with the large temperature variations.
The Sixth Copper Mountain Conference on Multigrid Methods, part 1
NASA Technical Reports Server (NTRS)
Melson, N. Duane (Editor); Manteuffel, T. A. (Editor); Mccormick, S. F. (Editor)
1993-01-01
The Sixth Copper Mountain Conference on Multigrid Methods was held on 4-9 Apr. 1993, at Copper Mountain, CO. This book is a collection of many of the papers presented at the conference and as such represents the conference proceedings. NASA LaRC graciously provided printing of this document so that all of the papers could be presented in a single forum. Each paper was reviewed by a member of the conference organizing committee under the coordination of the editors. The multigrid discipline continues to expand and mature, as is evident from these proceedings. The vibrancy in this field is amply expressed in these important papers, and the collection clearly shows its rapid trend to further diversity and depth.
Updated users' guide for TAWFIVE with multigrid
NASA Technical Reports Server (NTRS)
Melson, N. Duane; Streett, Craig L.
1989-01-01
A program for the Transonic Analysis of a Wing and Fuselage with Interacted Viscous Effects (TAWFIVE) was improved by the incorporation of multigrid and a method to specify lift coefficient rather than angle-of-attack. A finite volume full potential multigrid method is used to model the outer inviscid flow field. First order viscous effects are modeled by a 3-D integral boundary layer method. Both turbulent and laminar boundary layers are treated. Wake thickness effects are modeled using a 2-D strip method. A brief discussion of the engineering aspects of the program is given. The input, output, and use of the program are covered in detail. Sample results are given showing the effects of boundary layer corrections and the capability of the lift specification method.
LDRD final report on massively-parallel linear programming : the parPCx system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar
2005-02-01
This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runsmore » on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We conclude with directions for long-term future algorithmic research and for near-term development that could improve the performance of parPCx.« less
Fast and Efficient Discrimination of Traveling Salesperson Problem Stimulus Difficulty
ERIC Educational Resources Information Center
Dry, Matthew J.; Fontaine, Elizabeth L.
2014-01-01
The Traveling Salesperson Problem (TSP) is a computationally difficult combinatorial optimization problem. In spite of its relative difficulty, human solvers are able to generate close-to-optimal solutions in a close-to-linear time frame, and it has been suggested that this is due to the visual system's inherent sensitivity to certain geometric…
Fast Time Domain Integral Equation Solvers for Large-Scale Electromagnetic Analysis
2004-10-01
this topic coauthored by Mingyu Lu and Eric Michielssen received the Best Student Paper award at the 2001 IEEE Antennas and Propagation International...Yu Zhong, current Ph.D. student at UIUC. 18. Yujia Li, current Ph.D. student at UIUC. 19. Mingyu Lu, current Postdoctoral Fellow at UIUC. 20
Kantardjiev, Alexander A
2015-04-05
A cluster of strongly interacting ionization groups in protein molecules with irregular ionization behavior is suggestive for specific structure-function relationship. However, their computational treatment is unconventional (e.g., lack of convergence in naive self-consistent iterative algorithm). The stringent evaluation requires evaluation of Boltzmann averaged statistical mechanics sums and electrostatic energy estimation for each microstate. irGPU: Irregular strong interactions in proteins--a GPU solver is novel solution to a versatile problem in protein biophysics--atypical protonation behavior of coupled groups. The computational severity of the problem is alleviated by parallelization (via GPU kernels) which is applied for the electrostatic interaction evaluation (including explicit electrostatics via the fast multipole method) as well as statistical mechanics sums (partition function) estimation. Special attention is given to the ease of the service and encapsulation of theoretical details without sacrificing rigor of computational procedures. irGPU is not just a solution-in-principle but a promising practical application with potential to entice community into deeper understanding of principles governing biomolecule mechanisms. © 2015 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Agata, Ryoichiro; Ichimura, Tsuyoshi; Hori, Takane; Hirahara, Kazuro; Hashimoto, Chihiro; Hori, Muneo
2018-04-01
The simultaneous estimation of the asthenosphere's viscosity and coseismic slip/afterslip is expected to improve largely the consistency of the estimation results to observation data of crustal deformation collected in widely spread observation points, compared to estimations of slips only. Such an estimate can be formulated as a non-linear inverse problem of material properties of viscosity and input force that is equivalent to fault slips based on large-scale finite-element (FE) modeling of crustal deformation, in which the degree of freedom is in the order of 109. We formulated and developed a computationally efficient adjoint-based estimation method for this inverse problem, together with a fast and scalable FE solver for the associated forward and adjoint problems. In a numerical experiment that imitates the 2011 Tohoku-Oki earthquake, the advantage of the proposed method is confirmed by comparing the estimated results with those obtained using simplified estimation methods. The computational cost required for the optimization shows that the proposed method enabled the targeted estimation to be completed with moderate amount of computational resources.
Inverse problems in eddy current testing using neural network
NASA Astrophysics Data System (ADS)
Yusa, N.; Cheng, W.; Miya, K.
2000-05-01
Reconstruction of crack in conductive material is one of the most important issues in the field of eddy current testing. Although many attempts to reconstruct cracks have been made, most of them deal with only artificial cracks machined with electro-discharge. However, in the case of natural cracks like stress corrosion cracking or inter-granular attack, there must be contact region and therefore their conductivity is not necessarily zero. In this study, an attempt to reconstruct natural cracks using neural network is presented. The neural network was trained through numerical simulated data obtained by the fast forward solver that calculated unflawed potential data a priori to save computational time. The solver is based on A-φ method discretized by using FEM-BEM A natural crack was modeled as an area whose conductivity was less than that of a specimen. The distribution of conductivity in that area was reconstructed as well. It took much time to train the network, but the speed of reconstruction was extremely fast after once it was trained. Well-trained network gave good reconstruction result.
A multigrid LU-SSOR scheme for approximate Newton iteration applied to the Euler equations
NASA Technical Reports Server (NTRS)
Yoon, Seokkwan; Jameson, Antony
1986-01-01
A new efficient relaxation scheme in conjunction with a multigrid method is developed for the Euler equations. The LU SSOR scheme is based on a central difference scheme and does not need flux splitting for Newton iteration. Application to transonic flow shows that the new method surpasses the performance of the LU implicit scheme.
NASA Technical Reports Server (NTRS)
Nicolaides, R. A.
1979-01-01
A description and explanation of a simple multigrid algorithm for solving finite element systems is given. Numerical results for an implementation are reported for a number of elliptic equations, including cases with singular coefficients and indefinite equations. The method shows the high efficiency, essentially independent of the grid spacing, predicted by the theory.
Semiannual Report October 1, 1999 through March 31, 2000
2000-04-01
Mark Carpenter (NASA Langley). Textbook Multigrid Efficiency for the Navier-Stokes Equations Boris Diskin A typical modern Reynolds-Averaged...defined as textbook multigrid efficiency (TME), meaning the solutions to the governing system of equations are attained in a computational work...basic elements of the barriers to be overcome in extending textbook efficiencies to the compressible RANS equations, namely entering flows, far wake
On Bi-Grid Local Mode Analysis of Solution Techniques for 3-D Euler and Navier-Stokes Equations
NASA Technical Reports Server (NTRS)
Ibraheem, S. O.; Demuren, A. O.
1994-01-01
A procedure is presented for utilizing a bi-grid stability analysis as a practical tool for predicting multigrid performance in a range of numerical methods for solving Euler and Navier-Stokes equations. Model problems based on the convection, diffusion and Burger's equation are used to illustrate the superiority of the bi-grid analysis as a predictive tool for multigrid performance in comparison to the smoothing factor derived from conventional von Neumann analysis. For the Euler equations, bi-grid analysis is presented for three upwind difference based factorizations, namely Spatial, Eigenvalue and Combination splits, and two central difference based factorizations, namely LU and ADI methods. In the former, both the Steger-Warming and van Leer flux-vector splitting methods are considered. For the Navier-Stokes equations, only the Beam-Warming (ADI) central difference scheme is considered. In each case, estimates of multigrid convergence rates from the bi-grid analysis are compared to smoothing factors obtained from single-grid stability analysis. Effects of grid aspect ratio and flow skewness are examined. Both predictions are compared with practical multigrid convergence rates for 2-D Euler and Navier-Stokes solutions based on the Beam-Warming central scheme.
Inverse Problems, Control and Modeling in the Presence of Uncertainty
2007-10-30
using a Kelvin model, CRSC- TR07-08, March, 2007; IEEE Transactions on Biomedical Engineering, submitted. [P18] K. Ito, Q. Huynh and J . Toivanen, A fast...Science and Engineering, Springer (2006), 595 602 . [P19] K.Ito and J . Toivanen, A fast iterative solver for scattering by elastic objects in layered...and N.G. Medhin, " A stick-slip/Rouse hybrid model", CRSC-TR05-28, August, 2005. [P23] H.T. Banks, A . F. Karr, H. K. Nguyen, and J . R. Samuels, Jr
Fully coupled simulation of cosmic reionization. I. numerical methods and tests
Norman, Michael L.; Reynolds, Daniel R.; So, Geoffrey C.; ...
2015-01-09
Here, we describe an extension of the Enzo code to enable fully coupled radiation hydrodynamical simulation of inhomogeneous reionization in large similar to(100 Mpc)(3) cosmological volumes with thousands to millions of point sources. We solve all dynamical, radiative transfer, thermal, and ionization processes self-consistently on the same mesh, as opposed to a postprocessing approach which coarse-grains the radiative transfer. But, we employ a simple subgrid model for star formation which we calibrate to observations. The numerical method presented is a modification of an earlier method presented in Reynolds et al. differing principally in the operator splitting algorithm we use tomore » advance the system of equations. Radiation transport is done in the gray flux-limited diffusion (FLD) approximation, which is solved by implicit time integration split off from the gas energy and ionization equations, which are solved separately. This results in a faster and more robust scheme for cosmological applications compared to the earlier method. The FLD equation is solved using the hypre optimally scalable geometric multigrid solver from LLNL. By treating the ionizing radiation as a grid field as opposed to rays, our method is scalable with respect to the number of ionizing sources, limited only by the parallel scaling properties of the radiation solver. We test the speed and accuracy of our approach on a number of standard verification and validation tests. We show by direct comparison with Enzo's adaptive ray tracing method Moray that the well-known inability of FLD to cast a shadow behind opaque clouds has a minor effect on the evolution of ionized volume and mass fractions in a reionization simulation validation test. Finally, we illustrate an application of our method to the problem of inhomogeneous reionization in a 80 Mpc comoving box resolved with 3200(3) Eulerian grid cells and dark matter particles.« less
NASA Astrophysics Data System (ADS)
Sanan, P.; Schnepp, S. M.; May, D.; Schenk, O.
2014-12-01
Geophysical applications require efficient forward models for non-linear Stokes flow on high resolution spatio-temporal domains. The bottleneck in applying the forward model is solving the linearized, discretized Stokes problem which takes the form of a large, indefinite (saddle point) linear system. Due to the heterogeniety of the effective viscosity in the elliptic operator, devising effective preconditioners for saddle point problems has proven challenging and highly problem-dependent. Nevertheless, at least three approaches show promise for preconditioning these difficult systems in an algorithmically scalable way using multigrid and/or domain decomposition techniques. The first is to work with a hierarchy of coarser or smaller saddle point problems. The second is to use the Schur complement method to decouple and sequentially solve for the pressure and velocity. The third is to use the Schur decomposition to devise preconditioners for the full operator. These involve sub-solves resembling inexact versions of the sequential solve. The choice of approach and sub-methods depends crucially on the motivating physics, the discretization, and available computational resources. Here we examine the performance trade-offs for preconditioning strategies applied to idealized models of mantle convection and lithospheric dynamics, characterized by large viscosity gradients. Due to the arbitrary topological structure of the viscosity field in geodynamical simulations, we utilize low order, inf-sup stable mixed finite element spatial discretizations which are suitable when sharp viscosity variations occur in element interiors. Particular attention is paid to possibilities within the decoupled and approximate Schur complement factorization-based monolithic approaches to leverage recently-developed flexible, communication-avoiding, and communication-hiding Krylov subspace methods in combination with `heavy' smoothers, which require solutions of large per-node sub-problems, well-suited to solution on hybrid computational clusters. To manage the combinatorial explosion of solver options (which include hybridizations of all the approaches mentioned above), we leverage the modularity of the PETSc library.
NASA Astrophysics Data System (ADS)
Allen, Jeffery M.
This research involves a few First-Order System Least Squares (FOSLS) formulations of a nonlinear-Stokes flow model for ice sheets. In Glen's flow law, a commonly used constitutive equation for ice rheology, the viscosity becomes infinite as the velocity gradients approach zero. This typically occurs near the ice surface or where there is basal sliding. The computational difficulties associated with the infinite viscosity are often overcome by an arbitrary modification of Glen's law that bounds the maximum viscosity. The FOSLS formulations developed in this thesis are designed to overcome this difficulty. The first FOSLS formulation is just the first-order representation of the standard nonlinear, full-Stokes and is known as the viscosity formulation and suffers from the problem above. To overcome the problem of infinite viscosity, two new formulation exploit the fact that the deviatoric stress, the product of viscosity and strain-rate, approaches zero as the viscosity goes to infinity. Using the deviatoric stress as the basis for a first-order system results in the the basic fluidity system. Augmenting the basic fluidity system with a curl-type equation results in the augmented fluidity system, which is more amenable to the iterative solver, Algebraic MultiGrid (AMG). A Nested Iteration (NI) Newton-FOSLS-AMG approach is used to solve the nonlinear-Stokes problems. Several test problems from the ISMIP set of benchmarks is examined to test the effectiveness of the various formulations. These test show that the viscosity based method is more expensive and less accurate. The basic fluidity system shows optimal finite-element convergence. However, there is not yet an efficient iterative solver for this type of system and this is the topic of future research. Alternatively, AMG performs better on the augmented fluidity system when using specific scaling. Unfortunately, this scaling results in reduced finite-element convergence.
NASA Astrophysics Data System (ADS)
Ge, Liang; Sotiropoulos, Fotis
2007-08-01
A novel numerical method is developed that integrates boundary-conforming grids with a sharp interface, immersed boundary methodology. The method is intended for simulating internal flows containing complex, moving immersed boundaries such as those encountered in several cardiovascular applications. The background domain (e.g. the empty aorta) is discretized efficiently with a curvilinear boundary-fitted mesh while the complex moving immersed boundary (say a prosthetic heart valve) is treated with the sharp-interface, hybrid Cartesian/immersed-boundary approach of Gilmanov and Sotiropoulos [A. Gilmanov, F. Sotiropoulos, A hybrid cartesian/immersed boundary method for simulating flows with 3d, geometrically complex, moving bodies, Journal of Computational Physics 207 (2005) 457-492.]. To facilitate the implementation of this novel modeling paradigm in complex flow simulations, an accurate and efficient numerical method is developed for solving the unsteady, incompressible Navier-Stokes equations in generalized curvilinear coordinates. The method employs a novel, fully-curvilinear staggered grid discretization approach, which does not require either the explicit evaluation of the Christoffel symbols or the discretization of all three momentum equations at cell interfaces as done in previous formulations. The equations are integrated in time using an efficient, second-order accurate fractional step methodology coupled with a Jacobian-free, Newton-Krylov solver for the momentum equations and a GMRES solver enhanced with multigrid as preconditioner for the Poisson equation. Several numerical experiments are carried out on fine computational meshes to demonstrate the accuracy and efficiency of the proposed method for standard benchmark problems as well as for unsteady, pulsatile flow through a curved, pipe bend. To demonstrate the ability of the method to simulate flows with complex, moving immersed boundaries we apply it to calculate pulsatile, physiological flow through a mechanical, bileaflet heart valve mounted in a model straight aorta with an anatomical-like triple sinus.
An upwind multigrid method for solving viscous flows on unstructured triangular meshes. M.S. Thesis
NASA Technical Reports Server (NTRS)
Bonhaus, Daryl Lawrence
1993-01-01
A multigrid algorithm is combined with an upwind scheme for solving the two dimensional Reynolds averaged Navier-Stokes equations on triangular meshes resulting in an efficient, accurate code for solving complex flows around multiple bodies. The relaxation scheme uses a backward-Euler time difference and relaxes the resulting linear system using a red-black procedure. Roe's flux-splitting scheme is used to discretize convective and pressure terms, while a central difference is used for the diffusive terms. The multigrid scheme is demonstrated for several flows around single and multi-element airfoils, including inviscid, laminar, and turbulent flows. The results show an appreciable speed up of the scheme for inviscid and laminar flows, and dramatic increases in efficiency for turbulent cases, especially those on increasingly refined grids.
On multigrid methods for the Navier-Stokes Computer
NASA Technical Reports Server (NTRS)
Nosenchuck, D. M.; Krist, S. E.; Zang, T. A.
1988-01-01
The overall architecture of the multipurpose parallel-processing Navier-Stokes Computer (NSC) being developed by Princeton and NASA Langley (Nosenchuck et al., 1986) is described and illustrated with extensive diagrams, and the NSC implementation of an elementary multigrid algorithm for simulating isotropic turbulence (based on solution of the incompressible time-dependent Navier-Stokes equations with constant viscosity) is characterized in detail. The present NSC design concept calls for 64 nodes, each with the performance of a class VI supercomputer, linked together by a fiber-optic hypercube network and joined to a front-end computer by a global bus. In this configuration, the NSC would have a storage capacity of over 32 Gword and a peak speed of over 40 Gflops. The multigrid Navier-Stokes code discussed would give sustained operation rates of about 25 Gflops.
PCTDSE: A parallel Cartesian-grid-based TDSE solver for modeling laser-atom interactions
NASA Astrophysics Data System (ADS)
Fu, Yongsheng; Zeng, Jiaolong; Yuan, Jianmin
2017-01-01
We present a parallel Cartesian-grid-based time-dependent Schrödinger equation (TDSE) solver for modeling laser-atom interactions. It can simulate the single-electron dynamics of atoms in arbitrary time-dependent vector potentials. We use a split-operator method combined with fast Fourier transforms (FFT), on a three-dimensional (3D) Cartesian grid. Parallelization is realized using a 2D decomposition strategy based on the Message Passing Interface (MPI) library, which results in a good parallel scaling on modern supercomputers. We give simple applications for the hydrogen atom using the benchmark problems coming from the references and obtain repeatable results. The extensions to other laser-atom systems are straightforward with minimal modifications of the source code.
A dispersion minimizing scheme for the 3-D Helmholtz equation based on ray theory
NASA Astrophysics Data System (ADS)
Stolk, Christiaan C.
2016-06-01
We develop a new dispersion minimizing compact finite difference scheme for the Helmholtz equation in 2 and 3 dimensions. The scheme is based on a newly developed ray theory for difference equations. A discrete Helmholtz operator and a discrete operator to be applied to the source and the wavefields are constructed. Their coefficients are piecewise polynomial functions of hk, chosen such that phase and amplitude errors are minimal. The phase errors of the scheme are very small, approximately as small as those of the 2-D quasi-stabilized FEM method and substantially smaller than those of alternatives in 3-D, assuming the same number of gridpoints per wavelength is used. In numerical experiments, accurate solutions are obtained in constant and smoothly varying media using meshes with only five to six points per wavelength and wave propagation over hundreds of wavelengths. When used as a coarse level discretization in a multigrid method the scheme can even be used with down to three points per wavelength. Tests on 3-D examples with up to 108 degrees of freedom show that with a recently developed hybrid solver, the use of coarser meshes can lead to corresponding savings in computation time, resulting in good simulation times compared to the literature.
Patient-specific blood flow simulation to improve intracranial aneurysm diagnosis
NASA Astrophysics Data System (ADS)
Fenz, Wolfgang; Dirnberger, Johannes
2011-03-01
We present a novel simulation system of blood flow through intracranial aneurysms including the interaction between blood lumen and vessel tissue. It provides the means to estimate rupture risks by calculating the distribution of pressure and shear stresses in the aneurysm, in order to support the planning of clinical interventions. So far, this has only been possible with commercial simulation packages originally targeted at industrial applications, whereas our implementation focuses on the intuitive integration into clinical workflow. Due to the time-critical nature of the application, we exploit most efficient state-of-the-art numerical methods and technologies together with high performance computing infrastructures (Austrian Grid). Our system builds a three-dimensional virtual replica of the patient's cerebrovascular system from X-ray angiography, CT or MR images. The physician can then select a region of interest which is automatically transformed into a tetrahedral mesh. The differential equations for the blood flow and the wall elasticity are discretized via the finite element method (FEM), and the resulting linear equation systems are handled by an algebraic multigrid (AMG) solver. The wall displacement caused by the blood pressure is calculated using an iterative fluid-structure interaction (FSI) algorithm, and the fluid mesh is deformed accordingly. First simulation results on measured patient geometries show good medical relevance for diagnostic decision support.
3D electromagnetic modelling of a TTI medium and TTI effects in inversion
NASA Astrophysics Data System (ADS)
Jaysaval, Piyoosh; Shantsev, Daniil; de la Kethulle de Ryhove, Sébastien
2016-04-01
We present a numerical algorithm for 3D electromagnetic (EM) forward modelling in conducting media with general electric anisotropy. The algorithm is based on the finite-difference discretization of frequency-domain Maxwell's equations on a Lebedev grid, in which all components of the electric field are collocated but half a spatial step staggered with respect to the magnetic field components, which also are collocated. This leads to a system of linear equations that is solved using a stabilized biconjugate gradient method with a multigrid preconditioner. We validate the accuracy of the numerical results for layered and 3D tilted transverse isotropic (TTI) earth models representing typical scenarios used in the marine controlled-source EM method. It is then demonstrated that not taking into account the full anisotropy of the conductivity tensor can lead to misleading inversion results. For simulation data corresponding to a 3D model with a TTI anticlinal structure, a standard vertical transverse isotropic inversion is not able to image a resistor, while for a 3D model with a TTI synclinal structure the inversion produces a false resistive anomaly. If inversion uses the proposed forward solver that can handle TTI anisotropy, it produces resistivity images consistent with the true models.
Mang, Andreas; Ruthotto, Lars
2017-01-01
We present an efficient solver for diffeomorphic image registration problems in the framework of Large Deformations Diffeomorphic Metric Mappings (LDDMM). We use an optimal control formulation, in which the velocity field of a hyperbolic PDE needs to be found such that the distance between the final state of the system (the transformed/transported template image) and the observation (the reference image) is minimized. Our solver supports both stationary and non-stationary (i.e., transient or time-dependent) velocity fields. As transformation models, we consider both the transport equation (assuming intensities are preserved during the deformation) and the continuity equation (assuming mass-preservation). We consider the reduced form of the optimal control problem and solve the resulting unconstrained optimization problem using a discretize-then-optimize approach. A key contribution is the elimination of the PDE constraint using a Lagrangian hyperbolic PDE solver. Lagrangian methods rely on the concept of characteristic curves. We approximate these curves using a fourth-order Runge-Kutta method. We also present an efficient algorithm for computing the derivatives of the final state of the system with respect to the velocity field. This allows us to use fast Gauss-Newton based methods. We present quickly converging iterative linear solvers using spectral preconditioners that render the overall optimization efficient and scalable. Our method is embedded into the image registration framework FAIR and, thus, supports the most commonly used similarity measures and regularization functionals. We demonstrate the potential of our new approach using several synthetic and real world test problems with up to 14.7 million degrees of freedom.
Using the Intel Math Kernel Library on Peregrine | High-Performance
Computing | NREL the Intel Math Kernel Library on Peregrine Using the Intel Math Kernel Library on Peregrine Learn how to use the Intel Math Kernel Library (MKL) with Peregrine system software. MKL architectures. Core math functions in MKL include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier
Novel Methods for Electromagnetic Simulation and Design
2016-08-03
The resulting discretized integral equations are compatible with fast multipoleaccelerated solvers and will form the basis for high fidelity...expansion”) which are high-order, efficient and easy to use on arbitrarily triangulated surfaces. The resulting discretized integral equations are...created a user interface compatible with both low and high order discretizations , and implemented the generalized Debye approach of [4]. The
A Fast MoM Solver (GIFFT) for Large Arrays of Microstrip and Cavity-Backed Antennas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fasenfest, B J; Capolino, F; Wilton, D
2005-02-02
A straightforward numerical analysis of large arrays of arbitrary contour (and possibly missing elements) requires large memory storage and long computation times. Several techniques are currently under development to reduce this cost. One such technique is the GIFFT (Green's function interpolation and FFT) method discussed here that belongs to the class of fast solvers for large structures. This method uses a modification of the standard AIM approach [1] that takes into account the reusability properties of matrices that arise from identical array elements. If the array consists of planar conducting bodies, the array elements are meshed using standard subdomain basismore » functions, such as the RWG basis. The Green's function is then projected onto a sparse regular grid of separable interpolating polynomials. This grid can then be used in a 2D or 3D FFT to accelerate the matrix-vector product used in an iterative solver [2]. The method has been proven to greatly reduce solve time by speeding up the matrix-vector product computation. The GIFFT approach also reduces fill time and memory requirements, since only the near element interactions need to be calculated exactly. The present work extends GIFFT to layered material Green's functions and multiregion interactions via slots in ground planes. In addition, a preconditioner is implemented to greatly reduce the number of iterations required for a solution. The general scheme of the GIFFT method is reported in [2]; this contribution is limited to presenting new results for array antennas made of slot-excited patches and cavity-backed patch antennas.« less
Solving Coupled Gross--Pitaevskii Equations on a Cluster of PlayStation 3 Computers
NASA Astrophysics Data System (ADS)
Edwards, Mark; Heward, Jeffrey; Clark, C. W.
2009-05-01
At Georgia Southern University we have constructed an 8+1--node cluster of Sony PlayStation 3 (PS3) computers with the intention of using this computing resource to solve problems related to the behavior of ultra--cold atoms in general with a particular emphasis on studying bose--bose and bose--fermi mixtures confined in optical lattices. As a first project that uses this computing resource, we have implemented a parallel solver of the coupled time--dependent, one--dimensional Gross--Pitaevskii (TDGP) equations. These equations govern the behavior of dual-- species bosonic mixtures. We chose the split--operator/FFT to solve the coupled 1D TDGP equations. The fast Fourier transform component of this solver can be readily parallelized on the PS3 cpu known as the Cell Broadband Engine (CellBE). Each CellBE chip contains a single 64--bit PowerPC Processor Element known as the PPE and eight ``Synergistic Processor Element'' identified as the SPE's. We report on this algorithm and compare its performance to a non--parallel solver as applied to modeling evaporative cooling in dual--species bosonic mixtures.
NASA Astrophysics Data System (ADS)
Umezu, Yasuyoshi; Watanabe, Yuko; Ma, Ninshu
2005-08-01
Since 1996, Japan Research Institute Limited (JRI) has been providing a sheet metal forming simulation system called JSTAMP-Works packaged the FEM solvers of LS-DYNA and JOH/NIKE, which might be the first multistage system at that time and has been enjoying good reputation among users in Japan. To match the recent needs, "faster, more accurate and easier", of process designers and CAE engineers, a new metal forming simulation system JSTAMP-Works/NV is developed. The JSTAMP-Works/NV packaged the automatic healing function of CAD and had much more new capabilities such as prediction of 3D trimming lines for flanging or hemming, remote control of solver execution for multi-stage forming processes and shape evaluation between FEM and CAD. On the other way, a multi-stage multi-purpose inverse FEM solver HYSTAMP is developed and will be soon put into market, which is approved to be very fast, quite accurate and robust. Lastly, authors will give some application examples of user defined ductile damage subroutine in LS-DYNA for the estimation of material failure and springback in metal forming simulation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry
Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
Overcoming Challenges in Kinetic Modeling of Magnetized Plasmas and Vacuum Electronic Devices
NASA Astrophysics Data System (ADS)
Omelchenko, Yuri; Na, Dong-Yeop; Teixeira, Fernando
2017-10-01
We transform the state-of-the art of plasma modeling by taking advantage of novel computational techniques for fast and robust integration of multiscale hybrid (full particle ions, fluid electrons, no displacement current) and full-PIC models. These models are implemented in 3D HYPERS and axisymmetric full-PIC CONPIC codes. HYPERS is a massively parallel, asynchronous code. The HYPERS solver does not step fields and particles synchronously in time but instead executes local variable updates (events) at their self-adaptive rates while preserving fundamental conservation laws. The charge-conserving CONPIC code has a matrix-free explicit finite-element (FE) solver based on a sparse-approximate inverse (SPAI) algorithm. This explicit solver approximates the inverse FE system matrix (``mass'' matrix) using successive sparsity pattern orders of the original matrix. It does not reduce the set of Maxwell's equations to a vector-wave (curl-curl) equation of second order but instead utilizes the standard coupled first-order Maxwell's system. We discuss the ability of our codes to accurately and efficiently account for multiscale physical phenomena in 3D magnetized space and laboratory plasmas and axisymmetric vacuum electronic devices.
CubiCal - Fast radio interferometric calibration suite exploiting complex optimisation
NASA Astrophysics Data System (ADS)
Kenyon, J. S.; Smirnov, O. M.; Grobler, T. L.; Perkins, S. J.
2018-05-01
It has recently been shown that radio interferometric gain calibration can be expressed succinctly in the language of complex optimisation. In addition to providing an elegant framework for further development, it exposes properties of the calibration problem which can be exploited to accelerate traditional non-linear least squares solvers such as Gauss-Newton and Levenberg-Marquardt. We extend existing derivations to chains of Jones terms: products of several gains which model different aberrant effects. In doing so, we find that the useful properties found in the single term case still hold. We also develop several specialised solvers which deal with complex gains parameterised by real values. The newly developed solvers have been implemented in a Python package called CubiCal, which uses a combination of Cython, multiprocessing and shared memory to leverage the power of modern hardware. We apply CubiCal to both simulated and real data, and perform both direction-independent and direction-dependent self-calibration. Finally, we present the results of some rudimentary profiling to show that CubiCal is competitive with respect to existing calibration tools such as MeqTrees.
Three-dimensional multigrid algorithms for the flux-split Euler equations
NASA Technical Reports Server (NTRS)
Anderson, W. Kyle; Thomas, James L.; Whitfield, David L.
1988-01-01
The Full Approximation Scheme (FAS) multigrid method is applied to several implicit flux-split algorithms for solving the three-dimensional Euler equations in a body fitted coordinate system. Each of the splitting algorithms uses a variation of approximate factorization and is implemented in a finite volume formulation. The algorithms are all vectorizable with little or no scalar computation required. The flux vectors are split into upwind components using both the splittings of Steger-Warming and Van Leer. The stability and smoothing rate of each of the schemes are examined using a Fourier analysis of the complete system of equations. Results are presented for three-dimensional subsonic, transonic, and supersonic flows which demonstrate substantially improved convergence rates with the multigrid algorithm. The influence of using both a V-cycle and a W-cycle on the convergence is examined.
On the solution of evolution equations based on multigrid and explicit iterative methods
NASA Astrophysics Data System (ADS)
Zhukov, V. T.; Novikova, N. D.; Feodoritova, O. B.
2015-08-01
Two schemes for solving initial-boundary value problems for three-dimensional parabolic equations are studied. One is implicit and is solved using the multigrid method, while the other is explicit iterative and is based on optimal properties of the Chebyshev polynomials. In the explicit iterative scheme, the number of iteration steps and the iteration parameters are chosen as based on the approximation and stability conditions, rather than on the optimization of iteration convergence to the solution of the implicit scheme. The features of the multigrid scheme include the implementation of the intergrid transfer operators for the case of discontinuous coefficients in the equation and the adaptation of the smoothing procedure to the spectrum of the difference operators. The results produced by these schemes as applied to model problems with anisotropic discontinuous coefficients are compared.
Multigrid method for stability problems
NASA Technical Reports Server (NTRS)
Ta'asan, Shlomo
1988-01-01
The problem of calculating the stability of steady state solutions of differential equations is addressed. Leading eigenvalues of large matrices that arise from discretization are calculated, and an efficient multigrid method for solving these problems is presented. The resulting grid functions are used as initial approximations for appropriate eigenvalue problems. The method employs local relaxation on all levels together with a global change on the coarsest level only, which is designed to separate the different eigenfunctions as well as to update their corresponding eigenvalues. Coarsening is done using the FAS formulation in a nonstandard way in which the right-hand side of the coarse grid equations involves unknown parameters to be solved on the coarse grid. This leads to a new multigrid method for calculating the eigenvalues of symmetric problems. Numerical experiments with a model problem are presented which demonstrate the effectiveness of the method.
NASA Astrophysics Data System (ADS)
Guo, Hongbo; He, Xiaowei; Liu, Muhan; Zhang, Zeyu; Hu, Zhenhua; Tian, Jie
2017-03-01
Cerenkov luminescence tomography (CLT), as a promising optical molecular imaging modality, can be applied to cancer diagnostic and therapeutic. Most researches about CLT reconstruction are based on the finite element method (FEM) framework. However, the quality of FEM mesh grid is still a vital factor to restrict the accuracy of the CLT reconstruction result. In this paper, we proposed a multi-grid finite element method framework, which was able to improve the accuracy of reconstruction. Meanwhile, the multilevel scheme adaptive algebraic reconstruction technique (MLS-AART) based on a modified iterative algorithm was applied to improve the reconstruction accuracy. In numerical simulation experiments, the feasibility of our proposed method were evaluated. Results showed that the multi-grid strategy could obtain 3D spatial information of Cerenkov source more accurately compared with the traditional single-grid FEM.
A NetCDF version of the two-dimensional energy balance model based on the full multigrid algorithm
NASA Astrophysics Data System (ADS)
Zhuang, Kelin; North, Gerald R.; Stevens, Mark J.
A NetCDF version of the two-dimensional energy balance model based on the full multigrid method in Fortran is introduced for both pedagogical and research purposes. Based on the land-sea-ice distribution, orbital elements, greenhouse gases concentration, and albedo, the code calculates the global seasonal surface temperature. A step-by-step guide with examples is provided for practice.
NASA Astrophysics Data System (ADS)
Turcksin, Bruno; Ragusa, Jean C.; Morel, Jim E.
2012-01-01
It is well known that the diffusion synthetic acceleration (DSA) methods for the Sn equations become ineffective in the Fokker-Planck forward-peaked scattering limit. In response to this deficiency, Morel and Manteuffel (1991) developed an angular multigrid method for the 1-D Sn equations. This method is very effective, costing roughly twice as much as DSA per source iteration, and yielding a maximum spectral radius of approximately 0.6 in the Fokker-Planck limit. Pautz, Adams, and Morel (PAM) (1999) later generalized the angular multigrid to 2-D, but it was found that the method was unstable with sufficiently forward-peaked mappings between the angular grids. The method was stabilized via a filtering technique based on diffusion operators, but this filtering also degraded the effectiveness of the overall scheme. The spectral radius was not bounded away from unity in the Fokker-Planck limit, although the method remained more effective than DSA. The purpose of this article is to recast the multidimensional PAM angular multigrid method without the filtering as an Sn preconditioner and use it in conjunction with the Generalized Minimal RESidual (GMRES) Krylov method. The approach ensures stability and our computational results demonstrate that it is also significantly more efficient than an analogous DSA-preconditioned Krylov method.
Application of p-Multigrid to Discontinuous Galerkin Formulations of the Poisson Equation
NASA Technical Reports Server (NTRS)
Helenbrook, B. T.; Atkins, H. L.
2006-01-01
We investigate p-multigrid as a solution method for several different discontinuous Galerkin (DG) formulations of the Poisson equation. Different combinations of relaxation schemes and basis sets have been combined with the DG formulations to find the best performing combination. The damping factors of the schemes have been determined using Fourier analysis for both one and two-dimensional problems. One important finding is that when using DG formulations, the standard approach of forming the coarse p matrices separately for each level of multigrid is often unstable. To ensure stability the coarse p matrices must be constructed from the fine grid matrices using algebraic multigrid techniques. Of the relaxation schemes, we find that the combination of Jacobi relaxation with the spectral element basis is fairly effective. The results using this combination are p sensitive in both one and two dimensions, but reasonable convergence rates can still be achieved for moderate values of p and isotropic meshes. A competitive alternative is a block Gauss-Seidel relaxation. This actually out performs a more expensive line relaxation when the mesh is isotropic. When the mesh becomes highly anisotropic, the implicit line method and the Gauss-Seidel implicit line method are the only effective schemes. Adding the Gauss-Seidel terms to the implicit line method gives a significant improvement over the line relaxation method.
Fast algorithms for chiral fermions in 2 dimensions
NASA Astrophysics Data System (ADS)
Hyka (Xhako), Dafina; Osmanaj (Zeqirllari), Rudina
2018-03-01
In lattice QCD simulations the formulation of the theory in lattice should be chiral in order that symmetry breaking happens dynamically from interactions. In order to guarantee this symmetry on the lattice one uses overlap and domain wall fermions. On the other hand high computational cost of lattice QCD simulations with overlap or domain wall fermions remains a major obstacle of research in the field of elementary particles. We have developed the preconditioned GMRESR algorithm as fast inverting algorithm for chiral fermions in U(1) lattice gauge theory. In this algorithm we used the geometric multigrid idea along the extra dimension.The main result of this work is that the preconditioned GMRESR is capable to accelerate the convergence 2 to 12 times faster than the other optimal algorithms (SHUMR) for different coupling constant and lattice 32x32. Also, in this paper we tested it for larger lattice size 64x64. From the results of simulations we can see that our algorithm is faster than SHUMR. This is a very promising result that this algorithm can be adapted also in 4 dimension.
NASA Technical Reports Server (NTRS)
Young, D. P.; Woo, A. C.; Bussoletti, J. E.; Johnson, F. T.
1986-01-01
A general method is developed combining fast direct methods and boundary integral equation methods to solve Poisson's equation on irregular exterior regions. The method requires O(N log N) operations where N is the number of grid points. Error estimates are given that hold for regions with corners and other boundary irregularities. Computational results are given in the context of computational aerodynamics for a two-dimensional lifting airfoil. Solutions of boundary integral equations for lifting and nonlifting aerodynamic configurations using preconditioned conjugate gradient are examined for varying degrees of thinness.
QCAD simulation and optimization of semiconductor double quantum dots
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nielsen, Erik; Gao, Xujiao; Kalashnikova, Irina
2013-12-01
We present the Quantum Computer Aided Design (QCAD) simulator that targets modeling quantum devices, particularly silicon double quantum dots (DQDs) developed for quantum qubits. The simulator has three di erentiating features: (i) its core contains nonlinear Poisson, e ective mass Schrodinger, and Con guration Interaction solvers that have massively parallel capability for high simulation throughput, and can be run individually or combined self-consistently for 1D/2D/3D quantum devices; (ii) the core solvers show superior convergence even at near-zero-Kelvin temperatures, which is critical for modeling quantum computing devices; (iii) it couples with an optimization engine Dakota that enables optimization of gate voltagesmore » in DQDs for multiple desired targets. The Poisson solver includes Maxwell- Boltzmann and Fermi-Dirac statistics, supports Dirichlet, Neumann, interface charge, and Robin boundary conditions, and includes the e ect of dopant incomplete ionization. The solver has shown robust nonlinear convergence even in the milli-Kelvin temperature range, and has been extensively used to quickly obtain the semiclassical electrostatic potential in DQD devices. The self-consistent Schrodinger-Poisson solver has achieved robust and monotonic convergence behavior for 1D/2D/3D quantum devices at very low temperatures by using a predictor-correct iteration scheme. The QCAD simulator enables the calculation of dot-to-gate capacitances, and comparison with experiment and between solvers. It is observed that computed capacitances are in the right ballpark when compared to experiment, and quantum con nement increases capacitance when the number of electrons is xed in a quantum dot. In addition, the coupling of QCAD with Dakota allows to rapidly identify which device layouts are more likely leading to few-electron quantum dots. Very efficient QCAD simulations on a large number of fabricated and proposed Si DQDs have made it possible to provide fast feedback for design comparison and optimization.« less
Simulation Methods for Design of Networked Power Electronics and Information Systems
2014-07-01
Insertion of latency in every branch and at every node permits the system model to be efficiently distributed across many separate computing cores. An... the system . We demonstrated extensibility and generality of the Virtual Test Bed (VTB) framework to support multiple solvers and their associated...Information Systems Objectives The overarching objective of this program is to develop methods for fast
Simulations in support of the T4B experiment
NASA Astrophysics Data System (ADS)
Qerushi, Artan; Ross, Patrick; Lohff, Chriss; Raymond, Anthony; Montecalvo, Niccolo
2017-10-01
Simulations in support of the T4B experiment are presented. These include a Grad-Shafranov equilibrium solver and equilibrium reconstruction from flux-loop measurements, collision radiative models for plasma spectroscopy (determination of electron density and temperature from line ratios) and fast ion test particle codes for neutral beam - plasma coupling. ©2017 Lockheed Martin Corporation. All Rights Reserved.
FleCSPH - a parallel and distributed SPH implementation based on the FleCSI framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Junghans, Christoph; Loiseau, Julien
2017-06-20
FleCSPH is a multi-physics compact application that exercises FleCSI parallel data structures for tree-based particle methods. In particular, FleCSPH implements a smoothed-particle hydrodynamics (SPH) solver for the solution of Lagrangian problems in astrophysics and cosmology. FleCSPH includes support for gravitational forces using the fast multipole method (FMM).
A fast Chebyshev method for simulating flexible-wing propulsion
NASA Astrophysics Data System (ADS)
Moore, M. Nicholas J.
2017-09-01
We develop a highly efficient numerical method to simulate small-amplitude flapping propulsion by a flexible wing in a nearly inviscid fluid. We allow the wing's elastic modulus and mass density to vary arbitrarily, with an eye towards optimizing these distributions for propulsive performance. The method to determine the wing kinematics is based on Chebyshev collocation of the 1D beam equation as coupled to the surrounding 2D fluid flow. Through small-amplitude analysis of the Euler equations (with trailing-edge vortex shedding), the complete hydrodynamics can be represented by a nonlocal operator that acts on the 1D wing kinematics. A class of semi-analytical solutions permits fast evaluation of this operator with O (Nlog N) operations, where N is the number of collocation points on the wing. This is in contrast to the minimum O (N2) cost of a direct 2D fluid solver. The coupled wing-fluid problem is thus recast as a PDE with nonlocal operator, which we solve using a preconditioned iterative method. These techniques yield a solver of near-optimal complexity, O (Nlog N) , allowing one to rapidly search the infinite-dimensional parameter space of all possible material distributions and even perform optimization over this space.
NASA Astrophysics Data System (ADS)
Zheng, Chang-Jun; Gao, Hai-Feng; Du, Lei; Chen, Hai-Bo; Zhang, Chuanzeng
2016-01-01
An accurate numerical solver is developed in this paper for eigenproblems governed by the Helmholtz equation and formulated through the boundary element method. A contour integral method is used to convert the nonlinear eigenproblem into an ordinary eigenproblem, so that eigenvalues can be extracted accurately by solving a set of standard boundary element systems of equations. In order to accelerate the solution procedure, the parameters affecting the accuracy and efficiency of the method are studied and two contour paths are compared. Moreover, a wideband fast multipole method is implemented with a block IDR (s) solver to reduce the overall solution cost of the boundary element systems of equations with multiple right-hand sides. The Burton-Miller formulation is employed to identify the fictitious eigenfrequencies of the interior acoustic problems with multiply connected domains. The actual effect of the Burton-Miller formulation on tackling the fictitious eigenfrequency problem is investigated and the optimal choice of the coupling parameter as α = i / k is confirmed through exterior sphere examples. Furthermore, the numerical eigenvalues obtained by the developed method are compared with the results obtained by the finite element method to show the accuracy and efficiency of the developed method.
Helmholtz and parabolic equation solutions to a benchmark problem in ocean acoustics.
Larsson, Elisabeth; Abrahamsson, Leif
2003-05-01
The Helmholtz equation (HE) describes wave propagation in applications such as acoustics and electromagnetics. For realistic problems, solving the HE is often too expensive. Instead, approximations like the parabolic wave equation (PE) are used. For low-frequency shallow-water environments, one persistent problem is to assess the accuracy of the PE model. In this work, a recently developed HE solver that can handle a smoothly varying bathymetry, variable material properties, and layered materials, is used for an investigation of the errors in PE solutions. In the HE solver, a preconditioned Krylov subspace method is applied to the discretized equations. The preconditioner combines domain decomposition and fast transform techniques. A benchmark problem with upslope-downslope propagation over a penetrable lossy seamount is solved. The numerical experiments show that, for the same bathymetry, a soft and slow bottom gives very similar HE and PE solutions, whereas the PE model is far from accurate for a hard and fast bottom. A first attempt to estimate the error is made by computing the relative deviation from the energy balance for the PE solution. This measure gives an indication of the magnitude of the error, but cannot be used as a strict error bound.
An Optimal Order Nonnested Mixed Multigrid Method for Generalized Stokes Problems
NASA Technical Reports Server (NTRS)
Deng, Qingping
1996-01-01
A multigrid algorithm is developed and analyzed for generalized Stokes problems discretized by various nonnested mixed finite elements within a unified framework. It is abstractly proved by an element-independent analysis that the multigrid algorithm converges with an optimal order if there exists a 'good' prolongation operator. A technique to construct a 'good' prolongation operator for nonnested multilevel finite element spaces is proposed. Its basic idea is to introduce a sequence of auxiliary nested multilevel finite element spaces and define a prolongation operator as a composite operator of two single grid level operators. This makes not only the construction of a prolongation operator much easier (the final explicit forms of such prolongation operators are fairly simple), but the verification of the approximate properties for prolongation operators is also simplified. Finally, as an application, the framework and technique is applied to seven typical nonnested mixed finite elements.
On Efficient Multigrid Methods for Materials Processing Flows with Small Particles
NASA Technical Reports Server (NTRS)
Thomas, James (Technical Monitor); Diskin, Boris; Harik, VasylMichael
2004-01-01
Multiscale modeling of materials requires simulations of multiple levels of structural hierarchy. The computational efficiency of numerical methods becomes a critical factor for simulating large physical systems with highly desperate length scales. Multigrid methods are known for their superior efficiency in representing/resolving different levels of physical details. The efficiency is achieved by employing interactively different discretizations on different scales (grids). To assist optimization of manufacturing conditions for materials processing with numerous particles (e.g., dispersion of particles, controlling flow viscosity and clusters), a new multigrid algorithm has been developed for a case of multiscale modeling of flows with small particles that have various length scales. The optimal efficiency of the algorithm is crucial for accurate predictions of the effect of processing conditions (e.g., pressure and velocity gradients) on the local flow fields that control the formation of various microstructures or clusters.
Segmented Domain Decomposition Multigrid For 3-D Turbomachinery Flows
NASA Technical Reports Server (NTRS)
Celestina, M. L.; Adamczyk, J. J.; Rubin, S. G.
2001-01-01
A Segmented Domain Decomposition Multigrid (SDDMG) procedure was developed for three-dimensional viscous flow problems as they apply to turbomachinery flows. The procedure divides the computational domain into a coarse mesh comprised of uniformly spaced cells. To resolve smaller length scales such as the viscous layer near a surface, segments of the coarse mesh are subdivided into a finer mesh. This is repeated until adequate resolution of the smallest relevant length scale is obtained. Multigrid is used to communicate information between the different grid levels. To test the procedure, simulation results will be presented for a compressor and turbine cascade. These simulations are intended to show the ability of the present method to generate grid independent solutions. Comparisons with data will also be presented. These comparisons will further demonstrate the usefulness of the present work for they allow an estimate of the accuracy of the flow modeling equations independent of error attributed to numerical discretization.
GPU-based RFA simulation for minimally invasive cancer treatment of liver tumours.
Mariappan, Panchatcharam; Weir, Phil; Flanagan, Ronan; Voglreiter, Philip; Alhonnoro, Tuomas; Pollari, Mika; Moche, Michael; Busse, Harald; Futterer, Jurgen; Portugaller, Horst Rupert; Sequeiros, Roberto Blanco; Kolesnik, Marina
2017-01-01
Radiofrequency ablation (RFA) is one of the most popular and well-standardized minimally invasive cancer treatments (MICT) for liver tumours, employed where surgical resection has been contraindicated. Less-experienced interventional radiologists (IRs) require an appropriate planning tool for the treatment to help avoid incomplete treatment and so reduce the tumour recurrence risk. Although a few tools are available to predict the ablation lesion geometry, the process is computationally expensive. Also, in our implementation, a few patient-specific parameters are used to improve the accuracy of the lesion prediction. Advanced heterogeneous computing using personal computers, incorporating the graphics processing unit (GPU) and the central processing unit (CPU), is proposed to predict the ablation lesion geometry. The most recent GPU technology is used to accelerate the finite element approximation of Penne's bioheat equation and a three state cell model. Patient-specific input parameters are used in the bioheat model to improve accuracy of the predicted lesion. A fast GPU-based RFA solver is developed to predict the lesion by doing most of the computational tasks in the GPU, while reserving the CPU for concurrent tasks such as lesion extraction based on the heat deposition at each finite element node. The solver takes less than 3 min for a treatment duration of 26 min. When the model receives patient-specific input parameters, the deviation between real and predicted lesion is below 3 mm. A multi-centre retrospective study indicates that the fast RFA solver is capable of providing the IR with the predicted lesion in the short time period before the intervention begins when the patient has been clinically prepared for the treatment.
Multigrid Techniques for Highly Indefinite Equations
NASA Technical Reports Server (NTRS)
Shapira, Yair
1996-01-01
A multigrid method for the solution of finite difference approximations of elliptic PDE's is introduced. A parallelizable version of it, suitable for two and multi level analysis, is also defined, and serves as a theoretical tool for deriving a suitable implementation for the main version. For indefinite Helmholtz equations, this analysis provides a suitable mesh size for the coarsest grid used. Numerical experiments show that the method is applicable to diffusion equations with discontinuous coefficients and highly indefinite Helmholtz equations.
Final Report, DE-FG01-06ER25718 Domain Decomposition and Parallel Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Widlund, Olof B.
2015-06-09
The goal of this project is to develop and improve domain decomposition algorithms for a variety of partial differential equations such as those of linear elasticity and electro-magnetics.These iterative methods are designed for massively parallel computing systems and allow the fast solution of the very large systems of algebraic equations that arise in large scale and complicated simulations. A special emphasis is placed on problems arising from Maxwell's equation. The approximate solvers, the preconditioners, are combined with the conjugate gradient method and must always include a solver of a coarse model in order to have a performance which is independentmore » of the number of processors used in the computer simulation. A recent development allows for an adaptive construction of this coarse component of the preconditioner.« less
Fuel Optimal, Finite Thrust Guidance Methods to Circumnavigate with Lighting Constraints
NASA Astrophysics Data System (ADS)
Prince, E. R.; Carr, R. W.; Cobb, R. G.
This paper details improvements made to the authors' most recent work to find fuel optimal, finite-thrust guidance to inject an inspector satellite into a prescribed natural motion circumnavigation (NMC) orbit about a resident space object (RSO) in geosynchronous orbit (GEO). Better initial guess methodologies are developed for the low-fidelity model nonlinear programming problem (NLP) solver to include using Clohessy- Wiltshire (CW) targeting, a modified particle swarm optimization (PSO), and MATLAB's genetic algorithm (GA). These initial guess solutions may then be fed into the NLP solver as an initial guess, where a different NLP solver, IPOPT, is used. Celestial lighting constraints are taken into account in addition to the sunlight constraint, ensuring that the resulting NMC also adheres to Moon and Earth lighting constraints. The guidance is initially calculated given a fixed final time, and then solutions are also calculated for fixed final times before and after the original fixed final time, allowing mission planners to choose the lowest-cost solution in the resulting range which satisfies all constraints. The developed algorithms provide computationally fast and highly reliable methods for determining fuel optimal guidance for NMC injections while also adhering to multiple lighting constraints.
Numerical solution of a coupled pair of elliptic equations from solid state electronics
NASA Technical Reports Server (NTRS)
Phillips, T. N.
1983-01-01
Iterative methods are considered for the solution of a coupled pair of second order elliptic partial differential equations which arise in the field of solid state electronics. A finite difference scheme is used which retains the conservative form of the differential equations. Numerical solutions are obtained in two ways, by multigrid and dynamic alternating direction implicit methods. Numerical results are presented which show the multigrid method to be an efficient way of solving this problem.
Spectral multigrid methods for the solution of homogeneous turbulence problems
NASA Technical Reports Server (NTRS)
Erlebacher, G.; Zang, T. A.; Hussaini, M. Y.
1987-01-01
New three-dimensional spectral multigrid algorithms are analyzed and implemented to solve the variable coefficient Helmholtz equation. Periodicity is assumed in all three directions which leads to a Fourier collocation representation. Convergence rates are theoretically predicted and confirmed through numerical tests. Residual averaging results in a spectral radius of 0.2 for the variable coefficient Poisson equation. In general, non-stationary Richardson must be used for the Helmholtz equation. The algorithms developed are applied to the large-eddy simulation of incompressible isotropic turbulence.
Multigrid solution strategies for adaptive meshing problems
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.
1995-01-01
This paper discusses the issues which arise when combining multigrid strategies with adaptive meshing techniques for solving steady-state problems on unstructured meshes. A basic strategy is described, and demonstrated by solving several inviscid and viscous flow cases. Potential inefficiencies in this basic strategy are exposed, and various alternate approaches are discussed, some of which are demonstrated with an example. Although each particular approach exhibits certain advantages, all methods have particular drawbacks, and the formulation of a completely optimal strategy is considered to be an open problem.
A new multigrid formulation for high order finite difference methods on summation-by-parts form
NASA Astrophysics Data System (ADS)
Ruggiu, Andrea A.; Weinerfelt, Per; Nordström, Jan
2018-04-01
Multigrid schemes for high order finite difference methods on summation-by-parts form are studied by comparing the effect of different interpolation operators. By using the standard linear prolongation and restriction operators, the Galerkin condition leads to inaccurate coarse grid discretizations. In this paper, an alternative class of interpolation operators that bypass this issue and preserve the summation-by-parts property on each grid level is considered. Clear improvements of the convergence rate for relevant model problems are achieved.
Parameter estimation problems for distributed systems using a multigrid method
NASA Technical Reports Server (NTRS)
Ta'asan, Shlomo; Dutt, Pravir
1990-01-01
The problem of estimating spatially varying coefficients of partial differential equations is considered from observation of the solution and of the right hand side of the equation. It is assumed that the observations are distributed in the domain and that enough observations are given. A method of discretization and an efficient multigrid method for solving the resulting discrete systems are described. Numerical results are presented for estimation of coefficients in an elliptic and a parabolic partial differential equation.
Efficient solvers for coupled models in respiratory mechanics.
Verdugo, Francesc; Roth, Christian J; Yoshihara, Lena; Wall, Wolfgang A
2017-02-01
We present efficient preconditioners for one of the most physiologically relevant pulmonary models currently available. Our underlying motivation is to enable the efficient simulation of such a lung model on high-performance computing platforms in order to assess mechanical ventilation strategies and contributing to design more protective patient-specific ventilation treatments. The system of linear equations to be solved using the proposed preconditioners is essentially the monolithic system arising in fluid-structure interaction (FSI) extended by additional algebraic constraints. The introduction of these constraints leads to a saddle point problem that cannot be solved with usual FSI preconditioners available in the literature. The key ingredient in this work is to use the idea of the semi-implicit method for pressure-linked equations (SIMPLE) for getting rid of the saddle point structure, resulting in a standard FSI problem that can be treated with available techniques. The numerical examples show that the resulting preconditioners approach the optimal performance of multigrid methods, even though the lung model is a complex multiphysics problem. Moreover, the preconditioners are robust enough to deal with physiologically relevant simulations involving complex real-world patient-specific lung geometries. The same approach is applicable to other challenging biomedical applications where coupling between flow and tissue deformations is modeled with additional algebraic constraints. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Tezaur, Irina K.; Tuminaro, Raymond S.; Perego, Mauro; ...
2015-01-01
We examine the scalability of the recently developed Albany/FELIX finite-element based code for the first-order Stokes momentum balance equations for ice flow. We focus our analysis on the performance of two possible preconditioners for the iterative solution of the sparse linear systems that arise from the discretization of the governing equations: (1) a preconditioner based on the incomplete LU (ILU) factorization, and (2) a recently-developed algebraic multigrid (AMG) preconditioner, constructed using the idea of semi-coarsening. A strong scalability study on a realistic, high resolution Greenland ice sheet problem reveals that, for a given number of processor cores, the AMG preconditionermore » results in faster linear solve times but the ILU preconditioner exhibits better scalability. In addition, a weak scalability study is performed on a realistic, moderate resolution Antarctic ice sheet problem, a substantial fraction of which contains floating ice shelves, making it fundamentally different from the Greenland ice sheet problem. We show that as the problem size increases, the performance of the ILU preconditioner deteriorates whereas the AMG preconditioner maintains scalability. This is because the linear systems are extremely ill-conditioned in the presence of floating ice shelves, and the ill-conditioning has a greater negative effect on the ILU preconditioner than on the AMG preconditioner.« less
Prediction of fire growth on furniture using CFD
NASA Astrophysics Data System (ADS)
Pehrson, Richard David
A fire growth calculation method has been developed that couples a computational fluid dynamics (CFD) model with bench scale cone calorimeter test data for predicting the rate of flame spread on compartment contents such as furniture. The commercial CFD code TASCflow has been applied to solve time averaged conservation equations using an algebraic multigrid solver with mass weighted skewed upstream differencing for advection. Closure models include k-e for turbulence, eddy breakup for combustion following a single step irreversible reaction with Arrhenius rate constant, finite difference radiation transfer, and conjugate heat transfer. Radiation properties are determined from concentrations of soot, CO2 and H2O using the narrow band model of Grosshandler and exponential wide band curve fit model of Modak. The growth in pyrolyzing area is predicted by treating flame spread as a series of piloted ignitions based on coupled gas-fluid boundary conditions. The mass loss rate from a given surface element follows the bench scale test data for input to the combustion prediction. The fire growth model has been tested against foam-fabric mattresses and chairs burned in the furniture calorimeter. In general, agreement between model and experiment for peak heat release rate (HRR), time to peak HRR, and total energy lost is within +/-20%. Used as a proxy for the flame spread velocity, the slope of the HRR curve predicted by model agreed with experiment within +/-20% for all but one case.
NASA Astrophysics Data System (ADS)
Laakso, Ilkka
2009-06-01
This paper presents finite-difference time-domain (FDTD) calculations of specific absorption rate (SAR) values in the head under plane-wave exposure from 1 to 10 GHz using a resolution of 0.5 mm in adult male and female voxel models. Temperature rise due to the power absorption is calculated by the bioheat equation using a multigrid method solver. The computational accuracy is investigated by repeating the calculations with resolutions of 1 mm and 2 mm and comparing the results. Cubically averaged 10 g SAR in the eyes and brain and eye-averaged SAR are calculated and compared to the corresponding temperature rise as well as the recommended limits for exposure. The results suggest that 2 mm resolution should only be used for frequencies smaller than 2.5 GHz, and 1 mm resolution only under 5 GHz. Morphological differences in models seemed to be an important cause of variation: differences in results between the two different models were usually larger than the computational error due to the grid resolution, and larger than the difference between the results for open and closed eyes. Limiting the incident plane-wave power density to smaller than 100 W m-2 was sufficient for ensuring that the temperature rise in the eyes and brain were less than 1 °C in the whole frequency range.
Glenn-ht/bem Conjugate Heat Transfer Solver for Large-scale Turbomachinery Models
NASA Technical Reports Server (NTRS)
Divo, E.; Steinthorsson, E.; Rodriquez, F.; Kassab, A. J.; Kapat, J. S.; Heidmann, James D. (Technical Monitor)
2003-01-01
A coupled Boundary Element/Finite Volume Method temperature-forward/flux-hack algorithm is developed for conjugate heat transfer (CHT) applications. A loosely coupled strategy is adopted with each field solution providing boundary conditions for the other in an iteration seeking continuity of temperature and heat flux at the fluid-solid interface. The NASA Glenn Navier-Stokes code Glenn-HT is coupled to a 3-D BEM steady state heat conduction code developed at the University of Central Florida. Results from CHT simulation of a 3-D film-cooled blade section are presented and compared with those computed by a two-temperature approach. Also presented are current developments of an iterative domain decomposition strategy accommodating large numbers of unknowns in the BEM. The blade is artificially sub-sectioned in the span-wise direction, 3-D BEM solutions are obtained in the subdomains, and interface temperatures are averaged symmetrically when the flux is updated while the fluxes are averaged anti-symmetrically to maintain continuity of heat flux when the temperatures are updated. An initial guess for interface temperatures uses a physically-based 1-D conduction argument to provide an effective starting point and significantly reduce iteration. 2-D and 3-D results show the process converges efficiently and offers substantial computational and storage savings. Future developments include a parallel multi-grid implementation of the approach under MPI for computation on PC clusters.
LI, ZHILIN; JI, HAIFENG; CHEN, XIAOHONG
2016-01-01
A new augmented method is proposed for elliptic interface problems with a piecewise variable coefficient that has a finite jump across a smooth interface. The main motivation is not only to get a second order accurate solution but also a second order accurate gradient from each side of the interface. The key of the new method is to introduce the jump in the normal derivative of the solution as an augmented variable and re-write the interface problem as a new PDE that consists of a leading Laplacian operator plus lower order derivative terms near the interface. In this way, the leading second order derivatives jump relations are independent of the jump in the coefficient that appears only in the lower order terms after the scaling. An upwind type discretization is used for the finite difference discretization at the irregular grid points near or on the interface so that the resulting coefficient matrix is an M-matrix. A multi-grid solver is used to solve the linear system of equations and the GMRES iterative method is used to solve the augmented variable. Second order convergence for the solution and the gradient from each side of the interface has also been proved in this paper. Numerical examples for general elliptic interface problems have confirmed the theoretical analysis and efficiency of the new method. PMID:28983130
Multigrid methods with space–time concurrency
Falgout, R. D.; Friedhoff, S.; Kolev, Tz. V.; ...
2017-10-06
Here, we consider the comparison of multigrid methods for parabolic partial differential equations that allow space–time concurrency. With current trends in computer architectures leading towards systems with more, but not faster, processors, space–time concurrency is crucial for speeding up time-integration simulations. In contrast, traditional time-integration techniques impose serious limitations on parallel performance due to the sequential nature of the time-stepping approach, allowing spatial concurrency only. This paper considers the three basic options of multigrid algorithms on space–time grids that allow parallelism in space and time: coarsening in space and time, semicoarsening in the spatial dimensions, and semicoarsening in the temporalmore » dimension. We develop parallel software and performance models to study the three methods at scales of up to 16K cores and introduce an extension of one of them for handling multistep time integration. We then discuss advantages and disadvantages of the different approaches and their benefit compared to traditional space-parallel algorithms with sequential time stepping on modern architectures.« less
Application of multi-grid method on the simulation of incremental forging processes
NASA Astrophysics Data System (ADS)
Ramadan, Mohamad; Khaled, Mahmoud; Fourment, Lionel
2016-10-01
Numerical simulation becomes essential in manufacturing large part by incremental forging processes. It is a splendid tool allowing to show physical phenomena however behind the scenes, an expensive bill should be paid, that is the computational time. That is why many techniques are developed to decrease the computational time of numerical simulation. Multi-Grid method is a numerical procedure that permits to reduce computational time of numerical calculation by performing the resolution of the system of equations on several mesh of decreasing size which allows to smooth faster the low frequency of the solution as well as its high frequency. In this paper a Multi-Grid method is applied to cogging process in the software Forge 3. The study is carried out using increasing number of degrees of freedom. The results shows that calculation time is divide by two for a mesh of 39,000 nodes. The method is promising especially if coupled with Multi-Mesh method.
Multigrid methods with space–time concurrency
DOE Office of Scientific and Technical Information (OSTI.GOV)
Falgout, R. D.; Friedhoff, S.; Kolev, Tz. V.
Here, we consider the comparison of multigrid methods for parabolic partial differential equations that allow space–time concurrency. With current trends in computer architectures leading towards systems with more, but not faster, processors, space–time concurrency is crucial for speeding up time-integration simulations. In contrast, traditional time-integration techniques impose serious limitations on parallel performance due to the sequential nature of the time-stepping approach, allowing spatial concurrency only. This paper considers the three basic options of multigrid algorithms on space–time grids that allow parallelism in space and time: coarsening in space and time, semicoarsening in the spatial dimensions, and semicoarsening in the temporalmore » dimension. We develop parallel software and performance models to study the three methods at scales of up to 16K cores and introduce an extension of one of them for handling multistep time integration. We then discuss advantages and disadvantages of the different approaches and their benefit compared to traditional space-parallel algorithms with sequential time stepping on modern architectures.« less