parallel unstructured solver: Topics by Science.gov

Sample records for parallel unstructured solver

Implementation of a parallel unstructured Euler solver on the CM-5

NASA Technical Reports Server (NTRS)

Morano, Eric; Mavriplis, D. J.

1995-01-01

An efficient unstructured 3D Euler solver is parallelized on a Thinking Machine Corporation Connection Machine 5, distributed memory computer with vectoring capability. In this paper, the single instruction multiple data (SIMD) strategy is employed through the use of the CM Fortran language and the CMSSL scientific library. The performance of the CMSSL mesh partitioner is evaluated and the overall efficiency of the parallel flow solver is discussed.
Implementation of a parallel unstructured Euler solver on shared and distributed memory architectures

NASA Technical Reports Server (NTRS)

Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.

1992-01-01

An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.
Numerical aspects and implementation of a two-layer zonal wall model for LES of compressible turbulent flows on unstructured meshes

NASA Astrophysics Data System (ADS)

Park, George Ilhwan; Moin, Parviz

2016-01-01

This paper focuses on numerical and practical aspects associated with a parallel implementation of a two-layer zonal wall model for large-eddy simulation (LES) of compressible wall-bounded turbulent flows on unstructured meshes. A zonal wall model based on the solution of unsteady three-dimensional Reynolds-averaged Navier-Stokes (RANS) equations on a separate near-wall grid is implemented in an unstructured, cell-centered finite-volume LES solver. The main challenge in its implementation is to couple two parallel, unstructured flow solvers for efficient boundary data communication and simultaneous time integrations. A coupling strategy with good load balancing and low processors underutilization is identified. Face mapping and interpolation procedures at the coupling interface are explained in detail. The method of manufactured solution is used for verifying the correct implementation of solver coupling, and parallel performance of the combined wall-modeled LES (WMLES) solver is investigated. The method has successfully been applied to several attached and separated flows, including a transitional flow over a flat plate and a separated flow over an airfoil at an angle of attack.
Application of a Scalable, Parallel, Unstructured-Grid-Based Navier-Stokes Solver

NASA Technical Reports Server (NTRS)

Parikh, Paresh

2001-01-01

A parallel version of an unstructured-grid based Navier-Stokes solver, USM3Dns, previously developed for efficient operation on a variety of parallel computers, has been enhanced to incorporate upgrades made to the serial version. The resultant parallel code has been extensively tested on a variety of problems of aerospace interest and on two sets of parallel computers to understand and document its characteristics. An innovative grid renumbering construct and use of non-blocking communication are shown to produce superlinear computing performance. Preliminary results from parallelization of a recently introduced "porous surface" boundary condition are also presented.
Parallel performance investigations of an unstructured mesh Navier-Stokes solver

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.

2000-01-01

A Reynolds-averaged Navier-Stokes solver based on unstructured mesh techniques for analysis of high-lift configurations is described. The method makes use of an agglomeration multigrid solver for convergence acceleration. Implicit line-smoothing is employed to relieve the stiffness associated with highly stretched meshes. A GMRES technique is also implemented to speed convergence at the expense of additional memory usage. The solver is cache efficient and fully vectorizable, and is parallelized using a two-level hybrid MPI-OpenMP implementation suitable for shared and/or distributed memory architectures, as well as clusters of shared memory machines. Convergence and scalability results are illustrated for various high-lift cases.
Three-Dimensional High-Lift Analysis Using a Parallel Unstructured Multigrid Solver

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.

1998-01-01

A directional implicit unstructured agglomeration multigrid solver is ported to shared and distributed memory massively parallel machines using the explicit domain-decomposition and message-passing approach. Because the algorithm operates on local implicit lines in the unstructured mesh, special care is required in partitioning the problem for parallel computing. A weighted partitioning strategy is described which avoids breaking the implicit lines across processor boundaries, while incurring minimal additional communication overhead. Good scalability is demonstrated on a 128 processor SGI Origin 2000 machine and on a 512 processor CRAY T3E machine for reasonably fine grids. The feasibility of performing large-scale unstructured grid calculations with the parallel multigrid algorithm is demonstrated by computing the flow over a partial-span flap wing high-lift geometry on a highly resolved grid of 13.5 million points in approximately 4 hours of wall clock time on the CRAY T3E.
EUPDF: An Eulerian-Based Monte Carlo Probability Density Function (PDF) Solver. User's Manual

NASA Technical Reports Server (NTRS)

Raju, M. S.

1998-01-01

EUPDF is an Eulerian-based Monte Carlo PDF solver developed for application with sprays, combustion, parallel computing and unstructured grids. It is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and spray solvers. The solver accommodates the use of an unstructured mesh with mixed elements of either triangular, quadrilateral, and/or tetrahedral type. The manual provides the user with the coding required to couple the PDF code to any given flow code and a basic understanding of the EUPDF code structure as well as the models involved in the PDF formulation. The source code of EUPDF will be available with the release of the National Combustion Code (NCC) as a complete package.
The design and implementation of a parallel unstructured Euler solver using software primitives

NASA Technical Reports Server (NTRS)

Das, R.; Mavriplis, D. J.; Saltz, J.; Gupta, S.; Ponnusamy, R.

1992-01-01

This paper is concerned with the implementation of a three-dimensional unstructured grid Euler-solver on massively parallel distributed-memory computer architectures. The goal is to minimize solution time by achieving high computational rates with a numerically efficient algorithm. An unstructured multigrid algorithm with an edge-based data structure has been adopted, and a number of optimizations have been devised and implemented in order to accelerate the parallel communication rates. The implementation is carried out by creating a set of software tools, which provide an interface between the parallelization issues and the sequential code, while providing a basis for future automatic run-time compilation support. Large practical unstructured grid problems are solved on the Intel iPSC/860 hypercube and Intel Touchstone Delta machine. The quantitative effect of the various optimizations are demonstrated, and we show that the combined effect of these optimizations leads to roughly a factor of three performance improvement. The overall solution efficiency is compared with that obtained on the CRAY-YMP vector supercomputer.
LSPRAY-III: A Lagrangian Spray Module

NASA Technical Reports Server (NTRS)

Raju, M. S.

2008-01-01

LSPRAY-III is a Lagrangian spray solver developed for application with parallel computing and unstructured grids. It is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and/or Monte Carlo Probability Density Function (PDF) solvers. The solver accommodates the use of an unstructured mesh with mixed elements of either triangular, quadrilateral, and/or tetrahedral type for the gas flow grid representation. It is mainly designed to predict the flow, thermal and transport properties of a rapidly vaporizing spray because of its importance in aerospace application. The manual provides the user with an understanding of various models involved in the spray formulation, its code structure and solution algorithm, and various other issues related to parallelization and its coupling with other solvers. With the development of LSPRAY-III, we have advanced the state-of-the-art in spray computations in several important ways.
LSPRAY-II: A Lagrangian Spray Module

NASA Technical Reports Server (NTRS)

Raju, M. S.

2004-01-01

LSPRAY-II is a Lagrangian spray solver developed for application with parallel computing and unstructured grids. It is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and/or Monte Carlo Probability Density Function (PDF) solvers. The solver accommodates the use of an unstructured mesh with mixed elements of either triangular, quadrilateral, and/or tetrahedral type for the gas flow grid representation. It is mainly designed to predict the flow, thermal and transport properties of a rapidly vaporizing spray because of its importance in aerospace application. The manual provides the user with an understanding of various models involved in the spray formulation, its code structure and solution algorithm, and various other issues related to parallelization and its coupling with other solvers. With the development of LSPRAY-II, we have advanced the state-of-the-art in spray computations in several important ways.
EUPDF-II: An Eulerian Joint Scalar Monte Carlo PDF Module : User's Manual

NASA Technical Reports Server (NTRS)

Raju, M. S.; Liu, Nan-Suey (Technical Monitor)

2004-01-01

EUPDF-II provides the solution for the species and temperature fields based on an evolution equation for PDF (Probability Density Function) and it is developed mainly for application with sprays, combustion, parallel computing, and unstructured grids. It is designed to be massively parallel and could easily be coupled with any existing gas-phase CFD and spray solvers. The solver accommodates the use of an unstructured mesh with mixed elements of either triangular, quadrilateral, and/or tetrahedral type. The manual provides the user with an understanding of the various models involved in the PDF formulation, its code structure and solution algorithm, and various other issues related to parallelization and its coupling with other solvers. The source code of EUPDF-II will be available with National Combustion Code (NCC) as a complete package.
Implicit schemes and parallel computing in unstructured grid CFD

NASA Technical Reports Server (NTRS)

Venkatakrishnam, V.

1995-01-01

The development of implicit schemes for obtaining steady state solutions to the Euler and Navier-Stokes equations on unstructured grids is outlined. Applications are presented that compare the convergence characteristics of various implicit methods. Next, the development of explicit and implicit schemes to compute unsteady flows on unstructured grids is discussed. Next, the issues involved in parallelizing finite volume schemes on unstructured meshes in an MIMD (multiple instruction/multiple data stream) fashion are outlined. Techniques for partitioning unstructured grids among processors and for extracting parallelism in explicit and implicit solvers are discussed. Finally, some dynamic load balancing ideas, which are useful in adaptive transient computations, are presented.
Parallel Element Agglomeration Algebraic Multigrid and Upscaling Library

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barker, Andrew T.; Benson, Thomas R.; Lee, Chak Shing

ParELAG is a parallel C++ library for numerical upscaling of finite element discretizations and element-based algebraic multigrid solvers. It provides optimal complexity algorithms to build multilevel hierarchies and solvers that can be used for solving a wide class of partial differential equations (elliptic, hyperbolic, saddle point problems) on general unstructured meshes. Additionally, a novel multilevel solver for saddle point problems with divergence constraint is implemented.
Progress Toward Overset-Grid Moving Body Capability for USM3D Unstructured Flow Solver

NASA Technical Reports Server (NTRS)

Pandyna, Mohagna J.; Frink, Neal T.; Noack, Ralph W.

2005-01-01

A static and dynamic Chimera overset-grid capability is added to an established NASA tetrahedral unstructured parallel Navier-Stokes flow solver, USM3D. Modifications to the solver primarily consist of a few strategic calls to the Donor interpolation Receptor Transaction library (DiRTlib) to facilitate communication of solution information between various grids. The assembly of multiple overlapping grids into a single-zone composite grid is performed by the Structured, Unstructured and Generalized Grid AssembleR (SUGGAR) code. Several test cases are presented to verify the implementation, assess overset-grid solution accuracy and convergence relative to single-grid solutions, and demonstrate the prescribed relative grid motion capability.
A perspective on unstructured grid flow solvers

NASA Technical Reports Server (NTRS)

Venkatakrishnan, V.

1995-01-01

This survey paper assesses the status of compressible Euler and Navier-Stokes solvers on unstructured grids. Different spatial and temporal discretization options for steady and unsteady flows are discussed. The integration of these components into an overall framework to solve practical problems is addressed. Issues such as grid adaptation, higher order methods, hybrid discretizations and parallel computing are briefly discussed. Finally, some outstanding issues and future research directions are presented.
A matrix-free implicit unstructured multigrid finite volume method for simulating structural dynamics and fluid structure interaction

NASA Astrophysics Data System (ADS)

Lv, X.; Zhao, Y.; Huang, X. Y.; Xia, G. H.; Su, X. H.

2007-07-01

A new three-dimensional (3D) matrix-free implicit unstructured multigrid finite volume (FV) solver for structural dynamics is presented in this paper. The solver is first validated using classical 2D and 3D cantilever problems. It is shown that very accurate predictions of the fundamental natural frequencies of the problems can be obtained by the solver with fast convergence rates. This method has been integrated into our existing FV compressible solver [X. Lv, Y. Zhao, et al., An efficient parallel/unstructured-multigrid preconditioned implicit method for simulating 3d unsteady compressible flows with moving objects, Journal of Computational Physics 215(2) (2006) 661-690] based on the immersed membrane method (IMM) [X. Lv, Y. Zhao, et al., as mentioned above]. Results for the interaction between the fluid and an immersed fixed-free cantilever are also presented to demonstrate the potential of this integrated fluid-structure interaction approach.
LSPRAY-IV: A Lagrangian Spray Module

NASA Technical Reports Server (NTRS)

Raju, M. S.

2012-01-01

LSPRAY-IV is a Lagrangian spray solver developed for application with parallel computing and unstructured grids. It is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and/or Monte Carlo Probability Density Function (PDF) solvers. The solver accommodates the use of an unstructured mesh with mixed elements of either triangular, quadrilateral, and/or tetrahedral type for the gas flow grid representation. It is mainly designed to predict the flow, thermal and transport properties of a rapidly vaporizing spray. Some important research areas covered as a part of the code development are: (1) the extension of combined CFD/scalar-Monte- Carlo-PDF method to spray modeling, (2) the multi-component liquid spray modeling, and (3) the assessment of various atomization models used in spray calculations. The current version contains the extension to the modeling of superheated sprays. The manual provides the user with an understanding of various models involved in the spray formulation, its code structure and solution algorithm, and various other issues related to parallelization and its coupling with other solvers.
Parallel SOR methods with a parabolic-diffusion acceleration technique for solving an unstructured-grid Poisson equation on 3D arbitrary geometries

NASA Astrophysics Data System (ADS)

Zapata, M. A. Uh; Van Bang, D. Pham; Nguyen, K. D.

2016-05-01

This paper presents a parallel algorithm for the finite-volume discretisation of the Poisson equation on three-dimensional arbitrary geometries. The proposed method is formulated by using a 2D horizontal block domain decomposition and interprocessor data communication techniques with message passing interface. The horizontal unstructured-grid cells are reordered according to the neighbouring relations and decomposed into blocks using a load-balanced distribution to give all processors an equal amount of elements. In this algorithm, two parallel successive over-relaxation methods are presented: a multi-colour ordering technique for unstructured grids based on distributed memory and a block method using reordering index following similar ideas of the partitioning for structured grids. In all cases, the parallel algorithms are implemented with a combination of an acceleration iterative solver. This solver is based on a parabolic-diffusion equation introduced to obtain faster solutions of the linear systems arising from the discretisation. Numerical results are given to evaluate the performances of the methods showing speedups better than linear.
Array-based, parallel hierarchical mesh refinement algorithms for unstructured meshes

DOE PAGES

Ray, Navamita; Grindeanu, Iulian; Zhao, Xinglin; ...

2016-08-18

In this paper, we describe an array-based hierarchical mesh refinement capability through uniform refinement of unstructured meshes for efficient solution of PDE's using finite element methods and multigrid solvers. A multi-degree, multi-dimensional and multi-level framework is designed to generate the nested hierarchies from an initial coarse mesh that can be used for a variety of purposes such as in multigrid solvers/preconditioners, to do solution convergence and verification studies and to improve overall parallel efficiency by decreasing I/O bandwidth requirements (by loading smaller meshes and in memory refinement). We also describe a high-order boundary reconstruction capability that can be used tomore » project the new points after refinement using high-order approximations instead of linear projection in order to minimize and provide more control on geometrical errors introduced by curved boundaries.The capability is developed under the parallel unstructured mesh framework "Mesh Oriented dAtaBase" (MOAB Tautges et al. (2004)). We describe the underlying data structures and algorithms to generate such hierarchies in parallel and present numerical results for computational efficiency and effect on mesh quality. Furthermore, we also present results to demonstrate the applicability of the developed capability to study convergence properties of different point projection schemes for various mesh hierarchies and to a multigrid finite-element solver for elliptic problems.« less
LSPRAY: Lagrangian Spray Solver for Applications With Parallel Computing and Unstructured Gas-Phase Flow Solvers

NASA Technical Reports Server (NTRS)

Raju, Manthena S.

1998-01-01

Sprays occur in a wide variety of industrial and power applications and in the processing of materials. A liquid spray is a phase flow with a gas as the continuous phase and a liquid as the dispersed phase (in the form of droplets or ligaments). Interactions between the two phases, which are coupled through exchanges of mass, momentum, and energy, can occur in different ways at different times and locations involving various thermal, mass, and fluid dynamic factors. An understanding of the flow, combustion, and thermal properties of a rapidly vaporizing spray requires careful modeling of the rate-controlling processes associated with the spray's turbulent transport, mixing, chemical kinetics, evaporation, and spreading rates, as well as other phenomena. In an attempt to advance the state-of-the-art in multidimensional numerical methods, we at the NASA Lewis Research Center extended our previous work on sprays to unstructured grids and parallel computing. LSPRAY, which was developed by M.S. Raju of Nyma, Inc., is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and/or Monte Carlo probability density function (PDF) solver. The LSPRAY solver accommodates the use of an unstructured mesh with mixed triangular, quadrilateral, and/or tetrahedral elements in the gas-phase solvers. It is used specifically for fuel sprays within gas turbine combustors, but it has many other uses. The spray model used in LSPRAY provided favorable results when applied to stratified-charge rotary combustion (Wankel) engines and several other confined and unconfined spray flames. The source code will be available with the National Combustion Code (NCC) as a complete package.

Parallel Solver for H(div) Problems Using Hybridization and AMG

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Chak S.; Vassilevski, Panayot S.

2016-01-15

In this paper, a scalable parallel solver is proposed for H(div) problems discretized by arbitrary order finite elements on general unstructured meshes. The solver is based on hybridization and algebraic multigrid (AMG). Unlike some previously studied H(div) solvers, the hybridization solver does not require discrete curl and gradient operators as additional input from the user. Instead, only some element information is needed in the construction of the solver. The hybridization results in a H1-equivalent symmetric positive definite system, which is then rescaled and solved by AMG solvers designed for H1 problems. Weak and strong scaling of the method are examinedmore » through several numerical tests. Our numerical results show that the proposed solver provides a promising alternative to ADS, a state-of-the-art solver [12], for H(div) problems. In fact, it outperforms ADS for higher order elements.« less
OpenACC acceleration of an unstructured CFD solver based on a reconstructed discontinuous Galerkin method for compressible flows

DOE PAGES

Xia, Yidong; Lou, Jialin; Luo, Hong; ...

2015-02-09

Here, an OpenACC directive-based graphics processing unit (GPU) parallel scheme is presented for solving the compressible Navier–Stokes equations on 3D hybrid unstructured grids with a third-order reconstructed discontinuous Galerkin method. The developed scheme requires the minimum code intrusion and algorithm alteration for upgrading a legacy solver with the GPU computing capability at very little extra effort in programming, which leads to a unified and portable code development strategy. A face coloring algorithm is adopted to eliminate the memory contention because of the threading of internal and boundary face integrals. A number of flow problems are presented to verify the implementationmore » of the developed scheme. Timing measurements were obtained by running the resulting GPU code on one Nvidia Tesla K20c GPU card (Nvidia Corporation, Santa Clara, CA, USA) and compared with those obtained by running the equivalent Message Passing Interface (MPI) parallel CPU code on a compute node (consisting of two AMD Opteron 6128 eight-core CPUs (Advanced Micro Devices, Inc., Sunnyvale, CA, USA)). Speedup factors of up to 24× and 1.6× for the GPU code were achieved with respect to one and 16 CPU cores, respectively. The numerical results indicate that this OpenACC-based parallel scheme is an effective and extensible approach to port unstructured high-order CFD solvers to GPU computing.« less
Large-scale Parallel Unstructured Mesh Computations for 3D High-lift Analysis

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.; Pirzadeh, S.

1999-01-01

A complete "geometry to drag-polar" analysis capability for the three-dimensional high-lift configurations is described. The approach is based on the use of unstructured meshes in order to enable rapid turnaround for complicated geometries that arise in high-lift configurations. Special attention is devoted to creating a capability for enabling analyses on highly resolved grids. Unstructured meshes of several million vertices are initially generated on a work-station, and subsequently refined on a supercomputer. The flow is solved on these refined meshes on large parallel computers using an unstructured agglomeration multigrid algorithm. Good prediction of lift and drag throughout the range of incidences is demonstrated on a transport take-off configuration using up to 24.7 million grid points. The feasibility of using this approach in a production environment on existing parallel machines is demonstrated, as well as the scalability of the solver on machines using up to 1450 processors.
National Combustion Code: Parallel Implementation and Performance

NASA Technical Reports Server (NTRS)

Quealy, A.; Ryder, R.; Norris, A.; Liu, N.-S.

2000-01-01

The National Combustion Code (NCC) is being developed by an industry-government team for the design and analysis of combustion systems. CORSAIR-CCD is the current baseline reacting flow solver for NCC. This is a parallel, unstructured grid code which uses a distributed memory, message passing model for its parallel implementation. The focus of the present effort has been to improve the performance of the NCC flow solver to meet combustor designer requirements for model accuracy and analysis turnaround time. Improving the performance of this code contributes significantly to the overall reduction in time and cost of the combustor design cycle. This paper describes the parallel implementation of the NCC flow solver and summarizes its current parallel performance on an SGI Origin 2000. Earlier parallel performance results on an IBM SP-2 are also included. The performance improvements which have enabled a turnaround of less than 15 hours for a 1.3 million element fully reacting combustion simulation are described.
Gpu Implementation of a Viscous Flow Solver on Unstructured Grids

NASA Astrophysics Data System (ADS)

Xu, Tianhao; Chen, Long

2016-06-01

Graphics processing units have gained popularities in scientific computing over past several years due to their outstanding parallel computing capability. Computational fluid dynamics applications involve large amounts of calculations, therefore a latest GPU card is preferable of which the peak computing performance and memory bandwidth are much better than a contemporary high-end CPU. We herein focus on the detailed implementation of our GPU targeting Reynolds-averaged Navier-Stokes equations solver based on finite-volume method. The solver employs a vertex-centered scheme on unstructured grids for the sake of being capable of handling complex topologies. Multiple optimizations are carried out to improve the memory accessing performance and kernel utilization. Both steady and unsteady flow simulation cases are carried out using explicit Runge-Kutta scheme. The solver with GPU acceleration in this paper is demonstrated to have competitive advantages over the CPU targeting one.
Transonic Drag Prediction Using an Unstructured Multigrid Solver

NASA Technical Reports Server (NTRS)

Mavriplis, D. J.; Levy, David W.

2001-01-01

This paper summarizes the results obtained with the NSU-3D unstructured multigrid solver for the AIAA Drag Prediction Workshop held in Anaheim, CA, June 2001. The test case for the workshop consists of a wing-body configuration at transonic flow conditions. Flow analyses for a complete test matrix of lift coefficient values and Mach numbers at a constant Reynolds number are performed, thus producing a set of drag polars and drag rise curves which are compared with experimental data. Results were obtained independently by both authors using an identical baseline grid and different refined grids. Most cases were run in parallel on commodity cluster-type machines while the largest cases were run on an SGI Origin machine using 128 processors. The objective of this paper is to study the accuracy of the subject unstructured grid solver for predicting drag in the transonic cruise regime, to assess the efficiency of the method in terms of convergence, cpu time, and memory, and to determine the effects of grid resolution on this predictive ability and its computational efficiency. A good predictive ability is demonstrated over a wide range of conditions, although accuracy was found to degrade for cases at higher Mach numbers and lift values where increasing amounts of flow separation occur. The ability to rapidly compute large numbers of cases at varying flow conditions using an unstructured solver on inexpensive clusters of commodity computers is also demonstrated.
SediFoam: A general-purpose, open-source CFD-DEM solver for particle-laden flow with emphasis on sediment transport

NASA Astrophysics Data System (ADS)

Sun, Rui; Xiao, Heng

2016-04-01

With the growth of available computational resource, CFD-DEM (computational fluid dynamics-discrete element method) becomes an increasingly promising and feasible approach for the study of sediment transport. Several existing CFD-DEM solvers are applied in chemical engineering and mining industry. However, a robust CFD-DEM solver for the simulation of sediment transport is still desirable. In this work, the development of a three-dimensional, massively parallel, and open-source CFD-DEM solver SediFoam is detailed. This solver is built based on open-source solvers OpenFOAM and LAMMPS. OpenFOAM is a CFD toolbox that can perform three-dimensional fluid flow simulations on unstructured meshes; LAMMPS is a massively parallel DEM solver for molecular dynamics. Several validation tests of SediFoam are performed using cases of a wide range of complexities. The results obtained in the present simulations are consistent with those in the literature, which demonstrates the capability of SediFoam for sediment transport applications. In addition to the validation test, the parallel efficiency of SediFoam is studied to test the performance of the code for large-scale and complex simulations. The parallel efficiency tests show that the scalability of SediFoam is satisfactory in the simulations using up to O(107) particles.
Execution of a parallel edge-based Navier-Stokes solver on commodity graphics processor units

NASA Astrophysics Data System (ADS)

Corral, Roque; Gisbert, Fernando; Pueblas, Jesus

2017-02-01

The implementation of an edge-based three-dimensional Reynolds Average Navier-Stokes solver for unstructured grids able to run on multiple graphics processing units (GPUs) is presented. Loops over edges, which are the most time-consuming part of the solver, have been written to exploit the massively parallel capabilities of GPUs. Non-blocking communications between parallel processes and between the GPU and the central processor unit (CPU) have been used to enhance code scalability. The code is written using a mixture of C++ and OpenCL, to allow the execution of the source code on GPUs. The Message Passage Interface (MPI) library is used to allow the parallel execution of the solver on multiple GPUs. A comparative study of the solver parallel performance is carried out using a cluster of CPUs and another of GPUs. It is shown that a single GPU is up to 64 times faster than a single CPU core. The parallel scalability of the solver is mainly degraded due to the loss of computing efficiency of the GPU when the size of the case decreases. However, for large enough grid sizes, the scalability is strongly improved. A cluster featuring commodity GPUs and a high bandwidth network is ten times less costly and consumes 33% less energy than a CPU-based cluster with an equivalent computational power.
LSPRAY-V: A Lagrangian Spray Module

NASA Technical Reports Server (NTRS)

Raju, M. S.

2015-01-01

LSPRAY-V is a Lagrangian spray solver developed for application with unstructured grids and massively parallel computers. It is mainly designed to predict the flow, thermal and transport properties of a rapidly vaporizing spray encountered over a wide range of operating conditions in modern aircraft engine development. It could easily be coupled with any existing gas-phase flow and/or Monte Carlo Probability Density Function (PDF) solvers. The manual provides the user with an understanding of various models involved in the spray formulation, its code structure and solution algorithm, and various other issues related to parallelization and its coupling with other solvers. With the development of LSPRAY-V, we have advanced the state-of-the-art in spray computations in several important ways.
Detailed Aerodynamic Analysis of a Shrouded Tail Rotor Using an Unstructured Mesh Flow Solver

NASA Astrophysics Data System (ADS)

Lee, Hee Dong; Kwon, Oh Joon

The detailed aerodynamics of a shrouded tail rotor in hover has been numerically studied using a parallel inviscid flow solver on unstructured meshes. The numerical method is based on a cell-centered finite-volume discretization and an implicit Gauss-Seidel time integration. The calculation was made for a single blade by imposing a periodic boundary condition between adjacent rotor blades. The grid periodicity was also imposed at the periodic boundary planes to avoid numerical inaccuracy resulting from solution interpolation. The results were compared with available experimental data and those from a disk vortex theory for validation. It was found that realistic three-dimensional modeling is important for the prediction of detailed aerodynamics of shrouded rotors including the tip clearance gap flow.
Improvements to the Unstructured Mesh Generator MESH3D

NASA Technical Reports Server (NTRS)

Thomas, Scott D.; Baker, Timothy J.; Cliff, Susan E.

1999-01-01

The AIRPLANE process starts with an aircraft geometry stored in a CAD system. The surface is modeled with a mesh of triangles and then the flow solver produces pressures at surface points which may be integrated to find forces and moments. The biggest advantage is that the grid generation bottleneck of the CFD process is eliminated when an unstructured tetrahedral mesh is used. MESH3D is the key to turning around the first analysis of a CAD geometry in days instead of weeks. The flow solver part of AIRPLANE has proven to be robust and accurate over a decade of use at NASA. It has been extensively validated with experimental data and compares well with other Euler flow solvers. AIRPLANE has been applied to all the HSR geometries treated at Ames over the course of the HSR program in order to verify the accuracy of other flow solvers. The unstructured approach makes handling complete and complex geometries very simple because only the surface of the aircraft needs to be discretized, i.e. covered with triangles. The volume mesh is created automatically by MESH3D. AIRPLANE runs well on multiple platforms. Vectorization on the Cray Y-MP is reasonable for a code that uses indirect addressing. Massively parallel computers such as the IBM SP2, SGI Origin 2000, and the Cray T3E have been used with an MPI version of the flow solver and the code scales very well on these systems. AIRPLANE can run on a desktop computer as well. AIRPLANE has a future. The unstructured technologies developed as part of the HSR program are now targeting high Reynolds number viscous flow simulation. The pacing item in this effort is Navier-Stokes mesh generation.
Parallelization of Unsteady Adaptive Mesh Refinement for Unstructured Navier-Stokes Solvers

NASA Technical Reports Server (NTRS)

Schwing, Alan M.; Nompelis, Ioannis; Candler, Graham V.

2014-01-01

This paper explores the implementation of the MPI parallelization in a Navier-Stokes solver using adaptive mesh re nement. Viscous and inviscid test problems are considered for the purpose of benchmarking, as are implicit and explicit time advancement methods. The main test problem for comparison includes e ects from boundary layers and other viscous features and requires a large number of grid points for accurate computation. Ex- perimental validation against double cone experiments in hypersonic ow are shown. The adaptive mesh re nement shows promise for a staple test problem in the hypersonic com- munity. Extension to more advanced techniques for more complicated ows is described.
Adaptation of a Multi-Block Structured Solver for Effective Use in a Hybrid CPU/GPU Massively Parallel Environment

NASA Astrophysics Data System (ADS)

Gutzwiller, David; Gontier, Mathieu; Demeulenaere, Alain

2014-11-01

Multi-Block structured solvers hold many advantages over their unstructured counterparts, such as a smaller memory footprint and efficient serial performance. Historically, multi-block structured solvers have not been easily adapted for use in a High Performance Computing (HPC) environment, and the recent trend towards hybrid GPU/CPU architectures has further complicated the situation. This paper will elaborate on developments and innovations applied to the NUMECA FINE/Turbo solver that have allowed near-linear scalability with real-world problems on over 250 hybrid GPU/GPU cluster nodes. Discussion will focus on the implementation of virtual partitioning and load balancing algorithms using a novel meta-block concept. This implementation is transparent to the user, allowing all pre- and post-processing steps to be performed using a simple, unpartitioned grid topology. Additional discussion will elaborate on developments that have improved parallel performance, including fully parallel I/O with the ADIOS API and the GPU porting of the computationally heavy CPUBooster convergence acceleration module. Head of HPC and Release Management, Numeca International.
A software platform for continuum modeling of ion channels based on unstructured mesh

NASA Astrophysics Data System (ADS)

Tu, B.; Bai, S. Y.; Chen, M. X.; Xie, Y.; Zhang, L. B.; Lu, B. Z.

2014-01-01

Most traditional continuum molecular modeling adopted finite difference or finite volume methods which were based on a structured mesh (grid). Unstructured meshes were only occasionally used, but an increased number of applications emerge in molecular simulations. To facilitate the continuum modeling of biomolecular systems based on unstructured meshes, we are developing a software platform with tools which are particularly beneficial to those approaches. This work describes the software system specifically for the simulation of a typical, complex molecular procedure: ion transport through a three-dimensional channel system that consists of a protein and a membrane. The platform contains three parts: a meshing tool chain for ion channel systems, a parallel finite element solver for the Poisson-Nernst-Planck equations describing the electrodiffusion process of ion transport, and a visualization program for continuum molecular modeling. The meshing tool chain in the platform, which consists of a set of mesh generation tools, is able to generate high-quality surface and volume meshes for ion channel systems. The parallel finite element solver in our platform is based on the parallel adaptive finite element package PHG which wass developed by one of the authors [1]. As a featured component of the platform, a new visualization program, VCMM, has specifically been developed for continuum molecular modeling with an emphasis on providing useful facilities for unstructured mesh-based methods and for their output analysis and visualization. VCMM provides a graphic user interface and consists of three modules: a molecular module, a meshing module and a numerical module. A demonstration of the platform is provided with a study of two real proteins, the connexin 26 and hemolysin ion channels.
Wakefield Simulation of CLIC PETS Structure Using Parallel 3D Finite Element Time-Domain Solver T3P

DOE Office of Scientific and Technical Information (OSTI.GOV)

Candel, A.; Kabel, A.; Lee, L.

In recent years, SLAC's Advanced Computations Department (ACD) has developed the parallel 3D Finite Element electromagnetic time-domain code T3P. Higher-order Finite Element methods on conformal unstructured meshes and massively parallel processing allow unprecedented simulation accuracy for wakefield computations and simulations of transient effects in realistic accelerator structures. Applications include simulation of wakefield damping in the Compact Linear Collider (CLIC) power extraction and transfer structure (PETS).
Parallel Cartesian grid refinement for 3D complex flow simulations

NASA Astrophysics Data System (ADS)

Angelidis, Dionysios; Sotiropoulos, Fotis

2013-11-01

A second order accurate method for discretizing the Navier-Stokes equations on 3D unstructured Cartesian grids is presented. Although the grid generator is based on the oct-tree hierarchical method, fully unstructured data-structure is adopted enabling robust calculations for incompressible flows, avoiding both the need of synchronization of the solution between different levels of refinement and usage of prolongation/restriction operators. The current solver implements a hybrid staggered/non-staggered grid layout, employing the implicit fractional step method to satisfy the continuity equation. The pressure-Poisson equation is discretized by using a novel second order fully implicit scheme for unstructured Cartesian grids and solved using an efficient Krylov subspace solver. The momentum equation is also discretized with second order accuracy and the high performance Newton-Krylov method is used for integrating them in time. Neumann and Dirichlet conditions are used to validate the Poisson solver against analytical functions and grid refinement results to a significant reduction of the solution error. The effectiveness of the fractional step method results in the stability of the overall algorithm and enables the performance of accurate multi-resolution real life simulations. This material is based upon work supported by the Department of Energy under Award Number DE-EE0005482.
Use of general purpose graphics processing units with MODFLOW

USGS Publications Warehouse

Hughes, Joseph D.; White, Jeremy T.

2013-01-01

To evaluate the use of general-purpose graphics processing units (GPGPUs) to improve the performance of MODFLOW, an unstructured preconditioned conjugate gradient (UPCG) solver has been developed. The UPCG solver uses a compressed sparse row storage scheme and includes Jacobi, zero fill-in incomplete, and modified-incomplete lower-upper (LU) factorization, and generalized least-squares polynomial preconditioners. The UPCG solver also includes options for sequential and parallel solution on the central processing unit (CPU) using OpenMP. For simulations utilizing the GPGPU, all basic linear algebra operations are performed on the GPGPU; memory copies between the central processing unit CPU and GPCPU occur prior to the first iteration of the UPCG solver and after satisfying head and flow criteria or exceeding a maximum number of iterations. The efficiency of the UPCG solver for GPGPU and CPU solutions is benchmarked using simulations of a synthetic, heterogeneous unconfined aquifer with tens of thousands to millions of active grid cells. Testing indicates GPGPU speedups on the order of 2 to 8, relative to the standard MODFLOW preconditioned conjugate gradient (PCG) solver, can be achieved when (1) memory copies between the CPU and GPGPU are optimized, (2) the percentage of time performing memory copies between the CPU and GPGPU is small relative to the calculation time, (3) high-performance GPGPU cards are utilized, and (4) CPU-GPGPU combinations are used to execute sequential operations that are difficult to parallelize. Furthermore, UPCG solver testing indicates GPGPU speedups exceed parallel CPU speedups achieved using OpenMP on multicore CPUs for preconditioners that can be easily parallelized.
Scalable direct Vlasov solver with discontinuous Galerkin method on unstructured mesh.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, J.; Ostroumov, P. N.; Mustapha, B.

2010-12-01

This paper presents the development of parallel direct Vlasov solvers with discontinuous Galerkin (DG) method for beam and plasma simulations in four dimensions. Both physical and velocity spaces are in two dimesions (2P2V) with unstructured mesh. Contrary to the standard particle-in-cell (PIC) approach for kinetic space plasma simulations, i.e., solving Vlasov-Maxwell equations, direct method has been used in this paper. There are several benefits to solving a Vlasov equation directly, such as avoiding noise associated with a finite number of particles and the capability to capture fine structure in the plasma. The most challanging part of a direct Vlasov solvermore » comes from higher dimensions, as the computational cost increases as N{sup 2d}, where d is the dimension of the physical space. Recently, due to the fast development of supercomputers, the possibility has become more realistic. Many efforts have been made to solve Vlasov equations in low dimensions before; now more interest has focused on higher dimensions. Different numerical methods have been tried so far, such as the finite difference method, Fourier Spectral method, finite volume method, and spectral element method. This paper is based on our previous efforts to use the DG method. The DG method has been proven to be very successful in solving Maxwell equations, and this paper is our first effort in applying the DG method to Vlasov equations. DG has shown several advantages, such as local mass matrix, strong stability, and easy parallelization. These are particularly suitable for Vlasov equations. Domain decomposition in high dimensions has been used for parallelization; these include a highly scalable parallel two-dimensional Poisson solver. Benchmark results have been shown and simulation results will be reported.« less
Best Practices for Unstructured Grid Shock Fitting

NASA Technical Reports Server (NTRS)

McCloud, Peter L.

2017-01-01

Unstructured grid solvers have well-known issues predicting surface heat fluxes when strong shocks are present. Various efforts have been made to address the underlying numerical issues that cause the erroneous predictions. The present work addresses some of the shortcomings of unstructured grid solvers, not by addressing the numerics, but by applying structured grid best practices to unstructured grids. A methodology for robust shock detection and shock fitting is outlined and applied to production relevant cases. Results achieved by using the Loci-CHEM Computational Fluid Dynamics solver are provided.
Application of an unstructured grid flow solver to planes, trains and automobiles

NASA Technical Reports Server (NTRS)

Spragle, Gregory S.; Smith, Wayne A.; Yadlin, Yoram

1993-01-01

Rampant, an unstructured flow solver developed at Fluent Inc., is used to compute three-dimensional, viscous, turbulent, compressible flow fields within complex solution domains. Rampant is an explicit, finite-volume flow solver capable of computing flow fields using either triangular (2d) or tetrahedral (3d) unstructured grids. Local time stepping, implicit residual smoothing, and multigrid techniques are used to accelerate the convergence of the explicit scheme. The paper describes the Rampant flow solver and presents flow field solutions about a plane, train, and automobile.

A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TETRAHEDRAL DOMAINS

PubMed Central

Fu, Zhisong; Kirby, Robert M.; Whitaker, Ross T.

2014-01-01

Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically optimal, but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes; in this paper we adapt and extend previous two-dimensional strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains computational density. Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers. PMID:25221418
Parallelization of an Object-Oriented Unstructured Aeroacoustics Solver

NASA Technical Reports Server (NTRS)

Baggag, Abdelkader; Atkins, Harold; Oezturan, Can; Keyes, David

1999-01-01

A computational aeroacoustics code based on the discontinuous Galerkin method is ported to several parallel platforms using MPI. The discontinuous Galerkin method is a compact high-order method that retains its accuracy and robustness on non-smooth unstructured meshes. In its semi-discrete form, the discontinuous Galerkin method can be combined with explicit time marching methods making it well suited to time accurate computations. The compact nature of the discontinuous Galerkin method also makes it well suited for distributed memory parallel platforms. The original serial code was written using an object-oriented approach and was previously optimized for cache-based machines. The port to parallel platforms was achieved simply by treating partition boundaries as a type of boundary condition. Code modifications were minimal because boundary conditions were abstractions in the original program. Scalability results are presented for the SCI Origin, IBM SP2, and clusters of SGI and Sun workstations. Slightly superlinear speedup is achieved on a fixed-size problem on the Origin, due to cache effects.
The DANTE Boltzmann transport solver: An unstructured mesh, 3-D, spherical harmonics algorithm compatible with parallel computer architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

McGhee, J.M.; Roberts, R.M.; Morel, J.E.

1997-06-01

A spherical harmonics research code (DANTE) has been developed which is compatible with parallel computer architectures. DANTE provides 3-D, multi-material, deterministic, transport capabilities using an arbitrary finite element mesh. The linearized Boltzmann transport equation is solved in a second order self-adjoint form utilizing a Galerkin finite element spatial differencing scheme. The core solver utilizes a preconditioned conjugate gradient algorithm. Other distinguishing features of the code include options for discrete-ordinates and simplified spherical harmonics angular differencing, an exact Marshak boundary treatment for arbitrarily oriented boundary faces, in-line matrix construction techniques to minimize memory consumption, and an effective diffusion based preconditioner formore » scattering dominated problems. Algorithm efficiency is demonstrated for a massively parallel SIMD architecture (CM-5), and compatibility with MPP multiprocessor platforms or workstation clusters is anticipated.« less
Array-based Hierarchical Mesh Generation in Parallel

DOE PAGES

Ray, Navamita; Grindeanu, Iulian; Zhao, Xinglin; ...

2015-11-03

In this paper, we describe an array-based hierarchical mesh generation capability through uniform refinement of unstructured meshes for efficient solution of PDE's using finite element methods and multigrid solvers. A multi-degree, multi-dimensional and multi-level framework is designed to generate the nested hierarchies from an initial mesh that can be used for a number of purposes such as multi-level methods to generating large meshes. The capability is developed under the parallel mesh framework “Mesh Oriented dAtaBase” a.k.a MOAB. We describe the underlying data structures and algorithms to generate such hierarchies and present numerical results for computational efficiency and mesh quality. Inmore » conclusion, we also present results to demonstrate the applicability of the developed capability to a multigrid finite-element solver.« less
Development of an Unstructured Mesh Code for Flows About Complete Vehicles

NASA Technical Reports Server (NTRS)

Peraire, Jaime; Gupta, K. K. (Technical Monitor)

2001-01-01

This report describes the research work undertaken at the Massachusetts Institute of Technology, under NASA Research Grant NAG4-157. The aim of this research is to identify effective algorithms and methodologies for the efficient and routine solution of flow simulations about complete vehicle configurations. For over ten years we have received support from NASA to develop unstructured mesh methods for Computational Fluid Dynamics. As a result of this effort a methodology based on the use of unstructured adapted meshes of tetrahedra and finite volume flow solvers has been developed. A number of gridding algorithms, flow solvers, and adaptive strategies have been proposed. The most successful algorithms developed from the basis of the unstructured mesh system FELISA. The FELISA system has been extensively for the analysis of transonic and hypersonic flows about complete vehicle configurations. The system is highly automatic and allows for the routine aerodynamic analysis of complex configurations starting from CAD data. The code has been parallelized and utilizes efficient solution algorithms. For hypersonic flows, a version of the code which incorporates real gas effects, has been produced. The FELISA system is also a component of the STARS aeroservoelastic system developed at NASA Dryden. One of the latest developments before the start of this grant was to extend the system to include viscous effects. This required the development of viscous generators, capable of generating the anisotropic grids required to represent boundary layers, and viscous flow solvers. We show some sample hypersonic viscous computations using the developed viscous generators and solvers. Although this initial results were encouraging it became apparent that in order to develop a fully functional capability for viscous flows, several advances in solution accuracy, robustness and efficiency were required. In this grant we set out to investigate some novel methodologies that could lead to the required improvements. In particular we focused on two fronts: (1) finite element methods and (2) iterative algebraic multigrid solution techniques.
Oasis: A high-level/high-performance open source Navier-Stokes solver

NASA Astrophysics Data System (ADS)

Mortensen, Mikael; Valen-Sendstad, Kristian

2015-03-01

Oasis is a high-level/high-performance finite element Navier-Stokes solver written from scratch in Python using building blocks from the FEniCS project (fenicsproject.org). The solver is unstructured and targets large-scale applications in complex geometries on massively parallel clusters. Oasis utilizes MPI and interfaces, through FEniCS, to the linear algebra backend PETSc. Oasis advocates a high-level, programmable user interface through the creation of highly flexible Python modules for new problems. Through the high-level Python interface the user is placed in complete control of every aspect of the solver. A version of the solver, that is using piecewise linear elements for both velocity and pressure, is shown to reproduce very well the classical, spectral, turbulent channel simulations of Moser et al. (1999). The computational speed is strongly dominated by the iterative solvers provided by the linear algebra backend, which is arguably the best performance any similar implicit solver using PETSc may hope for. Higher order accuracy is also demonstrated and new solvers may be easily added within the same framework.
Application of an Unstructured Grid Navier-Stokes Solver to a Generic Helicopter Boby: Comparison of Unstructured Grid Results with Structured Grid Results and Experimental Results

NASA Technical Reports Server (NTRS)

Mineck, Raymond E.

1999-01-01

An unstructured-grid Navier-Stokes solver was used to predict the surface pressure distribution, the off-body flow field, the surface flow pattern, and integrated lift and drag coefficients on the ROBIN configuration (a generic helicopter) without a rotor at four angles of attack. The results are compared to those predicted by two structured- grid Navier-Stokes solvers and to experimental surface pressure distributions. The surface pressure distributions from the unstructured-grid Navier-Stokes solver are in good agreement with the results from the structured-grid Navier-Stokes solvers. Agreement with the experimental pressure coefficients is good over the forward portion of the body. However, agreement is poor on the lower portion of the mid-section of the body. Comparison of the predicted surface flow patterns showed similar regions of separated flow. Predicted lift and drag coefficients were in fair agreement with each other.
GPU accelerated cell-based adaptive mesh refinement on unstructured quadrilateral grid

NASA Astrophysics Data System (ADS)

Luo, Xisheng; Wang, Luying; Ran, Wei; Qin, Fenghua

2016-10-01

A GPU accelerated inviscid flow solver is developed on an unstructured quadrilateral grid in the present work. For the first time, the cell-based adaptive mesh refinement (AMR) is fully implemented on GPU for the unstructured quadrilateral grid, which greatly reduces the frequency of data exchange between GPU and CPU. Specifically, the AMR is processed with atomic operations to parallelize list operations, and null memory recycling is realized to improve the efficiency of memory utilization. It is found that results obtained by GPUs agree very well with the exact or experimental results in literature. An acceleration ratio of 4 is obtained between the parallel code running on the old GPU GT9800 and the serial code running on E3-1230 V2. With the optimization of configuring a larger L1 cache and adopting Shared Memory based atomic operations on the newer GPU C2050, an acceleration ratio of 20 is achieved. The parallelized cell-based AMR processes have achieved 2x speedup on GT9800 and 18x on Tesla C2050, which demonstrates that parallel running of the cell-based AMR method on GPU is feasible and efficient. Our results also indicate that the new development of GPU architecture benefits the fluid dynamics computing significantly.
Best Practices for Unstructured Grid Shock-Fitting

NASA Technical Reports Server (NTRS)

McCoud, Peter L.

2017-01-01

Unstructured grid solvers have well-known issues predicting surface heat fluxes when strong shocks are present. Various efforts have been made to address the underlying numerical issues that cause the erroneous predictions. The present work addresses some of the shortcomings of unstructured grid solvers, not by addressing the numerics, but by applying structured grid best practices to unstructured grids. A methodology for robust shock detection and shock-fitting is outlined and applied to production-relevant cases. Results
Parallel Climate Data Assimilation PSAS Package

NASA Technical Reports Server (NTRS)

Ding, Hong Q.; Chan, Clara; Gennery, Donald B.; Ferraro, Robert D.

1996-01-01

We have designed and implemented a set of highly efficient and highly scalable algorithms for an unstructured computational package, the PSAS data assimilation package, as demonstrated by detailed performance analysis of systematic runs on up to 512node Intel Paragon. The equation solver achieves a sustained 18 Gflops performance. As the results, we achieved an unprecedented 100-fold solution time reduction on the Intel Paragon parallel platform over the Cray C90. This not only meets and exceeds the DAO time requirements, but also significantly enlarges the window of exploration in climate data assimilations.
An assessment of unstructured grid technology for timely CFD analysis

NASA Technical Reports Server (NTRS)

Kinard, Tom A.; Schabowski, Deanne M.

1995-01-01

An assessment of two unstructured methods is presented in this paper. A tetrahedral unstructured method USM3D, developed at NASA Langley Research Center is compared to a Cartesian unstructured method, SPLITFLOW, developed at Lockheed Fort Worth Company. USM3D is an upwind finite volume solver that accepts grids generated primarily from the Vgrid grid generator. SPLITFLOW combines an unstructured grid generator with an implicit flow solver in one package. Both methods are exercised on three test cases, a wing, and a wing body, and a fully expanded nozzle. The results for the first two runs are included here and compared to the structured grid method TEAM and to available test data. On each test case, the set up procedure are described, including any difficulties that were encountered. Detailed descriptions of the solvers are not included in this paper.
Wind-US Unstructured Flow Solutions for a Transonic Diffuser

NASA Technical Reports Server (NTRS)

Mohler, Stanley R., Jr.

2005-01-01

The Wind-US Computational Fluid Dynamics flow solver computed flow solutions for a transonic diffusing duct. The calculations used an unstructured (hexahedral) grid. The Spalart-Allmaras turbulence model was used. Static pressures along the upper and lower wall agreed well with experiment, as did velocity profiles. The effect of the smoothing input parameters on convergence and solution accuracy was investigated. The meaning and proper use of these parameters are discussed for the benefit of Wind-US users. Finally, the unstructured solver is compared to the structured solver in terms of run times and solution accuracy.
Generating unstructured nuclear reactor core meshes in parallel

DOE PAGES

Jain, Rajeev; Tautges, Timothy J.

2014-10-24

Recent advances in supercomputers and parallel solver techniques have enabled users to run large simulations problems using millions of processors. Techniques for multiphysics nuclear reactor core simulations are under active development in several countries. Most of these techniques require large unstructured meshes that can be hard to generate in a standalone desktop computers because of high memory requirements, limited processing power, and other complexities. We have previously reported on a hierarchical lattice-based approach for generating reactor core meshes. Here, we describe efforts to exploit coarse-grained parallelism during reactor assembly and reactor core mesh generation processes. We highlight several reactor coremore » examples including a very high temperature reactor, a full-core model of the Korean MONJU reactor, a ¼ pressurized water reactor core, the fast reactor Experimental Breeder Reactor-II core with a XX09 assembly, and an advanced breeder test reactor core. The times required to generate large mesh models, along with speedups obtained from running these problems in parallel, are reported. A graphical user interface to the tools described here has also been developed.« less
Unstructured Adaptive Grid Computations on an Array of SMPs

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Pramanick, Ira; Sohn, Andrew; Simon, Horst D.

1996-01-01

Dynamic load balancing is necessary for parallel adaptive methods to solve unsteady CFD problems on unstructured grids. We have presented such a dynamic load balancing framework called JOVE, in this paper. Results on a four-POWERnode POWER CHALLENGEarray demonstrated that load balancing gives significant performance improvements over no load balancing for such adaptive computations. The parallel speedup of JOVE, implemented using MPI on the POWER CHALLENCEarray, was significant, being as high as 31 for 32 processors. An implementation of JOVE that exploits 'an array of SMPS' architecture was also studied; this hybrid JOVE outperformed flat JOVE by up to 28% on the meshes and adaption models tested. With large, realistic meshes and actual flow-solver and adaption phases incorporated into JOVE, hybrid JOVE can be expected to yield significant advantage over flat JOVE, especially as the number of processors is increased, thus demonstrating the scalability of an array of SMPs architecture.
Unstructured Mesh Methods for the Simulation of Hypersonic Flows

NASA Technical Reports Server (NTRS)

Peraire, Jaime; Bibb, K. L. (Technical Monitor)

2001-01-01

This report describes the research work undertaken at the Massachusetts Institute of Technology. The aim of this research is to identify effective algorithms and methodologies for the efficient and routine solution of hypersonic viscous flows about re-entry vehicles. For over ten years we have received support from NASA to develop unstructured mesh methods for Computational Fluid Dynamics. As a result of this effort a methodology based on the use, of unstructured adapted meshes of tetrahedra and finite volume flow solvers has been developed. A number of gridding algorithms flow solvers, and adaptive strategies have been proposed. The most successful algorithms developed from the basis of the unstructured mesh system FELISA. The FELISA system has been extensively for the analysis of transonic and hypersonic flows about complete vehicle configurations. The system is highly automatic and allows for the routine aerodynamic analysis of complex configurations starting from CAD data. The code has been parallelized and utilizes efficient solution algorithms. For hypersonic flows, a version of the, code which incorporates real gas effects, has been produced. One of the latest developments before the start of this grant was to extend the system to include viscous effects. This required the development of viscous generators, capable of generating the anisotropic grids required to represent boundary layers, and viscous flow solvers. In figures I and 2, we show some sample hypersonic viscous computations using the developed viscous generators and solvers. Although these initial results were encouraging, it became apparent that in order to develop a fully functional capability for viscous flows, several advances in gridding, solution accuracy, robustness and efficiency were required. As part of this research we have developed: 1) automatic meshing techniques and the corresponding computer codes have been delivered to NASA and implemented into the GridEx system, 2) a finite element algorithm for the solution of the viscous compressible flow equations which can solve flows all the way down to the incompressible limit and that can use higher order (quadratic) approximations leading to highly accurate answers, and 3) and iterative algebraic multigrid solution techniques.
An Immersed Boundary - Adaptive Mesh Refinement solver (IB-AMR) for high fidelity fully resolved wind turbine simulations

NASA Astrophysics Data System (ADS)

Angelidis, Dionysios; Sotiropoulos, Fotis

2015-11-01

The geometrical details of wind turbines determine the structure of the turbulence in the near and far wake and should be taken in account when performing high fidelity calculations. Multi-resolution simulations coupled with an immersed boundary method constitutes a powerful framework for high-fidelity calculations past wind farms located over complex terrains. We develop a 3D Immersed-Boundary Adaptive Mesh Refinement flow solver (IB-AMR) which enables turbine-resolving LES of wind turbines. The idea of using a hybrid staggered/non-staggered grid layout adopted in the Curvilinear Immersed Boundary Method (CURVIB) has been successfully incorporated on unstructured meshes and the fractional step method has been employed. The overall performance and robustness of the second order accurate, parallel, unstructured solver is evaluated by comparing the numerical simulations against conforming grid calculations and experimental measurements of laminar and turbulent flows over complex geometries. We also present turbine-resolving multi-scale LES considering all the details affecting the induced flow field; including the geometry of the tower, the nacelle and especially the rotor blades of a wind tunnel scale turbine. This material is based upon work supported by the Department of Energy under Award Number DE-EE0005482 and the Sandia National Laboratories.
EUPDF: Eulerian Monte Carlo Probability Density Function Solver for Applications With Parallel Computing, Unstructured Grids, and Sprays

NASA Technical Reports Server (NTRS)

Raju, M. S.

1998-01-01

The success of any solution methodology used in the study of gas-turbine combustor flows depends a great deal on how well it can model the various complex and rate controlling processes associated with the spray's turbulent transport, mixing, chemical kinetics, evaporation, and spreading rates, as well as convective and radiative heat transfer and other phenomena. The phenomena to be modeled, which are controlled by these processes, often strongly interact with each other at different times and locations. In particular, turbulence plays an important role in determining the rates of mass and heat transfer, chemical reactions, and evaporation in many practical combustion devices. The influence of turbulence in a diffusion flame manifests itself in several forms, ranging from the so-called wrinkled, or stretched, flamelets regime to the distributed combustion regime, depending upon how turbulence interacts with various flame scales. Conventional turbulence models have difficulty treating highly nonlinear reaction rates. A solution procedure based on the composition joint probability density function (PDF) approach holds the promise of modeling various important combustion phenomena relevant to practical combustion devices (such as extinction, blowoff limits, and emissions predictions) because it can account for nonlinear chemical reaction rates without making approximations. In an attempt to advance the state-of-the-art in multidimensional numerical methods, we at the NASA Lewis Research Center extended our previous work on the PDF method to unstructured grids, parallel computing, and sprays. EUPDF, which was developed by M.S. Raju of Nyma, Inc., was designed to be massively parallel and could easily be coupled with any existing gas-phase and/or spray solvers. EUPDF can use an unstructured mesh with mixed triangular, quadrilateral, and/or tetrahedral elements. The application of the PDF method showed favorable results when applied to several supersonic-diffusion flames and spray flames. The EUPDF source code will be available with the National Combustion Code (NCC) as a complete package.
Development of a Regional Structured and Unstructured Grid Methodology for Chemically Reactive Turbulent Flows

NASA Astrophysics Data System (ADS)

Stefanski, Douglas Lawrence

A finite volume method for solving the Reynolds Averaged Navier-Stokes (RANS) equations on unstructured hybrid grids is presented. Capabilities for handling arbitrary mixtures of reactive gas species within the unstructured framework are developed. The modeling of turbulent effects is carried out via the 1998 Wilcox k -- o model. This unstructured solver is incorporated within VULCAN -- a multi-block structured grid code -- as part of a novel patching procedure in which non-matching interfaces between structured blocks are replaced by transitional unstructured grids. This approach provides a fully-conservative alternative to VULCAN's non-conservative patching methods for handling such interfaces. In addition, the further development of the standalone unstructured solver toward large-eddy simulation (LES) applications is also carried out. Dual time-stepping using a Crank-Nicholson formulation is added to recover time-accuracy, and modeling of sub-grid scale effects is incorporated to provide higher fidelity LES solutions for turbulent flows. A switch based on the work of Ducros, et al., is implemented to transition from a monotonicity-preserving flux scheme near shocks to a central-difference method in vorticity-dominated regions in order to better resolve small-scale turbulent structures. The updated unstructured solver is used to carry out large-eddy simulations of a supersonic constrained mixing layer.
Application of the FUN3D Unstructured-Grid Navier-Stokes Solver to the 4th AIAA Drag Prediction Workshop Cases

NASA Technical Reports Server (NTRS)

Lee-Rausch, Elizabeth M.; Hammond, Dana P.; Nielsen, Eric J.; Pirzadeh, S. Z.; Rumsey, Christopher L.

2010-01-01

FUN3D Navier-Stokes solutions were computed for the 4th AIAA Drag Prediction Workshop grid convergence study, downwash study, and Reynolds number study on a set of node-based mixed-element grids. All of the baseline tetrahedral grids were generated with the VGRID (developmental) advancing-layer and advancing-front grid generation software package following the gridding guidelines developed for the workshop. With maximum grid sizes exceeding 100 million nodes, the grid convergence study was particularly challenging for the node-based unstructured grid generators and flow solvers. At the time of the workshop, the super-fine grid with 105 million nodes and 600 million elements was the largest grid known to have been generated using VGRID. FUN3D Version 11.0 has a completely new pre- and post-processing paradigm that has been incorporated directly into the solver and functions entirely in a parallel, distributed memory environment. This feature allowed for practical pre-processing and solution times on the largest unstructured-grid size requested for the workshop. For the constant-lift grid convergence case, the convergence of total drag is approximately second-order on the finest three grids. The variation in total drag between the finest two grids is only 2 counts. At the finest grid levels, only small variations in wing and tail pressure distributions are seen with grid refinement. Similarly, a small wing side-of-body separation also shows little variation at the finest grid levels. Overall, the FUN3D results compare well with the structured-grid code CFL3D. The FUN3D downwash study and Reynolds number study results compare well with the range of results shown in the workshop presentations.
Implementation of Flow Tripping Capability in the USM3D Unstructured Flow Solver

NASA Technical Reports Server (NTRS)

Pandya, Mohagna J.; Abdol-Harrid, Khaled S.; Campbell, Richard L.; Frink, Neal T.

2006-01-01

A flow tripping capability is added to an established NASA tetrahedral unstructured parallel Navier-Stokes flow solver, USM3D. The capability is based on prescribing an appropriate profile of turbulence model variables to energize the boundary layer in a plane normal to a specified trip region on the body surface. We demonstrate this approach using the k-e two-equation turbulence model of USM3D. Modification to the solution procedure primarily consists of developing a data structure to identify all unstructured tetrahedral grid cells located in the plane normal to a specified surface trip region and computing a function based on the mean flow solution to specify the modified profile of the turbulence model variables. We leverage this data structure and also show an adjunct approach that is based on enforcing a laminar flow condition on the otherwise fully turbulent flow solution in user specified region. The latter approach is applied for the solutions obtained using other one- and two-equation turbulence models of USM3D. A key ingredient of the present capability is the use of a graphical user-interface tool PREDISC to define a trip region on the body surface in an existing grid. Verification of the present modifications is demonstrated on three cases, namely, a flat plate, the RAE2822 airfoil, and the DLR F6 wing-fuselage configuration.

Implementation of Flow Tripping Capability in the USM3D Unstructured Flow Solver

NASA Technical Reports Server (NTRS)

Pandya, Mohagna J.; Abdol-Hamid, Khaled S.; Campbell, Richard L.; Frink, Neal T.

2006-01-01

A flow tripping capability is added to an established NASA tetrahedral unstructured parallel Navier-Stokes flow solver, USM3D. The capability is based on prescribing an appropriate profile of turbulence model variables to energize the boundary layer in a plane normal to a specified trip region on the body surface. We demonstrate this approach using the k-epsilon two-equation turbulence model of USM3D. Modification to the solution procedure primarily consists of developing a data structure to identify all unstructured tetrahedral grid cells located in the plane normal to a specified surface trip region and computing a function based on the mean flow solution to specify the modified profile of the turbulence model variables. We leverage this data structure and also show an adjunct approach that is based on enforcing a laminar flow condition on the otherwise fully turbulent flow solution in user-specified region. The latter approach is applied for the solutions obtained using other one-and two-equation turbulence models of USM3D. A key ingredient of the present capability is the use of a graphical user-interface tool PREDISC to define a trip region on the body surface in an existing grid. Verification of the present modifications is demonstrated on three cases, namely, a flat plate, the RAE2822 airfoil, and the DLR F6 wing-fuselage configuration.
Algorithms and Application of Sparse Matrix Assembly and Equation Solvers for Aeroacoustics

NASA Technical Reports Server (NTRS)

Watson, W. R.; Nguyen, D. T.; Reddy, C. J.; Vatsa, V. N.; Tang, W. H.

2001-01-01

An algorithm for symmetric sparse equation solutions on an unstructured grid is described. Efficient, sequential sparse algorithms for degree-of-freedom reordering, supernodes, symbolic/numerical factorization, and forward backward solution phases are reviewed. Three sparse algorithms for the generation and assembly of symmetric systems of matrix equations are presented. The accuracy and numerical performance of the sequential version of the sparse algorithms are evaluated over the frequency range of interest in a three-dimensional aeroacoustics application. Results show that the solver solutions are accurate using a discretization of 12 points per wavelength. Results also show that the first assembly algorithm is impractical for high-frequency noise calculations. The second and third assembly algorithms have nearly equal performance at low values of source frequencies, but at higher values of source frequencies the third algorithm saves CPU time and RAM. The CPU time and the RAM required by the second and third assembly algorithms are two orders of magnitude smaller than that required by the sparse equation solver. A sequential version of these sparse algorithms can, therefore, be conveniently incorporated into a substructuring for domain decomposition formulation to achieve parallel computation, where different substructures are handles by different parallel processors.
A three-dimensional structured/unstructured hybrid Navier-Stokes method for turbine blade rows

NASA Technical Reports Server (NTRS)

Tsung, F.-L.; Loellbach, J.; Kwon, O.; Hah, C.

1994-01-01

A three-dimensional viscous structured/unstructured hybrid scheme has been developed for numerical computation of high Reynolds number turbomachinery flows. The procedure allows an efficient structured solver to be employed in the densely clustered, high aspect-ratio grid around the viscous regions near solid surfaces, while employing an unstructured solver elsewhere in the flow domain to add flexibility in mesh generation. Test results for an inviscid flow over an external transonic wing and a Navier-Stokes flow for an internal annular cascade are presented.
On unstructured grids and solvers

NASA Technical Reports Server (NTRS)

Barth, T. J.

1990-01-01

The fundamentals and the state-of-the-art technology for unstructured grids and solvers are highlighted. Algorithms and techniques pertinent to mesh generation are discussed. It is shown that grid generation and grid manipulation schemes rely on fast multidimensional searching. Flow solution techniques for the Euler equations, which can be derived from the integral form of the equations are discussed. Sample calculations are also provided.
Simulation of Unsteady Flows Using an Unstructured Navier-Stokes Solver on Moving and Stationary Grids

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Vatsa, Veer N.; Atkins, Harold L.

2005-01-01

We apply an unsteady Reynolds-averaged Navier-Stokes (URANS) solver for unstructured grids to unsteady flows on moving and stationary grids. Example problems considered are relevant to active flow control and stability and control. Computational results are presented using the Spalart-Allmaras turbulence model and are compared to experimental data. The effect of grid and time-step refinement are examined.
Final Report - High-Order Spectral Volume Method for the Navier-Stokes Equations On Unstructured Tetrahedral Grids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Z J

2012-12-06

The overriding objective for this project is to develop an efficient and accurate method for capturing strong discontinuities and fine smooth flow structures of disparate length scales with unstructured grids, and demonstrate its potentials for problems relevant to DOE. More specifically, we plan to achieve the following objectives: 1. Extend the SV method to three dimensions, and develop a fourth-order accurate SV scheme for tetrahedral grids. Optimize the SV partition by minimizing a form of the Lebesgue constant. Verify the order of accuracy using the scalar conservation laws with an analytical solution; 2. Extend the SV method to Navier-Stokes equationsmore » for the simulation of viscous flow problems. Two promising approaches to compute the viscous fluxes will be tested and analyzed; 3. Parallelize the 3D viscous SV flow solver using domain decomposition and message passing. Optimize the cache performance of the flow solver by designing data structures minimizing data access times; 4. Demonstrate the SV method with a wide range of flow problems including both discontinuities and complex smooth structures. The objectives remain the same as those outlines in the original proposal. We anticipate no technical obstacles in meeting these objectives.« less
Assessing uncertainty in the turbulent upper-ocean mixed layer using an unstructured finite-element solver

NASA Astrophysics Data System (ADS)

Pacheco, Luz; Smith, Katherine; Hamlington, Peter; Niemeyer, Kyle

2017-11-01

Vertical transport flux in the ocean upper mixed layer has recently been attributed to submesoscale currents, which occur at scales on the order of kilometers in the horizontal direction. These phenomena, which include fronts and mixed-layer instabilities, have been of particular interest due to the effect of turbulent mixing on nutrient transport, facilitating phytoplankton blooms. We study these phenomena using a non-hydrostatic, large eddy simulation for submesoscale currents in the ocean, developed using the extensible, open-source finite element platform FEniCs. Our model solves the standard Boussinesq Euler equations in variational form using the finite element method. FEniCs enables the use of parallel computing on modern systems for efficient computing time, and is suitable for unstructured grids where irregular topography can be considered in the future. The solver will be verified against the well-established NCAR-LES model and validated against observational data. For the verification with NCAR-LES, the velocity, pressure, and buoyancy fields are compared through a surface-wind-driven, open-ocean case. We use this model to study the impacts of uncertainties in the model parameters, such as near-surface buoyancy flux and secondary circulation, and discuss implications.
Generation of unstructured grids and Euler solutions for complex geometries

NASA Technical Reports Server (NTRS)

Loehner, Rainald; Parikh, Paresh; Salas, Manuel D.

1989-01-01

Algorithms are described for the generation and adaptation of unstructured grids in two and three dimensions, as well as Euler solvers for unstructured grids. The main purpose is to demonstrate how unstructured grids may be employed advantageously for the economic simulation of both geometrically as well as physically complex flow fields.
Scalable hierarchical PDE sampler for generating spatially correlated random fields using nonmatching meshes: Scalable hierarchical PDE sampler using nonmatching meshes

DOE PAGES

Osborn, Sarah; Zulian, Patrick; Benson, Thomas; ...

2018-01-30

This work describes a domain embedding technique between two nonmatching meshes used for generating realizations of spatially correlated random fields with applications to large-scale sampling-based uncertainty quantification. The goal is to apply the multilevel Monte Carlo (MLMC) method for the quantification of output uncertainties of PDEs with random input coefficients on general and unstructured computational domains. We propose a highly scalable, hierarchical sampling method to generate realizations of a Gaussian random field on a given unstructured mesh by solving a reaction–diffusion PDE with a stochastic right-hand side. The stochastic PDE is discretized using the mixed finite element method on anmore » embedded domain with a structured mesh, and then, the solution is projected onto the unstructured mesh. This work describes implementation details on how to efficiently transfer data from the structured and unstructured meshes at coarse levels, assuming that this can be done efficiently on the finest level. We investigate the efficiency and parallel scalability of the technique for the scalable generation of Gaussian random fields in three dimensions. An application of the MLMC method is presented for quantifying uncertainties of subsurface flow problems. Here, we demonstrate the scalability of the sampling method with nonmatching mesh embedding, coupled with a parallel forward model problem solver, for large-scale 3D MLMC simulations with up to 1.9·109 unknowns.« less
Scalable hierarchical PDE sampler for generating spatially correlated random fields using nonmatching meshes: Scalable hierarchical PDE sampler using nonmatching meshes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Osborn, Sarah; Zulian, Patrick; Benson, Thomas

This work describes a domain embedding technique between two nonmatching meshes used for generating realizations of spatially correlated random fields with applications to large-scale sampling-based uncertainty quantification. The goal is to apply the multilevel Monte Carlo (MLMC) method for the quantification of output uncertainties of PDEs with random input coefficients on general and unstructured computational domains. We propose a highly scalable, hierarchical sampling method to generate realizations of a Gaussian random field on a given unstructured mesh by solving a reaction–diffusion PDE with a stochastic right-hand side. The stochastic PDE is discretized using the mixed finite element method on anmore » embedded domain with a structured mesh, and then, the solution is projected onto the unstructured mesh. This work describes implementation details on how to efficiently transfer data from the structured and unstructured meshes at coarse levels, assuming that this can be done efficiently on the finest level. We investigate the efficiency and parallel scalability of the technique for the scalable generation of Gaussian random fields in three dimensions. An application of the MLMC method is presented for quantifying uncertainties of subsurface flow problems. Here, we demonstrate the scalability of the sampling method with nonmatching mesh embedding, coupled with a parallel forward model problem solver, for large-scale 3D MLMC simulations with up to 1.9·109 unknowns.« less
Efficiency Analysis of the Parallel Implementation of the SIMPLE Algorithm on Multiprocessor Computers

NASA Astrophysics Data System (ADS)

Lashkin, S. V.; Kozelkov, A. S.; Yalozo, A. V.; Gerasimov, V. Yu.; Zelensky, D. K.

2017-12-01

This paper describes the details of the parallel implementation of the SIMPLE algorithm for numerical solution of the Navier-Stokes system of equations on arbitrary unstructured grids. The iteration schemes for the serial and parallel versions of the SIMPLE algorithm are implemented. In the description of the parallel implementation, special attention is paid to computational data exchange among processors under the condition of the grid model decomposition using fictitious cells. We discuss the specific features for the storage of distributed matrices and implementation of vector-matrix operations in parallel mode. It is shown that the proposed way of matrix storage reduces the number of interprocessor exchanges. A series of numerical experiments illustrates the effect of the multigrid SLAE solver tuning on the general efficiency of the algorithm; the tuning involves the types of the cycles used (V, W, and F), the number of iterations of a smoothing operator, and the number of cells for coarsening. Two ways (direct and indirect) of efficiency evaluation for parallelization of the numerical algorithm are demonstrated. The paper presents the results of solving some internal and external flow problems with the evaluation of parallelization efficiency by two algorithms. It is shown that the proposed parallel implementation enables efficient computations for the problems on a thousand processors. Based on the results obtained, some general recommendations are made for the optimal tuning of the multigrid solver, as well as for selecting the optimal number of cells per processor.
Modularization and Validation of FUN3D as a CREATE-AV Helios Near-Body Solver

NASA Technical Reports Server (NTRS)

Jain, Rohit; Biedron, Robert T.; Jones, William T.; Lee-Rausch, Elizabeth M.

2016-01-01

Under a recent collaborative effort between the US Army Aeroflightdynamics Directorate (AFDD) and NASA Langley, NASA's general unstructured CFD solver, FUN3D, was modularized as a CREATE-AV Helios near-body unstructured grid solver. The strategies adopted in Helios/FUN3D integration effort are described. A validation study of the new capability is performed for rotorcraft cases spanning hover prediction, airloads prediction, coupling with computational structural dynamics, counter-rotating dual-rotor configurations, and free-flight trim. The integration of FUN3D, along with the previously integrated NASA OVERFLOW solver, lays the ground for future interaction opportunities where capabilities of one component could be leveraged with those of others in a relatively seamless fashion within CREATE-AV Helios.
Improving Fidelity of Launch Vehicle Liftoff Acoustic Simulations

NASA Technical Reports Server (NTRS)

Liever, Peter; West, Jeff

2016-01-01

Launch vehicles experience high acoustic loads during ignition and liftoff affected by the interaction of rocket plume generated acoustic waves with launch pad structures. Application of highly parallelized Computational Fluid Dynamics (CFD) analysis tools optimized for application on the NAS computer systems such as the Loci/CHEM program now enable simulation of time-accurate, turbulent, multi-species plume formation and interaction with launch pad geometry and capture the generation of acoustic noise at the source regions in the plume shear layers and impingement regions. These CFD solvers are robust in capturing the acoustic fluctuations, but they are too dissipative to accurately resolve the propagation of the acoustic waves throughout the launch environment domain along the vehicle. A hybrid Computational Fluid Dynamics and Computational Aero-Acoustics (CFD/CAA) modeling framework has been developed to improve such liftoff acoustic environment predictions. The framework combines the existing highly-scalable NASA production CFD code, Loci/CHEM, with a high-order accurate discontinuous Galerkin (DG) solver, Loci/THRUST, developed in the same computational framework. Loci/THRUST employs a low dissipation, high-order, unstructured DG method to accurately propagate acoustic waves away from the source regions across large distances. The DG solver is currently capable of solving up to 4th order solutions for non-linear, conservative acoustic field propagation. Higher order boundary conditions are implemented to accurately model the reflection and refraction of acoustic waves on launch pad components. The DG solver accepts generalized unstructured meshes, enabling efficient application of common mesh generation tools for CHEM and THRUST simulations. The DG solution is coupled with the CFD solution at interface boundaries placed near the CFD acoustic source regions. Both simulations are executed simultaneously with coordinated boundary condition data exchange.
A 3D Unstructured Mesh Euler Solver Based on the Fourth-Order CESE Method

DTIC Science & Technology

2013-06-01

Form 298 (Rev. 8-98) Prescribed by ANSI Std. 239.18 A 3D Unstructured Mesh Euler Solver Based on the Fourth-Order CESE Method David L. Bilyeu ∗1,2...Similarly, the fluxes, f x,y,z i , and their derivatives inside a SE are also discretized by the Taylor series expansion: ∂ Cfx ,y,zi ∂xI∂yJ∂zK∂tL = A
A 3-D CE/SE Navier-Stokes Solver With Unstructured Hexahedral Grid for Computation of Near Field Jet Screech Noise

NASA Technical Reports Server (NTRS)

Loh, Ching Y.; Himansu, Ananda; Hultgren, Lennart S.

2003-01-01

A 3-D space-time CE/SE Navier-Stokes solver using an unstructured hexahedral grid is described and applied to a circular jet screech noise computation. The present numerical results for an underexpanded jet, corresponding to a fully expanded Mach number of 1.42, capture the dominant and nonaxisymmetric 'B' screech mode and are generally in good agreement with existing experiments.
Parallel Computation of Flow in Heterogeneous Media Modelled by Mixed Finite Elements

NASA Astrophysics Data System (ADS)

Cliffe, K. A.; Graham, I. G.; Scheichl, R.; Stals, L.

2000-11-01

In this paper we describe a fast parallel method for solving highly ill-conditioned saddle-point systems arising from mixed finite element simulations of stochastic partial differential equations (PDEs) modelling flow in heterogeneous media. Each realisation of these stochastic PDEs requires the solution of the linear first-order velocity-pressure system comprising Darcy's law coupled with an incompressibility constraint. The chief difficulty is that the permeability may be highly variable, especially when the statistical model has a large variance and a small correlation length. For reasonable accuracy, the discretisation has to be extremely fine. We solve these problems by first reducing the saddle-point formulation to a symmetric positive definite (SPD) problem using a suitable basis for the space of divergence-free velocities. The reduced problem is solved using parallel conjugate gradients preconditioned with an algebraically determined additive Schwarz domain decomposition preconditioner. The result is a solver which exhibits a good degree of robustness with respect to the mesh size as well as to the variance and to physically relevant values of the correlation length of the underlying permeability field. Numerical experiments exhibit almost optimal levels of parallel efficiency. The domain decomposition solver (DOUG, http://www.maths.bath.ac.uk/~parsoft) used here not only is applicable to this problem but can be used to solve general unstructured finite element systems on a wide range of parallel architectures.
Global Load Balancing with Parallel Mesh Adaption on Distributed-Memory Systems

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid; Sohn, Andrew

1996-01-01

Dynamic mesh adaption on unstructured grids is a powerful tool for efficiently computing unsteady problems to resolve solution features of interest. Unfortunately, this causes load imbalance among processors on a parallel machine. This paper describes the parallel implementation of a tetrahedral mesh adaption scheme and a new global load balancing method. A heuristic remapping algorithm is presented that assigns partitions to processors such that the redistribution cost is minimized. Results indicate that the parallel performance of the mesh adaption code depends on the nature of the adaption region and show a 35.5X speedup on 64 processors of an SP2 when 35% of the mesh is randomly adapted. For large-scale scientific computations, our load balancing strategy gives almost a sixfold reduction in solver execution times over non-balanced loads. Furthermore, our heuristic remapper yields processor assignments that are less than 3% off the optimal solutions but requires only 1% of the computational time.
Extending HPF for advanced data parallel applications

NASA Technical Reports Server (NTRS)

Chapman, Barbara; Mehrotra, Piyush; Zima, Hans

1994-01-01

The stated goal of High Performance Fortran (HPF) was to 'address the problems of writing data parallel programs where the distribution of data affects performance'. After examining the current version of the language we are led to the conclusion that HPF has not fully achieved this goal. While the basic distribution functions offered by the language - regular block, cyclic, and block cyclic distributions - can support regular numerical algorithms, advanced applications such as particle-in-cell codes or unstructured mesh solvers cannot be expressed adequately. We believe that this is a major weakness of HPF, significantly reducing its chances of becoming accepted in the numeric community. The paper discusses the data distribution and alignment issues in detail, points out some flaws in the basic language, and outlines possible future paths of development. Furthermore, we briefly deal with the issue of task parallelism and its integration with the data parallel paradigm of HPF.
Research in Parallel Algorithms and Software for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Domel, Neal D.

1996-01-01

Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Research in Parallel Algorithms and Software for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Domel, Neal D.

1996-01-01

Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.

Recent Advances in Agglomerated Multigrid

NASA Technical Reports Server (NTRS)

Nishikawa, Hiroaki; Diskin, Boris; Thomas, James L.; Hammond, Dana P.

2013-01-01

We report recent advancements of the agglomerated multigrid methodology for complex flow simulations on fully unstructured grids. An agglomerated multigrid solver is applied to a wide range of test problems from simple two-dimensional geometries to realistic three- dimensional configurations. The solver is evaluated against a single-grid solver and, in some cases, against a structured-grid multigrid solver. Grid and solver issues are identified and overcome, leading to significant improvements over single-grid solvers.
elsA-Hybrid: an all-in-one structured/unstructured solver for the simulation of internal and external flows. Application to turbomachinery

NASA Astrophysics Data System (ADS)

de la Llave Plata, M.; Couaillier, V.; Le Pape, M.-C.; Marmignon, C.; Gazaix, M.

2013-03-01

This paper reports recent work on the extension of the multiblock structured solver elsA to deal with hybrid grids. The new hybrid-grid solver is called elsA-H (elsA-Hybrid), is based on the investigation of a new unstructured-grid module has been built within the original elsA CFD (computational fluid dynamics) system. The implementation benefits from the flexibility of the object-oriented design. The aim of elsA-H is to take advantage of the full potential of structured solvers and unstructured mesh generation by allowing any type of grid to be used within the same simulation process. The main challenge lies in the numerical treatment of the hybrid-grid interfaces where blocks of different type meet. In particular, one must pay attention to the transfer of information across these boundaries, so that the accuracy of the numerical scheme is preserved and flux conservation is guaranteed. In this paper, the numerical approach allowing to achieve this is presented. A comparison between the hybrid and the structured-grid methods is also carried out by considering a fully hexahedral multiblock mesh for which a few blocks have been transformed into unstructured. The performance of elsA-H for the simulation of internal flows will be demonstrated on a number of turbomachinery configurations.
Three dimensional modelling of earthquake rupture cycles on frictional faults

NASA Astrophysics Data System (ADS)

Simpson, Guy; May, Dave

2017-04-01

We are developing an efficient MPI-parallel numerical method to simulate earthquake sequences on preexisting faults embedding within a three dimensional viscoelastic half-space. We solve the velocity form of the elasto(visco)dynamic equations using a continuous Galerkin Finite Element Method on an unstructured pentahedral mesh, which thus permits local spatial refinement in the vicinity of the fault. Friction sliding is coupled to the viscoelastic solid via rate- and state-dependent friction laws using the split-node technique. Our coupled formulation employs a picard-type non-linear solver with a fully implicit, first order accurate time integrator that utilises an adaptive time step that efficiently evolves the system through multiple seismic cycles. The implementation leverages advanced parallel solvers, preconditioners and linear algebra from the Portable Extensible Toolkit for Scientific Computing (PETSc) library. The model can treat heterogeneous frictional properties and stress states on the fault and surrounding solid as well as non-planar fault geometries. Preliminary tests show that the model successfully reproduces dynamic rupture on a vertical strike-slip fault in a half-space governed by rate-state friction with the ageing law.
Rotor Airloads Prediction Using Unstructured Meshes and Loose CFD/CSD Coupling

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Lee-Rausch, Elizabeth M.

2008-01-01

The FUN3D unsteady Reynolds-averaged Navier-Stokes solver for unstructured grids has been modified to allow prediction of trimmed rotorcraft airloads. The trim of the rotorcraft and the aeroelastic deformation of the rotor blades are accounted for via loose coupling with the CAMRAD II rotorcraft computational structural dynamics code. The set of codes is used to analyze the HART-II Baseline, Minimum Noise and Minimum Vibration test conditions. The loose coupling approach is found to be stable and convergent for the cases considered. Comparison of the resulting airloads and structural deformations with experimentally measured data is presented. The effect of grid resolution and temporal accuracy is examined. Rotorcraft airloads prediction presents a very substantial challenge for Computational Fluid Dynamics (CFD). Not only must the unsteady nature of the flow be accurately modeled, but since most rotorcraft blades are not structurally stiff, an accurate simulation must account for the blade structural dynamics. In addition, trim of the rotorcraft to desired thrust and moment targets depends on both aerodynamic loads and structural deformation, and vice versa. Further, interaction of the fuselage with the rotor flow field can be important, so that relative motion between the blades and the fuselage must be accommodated. Thus a complete simulation requires coupled aerodynamics, structures and trim, with the ability to model geometrically complex configurations. NASA has recently initiated a Subsonic Rotary Wing (SRW) Project under the overall Fundamental Aeronautics Program. Within the context of SRW are efforts aimed at furthering the state of the art of high-fidelity rotorcraft flow simulations, using both structured and unstructured meshes. Structured-mesh solvers have an advantage in computation speed, but even though remarkably complex configurations may be accommodated using the overset grid approach, generation of complex structured-mesh systems can require months to set up. As a result, many rotorcraft simulations using structured-grid CFD neglect the fuselage. On the other hand, unstructured-mesh solvers are easily able to handle complex geometries, but suffer from slower execution speed. However, advances in both computer hardware and CFD algorithms have made previously state-of-the-art computations routine for unstructured-mesh solvers, so that rotorcraft simulations using unstructured grids are now viable. The aim of the present work is to develop a first principles rotorcraft simulation tool based on an unstructured CFD solver.
2nd-Order CESE Results For C1.4: Vortex Transport by Uniform Flow

NASA Technical Reports Server (NTRS)

Friedlander, David J.

2015-01-01

The Conservation Element and Solution Element (CESE) method was used as implemented in the NASA research code ez4d. The CESE method is a time accurate formulation with flux-conservation in both space and time. The method treats the discretized derivatives of space and time identically and while the 2nd-order accurate version was used, high-order versions exist, the 2nd-order accurate version was used. In regards to the ez4d code, it is an unstructured Navier-Stokes solver coded in C++ with serial and parallel versions available. As part of its architecture, ez4d has the capability to utilize multi-thread and Messaging Passage Interface (MPI) for parallel runs.
An Optimized Multicolor Point-Implicit Solver for Unstructured Grid Applications on Graphics Processing Units

NASA Technical Reports Server (NTRS)

Zubair, Mohammad; Nielsen, Eric; Luitjens, Justin; Hammond, Dana

2016-01-01

In the field of computational fluid dynamics, the Navier-Stokes equations are often solved using an unstructuredgrid approach to accommodate geometric complexity. Implicit solution methodologies for such spatial discretizations generally require frequent solution of large tightly-coupled systems of block-sparse linear equations. The multicolor point-implicit solver used in the current work typically requires a significant fraction of the overall application run time. In this work, an efficient implementation of the solver for graphics processing units is proposed. Several factors present unique challenges to achieving an efficient implementation in this environment. These include the variable amount of parallelism available in different kernel calls, indirect memory access patterns, low arithmetic intensity, and the requirement to support variable block sizes. In this work, the solver is reformulated to use standard sparse and dense Basic Linear Algebra Subprograms (BLAS) functions. However, numerical experiments show that the performance of the BLAS functions available in existing CUDA libraries is suboptimal for matrices representative of those encountered in actual simulations. Instead, optimized versions of these functions are developed. Depending on block size, the new implementations show performance gains of up to 7x over the existing CUDA library functions.
Anisotropic three-dimensional inversion of CSEM data using finite-element techniques on unstructured grids

NASA Astrophysics Data System (ADS)

Wang, Feiyan; Morten, Jan Petter; Spitzer, Klaus

2018-05-01

In this paper, we present a recently developed anisotropic 3-D inversion framework for interpreting controlled-source electromagnetic (CSEM) data in the frequency domain. The framework integrates a high-order finite-element forward operator and a Gauss-Newton inversion algorithm. Conductivity constraints are applied using a parameter transformation. We discretize the continuous forward and inverse problems on unstructured grids for a flexible treatment of arbitrarily complex geometries. Moreover, an unstructured mesh is more desirable in comparison to a single rectilinear mesh for multisource problems because local grid refinement will not significantly influence the mesh density outside the region of interest. The non-uniform spatial discretization facilitates parametrization of the inversion domain at a suitable scale. For a rapid simulation of multisource EM data, we opt to use a parallel direct solver. We further accelerate the inversion process by decomposing the entire data set into subsets with respect to frequencies (and transmitters if memory requirement is affordable). The computational tasks associated with each data subset are distributed to different processes and run in parallel. We validate the scheme using a synthetic marine CSEM model with rough bathymetry, and finally, apply it to an industrial-size 3-D data set from the Troll field oil province in the North Sea acquired in 2008 to examine its robustness and practical applicability.
Computation of UH-60A Airloads Using CFD/CSD Coupling on Unstructured Meshes

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Lee-Rausch, Elizabeth M.

2011-01-01

An unsteady Reynolds-averaged Navier-Stokes solver for unstructured grids is used to compute the rotor airloads on the UH-60A helicopter at high-speed and high thrust conditions. The flow solver is coupled to a rotorcraft comprehensive code in order to account for trim and aeroelastic deflections. Simulations are performed both with and without the fuselage, and the effects of grid resolution, temporal resolution and turbulence model are examined. Computed airloads are compared to flight data.
An unstructured shock-fitting solver for hypersonic plasma flows in chemical non-equilibrium

NASA Astrophysics Data System (ADS)

Pepe, R.; Bonfiglioli, A.; D'Angola, A.; Colonna, G.; Paciorri, R.

2015-11-01

A CFD solver, using Residual Distribution Schemes on unstructured grids, has been extended to deal with inviscid chemical non-equilibrium flows. The conservative equations have been coupled with a kinetic model for argon plasma which includes the argon metastable state as independent species, taking into account electron-atom and atom-atom processes. Results in the case of an hypersonic flow around an infinite cylinder, obtained by using both shock-capturing and shock-fitting approaches, show higher accuracy of the shock-fitting approach.
Time-Accurate Local Time Stepping and High-Order Time CESE Methods for Multi-Dimensional Flows Using Unstructured Meshes

NASA Technical Reports Server (NTRS)

Chang, Chau-Lyan; Venkatachari, Balaji Shankar; Cheng, Gary

2013-01-01

With the wide availability of affordable multiple-core parallel supercomputers, next generation numerical simulations of flow physics are being focused on unsteady computations for problems involving multiple time scales and multiple physics. These simulations require higher solution accuracy than most algorithms and computational fluid dynamics codes currently available. This paper focuses on the developmental effort for high-fidelity multi-dimensional, unstructured-mesh flow solvers using the space-time conservation element, solution element (CESE) framework. Two approaches have been investigated in this research in order to provide high-accuracy, cross-cutting numerical simulations for a variety of flow regimes: 1) time-accurate local time stepping and 2) highorder CESE method. The first approach utilizes consistent numerical formulations in the space-time flux integration to preserve temporal conservation across the cells with different marching time steps. Such approach relieves the stringent time step constraint associated with the smallest time step in the computational domain while preserving temporal accuracy for all the cells. For flows involving multiple scales, both numerical accuracy and efficiency can be significantly enhanced. The second approach extends the current CESE solver to higher-order accuracy. Unlike other existing explicit high-order methods for unstructured meshes, the CESE framework maintains a CFL condition of one for arbitrarily high-order formulations while retaining the same compact stencil as its second-order counterpart. For large-scale unsteady computations, this feature substantially enhances numerical efficiency. Numerical formulations and validations using benchmark problems are discussed in this paper along with realistic examples.
Prospects and expectations for unstructured methods

NASA Technical Reports Server (NTRS)

Baker, Timothy J.

1995-01-01

The last decade has witnessed a vigorous and sustained research effort on unstructured methods for computational fluid dynamics. Unstructured mesh generators and flow solvers have evolved to the point where they are now in use for design purposes throughout the aerospace industry. In this paper we survey the various mesh types, structured as well as unstructured, and examine their relative strengths and weaknesses. We argue that unstructured methodology does offer the best prospect for the next generation of computational fluid dynamics algorithms.
Numerical Simulations of Aero-Optical Distortions Around Various Turret Geometries

DTIC Science & Technology

2013-06-12

arbi trary cell topologies. The spatial operator uses the exact Riemann Solver of Gottlieb and Groth, least squares gradient cal- culations using QR...Unstructured Euler/Navier-Stokes Flow Solver ," in A/AA Paper 1999-0786, 1999. [9] J. J. Gottlieb and C. P. T. Groth, "Assessment of Riemann Solvers
USM3D Unstructured Grid Solutions for CAWAPI at NASA LaRC

NASA Technical Reports Server (NTRS)

Lamar, John E.; Abdol-Hamid, Khaled S.

2007-01-01

In support the Cranked Arrow Wing Aerodynamic Project International (CAWAPI) to improve the Technology Readiness Level of flow solvers by comparing results with measured F-16XL-1 flight data, NASA Langley employed the TetrUSS unstructured grid solver, USM3D, to obtain solutions for all seven flight conditions of interest. A newly available solver version that incorporates a number of turbulence models, including the two-equation linear and non-linear k-epsilon, was used in this study. As a first test, a choice was made to utilize only a single grid resolution with the solver for the simulation of the different flight conditions. Comparisons are presented with three turbulence models in USM3D, flight data for surface pressure, boundary-layer profiles, and skin-friction results, as well as limited predictions from other solvers. A result of these comparisons is that the USM3D solver can be used in an engineering environment to predict flow physics on a complex configuration at flight Reynolds numbers with a two-equation linear k-epsilon turbulence model.
Large-scale 3D geoelectromagnetic modeling using parallel adaptive high-order finite element method

DOE PAGES

Grayver, Alexander V.; Kolev, Tzanio V.

2015-11-01

Here, we have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linear equations poses the main practical challenge. We have developed a fully parallel and distributed robust and scalable linear solver based on the optimal block-diagonal and auxiliary space preconditioners. The solver was found to be efficient for high finite element orders, unstructured and nonconforming locally refined meshes, a wide range of frequencies, largemore » conductivity contrasts, and number of degrees of freedom (DoFs). Furthermore, the presented linear solver is in essence algebraic; i.e., it acts on the matrix-vector level and thus requires no information about the discretization, boundary conditions, or physical source used, making it readily efficient for a wide range of electromagnetic modeling problems. To get accurate solutions at reduced computational cost, we have also implemented goal-oriented adaptive mesh refinement. The numerical tests indicated that if highly accurate modeling results were required, the high-order FEM in combination with the goal-oriented local mesh refinement required less computational time and DoFs than the lowest order adaptive FEM.« less
Large-scale 3D geoelectromagnetic modeling using parallel adaptive high-order finite element method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grayver, Alexander V.; Kolev, Tzanio V.

Here, we have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linear equations poses the main practical challenge. We have developed a fully parallel and distributed robust and scalable linear solver based on the optimal block-diagonal and auxiliary space preconditioners. The solver was found to be efficient for high finite element orders, unstructured and nonconforming locally refined meshes, a wide range of frequencies, largemore » conductivity contrasts, and number of degrees of freedom (DoFs). Furthermore, the presented linear solver is in essence algebraic; i.e., it acts on the matrix-vector level and thus requires no information about the discretization, boundary conditions, or physical source used, making it readily efficient for a wide range of electromagnetic modeling problems. To get accurate solutions at reduced computational cost, we have also implemented goal-oriented adaptive mesh refinement. The numerical tests indicated that if highly accurate modeling results were required, the high-order FEM in combination with the goal-oriented local mesh refinement required less computational time and DoFs than the lowest order adaptive FEM.« less
Geometrically Flexible and Efficient Flow Analysis of High Speed Vehicles Via Domain Decomposition, Part 1: Unstructured-Grid Solver for High Speed Flows

NASA Technical Reports Server (NTRS)

White, Jeffery A.; Baurle, Robert A.; Passe, Bradley J.; Spiegel, Seth C.; Nishikawa, Hiroaki

2017-01-01

The ability to solve the equations governing the hypersonic turbulent flow of a real gas on unstructured grids using a spatially-elliptic, 2nd-order accurate, cell-centered, finite-volume method has been recently implemented in the VULCAN-CFD code. This paper describes the key numerical methods and techniques that were found to be required to robustly obtain accurate solutions to hypersonic flows on non-hex-dominant unstructured grids. The methods and techniques described include: an augmented stencil, weighted linear least squares, cell-average gradient method, a robust multidimensional cell-average gradient-limiter process that is consistent with the augmented stencil of the cell-average gradient method and a cell-face gradient method that contains a cell skewness sensitive damping term derived using hyperbolic diffusion based concepts. A data-parallel matrix-based symmetric Gauss-Seidel point-implicit scheme, used to solve the governing equations, is described and shown to be more robust and efficient than a matrix-free alternative. In addition, a y+ adaptive turbulent wall boundary condition methodology is presented. This boundary condition methodology is deigned to automatically switch between a solve-to-the-wall and a wall-matching-function boundary condition based on the local y+ of the 1st cell center off the wall. The aforementioned methods and techniques are then applied to a series of hypersonic and supersonic turbulent flat plate unit tests to examine the efficiency, robustness and convergence behavior of the implicit scheme and to determine the ability of the solve-to-the-wall and y+ adaptive turbulent wall boundary conditions to reproduce the turbulent law-of-the-wall. Finally, the thermally perfect, chemically frozen, Mach 7.8 turbulent flow of air through a scramjet flow-path is computed and compared with experimental data to demonstrate the robustness, accuracy and convergence behavior of the unstructured-grid solver for a realistic 3-D geometry on a non-hex-dominant grid.
Global Load Balancing with Parallel Mesh Adaption on Distributed-Memory Systems

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid; Sohn, Andrew

1996-01-01

Dynamic mesh adaptation on unstructured grids is a powerful tool for efficiently computing unsteady problems to resolve solution features of interest. Unfortunately, this causes load inbalances among processors on a parallel machine. This paper described the parallel implementation of a tetrahedral mesh adaption scheme and a new global load balancing method. A heuristic remapping algorithm is presented that assigns partitions to processors such that the redistribution coast is minimized. Results indicate that the parallel performance of the mesh adaption code depends on the nature of the adaption region and show a 35.5X speedup on 64 processors of an SP2 when 35 percent of the mesh is randomly adapted. For large scale scientific computations, our load balancing strategy gives an almost sixfold reduction in solver execution times over non-balanced loads. Furthermore, our heuristic remappier yields processor assignments that are less than 3 percent of the optimal solutions, but requires only 1 percent of the computational time.
An implicit higher-order spatially accurate scheme for solving time dependent flows on unstructured meshes

NASA Astrophysics Data System (ADS)

Tomaro, Robert F.

1998-07-01

The present research is aimed at developing a higher-order, spatially accurate scheme for both steady and unsteady flow simulations using unstructured meshes. The resulting scheme must work on a variety of general problems to ensure the creation of a flexible, reliable and accurate aerodynamic analysis tool. To calculate the flow around complex configurations, unstructured grids and the associated flow solvers have been developed. Efficient simulations require the minimum use of computer memory and computational times. Unstructured flow solvers typically require more computer memory than a structured flow solver due to the indirect addressing of the cells. The approach taken in the present research was to modify an existing three-dimensional unstructured flow solver to first decrease the computational time required for a solution and then to increase the spatial accuracy. The terms required to simulate flow involving non-stationary grids were also implemented. First, an implicit solution algorithm was implemented to replace the existing explicit procedure. Several test cases, including internal and external, inviscid and viscous, two-dimensional, three-dimensional and axi-symmetric problems, were simulated for comparison between the explicit and implicit solution procedures. The increased efficiency and robustness of modified code due to the implicit algorithm was demonstrated. Two unsteady test cases, a plunging airfoil and a wing undergoing bending and torsion, were simulated using the implicit algorithm modified to include the terms required for a moving and/or deforming grid. Secondly, a higher than second-order spatially accurate scheme was developed and implemented into the baseline code. Third- and fourth-order spatially accurate schemes were implemented and tested. The original dissipation was modified to include higher-order terms and modified near shock waves to limit pre- and post-shock oscillations. The unsteady cases were repeated using the higher-order spatially accurate code. The new solutions were compared with those obtained using the second-order spatially accurate scheme. Finally, the increased efficiency of using an implicit solution algorithm in a production Computational Fluid Dynamics flow solver was demonstrated for steady and unsteady flows. A third- and fourth-order spatially accurate scheme has been implemented creating a basis for a state-of-the-art aerodynamic analysis tool.
The implementation of an aeronautical CFD flow code onto distributed memory parallel systems

NASA Astrophysics Data System (ADS)

Ierotheou, C. S.; Forsey, C. R.; Leatham, M.

2000-04-01

The parallelization of an industrially important in-house computational fluid dynamics (CFD) code for calculating the airflow over complex aircraft configurations using the Euler or Navier-Stokes equations is presented. The code discussed is the flow solver module of the SAUNA CFD suite. This suite uses a novel grid system that may include block-structured hexahedral or pyramidal grids, unstructured tetrahedral grids or a hybrid combination of both. To assist in the rapid convergence to a solution, a number of convergence acceleration techniques are employed including implicit residual smoothing and a multigrid full approximation storage scheme (FAS). Key features of the parallelization approach are the use of domain decomposition and encapsulated message passing to enable the execution in parallel using a single programme multiple data (SPMD) paradigm. In the case where a hybrid grid is used, a unified grid partitioning scheme is employed to define the decomposition of the mesh. The parallel code has been tested using both structured and hybrid grids on a number of different distributed memory parallel systems and is now routinely used to perform industrial scale aeronautical simulations. Copyright
Development of iterative techniques for the solution of unsteady compressible viscous flows

NASA Technical Reports Server (NTRS)

Hixon, Duane; Sankar, L. N.

1993-01-01

During the past two decades, there has been significant progress in the field of numerical simulation of unsteady compressible viscous flows. At present, a variety of solution techniques exist such as the transonic small disturbance analyses (TSD), transonic full potential equation-based methods, unsteady Euler solvers, and unsteady Navier-Stokes solvers. These advances have been made possible by developments in three areas: (1) improved numerical algorithms; (2) automation of body-fitted grid generation schemes; and (3) advanced computer architectures with vector processing and massively parallel processing features. In this work, the GMRES scheme has been considered as a candidate for acceleration of a Newton iteration time marching scheme for unsteady 2-D and 3-D compressible viscous flow calculation; from preliminary calculations, this will provide up to a 65 percent reduction in the computer time requirements over the existing class of explicit and implicit time marching schemes. The proposed method has ben tested on structured grids, but is flexible enough for extension to unstructured grids. The described scheme has been tested only on the current generation of vector processor architecture of the Cray Y/MP class, but should be suitable for adaptation to massively parallel machines.

Unsteady flow simulations around complex geometries using stationary or rotating unstructured grids

NASA Astrophysics Data System (ADS)

Sezer-Uzol, Nilay

In this research, the computational analysis of three-dimensional, unsteady, separated, vortical flows around complex geometries is studied by using stationary or moving unstructured grids. Two main engineering problems are investigated. The first problem is the unsteady simulation of a ship airwake, where helicopter operations become even more challenging, by using stationary unstructured grids. The second problem is the unsteady simulation of wind turbine rotor flow fields by using moving unstructured grids which are rotating with the whole three-dimensional rigid rotor geometry. The three dimensional, unsteady, parallel, unstructured, finite volume flow solver, PUMA2, is used for the computational fluid dynamics (CFD) simulations considered in this research. The code is modified to have a moving grid capability to perform three-dimensional, time-dependent rotor simulations. An instantaneous log-law wall model for Large Eddy Simulations is also implemented in PUMA2 to investigate the very large Reynolds number flow fields of rotating blades. To verify the code modifications, several sample test cases are also considered. In addition, interdisciplinary studies, which are aiming to provide new tools and insights to the aerospace and wind energy scientific communities, are done during this research by focusing on the coupling of ship airwake CFD simulations with the helicopter flight dynamics and control analysis, the coupling of wind turbine rotor CFD simulations with the aeroacoustic analysis, and the analysis of these time-dependent and large-scale CFD simulations with the help of a computational monitoring, steering and visualization tool, POSSE.
An assessment of the adaptive unstructured tetrahedral grid, Euler Flow Solver Code FELISA

NASA Technical Reports Server (NTRS)

Djomehri, M. Jahed; Erickson, Larry L.

1994-01-01

A three-dimensional solution-adaptive Euler flow solver for unstructured tetrahedral meshes is assessed, and the accuracy and efficiency of the method for predicting sonic boom pressure signatures about simple generic models are demonstrated. Comparison of computational and wind tunnel data and enhancement of numerical solutions by means of grid adaptivity are discussed. The mesh generation is based on the advancing front technique. The FELISA code consists of two solvers, the Taylor-Galerkin and the Runge-Kutta-Galerkin schemes, both of which are spacially discretized by the usual Galerkin weighted residual finite-element methods but with different explicit time-marching schemes to steady state. The solution-adaptive grid procedure is based on either remeshing or mesh refinement techniques. An alternative geometry adaptive procedure is also incorporated.
LAVA Simulations for the 3rd AIAA CFD High Lift Prediction Workshop with Body Fitted Grids

NASA Technical Reports Server (NTRS)

Jensen, James C.; Stich, Gerrit-Daniel; Housman, Jeffrey A.; Denison, Marie; Kiris, Cetin C.

2018-01-01

In response to the 3rd AIAA CFD High Lift Prediction Workshop, the workshop cases were analyzed using Reynolds-averaged Navier-Stokes flow solvers within the Launch Ascent and Vehicle Aerodynamics (LAVA) solver framework. For the workshop cases the advantages and limitations of both overset-structured an unstructured polyhedral meshes were assessed. The workshop included 3 cases: a 2D airfoil validation case, a mesh convergence study using the High Lift Common Research Model, and a nacelle/pylon integration study using the JAXA (Japan Aerospace Exploration Agency) Standard Model. The 2D airfoil case from the workshop is used to verify the implementation of the Spalart-Allmaras turbulence model along with some of its variants within the solver. The High Lift Common Research Model case is used to assess solver performance and accuracy at varying mesh resolutions, as well as identify the minimum mesh fidelity required for LAVA on this class of problem. The JAXA Standard Model case is used to assess the solver's sensitivity to the turbulence model and to compare the structured and unstructured mesh paradigms. These workshop cases have helped establish best practices for high lift flow configurations for the LAVA solver.
Divergence-free MHD on unstructured meshes using high order finite volume schemes based on multidimensional Riemann solvers

NASA Astrophysics Data System (ADS)

Balsara, Dinshaw S.; Dumbser, Michael

2015-10-01

Several advances have been reported in the recent literature on divergence-free finite volume schemes for Magnetohydrodynamics (MHD). Almost all of these advances are restricted to structured meshes. To retain full geometric versatility, however, it is also very important to make analogous advances in divergence-free schemes for MHD on unstructured meshes. Such schemes utilize a staggered Yee-type mesh, where all hydrodynamic quantities (mass, momentum and energy density) are cell-centered, while the magnetic fields are face-centered and the electric fields, which are so useful for the time update of the magnetic field, are centered at the edges. Three important advances are brought together in this paper in order to make it possible to have high order accurate finite volume schemes for the MHD equations on unstructured meshes. First, it is shown that a divergence-free WENO reconstruction of the magnetic field can be developed for unstructured meshes in two and three space dimensions using a classical cell-centered WENO algorithm, without the need to do a WENO reconstruction for the magnetic field on the faces. This is achieved via a novel constrained L2-projection operator that is used in each time step as a postprocessor of the cell-centered WENO reconstruction so that the magnetic field becomes locally and globally divergence free. Second, it is shown that recently-developed genuinely multidimensional Riemann solvers (called MuSIC Riemann solvers) can be used on unstructured meshes to obtain a multidimensionally upwinded representation of the electric field at each edge. Third, the above two innovations work well together with a high order accurate one-step ADER time stepping strategy, which requires the divergence-free nonlinear WENO reconstruction procedure to be carried out only once per time step. The resulting divergence-free ADER-WENO schemes with MuSIC Riemann solvers give us an efficient and easily-implemented strategy for divergence-free MHD on unstructured meshes. Several stringent two- and three-dimensional problems are shown to work well with the methods presented here.
Computation of an Underexpanded 3-D Rectangular Jet by the CE/SE Method

NASA Technical Reports Server (NTRS)

Loh, Ching Y.; Himansu, Ananda; Wang, Xiao Y.; Jorgenson, Philip C. E.

2000-01-01

Recently, an unstructured three-dimensional space-time conservation element and solution element (CE/SE) Euler solver was developed. Now it is also developed for parallel computation using METIS for domain decomposition and MPI (message passing interface). The method is employed here to numerically study the near-field of a typical 3-D rectangular under-expanded jet. For the computed case-a jet with Mach number Mj = 1.6. with a very modest grid of 1.7 million tetrahedrons, the flow features such as the shock-cell structures and the axis switching, are in good qualitative agreement with experimental results.
Modeling dam-break flows using finite volume method on unstructured grid

USDA-ARS?s Scientific Manuscript database

Two-dimensional shallow water models based on unstructured finite volume method and approximate Riemann solvers for computing the intercell fluxes have drawn growing attention because of their robustness, high adaptivity to complicated geometry and ability to simulate flows with mixed regimes and di...
Highly efficient spatial data filtering in parallel using the opensource library CPPPO

NASA Astrophysics Data System (ADS)

Municchi, Federico; Goniva, Christoph; Radl, Stefan

2016-10-01

CPPPO is a compilation of parallel data processing routines developed with the aim to create a library for "scale bridging" (i.e. connecting different scales by mean of closure models) in a multi-scale approach. CPPPO features a number of parallel filtering algorithms designed for use with structured and unstructured Eulerian meshes, as well as Lagrangian data sets. In addition, data can be processed on the fly, allowing the collection of relevant statistics without saving individual snapshots of the simulation state. Our library is provided with an interface to the widely-used CFD solver OpenFOAM®, and can be easily connected to any other software package via interface modules. Also, we introduce a novel, extremely efficient approach to parallel data filtering, and show that our algorithms scale super-linearly on multi-core clusters. Furthermore, we provide a guideline for choosing the optimal Eulerian cell selection algorithm depending on the number of CPU cores used. Finally, we demonstrate the accuracy and the parallel scalability of CPPPO in a showcase focusing on heat and mass transfer from a dense bed of particles.
Multidimensional upwind hydrodynamics on unstructured meshes using graphics processing units - I. Two-dimensional uniform meshes

NASA Astrophysics Data System (ADS)

Paardekooper, S.-J.

2017-08-01

We present a new method for numerical hydrodynamics which uses a multidimensional generalization of the Roe solver and operates on an unstructured triangular mesh. The main advantage over traditional methods based on Riemann solvers, which commonly use one-dimensional flux estimates as building blocks for a multidimensional integration, is its inherently multidimensional nature, and as a consequence its ability to recognize multidimensional stationary states that are not hydrostatic. A second novelty is the focus on graphics processing units (GPUs). By tailoring the algorithms specifically to GPUs, we are able to get speedups of 100-250 compared to a desktop machine. We compare the multidimensional upwind scheme to a traditional, dimensionally split implementation of the Roe solver on several test problems, and we find that the new method significantly outperforms the Roe solver in almost all cases. This comes with increased computational costs per time-step, which makes the new method approximately a factor of 2 slower than a dimensionally split scheme acting on a structured grid.
plasmaFoam: An OpenFOAM framework for computational plasma physics and chemistry

NASA Astrophysics Data System (ADS)

Venkattraman, Ayyaswamy; Verma, Abhishek Kumar

2016-09-01

As emphasized in the 2012 Roadmap for low temperature plasmas (LTP), scientific computing has emerged as an essential tool for the investigation and prediction of the fundamental physical and chemical processes associated with these systems. While several in-house and commercial codes exist, with each having its own advantages and disadvantages, a common framework that can be developed by researchers from all over the world will likely accelerate the impact of computational studies on advances in low-temperature plasma physics and chemistry. In this regard, we present a finite volume computational toolbox to perform high-fidelity simulations of LTP systems. This framework, primarily based on the OpenFOAM solver suite, allows us to enhance our understanding of multiscale plasma phenomenon by performing massively parallel, three-dimensional simulations on unstructured meshes using well-established high performance computing tools that are widely used in the computational fluid dynamics community. In this talk, we will present preliminary results obtained using the OpenFOAM-based solver suite with benchmark three-dimensional simulations of microplasma devices including both dielectric and plasma regions. We will also discuss the future outlook for the solver suite.
Parallel, Gradient-Based Anisotropic Mesh Adaptation for Re-entry Vehicle Configurations

NASA Technical Reports Server (NTRS)

Bibb, Karen L.; Gnoffo, Peter A.; Park, Michael A.; Jones, William T.

2006-01-01

Two gradient-based adaptation methodologies have been implemented into the Fun3d refine GridEx infrastructure. A spring-analogy adaptation which provides for nodal movement to cluster mesh nodes in the vicinity of strong shocks has been extended for general use within Fun3d, and is demonstrated for a 70 sphere cone at Mach 2. A more general feature-based adaptation metric has been developed for use with the adaptation mechanics available in Fun3d, and is applicable to any unstructured, tetrahedral, flow solver. The basic functionality of general adaptation is explored through a case of flow over the forebody of a 70 sphere cone at Mach 6. A practical application of Mach 10 flow over an Apollo capsule, computed with the Felisa flow solver, is given to compare the adaptive mesh refinement with uniform mesh refinement. The examples of the paper demonstrate that the gradient-based adaptation capability as implemented can give an improvement in solution quality.
Implicit flux-split Euler schemes for unsteady aerodynamic analysis involving unstructured dynamic meshes

NASA Technical Reports Server (NTRS)

Batina, John T.

1990-01-01

Improved algorithms for the solution of the time-dependent Euler equations are presented for unsteady aerodynamic analysis involving unstructured dynamic meshes. The improvements have been developed recently to the spatial and temporal discretizations used by unstructured grid flow solvers. The spatial discretization involves a flux-split approach which is naturally dissipative and captures shock waves sharply with at most one grid point within the shock structure. The temporal discretization involves an implicit time-integration shceme using a Gauss-Seidel relaxation procedure which is computationally efficient for either steady or unsteady flow problems. For example, very large time steps may be used for rapid convergence to steady state, and the step size for unsteady cases may be selected for temporal accuracy rather than for numerical stability. Steady and unsteady flow results are presented for the NACA 0012 airfoil to demonstrate applications of the new Euler solvers. The unsteady results were obtained for the airfoil pitching harmonically about the quarter chord. The resulting instantaneous pressure distributions and lift and moment coefficients during a cycle of motion compare well with experimental data. The paper presents a description of the Euler solvers along with results and comparisons which assess the capability.
Accuracy of an unstructured-grid upwind-Euler algorithm for the ONERA M6 wing

NASA Technical Reports Server (NTRS)

Batina, John T.

1991-01-01

Improved algorithms for the solution of the three-dimensional, time-dependent Euler equations are presented for aerodynamic analysis involving unstructured dynamic meshes. The improvements have been developed recently to the spatial and temporal discretizations used by unstructured-grid flow solvers. The spatial discretization involves a flux-split approach that is naturally dissipative and captures shock waves sharply with at most one grid point within the shock structure. The temporal discretization involves either an explicit time-integration scheme using a multistage Runge-Kutta procedure or an implicit time-integration scheme using a Gauss-Seidel relaxation procedure, which is computationally efficient for either steady or unsteady flow problems. With the implicit Gauss-Seidel procedure, very large time steps may be used for rapid convergence to steady state, and the step size for unsteady cases may be selected for temporal accuracy rather than for numerical stability. Steady flow results are presented for both the NACA 0012 airfoil and the Office National d'Etudes et de Recherches Aerospatiales M6 wing to demonstrate applications of the new Euler solvers. The paper presents a description of the Euler solvers along with results and comparisons that assess the capability.
Implicit flux-split Euler schemes for unsteady aerodynamic analysis involving unstructured dynamic meshes

NASA Technical Reports Server (NTRS)

Batina, John T.

1990-01-01

Improved algorithm for the solution of the time-dependent Euler equations are presented for unsteady aerodynamic analysis involving unstructured dynamic meshes. The improvements were developed recently to the spatial and temporal discretizations used by unstructured grid flow solvers. The spatial discretization involves a flux-split approach which is naturally dissipative and captures shock waves sharply with at most one grid point within the shock structure. The temporal discretization involves an implicit time-integration scheme using a Gauss-Seidel relaxation procedure which is computationally efficient for either steady or unsteady flow problems. For example, very large time steps may be used for rapid convergence to steady state, and the step size for unsteady cases may be selected for temporal accuracy rather than for numerical stability. Steady and unsteady flow results are presented for the NACA 0012 airfoil to demonstrate applications of the new Euler solvers. The unsteady results were obtained for the airfoil pitching harmonically about the quarter chord. The resulting instantaneous pressure distributions and lift and moment coefficients during a cycle of motion compare well with experimental data. A description of the Euler solvers is presented along with results and comparisons which assess the capability.
Overview of the NCC

NASA Technical Reports Server (NTRS)

Liu, Nan-Suey

2001-01-01

A multi-disciplinary design/analysis tool for combustion systems is critical for optimizing the low-emission, high-performance combustor design process. Based on discussions between then NASA Lewis Research Center and the jet engine companies, an industry-government team was formed in early 1995 to develop the National Combustion Code (NCC), which is an integrated system of computer codes for the design and analysis of combustion systems. NCC has advanced features that address the need to meet designer's requirements such as "assured accuracy", "fast turnaround", and "acceptable cost". The NCC development team is comprised of Allison Engine Company (Allison), CFD Research Corporation (CFDRC), GE Aircraft Engines (GEAE), NASA Glenn Research Center (LeRC), and Pratt & Whitney (P&W). The "unstructured mesh" capability and "parallel computing" are fundamental features of NCC from its inception. The NCC system is composed of a set of "elements" which includes grid generator, main flow solver, turbulence module, turbulence and chemistry interaction module, chemistry module, spray module, radiation heat transfer module, data visualization module, and a post-processor for evaluating engine performance parameters. Each element may have contributions from several team members. Such a multi-source multi-element system needs to be integrated in a way that facilitates inter-module data communication, flexibility in module selection, and ease of integration. The development of the NCC beta version was essentially completed in June 1998. Technical details of the NCC elements are given in the Reference List. Elements such as the baseline flow solver, turbulence module, and the chemistry module, have been extensively validated; and their parallel performance on large-scale parallel systems has been evaluated and optimized. However the scalar PDF module and the Spray module, as well as their coupling with the baseline flow solver, were developed in a small-scale distributed computing environment. As a result, the validation of the NCC beta version as a whole was quite limited. Current effort has been focused on the validation of the integrated code and the evaluation/optimization of its overall performance on large-scale parallel systems.
An effective lattice Boltzmann flux solver on arbitrarily unstructured meshes

NASA Astrophysics Data System (ADS)

Wu, Qi-Feng; Shu, Chang; Wang, Yan; Yang, Li-Ming

2018-05-01

The recently proposed lattice Boltzmann flux solver (LBFS) is a new approach for the simulation of incompressible flow problems. It applies the finite volume method (FVM) to discretize the governing equations, and the flux at the cell interface is evaluated by local reconstruction of lattice Boltzmann solution from macroscopic flow variables at cell centers. In the previous application of the LBFS, the structured meshes have been commonly employed, which may cause inconvenience for problems with complex geometries. In this paper, the LBFS is extended to arbitrarily unstructured meshes for effective simulation of incompressible flows. Two test cases, the lid-driven flow in a triangular cavity and flow around a circular cylinder, are carried out for validation. The obtained results are compared with the data available in the literature. Good agreement has been achieved, which demonstrates the effectiveness and reliability of the LBFS in simulating flows on arbitrarily unstructured meshes.
Spectral-Element Seismic Wave Propagation Codes for both Forward Modeling in Complex Media and Adjoint Tomography

NASA Astrophysics Data System (ADS)

Smith, J. A.; Peter, D. B.; Tromp, J.; Komatitsch, D.; Lefebvre, M. P.

2015-12-01

We present both SPECFEM3D_Cartesian and SPECFEM3D_GLOBE open-source codes, representing high-performance numerical wave solvers simulating seismic wave propagation for local-, regional-, and global-scale application. These codes are suitable for both forward propagation in complex media and tomographic imaging. Both solvers compute highly accurate seismic wave fields using the continuous Galerkin spectral-element method on unstructured meshes. Lateral variations in compressional- and shear-wave speeds, density, as well as 3D attenuation Q models, topography and fluid-solid coupling are all readily included in both codes. For global simulations, effects due to rotation, ellipticity, the oceans, 3D crustal models, and self-gravitation are additionally included. Both packages provide forward and adjoint functionality suitable for adjoint tomography on high-performance computing architectures. We highlight the most recent release of the global version which includes improved performance, simultaneous MPI runs, OpenCL and CUDA support via an automatic source-to-source transformation library (BOAST), parallel I/O readers and writers for databases using ADIOS and seismograms using the recently developed Adaptable Seismic Data Format (ASDF) with built-in provenance. This makes our spectral-element solvers current state-of-the-art, open-source community codes for high-performance seismic wave propagation on arbitrarily complex 3D models. Together with these solvers, we provide full-waveform inversion tools to image the Earth's interior at unprecedented resolution.
A higher-order conservation element solution element method for solving hyperbolic differential equations on unstructured meshes

NASA Astrophysics Data System (ADS)

Bilyeu, David

This dissertation presents an extension of the Conservation Element Solution Element (CESE) method from second- to higher-order accuracy. The new method retains the favorable characteristics of the original second-order CESE scheme, including (i) the use of the space-time integral equation for conservation laws, (ii) a compact mesh stencil, (iii) the scheme will remain stable up to a CFL number of unity, (iv) a fully explicit, time-marching integration scheme, (v) true multidimensionality without using directional splitting, and (vi) the ability to handle two- and three-dimensional geometries by using unstructured meshes. This algorithm has been thoroughly tested in one, two and three spatial dimensions and has been shown to obtain the desired order of accuracy for solving both linear and non-linear hyperbolic partial differential equations. The scheme has also shown its ability to accurately resolve discontinuities in the solutions. Higher order unstructured methods such as the Discontinuous Galerkin (DG) method and the Spectral Volume (SV) methods have been developed for one-, two- and three-dimensional application. Although these schemes have seen extensive development and use, certain drawbacks of these methods have been well documented. For example, the explicit versions of these two methods have very stringent stability criteria. This stability criteria requires that the time step be reduced as the order of the solver increases, for a given simulation on a given mesh. The research presented in this dissertation builds upon the work of Chang, who developed a fourth-order CESE scheme to solve a scalar one-dimensional hyperbolic partial differential equation. The completed research has resulted in two key deliverables. The first is a detailed derivation of a high-order CESE methods on unstructured meshes for solving the conservation laws in two- and three-dimensional spaces. The second is the code implementation of these numerical methods in a computer code. For code development, a one-dimensional solver for the Euler equations was developed. This work is an extension of Chang's work on the fourth-order CESE method for solving a one-dimensional scalar convection equation. A generic formulation for the nth-order CESE method, where n ≥ 4, was derived. Indeed, numerical implementation of the scheme confirmed that the order of convergence was consistent with the order of the scheme. For the two- and three-dimensional solvers, SOLVCON was used as the basic framework for code implementation. A new solver kernel for the fourth-order CESE method has been developed and integrated into the framework provided by SOLVCON. The main part of SOLVCON, which deals with unstructured meshes and parallel computing, remains intact. The SOLVCON code for data transmission between computer nodes for High Performance Computing (HPC). To validate and verify the newly developed high-order CESE algorithms, several one-, two- and three-dimensional simulations where conducted. For the arbitrary order, one-dimensional, CESE solver, three sets of governing equations were selected for simulation: (i) the linear convection equation, (ii) the linear acoustic equations, (iii) the nonlinear Euler equations. All three systems of equations were used to verify the order of convergence through mesh refinement. In addition the Euler equations were used to solve the Shu-Osher and Blastwave problems. These two simulations demonstrated that the new high-order CESE methods can accurately resolve discontinuities in the flow field.For the two-dimensional, fourth-order CESE solver, the Euler equation was employed in four different test cases. The first case was used to verify the order of convergence through mesh refinement. The next three cases demonstrated the ability of the new solver to accurately resolve discontinuities in the flows. This was demonstrated through: (i) the interaction between acoustic waves and an entropy pulse, (ii) supersonic flow over a circular blunt body, (iii) supersonic flow over a guttered wedge. To validate and verify the three-dimensional, fourth-order CESE solver, two different simulations where selected. The first used the linear convection equations to demonstrate fourth-order convergence. The second used the Euler equations to simulate supersonic flow over a spherical body to demonstrate the scheme's ability to accurately resolve shocks. All test cases used are well known benchmark problems and as such, there are multiple sources available to validate the numerical results. Furthermore, the simulations showed that the high-order CESE solver was stable at a CFL number near unity.
CFD code evaluation for internal flow modeling

NASA Technical Reports Server (NTRS)

Chung, T. J.

1990-01-01

Research on the computational fluid dynamics (CFD) code evaluation with emphasis on supercomputing in reacting flows is discussed. Advantages of unstructured grids, multigrids, adaptive methods, improved flow solvers, vector processing, parallel processing, and reduction of memory requirements are discussed. As examples, researchers include applications of supercomputing to reacting flow Navier-Stokes equations including shock waves and turbulence and combustion instability problems associated with solid and liquid propellants. Evaluation of codes developed by other organizations are not included. Instead, the basic criteria for accuracy and efficiency have been established, and some applications on rocket combustion have been made. Research toward an ultimate goal, the most accurate and efficient CFD code, is in progress and will continue for years to come.
Progress in the Simulation of Steady and Time-Dependent Flows with 3D Parallel Unstructured Cartesian Methods

NASA Technical Reports Server (NTRS)

Aftosmis, M. J.; Berger, M. J.; Murman, S. M.; Kwak, Dochan (Technical Monitor)

2002-01-01

The proposed paper will present recent extensions in the development of an efficient Euler solver for adaptively-refined Cartesian meshes with embedded boundaries. The paper will focus on extensions of the basic method to include solution adaptation, time-dependent flow simulation, and arbitrary rigid domain motion. The parallel multilevel method makes use of on-the-fly parallel domain decomposition to achieve extremely good scalability on large numbers of processors, and is coupled with an automatic coarse mesh generation algorithm for efficient processing by a multigrid smoother. Numerical results are presented demonstrating parallel speed-ups of up to 435 on 512 processors. Solution-based adaptation may be keyed off truncation error estimates using tau-extrapolation or a variety of feature detection based refinement parameters. The multigrid method is extended to for time-dependent flows through the use of a dual-time approach. The extension to rigid domain motion uses an Arbitrary Lagrangian-Eulerlarian (ALE) formulation, and results will be presented for a variety of two- and three-dimensional example problems with both simple and complex geometry.
Efficient relaxed-Jacobi smoothers for multigrid on parallel computers

NASA Astrophysics Data System (ADS)

Yang, Xiang; Mittal, Rajat

2017-03-01

In this Technical Note, we present a family of Jacobi-based multigrid smoothers suitable for the solution of discretized elliptic equations. These smoothers are based on the idea of scheduled-relaxation Jacobi proposed recently by Yang & Mittal (2014) [18] and employ two or three successive relaxed Jacobi iterations with relaxation factors derived so as to maximize the smoothing property of these iterations. The performance of these new smoothers measured in terms of convergence acceleration and computational workload, is assessed for multi-domain implementations typical of parallelized solvers, and compared to the lexicographic point Gauss-Seidel smoother. The tests include the geometric multigrid method on structured grids as well as the algebraic grid method on unstructured grids. The tests demonstrate that unlike Gauss-Seidel, the convergence of these Jacobi-based smoothers is unaffected by domain decomposition, and furthermore, they outperform the lexicographic Gauss-Seidel by factors that increase with domain partition count.

Unstructured Euler flow solutions using hexahedral cell refinement

NASA Technical Reports Server (NTRS)

Melton, John E.; Cappuccio, Gelsomina; Thomas, Scott D.

1991-01-01

An attempt is made to extend grid refinement into three dimensions by using unstructured hexahedral grids. The flow solver is developed using the TIGER (topologically Independent Grid, Euler Refinement) as the starting point. The program uses an unstructured hexahedral mesh and a modified version of the Jameson four-stage, finite-volume Runge-Kutta algorithm for integration of the Euler equations. The unstructured mesh allows for local refinement appropriate for each freestream condition, thereby concentrating mesh cells in the regions of greatest interest. This increases the computational efficiency because the refinement is not required to extend throughout the entire flow field.
Practical Aerodynamic Design Optimization Based on the Navier-Stokes Equations and a Discrete Adjoint Method

NASA Technical Reports Server (NTRS)

Grossman, Bernard

1999-01-01

Compressible and incompressible versions of a three-dimensional unstructured mesh Reynolds-averaged Navier-Stokes flow solver have been differentiated and resulting derivatives have been verified by comparisons with finite differences and a complex-variable approach. In this implementation, the turbulence model is fully coupled with the flow equations in order to achieve this consistency. The accuracy demonstrated in the current work represents the first time that such an approach has been successfully implemented. The accuracy of a number of simplifying approximations to the linearizations of the residual have been examined. A first-order approximation to the dependent variables in both the adjoint and design equations has been investigated. The effects of a "frozen" eddy viscosity and the ramifications of neglecting some mesh sensitivity terms were also examined. It has been found that none of the approximations yielded derivatives of acceptable accuracy and were often of incorrect sign. However, numerical experiments indicate that an incomplete convergence of the adjoint system often yield sufficiently accurate derivatives, thereby significantly lowering the time required for computing sensitivity information. The convergence rate of the adjoint solver relative to the flow solver has been examined. Inviscid adjoint solutions typically require one to four times the cost of a flow solution, while for turbulent adjoint computations, this ratio can reach as high as eight to ten. Numerical experiments have shown that the adjoint solver can stall before converging the solution to machine accuracy, particularly for viscous cases. A possible remedy for this phenomenon would be to include the complete higher-order linearization in the preconditioning step, or to employ a simple form of mesh sequencing to obtain better approximations to the solution through the use of coarser meshes. An efficient surface parameterization based on a free-form deformation technique has been utilized and the resulting codes have been integrated with an optimization package. Lastly, sample optimizations have been shown for inviscid and turbulent flow over an ONERA M6 wing. Drag reductions have been demonstrated by reducing shock strengths across the span of the wing. In order for large scale optimization to become routine, the benefits of parallel architectures should be exploited. Although the flow solver has been parallelized using compiler directives. The parallel efficiency is under 50 percent. Clearly, parallel versions of the codes will have an immediate impact on the ability to design realistic configurations on fine meshes, and this effort is currently underway.
An unstructured mesh arbitrary Lagrangian-Eulerian unsteady incompressible flow solver and its application to insect flight aerodynamics

NASA Astrophysics Data System (ADS)

Su, Xiaohui; Cao, Yuanwei; Zhao, Yong

2016-06-01

In this paper, an unstructured mesh Arbitrary Lagrangian-Eulerian (ALE) incompressible flow solver is developed to investigate the aerodynamics of insect hovering flight. The proposed finite-volume ALE Navier-Stokes solver is based on the artificial compressibility method (ACM) with a high-resolution method of characteristics-based scheme on unstructured grids. The present ALE model is validated and assessed through flow passing over an oscillating cylinder. Good agreements with experimental results and other numerical solutions are obtained, which demonstrates the accuracy and the capability of the present model. The lift generation mechanisms of 2D wing in hovering motion, including wake capture, delayed stall, rapid pitch, as well as clap and fling are then studied and illustrated using the current ALE model. Moreover, the optimized angular amplitude in symmetry model, 45°, is firstly reported in details using averaged lift and the energy power method. Besides, the lift generation of complete cyclic clap and fling motion, which is simulated by few researchers using the ALE method due to large deformation, is studied and clarified for the first time. The present ALE model is found to be a useful tool to investigate lift force generation mechanism for insect wing flight.
Aerodynamics Simulations for the D8 ``Double Bubble'' Aircraft Using the LAVA Unstructured Solver

NASA Astrophysics Data System (ADS)

Ballinger, Sean

2013-11-01

The D8 ``double bubble'' is a proposed design for quieter and more efficient domestic passenger aircraft of the Boeing 737 class. It features boundary layer-ingesting engines located under a non-load-bearing π-tail and a lightweight low-sweep wing for flight around Mach 0.7. The D8's wide lifting body is expected to supply 15% of its total lift, while a Boeing 737's fuselage contributes only 8%. The tapering rear of the fuselage is also predicted to experience a negative moment resulting in positive pitch, produce a thicker boundary layer for ingestion by distortion-tolerant engines, and act as a noise shield. To investigate these predictions, unstructured grids generated over a fine surface triangulation using Star-CCM+ are used to model the unpowered D8 with flow conditions mimicking those in the MIT Wright brothers wind tunnel at angles of attack from - 2 to 14 degrees. LAVA, the recently developed Launch Ascent and Vehicle Aerodynamics solver, is used to carry out simulations on an unstructured grid. The results are compared to wind tunnel data, and to data from structured grid simulations using the LAVA, Overflow, and Cart3D solvers. Applied Modeling and Simulation Branch, NASA Advanced Supercomputing Division, funded by New York Space Grant.
A package for 3-D unstructured grid generation, finite-element flow solution and flow field visualization

NASA Technical Reports Server (NTRS)

Parikh, Paresh; Pirzadeh, Shahyar; Loehner, Rainald

1990-01-01

A set of computer programs for 3-D unstructured grid generation, fluid flow calculations, and flow field visualization was developed. The grid generation program, called VGRID3D, generates grids over complex configurations using the advancing front method. In this method, the point and element generation is accomplished simultaneously, VPLOT3D is an interactive, menudriven pre- and post-processor graphics program for interpolation and display of unstructured grid data. The flow solver, VFLOW3D, is an Euler equation solver based on an explicit, two-step, Taylor-Galerkin algorithm which uses the Flux Corrected Transport (FCT) concept for a wriggle-free solution. Using these programs, increasingly complex 3-D configurations of interest to aerospace community were gridded including a complete Space Transportation System comprised of the space-shuttle orbitor, the solid-rocket boosters, and the external tank. Flow solutions were obtained on various configurations in subsonic, transonic, and supersonic flow regimes.
Transonic Drag Prediction on a DLR-F6 Transport Configuration Using Unstructured Grid Solvers

NASA Technical Reports Server (NTRS)

Lee-Rausch, E. M.; Frink, N. T.; Mavriplis, D. J.; Rausch, R. D.; Milholen, W. E.

2004-01-01

A second international AIAA Drag Prediction Workshop (DPW-II) was organized and held in Orlando Florida on June 21-22, 2003. The primary purpose was to inves- tigate the code-to-code uncertainty. address the sensitivity of the drag prediction to grid size and quantify the uncertainty in predicting nacelle/pylon drag increments at a transonic cruise condition. This paper presents an in-depth analysis of the DPW-II computational results from three state-of-the-art unstructured grid Navier-Stokes flow solvers exercised on similar families of tetrahedral grids. The flow solvers are USM3D - a tetrahedral cell-centered upwind solver. FUN3D - a tetrahedral node-centered upwind solver, and NSU3D - a general element node-centered central-differenced solver. For the wingbody, the total drag predicted for a constant-lift transonic cruise condition showed a decrease in code-to-code variation with grid refinement as expected. For the same flight condition, the wing/body/nacelle/pylon total drag and the nacelle/pylon drag increment predicted showed an increase in code-to-code variation with grid refinement. Although the range in total drag for the wingbody fine grids was only 5 counts, a code-to-code comparison of surface pressures and surface restricted streamlines indicated that the three solvers were not all converging to the same flow solutions- different shock locations and separation patterns were evident. Similarly, the wing/body/nacelle/pylon solutions did not appear to be converging to the same flow solutions. Overall, grid refinement did not consistently improve the correlation with experimental data for either the wingbody or the wing/body/nacelle pylon configuration. Although the absolute values of total drag predicted by two of the solvers for the medium and fine grids did not compare well with the experiment, the incremental drag predictions were within plus or minus 3 counts of the experimental data. The correlation with experimental incremental drag was not significantly changed by specifying transition. Although the sources of code-to-code variation in force and moment predictions for the three unstructured grid codes have not yet been identified, the current study reinforces the necessity of applying multiple codes to the same application to assess uncertainty.
Climate Data Assimilation on a Massively Parallel Supercomputer

NASA Technical Reports Server (NTRS)

Ding, Hong Q.; Ferraro, Robert D.

1996-01-01

We have designed and implemented a set of highly efficient and highly scalable algorithms for an unstructured computational package, the PSAS data assimilation package, as demonstrated by detailed performance analysis of systematic runs on up to 512-nodes of an Intel Paragon. The preconditioned Conjugate Gradient solver achieves a sustained 18 Gflops performance. Consequently, we achieve an unprecedented 100-fold reduction in time to solution on the Intel Paragon over a single head of a Cray C90. This not only exceeds the daily performance requirement of the Data Assimilation Office at NASA's Goddard Space Flight Center, but also makes it possible to explore much larger and challenging data assimilation problems which are unthinkable on a traditional computer platform such as the Cray C90.
Implementation of Advanced Two Equation Turbulence Models in the USM3D Unstructured Flow Solver

NASA Technical Reports Server (NTRS)

Wang, Qun-Zhen; Massey, Steven J.; Abdol-Hamid, Khaled S.

2000-01-01

USM3D is a widely-used unstructured flow solver for simulating inviscid and viscous flows over complex geometries. The current version (version 5.0) of USM3D, however, does not have advanced turbulence models to accurately simulate complicated flow. We have implemented two modified versions of the original Jones and Launder k-epsilon "two-equation" turbulence model and the Girimaji algebraic Reynolds stress model in USM3D. Tests have been conducted for three flat plate boundary layer cases, a RAE2822 airfoil and an ONERA M6 wing. The results are compared with those from direct numerical simulation, empirical formulae, theoretical results, and the existing Spalart-Allmaras one-equation model.
Three-dimensional unstructured grid Euler computations using a fully-implicit, upwind method

NASA Technical Reports Server (NTRS)

Whitaker, David L.

1993-01-01

A method has been developed to solve the Euler equations on a three-dimensional unstructured grid composed of tetrahedra. The method uses an upwind flow solver with a linearized, backward-Euler time integration scheme. Each time step results in a sparse linear system of equations which is solved by an iterative, sparse matrix solver. Local-time stepping, switched evolution relaxation (SER), preconditioning and reuse of the Jacobian are employed to accelerate the convergence rate. Implicit boundary conditions were found to be extremely important for fast convergence. Numerical experiments have shown that convergence rates comparable to that of a multigrid, central-difference scheme are achievable on the same mesh. Results are presented for several grids about an ONERA M6 wing.
Parallel Preconditioning for CFD Problems on the CM-5

NASA Technical Reports Server (NTRS)

Simon, Horst D.; Kremenetsky, Mark D.; Richardson, John; Lasinski, T. A. (Technical Monitor)

1994-01-01

Up to today, preconditioning methods on massively parallel systems have faced a major difficulty. The most successful preconditioning methods in terms of accelerating the convergence of the iterative solver such as incomplete LU factorizations are notoriously difficult to implement on parallel machines for two reasons: (1) the actual computation of the preconditioner is not very floating-point intensive, but requires a large amount of unstructured communication, and (2) the application of the preconditioning matrix in the iteration phase (i.e. triangular solves) are difficult to parallelize because of the recursive nature of the computation. Here we present a new approach to preconditioning for very large, sparse, unsymmetric, linear systems, which avoids both difficulties. We explicitly compute an approximate inverse to our original matrix. This new preconditioning matrix can be applied most efficiently for iterative methods on massively parallel machines, since the preconditioning phase involves only a matrix-vector multiplication, with possibly a dense matrix. Furthermore the actual computation of the preconditioning matrix has natural parallelism. For a problem of size n, the preconditioning matrix can be computed by solving n independent small least squares problems. The algorithm and its implementation on the Connection Machine CM-5 are discussed in detail and supported by extensive timings obtained from real problem data.
The Feasibility of Adaptive Unstructured Computations On Petaflops Systems

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid; Heber, Gerd; Gao, Guang; Saini, Subhash (Technical Monitor)

1999-01-01

This viewgraph presentation covers the advantages of mesh adaptation, unstructured grids, and dynamic load balancing. It illustrates parallel adaptive communications, and explains PLUM (Parallel dynamic load balancing for adaptive unstructured meshes), and PSAW (Proper Self Avoiding Walks).
First Applications of the New Parallel Krylov Solver for MODFLOW on a National and Global Scale

NASA Astrophysics Data System (ADS)

Verkaik, J.; Hughes, J. D.; Sutanudjaja, E.; van Walsum, P.

2016-12-01

Integrated high-resolution hydrologic models are increasingly being used for evaluating water management measures at field scale. Their drawbacks are large memory requirements and long run times. Examples of such models are The Netherlands Hydrological Instrument (NHI) model and the PCRaster Global Water Balance (PCR-GLOBWB) model. Typical simulation periods are 30-100 years with daily timesteps. The NHI model predicts water demands in periods of drought, supporting operational and long-term water-supply decisions. The NHI is a state-of-the-art coupling of several models: a 7-layer MODFLOW groundwater model ( 6.5M 250m cells), a MetaSWAP model for the unsaturated zone (Richards emulator of 0.5M cells), and a surface water model (MOZART-DM). The PCR-GLOBWB model provides a grid-based representation of global terrestrial hydrology and this work uses the version that includes a 2-layer MODFLOW groundwater model ( 4.5M 10km cells). The Parallel Krylov Solver (PKS) speeds up computation by both distributed memory parallelization (Message Passing Interface) and shared memory parallelization (Open Multi-Processing). PKS includes conjugate gradient, bi-conjugate gradient stabilized, and generalized minimal residual linear accelerators that use an overlapping additive Schwarz domain decomposition preconditioner. PKS can be used for both structured and unstructured grids and has been fully integrated in MODFLOW-USG using METIS partitioning and in iMODFLOW using RCB partitioning. iMODFLOW is an accelerated version of MODFLOW-2005 that is implicitly and online coupled to MetaSWAP. Results for benchmarks carried out on the Cartesius Dutch supercomputer (https://userinfo.surfsara.nl/systems/cartesius) for the PCRGLOB-WB model and on a 2x16 core Windows machine for the NHI model show speedups up to 10-20 and 5-10, respectively.
Integrated multidisciplinary CAD/CAE environment for micro-electro-mechanical systems (MEMS)

NASA Astrophysics Data System (ADS)

Przekwas, Andrzej J.

1999-03-01

Computational design of MEMS involves several strongly coupled physical disciplines, including fluid mechanics, heat transfer, stress/deformation dynamics, electronics, electro/magneto statics, calorics, biochemistry and others. CFDRC is developing a new generation multi-disciplinary CAD systems for MEMS using high-fidelity field solvers on unstructured, solution-adaptive grids for a full range of disciplines. The software system, ACE + MEMS, includes all essential CAD tools; geometry/grid generation for multi- discipline, multi-equation solvers, GUI, tightly coupled configurable 3D field solvers for FVM, FEM and BEM and a 3D visualization/animation tool. The flow/heat transfer/calorics/chemistry equations are solved with unstructured adaptive FVM solver, stress/deformation are computed with a FEM STRESS solver and a FAST BEM solver is used to solve linear heat transfer, electro/magnetostatics and elastostatics equations on adaptive polygonal surface grids. Tight multidisciplinary coupling and automatic interoperability between the tools was achieved by designing a comprehensive database structure and APIs for complete model definition. The virtual model definition is implemented in data transfer facility, a publicly available tool described in this paper. The paper presents overall description of the software architecture and MEMS design flow in ACE + MEMS. It describes current status, ongoing effort and future plans for the software. The paper also discusses new concepts of mixed-level and mixed- dimensionality capability in which 1D microfluidic networks are simulated concurrently with 3D high-fidelity models of discrete components.
A robust and contact resolving Riemann solver on unstructured mesh, Part I, Euler method

NASA Astrophysics Data System (ADS)

Shen, Zhijun; Yan, Wei; Yuan, Guangwei

2014-07-01

This article presents a new cell-centered numerical method for compressible flows on arbitrary unstructured meshes. A multi-dimensional Riemann solver based on the HLLC method (denoted by HLLC-2D solver) is established. The work is an extension from the cell-centered Lagrangian scheme of Maire et al. [27] to the Eulerian framework. Similarly to the work in [27], a two-dimensional contact velocity defined on a grid node is introduced, and the motivation is to keep an edge flux consistency with the node velocity connected to the edge intrinsically. The main new feature of the algorithm is to relax the condition that the contact pressures must be same in the traditional HLLC solver. The discontinuous fluxes are constructed across each wave sampling direction rather than only along the contact wave direction. The two-dimensional contact velocity of the grid node is determined via enforcing conservation of mass, momentum and total energy, and thus the new method satisfies these conservation properties at nodes rather than on grid edges. Other good properties of the HLLC-2d solver, such as the positivity and the contact preserving, are described, and the two-dimensional high-order extension is constructed employing MUSCL type reconstruction procedure. Numerical results based on both quadrilateral and triangular grids are presented to demonstrate the robustness and the accuracy of this new solver, which shows it has better performance than the existing HLLC method.
3-D modeling of ductile tearing using finite elements: Computational aspects and techniques

NASA Astrophysics Data System (ADS)

Gullerud, Arne Stewart

This research focuses on the development and application of computational tools to perform large-scale, 3-D modeling of ductile tearing in engineering components under quasi-static to mild loading rates. Two standard models for ductile tearing---the computational cell methodology and crack growth controlled by the crack tip opening angle (CTOA)---are described and their 3-D implementations are explored. For the computational cell methodology, quantification of the effects of several numerical issues---computational load step size, procedures for force release after cell deletion, and the porosity for cell deletion---enables construction of computational algorithms to remove the dependence of predicted crack growth on these issues. This work also describes two extensions of the CTOA approach into 3-D: a general 3-D method and a constant front technique. Analyses compare the characteristics of the extensions, and a validation study explores the ability of the constant front extension to predict crack growth in thin aluminum test specimens over a range of specimen geometries, absolutes sizes, and levels of out-of-plane constraint. To provide a computational framework suitable for the solution of these problems, this work also describes the parallel implementation of a nonlinear, implicit finite element code. The implementation employs an explicit message-passing approach using the MPI standard to maintain portability, a domain decomposition of element data to provide parallel execution, and a master-worker organization of the computational processes to enhance future extensibility. A linear preconditioned conjugate gradient (LPCG) solver serves as the core of the solution process. The parallel LPCG solver utilizes an element-by-element (EBE) structure of the computations to permit a dual-level decomposition of the element data: domain decomposition of the mesh provides efficient coarse-grain parallel execution, while decomposition of the domains into blocks of similar elements (same type, constitutive model, etc.) provides fine-grain parallel computation on each processor. A major focus of the LPCG solver is a new implementation of the Hughes-Winget element-by-element (HW) preconditioner. The implementation employs a weighted dependency graph combined with a new coloring algorithm to provide load-balanced scheduling for the preconditioner and overlapped communication/computation. This approach enables efficient parallel application of the HW preconditioner for arbitrary unstructured meshes.
Aspects of Unstructured Grids and Finite-Volume Solvers for the Euler and Navier-Stokes Equations

NASA Technical Reports Server (NTRS)

Barth, Timothy J.

1992-01-01

One of the major achievements in engineering science has been the development of computer algorithms for solving nonlinear differential equations such as the Navier-Stokes equations. In the past, limited computer resources have motivated the development of efficient numerical schemes in computational fluid dynamics (CFD) utilizing structured meshes. The use of structured meshes greatly simplifies the implementation of CFD algorithms on conventional computers. Unstructured grids on the other hand offer an alternative to modeling complex geometries. Unstructured meshes have irregular connectivity and usually contain combinations of triangles, quadrilaterals, tetrahedra, and hexahedra. The generation and use of unstructured grids poses new challenges in CFD. The purpose of this note is to present recent developments in the unstructured grid generation and flow solution technology.
Towards Real-Time Pilot-in-the-Loop Simulation of Rotorcraft With Fully-Coupled CFD Solutions of Rotor / Terrain Interactions

NASA Astrophysics Data System (ADS)

Oruc, Ilker

This thesis presents the development of computationally efficient coupling of Navier-Stokes CFD with a helicopter flight dynamics model, with the ultimate goal of real-time simulation of fully coupled aerodynamic interactions between rotor flow and the surrounding terrain. A particular focus of the research is on coupled airwake effects in the helicopter / ship dynamic interface. A computationally efficient coupling interface was developed between the helicopter flight dynamics model, GENHEL-PSU and the Navier-Stokes solvers, CRUNCH/CRAFT-CFD using both FORTRAN and C/C++ programming languages. In order to achieve real-time execution speeds, the main rotor was modeled with a simplified actuator disk using unsteady momentum sources, instead of resolving the full blade geometry in the CFD. All the airframe components, including the fuselage are represented by single aerodynamic control points in the CFD calculations. The rotor downwash influence on the fuselage and empennage are calculated by using the CFD predicted local flow velocities at these aerodynamic control points defined on the helicopter airframe. In the coupled simulations, the flight dynamics model is free to move within a computational domain, where the main rotor forces are translated into source terms in the momentum equations of the Navier-Stokes equations. Simultaneously, the CFD calculates induced velocities those are fed back to the simulation and affect the aerodynamic loads in the flight dynamics. The CFD solver models the inflow, ground effect, and interactional aerodynamics in the flight dynamics simulation, and these calculations can be coupled with solution of the external flow (e.g. ship airwake effects). The developed framework was utilized for various investigations of hovering, forward flight and helicopter/terrain interaction simulations including standard ground effect, partial ground effect, sloped terrain, and acceleration in ground effect; and results compared with different flight and experimental data. In near ground cases, the fully-coupled flight dynamics and CFD simulations predicted roll oscillations due to interactions of the rotor downwash, ground plane, and the feedback controller, which are not predicted by the conventional simulation models. Fully coupled simulations of a helicopter accelerating near ground predicted flow formations similar to the recirculation and ground vortex flow regimes observed in experiments. The predictions of hover power reductions due to ground effect compared well to a recent experimental data and the results showed 22% power reduction for a hover flight z/R=0.55 above ground level. Fully coupled simulations performed for a helicopter hovering over and approaching to a ship flight deck and results compared with the standalone GENHEL-PSU simulations without ship airwake and one-way coupled simulations. The fully-coupled simulations showed higher pilot workload compared to the other two cases. In order to increase the execution speeds of the CFD calculations, several improvements were made on the CFD solver. First, the initial coupling approach File I/O was replaced with a more efficient method called Multiple Program Multiple Data MPI framework, where the two executables communicate with each other by MPI calls. Next, the unstructured solver (CRUNCH CFD), which is 2nd-order accurate in space, was replaced with the faster running structured solver (CRAFT CFD) that is 5th-order accurate in space. Other improvements including a more efficient k-d tree search algorithm and the bounding of the source term search space within a small region of the grid surrounding the rotor were made on the CFD solver. The final improvement was to parallelize the search task with the CFD solver tasks within the solver. To quantify the speed-up of the improvements to the coupling interface described above, a study was performed to demonstrate the speedup achieved from each of the interface improvements. The improvements made on the CFD solver showed more than 40 times speedup from the baseline file I/O and unstructured solver CRUNCH CFD. Using a structured CFD solver with 5th-order spacial accuracy provided the largest reductions in execution times. Disregarding the solver numeric, the total speedup of all of the interface improvements including the MPMD rotor point exchange, k-d tree search algorithm, bounded search space, and paralleled search task, was approximately 231%, more than a factor of 2. All these improvements provided the necessary speedup for approach real-time CFD. (Abstract shortened by ProQuest.).
Hybrid mesh finite volume CFD code for studying heat transfer in a forward-facing step

NASA Astrophysics Data System (ADS)

Jayakumar, J. S.; Kumar, Inder; Eswaran, V.

2010-12-01

Computational fluid dynamics (CFD) methods employ two types of grid: structured and unstructured. Developing the solver and data structures for a finite-volume solver is easier than for unstructured grids. But real-life problems are too complicated to be fitted flexibly by structured grids. Therefore, unstructured grids are widely used for solving real-life problems. However, using only one type of unstructured element consumes a lot of computational time because the number of elements cannot be controlled. Hence, a hybrid grid that contains mixed elements, such as the use of hexahedral elements along with tetrahedral and pyramidal elements, gives the user control over the number of elements in the domain, and thus only the domain that requires a finer grid is meshed finer and not the entire domain. This work aims to develop such a finite-volume hybrid grid solver capable of handling turbulence flows and conjugate heat transfer. It has been extended to solving flow involving separation and subsequent reattachment occurring due to sudden expansion or contraction. A significant effect of mixing high- and low-enthalpy fluid occurs in the reattached regions of these devices. This makes the study of the backward-facing and forward-facing step with heat transfer an important field of research. The problem of the forward-facing step with conjugate heat transfer was taken up and solved for turbulence flow using a two-equation model of k-ω. The variation in the flow profile and heat transfer behavior has been studied with the variation in Re and solid to fluid thermal conductivity ratios. The results for the variation in local Nusselt number, interface temperature and skin friction factor are presented.
Predictive Flow Control to Minimize Convective Time Delays

DTIC Science & Technology

2013-08-19

simulation. The CFO solver used is Cobalt, an unstructured finite-volume code developed for the solution of the compress- ible Navier-Stokes...cell-centered fin ite volume approach applicable to arbitrary cell topologies (e.g, hexahedra, prisms, tetrahedra). The spatial operator uses a Riemann ... solver , least squares gradient calculations using QR factorizati on to provide second order accuracy in space. A point implicit method using
Hydrodynamic Drag Reduction

DTIC Science & Technology

2015-04-01

Computational Engineering unstructured RANS/LES/DES solver , Tenasi, was used to predict drag and simulate the free surface flow around the ACV over a...using a second-order accurate Roe approximate Riemann scheme, while viscous fluxes are evaluated using a second-order directional derivative approach...Predictions of rigid body ship motions for the SI75 container ship in incident waves and methodology for a one-way coupling of the Tenasi flow solver

Computational Fluid Dynamics (CFD) Design of a Blended Wing Body (BWB) with Boundary Layer Ingestion (BLI) Nacelles

NASA Technical Reports Server (NTRS)

Morehouse, Melissa B.

2001-01-01

A study is being conducted to improve the propulsion/airframe integration for the Blended Wing-Body (BWB) configuration with boundary layer ingestion nacelles. TWO unstructured grid flow solvers, USM3D and FUN3D, have been coupled with different design methods and are being used to redesign the aft wing region and the nacelles to reduce drag and flow separation. An initial study comparing analyses from these two flow solvers against data from a wind tunnel test as well as predictions from the OVERFLOW structured grid code for a BWB without nacelles has been completed. Results indicate that the unstructured grid codes are sufficiently accurate for use in design. Results from the BWB design study will be presented.
Directional Agglomeration Multigrid Techniques for High Reynolds Number Viscous Flow Solvers

NASA Technical Reports Server (NTRS)

1998-01-01

A preconditioned directional-implicit agglomeration algorithm is developed for solving two- and three-dimensional viscous flows on highly anisotropic unstructured meshes of mixed-element types. The multigrid smoother consists of a pre-conditioned point- or line-implicit solver which operates on lines constructed in the unstructured mesh using a weighted graph algorithm. Directional coarsening or agglomeration is achieved using a similar weighted graph algorithm. A tight coupling of the line construction and directional agglomeration algorithms enables the use of aggressive coarsening ratios in the multigrid algorithm, which in turn reduces the cost of a multigrid cycle. Convergence rates which are independent of the degree of grid stretching are demonstrated in both two and three dimensions. Further improvement of the three-dimensional convergence rates through a GMRES technique is also demonstrated.
LES of Swirling Reacting Flows via the Unstructured scalar-FDF Solver

NASA Astrophysics Data System (ADS)

Ansari, Naseem; Pisciuneri, Patrick; Strakey, Peter; Givi, Peyman

2011-11-01

Swirling flames pose a significant challenge for computational modeling due to the presence of recirculation regions and vortex shedding. In this work, results are presented of LES of two swirl stabilized non-premixed flames (SM1 and SM2) via the FDF methodology. These flames are part of the database for validation of turbulent-combustion models. The scalar-FDF is simulated on a domain discretized by unstructured meshes, and is coupled with a finite volume flow solver. In the SM1 flame (with a low swirl number) chemistry is described by the flamelet model based on the full GRI 2.11 mechanism. The SM2 flame (with a high swirl number) is simulated via a 46-step 17-species mechanism. The simulated results are assessed via comparison with experimental data.
Implementation of Implicit Adaptive Mesh Refinement in an Unstructured Finite-Volume Flow Solver

NASA Technical Reports Server (NTRS)

Schwing, Alan M.; Nompelis, Ioannis; Candler, Graham V.

2013-01-01

This paper explores the implementation of adaptive mesh refinement in an unstructured, finite-volume solver. Unsteady and steady problems are considered. The effect on the recovery of high-order numerics is explored and the results are favorable. Important to this work is the ability to provide a path for efficient, implicit time advancement. A method using a simple refinement sensor based on undivided differences is discussed and applied to a practical problem: a shock-shock interaction on a hypersonic, inviscid double-wedge. Cases are compared to uniform grids without the use of adapted meshes in order to assess error and computational expense. Discussion of difficulties, advances, and future work prepare this method for additional research. The potential for this method in more complicated flows is described.
A GPU-based incompressible Navier-Stokes solver on moving overset grids

NASA Astrophysics Data System (ADS)

Chandar, Dominic D. J.; Sitaraman, Jayanarayanan; Mavriplis, Dimitri J.

2013-07-01

In pursuit of obtaining high fidelity solutions to the fluid flow equations in a short span of time, graphics processing units (GPUs) which were originally intended for gaming applications are currently being used to accelerate computational fluid dynamics (CFD) codes. With a high peak throughput of about 1 TFLOPS on a PC, GPUs seem to be favourable for many high-resolution computations. One such computation that involves a lot of number crunching is computing time accurate flow solutions past moving bodies. The aim of the present paper is thus to discuss the development of a flow solver on unstructured and overset grids and its implementation on GPUs. In its present form, the flow solver solves the incompressible fluid flow equations on unstructured/hybrid/overset grids using a fully implicit projection method. The resulting discretised equations are solved using a matrix-free Krylov solver using several GPU kernels such as gradient, Laplacian and reduction. Some of the simple arithmetic vector calculations are implemented using the CU++: An Object Oriented Framework for Computational Fluid Dynamics Applications using Graphics Processing Units, Journal of Supercomputing, 2013, doi:10.1007/s11227-013-0985-9 approach where GPU kernels are automatically generated at compile time. Results are presented for two- and three-dimensional computations on static and moving grids.
Time integration algorithms for the two-dimensional Euler equations on unstructured meshes

NASA Technical Reports Server (NTRS)

Slack, David C.; Whitaker, D. L.; Walters, Robert W.

1994-01-01

Explicit and implicit time integration algorithms for the two-dimensional Euler equations on unstructured grids are presented. Both cell-centered and cell-vertex finite volume upwind schemes utilizing Roe's approximate Riemann solver are developed. For the cell-vertex scheme, a four-stage Runge-Kutta time integration, a fourstage Runge-Kutta time integration with implicit residual averaging, a point Jacobi method, a symmetric point Gauss-Seidel method and two methods utilizing preconditioned sparse matrix solvers are presented. For the cell-centered scheme, a Runge-Kutta scheme, an implicit tridiagonal relaxation scheme modeled after line Gauss-Seidel, a fully implicit lower-upper (LU) decomposition, and a hybrid scheme utilizing both Runge-Kutta and LU methods are presented. A reverse Cuthill-McKee renumbering scheme is employed for the direct solver to decrease CPU time by reducing the fill of the Jacobian matrix. A comparison of the various time integration schemes is made for both first-order and higher order accurate solutions using several mesh sizes, higher order accuracy is achieved by using multidimensional monotone linear reconstruction procedures. The results obtained for a transonic flow over a circular arc suggest that the preconditioned sparse matrix solvers perform better than the other methods as the number of elements in the mesh increases.
A scalable nonlinear fluid-structure interaction solver based on a Schwarz preconditioner with isogeometric unstructured coarse spaces in 3D

NASA Astrophysics Data System (ADS)

Kong, Fande; Cai, Xiao-Chuan

2017-07-01

Nonlinear fluid-structure interaction (FSI) problems on unstructured meshes in 3D appear in many applications in science and engineering, such as vibration analysis of aircrafts and patient-specific diagnosis of cardiovascular diseases. In this work, we develop a highly scalable, parallel algorithmic and software framework for FSI problems consisting of a nonlinear fluid system and a nonlinear solid system, that are coupled monolithically. The FSI system is discretized by a stabilized finite element method in space and a fully implicit backward difference scheme in time. To solve the large, sparse system of nonlinear algebraic equations at each time step, we propose an inexact Newton-Krylov method together with a multilevel, smoothed Schwarz preconditioner with isogeometric coarse meshes generated by a geometry preserving coarsening algorithm. Here "geometry" includes the boundary of the computational domain and the wet interface between the fluid and the solid. We show numerically that the proposed algorithm and implementation are highly scalable in terms of the number of linear and nonlinear iterations and the total compute time on a supercomputer with more than 10,000 processor cores for several problems with hundreds of millions of unknowns.
A scalable nonlinear fluid–structure interaction solver based on a Schwarz preconditioner with isogeometric unstructured coarse spaces in 3D

DOE PAGES

Kong, Fande; Cai, Xiao-Chuan

2017-03-24

Nonlinear fluid-structure interaction (FSI) problems on unstructured meshes in 3D appear many applications in science and engineering, such as vibration analysis of aircrafts and patient-specific diagnosis of cardiovascular diseases. In this work, we develop a highly scalable, parallel algorithmic and software framework for FSI problems consisting of a nonlinear fluid system and a nonlinear solid system, that are coupled monolithically. The FSI system is discretized by a stabilized finite element method in space and a fully implicit backward difference scheme in time. To solve the large, sparse system of nonlinear algebraic equations at each time step, we propose an inexactmore » Newton-Krylov method together with a multilevel, smoothed Schwarz preconditioner with isogeometric coarse meshes generated by a geometry preserving coarsening algorithm. Here ''geometry'' includes the boundary of the computational domain and the wet interface between the fluid and the solid. We show numerically that the proposed algorithm and implementation are highly scalable in terms of the number of linear and nonlinear iterations and the total compute time on a supercomputer with more than 10,000 processor cores for several problems with hundreds of millions of unknowns.« less
An Adaptive Flow Solver for Air-Borne Vehicles Undergoing Time-Dependent Motions/Deformations

NASA Technical Reports Server (NTRS)

Singh, Jatinder; Taylor, Stephen

1997-01-01

This report describes a concurrent Euler flow solver for flows around complex 3-D bodies. The solver is based on a cell-centered finite volume methodology on 3-D unstructured tetrahedral grids. In this algorithm, spatial discretization for the inviscid convective term is accomplished using an upwind scheme. A localized reconstruction is done for flow variables which is second order accurate. Evolution in time is accomplished using an explicit three-stage Runge-Kutta method which has second order temporal accuracy. This is adapted for concurrent execution using another proven methodology based on concurrent graph abstraction. This solver operates on heterogeneous network architectures. These architectures may include a broad variety of UNIX workstations and PCs running Windows NT, symmetric multiprocessors and distributed-memory multi-computers. The unstructured grid is generated using commercial grid generation tools. The grid is automatically partitioned using a concurrent algorithm based on heat diffusion. This results in memory requirements that are inversely proportional to the number of processors. The solver uses automatic granularity control and resource management techniques both to balance load and communication requirements, and deal with differing memory constraints. These ideas are again based on heat diffusion. Results are subsequently combined for visualization and analysis using commercial CFD tools. Flow simulation results are demonstrated for a constant section wing at subsonic, transonic, and a supersonic case. These results are compared with experimental data and numerical results of other researchers. Performance results are under way for a variety of network topologies.
FoSSI: the family of simplified solver interfaces for the rapid development of parallel numerical atmosphere and ocean models

NASA Astrophysics Data System (ADS)

Frickenhaus, Stephan; Hiller, Wolfgang; Best, Meike

The portable software FoSSI is introduced that—in combination with additional free solver software packages—allows for an efficient and scalable parallel solution of large sparse linear equations systems arising in finite element model codes. FoSSI is intended to support rapid model code development, completely hiding the complexity of the underlying solver packages. In particular, the model developer need not be an expert in parallelization and is yet free to switch between different solver packages by simple modifications of the interface call. FoSSI offers an efficient and easy, yet flexible interface to several parallel solvers, most of them available on the web, such as PETSC, AZTEC, MUMPS, PILUT and HYPRE. FoSSI makes use of the concept of handles for vectors, matrices, preconditioners and solvers, that is frequently used in solver libraries. Hence, FoSSI allows for a flexible treatment of several linear equations systems and associated preconditioners at the same time, even in parallel on separate MPI-communicators. The second special feature in FoSSI is the task specifier, being a combination of keywords, each configuring a certain phase in the solver setup. This enables the user to control a solver over one unique subroutine. Furthermore, FoSSI has rather similar features for all solvers, making a fast solver intercomparison or exchange an easy task. FoSSI is a community software, proven in an adaptive 2D-atmosphere model and a 3D-primitive equation ocean model, both formulated in finite elements. The present paper discusses perspectives of an OpenMP-implementation of parallel iterative solvers based on domain decomposition methods. This approach to OpenMP solvers is rather attractive, as the code for domain-local operations of factorization, preconditioning and matrix-vector product can be readily taken from a sequential implementation that is also suitable to be used in an MPI-variant. Code development in this direction is in an advanced state under the name ScOPES: the Scalable Open Parallel sparse linear Equations Solver.
A Solution Adaptive Structured/Unstructured Overset Grid Flow Solver with Applications to Helicopter Rotor Flows

NASA Technical Reports Server (NTRS)

Duque, Earl P. N.; Biswas, Rupak; Strawn, Roger C.

1995-01-01

This paper summarizes a method that solves both the three dimensional thin-layer Navier-Stokes equations and the Euler equations using overset structured and solution adaptive unstructured grids with applications to helicopter rotor flowfields. The overset structured grids use an implicit finite-difference method to solve the thin-layer Navier-Stokes/Euler equations while the unstructured grid uses an explicit finite-volume method to solve the Euler equations. Solutions on a helicopter rotor in hover show the ability to accurately convect the rotor wake. However, isotropic subdivision of the tetrahedral mesh rapidly increases the overall problem size.
A CFD Heterogeneous Parallel Solver Based on Collaborating CPU and GPU

NASA Astrophysics Data System (ADS)

Lai, Jianqi; Tian, Zhengyu; Li, Hua; Pan, Sha

2018-03-01

Since Graphic Processing Unit (GPU) has a strong ability of floating-point computation and memory bandwidth for data parallelism, it has been widely used in the areas of common computing such as molecular dynamics (MD), computational fluid dynamics (CFD) and so on. The emergence of compute unified device architecture (CUDA), which reduces the complexity of compiling program, brings the great opportunities to CFD. There are three different modes for parallel solution of NS equations: parallel solver based on CPU, parallel solver based on GPU and heterogeneous parallel solver based on collaborating CPU and GPU. As we can see, GPUs are relatively rich in compute capacity but poor in memory capacity and the CPUs do the opposite. We need to make full use of the GPUs and CPUs, so a CFD heterogeneous parallel solver based on collaborating CPU and GPU has been established. Three cases are presented to analyse the solver’s computational accuracy and heterogeneous parallel efficiency. The numerical results agree well with experiment results, which demonstrate that the heterogeneous parallel solver has high computational precision. The speedup on a single GPU is more than 40 for laminar flow, it decreases for turbulent flow, but it still can reach more than 20. What’s more, the speedup increases as the grid size becomes larger.
A parallel electrostatic Particle-in-Cell method on unstructured tetrahedral grids for large-scale bounded collisionless plasma simulations

NASA Astrophysics Data System (ADS)

Averkin, Sergey N.; Gatsonis, Nikolaos A.

2018-06-01

An unstructured electrostatic Particle-In-Cell (EUPIC) method is developed on arbitrary tetrahedral grids for simulation of plasmas bounded by arbitrary geometries. The electric potential in EUPIC is obtained on cell vertices from a finite volume Multi-Point Flux Approximation of Gauss' law using the indirect dual cell with Dirichlet, Neumann and external circuit boundary conditions. The resulting matrix equation for the nodal potential is solved with a restarted generalized minimal residual method (GMRES) and an ILU(0) preconditioner algorithm, parallelized using a combination of node coloring and level scheduling approaches. The electric field on vertices is obtained using the gradient theorem applied to the indirect dual cell. The algorithms for injection, particle loading, particle motion, and particle tracking are parallelized for unstructured tetrahedral grids. The algorithms for the potential solver, electric field evaluation, loading, scatter-gather algorithms are verified using analytic solutions for test cases subject to Laplace and Poisson equations. Grid sensitivity analysis examines the L2 and L∞ norms of the relative error in potential, field, and charge density as a function of edge-averaged and volume-averaged cell size. Analysis shows second order of convergence for the potential and first order of convergence for the electric field and charge density. Temporal sensitivity analysis is performed and the momentum and energy conservation properties of the particle integrators in EUPIC are examined. The effects of cell size and timestep on heating, slowing-down and the deflection times are quantified. The heating, slowing-down and the deflection times are found to be almost linearly dependent on number of particles per cell. EUPIC simulations of current collection by cylindrical Langmuir probes in collisionless plasmas show good comparison with previous experimentally validated numerical results. These simulations were also used in a parallelization efficiency investigation. Results show that the EUPIC has efficiency of more than 80% when the simulation is performed on a single CPU from a non-uniform memory access node and the efficiency is decreasing as the number of threads further increases. The EUPIC is applied to the simulation of the multi-species plasma flow over a geometrically complex CubeSat in Low Earth Orbit. The EUPIC potential and flowfield distribution around the CubeSat exhibit features that are consistent with previous simulations over simpler geometrical bodies.
Unstructured Polyhedral Mesh Thermal Radiation Diffusion

DOE Office of Scientific and Technical Information (OSTI.GOV)

Palmer, T.S.; Zika, M.R.; Madsen, N.K.

2000-07-27

Unstructured mesh particle transport and diffusion methods are gaining wider acceptance as mesh generation, scientific visualization and linear solvers improve. This paper describes an algorithm that is currently being used in the KULL code at Lawrence Livermore National Laboratory to solve the radiative transfer equations. The algorithm employs a point-centered diffusion discretization on arbitrary polyhedral meshes in 3D. We present the results of a few test problems to illustrate the capabilities of the radiation diffusion module.
Solution algorithms for the two-dimensional Euler equations on unstructured meshes

NASA Technical Reports Server (NTRS)

Whitaker, D. L.; Slack, David C.; Walters, Robert W.

1990-01-01

The objective of the study was to analyze implicit techniques employed in structured grid algorithms for solving two-dimensional Euler equations and extend them to unstructured solvers in order to accelerate convergence rates. A comparison is made between nine different algorithms for both first-order and second-order accurate solutions. Higher-order accuracy is achieved by using multidimensional monotone linear reconstruction procedures. The discussion is illustrated by results for flow over a transonic circular arc.
MPSalsa Version 1.5: A Finite Element Computer Program for Reacting Flow Problems: Part 1 - Theoretical Development

DOE Office of Scientific and Technical Information (OSTI.GOV)

Devine, K.D.; Hennigan, G.L.; Hutchinson, S.A.

1999-01-01

The theoretical background for the finite element computer program, MPSalsa Version 1.5, is presented in detail. MPSalsa is designed to solve laminar or turbulent low Mach number, two- or three-dimensional incompressible and variable density reacting fluid flows on massively parallel computers, using a Petrov-Galerkin finite element formulation. The code has the capability to solve coupled fluid flow (with auxiliary turbulence equations), heat transport, multicomponent species transport, and finite-rate chemical reactions, and to solve coupled multiple Poisson or advection-diffusion-reaction equations. The program employs the CHEMKIN library to provide a rigorous treatment of multicomponent ideal gas kinetics and transport. Chemical reactions occurringmore » in the gas phase and on surfaces are treated by calls to CHEMKIN and SURFACE CHEMK3N, respectively. The code employs unstructured meshes, using the EXODUS II finite element database suite of programs for its input and output files. MPSalsa solves both transient and steady flows by using fully implicit time integration, an inexact Newton method and iterative solvers based on preconditioned Krylov methods as implemented in the Aztec. solver library.« less
Assessment of Hybrid RANS/LES Turbulence Models for Aeroacoustics Applications

NASA Technical Reports Server (NTRS)

Vatsa, Veer N.; Lockard, David P.

2010-01-01

Predicting the noise from aircraft with exposed landing gear remains a challenging problem for the aeroacoustics community. Although computational fluid dynamics (CFD) has shown promise as a technique that could produce high-fidelity flow solutions, generating grids that can resolve the pertinent physics around complex configurations can be very challenging. Structured grids are often impractical for such configurations. Unstructured grids offer a path forward for simulating complex configurations. However, few unstructured grid codes have been thoroughly tested for unsteady flow problems in the manner needed for aeroacoustic prediction. A widely used unstructured grid code, FUN3D, is examined for resolving the near field in unsteady flow problems. Although the ultimate goal is to compute the flow around complex geometries such as the landing gear, simpler problems that include some of the relevant physics, and are easily amenable to the structured grid approaches are used for testing the unstructured grid approach. The test cases chosen for this study correspond to the experimental work on single and tandem cylinders conducted in the Basic Aerodynamic Research Tunnel (BART) and the Quiet Flow Facility (QFF) at NASA Langley Research Center. These configurations offer an excellent opportunity to assess the performance of hybrid RANS/LES turbulence models that transition from RANS in unresolved regions near solid bodies to LES in the outer flow field. Several of these models have been implemented and tested in both structured and unstructured grid codes to evaluate their dependence on the solver and mesh type. Comparison of FUN3D solutions with experimental data and numerical solutions from a structured grid flow solver are found to be encouraging.
A Robust and Scalable Software Library for Parallel Adaptive Refinement on Unstructured Meshes

NASA Technical Reports Server (NTRS)

Lou, John Z.; Norton, Charles D.; Cwik, Thomas A.

1999-01-01

The design and implementation of Pyramid, a software library for performing parallel adaptive mesh refinement (PAMR) on unstructured meshes, is described. This software library can be easily used in a variety of unstructured parallel computational applications, including parallel finite element, parallel finite volume, and parallel visualization applications using triangular or tetrahedral meshes. The library contains a suite of well-designed and efficiently implemented modules that perform operations in a typical PAMR process. Among these are mesh quality control during successive parallel adaptive refinement (typically guided by a local-error estimator), parallel load-balancing, and parallel mesh partitioning using the ParMeTiS partitioner. The Pyramid library is implemented in Fortran 90 with an interface to the Message-Passing Interface (MPI) library, supporting code efficiency, modularity, and portability. An EM waveguide filter application, adaptively refined using the Pyramid library, is illustrated.
An implementation of a chemical and thermal nonequilibrium flow solver on unstructured meshes and application to blunt bodies

NASA Technical Reports Server (NTRS)

Prabhu, Ramadas K.

1994-01-01

This paper presents a nonequilibrium flow solver, implementation of the algorithm on unstructured meshes, and application to hypersonic flow past blunt bodies. Air is modeled as a mixture of five chemical species, namely O2, N2, O, NO, and N, having two temperatures namely translational and vibrational. The solution algorithm is a cell centered, point implicit upwind scheme that employs Roe's flux difference splitting technique. Implementation of this algorithm on unstructured meshes is described. The computer code is applied to solve Mach 15 flow with and without a Type IV shock interference on a cylindrical body of 2.5mm radius representing a cowl lip. Adaptively generated meshes are employed, and the meshes are refined several times until the solution exhibits detailed flow features and surface pressure and heat flux distributions. Effects of a catalytic wall on surface heat flux distribution are studied. For the Mach 15 Type IV shock interference flow, present results showed a peak heat flux of 544 MW/m2 for a fully catalytic wall and 431 MW/m(exp 2) for a noncatalytic wall. Some of the results are compared with available computational data.
Sound-turbulence interaction in transonic boundary layers

NASA Astrophysics Data System (ADS)

Lelostec, Ludovic; Scalo, Carlo; Lele, Sanjiva

2014-11-01

Acoustic wave scattering in a transonic boundary layer is investigated through a novel approach. Instead of simulating directly the interaction of an incoming oblique acoustic wave with a turbulent boundary layer, suitable Dirichlet conditions are imposed at the wall to reproduce only the reflected wave resulting from the interaction of the incident wave with the boundary layer. The method is first validated using the laminar boundary layer profiles in a parallel flow approximation. For this scattering problem an exact inviscid solution can be found in the frequency domain which requires numerical solution of an ODE. The Dirichlet conditions are imposed in a high-fidelity unstructured compressible flow solver for Large Eddy Simulation (LES), CharLESx. The acoustic field of the reflected wave is then solved and the interaction between the boundary layer and sound scattering can be studied.

Multigrid approaches to non-linear diffusion problems on unstructured meshes

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.; Bushnell, Dennis M. (Technical Monitor)

2001-01-01

The efficiency of three multigrid methods for solving highly non-linear diffusion problems on two-dimensional unstructured meshes is examined. The three multigrid methods differ mainly in the manner in which the nonlinearities of the governing equations are handled. These comprise a non-linear full approximation storage (FAS) multigrid method which is used to solve the non-linear equations directly, a linear multigrid method which is used to solve the linear system arising from a Newton linearization of the non-linear system, and a hybrid scheme which is based on a non-linear FAS multigrid scheme, but employs a linear solver on each level as a smoother. Results indicate that all methods are equally effective at converging the non-linear residual in a given number of grid sweeps, but that the linear solver is more efficient in cpu time due to the lower cost of linear versus non-linear grid sweeps.
Static Aeroelastic Predictions for a Transonic Transport Model Using an Unstructured-Grid Flow Solver Coupled With a Structural Plate Technique

NASA Technical Reports Server (NTRS)

Allison, Dennis O.; Cavallo, Peter A.

2003-01-01

An equivalent-plate structural deformation technique was coupled with a steady-state unstructured-grid three-dimensional Euler flow solver and a two-dimensional strip interactive boundary-layer technique. The objective of the research was to assess the extent to which a simple accounting for static model deformations could improve correlations with measured wing pressure distributions and lift coefficients at transonic speeds. Results were computed and compared to test data for a wing-fuselage model of a generic low-wing transonic transport at a transonic cruise condition over a range of Reynolds numbers and dynamic pressures. The deformations significantly improved correlations with measured wing pressure distributions and lift coefficients. This method provided a means of quantifying the role of dynamic pressure in wind-tunnel studies of Reynolds number effects for transonic transport models.
Computational Aerothermodynamic Simulation Issues on Unstructured Grids

NASA Technical Reports Server (NTRS)

Gnoffo, Peter A.; White, Jeffery A.

2004-01-01

The synthesis of physical models for gas chemistry and turbulence from the structured grid codes LAURA and VULCAN into the unstructured grid code FUN3D is described. A directionally Symmetric, Total Variation Diminishing (STVD) algorithm and an entropy fix (eigenvalue limiter) keyed to local cell Reynolds number are introduced to improve solution quality for hypersonic aeroheating applications. A simple grid-adaptation procedure is incorporated within the flow solver. Simulations of flow over an ellipsoid (perfect gas, inviscid), Shuttle Orbiter (viscous, chemical nonequilibrium) and comparisons to the structured grid solvers LAURA (cylinder, Shuttle Orbiter) and VULCAN (flat plate) are presented to show current capabilities. The quality of heating in 3D stagnation regions is very sensitive to algorithm options in general, high aspect ratio tetrahedral elements complicate the simulation of high Reynolds number, viscous flow as compared to locally structured meshes aligned with the flow.
BCYCLIC: A parallel block tridiagonal matrix cyclic solver

NASA Astrophysics Data System (ADS)

Hirshman, S. P.; Perumalla, K. S.; Lynch, V. E.; Sanchez, R.

2010-09-01

A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.
Inlet Spillage Drag Predictions Using the AIRPLANE Code

NASA Technical Reports Server (NTRS)

Thomas, Scott D.; Won, Mark A.; Cliff, Susan E.

1999-01-01

AIRPLANE (Jameson/Baker) is a steady inviscid unstructured Euler flow solver. It has been validated on many HSR geometries. It is implemented as MESHPLANE, an unstructured mesh generator, and FLOPLANE, an iterative flow solver. The surface description from an Intergraph CAD system goes into MESHPLANE as collections of polygonal curves to generate the 3D mesh. The flow solver uses a multistage time stepping scheme with residual averaging to approach steady state, but R is not time accurate. The flow solver was ported from Cray to IBM SP2 by Wu-Sun Cheng (IBM); it could only be run on 4 CPUs at a time because of memory limitations. Meshes for the four cases had about 655,000 points in the flow field, about 3.9 million tetrahedra, about 77,500 points on the surface. The flow solver took about 23 wall seconds per iteration when using 4 CPUs. It took about eight and a half wall hours to run 1,300 iterations at a time (the queue limit is 10 hours). A revised version of FLOPLANE (Thomas) was used on up to 64 CPUs to finish up some calculations at the end. We had to turn on more communication when using more processors to eliminate noise that was contaminating the flow field; this added about 50% to the elapsed wall time per iteration when using 64 CPUs. This study involved computing lift and drag for a wing/body/nacelle configuration at Mach 0.9 and 4 degrees pitch. Four cases were considered, corresponding to four nacelle mass flow conditions.
Preconditioned implicit solvers for the Navier-Stokes equations on distributed-memory machines

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Liou, Meng-Sing; Dyson, Rodger W.

1994-01-01

The GMRES method is parallelized, and combined with local preconditioning to construct an implicit parallel solver to obtain steady-state solutions for the Navier-Stokes equations of fluid flow on distributed-memory machines. The new implicit parallel solver is designed to preserve the convergence rate of the equivalent 'serial' solver. A static domain-decomposition is used to partition the computational domain amongst the available processing nodes of the parallel machine. The SPMD (Single-Program Multiple-Data) programming model is combined with message-passing tools to develop the parallel code on a 32-node Intel Hypercube and a 512-node Intel Delta machine. The implicit parallel solver is validated for internal and external flow problems, and is found to compare identically with flow solutions obtained on a Cray Y-MP/8. A peak computational speed of 2300 MFlops/sec has been achieved on 512 nodes of the Intel Delta machine,k for a problem size of 1024 K equations (256 K grid points).
GPU surface extraction using the closest point embedding

NASA Astrophysics Data System (ADS)

Kim, Mark; Hansen, Charles

2015-01-01

Isosurface extraction is a fundamental technique used for both surface reconstruction and mesh generation. One method to extract well-formed isosurfaces is a particle system; unfortunately, particle systems can be slow. In this paper, we introduce an enhanced parallel particle system that uses the closest point embedding as the surface representation to speedup the particle system for isosurface extraction. The closest point embedding is used in the Closest Point Method (CPM), a technique that uses a standard three dimensional numerical PDE solver on two dimensional embedded surfaces. To fully take advantage of the closest point embedding, it is coupled with a Barnes-Hut tree code on the GPU. This new technique produces well-formed, conformal unstructured triangular and tetrahedral meshes from labeled multi-material volume datasets. Further, this new parallel implementation of the particle system is faster than any known methods for conformal multi-material mesh extraction. The resulting speed-ups gained in this implementation can reduce the time from labeled data to mesh from hours to minutes and benefits users, such as bioengineers, who employ triangular and tetrahedral meshes
DOE Office of Scientific and Technical Information (OSTI.GOV)

Burke, Timothy P.; Martz, Roger L.; Kiedrowski, Brian C.

New unstructured mesh capabilities in MCNP6 (developmental version during summer 2012) show potential for conducting multi-physics analyses by coupling MCNP to a finite element solver such as Abaqus/CAE[2]. Before these new capabilities can be utilized, the ability of MCNP to accurately estimate eigenvalues and pin powers using an unstructured mesh must first be verified. Previous work to verify the unstructured mesh capabilities in MCNP was accomplished using the Godiva sphere [1], and this work attempts to build on that. To accomplish this, a criticality benchmark and a fuel assembly benchmark were used for calculations in MCNP using both the Constructivemore » Solid Geometry (CSG) native to MCNP and the unstructured mesh geometry generated using Abaqus/CAE. The Big Ten criticality benchmark [3] was modeled due to its geometry being similar to that of a reactor fuel pin. The C5G7 3-D Mixed Oxide (MOX) Fuel Assembly Benchmark [4] was modeled to test the unstructured mesh capabilities on a reactor-type problem.« less
GSRP/David Marshall: Fully Automated Cartesian Grid CFD Application for MDO in High Speed Flows

NASA Technical Reports Server (NTRS)

2003-01-01

With the renewed interest in Cartesian gridding methodologies for the ease and speed of gridding complex geometries in addition to the simplicity of the control volumes used in the computations, it has become important to investigate ways of extending the existing Cartesian grid solver functionalities. This includes developing methods of modeling the viscous effects in order to utilize Cartesian grids solvers for accurate drag predictions and addressing the issues related to the distributed memory parallelization of Cartesian solvers. This research presents advances in two areas of interest in Cartesian grid solvers, viscous effects modeling and MPI parallelization. The development of viscous effects modeling using solely Cartesian grids has been hampered by the widely varying control volume sizes associated with the mesh refinement and the cut cells associated with the solid surface. This problem is being addressed by using physically based modeling techniques to update the state vectors of the cut cells and removing them from the finite volume integration scheme. This work is performed on a new Cartesian grid solver, NASCART-GT, with modifications to its cut cell functionality. The development of MPI parallelization addresses issues associated with utilizing Cartesian solvers on distributed memory parallel environments. This work is performed on an existing Cartesian grid solver, CART3D, with modifications to its parallelization methodology.
Revisiting Parallel Cyclic Reduction and Parallel Prefix-Based Algorithms for Block Tridiagonal System of Equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seal, Sudip K; Perumalla, Kalyan S; Hirshman, Steven Paul

2013-01-01

Simulations that require solutions of block tridiagonal systems of equations rely on fast parallel solvers for runtime efficiency. Leading parallel solvers that are highly effective for general systems of equations, dense or sparse, are limited in scalability when applied to block tridiagonal systems. This paper presents scalability results as well as detailed analyses of two parallel solvers that exploit the special structure of block tridiagonal matrices to deliver superior performance, often by orders of magnitude. A rigorous analysis of their relative parallel runtimes is shown to reveal the existence of a critical block size that separates the parameter space spannedmore » by the number of block rows, the block size and the processor count, into distinct regions that favor one or the other of the two solvers. Dependence of this critical block size on the above parameters as well as on machine-specific constants is established. These formal insights are supported by empirical results on up to 2,048 cores of a Cray XT4 system. To the best of our knowledge, this is the highest reported scalability for parallel block tridiagonal solvers to date.« less
LDRD final report on massively-parallel linear programming : the parPCx system.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar

2005-02-01

This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runsmore » on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We conclude with directions for long-term future algorithmic research and for near-term development that could improve the performance of parPCx.« less
Quality assessment of two- and three-dimensional unstructured meshes and validation of an upwind Euler flow solver

NASA Technical Reports Server (NTRS)

Woodard, Paul R.; Batina, John T.; Yang, Henry T. Y.

1992-01-01

Quality assessment procedures are described for two-dimensional unstructured meshes. The procedures include measurement of minimum angles, element aspect ratios, stretching, and element skewness. Meshes about the ONERA M6 wing and the Boeing 747 transport configuration are generated using an advancing front method grid generation package of programs. Solutions of Euler's equations for these meshes are obtained at low angle-of-attack, transonic conditions. Results for these cases, obtained as part of a validation study demonstrate accuracy of an implicit upwind Euler solution algorithm.
Development of a solution adaptive unstructured scheme for quasi-3D inviscid flows through advanced turbomachinery cascades

NASA Technical Reports Server (NTRS)

Usab, William J., Jr.; Jiang, Yi-Tsann

1991-01-01

The objective of the present research is to develop a general solution adaptive scheme for the accurate prediction of inviscid quasi-three-dimensional flow in advanced compressor and turbine designs. The adaptive solution scheme combines an explicit finite-volume time-marching scheme for unstructured triangular meshes and an advancing front triangular mesh scheme with a remeshing procedure for adapting the mesh as the solution evolves. The unstructured flow solver has been tested on a series of two-dimensional airfoil configurations including a three-element analytic test case presented here. Mesh adapted quasi-three-dimensional Euler solutions are presented for three spanwise stations of the NASA rotor 67 transonic fan. Computed solutions are compared with available experimental data.
Parallel Adaptive Mesh Refinement for High-Order Finite-Volume Schemes in Computational Fluid Dynamics

NASA Astrophysics Data System (ADS)

Schwing, Alan Michael

For computational fluid dynamics, the governing equations are solved on a discretized domain of nodes, faces, and cells. The quality of the grid or mesh can be a driving source for error in the results. While refinement studies can help guide the creation of a mesh, grid quality is largely determined by user expertise and understanding of the flow physics. Adaptive mesh refinement is a technique for enriching the mesh during a simulation based on metrics for error, impact on important parameters, or location of important flow features. This can offload from the user some of the difficult and ambiguous decisions necessary when discretizing the domain. This work explores the implementation of adaptive mesh refinement in an implicit, unstructured, finite-volume solver. Consideration is made for applying modern computational techniques in the presence of hanging nodes and refined cells. The approach is developed to be independent of the flow solver in order to provide a path for augmenting existing codes. It is designed to be applicable for unsteady simulations and refinement and coarsening of the grid does not impact the conservatism of the underlying numerics. The effect on high-order numerical fluxes of fourth- and sixth-order are explored. Provided the criteria for refinement is appropriately selected, solutions obtained using adapted meshes have no additional error when compared to results obtained on traditional, unadapted meshes. In order to leverage large-scale computational resources common today, the methods are parallelized using MPI. Parallel performance is considered for several test problems in order to assess scalability of both adapted and unadapted grids. Dynamic repartitioning of the mesh during refinement is crucial for load balancing an evolving grid. Development of the methods outlined here depend on a dual-memory approach that is described in detail. Validation of the solver developed here against a number of motivating problems shows favorable comparisons across a range of regimes. Unsteady and steady applications are considered in both subsonic and supersonic flows. Inviscid and viscous simulations achieve similar results at a much reduced cost when employing dynamic mesh adaptation. Several techniques for guiding adaptation are compared. Detailed analysis of statistics from the instrumented solver enable understanding of the costs associated with adaptation. Adaptive mesh refinement shows promise for the test cases presented here. It can be considerably faster than using conventional grids and provides accurate results. The procedures for adapting the grid are light-weight enough to not require significant computational time and yield significant reductions in grid size.
Performance Models for the Spike Banded Linear System Solver

DOE PAGES

Manguoglu, Murat; Saied, Faisal; Sameh, Ahmed; ...

2011-01-01

With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers themselves. In this paper, we investigate the problem of characterizing performance and scalability of banded preconditioners. Recent work has demonstrated the superior convergence properties and robustness of banded preconditioners,more » compared to state-of-the-art ILU family of preconditioners as well as algebraic multigrid preconditioners. Furthermore, when used in conjunction with efficient banded solvers, banded preconditioners are capable of significantly faster time-to-solution. Our banded solver, the Truncated Spike algorithm is specifically designed for parallel performance and tolerance to deep memory hierarchies. Its regular structure is also highly amenable to accurate performance characterization. Using these characteristics, we derive the following results in this paper: (i) we develop parallel formulations of the Truncated Spike solver, (ii) we develop a highly accurate pseudo-analytical parallel performance model for our solver, (iii) we show excellent predication capabilities of our model – based on which we argue the high scalability of our solver. Our pseudo-analytical performance model is based on analytical performance characterization of each phase of our solver. These analytical models are then parameterized using actual runtime information on target platforms. An important consequence of our performance models is that they reveal underlying performance bottlenecks in both serial and parallel formulations. All of our results are validated on diverse heterogeneous multiclusters – platforms for which performance prediction is particularly challenging. Finally, we provide predict the scalability of the Spike algorithm using up to 65,536 cores with our model. In this paper we extend the results presented in the Ninth International Symposium on Parallel and Distributed Computing.« less
Parallelizing alternating direction implicit solver on GPUs

USDA-ARS?s Scientific Manuscript database

We present a parallel Alternating Direction Implicit (ADI) solver on GPUs. Our implementation significantly improves existing implementations in two aspects. First, we address the scalability issue of existing Parallel Cyclic Reduction (PCR) implementations by eliminating their hardware resource con...
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran

We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less
''A Parallel Adaptive Simulation Tool for Two Phase Steady State Reacting Flows in Industrial Boilers and Furnaces''

DOE Office of Scientific and Technical Information (OSTI.GOV)

Michael J. Bockelie

2002-01-04

This DOE SBIR Phase II final report summarizes research that has been performed to develop a parallel adaptive tool for modeling steady, two phase turbulent reacting flow. The target applications for the new tool are full scale, fossil-fuel fired boilers and furnaces such as those used in the electric utility industry, chemical process industry and mineral/metal process industry. The type of analyses to be performed on these systems are engineering calculations to evaluate the impact on overall furnace performance due to operational, process or equipment changes. To develop a Computational Fluid Dynamics (CFD) model of an industrial scale furnace requiresmore » a carefully designed grid that will capture all of the large and small scale features of the flowfield. Industrial systems are quite large, usually measured in tens of feet, but contain numerous burners, air injection ports, flames and localized behavior with dimensions that are measured in inches or fractions of inches. To create an accurate computational model of such systems requires capturing length scales within the flow field that span several orders of magnitude. In addition, to create an industrially useful model, the grid can not contain too many grid points - the model must be able to execute on an inexpensive desktop PC in a matter of days. An adaptive mesh provides a convenient means to create a grid that can capture both fine flow field detail within a very large domain with a ''reasonable'' number of grid points. However, the use of an adaptive mesh requires the development of a new flow solver. To create the new simulation tool, we have combined existing reacting CFD modeling software with new software based on emerging block structured Adaptive Mesh Refinement (AMR) technologies developed at Lawrence Berkeley National Laboratory (LBNL). Specifically, we combined: -physical models, modeling expertise, and software from existing combustion simulation codes used by Reaction Engineering International; -mesh adaption, data management, and parallelization software and technology being developed by users of the BoxLib library at LBNL; and -solution methods for problems formulated on block structured grids that were being developed in collaboration with technical staff members at the University of Utah Center for High Performance Computing (CHPC) and at LBNL. The combustion modeling software used by Reaction Engineering International represents an investment of over fifty man-years of development, conducted over a period of twenty years. Thus, it was impractical to achieve our objective by starting from scratch. The research program resulted in an adaptive grid, reacting CFD flow solver that can be used only on limited problems. In current form the code is appropriate for use on academic problems with simplified geometries. The new solver is not sufficiently robust or sufficiently general to be used in a ''production mode'' for industrial applications. The principle difficulty lies with the multi-level solver technology. The use of multi-level solvers on adaptive grids with embedded boundaries is not yet a mature field and there are many issues that remain to be resolved. From the lessons learned in this SBIR program, we have started work on a new flow solver with an AMR capability. The new code is based on a conventional cell-by-cell mesh refinement strategy used in unstructured grid solvers that employ hexahedral cells. The new solver employs several of the concepts and solution strategies developed within this research program. The formulation of the composite grid problem for the new solver has been designed to avoid the embedded boundary complications encountered in this SBIR project. This follow-on effort will result in a reacting flow CFD solver with localized mesh capability that can be used to perform engineering calculations on industrial problems in a production mode.« less
A high performance linear equation solver on the VPP500 parallel supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nakanishi, Makoto; Ina, Hiroshi; Miura, Kenichi

1994-12-31

This paper describes the implementation of two high performance linear equation solvers developed for the Fujitsu VPP500, a distributed memory parallel supercomputer system. The solvers take advantage of the key architectural features of VPP500--(1) scalability for an arbitrary number of processors up to 222 processors, (2) flexible data transfer among processors provided by a crossbar interconnection network, (3) vector processing capability on each processor, and (4) overlapped computation and transfer. The general linear equation solver based on the blocked LU decomposition method achieves 120.0 GFLOPS performance with 100 processors in the LIN-PACK Highly Parallel Computing benchmark.
An Overview of the NCC Spray/Monte-Carlo-PDF Computations

NASA Technical Reports Server (NTRS)

Raju, M. S.; Liu, Nan-Suey (Technical Monitor)

2000-01-01

This paper advances the state-of-the-art in spray computations with some of our recent contributions involving scalar Monte Carlo PDF (Probability Density Function), unstructured grids and parallel computing. It provides a complete overview of the scalar Monte Carlo PDF and Lagrangian spray computer codes developed for application with unstructured grids and parallel computing. Detailed comparisons for the case of a reacting non-swirling spray clearly highlight the important role that chemistry/turbulence interactions play in the modeling of reacting sprays. The results from the PDF and non-PDF methods were found to be markedly different and the PDF solution is closer to the reported experimental data. The PDF computations predict that some of the combustion occurs in a predominantly premixed-flame environment and the rest in a predominantly diffusion-flame environment. However, the non-PDF solution predicts wrongly for the combustion to occur in a vaporization-controlled regime. Near the premixed flame, the Monte Carlo particle temperature distribution shows two distinct peaks: one centered around the flame temperature and the other around the surrounding-gas temperature. Near the diffusion flame, the Monte Carlo particle temperature distribution shows a single peak. In both cases, the computed PDF's shape and strength are found to vary substantially depending upon the proximity to the flame surface. The results bring to the fore some of the deficiencies associated with the use of assumed-shape PDF methods in spray computations. Finally, we end the paper by demonstrating the computational viability of the present solution procedure for its use in 3D combustor calculations by summarizing the results of a 3D test case with periodic boundary conditions. For the 3D case, the parallel performance of all the three solvers (CFD, PDF, and spray) has been found to be good when the computations were performed on a 24-processor SGI Origin work-station.

An Upwind Solver for the National Combustion Code

NASA Technical Reports Server (NTRS)

Sockol, Peter M.

2011-01-01

An upwind solver is presented for the unstructured grid National Combustion Code (NCC). The compressible Navier-Stokes equations with time-derivative preconditioning and preconditioned flux-difference splitting of the inviscid terms are used. First order derivatives are computed on cell faces and used to evaluate the shear stresses and heat fluxes. A new flux limiter uses these same first order derivatives in the evaluation of left and right states used in the flux-difference splitting. The k-epsilon turbulence equations are solved with the same second-order method. The new solver has been installed in a recent version of NCC and the resulting code has been tested successfully in 2D on two laminar cases with known solutions and one turbulent case with experimental data.
Parallel-vector out-of-core equation solver for computational mechanics

NASA Technical Reports Server (NTRS)

Qin, J.; Agarwal, T. K.; Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.

1993-01-01

A parallel/vector out-of-core equation solver is developed for shared-memory computers, such as the Cray Y-MP machine. The input/ output (I/O) time is reduced by using the a synchronous BUFFER IN and BUFFER OUT, which can be executed simultaneously with the CPU instructions. The parallel and vector capability provided by the supercomputers is also exploited to enhance the performance. Numerical applications in large-scale structural analysis are given to demonstrate the efficiency of the present out-of-core solver.
The Tera Multithreaded Architecture and Unstructured Meshes

NASA Technical Reports Server (NTRS)

Bokhari, Shahid H.; Mavriplis, Dimitri J.

1998-01-01

The Tera Multithreaded Architecture (MTA) is a new parallel supercomputer currently being installed at San Diego Supercomputing Center (SDSC). This machine has an architecture quite different from contemporary parallel machines. The computational processor is a custom design and the machine uses hardware to support very fine grained multithreading. The main memory is shared, hardware randomized and flat. These features make the machine highly suited to the execution of unstructured mesh problems, which are difficult to parallelize on other architectures. We report the results of a study carried out during July-August 1998 to evaluate the execution of EUL3D, a code that solves the Euler equations on an unstructured mesh, on the 2 processor Tera MTA at SDSC. Our investigation shows that parallelization of an unstructured code is extremely easy on the Tera. We were able to get an existing parallel code (designed for a shared memory machine), running on the Tera by changing only the compiler directives. Furthermore, a serial version of this code was compiled to run in parallel on the Tera by judicious use of directives to invoke the "full/empty" tag bits of the machine to obtain synchronization. This version achieves 212 and 406 Mflop/s on one and two processors respectively, and requires no attention to partitioning or placement of data issues that would be of paramount importance in other parallel architectures.
Efficient Parallelization of a Dynamic Unstructured Application on the Tera MTA

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Biswas, Rupak

1999-01-01

The success of parallel computing in solving real-life computationally-intensive problems relies on their efficient mapping and execution on large-scale multiprocessor architectures. Many important applications are both unstructured and dynamic in nature, making their efficient parallel implementation a daunting task. This paper presents the parallelization of a dynamic unstructured mesh adaptation algorithm using three popular programming paradigms on three leading supercomputers. We examine an MPI message-passing implementation on the Cray T3E and the SGI Origin2OOO, a shared-memory implementation using cache coherent nonuniform memory access (CC-NUMA) of the Origin2OOO, and a multi-threaded version on the newly-released Tera Multi-threaded Architecture (MTA). We compare several critical factors of this parallel code development, including runtime, scalability, programmability, and memory overhead. Our overall results demonstrate that multi-threaded systems offer tremendous potential for quickly and efficiently solving some of the most challenging real-life problems on parallel computers.
Quality assessment of two- and three-dimensional unstructured meshes and validation of an upwind Euler flow solver

NASA Technical Reports Server (NTRS)

Woodard, Paul R.; Yang, Henry T. Y.; Batina, John T.

1992-01-01

Quality assessment procedures are described for two-dimensional and three-dimensional unstructured meshes. The procedures include measurement of minimum angles, element aspect ratios, stretching, and element skewness. Meshes about the ONERA M6 wing and the Boeing 747 transport configuration are generated using an advancing front method grid generation package of programs. Solutions of Euler's equations for these meshes are obtained at low angle-of-attack, transonic conditions. Results for these cases, obtained as part of a validation study demonstrate the accuracy of an implicit upwind Euler solution algorithm.
Efficient Implementation of Multigrid Solvers on Message-Passing Parrallel Systems

NASA Technical Reports Server (NTRS)

Lou, John

1994-01-01

We discuss our implementation strategies for finite difference multigrid partial differential equation (PDE) solvers on message-passing systems. Our target parallel architecture is Intel parallel computers: the Delta and Paragon system.
A New Approach to Parallel Dynamic Partitioning for Adaptive Unstructured Meshes

NASA Technical Reports Server (NTRS)

Heber, Gerd; Biswas, Rupak; Gao, Guang R.

1999-01-01

Classical mesh partitioning algorithms were designed for rather static situations, and their straightforward application in a dynamical framework may lead to unsatisfactory results, e.g., excessive data migration among processors. Furthermore, special attention should be paid to their amenability to parallelization. In this paper, a novel parallel method for the dynamic partitioning of adaptive unstructured meshes is described. It is based on a linear representation of the mesh using self-avoiding walks.
Unstructured grids on SIMD torus machines

NASA Technical Reports Server (NTRS)

Bjorstad, Petter E.; Schreiber, Robert

1994-01-01

Unstructured grids lead to unstructured communication on distributed memory parallel computers, a problem that has been considered difficult. Here, we consider adaptive, offline communication routing for a SIMD processor grid. Our approach is empirical. We use large data sets drawn from supercomputing applications instead of an analytic model of communication load. The chief contribution of this paper is an experimental demonstration of the effectiveness of certain routing heuristics. Our routing algorithm is adaptive, nonminimal, and is generally designed to exploit locality. We have a parallel implementation of the router, and we report on its performance.
MODFLOW–USG version 1: An unstructured grid version of MODFLOW for simulating groundwater flow and tightly coupled processes using a control volume finite-difference formulation

USGS Publications Warehouse

Panday, Sorab; Langevin, Christian D.; Niswonger, Richard G.; Ibaraki, Motomu; Hughes, Joseph D.

2013-01-01

A new version of MODFLOW, called MODFLOW–USG (for UnStructured Grid), was developed to support a wide variety of structured and unstructured grid types, including nested grids and grids based on prismatic triangles, rectangles, hexagons, and other cell shapes. Flexibility in grid design can be used to focus resolution along rivers and around wells, for example, or to subdiscretize individual layers to better represent hydrostratigraphic units. MODFLOW–USG is based on an underlying control volume finite difference (CVFD) formulation in which a cell can be connected to an arbitrary number of adjacent cells. To improve accuracy of the CVFD formulation for irregular grid-cell geometries or nested grids, a generalized Ghost Node Correction (GNC) Package was developed, which uses interpolated heads in the flow calculation between adjacent connected cells. MODFLOW–USG includes a Groundwater Flow (GWF) Process, based on the GWF Process in MODFLOW–2005, as well as a new Connected Linear Network (CLN) Process to simulate the effects of multi-node wells, karst conduits, and tile drains, for example. The CLN Process is tightly coupled with the GWF Process in that the equations from both processes are formulated into one matrix equation and solved simultaneously. This robustness results from using an unstructured grid with unstructured matrix storage and solution schemes. MODFLOW–USG also contains an optional Newton-Raphson formulation, based on the formulation in MODFLOW–NWT, for improving solution convergence and avoiding problems with the drying and rewetting of cells. Because the existing MODFLOW solvers were developed for structured and symmetric matrices, they were replaced with a new Sparse Matrix Solver (SMS) Package developed specifically for MODFLOW–USG. The SMS Package provides several methods for resolving nonlinearities and multiple symmetric and asymmetric linear solution schemes to solve the matrix arising from the flow equations and the Newton-Raphson formulation, respectively.
Assessing a mini-application as a performance proxy for a finite element method engineering application

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Paul T.; Heroux, Michael A.; Barrett, Richard F.

The performance of a large-scale, production-quality science and engineering application (‘app’) is often dominated by a small subset of the code. Even within that subset, computational and data access patterns are often repeated, so that an even smaller portion can represent the performance-impacting features. If application developers, parallel computing experts, and computer architects can together identify this representative subset and then develop a small mini-application (‘miniapp’) that can capture these primary performance characteristics, then this miniapp can be used to both improve the performance of the app as well as provide a tool for co-design for the high-performance computing community.more » However, a critical question is whether a miniapp can effectively capture key performance behavior of an app. This study provides a comparison of an implicit finite element semiconductor device modeling app on unstructured meshes with an implicit finite element miniapp on unstructured meshes. The goal is to assess whether the miniapp is predictive of the performance of the app. Finally, single compute node performance will be compared, as well as scaling up to 16,000 cores. Results indicate that the miniapp can be reasonably predictive of the performance characteristics of the app for a single iteration of the solver on a single compute node.« less
Assessing a mini-application as a performance proxy for a finite element method engineering application

DOE PAGES

Lin, Paul T.; Heroux, Michael A.; Barrett, Richard F.; ...

2015-07-30

The performance of a large-scale, production-quality science and engineering application (‘app’) is often dominated by a small subset of the code. Even within that subset, computational and data access patterns are often repeated, so that an even smaller portion can represent the performance-impacting features. If application developers, parallel computing experts, and computer architects can together identify this representative subset and then develop a small mini-application (‘miniapp’) that can capture these primary performance characteristics, then this miniapp can be used to both improve the performance of the app as well as provide a tool for co-design for the high-performance computing community.more » However, a critical question is whether a miniapp can effectively capture key performance behavior of an app. This study provides a comparison of an implicit finite element semiconductor device modeling app on unstructured meshes with an implicit finite element miniapp on unstructured meshes. The goal is to assess whether the miniapp is predictive of the performance of the app. Finally, single compute node performance will be compared, as well as scaling up to 16,000 cores. Results indicate that the miniapp can be reasonably predictive of the performance characteristics of the app for a single iteration of the solver on a single compute node.« less
Whole-annulus aeroelasticity analysis of a 17-bladerow WRF compressor using an unstructured Navier Stokes solver

NASA Astrophysics Data System (ADS)

Wu, X.; Vahdati, M.; Sayma, A.; Imregun, M.

2005-03-01

This paper describes a large-scale aeroelasticity computation for an aero-engine core compressor. The computational domain includes all 17 bladerows, resulting in a mesh with over 68 million points. The Favre-averaged Navier Stokes equations are used to represent the flow in a non-linear time-accurate fashion on unstructured meshes of mixed elements. The structural model of the first two rotor bladerows is based on a standard finite element representation. The fluid mesh is moved at each time step according to the structural motion so that changes in blade aerodynamic damping and flow unsteadiness can be accommodated automatically. An efficient domain decomposition technique, where special care was taken to balance the memory requirement across processors, was developed as part of the work. The calculation was conducted in parallel mode on 128 CPUs of an SGI Origin 3000. Ten vibration cycles were obtained using over 2.2 CPU years, though the elapsed time was a week only. Steady-state flow measurements and predictions were found to be in good agreement. A comparison of the averaged unsteady flow and the steady-state flow revealed some discrepancies. It was concluded that, in due course, the methodology would be adopted by industry to perform routine numerical simulations of the unsteady flow through entire compressor assemblies with vibrating blades not only to minimise engine and rig tests but also to improve performance predictions.
Unstructured Adaptive (UA) NAS Parallel Benchmark. Version 1.0

NASA Technical Reports Server (NTRS)

Feng, Huiyu; VanderWijngaart, Rob; Biswas, Rupak; Mavriplis, Catherine

2004-01-01

We present a complete specification of a new benchmark for measuring the performance of modern computer systems when solving scientific problems featuring irregular, dynamic memory accesses. It complements the existing NAS Parallel Benchmark suite. The benchmark involves the solution of a stylized heat transfer problem in a cubic domain, discretized on an adaptively refined, unstructured mesh.
Parallelization of the preconditioned IDR solver for modern multicore computer systems

NASA Astrophysics Data System (ADS)

Bessonov, O. A.; Fedoseyev, A. I.

2012-10-01

This paper present the analysis, parallelization and optimization approach for the large sparse matrix solver CNSPACK for modern multicore microprocessors. CNSPACK is an advanced solver successfully used for coupled solution of stiff problems arising in multiphysics applications such as CFD, semiconductor transport, kinetic and quantum problems. It employs iterative IDR algorithm with ILU preconditioning (user chosen ILU preconditioning order). CNSPACK has been successfully used during last decade for solving problems in several application areas, including fluid dynamics and semiconductor device simulation. However, there was a dramatic change in processor architectures and computer system organization in recent years. Due to this, performance criteria and methods have been revisited, together with involving the parallelization of the solver and preconditioner using Open MP environment. Results of the successful implementation for efficient parallelization are presented for the most advances computer system (Intel Core i7-9xx or two-processor Xeon 55xx/56xx).
Parallel Computation of the Jacobian Matrix for Nonlinear Equation Solvers Using MATLAB

NASA Technical Reports Server (NTRS)

Rose, Geoffrey K.; Nguyen, Duc T.; Newman, Brett A.

2017-01-01

Demonstrating speedup for parallel code on a multicore shared memory PC can be challenging in MATLAB due to underlying parallel operations that are often opaque to the user. This can limit potential for improvement of serial code even for the so-called embarrassingly parallel applications. One such application is the computation of the Jacobian matrix inherent to most nonlinear equation solvers. Computation of this matrix represents the primary bottleneck in nonlinear solver speed such that commercial finite element (FE) and multi-body-dynamic (MBD) codes attempt to minimize computations. A timing study using MATLAB's Parallel Computing Toolbox was performed for numerical computation of the Jacobian. Several approaches for implementing parallel code were investigated while only the single program multiple data (spmd) method using composite objects provided positive results. Parallel code speedup is demonstrated but the goal of linear speedup through the addition of processors was not achieved due to PC architecture.
Discrete Adjoint-Based Design Optimization of Unsteady Turbulent Flows on Dynamic Unstructured Grids

NASA Technical Reports Server (NTRS)

Nielsen, Eric J.; Diskin, Boris; Yamaleev, Nail K.

2009-01-01

An adjoint-based methodology for design optimization of unsteady turbulent flows on dynamic unstructured grids is described. The implementation relies on an existing unsteady three-dimensional unstructured grid solver capable of dynamic mesh simulations and discrete adjoint capabilities previously developed for steady flows. The discrete equations for the primal and adjoint systems are presented for the backward-difference family of time-integration schemes on both static and dynamic grids. The consistency of sensitivity derivatives is established via comparisons with complex-variable computations. The current work is believed to be the first verified implementation of an adjoint-based optimization methodology for the true time-dependent formulation of the Navier-Stokes equations in a practical computational code. Large-scale shape optimizations are demonstrated for turbulent flows over a tiltrotor geometry and a simulated aeroelastic motion of a fighter jet.
Parallel volume ray-casting for unstructured-grid data on distributed-memory architectures

NASA Technical Reports Server (NTRS)

Ma, Kwan-Liu

1995-01-01

As computing technology continues to advance, computational modeling of scientific and engineering problems produces data of increasing complexity: large in size and unstructured in shape. Volume visualization of such data is a challenging problem. This paper proposes a distributed parallel solution that makes ray-casting volume rendering of unstructured-grid data practical. Both the data and the rendering process are distributed among processors. At each processor, ray-casting of local data is performed independent of the other processors. The global image composing processes, which require inter-processor communication, are overlapped with the local ray-casting processes to achieve maximum parallel efficiency. This algorithm differs from previous ones in four ways: it is completely distributed, less view-dependent, reasonably scalable, and flexible. Without using dynamic load balancing, test results on the Intel Paragon using from two to 128 processors show, on average, about 60% parallel efficiency.
Moving and adaptive grid methods for compressible flows

NASA Technical Reports Server (NTRS)

Trepanier, Jean-Yves; Camarero, Ricardo

1995-01-01

This paper describes adaptive grid methods developed specifically for compressible flow computations. The basic flow solver is a finite-volume implementation of Roe's flux difference splitting scheme or arbitrarily moving unstructured triangular meshes. The grid adaptation is performed according to geometric and flow requirements. Some results are included to illustrate the potential of the methodology.
Wind-US Users Guide Version 3.0

NASA Technical Reports Server (NTRS)

Yoder, Dennis A.

2016-01-01

Wind-US is a computational platform which may be used to numerically solve various sets of equations governing physical phenomena. Currently, the code supports the solution of the Euler and Navier-Stokes equations of fluid mechanics, along with supporting equation sets governing turbulent and chemically reacting flows. Wind-US is a product of the NPARC Alliance, a partnership between the NASA Glenn Research Center (GRC) and the Arnold Engineering Development Complex (AEDC) dedicated to the establishment of a national, applications-oriented flow simulation capability. The Boeing Company has also been closely associated with the Alliance since its inception, and represents the interests of the NPARC User's Association. The "Wind-US User's Guide" describes the operation and use of Wind-US, including: a basic tutorial; the physical and numerical models that are used; the boundary conditions; monitoring convergence; the files that are read and/or written; parallel execution; and a complete list of input keywords and test options. For current information about Wind-US and the NPARC Alliance, please see the Wind-US home page at http://www.grc.nasa.gov/WWW/winddocs/ and the NPARC Alliance home page at http://www.grc.nasa.gov/WWW/wind/. This manual describes the operation and use of Wind-US, a computational platform which may be used to numerically solve various sets of equations governing physical phenomena. Wind-US represents a merger of the capabilities of four CFD codes - NASTD (a structured grid flow solver developed at McDonnell Douglas, now part of Boeing), NPARC (the original NPARC Alliance structured grid flow solver), NXAIR (an AEDC structured grid code used primarily for store separation analysis), and ICAT (an unstructured grid flow solver developed at the Rockwell Science Center and Boeing).
New preconditioning strategy for Jacobian-free solvers for variably saturated flows with Richards’ equation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lipnikov, Konstantin; Moulton, David; Svyatskiy, Daniil

2016-04-29

We develop a new approach for solving the nonlinear Richards’ equation arising in variably saturated flow modeling. The growing complexity of geometric models for simulation of subsurface flows leads to the necessity of using unstructured meshes and advanced discretization methods. Typically, a numerical solution is obtained by first discretizing PDEs and then solving the resulting system of nonlinear discrete equations with a Newton-Raphson-type method. Efficiency and robustness of the existing solvers rely on many factors, including an empiric quality control of intermediate iterates, complexity of the employed discretization method and a customized preconditioner. We propose and analyze a new preconditioningmore » strategy that is based on a stable discretization of the continuum Jacobian. We will show with numerical experiments for challenging problems in subsurface hydrology that this new preconditioner improves convergence of the existing Jacobian-free solvers 3-20 times. Furthermore, we show that the Picard method with this preconditioner becomes a more efficient nonlinear solver than a few widely used Jacobian-free solvers.« less

3-D minimum-structure inversion of magnetotelluric data using the finite-element method and tetrahedral grids

NASA Astrophysics Data System (ADS)

Jahandari, H.; Farquharson, C. G.

2017-11-01

Unstructured grids enable representing arbitrary structures more accurately and with fewer cells compared to regular structured grids. These grids also allow more efficient refinements compared to rectilinear meshes. In this study, tetrahedral grids are used for the inversion of magnetotelluric (MT) data, which allows for the direct inclusion of topography in the model, for constraining an inversion using a wireframe-based geological model and for local refinement at the observation stations. A minimum-structure method with an iterative model-space Gauss-Newton algorithm for optimization is used. An iterative solver is employed for solving the normal system of equations at each Gauss-Newton step and the sensitivity matrix-vector products that are required by this solver are calculated using pseudo-forward problems. This method alleviates the need to explicitly form the Hessian or Jacobian matrices which significantly reduces the required computation memory. Forward problems are formulated using an edge-based finite-element approach and a sparse direct solver is used for the solutions. This solver allows saving and re-using the factorization of matrices for similar pseudo-forward problems within a Gauss-Newton iteration which greatly minimizes the computation time. Two examples are presented to show the capability of the algorithm: the first example uses a benchmark model while the second example represents a realistic geological setting with topography and a sulphide deposit. The data that are inverted are the full-tensor impedance and the magnetic transfer function vector. The inversions sufficiently recovered the models and reproduced the data, which shows the effectiveness of unstructured grids for complex and realistic MT inversion scenarios. The first example is also used to demonstrate the computational efficiency of the presented model-space method by comparison with its data-space counterpart.
Summer Proceedings 2016: The Center for Computing Research at Sandia National Laboratories

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carleton, James Brian; Parks, Michael L.

Solving sparse linear systems from the discretization of elliptic partial differential equations (PDEs) is an important building block in many engineering applications. Sparse direct solvers can solve general linear systems, but are usually slower and use much more memory than effective iterative solvers. To overcome these two disadvantages, a hierarchical solver (LoRaSp) based on H2-matrices was introduced in [22]. Here, we have developed a parallel version of the algorithm in LoRaSp to solve large sparse matrices on distributed memory machines. On a single processor, the factorization time of our parallel solver scales almost linearly with the problem size for three-dimensionalmore » problems, as opposed to the quadratic scalability of many existing sparse direct solvers. Moreover, our solver leads to almost constant numbers of iterations, when used as a preconditioner for Poisson problems. On more than one processor, our algorithm has significant speedups compared to sequential runs. With this parallel algorithm, we are able to solve large problems much faster than many existing packages as demonstrated by the numerical experiments.« less
A Solution Adaptive Technique Using Tetrahedral Unstructured Grids

NASA Technical Reports Server (NTRS)

Pirzadeh, Shahyar Z.

2000-01-01

An adaptive unstructured grid refinement technique has been developed and successfully applied to several three dimensional inviscid flow test cases. The method is based on a combination of surface mesh subdivision and local remeshing of the volume grid Simple functions of flow quantities are employed to detect dominant features of the flowfield The method is designed for modular coupling with various error/feature analyzers and flow solvers. Several steady-state, inviscid flow test cases are presented to demonstrate the applicability of the method for solving practical three-dimensional problems. In all cases, accurate solutions featuring complex, nonlinear flow phenomena such as shock waves and vortices have been generated automatically and efficiently.
Euler Flow Computations on Non-Matching Unstructured Meshes

NASA Technical Reports Server (NTRS)

Gumaste, Udayan

1999-01-01

Advanced fluid solvers to predict aerodynamic performance-coupled treatment of multiple fields are described. The interaction between the fluid and structural components in the bladed regions of the engine is investigated with respect to known blade failures caused by either flutter or forced vibrations. Methods are developed to describe aeroelastic phenomena for internal flows in turbomachinery by accounting for the increased geometric complexity, mutual interaction between adjacent structural components and presence of thermal and geometric loading. The computer code developed solves the full three dimensional aeroelastic problem of-stage. The results obtained show that flow computations can be performed on non-matching finite-volume unstructured meshes with second order spatial accuracy.
Implicit solvers for unstructured meshes

NASA Technical Reports Server (NTRS)

Venkatakrishnan, V.; Mavriplis, Dimitri J.

1991-01-01

Implicit methods for unstructured mesh computations are developed and tested. The approximate system which arises from the Newton-linearization of the nonlinear evolution operator is solved by using the preconditioned generalized minimum residual technique. These different preconditioners are investigated: the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over-relaxation (SSOR). The preconditioners have been optimized to have good vectorization properties. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also investigated. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
Aerodynamic Design Optimization on Unstructured Meshes Using the Navier-Stokes Equations

NASA Technical Reports Server (NTRS)

Nielsen, Eric J.; Anderson, W. Kyle

1998-01-01

A discrete adjoint method is developed and demonstrated for aerodynamic design optimization on unstructured grids. The governing equations are the three-dimensional Reynolds-averaged Navier-Stokes equations coupled with a one-equation turbulence model. A discussion of the numerical implementation of the flow and adjoint equations is presented. Both compressible and incompressible solvers are differentiated and the accuracy of the sensitivity derivatives is verified by comparing with gradients obtained using finite differences. Several simplifying approximations to the complete linearization of the residual are also presented, and the resulting accuracy of the derivatives is examined. Demonstration optimizations for both compressible and incompressible flows are given.
A Computational Framework for Efficient Low Temperature Plasma Simulations

NASA Astrophysics Data System (ADS)

Verma, Abhishek Kumar; Venkattraman, Ayyaswamy

2016-10-01

Over the past years, scientific computing has emerged as an essential tool for the investigation and prediction of low temperature plasmas (LTP) applications which includes electronics, nanomaterial synthesis, metamaterials etc. To further explore the LTP behavior with greater fidelity, we present a computational toolbox developed to perform LTP simulations. This framework will allow us to enhance our understanding of multiscale plasma phenomenon using high performance computing tools mainly based on OpenFOAM FVM distribution. Although aimed at microplasma simulations, the modular framework is able to perform multiscale, multiphysics simulations of physical systems comprises of LTP. Some salient introductory features are capability to perform parallel, 3D simulations of LTP applications on unstructured meshes. Performance of the solver is tested based on numerical results assessing accuracy and efficiency of benchmarks for problems in microdischarge devices. Numerical simulation of microplasma reactor at atmospheric pressure with hemispherical dielectric coated electrodes will be discussed and hence, provide an overview of applicability and future scope of this framework.
Multidisciplinary design optimization for sonic boom mitigation

NASA Astrophysics Data System (ADS)

Ozcer, Isik A.

Automated, parallelized, time-efficient surface definition and grid generation and flow simulation methods are developed for sharp and accurate sonic boom signal computation in three dimensions in the near and mid-field of an aircraft using Euler/Full-Potential unstructured/structured computational fluid dynamics. The full-potential mid-field sonic boom prediction code is an accurate and efficient solver featuring automated grid generation, grid adaptation and shock fitting, and parallel processing. This program quickly marches the solution using a single nonlinear equation for large distances that cannot be covered with Euler solvers due to large memory and long computational time requirements. The solver takes into account variations in temperature and pressure with altitude. The far-field signal prediction is handled using the classical linear Thomas Waveform Parameter Method where the switching altitude from the nonlinear to linear prediction is determined by convergence of the ground signal pressure impulse value. This altitude is determined as r/L ≈ 10 from the source for a simple lifting wing, and r/L ≈ 40 for a real complex aircraft. Unstructured grid adaptation and shock fitting methodology developed for the near-field analysis employs an Hessian based anisotropic grid adaptation based on error equidistribution. A special field scalar is formulated to be used in the computation of the Hessian based error metric which enhances significantly the adaptation scheme for shocks. The entire cross-flow of a complex aircraft is resolved with high fidelity using only 500,000 grid nodes after only about 10 solution/adaptation cycles. Shock fitting is accomplished using Roe's Flux-Difference Splitting scheme which is an approximate Riemann type solver and by proper alignment of the cell faces with respect to shock surfaces. Simple to complex real aircraft geometries are handled with no user-interference required making the simulation methods suitable tools for product design. The simulation tools are used to optimize three geometries for sonic boom mitigation. The first is a simple axisymmetric shape to be used as a generic nose component, the second is a delta wing with lift, and the third is a real aircraft with nose and wing optimization. The objectives are to minimize the pressure impulse or the peak pressure in the sonic boom signal, while keeping the drag penalty under feasible limits. The design parameters for the meridian profile of the nose shape are the lengths and the half-cone angles of the linear segments that make up the profile. The design parameters for the lifting wing are the dihedral angle, angle of attack, non-linear span-wise twist and camber distribution. The test-bed aircraft is the modified F-5E aircraft built by Northrop Grumman, designated the Shaped Sonic Boom Demonstrator. This aircraft is fitted with an optimized axisymmetric nose, and the wings are optimized to demonstrate optimization for sonic boom mitigation for a real aircraft. The final results predict 42% reduction in bow shock strength, 17% reduction in peak Deltap, 22% reduction in pressure impulse, 10% reduction in foot print size, 24% reduction in inviscid drag, and no loss in lift for the optimized aircraft. Optimization is carried out using response surface methodology, and the design matrices are determined using standard DoE techniques for quadratic response modeling.
Performance Comparison of a Set of Periodic and Non-Periodic Tridiagonal Solvers on SP2 and Paragon Parallel Computers

NASA Technical Reports Server (NTRS)

Sun, Xian-He; Moitra, Stuti

1996-01-01

Various tridiagonal solvers have been proposed in recent years for different parallel platforms. In this paper, the performance of three tridiagonal solvers, namely, the parallel partition LU algorithm, the parallel diagonal dominant algorithm, and the reduced diagonal dominant algorithm, is studied. These algorithms are designed for distributed-memory machines and are tested on an Intel Paragon and an IBM SP2 machines. Measured results are reported in terms of execution time and speedup. Analytical study are conducted for different communication topologies and for different tridiagonal systems. The measured results match the analytical results closely. In addition to address implementation issues, performance considerations such as problem sizes and models of speedup are also discussed.
Visualization and Tracking of Parallel CFD Simulations

NASA Technical Reports Server (NTRS)

Vaziri, Arsi; Kremenetsky, Mark

1995-01-01

We describe a system for interactive visualization and tracking of a 3-D unsteady computational fluid dynamics (CFD) simulation on a parallel computer. CM/AVS, a distributed, parallel implementation of a visualization environment (AVS) runs on the CM-5 parallel supercomputer. A CFD solver is run as a CM/AVS module on the CM-5. Data communication between the solver, other parallel visualization modules, and a graphics workstation, which is running AVS, are handled by CM/AVS. Partitioning of the visualization task, between CM-5 and the workstation, can be done interactively in the visual programming environment provided by AVS. Flow solver parameters can also be altered by programmable interactive widgets. This system partially removes the requirement of storing large solution files at frequent time steps, a characteristic of the traditional 'simulate (yields) store (yields) visualize' post-processing approach.
A High-Order Direct Solver for Helmholtz Equations with Neumann Boundary Conditions

NASA Technical Reports Server (NTRS)

Sun, Xian-He; Zhuang, Yu

1997-01-01

In this study, a compact finite-difference discretization is first developed for Helmholtz equations on rectangular domains. Special treatments are then introduced for Neumann and Neumann-Dirichlet boundary conditions to achieve accuracy and separability. Finally, a Fast Fourier Transform (FFT) based technique is used to yield a fast direct solver. Analytical and experimental results show this newly proposed solver is comparable to the conventional second-order elliptic solver when accuracy is not a primary concern, and is significantly faster than that of the conventional solver if a highly accurate solution is required. In addition, this newly proposed fourth order Helmholtz solver is parallel in nature. It is readily available for parallel and distributed computers. The compact scheme introduced in this study is likely extendible for sixth-order accurate algorithms and for more general elliptic equations.
Assessment of the Unstructured Grid Software TetrUSS for Drag Prediction of the DLR-F4 Configuration

NASA Technical Reports Server (NTRS)

Pirzadeh, Shahyar Z.; Frink, Neal T.

2002-01-01

An application of the NASA unstructured grid software system TetrUSS is presented for the prediction of aerodynamic drag on a transport configuration. The paper briefly describes the underlying methodology and summarizes the results obtained on the DLR-F4 transport configuration recently presented in the first AIAA computational fluid dynamics (CFD) Drag Prediction Workshop. TetrUSS is a suite of loosely coupled unstructured grid CFD codes developed at the NASA Langley Research Center. The meshing approach is based on the advancing-front and the advancing-layers procedures. The flow solver employs a cell-centered, finite volume scheme for solving the Reynolds Averaged Navier-Stokes equations on tetrahedral grids. For the present computations, flow in the viscous sublayer has been modeled with an analytical wall function. The emphasis of the paper is placed on the practicality of the methodology for accurately predicting aerodynamic drag data.
Global magnetosphere simulations using constrained-transport Hall-MHD with CWENO reconstruction

NASA Astrophysics Data System (ADS)

Lin, L.; Germaschewski, K.; Maynard, K. M.; Abbott, S.; Bhattacharjee, A.; Raeder, J.

2013-12-01

We present a new CWENO (Centrally-Weighted Essentially Non-Oscillatory) reconstruction based MHD solver for the OpenGGCM global magnetosphere code. The solver was built using libMRC, a library for creating efficient parallel PDE solvers on structured grids. The use of libMRC gives us access to its core functionality of providing an automated code generation framework which takes a user provided PDE right hand side in symbolic form to generate an efficient, computer architecture specific, parallel code. libMRC also supports block-structured adaptive mesh refinement and implicit-time stepping through integration with the PETSc library. We validate the new CWENO Hall-MHD solver against existing solvers both in standard test problems as well as in global magnetosphere simulations.
User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.

NASA Technical Reports Server (NTRS)

Reddy, C. J.

2000-01-01

PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.
Overview of the CHarring Ablator Response (CHAR) Code

NASA Technical Reports Server (NTRS)

Amar, Adam J.; Oliver, A. Brandon; Kirk, Benjamin S.; Salazar, Giovanni; Droba, Justin

2016-01-01

An overview of the capabilities of the CHarring Ablator Response (CHAR) code is presented. CHAR is a one-, two-, and three-dimensional unstructured continuous Galerkin finite-element heat conduction and ablation solver with both direct and inverse modes. Additionally, CHAR includes a coupled linear thermoelastic solver for determination of internal stresses induced from the temperature field and surface loading. Background on the development process, governing equations, material models, discretization techniques, and numerical methods is provided. Special focus is put on the available boundary conditions including thermochemical ablation and contact interfaces, and example simulations are included. Finally, a discussion of ongoing development efforts is presented.
Overview of the CHarring Ablator Response (CHAR) Code

NASA Technical Reports Server (NTRS)

Amar, Adam J.; Oliver, A. Brandon; Kirk, Benjamin S.; Salazar, Giovanni; Droba, Justin

2016-01-01

An overview of the capabilities of the CHarring Ablator Response (CHAR) code is presented. CHAR is a one-, two-, and three-dimensional unstructured continuous Galerkin finite-element heat conduction and ablation solver with both direct and inverse modes. Additionally, CHAR includes a coupled linear thermoelastic solver for determination of internal stresses induced from the temperature field and surface loading. Background on the development process, governing equations, material models, discretization techniques, and numerical methods is provided. Special focus is put on the available boundary conditions including thermochemical ablation, surface-to-surface radiation exchange, and flowfield coupling. Finally, a discussion of ongoing development efforts is presented.
Parallel-vector computation for linear structural analysis and non-linear unconstrained optimization problems

NASA Technical Reports Server (NTRS)

Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.

1991-01-01

Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.
Conservative multizonal interface algorithm for the 3-D Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Klopfer, G. H.; Molvik, G. A.

1991-01-01

A conservative zonal interface algorithm using features of both structured and unstructured mesh CFD technology is presented. The flow solver within each of the zones is based on structured mesh CFD technology. The interface algorithm was implemented into two three-dimensional Navier-Stokes finite volume codes and was found to yield good results.
A manual for PARTI runtime primitives

NASA Technical Reports Server (NTRS)

Berryman, Harry; Saltz, Joel

1990-01-01

Primitives are presented that are designed to help users efficiently program irregular problems (e.g., unstructured mesh sweeps, sparse matrix codes, adaptive mesh partial differential equations solvers) on distributed memory machines. These primitives are also designed for use in compilers for distributed memory multiprocessors. Communications patterns are captured at runtime, and the appropriate send and receive messages are automatically generated.
Parallel deterministic transport sweeps of structured and unstructured meshes with overloaded mesh decompositions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pautz, Shawn D.; Bailey, Teresa S.

Here, the efficiency of discrete ordinates transport sweeps depends on the scheduling algorithm, the domain decomposition, the problem to be solved, and the computational platform. Sweep scheduling algorithms may be categorized by their approach to several issues. In this paper we examine the strategy of domain overloading for mesh partitioning as one of the components of such algorithms. In particular, we extend the domain overloading strategy, previously defined and analyzed for structured meshes, to the general case of unstructured meshes. We also present computational results for both the structured and unstructured domain overloading cases. We find that an appropriate amountmore » of domain overloading can greatly improve the efficiency of parallel sweeps for both structured and unstructured partitionings of the test problems examined on up to 10 5 processor cores.« less

Parallel deterministic transport sweeps of structured and unstructured meshes with overloaded mesh decompositions

DOE PAGES

Pautz, Shawn D.; Bailey, Teresa S.

2016-11-29

Here, the efficiency of discrete ordinates transport sweeps depends on the scheduling algorithm, the domain decomposition, the problem to be solved, and the computational platform. Sweep scheduling algorithms may be categorized by their approach to several issues. In this paper we examine the strategy of domain overloading for mesh partitioning as one of the components of such algorithms. In particular, we extend the domain overloading strategy, previously defined and analyzed for structured meshes, to the general case of unstructured meshes. We also present computational results for both the structured and unstructured domain overloading cases. We find that an appropriate amountmore » of domain overloading can greatly improve the efficiency of parallel sweeps for both structured and unstructured partitionings of the test problems examined on up to 10 5 processor cores.« less
Learning and Parallelization Boost Constraint Search

ERIC Educational Resources Information Center

Yun, Xi

2013-01-01

Constraint satisfaction problems are a powerful way to abstract and represent academic and real-world problems from both artificial intelligence and operations research. A constraint satisfaction problem is typically addressed by a sequential constraint solver running on a single processor. Rather than construct a new, parallel solver, this work…
Investigation of advancing front method for generating unstructured grid

NASA Technical Reports Server (NTRS)

Thomas, A. M.; Tiwari, S. N.

1992-01-01

The advancing front technique is used to generate an unstructured grid about simple aerodynamic geometries. Unstructured grids are generated using VGRID2D and VGRID3D software. Specific problems considered are a NACA 0012 airfoil, a bi-plane consisting of two NACA 0012 airfoil, a four element airfoil in its landing configuration, and an ONERA M6 wing. Inviscid time dependent solutions are computed on these geometries using USM3D and the results are compared with standard test results obtained by other investigators. A grid convergence study is conducted for the NACA 0012 airfoil and compared with a structured grid. A structured grid is generated using GRIDGEN software and inviscid solutions computed using CFL3D flow solver. The results obtained by unstructured grid for NACA 0012 airfoil showed an asymmetric distribution of flow quantities, and a fine distribution of grid was required to remove this asymmetry. On the other hand, the structured grid predicted a very symmetric distribution, but when the total number of points were compared to obtain the same results it was seen that structured grid required more grid points.
An accurate discontinuous Galerkin method for solving point-source Eikonal equation in 2-D heterogeneous anisotropic media

NASA Astrophysics Data System (ADS)

Le Bouteiller, P.; Benjemaa, M.; Métivier, L.; Virieux, J.

2018-03-01

Accurate numerical computation of wave traveltimes in heterogeneous media is of major interest for a large range of applications in seismics, such as phase identification, data windowing, traveltime tomography and seismic imaging. A high level of precision is needed for traveltimes and their derivatives in applications which require quantities such as amplitude or take-off angle. Even more challenging is the anisotropic case, where the general Eikonal equation is a quartic in the derivatives of traveltimes. Despite their efficiency on Cartesian meshes, finite-difference solvers are inappropriate when dealing with unstructured meshes and irregular topographies. Moreover, reaching high orders of accuracy generally requires wide stencils and high additional computational load. To go beyond these limitations, we propose a discontinuous-finite-element-based strategy which has the following advantages: (1) the Hamiltonian formalism is general enough for handling the full anisotropic Eikonal equations; (2) the scheme is suitable for any desired high-order formulation or mixing of orders (p-adaptivity); (3) the solver is explicit whatever Hamiltonian is used (no need to find the roots of the quartic); (4) the use of unstructured meshes provides the flexibility for handling complex boundary geometries such as topographies (h-adaptivity) and radiation boundary conditions for mimicking an infinite medium. The point-source factorization principles are extended to this discontinuous Galerkin formulation. Extensive tests in smooth analytical media demonstrate the high accuracy of the method. Simulations in strongly heterogeneous media illustrate the solver robustness to realistic Earth-sciences-oriented applications.
Sensitivity Analysis of Multidisciplinary Rotorcraft Simulations

NASA Technical Reports Server (NTRS)

Wang, Li; Diskin, Boris; Biedron, Robert T.; Nielsen, Eric J.; Bauchau, Olivier A.

2017-01-01

A multidisciplinary sensitivity analysis of rotorcraft simulations involving tightly coupled high-fidelity computational fluid dynamics and comprehensive analysis solvers is presented and evaluated. An unstructured sensitivity-enabled Navier-Stokes solver, FUN3D, and a nonlinear flexible multibody dynamics solver, DYMORE, are coupled to predict the aerodynamic loads and structural responses of helicopter rotor blades. A discretely-consistent adjoint-based sensitivity analysis available in FUN3D provides sensitivities arising from unsteady turbulent flows and unstructured dynamic overset meshes, while a complex-variable approach is used to compute DYMORE structural sensitivities with respect to aerodynamic loads. The multidisciplinary sensitivity analysis is conducted through integrating the sensitivity components from each discipline of the coupled system. Numerical results verify accuracy of the FUN3D/DYMORE system by conducting simulations for a benchmark rotorcraft test model and comparing solutions with established analyses and experimental data. Complex-variable implementation of sensitivity analysis of DYMORE and the coupled FUN3D/DYMORE system is verified by comparing with real-valued analysis and sensitivities. Correctness of adjoint formulations for FUN3D/DYMORE interfaces is verified by comparing adjoint-based and complex-variable sensitivities. Finally, sensitivities of the lift and drag functions obtained by complex-variable FUN3D/DYMORE simulations are compared with sensitivities computed by the multidisciplinary sensitivity analysis, which couples adjoint-based flow and grid sensitivities of FUN3D and FUN3D/DYMORE interfaces with complex-variable sensitivities of DYMORE structural responses.
A narrow-band k-distribution model with single mixture gas assumption for radiative flows

NASA Astrophysics Data System (ADS)

Jo, Sung Min; Kim, Jae Won; Kwon, Oh Joon

2018-06-01

In the present study, the narrow-band k-distribution (NBK) model parameters for mixtures of H2O, CO2, and CO are proposed by utilizing the line-by-line (LBL) calculations with a single mixture gas assumption. For the application of the NBK model to radiative flows, a radiative transfer equation (RTE) solver based on a finite-volume method on unstructured meshes was developed. The NBK model and the RTE solver were verified by solving two benchmark problems including the spectral radiance distribution emitted from one-dimensional slabs and the radiative heat transfer in a truncated conical enclosure. It was shown that the results are accurate and physically reliable by comparing with available data. To examine the applicability of the methods to realistic multi-dimensional problems in non-isothermal and non-homogeneous conditions, radiation in an axisymmetric combustion chamber was analyzed, and then the infrared signature emitted from an aircraft exhaust plume was predicted. For modeling the plume flow involving radiative cooling, a flow-radiation coupled procedure was devised in a loosely coupled manner by adopting a Navier-Stokes flow solver based on unstructured meshes. It was shown that the predicted radiative cooling for the combustion chamber is physically more accurate than other predictions, and is as accurate as that by the LBL calculations. It was found that the infrared signature of aircraft exhaust plume can also be obtained accurately, equivalent to the LBL calculations, by using the present narrow-band approach with a much improved numerical efficiency.
Simulation of all-scale atmospheric dynamics on unstructured meshes

NASA Astrophysics Data System (ADS)

Smolarkiewicz, Piotr K.; Szmelter, Joanna; Xiao, Feng

2016-10-01

The advance of massively parallel computing in the nineteen nineties and beyond encouraged finer grid intervals in numerical weather-prediction models. This has improved resolution of weather systems and enhanced the accuracy of forecasts, while setting the trend for development of unified all-scale atmospheric models. This paper first outlines the historical background to a wide range of numerical methods advanced in the process. Next, the trend is illustrated with a technical review of a versatile nonoscillatory forward-in-time finite-volume (NFTFV) approach, proven effective in simulations of atmospheric flows from small-scale dynamics to global circulations and climate. The outlined approach exploits the synergy of two specific ingredients: the MPDATA methods for the simulation of fluid flows based on the sign-preserving properties of upstream differencing; and the flexible finite-volume median-dual unstructured-mesh discretisation of the spatial differential operators comprising PDEs of atmospheric dynamics. The paper consolidates the concepts leading to a family of generalised nonhydrostatic NFTFV flow solvers that include soundproof PDEs of incompressible Boussinesq, anelastic and pseudo-incompressible systems, common in large-eddy simulation of small- and meso-scale dynamics, as well as all-scale compressible Euler equations. Such a framework naturally extends predictive skills of large-eddy simulation to the global atmosphere, providing a bottom-up alternative to the reverse approach pursued in the weather-prediction models. Theoretical considerations are substantiated by calculations attesting to the versatility and efficacy of the NFTFV approach. Some prospective developments are also discussed.
Unstructured Cartesian refinement with sharp interface immersed boundary method for 3D unsteady incompressible flows

NASA Astrophysics Data System (ADS)

Angelidis, Dionysios; Chawdhary, Saurabh; Sotiropoulos, Fotis

2016-11-01

A novel numerical method is developed for solving the 3D, unsteady, incompressible Navier-Stokes equations on locally refined fully unstructured Cartesian grids in domains with arbitrarily complex immersed boundaries. Owing to the utilization of the fractional step method on an unstructured Cartesian hybrid staggered/non-staggered grid layout, flux mismatch and pressure discontinuity issues are avoided and the divergence free constraint is inherently satisfied to machine zero. Auxiliary/hanging nodes are used to facilitate the discretization of the governing equations. The second-order accuracy of the solver is ensured by using multi-dimension Lagrange interpolation operators and appropriate differencing schemes at the interface of regions with different levels of refinement. The sharp interface immersed boundary method is augmented with local near-boundary refinement to handle arbitrarily complex boundaries. The discrete momentum equation is solved with the matrix free Newton-Krylov method and the Krylov-subspace method is employed to solve the Poisson equation. The second-order accuracy of the proposed method on unstructured Cartesian grids is demonstrated by solving the Poisson equation with a known analytical solution. A number of three-dimensional laminar flow simulations of increasing complexity illustrate the ability of the method to handle flows across a range of Reynolds numbers and flow regimes. Laminar steady and unsteady flows past a sphere and the oblique vortex shedding from a circular cylinder mounted between two end walls demonstrate the accuracy, the efficiency and the smooth transition of scales and coherent structures across refinement levels. Large-eddy simulation (LES) past a miniature wind turbine rotor, parameterized using the actuator line approach, indicates the ability of the fully unstructured solver to simulate complex turbulent flows. Finally, a geometry resolving LES of turbulent flow past a complete hydrokinetic turbine illustrates the potential of the method to simulate turbulent flows past geometrically complex bodies on locally refined meshes. In all the cases, the results are found to be in very good agreement with published data and savings in computational resources are achieved.
A taxonomy and comparison of parallel block multi-level preconditioners for the incompressible Navier-Stokes equations.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shadid, John Nicolas; Elman, Howard; Shuttleworth, Robert R.

2007-04-01

In recent years, considerable effort has been placed on developing efficient and robust solution algorithms for the incompressible Navier-Stokes equations based on preconditioned Krylov methods. These include physics-based methods, such as SIMPLE, and purely algebraic preconditioners based on the approximation of the Schur complement. All these techniques can be represented as approximate block factorization (ABF) type preconditioners. The goal is to decompose the application of the preconditioner into simplified sub-systems in which scalable multi-level type solvers can be applied. In this paper we develop a taxonomy of these ideas based on an adaptation of a generalized approximate factorization of themore » Navier-Stokes system first presented in [25]. This taxonomy illuminates the similarities and differences among these preconditioners and the central role played by efficient approximation of certain Schur complement operators. We then present a parallel computational study that examines the performance of these methods and compares them to an additive Schwarz domain decomposition (DD) algorithm. Results are presented for two and three-dimensional steady state problems for enclosed domains and inflow/outflow systems on both structured and unstructured meshes. The numerical experiments are performed using MPSalsa, a stabilized finite element code.« less
A User's Guide to AMR1D: An Instructional Adaptive Mesh Refinement Code for Unstructured Grids

NASA Technical Reports Server (NTRS)

deFainchtein, Rosalinda

1996-01-01

This report documents the code AMR1D, which is currently posted on the World Wide Web (http://sdcd.gsfc.nasa.gov/ESS/exchange/contrib/de-fainchtein/adaptive _mesh_refinement.html). AMR1D is a one-dimensional finite element fluid-dynamics solver, capable of adaptive mesh refinement (AMR). It was written as an instructional tool for AMR on unstructured mesh codes. It is meant to illustrate the minimum requirements for AMR on more than one dimension. For that purpose, it uses the same type of data structure that would be necessary on a two-dimensional AMR code (loosely following the algorithm described by Lohner).
Solving Navier-Stokes Equations with Advanced Turbulence Models on Three-Dimensional Unstructured Grids

NASA Technical Reports Server (NTRS)

Wang, Qun-Zhen; Massey, Steven J.; Abdol-Hamid, Khaled S.; Frink, Neal T.

1999-01-01

USM3D is a widely-used unstructured flow solver for simulating inviscid and viscous flows over complex geometries. The current version (version 5.0) of USM3D, however, does not have advanced turbulence models to accurately simulate complicated flows. We have implemented two modified versions of the original Jones and Launder k-epsilon two-equation turbulence model and the Girimaji algebraic Reynolds stress model in USM3D. Tests have been conducted for two flat plate boundary layer cases, a RAE2822 airfoil and an ONERA M6 wing. The results are compared with those of empirical formulae, theoretical results and the existing Spalart-Allmaras one-equation model.
Wind-US Flow Calculations for the M2129 S-Duct Using Structured and Unstructured Grids

NASA Technical Reports Server (NTRS)

Mohler, Stanley R., Jr.

2003-01-01

Computational Fluid Dynamics (CFD) flow solutions for the M2129 diffusing S-duct with and without vane effectors were computed by the Wind-US flow solver. Both structured and unstructured 3-D grids were used. Without vane effectors, the duct exhibited massive flow separation in both experiment and CFD. With vane effectors installed, the flow remained attached and aerodynamic losses were reduced. Total pressure recovery and distortion near the duct outlet were computed from the solutions and compared favorably to experimental values. These calculations are part of a validation effort for the Wind-US code. They also provide an example case to aid engineers in learning to use the Wind-US software.
Optimized and parallelized implementation of the electronegativity equalization method and the atom-bond electronegativity equalization method.

PubMed

Vareková, R Svobodová; Koca, J

2006-02-01

The most common way to calculate charge distribution in a molecule is ab initio quantum mechanics (QM). Some faster alternatives to QM have also been developed, the so-called "equalization methods" EEM and ABEEM, which are based on DFT. We have implemented and optimized the EEM and ABEEM methods and created the EEM SOLVER and ABEEM SOLVER programs. It has been found that the most time-consuming part of equalization methods is the reduction of the matrix belonging to the equation system generated by the method. Therefore, for both methods this part was replaced by the parallel algorithm WIRS and implemented within the PVM environment. The parallelized versions of the programs EEM SOLVER and ABEEM SOLVER showed promising results, especially on a single computer with several processors (compact PVM). The implemented programs are available through the Web page http://ncbr.chemi.muni.cz/~n19n/eem_abeem.
Multitasking domain decomposition fast Poisson solvers on the Cray Y-MP

NASA Technical Reports Server (NTRS)

Chan, Tony F.; Fatoohi, Rod A.

1990-01-01

The results of multitasking implementation of a domain decomposition fast Poisson solver on eight processors of the Cray Y-MP are presented. The object of this research is to study the performance of domain decomposition methods on a Cray supercomputer and to analyze the performance of different multitasking techniques using highly parallel algorithms. Two implementations of multitasking are considered: macrotasking (parallelism at the subroutine level) and microtasking (parallelism at the do-loop level). A conventional FFT-based fast Poisson solver is also multitasked. The results of different implementations are compared and analyzed. A speedup of over 7.4 on the Cray Y-MP running in a dedicated environment is achieved for all cases.
The Overgrid Interface for Computational Simulations on Overset Grids

NASA Technical Reports Server (NTRS)

Chan, William M.; Kwak, Dochan (Technical Monitor)

2002-01-01

Computational simulations using overset grids typically involve multiple steps and a variety of software modules. A graphical interface called OVERGRID has been specially designed for such purposes. Data required and created by the different steps include geometry, grids, domain connectivity information and flow solver input parameters. The interface provides a unified environment for the visualization, processing, generation and diagnosis of such data. General modules are available for the manipulation of structured grids and unstructured surface triangulations. Modules more specific for the overset approach include surface curve generators, hyperbolic and algebraic surface grid generators, a hyperbolic volume grid generator, Cartesian box grid generators, and domain connectivity: pre-processing tools. An interface provides automatic selection and viewing of flow solver boundary conditions, and various other flow solver inputs. For problems involving multiple components in relative motion, a module is available to build the component/grid relationships and to prescribe and animate the dynamics of the different components.
Verification and Validation Studies for the LAVA CFD Solver

NASA Technical Reports Server (NTRS)

Moini-Yekta, Shayan; Barad, Michael F; Sozer, Emre; Brehm, Christoph; Housman, Jeffrey A.; Kiris, Cetin C.

2013-01-01

The verification and validation of the Launch Ascent and Vehicle Aerodynamics (LAVA) computational fluid dynamics (CFD) solver is presented. A modern strategy for verification and validation is described incorporating verification tests, validation benchmarks, continuous integration and version control methods for automated testing in a collaborative development environment. The purpose of the approach is to integrate the verification and validation process into the development of the solver and improve productivity. This paper uses the Method of Manufactured Solutions (MMS) for the verification of 2D Euler equations, 3D Navier-Stokes equations as well as turbulence models. A method for systematic refinement of unstructured grids is also presented. Verification using inviscid vortex propagation and flow over a flat plate is highlighted. Simulation results using laminar and turbulent flow past a NACA 0012 airfoil and ONERA M6 wing are validated against experimental and numerical data.
Domain Decomposition By the Advancing-Partition Method for Parallel Unstructured Grid Generation

NASA Technical Reports Server (NTRS)

Pirzadeh, Shahyar Z.; Zagaris, George

2009-01-01

A new method of domain decomposition has been developed for generating unstructured grids in subdomains either sequentially or using multiple computers in parallel. Domain decomposition is a crucial and challenging step for parallel grid generation. Prior methods are generally based on auxiliary, complex, and computationally intensive operations for defining partition interfaces and usually produce grids of lower quality than those generated in single domains. The new technique, referred to as "Advancing Partition," is based on the Advancing-Front method, which partitions a domain as part of the volume mesh generation in a consistent and "natural" way. The benefits of this approach are: 1) the process of domain decomposition is highly automated, 2) partitioning of domain does not compromise the quality of the generated grids, and 3) the computational overhead for domain decomposition is minimal. The new method has been implemented in NASA's unstructured grid generation code VGRID.
An adaptive discontinuous Galerkin solver for aerodynamic flows

NASA Astrophysics Data System (ADS)

Burgess, Nicholas K.

This work considers the accuracy, efficiency, and robustness of an unstructured high-order accurate discontinuous Galerkin (DG) solver for computational fluid dynamics (CFD). Recently, there has been a drive to reduce the discretization error of CFD simulations using high-order methods on unstructured grids. However, high-order methods are often criticized for lacking robustness and having high computational cost. The goal of this work is to investigate methods that enhance the robustness of high-order discontinuous Galerkin (DG) methods on unstructured meshes, while maintaining low computational cost and high accuracy of the numerical solutions. This work investigates robustness enhancement of high-order methods by examining effective non-linear solvers, shock capturing methods, turbulence model discretizations and adaptive refinement techniques. The goal is to develop an all encompassing solver that can simulate a large range of physical phenomena, where all aspects of the solver work together to achieve a robust, efficient and accurate solution strategy. The components and framework for a robust high-order accurate solver that is capable of solving viscous, Reynolds Averaged Navier-Stokes (RANS) and shocked flows is presented. In particular, this work discusses robust discretizations of the turbulence model equation used to close the RANS equations, as well as stable shock capturing strategies that are applicable across a wide range of discretization orders and applicable to very strong shock waves. Furthermore, refinement techniques are considered as both efficiency and robustness enhancement strategies. Additionally, efficient non-linear solvers based on multigrid and Krylov subspace methods are presented. The accuracy, efficiency, and robustness of the solver is demonstrated using a variety of challenging aerodynamic test problems, which include turbulent high-lift and viscous hypersonic flows. Adaptive mesh refinement was found to play a critical role in obtaining a robust and efficient high-order accurate flow solver. A goal-oriented error estimation technique has been developed to estimate the discretization error of simulation outputs. For high-order discretizations, it is shown that functional output error super-convergence can be obtained, provided the discretization satisfies a property known as dual consistency. The dual consistency of the DG methods developed in this work is shown via mathematical analysis and numerical experimentation. Goal-oriented error estimation is also used to drive an hp-adaptive mesh refinement strategy, where a combination of mesh or h-refinement, and order or p-enrichment, is employed based on the smoothness of the solution. The results demonstrate that the combination of goal-oriented error estimation and hp-adaptation yield superior accuracy, as well as enhanced robustness and efficiency for a variety of aerodynamic flows including flows with strong shock waves. This work demonstrates that DG discretizations can be the basis of an accurate, efficient, and robust CFD solver. Furthermore, enhancing the robustness of DG methods does not adversely impact the accuracy or efficiency of the solver for challenging and complex flow problems. In particular, when considering the computation of shocked flows, this work demonstrates that the available shock capturing techniques are sufficiently accurate and robust, particularly when used in conjunction with adaptive mesh refinement . This work also demonstrates that robust solutions of the Reynolds Averaged Navier-Stokes (RANS) and turbulence model equations can be obtained for complex and challenging aerodynamic flows. In this context, the most robust strategy was determined to be a low-order turbulence model discretization coupled to a high-order discretization of the RANS equations. Although RANS solutions using high-order accurate discretizations of the turbulence model were obtained, the behavior of current-day RANS turbulence models discretized to high-order was found to be problematic, leading to solver robustness issues. This suggests that future work is warranted in the area of turbulence model formulation for use with high-order discretizations. Alternately, the use of Large-Eddy Simulation (LES) subgrid scale models with high-order DG methods offers the potential to leverage the high accuracy of these methods for very high fidelity turbulent simulations. This thesis has developed the algorithmic improvements that will lay the foundation for the development of a three-dimensional high-order flow solution strategy that can be used as the basis for future LES simulations.
An Efficient Algorithm for Mapping Imaging Data to 3D Unstructured Grids in Computational Biomechanics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Einstein, Daniel R.; Kuprat, Andrew P.; Jiao, Xiangmin

2013-01-01

Geometries for organ scale and multiscale simulations of organ function are now routinely derived from imaging data. However, medical images may also contain spatially heterogeneous information other than geometry that are relevant to such simulations either as initial conditions or in the form of model parameters. In this manuscript, we present an algorithm for the efficient and robust mapping of such data to imaging based unstructured polyhedral grids in parallel. We then illustrate the application of our mapping algorithm to three different mapping problems: 1) the mapping of MRI diffusion tensor data to an unstuctured ventricular grid; 2) the mappingmore » of serial cyro-section histology data to an unstructured mouse brain grid; and 3) the mapping of CT-derived volumetric strain data to an unstructured multiscale lung grid. Execution times and parallel performance are reported for each case.« less
Parallel performance optimizations on unstructured mesh-based simulations

DOE PAGES

Sarje, Abhinav; Song, Sukhyun; Jacobsen, Douglas; ...

2015-06-01

This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches.more » We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.« less

2D Seismic Imaging of Elastic Parameters by Frequency Domain Full Waveform Inversion

NASA Astrophysics Data System (ADS)

Brossier, R.; Virieux, J.; Operto, S.

2008-12-01

Thanks to recent advances in parallel computing, full waveform inversion is today a tractable seismic imaging method to reconstruct physical parameters of the earth interior at different scales ranging from the near- surface to the deep crust. We present a massively parallel 2D frequency-domain full-waveform algorithm for imaging visco-elastic media from multi-component seismic data. The forward problem (i.e. the resolution of the frequency-domain 2D PSV elastodynamics equations) is based on low-order Discontinuous Galerkin (DG) method (P0 and/or P1 interpolations). Thanks to triangular unstructured meshes, the DG method allows accurate modeling of both body waves and surface waves in case of complex topography for a discretization of 10 to 15 cells per shear wavelength. The frequency-domain DG system is solved efficiently for multiple sources with the parallel direct solver MUMPS. The local inversion procedure (i.e. minimization of residuals between observed and computed data) is based on the adjoint-state method which allows to efficiently compute the gradient of the objective function. Applying the inversion hierarchically from the low frequencies to the higher ones defines a multiresolution imaging strategy which helps convergence towards the global minimum. In place of expensive Newton algorithm, the combined use of the diagonal terms of the approximate Hessian matrix and optimization algorithms based on quasi-Newton methods (Conjugate Gradient, LBFGS, ...) allows to improve the convergence of the iterative inversion. The distribution of forward problem solutions over processors driven by a mesh partitioning performed by METIS allows to apply most of the inversion in parallel. We shall present the main features of the parallel modeling/inversion algorithm, assess its scalability and illustrate its performances with realistic synthetic case studies.
Deconstructing Hub Drag. Part 2. Computational Development and Anaysis

DTIC Science & Technology

2013-09-30

leveraged a Vertical Lift Consortium ( VLC )-funded hub drag scaling research effort. To confirm this objective, correlations are performed with the...Technology™ Demonstrator aircraft using an unstructured computational solver. These simpler faired elliptical geome- tries can prove to be challenging ...possible. However, additional funding was obtained from the Vertical Lift Consortium ( VLC ) to perform this study. This analysis is documented in
A manual for PARTI runtime primitives, revision 1

NASA Technical Reports Server (NTRS)

Das, Raja; Saltz, Joel; Berryman, Harry

1991-01-01

Primitives are presented that are designed to help users efficiently program irregular problems (e.g., unstructured mesh sweeps, sparse matrix codes, adaptive mesh partial differential equations solvers) on distributed memory machines. These primitives are also designed for use in compilers for distributed memory multiprocessors. Communications patterns are captured at runtime, and the appropriate send and receive messages are automatically generated.
Dynamic mesh adaption for triangular and tetrahedral grids

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Strawn, Roger

1993-01-01

The following topics are discussed: requirements for dynamic mesh adaption; linked-list data structure; edge-based data structure; adaptive-grid data structure; three types of element subdivision; mesh refinement; mesh coarsening; additional constraints for coarsening; anisotropic error indicator for edges; unstructured-grid Euler solver; inviscid 3-D wing; and mesh quality for solution-adaptive grids. The discussion is presented in viewgraph form.
Unstructured CFD Aerodynamic Analysis of a Generic UCAV Configuration

NASA Technical Reports Server (NTRS)

Frink, Neal T.; Tormalm, Magnus; Schmidt, Stefan

2011-01-01

Three independent studies from the United States (NASA), Sweden (FOI), and Australia (DSTO) are analyzed to assess the state of current unstructured-grid computational fluid dynamic tools and practices for predicting the complex static and dynamic aerodynamic and stability characteristics of a generic 53-degree swept, round-leading-edge uninhabited combat air vehicle configuration, called SACCON. NASA exercised the USM3D tetrahedral cell-centered flow solver, while FOI and DSTO applied the FOI/EDGE general-cell vertex-based solver. The authors primarily employ the Reynolds Averaged Navier-Stokes (RANS) assumption, with a limited assessment of the EDGE Detached Eddy Simulation (DES) extension, to explore sensitivities to grids and turbulence models. Correlations with experimental data are provided for force and moments, surface pressure, and off-body flow measurements. The vortical flow field over SACCON proved extremely difficult to model adequately. As a general rule, the prospect of obtaining reasonable correlations of SACCON pitching moment characteristics with the RANS formulation is not promising, even for static cases. Yet, dynamic pitch oscillation results seem to produce a promising characterization of shapes for the lift and pitching moment hysteresis curves. Future studies of this configuration should include more investigation with higher-fidelity turbulence models, such as DES.
Numerical Simulations For the F-16XL Aircraft Configuration

NASA Technical Reports Server (NTRS)

Elmiligui, Alaa A.; Abdol-Hamid, Khaled; Cavallo, Peter A.; Parlette, Edward B.

2014-01-01

Numerical simulations of flow around the F-16XL are presented as a contribution to the Cranked Arrow Wing Aerodynamic Project International II (CAWAPI-II). The NASA Tetrahedral Unstructured Software System (TetrUSS) is used to perform numerical simulations. This CFD suite, developed and maintained by NASA Langley Research Center, includes an unstructured grid generation program called VGRID, a postprocessor named POSTGRID, and the flow solver USM3D. The CRISP CFD package is utilized to provide error estimates and grid adaption for verification of USM3D results. A subsonic high angle-of-attack case flight condition (FC) 25 is computed and analyzed. Three turbulence models are used in the calculations: the one-equation Spalart-Allmaras (SA), the two-equation shear stress transport (SST) and the ke turbulence models. Computational results, and surface static pressure profiles are presented and compared with flight data. Solution verification is performed using formal grid refinement studies, the solution of Error Transport Equations, and adaptive mesh refinement. The current study shows that the USM3D solver coupled with CRISP CFD can be used in an engineering environment in predicting vortex-flow physics on a complex configuration at flight Reynolds numbers.
Parallel O(N) Stokes’ solver towards scalable Brownian dynamics of hydrodynamically interacting objects in general geometries

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhao, Xujun; Li, Jiyuan; Jiang, Xikai

An efficient parallel Stokes’s solver is developed towards the complete inclusion of hydrodynamic interactions of Brownian particles in any geometry. A Langevin description of the particle dynamics is adopted, where the long-range interactions are included using a Green’s function formalism. We present a scalable parallel computational approach, where the general geometry Stokeslet is calculated following a matrix-free algorithm using the General geometry Ewald-like method. Our approach employs a highly-efficient iterative finite element Stokes’ solver for the accurate treatment of long-range hydrodynamic interactions within arbitrary confined geometries. A combination of mid-point time integration of the Brownian stochastic differential equation, the parallelmore » Stokes’ solver, and a Chebyshev polynomial approximation for the fluctuation-dissipation theorem result in an O(N) parallel algorithm. We also illustrate the new algorithm in the context of the dynamics of confined polymer solutions in equilibrium and non-equilibrium conditions. Our method is extended to treat suspended finite size particles of arbitrary shape in any geometry using an Immersed Boundary approach.« less
Parallel O(N) Stokes’ solver towards scalable Brownian dynamics of hydrodynamically interacting objects in general geometries

DOE PAGES

Zhao, Xujun; Li, Jiyuan; Jiang, Xikai; ...

2017-06-29

An efficient parallel Stokes’s solver is developed towards the complete inclusion of hydrodynamic interactions of Brownian particles in any geometry. A Langevin description of the particle dynamics is adopted, where the long-range interactions are included using a Green’s function formalism. We present a scalable parallel computational approach, where the general geometry Stokeslet is calculated following a matrix-free algorithm using the General geometry Ewald-like method. Our approach employs a highly-efficient iterative finite element Stokes’ solver for the accurate treatment of long-range hydrodynamic interactions within arbitrary confined geometries. A combination of mid-point time integration of the Brownian stochastic differential equation, the parallelmore » Stokes’ solver, and a Chebyshev polynomial approximation for the fluctuation-dissipation theorem result in an O(N) parallel algorithm. We also illustrate the new algorithm in the context of the dynamics of confined polymer solutions in equilibrium and non-equilibrium conditions. Our method is extended to treat suspended finite size particles of arbitrary shape in any geometry using an Immersed Boundary approach.« less
Performance of a parallel thermal-hydraulics code TEMPEST

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fann, G.I.; Trent, D.S.

The authors describe the parallelization of the Tempest thermal-hydraulics code. The serial version of this code is used for production quality 3-D thermal-hydraulics simulations. Good speedup was obtained with a parallel diagonally preconditioned BiCGStab non-symmetric linear solver, using a spatial domain decomposition approach for the semi-iterative pressure-based and mass-conserved algorithm. The test case used here to illustrate the performance of the BiCGStab solver is a 3-D natural convection problem modeled using finite volume discretization in cylindrical coordinates. The BiCGStab solver replaced the LSOR-ADI method for solving the pressure equation in TEMPEST. BiCGStab also solves the coupled thermal energy equation. Scalingmore » performance of 3 problem sizes (221220 nodes, 358120 nodes, and 701220 nodes) are presented. These problems were run on 2 different parallel machines: IBM-SP and SGI PowerChallenge. The largest problem attains a speedup of 68 on an 128 processor IBM-SP. In real terms, this is over 34 times faster than the fastest serial production time using the LSOR-ADI solver.« less
Numerical and experimental investigation of the 3D free surface flow in a model Pelton turbine

NASA Astrophysics Data System (ADS)

Fiereder, R.; Riemann, S.; Schilling, R.

2010-08-01

This investigation focuses on the numerical and experimental analysis of the 3D free surface flow in a Pelton turbine. In particular, two typical flow conditions occurring in a full scale Pelton turbine - a configuration with a straight inlet as well as a configuration with a 90 degree elbow upstream of the nozzle - are considered. Thereby, the effect of secondary flow due to the 90 degree bending of the upstream pipe on the characteristics of the jet is explored. The hybrid flow field consists of pure liquid flow within the conduit and free surface two component flow of the liquid jet emerging out of the nozzle into air. The numerical results are validated against experimental investigations performed in the laboratory of the Institute of Fluid Mechanics (FLM). For the numerical simulation of the flow the in-house unstructured fully parallelized finite volume solver solver3D is utilized. An advanced interface capturing model based on the classic Volume of Fluid method is applied. In order to ensure sharp interface resolution an additional convection term is added to the transport equation of the volume fraction. A collocated variable arrangement is used and the set of non-linear equations, containing fluid conservation equations and model equations for turbulence and volume fraction, are solved in a segregated manner. For pressure-velocity coupling the SIMPLE and PISO algorithms are implemented. Detailed analysis of the observed flow patterns in the jet and of the jet geometry are presented.
Factorizable Upwind Schemes: The Triangular Unstructured Grid Formulation

NASA Technical Reports Server (NTRS)

Sidilkover, David; Nielsen, Eric J.

2001-01-01

The upwind factorizable schemes for the equations of fluid were introduced recently. They facilitate achieving the Textbook Multigrid Efficiency (TME) and are expected also to result in the solvers of unparalleled robustness. The approach itself is very general. Therefore, it may well become a general framework for the large-scale, Computational Fluid Dynamics. In this paper we outline the triangular grid formulation of the factorizable schemes. The derivation is based on the fact that the factorizable schemes can be expressed entirely using vector notation. without explicitly mentioning a particular coordinate frame. We, describe the resulting discrete scheme in detail and present some computational results verifying the basic properties of the scheme/solver.
Auto-adaptive finite element meshes

NASA Technical Reports Server (NTRS)

Richter, Roland; Leyland, Penelope

1995-01-01

Accurate capturing of discontinuities within compressible flow computations is achieved by coupling a suitable solver with an automatic adaptive mesh algorithm for unstructured triangular meshes. The mesh adaptation procedures developed rely on non-hierarchical dynamical local refinement/derefinement techniques, which hence enable structural optimization as well as geometrical optimization. The methods described are applied for a number of the ICASE test cases are particularly interesting for unsteady flow simulations.
Some fast elliptic solvers on parallel architectures and their complexities

NASA Technical Reports Server (NTRS)

Gallopoulos, E.; Saad, Y.

1989-01-01

The discretization of separable elliptic partial differential equations leads to linear systems with special block tridiagonal matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconstant coefficients. A method was recently proposed to parallelize and vectorize BCR. In this paper, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational compelxity lower than that of parallel BCR.
Some fast elliptic solvers on parallel architectures and their complexities

NASA Technical Reports Server (NTRS)

Gallopoulos, E.; Saad, Youcef

1989-01-01

The discretization of separable elliptic partial differential equations leads to linear systems with special block triangular matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconsistant coefficients. A method was recently proposed to parallelize and vectorize BCR. Here, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches, including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational complexity lower than that of parallel BCR.
Application of FUN3D Solver for Aeroacoustics Simulation of a Nose Landing Gear Configuration

NASA Technical Reports Server (NTRS)

Vatsa, Veer N.; Lockard, David P.; Khorrami, Mehdi R.

2011-01-01

Numerical simulations have been performed for a nose landing gear configuration corresponding to the experimental tests conducted in the Basic Aerodynamic Research Tunnel at NASA Langley Research Center. A widely used unstructured grid code, FUN3D, is examined for solving the unsteady flow field associated with this configuration. A series of successively finer unstructured grids has been generated to assess the effect of grid refinement. Solutions have been obtained on purely tetrahedral grids as well as mixed element grids using hybrid RANS/LES turbulence models. The agreement of FUN3D solutions with experimental data on the same size mesh is better on mixed element grids compared to pure tetrahedral grids, and in general improves with grid refinement.
Implicit solvers for unstructured meshes

NASA Technical Reports Server (NTRS)

Venkatakrishnan, V.; Mavriplis, Dimitri J.

1991-01-01

Implicit methods were developed and tested for unstructured mesh computations. The approximate system which arises from the Newton linearization of the nonlinear evolution operator is solved by using the preconditioned GMRES (Generalized Minimum Residual) technique. Three different preconditioners were studied, namely, the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over relaxation (SSOR). The preconditioners were optimized to have good vectorization properties. SSOR and ILU were also studied as iterative schemes. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also studied. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
Parallel Solver for Diffuse Optical Tomography on Realistic Head Models With Scattering and Clear Regions.

PubMed

Placati, Silvio; Guermandi, Marco; Samore, Andrea; Scarselli, Eleonora Franchi; Guerrieri, Roberto

2016-09-01

Diffuse optical tomography is an imaging technique, based on evaluation of how light propagates within the human head to obtain the functional information about the brain. Precision in reconstructing such an optical properties map is highly affected by the accuracy of the light propagation model implemented, which needs to take into account the presence of clear and scattering tissues. We present a numerical solver based on the radiosity-diffusion model, integrating the anatomical information provided by a structural MRI. The solver is designed to run on parallel heterogeneous platforms based on multiple GPUs and CPUs. We demonstrate how the solver provides a 7 times speed-up over an isotropic-scattered parallel Monte Carlo engine based on a radiative transport equation for a domain composed of 2 million voxels, along with a significant improvement in accuracy. The speed-up greatly increases for larger domains, allowing us to compute the light distribution of a full human head ( ≈ 3 million voxels) in 116 s for the platform used.
Scalable domain decomposition solvers for stochastic PDEs in high performance computing

DOE PAGES

Desai, Ajit; Khalil, Mohammad; Pettit, Chris; ...

2017-09-21

Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Vay, Jean-Luc, E-mail: jlvay@lbl.gov; Haber, Irving; Godfrey, Brendan B.

Pseudo-spectral electromagnetic solvers (i.e. representing the fields in Fourier space) have extraordinary precision. In particular, Haber et al. presented in 1973 a pseudo-spectral solver that integrates analytically the solution over a finite time step, under the usual assumption that the source is constant over that time step. Yet, pseudo-spectral solvers have not been widely used, due in part to the difficulty for efficient parallelization owing to global communications associated with global FFTs on the entire computational domains. A method for the parallelization of electromagnetic pseudo-spectral solvers is proposed and tested on single electromagnetic pulses, and on Particle-In-Cell simulations of themore » wakefield formation in a laser plasma accelerator. The method takes advantage of the properties of the Discrete Fourier Transform, the linearity of Maxwell’s equations and the finite speed of light for limiting the communications of data within guard regions between neighboring computational domains. Although this requires a small approximation, test results show that no significant error is made on the test cases that have been presented. The proposed method opens the way to solvers combining the favorable parallel scaling of standard finite-difference methods with the accuracy advantages of pseudo-spectral methods.« less
Scalable domain decomposition solvers for stochastic PDEs in high performance computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Desai, Ajit; Khalil, Mohammad; Pettit, Chris

Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less

Equation solvers for distributed-memory computers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.

1994-01-01

A large number of scientific and engineering problems require the rapid solution of large systems of simultaneous equations. The performance of parallel computers in this area now dwarfs traditional vector computers by nearly an order of magnitude. This talk describes the major issues involved in parallel equation solvers with particular emphasis on the Intel Paragon, IBM SP-1 and SP-2 processors.
Parallel Performance Optimizations on Unstructured Mesh-based Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sarje, Abhinav; Song, Sukhyun; Jacobsen, Douglas

2015-01-01

© The Authors. Published by Elsevier B.V. This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cachemore » efficiency, as well as communication reduction approaches. We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.« less
Jali - Unstructured Mesh Infrastructure for Multi-Physics Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Garimella, Rao V; Berndt, Markus; Coon, Ethan

2017-04-13

Jali is a parallel unstructured mesh infrastructure library designed for use by multi-physics simulations. It supports 2D and 3D arbitrary polyhedral meshes distributed over hundreds to thousands of nodes. Jali can read write Exodus II meshes along with fields and sets on the mesh and support for other formats is partially implemented or is (https://github.com/MeshToolkit/MSTK), an open source general purpose unstructured mesh infrastructure library from Los Alamos National Laboratory. While it has been made to work with other mesh frameworks such as MOAB and STKmesh in the past, support for maintaining the interface to these frameworks has been suspended formore » now. Jali supports distributed as well as on-node parallelism. Support of on-node parallelism is through direct use of the the mesh in multi-threaded constructs or through the use of "tiles" which are submeshes or sub-partitions of a partition destined for a compute node.« less
Aerodynamic simulation on massively parallel systems

NASA Technical Reports Server (NTRS)

Haeuser, Jochem; Simon, Horst D.

1992-01-01

This paper briefly addresses the computational requirements for the analysis of complete configurations of aircraft and spacecraft currently under design to be used for advanced transportation in commercial applications as well as in space flight. The discussion clearly shows that massively parallel systems are the only alternative which is both cost effective and on the other hand can provide the necessary TeraFlops, needed to satisfy the narrow design margins of modern vehicles. It is assumed that the solution of the governing physical equations, i.e., the Navier-Stokes equations which may be complemented by chemistry and turbulence models, is done on multiblock grids. This technique is situated between the fully structured approach of classical boundary fitted grids and the fully unstructured tetrahedra grids. A fully structured grid best represents the flow physics, while the unstructured grid gives best geometrical flexibility. The multiblock grid employed is structured within a block, but completely unstructured on the block level. While a completely unstructured grid is not straightforward to parallelize, the above mentioned multiblock grid is inherently parallel, in particular for multiple instruction multiple datastream (MIMD) machines. In this paper guidelines are provided for setting up or modifying an existing sequential code so that a direct parallelization on a massively parallel system is possible. Results are presented for three parallel systems, namely the Intel hypercube, the Ncube hypercube, and the FPS 500 system. Some preliminary results for an 8K CM2 machine will also be mentioned. The code run is the two dimensional grid generation module of Grid, which is a general two dimensional and three dimensional grid generation code for complex geometries. A system of nonlinear Poisson equations is solved. This code is also a good testcase for complex fluid dynamics codes, since the same datastructures are used. All systems provided good speedups, but message passing MIMD systems seem to be best suited for large miltiblock applications.
A multidimensional unified gas-kinetic scheme for radiative transfer equations on unstructured mesh

NASA Astrophysics Data System (ADS)

Sun, Wenjun; Jiang, Song; Xu, Kun

2017-12-01

In order to extend the unified gas kinetic scheme (UGKS) to solve radiative transfer equations in a complex geometry, a multidimensional asymptotic preserving implicit method on unstructured mesh is constructed in this paper. With an implicit formulation, the CFL condition for the determination of the time step in UGKS can be much relaxed, and a large time step is used in simulations. Differently from previous direction-by-direction UGKS on orthogonal structured mesh, on unstructured mesh the interface flux transport takes into account multi-dimensional effect, where gradients of radiation intensity and material temperature in both normal and tangential directions of a cell interface are included in the flux evaluation. The multiple scale nature makes the UGKS be able to capture the solutions in both optically thin and thick regions seamlessly. In the optically thick region the condition of cell size being less than photon's mean free path is fully removed, and the UGKS recovers a solver for diffusion equation in such a limit on unstructured mesh. For a distorted quadrilateral mesh, the UGKS goes to a nine-point scheme for the diffusion equation, and it naturally reduces to the standard five-point scheme for a orthogonal quadrilateral mesh. Numerical computations covering a wide range of transport regimes on unstructured and distorted quadrilateral meshes will be presented to validate the current approach.
High Resolution Aerospace Applications using the NASA Columbia Supercomputer

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.; Aftosmis, Michael J.; Berger, Marsha

2005-01-01

This paper focuses on the parallel performance of two high-performance aerodynamic simulation packages on the newly installed NASA Columbia supercomputer. These packages include both a high-fidelity, unstructured, Reynolds-averaged Navier-Stokes solver, and a fully-automated inviscid flow package for cut-cell Cartesian grids. The complementary combination of these two simulation codes enables high-fidelity characterization of aerospace vehicle design performance over the entire flight envelope through extensive parametric analysis and detailed simulation of critical regions of the flight envelope. Both packages. are industrial-level codes designed for complex geometry and incorpor.ats. CuStomized multigrid solution algorithms. The performance of these codes on Columbia is examined using both MPI and OpenMP and using both the NUMAlink and InfiniBand interconnect fabrics. Numerical results demonstrate good scalability on up to 2016 CPUs using the NUMAIink4 interconnect, with measured computational rates in the vicinity of 3 TFLOP/s, while InfiniBand showed some performance degradation at high CPU counts, particularly with multigrid. Nonetheless, the results are encouraging enough to indicate that larger test cases using combined MPI/OpenMP communication should scale well on even more processors.
Level set methods for detonation shock dynamics using high-order finite elements

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dobrev, V. A.; Grogan, F. C.; Kolev, T. V.

Level set methods are a popular approach to modeling evolving interfaces. We present a level set ad- vection solver in two and three dimensions using the discontinuous Galerkin method with high-order nite elements. During evolution, the level set function is reinitialized to a signed distance function to maintain ac- curacy. Our approach leads to stable front propagation and convergence on high-order, curved, unstructured meshes. The ability of the solver to implicitly track moving fronts lends itself to a number of applications; in particular, we highlight applications to high-explosive (HE) burn and detonation shock dynamics (DSD). We provide results for two-more » and three-dimensional benchmark problems as well as applications to DSD.« less
Textbook Multigrid Efficiency for the Steady Euler Equations

NASA Technical Reports Server (NTRS)

Roberts, Thomas W.; Sidilkover, David; Swanson, R. C.

2004-01-01

A fast multigrid solver for the steady incompressible Euler equations is presented. Unlike time-marching schemes, this approach uses relaxation of the steady equations. Application of this method results in a discretization that correctly distinguishes between the advection and elliptic parts of the operator, allowing efficient smoothers to be constructed. Solvers for both unstructured triangular grids and structured quadrilateral grids have been written. Computations for channel flow and flow over a nonlifting airfoil have computed. Using Gauss-Seidel relaxation ordered in the flow direction, textbook multigrid convergence rates of nearly one order-of-magnitude residual reduction per multigrid cycle are achieved, independent of the grid spacing. This approach also may be applied to the compressible Euler equations and the incompressible Navier-Stokes equations.
Robust and efficient overset grid assembly for partitioned unstructured meshes

NASA Astrophysics Data System (ADS)

Roget, Beatrice; Sitaraman, Jayanarayanan

2014-03-01

This paper presents a method to perform efficient and automated Overset Grid Assembly (OGA) on a system of overlapping unstructured meshes in a parallel computing environment where all meshes are partitioned into multiple mesh-blocks and processed on multiple cores. The main task of the overset grid assembler is to identify, in parallel, among all points in the overlapping mesh system, at which points the flow solution should be computed (field points), interpolated (receptor points), or ignored (hole points). Point containment search or donor search, an algorithm to efficiently determine the cell that contains a given point, is the core procedure necessary for accomplishing this task. Donor search is particularly challenging for partitioned unstructured meshes because of the complex irregular boundaries that are often created during partitioning.
Computational Challenges of 3D Radiative Transfer in Atmospheric Models

NASA Astrophysics Data System (ADS)

Jakub, Fabian; Bernhard, Mayer

2017-04-01

The computation of radiative heating and cooling rates is one of the most expensive components in todays atmospheric models. The high computational cost stems not only from the laborious integration over a wide range of the electromagnetic spectrum but also from the fact that solving the integro-differential radiative transfer equation for monochromatic light is already rather involved. This lead to the advent of numerous approximations and parameterizations to reduce the cost of the solver. One of the most prominent one is the so called independent pixel approximations (IPA) where horizontal energy transfer is neglected whatsoever and radiation may only propagate in the vertical direction (1D). Recent studies implicate that the IPA introduces significant errors in high resolution simulations and affects the evolution and development of convective systems. However, using fully 3D solvers such as for example MonteCarlo methods is not even on state of the art supercomputers feasible. The parallelization of atmospheric models is often realized by a horizontal domain decomposition, and hence, horizontal transfer of energy necessitates communication. E.g. a cloud's shadow at a low zenith angle will cast a long shadow and potentially needs to communication through a multitude of processors. Especially light in the solar spectral range may travel long distances through the atmosphere. Concerning highly parallel simulations, it is vital that 3D radiative transfer solvers put a special emphasis on parallel scalability. We will present an introduction to intricacies computing 3D radiative heating and cooling rates as well as report on the parallel performance of the TenStream solver. The TenStream is a 3D radiative transfer solver using the PETSc framework to iteratively solve a set of partial differential equation. We investigate two matrix preconditioners, (a) geometric algebraic multigrid preconditioning(MG+GAMG) and (b) block Jacobi incomplete LU (ILU) factorization. The TenStream solver is tested for up to 4096 cores and shows a parallel scaling efficiency of 80-90% on various supercomputers.
A Multi-Level Parallelization Concept for High-Fidelity Multi-Block Solvers

NASA Technical Reports Server (NTRS)

Hatay, Ferhat F.; Jespersen, Dennis C.; Guruswamy, Guru P.; Rizk, Yehia M.; Byun, Chansup; Gee, Ken; VanDalsem, William R. (Technical Monitor)

1997-01-01

The integration of high-fidelity Computational Fluid Dynamics (CFD) analysis tools with the industrial design process benefits greatly from the robust implementations that are transportable across a wide range of computer architectures. In the present work, a hybrid domain-decomposition and parallelization concept was developed and implemented into the widely-used NASA multi-block Computational Fluid Dynamics (CFD) packages implemented in ENSAERO and OVERFLOW. The new parallel solver concept, PENS (Parallel Euler Navier-Stokes Solver), employs both fine and coarse granularity in data partitioning as well as data coalescing to obtain the desired load-balance characteristics on the available computer platforms. This multi-level parallelism implementation itself introduces no changes to the numerical results, hence the original fidelity of the packages are identically preserved. The present implementation uses the Message Passing Interface (MPI) library for interprocessor message passing and memory accessing. By choosing an appropriate combination of the available partitioning and coalescing capabilities only during the execution stage, the PENS solver becomes adaptable to different computer architectures from shared-memory to distributed-memory platforms with varying degrees of parallelism. The PENS implementation on the IBM SP2 distributed memory environment at the NASA Ames Research Center obtains 85 percent scalable parallel performance using fine-grain partitioning of single-block CFD domains using up to 128 wide computational nodes. Multi-block CFD simulations of complete aircraft simulations achieve 75 percent perfect load-balanced executions using data coalescing and the two levels of parallelism. SGI PowerChallenge, SGI Origin 2000, and a cluster of workstations are the other platforms where the robustness of the implementation is tested. The performance behavior on the other computer platforms with a variety of realistic problems will be included as this on-going study progresses.
Performance Benchmark for a Prismatic Flow Solver

DTIC Science & Technology

2007-03-26

Gauss- Seidel (LU-SGS) implicit method is used for time integration to reduce the computational time. A one-equation turbulence model by Goldberg and...numerical flux computations. The Lower-Upper-Symmetric Gauss- Seidel (LU-SGS) implicit method [1] is used for time integration to reduce the...Sharov, D. and Nakahashi, K., “Reordering of Hybrid Unstructured Grids for Lower-Upper Symmetric Gauss- Seidel Computations,” AIAA Journal, Vol. 36
Development and acceleration of unstructured mesh-based cfd solver

NASA Astrophysics Data System (ADS)

Emelyanov, V.; Karpenko, A.; Volkov, K.

2017-06-01

The study was undertaken as part of a larger effort to establish a common computational fluid dynamics (CFD) code for simulation of internal and external flows and involves some basic validation studies. The governing equations are solved with ¦nite volume code on unstructured meshes. The computational procedure involves reconstruction of the solution in each control volume and extrapolation of the unknowns to find the flow variables on the faces of control volume, solution of Riemann problem for each face of the control volume, and evolution of the time step. The nonlinear CFD solver works in an explicit time-marching fashion, based on a three-step Runge-Kutta stepping procedure. Convergence to a steady state is accelerated by the use of geometric technique and by the application of Jacobi preconditioning for high-speed flows, with a separate low Mach number preconditioning method for use with low-speed flows. The CFD code is implemented on graphics processing units (GPUs). Speedup of solution on GPUs with respect to solution on central processing units (CPU) is compared with the use of different meshes and different methods of distribution of input data into blocks. The results obtained provide promising perspective for designing a GPU-based software framework for applications in CFD.
Simulation of an Isolated Tiltrotor in Hover with an Unstructured Overset-Grid RANS Solver

NASA Technical Reports Server (NTRS)

Lee-Rausch, Elizabeth M.; Biedron, Robert T.

2009-01-01

An unstructured overset-grid Reynolds Averaged Navier-Stokes (RANS) solver, FUN3D, is used to simulate an isolated tiltrotor in hover. An overview of the computational method is presented as well as the details of the overset-grid systems. Steady-state computations within a noninertial reference frame define the performance trends of the rotor across a range of the experimental collective settings. Results are presented to show the effects of off-body grid refinement and blade grid refinement. The computed performance and blade loading trends show good agreement with experimental results and previously published structured overset-grid computations. Off-body flow features indicate a significant improvement in the resolution of the first perpendicular blade vortex interaction with background grid refinement across the collective range. Considering experimental data uncertainty and effects of transition, the prediction of figure of merit on the baseline and refined grid is reasonable at the higher collective range- within 3 percent of the measured values. At the lower collective settings, the computed figure of merit is approximately 6 percent lower than the experimental data. A comparison of steady and unsteady results show that with temporal refinement, the dynamic results closely match the steady-state noninertial results which gives confidence in the accuracy of the dynamic overset-grid approach.
A Spectral Element Discretisation on Unstructured Triangle / Tetrahedral Meshes for Elastodynamics

NASA Astrophysics Data System (ADS)

May, Dave A.; Gabriel, Alice-A.

2017-04-01

The spectral element method (SEM) defined over quadrilateral and hexahedral element geometries has proven to be a fast, accurate and scalable approach to study wave propagation phenomena. In the context of regional scale seismology and or simulations incorporating finite earthquake sources, the geometric restrictions associated with hexahedral elements can limit the applicability of the classical quad./hex. SEM. Here we describe a continuous Galerkin spectral element discretisation defined over unstructured meshes composed of triangles (2D), or tetrahedra (3D). The method uses a stable, nodal basis constructed from PKD polynomials and thus retains the spectral accuracy and low dispersive properties of the classical SEM, in addition to the geometric versatility provided by unstructured simplex meshes. For the particular basis and quadrature rule we have adopted, the discretisation results in a mass matrix which is not diagonal, thereby mandating linear solvers be utilised. To that end, we have developed efficient solvers and preconditioners which are robust with respect to the polynomial order (p), and possess high arithmetic intensity. Furthermore, we also consider using implicit time integrators, together with a p-multigrid preconditioner to circumvent the CFL condition. Implicit time integrators become particularly relevant when considering solving problems on poor quality meshes, or meshes containing elements with a widely varying range of length scales - both of which frequently arise when meshing non-trivial geometries. We demonstrate the applicability of the new method by examining a number of two- and three-dimensional wave propagation scenarios. These scenarios serve to characterise the accuracy and cost of the new method. Lastly, we will assess the potential benefits of using implicit time integrators for regional scale wave propagation simulations.
A Grid Sourcing and Adaptation Study Using Unstructured Grids for Supersonic Boom Prediction

NASA Technical Reports Server (NTRS)

Carter, Melissa B.; Deere, Karen A.

2008-01-01

NASA created the Supersonics Project as part of the NASA Fundamental Aeronautics Program to advance technology that will make a supersonic flight over land viable. Computational flow solvers have lacked the ability to accurately predict sonic boom from the near to far field. The focus of this investigation was to establish gridding and adaptation techniques to predict near-to-mid-field (<10 body lengths below the aircraft) boom signatures at supersonic speeds using the USM3D unstructured grid flow solver. The study began by examining sources along the body the aircraft, far field sourcing and far field boundaries. The study then examined several techniques for grid adaptation. During the course of the study, volume sourcing was introduced as a new way to source grids using the grid generation code VGRID. Two different methods of using the volume sources were examined. The first method, based on manual insertion of the numerous volume sources, made great improvements in the prediction capability of USM3D for boom signatures. The second method (SSGRID), which uses an a priori adaptation approach to stretch and shear the original unstructured grid to align the grid and pressure waves, showed similar results with a more automated approach. Due to SSGRID s results and ease of use, the rest of the study focused on developing a best practice using SSGRID. The best practice created by this study for boom predictions using the CFD code USM3D involved: 1) creating a small cylindrical outer boundary either 1 or 2 body lengths in diameter (depending on how far below the aircraft the boom prediction is required), 2) using a single volume source under the aircraft, and 3) using SSGRID to stretch and shear the grid to the desired length.
Three-Dimensional High-Order Spectral Volume Method for Solving Maxwell's Equations on Unstructured Grids

NASA Technical Reports Server (NTRS)

Liu, Yen; Vinokur, Marcel; Wang, Z. J.

2004-01-01

A three-dimensional, high-order, conservative, and efficient discontinuous spectral volume (SV) method for the solutions of Maxwell's equations on unstructured grids is presented. The concept of discontinuous 2nd high-order loca1 representations to achieve conservation and high accuracy is utilized in a manner similar to the Discontinuous Galerkin (DG) method, but instead of using a Galerkin finite-element formulation, the SV method is based on a finite-volume approach to attain a simpler formulation. Conventional unstructured finite-volume methods require data reconstruction based on the least-squares formulation using neighboring cell data. Since each unknown employs a different stencil, one must repeat the least-squares inversion for every cell at each time step, or to store the inversion coefficients. In a high-order, three-dimensional computation, the former would involve impractically large CPU time, while for the latter the memory requirement becomes prohibitive. In the SV method, one starts with a relatively coarse grid of triangles or tetrahedra, called spectral volumes (SVs), and partition each SV into a number of structured subcells, called control volumes (CVs), that support a polynomial expansion of a desired degree of precision. The unknowns are cell averages over CVs. If all the SVs are partitioned in a geometrically similar manner, the reconstruction becomes universal as a weighted sum of unknowns, and only a few universal coefficients need to be stored for the surface integrals over CV faces. Since the solution is discontinuous across the SV boundaries, a Riemann solver is thus necessary to maintain conservation. In the paper, multi-parameter and symmetric SV partitions, up to quartic for triangle and cubic for tetrahedron, are first presented. The corresponding weight coefficients for CV face integrals in terms of CV cell averages for each partition are analytically determined. These discretization formulas are then applied to the integral form of the Maxwell equations. All numerical procedures for outer boundary, material interface, zonal interface, and interior SV face are unified with a single characteristic formulation. The load balancing in a massive parallel computing environment is therefore easier to achieve. A parameter is introduced in the Riemann solver to control the strength of the smoothing term. Important aspects of the data structure and its effects to communication and the optimum use of cache memory are discussed. Results will be presented for plane TE and TM waves incident on a perfectly conducting cylinder for up to fifth order of accuracy, and a plane wave incident on a perfectly conducting sphere for up to fourth order of accuracy. Comparisons are made with exact solutions for these cases.
Numerical Simulations for Landing Gear Noise Generation and Radiation

NASA Technical Reports Server (NTRS)

Morris, Philip J.; Long, Lyle N.

2002-01-01

Aerodynamic noise from a landing gear in a uniform flow is computed using the Ffowcs Williams -Hawkings (FW-H) equation. The time accurate flow data on the surface is obtained using a finite volume flow solver on an unstructured and. The Ffowcs Williams-Hawkings equation is solved using surface integrals over the landing gear surface and over a permeable surface away from the landing gear. Two geometric configurations are tested in order to assess the impact of two lateral struts on the sound level and directivity in the far-field. Predictions from the Ffowcs Williams-Hawkings code are compared with direct calculations by the flow solver at several observer locations inside the computational domain. The permeable Ffowcs Williams-Hawkings surface predictions match those of the flow solver in the near-field. Far-field noise calculations coincide for both integration surfaces. The increase in drag observed between the two landing gear configurations is reflected in the sound pressure level and directivity mainly in the streamwise direction.
A Framework for Parallel Unstructured Grid Generation for Complex Aerodynamic Simulations

NASA Technical Reports Server (NTRS)

Zagaris, George; Pirzadeh, Shahyar Z.; Chrisochoides, Nikos

2009-01-01

A framework for parallel unstructured grid generation targeting both shared memory multi-processors and distributed memory architectures is presented. The two fundamental building-blocks of the framework consist of: (1) the Advancing-Partition (AP) method used for domain decomposition and (2) the Advancing Front (AF) method used for mesh generation. Starting from the surface mesh of the computational domain, the AP method is applied recursively to generate a set of sub-domains. Next, the sub-domains are meshed in parallel using the AF method. The recursive nature of domain decomposition naturally maps to a divide-and-conquer algorithm which exhibits inherent parallelism. For the parallel implementation, the Master/Worker pattern is employed to dynamically balance the varying workloads of each task on the set of available CPUs. Performance results by this approach are presented and discussed in detail as well as future work and improvements.
Directional Agglomeration Multigrid Techniques for High-Reynolds Number Viscous Flows

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.

1998-01-01

A preconditioned directional-implicit agglomeration algorithm is developed for solving two- and three-dimensional viscous flows on highly anisotropic unstructured meshes of mixed-element types. The multigrid smoother consists of a pre-conditioned point- or line-implicit solver which operates on lines constructed in the unstructured mesh using a weighted graph algorithm. Directional coarsening or agglomeration is achieved using a similar weighted graph algorithm. A tight coupling of the line construction and directional agglomeration algorithms enables the use of aggressive coarsening ratios in the multigrid algorithm, which in turn reduces the cost of a multigrid cycle. Convergence rates which are independent of the degree of grid stretching are demonstrated in both two and three dimensions. Further improvement of the three-dimensional convergence rates through a GMRES technique is also demonstrated.

Comments regarding two upwind methods for solving two-dimensional external flows using unstructured grids

NASA Technical Reports Server (NTRS)

Kleb, W. L.

1994-01-01

Steady flow over the leading portion of a multicomponent airfoil section is studied using computational fluid dynamics (CFD) employing an unstructured grid. To simplify the problem, only the inviscid terms are retained from the Reynolds-averaged Navier-Stokes equations - leaving the Euler equations. The algorithm is derived using the finite-volume approach, incorporating explicit time-marching of the unsteady Euler equations to a time-asymptotic, steady-state solution. The inviscid fluxes are obtained through either of two approximate Riemann solvers: Roe's flux difference splitting or van Leer's flux vector splitting. Results are presented which contrast the solutions given by the two flux functions as a function of Mach number and grid resolution. Additional information is presented concerning code verification techniques, flow recirculation regions, convergence histories, and computational resources.
NONLINEAR MULTIGRID SOLVER EXPLOITING AMGe COARSE SPACES WITH APPROXIMATION PROPERTIES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Christensen, Max La Cour; Villa, Umberto E.; Engsig-Karup, Allan P.

The paper introduces a nonlinear multigrid solver for mixed nite element discretizations based on the Full Approximation Scheme (FAS) and element-based Algebraic Multigrid (AMGe). The main motivation to use FAS for unstruc- tured problems is the guaranteed approximation property of the AMGe coarse spaces that were developed recently at Lawrence Livermore National Laboratory. These give the ability to derive stable and accurate coarse nonlinear discretization problems. The previous attempts (including ones with the original AMGe method, [5, 11]), were less successful due to lack of such good approximation properties of the coarse spaces. With coarse spaces with approximation properties, ourmore » FAS approach on un- structured meshes should be as powerful/successful as FAS on geometrically re ned meshes. For comparison, Newton's method and Picard iterations with an inner state-of-the-art linear solver is compared to FAS on a nonlinear saddle point problem with applications to porous media ow. It is demonstrated that FAS is faster than Newton's method and Picard iterations for the experiments considered here. Due to the guaranteed approximation properties of our AMGe, the coarse spaces are very accurate, providing a solver with the potential for mesh-independent convergence on general unstructured meshes.« less
Implementation of a 3D version of ponderomotive guiding center solver in particle-in-cell code OSIRIS

NASA Astrophysics Data System (ADS)

Helm, Anton; Vieira, Jorge; Silva, Luis; Fonseca, Ricardo

2016-10-01

Laser-driven accelerators gained an increased attention over the past decades. Typical modeling techniques for laser wakefield acceleration (LWFA) are based on particle-in-cell (PIC) simulations. PIC simulations, however, are very computationally expensive due to the disparity of the relevant scales ranging from the laser wavelength, in the micrometer range, to the acceleration length, currently beyond the ten centimeter range. To minimize the gap between these despair scales the ponderomotive guiding center (PGC) algorithm is a promising approach. By describing the evolution of the laser pulse envelope separately, only the scales larger than the plasma wavelength are required to be resolved in the PGC algorithm, leading to speedups in several orders of magnitude. Previous work was limited to two dimensions. Here we present the implementation of the 3D version of a PGC solver into the massively parallel, fully relativistic PIC code OSIRIS. We extended the solver to include periodic boundary conditions and parallelization in all spatial dimensions. We present benchmarks for distributed and shared memory parallelization. We also discuss the stability of the PGC solver.
An Implicit Solver on A Parallel Block-Structured Adaptive Mesh Grid for FLASH

NASA Astrophysics Data System (ADS)

Lee, D.; Gopal, S.; Mohapatra, P.

2012-07-01

We introduce a fully implicit solver for FLASH based on a Jacobian-Free Newton-Krylov (JFNK) approach with an appropriate preconditioner. The main goal of developing this JFNK-type implicit solver is to provide efficient high-order numerical algorithms and methodology for simulating stiff systems of differential equations on large-scale parallel computer architectures. A large number of natural problems in nonlinear physics involve a wide range of spatial and time scales of interest. A system that encompasses such a wide magnitude of scales is described as "stiff." A stiff system can arise in many different fields of physics, including fluid dynamics/aerodynamics, laboratory/space plasma physics, low Mach number flows, reactive flows, radiation hydrodynamics, and geophysical flows. One of the big challenges in solving such a stiff system using current-day computational resources lies in resolving time and length scales varying by several orders of magnitude. We introduce FLASH's preliminary implementation of a time-accurate JFNK-based implicit solver in the framework of FLASH's unsplit hydro solver.
Fluid-structure interaction of a pulsatile flow with an aortic valve model: A combined experimental and numerical study.

PubMed

Sigüenza, Julien; Pott, Desiree; Mendez, Simon; Sonntag, Simon J; Kaufmann, Tim A S; Steinseifer, Ulrich; Nicoud, Franck

2018-04-01

The complex fluid-structure interaction problem associated with the flow of blood through a heart valve with flexible leaflets is investigated both experimentally and numerically. In the experimental test rig, a pulse duplicator generates a pulsatile flow through a biomimetic rigid aortic root where a model of aortic valve with polymer flexible leaflets is implanted. High-speed recordings of the leaflets motion and particle image velocimetry measurements were performed together to investigate the valve kinematics and the dynamics of the flow. Large eddy simulations of the same configuration, based on a variant of the immersed boundary method, are also presented. A massively parallel unstructured finite-volume flow solver is coupled with a finite-element solid mechanics solver to predict the fluid-structure interaction between the unsteady flow and the valve. Detailed analysis of the dynamics of opening and closure of the valve are conducted, showing a good quantitative agreement between the experiment and the simulation regarding the global behavior, in spite of some differences regarding the individual dynamics of the valve leaflets. A multicycle analysis (over more than 20 cycles) enables to characterize the generation of turbulence downstream of the valve, showing similar flow features between the experiment and the simulation. The flow transitions to turbulence after peak systole, when the flow starts to decelerate. Fluctuations are observed in the wake of the valve, with maximum amplitude observed at the commissure side of the aorta. Overall, a very promising experiment-vs-simulation comparison is shown, demonstrating the potential of the numerical method. Copyright © 2017 John Wiley & Sons, Ltd.
TOUGH3: A new efficient version of the TOUGH suite of multiphase flow and transport simulators

NASA Astrophysics Data System (ADS)

Jung, Yoojin; Pau, George Shu Heng; Finsterle, Stefan; Pollyea, Ryan M.

2017-11-01

The TOUGH suite of nonisothermal multiphase flow and transport simulators has been updated by various developers over many years to address a vast range of challenging subsurface problems. The increasing complexity of the simulated processes as well as the growing size of model domains that need to be handled call for an improvement in the simulator's computational robustness and efficiency. Moreover, modifications have been frequently introduced independently, resulting in multiple versions of TOUGH that (1) led to inconsistencies in feature implementation and usage, (2) made code maintenance and development inefficient, and (3) caused confusion to users and developers. TOUGH3-a new base version of TOUGH-addresses these issues. It consolidates both the serial (TOUGH2 V2.1) and parallel (TOUGH2-MP V2.0) implementations, enabling simulations to be performed on desktop computers and supercomputers using a single code. New PETSc parallel linear solvers are added to the existing serial solvers of TOUGH2 and the Aztec solver used in TOUGH2-MP. The PETSc solvers generally perform better than the Aztec solvers in parallel and the internal TOUGH3 linear solver in serial. TOUGH3 also incorporates many new features, addresses bugs, and improves the flexibility of data handling. Due to the improved capabilities and usability, TOUGH3 is more robust and efficient for solving tough and computationally demanding problems in diverse scientific and practical applications related to subsurface flow modeling.
Unstructured mesh algorithms for aerodynamic calculations

NASA Technical Reports Server (NTRS)

Mavriplis, D. J.

1992-01-01

The use of unstructured mesh techniques for solving complex aerodynamic flows is discussed. The principle advantages of unstructured mesh strategies, as they relate to complex geometries, adaptive meshing capabilities, and parallel processing are emphasized. The various aspects required for the efficient and accurate solution of aerodynamic flows are addressed. These include mesh generation, mesh adaptivity, solution algorithms, convergence acceleration, and turbulence modeling. Computations of viscous turbulent two-dimensional flows and inviscid three-dimensional flows about complex configurations are demonstrated. Remaining obstacles and directions for future research are also outlined.
Implementation of a fully-balanced periodic tridiagonal solver on a parallel distributed memory architecture

NASA Technical Reports Server (NTRS)

Eidson, T. M.; Erlebacher, G.

1994-01-01

While parallel computers offer significant computational performance, it is generally necessary to evaluate several programming strategies. Two programming strategies for a fairly common problem - a periodic tridiagonal solver - are developed and evaluated. Simple model calculations as well as timing results are presented to evaluate the various strategies. The particular tridiagonal solver evaluated is used in many computational fluid dynamic simulation codes. The feature that makes this algorithm unique is that these simulation codes usually require simultaneous solutions for multiple right-hand-sides (RHS) of the system of equations. Each RHS solutions is independent and thus can be computed in parallel. Thus a Gaussian elimination type algorithm can be used in a parallel computation and the more complicated approaches such as cyclic reduction are not required. The two strategies are a transpose strategy and a distributed solver strategy. For the transpose strategy, the data is moved so that a subset of all the RHS problems is solved on each of the several processors. This usually requires significant data movement between processor memories across a network. The second strategy attempts to have the algorithm allow the data across processor boundaries in a chained manner. This usually requires significantly less data movement. An approach to accomplish this second strategy in a near-perfect load-balanced manner is developed. In addition, an algorithm will be shown to directly transform a sequential Gaussian elimination type algorithm into the parallel chained, load-balanced algorithm.
Hybrid Grid Techniques for Propulsion Applications

NASA Technical Reports Server (NTRS)

Koomullil, Roy P.; Soni, Bharat K.; Thornburg, Hugh J.

1996-01-01

During the past decade, computational simulation of fluid flow for propulsion activities has progressed significantly, and many notable successes have been reported in the literature. However, the generation of a high quality mesh for such problems has often been reported as a pacing item. Hence, much effort has been expended to speed this portion of the simulation process. Several approaches have evolved for grid generation. Two of the most common are structured multi-block, and unstructured based procedures. Structured grids tend to be computationally efficient, and have high aspect ratio cells necessary for efficently resolving viscous layers. Structured multi-block grids may or may not exhibit grid line continuity across the block interface. This relaxation of the continuity constraint at the interface is intended to ease the grid generation process, which is still time consuming. Flow solvers supporting non-contiguous interfaces require specialized interpolation procedures which may not ensure conservation at the interface. Unstructured or generalized indexing data structures offer greater flexibility, but require explicit connectivity information and are not easy to generate for three dimensional configurations. In addition, unstructured mesh based schemes tend to be less efficient and it is difficult to resolve viscous layers. Recently hybrid or generalized element solution and grid generation techniques have been developed with the objective of combining the attractive features of both structured and unstructured techniques. In the present work, recently developed procedures for hybrid grid generation and flow simulation are critically evaluated, and compared to existing structured and unstructured procedures in terms of accuracy and computational requirements.
Efficient Unstructured Grid Adaptation Methods for Sonic Boom Prediction

NASA Technical Reports Server (NTRS)

Campbell, Richard L.; Carter, Melissa B.; Deere, Karen A.; Waithe, Kenrick A.

2008-01-01

This paper examines the use of two grid adaptation methods to improve the accuracy of the near-to-mid field pressure signature prediction of supersonic aircraft computed using the USM3D unstructured grid flow solver. The first method (ADV) is an interactive adaptation process that uses grid movement rather than enrichment to more accurately resolve the expansion and compression waves. The second method (SSGRID) uses an a priori adaptation approach to stretch and shear the original unstructured grid to align the grid with the pressure waves and reduce the cell count required to achieve an accurate signature prediction at a given distance from the vehicle. Both methods initially create negative volume cells that are repaired in a module in the ADV code. While both approaches provide significant improvements in the near field signature (< 3 body lengths) relative to a baseline grid without increasing the number of grid points, only the SSGRID approach allows the details of the signature to be accurately computed at mid-field distances (3-10 body lengths) for direct use with mid-field-to-ground boom propagation codes.
A fast parallel 3D Poisson solver with longitudinal periodic and transverse open boundary conditions for space-charge simulations

NASA Astrophysics Data System (ADS)

Qiang, Ji

2017-10-01

A three-dimensional (3D) Poisson solver with longitudinal periodic and transverse open boundary conditions can have important applications in beam physics of particle accelerators. In this paper, we present a fast efficient method to solve the Poisson equation using a spectral finite-difference method. This method uses a computational domain that contains the charged particle beam only and has a computational complexity of O(Nu(logNmode)) , where Nu is the total number of unknowns and Nmode is the maximum number of longitudinal or azimuthal modes. This saves both the computational time and the memory usage of using an artificial boundary condition in a large extended computational domain. The new 3D Poisson solver is parallelized using a message passing interface (MPI) on multi-processor computers and shows a reasonable parallel performance up to hundreds of processor cores.
Parallel Gaussian elimination of a block tridiagonal matrix using multiple microcomputers

NASA Technical Reports Server (NTRS)

Blech, Richard A.

1989-01-01

The solution of a block tridiagonal matrix using parallel processing is demonstrated. The multiprocessor system on which results were obtained and the software environment used to program that system are described. Theoretical partitioning and resource allocation for the Gaussian elimination method used to solve the matrix are discussed. The results obtained from running 1, 2 and 3 processor versions of the block tridiagonal solver are presented. The PASCAL source code for these solvers is given in the appendix, and may be transportable to other shared memory parallel processors provided that the synchronization outlines are reproduced on the target system.
Nuclide Depletion Capabilities in the Shift Monte Carlo Code

DOE PAGES

Davidson, Gregory G.; Pandya, Tara M.; Johnson, Seth R.; ...

2017-12-21

A new depletion capability has been developed in the Exnihilo radiation transport code suite. This capability enables massively parallel domain-decomposed coupling between the Shift continuous-energy Monte Carlo solver and the nuclide depletion solvers in ORIGEN to perform high-performance Monte Carlo depletion calculations. This paper describes this new depletion capability and discusses its various features, including a multi-level parallel decomposition, high-order transport-depletion coupling, and energy-integrated power renormalization. Several test problems are presented to validate the new capability against other Monte Carlo depletion codes, and the parallel performance of the new capability is analyzed.
A Parallel Multigrid Solver for Viscous Flows on Anisotropic Structured Grids

NASA Technical Reports Server (NTRS)

Prieto, Manuel; Montero, Ruben S.; Llorente, Ignacio M.; Bushnell, Dennis M. (Technical Monitor)

2001-01-01

This paper presents an efficient parallel multigrid solver for speeding up the computation of a 3-D model that treats the flow of a viscous fluid over a flat plate. The main interest of this simulation lies in exhibiting some basic difficulties that prevent optimal multigrid efficiencies from being achieved. As the computing platform, we have used Coral, a Beowulf-class system based on Intel Pentium processors and equipped with GigaNet cLAN and switched Fast Ethernet networks. Our study not only examines the scalability of the solver but also includes a performance evaluation of Coral where the investigated solver has been used to compare several of its design choices, namely, the interconnection network (GigaNet versus switched Fast-Ethernet) and the node configuration (dual nodes versus single nodes). As a reference, the performance results have been compared with those obtained with the NAS-MG benchmark.
In-memory integration of existing software components for parallel adaptive unstructured mesh workflows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smith, Cameron W.; Granzow, Brian; Diamond, Gerrett

Unstructured mesh methods, like finite elements and finite volumes, support the effective analysis of complex physical behaviors modeled by partial differential equations over general threedimensional domains. The most reliable and efficient methods apply adaptive procedures with a-posteriori error estimators that indicate where and how the mesh is to be modified. Although adaptive meshes can have two to three orders of magnitude fewer elements than a more uniform mesh for the same level of accuracy, there are many complex simulations where the meshes required are so large that they can only be solved on massively parallel systems.
In-memory integration of existing software components for parallel adaptive unstructured mesh workflows

DOE PAGES

Smith, Cameron W.; Granzow, Brian; Diamond, Gerrett; ...

2017-01-01

Unstructured mesh methods, like finite elements and finite volumes, support the effective analysis of complex physical behaviors modeled by partial differential equations over general threedimensional domains. The most reliable and efficient methods apply adaptive procedures with a-posteriori error estimators that indicate where and how the mesh is to be modified. Although adaptive meshes can have two to three orders of magnitude fewer elements than a more uniform mesh for the same level of accuracy, there are many complex simulations where the meshes required are so large that they can only be solved on massively parallel systems.
An Element-Based Concurrent Partitioner for Unstructured Finite Element Meshes

NASA Technical Reports Server (NTRS)

Ding, Hong Q.; Ferraro, Robert D.

1996-01-01

A concurrent partitioner for partitioning unstructured finite element meshes on distributed memory architectures is developed. The partitioner uses an element-based partitioning strategy. Its main advantage over the more conventional node-based partitioning strategy is its modular programming approach to the development of parallel applications. The partitioner first partitions element centroids using a recursive inertial bisection algorithm. Elements and nodes then migrate according to the partitioned centroids, using a data request communication template for unpredictable incoming messages. Our scalable implementation is contrasted to a non-scalable implementation which is a straightforward parallelization of a sequential partitioner.
Parallel 3D Multi-Stage Simulation of a Turbofan Engine

NASA Technical Reports Server (NTRS)

Turner, Mark G.; Topp, David A.

1998-01-01

A 3D multistage simulation of each component of a modern GE Turbofan engine has been made. An axisymmetric view of this engine is presented in the document. This includes a fan, booster rig, high pressure compressor rig, high pressure turbine rig and a low pressure turbine rig. In the near future, all components will be run in a single calculation for a solution of 49 blade rows. The simulation exploits the use of parallel computations by using two levels of parallelism. Each blade row is run in parallel and each blade row grid is decomposed into several domains and run in parallel. 20 processors are used for the 4 blade row analysis. The average passage approach developed by John Adamczyk at NASA Lewis Research Center has been further developed and parallelized. This is APNASA Version A. It is a Navier-Stokes solver using a 4-stage explicit Runge-Kutta time marching scheme with variable time steps and residual smoothing for convergence acceleration. It has an implicit K-E turbulence model which uses an ADI solver to factor the matrix. Between 50 and 100 explicit time steps are solved before a blade row body force is calculated and exchanged with the other blade rows. This outer iteration has been coined a "flip." Efforts have been made to make the solver linearly scaleable with the number of blade rows. Enough flips are run (between 50 and 200) so the solution in the entire machine is not changing. The K-E equations are generally solved every other explicit time step. One of the key requirements in the development of the parallel code was to make the parallel solution exactly (bit for bit) match the serial solution. This has helped isolate many small parallel bugs and guarantee the parallelization was done correctly. The domain decomposition is done only in the axial direction since the number of points axially is much larger than the other two directions. This code uses MPI for message passing. The parallel speed up of the solver portion (no 1/0 or body force calculation) for a grid which has 227 points axially.
Laplace-domain waveform modeling and inversion for the 3D acoustic-elastic coupled media

NASA Astrophysics Data System (ADS)

Shin, Jungkyun; Shin, Changsoo; Calandra, Henri

2016-06-01

Laplace-domain waveform inversion reconstructs long-wavelength subsurface models by using the zero-frequency component of damped seismic signals. Despite the computational advantages of Laplace-domain waveform inversion over conventional frequency-domain waveform inversion, an acoustic assumption and an iterative matrix solver have been used to invert 3D marine datasets to mitigate the intensive computing cost. In this study, we develop a Laplace-domain waveform modeling and inversion algorithm for 3D acoustic-elastic coupled media by using a parallel sparse direct solver library (MUltifrontal Massively Parallel Solver, MUMPS). We precisely simulate a real marine environment by coupling the 3D acoustic and elastic wave equations with the proper boundary condition at the fluid-solid interface. In addition, we can extract the elastic properties of the Earth below the sea bottom from the recorded acoustic pressure datasets. As a matrix solver, the parallel sparse direct solver is used to factorize the non-symmetric impedance matrix in a distributed memory architecture and rapidly solve the wave field for a number of shots by using the lower and upper matrix factors. Using both synthetic datasets and real datasets obtained by a 3D wide azimuth survey, the long-wavelength component of the P-wave and S-wave velocity models is reconstructed and the proposed modeling and inversion algorithm are verified. A cluster of 80 CPU cores is used for this study.
PLUM: Parallel Load Balancing for Unstructured Adaptive Meshes. Degree awarded by Colorado Univ.

NASA Technical Reports Server (NTRS)

Oliker, Leonid

1998-01-01

Dynamic mesh adaption on unstructured grids is a powerful tool for computing large-scale problems that require grid modifications to efficiently resolve solution features. By locally refining and coarsening the mesh to capture physical phenomena of interest, such procedures make standard computational methods more cost effective. Unfortunately, an efficient parallel implementation of these adaptive methods is rather difficult to achieve, primarily due to the load imbalance created by the dynamically-changing nonuniform grid. This requires significant communication at runtime, leading to idle processors and adversely affecting the total execution time. Nonetheless, it is generally thought that unstructured adaptive- grid techniques will constitute a significant fraction of future high-performance supercomputing. Various dynamic load balancing methods have been reported to date; however, most of them either lack a global view of loads across processors or do not apply their techniques to realistic large-scale applications.

Gradient Calculation Methods on Arbitrary Polyhedral Unstructured Meshes for Cell-Centered CFD Solvers

NASA Technical Reports Server (NTRS)

Sozer, Emre; Brehm, Christoph; Kiris, Cetin C.

2014-01-01

A survey of gradient reconstruction methods for cell-centered data on unstructured meshes is conducted within the scope of accuracy assessment. Formal order of accuracy, as well as error magnitudes for each of the studied methods, are evaluated on a complex mesh of various cell types through consecutive local scaling of an analytical test function. The tests highlighted several gradient operator choices that can consistently achieve 1st order accuracy regardless of cell type and shape. The tests further offered error comparisons for given cell types, leading to the observation that the "ideal" gradient operator choice is not universal. Practical implications of the results are explored via CFD solutions of a 2D inviscid standing vortex, portraying the discretization error properties. A relatively naive, yet largely unexplored, approach of local curvilinear stencil transformation exhibited surprisingly favorable properties
An unstructured-grid software system for solving complex aerodynamic problems

NASA Technical Reports Server (NTRS)

Frink, Neal T.; Pirzadeh, Shahyar; Parikh, Paresh

1995-01-01

A coordinated effort has been underway over the past four years to elevate unstructured-grid methodology to a mature level. The goal of this endeavor is to provide a validated capability to non-expert users for performing rapid aerodynamic analysis and design of complex configurations. The Euler component of the system is well developed, and is impacting a broad spectrum of engineering needs with capabilities such as rapid grid generation and inviscid flow analysis, inverse design, interactive boundary layers, and propulsion effects. Progress is also being made in the more tenuous Navier-Stokes component of the system. A robust grid generator is under development for constructing quality thin-layer tetrahedral grids, along with a companion Navier-Stokes flow solver. This paper presents an overview of this effort, along with a perspective on the present and future status of the methodology.
MUTILS - a set of efficient modeling tools for multi-core CPUs implemented in MEX

NASA Astrophysics Data System (ADS)

Krotkiewski, Marcin; Dabrowski, Marcin

2013-04-01

The need for computational performance is common in scientific applications, and in particular in numerical simulations, where high resolution models require efficient processing of large amounts of data. Especially in the context of geological problems the need to increase the model resolution to resolve physical and geometrical complexities seems to have no limits. Alas, the performance of new generations of CPUs does not improve any longer by simply increasing clock speeds. Current industrial trends are to increase the number of computational cores. As a result, parallel implementations are required in order to fully utilize the potential of new processors, and to study more complex models. We target simulations on small to medium scale shared memory computers: laptops and desktop PCs with ~8 CPU cores and up to tens of GB of memory to high-end servers with ~50 CPU cores and hundereds of GB of memory. In this setting MATLAB is often the environment of choice for scientists that want to implement their own models with little effort. It is a useful general purpose mathematical software package, but due to its versatility some of its functionality is not as efficient as it could be. In particular, the challanges of modern multi-core architectures are not fully addressed. We have developed MILAMIN 2 - an efficient FEM modeling environment written in native MATLAB. Amongst others, MILAMIN provides functions to define model geometry, generate and convert structured and unstructured meshes (also through interfaces to external mesh generators), compute element and system matrices, apply boundary conditions, solve the system of linear equations, address non-linear and transient problems, and perform post-processing. MILAMIN strives to combine the ease of code development and the computational efficiency. Where possible, the code is optimized and/or parallelized within the MATLAB framework. Native MATLAB is augmented with the MUTILS library - a set of MEX functions that implement the computationally intensive, performance critical parts of the code, which we have identified to be bottlenecks. Here, we discuss the functionality and performance of the MUTILS library. Currently, it includes: 1. time and memory efficient assembly of sparse matrices for FEM simulations 2. parallel sparse matrix - vector product with optimizations speficic to symmetric matrices and multiple degrees of freedom per node 3. parallel point in triangle location and point in tetrahedron location for unstructured, adaptive 2D and 3D meshes (useful for 'marker in cell' type of methods) 4. parallel FEM interpolation for 2D and 3D meshes of elements of different types and orders, and for different number of degrees of freedom per node 5. a stand-alone, MEX implementation of the Conjugate Gradients iterative solver 6. interface to METIS graph partitioning and a fast implementation of RCM reordering
3-Dimensional Marine CSEM Modeling by Employing TDFEM with Parallel Solvers

NASA Astrophysics Data System (ADS)

Wu, X.; Yang, T.

2013-12-01

In this paper, parallel fulfillment is developed for forward modeling of the 3-Dimensional controlled source electromagnetic (CSEM) by using time-domain finite element method (TDFEM). Recently, a greater attention rises on research of hydrocarbon (HC) reservoir detection mechanism in the seabed. Since China has vast ocean resources, seeking hydrocarbon reservoirs become significant in the national economy. However, traditional methods of seismic exploration shown a crucial obstacle to detect hydrocarbon reservoirs in the seabed with a complex structure, due to relatively high acquisition costs and high-risking exploration. In addition, the development of EM simulations typically requires both a deep knowledge of the computational electromagnetics (CEM) and a proper use of sophisticated techniques and tools from computer science. However, the complexity of large-scale EM simulations often requires large memory because of a large amount of data, or solution time to address problems concerning matrix solvers, function transforms, optimization, etc. The objective of this paper is to present parallelized implementation of the time-domain finite element method for analysis of three-dimensional (3D) marine controlled source electromagnetic problems. Firstly, we established a three-dimensional basic background model according to the seismic data, then electromagnetic simulation of marine CSEM was carried out by using time-domain finite element method, which works on a MPI (Message Passing Interface) platform with exact orientation to allow fast detecting of hydrocarbons targets in ocean environment. To speed up the calculation process, SuperLU of an MPI (Message Passing Interface) version called SuperLU_DIST is employed in this approach. Regarding the representation of three-dimension seabed terrain with sense of reality, the region is discretized into an unstructured mesh rather than a uniform one in order to reduce the number of unknowns. Moreover, high-order Whitney vector basis functions are used for spatial discretization within the finite element approach to approximate the electric field. A horizontal electric dipole was used as a source, and an array of the receiver located at the seabed. To capture the presence of the hydrocarbon layer, the forward responses at water depths from 100m to 3000m are calculated. The normalized Magnitude Versus Offset (N-MVO) and Phase Versus Offset (PVO) curve can reflect resistive characteristics of hydrocarbon layers. For future work, Graphics Process Unit (GPU) acceleration algorithm would be carried out to multiply the calculation efficiency greatly.
A two-dimensional Riemann solver with self-similar sub-structure - Alternative formulation based on least squares projection

NASA Astrophysics Data System (ADS)

Balsara, Dinshaw S.; Vides, Jeaniffer; Gurski, Katharine; Nkonga, Boniface; Dumbser, Michael; Garain, Sudip; Audit, Edouard

2016-01-01

Just as the quality of a one-dimensional approximate Riemann solver is improved by the inclusion of internal sub-structure, the quality of a multidimensional Riemann solver is also similarly improved. Such multidimensional Riemann problems arise when multiple states come together at the vertex of a mesh. The interaction of the resulting one-dimensional Riemann problems gives rise to a strongly-interacting state. We wish to endow this strongly-interacting state with physically-motivated sub-structure. The self-similar formulation of Balsara [16] proves especially useful for this purpose. While that work is based on a Galerkin projection, in this paper we present an analogous self-similar formulation that is based on a different interpretation. In the present formulation, we interpret the shock jumps at the boundary of the strongly-interacting state quite literally. The enforcement of the shock jump conditions is done with a least squares projection (Vides, Nkonga and Audit [67]). With that interpretation, we again show that the multidimensional Riemann solver can be endowed with sub-structure. However, we find that the most efficient implementation arises when we use a flux vector splitting and a least squares projection. An alternative formulation that is based on the full characteristic matrices is also presented. The multidimensional Riemann solvers that are demonstrated here use one-dimensional HLLC Riemann solvers as building blocks. Several stringent test problems drawn from hydrodynamics and MHD are presented to show that the method works. Results from structured and unstructured meshes demonstrate the versatility of our method. The reader is also invited to watch a video introduction to multidimensional Riemann solvers on http://www.nd.edu/ dbalsara/Numerical-PDE-Course.
Computationally efficient simulation of unsteady aerodynamics using POD on the fly

NASA Astrophysics Data System (ADS)

Moreno-Ramos, Ruben; Vega, José M.; Varas, Fernando

2016-12-01

Modern industrial aircraft design requires a large amount of sufficiently accurate aerodynamic and aeroelastic simulations. Current computational fluid dynamics (CFD) solvers with aeroelastic capabilities, such as the NASA URANS unstructured solver FUN3D, require very large computational resources. Since a very large amount of simulation is necessary, the CFD cost is just unaffordable in an industrial production environment and must be significantly reduced. Thus, a more inexpensive, yet sufficiently precise solver is strongly needed. An opportunity to approach this goal could follow some recent results (Terragni and Vega 2014 SIAM J. Appl. Dyn. Syst. 13 330-65 Rapun et al 2015 Int. J. Numer. Meth. Eng. 104 844-68) on an adaptive reduced order model that combines ‘on the fly’ a standard numerical solver (to compute some representative snapshots), proper orthogonal decomposition (POD) (to extract modes from the snapshots), Galerkin projection (onto the set of POD modes), and several additional ingredients such as projecting the equations using a limited amount of points and fairly generic mode libraries. When applied to the complex Ginzburg-Landau equation, the method produces acceleration factors (comparing with standard numerical solvers) of the order of 20 and 300 in one and two space dimensions, respectively. Unfortunately, the extension of the method to unsteady, compressible flows around deformable geometries requires new approaches to deal with deformable meshes, high-Reynolds numbers, and compressibility. A first step in this direction is presented considering the unsteady compressible, two-dimensional flow around an oscillating airfoil using a CFD solver in a rigidly moving mesh. POD on the Fly gives results whose accuracy is comparable to that of the CFD solver used to compute the snapshots.
Parallel computational fluid dynamics '91; Conference Proceedings, Stuttgart, Germany, Jun. 10-12, 1991

NASA Technical Reports Server (NTRS)

Reinsch, K. G. (Editor); Schmidt, W. (Editor); Ecer, A. (Editor); Haeuser, Jochem (Editor); Periaux, J. (Editor)

1992-01-01

A conference was held on parallel computational fluid dynamics and produced related papers. Topics discussed in these papers include: parallel implicit and explicit solvers for compressible flow, parallel computational techniques for Euler and Navier-Stokes equations, grid generation techniques for parallel computers, and aerodynamic simulation om massively parallel systems.
A Flow Solver for Three-Dimensional DRAGON Grids

NASA Technical Reports Server (NTRS)

Liou, Meng-Sing; Zheng, Yao

2002-01-01

DRAGONFLOW code has been developed to solve three-dimensional Navier-Stokes equations over a complex geometry whose flow domain is discretized with the DRAGON grid-a combination of Chimera grid and a collection of unstructured grids. In the DRAGONFLOW suite, both OVERFLOW and USM3D are presented in form of module libraries, and a master module controls the invoking of these individual modules. This report includes essential aspects, programming structures, benchmark tests and numerical simulations.
PCTDSE: A parallel Cartesian-grid-based TDSE solver for modeling laser-atom interactions

NASA Astrophysics Data System (ADS)

Fu, Yongsheng; Zeng, Jiaolong; Yuan, Jianmin

2017-01-01

We present a parallel Cartesian-grid-based time-dependent Schrödinger equation (TDSE) solver for modeling laser-atom interactions. It can simulate the single-electron dynamics of atoms in arbitrary time-dependent vector potentials. We use a split-operator method combined with fast Fourier transforms (FFT), on a three-dimensional (3D) Cartesian grid. Parallelization is realized using a 2D decomposition strategy based on the Message Passing Interface (MPI) library, which results in a good parallel scaling on modern supercomputers. We give simple applications for the hydrogen atom using the benchmark problems coming from the references and obtain repeatable results. The extensions to other laser-atom systems are straightforward with minimal modifications of the source code.
A Parallel Implementation of Multilevel Recursive Spectral Bisection for Application to Adaptive Unstructured Meshes. Chapter 1

NASA Technical Reports Server (NTRS)

Barnard, Stephen T.; Simon, Horst; Lasinski, T. A. (Technical Monitor)

1994-01-01

The design of a parallel implementation of multilevel recursive spectral bisection is described. The goal is to implement a code that is fast enough to enable dynamic repartitioning of adaptive meshes.
Parallel Three-Dimensional Computation of Fluid Dynamics and Fluid-Structure Interactions of Ram-Air Parachutes

NASA Technical Reports Server (NTRS)

Tezduyar, Tayfun E.

1998-01-01

This is a final report as far as our work at University of Minnesota is concerned. The report describes our research progress and accomplishments in development of high performance computing methods and tools for 3D finite element computation of aerodynamic characteristics and fluid-structure interactions (FSI) arising in airdrop systems, namely ram-air parachutes and round parachutes. This class of simulations involves complex geometries, flexible structural components, deforming fluid domains, and unsteady flow patterns. The key components of our simulation toolkit are a stabilized finite element flow solver, a nonlinear structural dynamics solver, an automatic mesh moving scheme, and an interface between the fluid and structural solvers; all of these have been developed within a parallel message-passing paradigm.
Agglomeration Multigrid for an Unstructured-Grid Flow Solver

NASA Technical Reports Server (NTRS)

Frink, Neal; Pandya, Mohagna J.

2004-01-01

An agglomeration multigrid scheme has been implemented into the sequential version of the NASA code USM3Dns, tetrahedral cell-centered finite volume Euler/Navier-Stokes flow solver. Efficiency and robustness of the multigrid-enhanced flow solver have been assessed for three configurations assuming an inviscid flow and one configuration assuming a viscous fully turbulent flow. The inviscid studies include a transonic flow over the ONERA M6 wing and a generic business jet with flow-through nacelles and a low subsonic flow over a high-lift trapezoidal wing. The viscous case includes a fully turbulent flow over the RAE 2822 rectangular wing. The multigrid solutions converged with 12%-33% of the Central Processing Unit (CPU) time required by the solutions obtained without multigrid. For all of the inviscid cases, multigrid in conjunction with an explicit time-stepping scheme performed the best with regard to the run time memory and CPU time requirements. However, for the viscous case multigrid had to be used with an implicit backward Euler time-stepping scheme that increased the run time memory requirement by 22% as compared to the run made without multigrid.
Hierarchically Parallelized Constrained Nonlinear Solvers with Automated Substructuring

NASA Technical Reports Server (NTRS)

Padovan, Joe; Kwang, Abel

1994-01-01

This paper develops a parallelizable multilevel multiple constrained nonlinear equation solver. The substructuring process is automated to yield appropriately balanced partitioning of each succeeding level. Due to the generality of the procedure,_sequential, as well as partially and fully parallel environments can be handled. This includes both single and multiprocessor assignment per individual partition. Several benchmark examples are presented. These illustrate the robustness of the procedure as well as its capability to yield significant reductions in memory utilization and calculational effort due both to updating and inversion.
Parallel methods for the computation of unsteady separated flows around complex geometries

NASA Astrophysics Data System (ADS)

Souliez, Frederic Jean

A numerical investigation of separated flows is made using unstructured meshes around complex geometries. The flow data in the wake of a 60-degree vertex angle cone are analyzed for various versions of our finite volume solver, including a generic version without turbulence model, and a Large Eddy Simulation model with different sub-grid scale constant values. While the primary emphasis is on the comparison of the results against experimental data, the solution is also used as a benchmark tool for an aeroacoustic post-processing utility combined with the Ffowcs Williams-Hawkings (FW-H) equation. A concurrent study is performed of the flow around two 4-wheel landing gear models, with the difference residing in the addition of two additional support struts. These unsteady calculations are used to provide aerodynamic and aeroacoustic data. The impact of the two configurations on the forces as well as on the acoustic near- and far-field is evaluated with the help of the above-mentioned aeroacoustic program. For both the cone and landing gear runs, parallel versions of the flow solver and of the FW-H utility are used via the implementation of the Message Passing Interface (MPI) library, resulting in very good scaling performance. The speed-up results for these cases are described for different platforms including inexpensive Beowulf-class clusters, which are the computing workhorse for the present numerical investigation. Furthermore, the analysis of the flow around a Bell 214 Super Transport (ST) fuselage is presented. A mesh sensitivity analysis is compared against experimental and numerical results collected by the helicopter manufacturer. Parameters such as surface pressure coefficient, lift and drag are evaluated resulting from both steady-state and time-accurate simulations. Various flight conditions are tested, with a slightly negative angle of attack, a large positive angle of attack and a positive yaw angle, all of which resulting in massive flow separation. The impact of the shedding of flow behind the rotor hub on the unsteady tail loading is also assessed. Finally, a parametric study of the solver's ability to simulate the propagation of a Gaussian pulse using Roe's flux integration scheme versus central differencing is performed, measuring the impact on the artificial dissipation scheme as well as that of the values of the artificial viscosity coefficients. The combination of a central differencing scheme with fourth-order artificial dissipation is tested on the previously described cone flow case, and the effects on averaged and turbulent quantities are measured.
Algorithms for parallel flow solvers on message passing architectures

NASA Technical Reports Server (NTRS)

Vanderwijngaart, Rob F.

1995-01-01

The purpose of this project has been to identify and test suitable technologies for implementation of fluid flow solvers -- possibly coupled with structures and heat equation solvers -- on MIMD parallel computers. In the course of this investigation much attention has been paid to efficient domain decomposition strategies for ADI-type algorithms. Multi-partitioning derives its efficiency from the assignment of several blocks of grid points to each processor in the parallel computer. A coarse-grain parallelism is obtained, and a near-perfect load balance results. In uni-partitioning every processor receives responsibility for exactly one block of grid points instead of several. This necessitates fine-grain pipelined program execution in order to obtain a reasonable load balance. Although fine-grain parallelism is less desirable on many systems, especially high-latency networks of workstations, uni-partition methods are still in wide use in production codes for flow problems. Consequently, it remains important to achieve good efficiency with this technique that has essentially been superseded by multi-partitioning for parallel ADI-type algorithms. Another reason for the concentration on improving the performance of pipeline methods is their applicability in other types of flow solver kernels with stronger implied data dependence. Analytical expressions can be derived for the size of the dynamic load imbalance incurred in traditional pipelines. From these it can be determined what is the optimal first-processor retardation that leads to the shortest total completion time for the pipeline process. Theoretical predictions of pipeline performance with and without optimization match experimental observations on the iPSC/860 very well. Analysis of pipeline performance also highlights the effect of uncareful grid partitioning in flow solvers that employ pipeline algorithms. If grid blocks at boundaries are not at least as large in the wall-normal direction as those immediately adjacent to them, then the first processor in the pipeline will receive a computational load that is less than that of subsequent processors, magnifying the pipeline slowdown effect. Extra compensation is needed for grid boundary effects, even if all grid blocks are equally sized.
A Boundary Condition for Simulation of Flow Over Porous Surfaces

NASA Technical Reports Server (NTRS)

Frink, Neal T.; Bonhaus, Daryl L.; Vatsa, Veer N.; Bauer, Steven X. S.; Tinetti, Ana F.

2001-01-01

A new boundary condition is presented.for simulating the flow over passively porous surfaces. The model builds on the prior work of R.H. Bush to eliminate the need for constructing grid within an underlying plenum, thereby simplifying the numerical modeling of passively porous flow control systems and reducing computation cost. Code experts.for two structured-grid.flow solvers, TLNS3D and CFL3D. and one unstructured solver, USM3Dns, collaborated with an experimental porosity expert to develop the model and implement it into their respective codes. Results presented,for the three codes on a slender forebody with circumferential porosity and a wing with leading-edge porosity demonstrate a good agreement with experimental data and a remarkable ability to predict the aggregate aerodynamic effects of surface porosity with a simple boundary condition.
Recent Enhancements To The FUN3D Flow Solver For Moving-Mesh Applications

NASA Technical Reports Server (NTRS)

Biedron, Robert T,; Thomas, James L.

2009-01-01

An unsteady Reynolds-averaged Navier-Stokes solver for unstructured grids has been extended to handle general mesh movement involving rigid, deforming, and overset meshes. Mesh deformation is achieved through analogy to elastic media by solving the linear elasticity equations. A general method for specifying the motion of moving bodies within the mesh has been implemented that allows for inherited motion through parent-child relationships, enabling simulations involving multiple moving bodies. Several example calculations are shown to illustrate the range of potential applications. For problems in which an isolated body is rotating with a fixed rate, a noninertial reference-frame formulation is available. An example calculation for a tilt-wing rotor is used to demonstrate that the time-dependent moving grid and noninertial formulations produce the same results in the limit of zero time-step size.
Performance of fully-coupled algebraic multigrid preconditioners for large-scale VMS resistive MHD

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, P. T.; Shadid, J. N.; Hu, J. J.

Here, we explore the current performance and scaling of a fully-implicit stabilized unstructured finite element (FE) variational multiscale (VMS) capability for large-scale simulations of 3D incompressible resistive magnetohydrodynamics (MHD). The large-scale linear systems that are generated by a Newton nonlinear solver approach are iteratively solved by preconditioned Krylov subspace methods. The efficiency of this approach is critically dependent on the scalability and performance of the algebraic multigrid preconditioner. Our study considers the performance of the numerical methods as recently implemented in the second-generation Trilinos implementation that is 64-bit compliant and is not limited by the 32-bit global identifiers of themore » original Epetra-based Trilinos. The study presents representative results for a Poisson problem on 1.6 million cores of an IBM Blue Gene/Q platform to demonstrate very large-scale parallel execution. Additionally, results for a more challenging steady-state MHD generator and a transient solution of a benchmark MHD turbulence calculation for the full resistive MHD system are also presented. These results are obtained on up to 131,000 cores of a Cray XC40 and one million cores of a BG/Q system.« less
Performance of fully-coupled algebraic multigrid preconditioners for large-scale VMS resistive MHD

DOE PAGES

Lin, P. T.; Shadid, J. N.; Hu, J. J.; ...

2017-11-06

Here, we explore the current performance and scaling of a fully-implicit stabilized unstructured finite element (FE) variational multiscale (VMS) capability for large-scale simulations of 3D incompressible resistive magnetohydrodynamics (MHD). The large-scale linear systems that are generated by a Newton nonlinear solver approach are iteratively solved by preconditioned Krylov subspace methods. The efficiency of this approach is critically dependent on the scalability and performance of the algebraic multigrid preconditioner. Our study considers the performance of the numerical methods as recently implemented in the second-generation Trilinos implementation that is 64-bit compliant and is not limited by the 32-bit global identifiers of themore » original Epetra-based Trilinos. The study presents representative results for a Poisson problem on 1.6 million cores of an IBM Blue Gene/Q platform to demonstrate very large-scale parallel execution. Additionally, results for a more challenging steady-state MHD generator and a transient solution of a benchmark MHD turbulence calculation for the full resistive MHD system are also presented. These results are obtained on up to 131,000 cores of a Cray XC40 and one million cores of a BG/Q system.« less
Computational simulations of supersonic magnetohydrodynamic flow control, power and propulsion systems

NASA Astrophysics Data System (ADS)

Wan, Tian

This work is motivated by the lack of fully coupled computational tool that solves successfully the turbulent chemically reacting Navier-Stokes equation, the electron energy conservation equation and the electric current Poisson equation. In the present work, the abovementioned equations are solved in a fully coupled manner using fully implicit parallel GMRES methods. The system of Navier-Stokes equations are solved using a GMRES method with combined Schwarz and ILU(0) preconditioners. The electron energy equation and the electric current Poisson equation are solved using a GMRES method with combined SOR and Jacobi preconditioners. The fully coupled method has also been implemented successfully in an unstructured solver, US3D, and convergence test results were presented. This new method is shown two to five times faster than the original DPLR method. The Poisson solver is validated with analytic test problems. Then, four problems are selected; two of them are computed to explore the possibility of onboard MHD control and power generation, and the other two are simulation of experiments. First, the possibility of onboard reentry shock control by a magnetic field is explored. As part of a previous project, MHD power generation onboard a re-entry vehicle is also simulated. Then, the MHD acceleration experiments conducted at NASA Ames research center are simulated. Lastly, the MHD power generation experiments known as the HVEPS project are simulated. For code validation, the scramjet experiments at University of Queensland are simulated first. The generator section of the HVEPS test facility is computed then. The main conclusion is that the computational tool is accurate for different types of problems and flow conditions, and its accuracy and efficiency are necessary when the flow complexity increases.

Distributed Memory Parallel Computing with SEAWAT

NASA Astrophysics Data System (ADS)

Verkaik, J.; Huizer, S.; van Engelen, J.; Oude Essink, G.; Ram, R.; Vuik, K.

2017-12-01

Fresh groundwater reserves in coastal aquifers are threatened by sea-level rise, extreme weather conditions, increasing urbanization and associated groundwater extraction rates. To counteract these threats, accurate high-resolution numerical models are required to optimize the management of these precious reserves. The major model drawbacks are long run times and large memory requirements, limiting the predictive power of these models. Distributed memory parallel computing is an efficient technique for reducing run times and memory requirements, where the problem is divided over multiple processor cores. A new Parallel Krylov Solver (PKS) for SEAWAT is presented. PKS has recently been applied to MODFLOW and includes Conjugate Gradient (CG) and Biconjugate Gradient Stabilized (BiCGSTAB) linear accelerators. Both accelerators are preconditioned by an overlapping additive Schwarz preconditioner in a way that: a) subdomains are partitioned using Recursive Coordinate Bisection (RCB) load balancing, b) each subdomain uses local memory only and communicates with other subdomains by Message Passing Interface (MPI) within the linear accelerator, c) it is fully integrated in SEAWAT. Within SEAWAT, the PKS-CG solver replaces the Preconditioned Conjugate Gradient (PCG) solver for solving the variable-density groundwater flow equation and the PKS-BiCGSTAB solver replaces the Generalized Conjugate Gradient (GCG) solver for solving the advection-diffusion equation. PKS supports the third-order Total Variation Diminishing (TVD) scheme for computing advection. Benchmarks were performed on the Dutch national supercomputer (https://userinfo.surfsara.nl/systems/cartesius) using up to 128 cores, for a synthetic 3D Henry model (100 million cells) and the real-life Sand Engine model ( 10 million cells). The Sand Engine model was used to investigate the potential effect of the long-term morphological evolution of a large sand replenishment and climate change on fresh groundwater resources. Speed-ups up to 40 were obtained with the new PKS solver.
Turbulent Bubbly Flow in a Vertical Pipe Computed By an Eddy-Resolving Reynolds Stress Model

DTIC Science & Technology

2014-09-19

the numerical code OpenFOAM R©. 1 Introduction Turbulent bubbly flows are encountered in many industrially relevant applications, such as chemical in...performed using the OpenFOAM -2.2.2 computational code utilizing a cell- center-based finite volume method on an unstructured numerical grid. The...the mean Courant number is always below 0.4. The utilized turbulence models were implemented into the so-called twoPhaseEulerFoam solver in OpenFOAM , to
Development of an Aero-Optics Software Library and Integration into Structured Overset and Unstructured Computational Fluid Dynamics (CFD) Flow Solvers

DTIC Science & Technology

2011-04-01

some similarities to the far- field (i.e. atmospheric ) propagation, but due to the interactions between turbulence length scales, beam wavelengths...equivalently, phase differences, have been used to characterize the beam distortion caused by the unsteady turbulent flow field. A Partially-Averaged Navier...A., Wang, M., and Moin, P., “Computational Study of Aero-Optical Distortion by Turbulent Wake,” AIAA Paper 2005-4655. [11] Mani, A., Wang, M., and
Final Report for ALCC Allocation: Predictive Simulation of Complex Flow in Wind Farms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barone, Matthew F.; Ananthan, Shreyas; Churchfield, Matt

This report documents work performed using ALCC computing resources granted under a proposal submitted in February 2016, with the resource allocation period spanning the period July 2016 through June 2017. The award allocation was 10.7 million processor-hours at the National Energy Research Scientific Computing Center. The simulations performed were in support of two projects: the Atmosphere to Electrons (A2e) project, supported by the DOE EERE office; and the Exascale Computing Project (ECP), supported by the DOE Office of Science. The project team for both efforts consists of staff scientists and postdocs from Sandia National Laboratories and the National Renewable Energymore » Laboratory. At the heart of these projects is the open-source computational-fluid-dynamics (CFD) code, Nalu. Nalu solves the low-Mach-number Navier-Stokes equations using an unstructured- grid discretization. Nalu leverages the open-source Trilinos solver library and the Sierra Toolkit (STK) for parallelization and I/O. This report documents baseline computational performance of the Nalu code on problems of direct relevance to the wind plant physics application - namely, Large Eddy Simulation (LES) of an atmospheric boundary layer (ABL) flow and wall-modeled LES of a flow past a static wind turbine rotor blade. Parallel performance of Nalu and its constituent solver routines residing in the Trilinos library has been assessed previously under various campaigns. However, both Nalu and Trilinos have been, and remain, in active development and resources have not been available previously to rigorously track code performance over time. With the initiation of the ECP, it is important to establish and document baseline code performance on the problems of interest. This will allow the project team to identify and target any deficiencies in performance, as well as highlight any performance bottlenecks as we exercise the code on a greater variety of platforms and at larger scales. The current study is rather modest in scale, examining performance on problem sizes of O(100 million) elements and core counts up to 8k cores. This will be expanded as more computational resources become available to the projects.« less
Parallelized CCHE2D flow model with CUDA Fortran on Graphics Process Units

USDA-ARS?s Scientific Manuscript database

This paper presents the CCHE2D implicit flow model parallelized using CUDA Fortran programming technique on Graphics Processing Units (GPUs). A parallelized implicit Alternating Direction Implicit (ADI) solver using Parallel Cyclic Reduction (PCR) algorithm on GPU is developed and tested. This solve...
Distributed memory compiler methods for irregular problems: Data copy reuse and runtime partitioning

NASA Technical Reports Server (NTRS)

Das, Raja; Ponnusamy, Ravi; Saltz, Joel; Mavriplis, Dimitri

1991-01-01

Outlined here are two methods which we believe will play an important role in any distributed memory compiler able to handle sparse and unstructured problems. We describe how to link runtime partitioners to distributed memory compilers. In our scheme, programmers can implicitly specify how data and loop iterations are to be distributed between processors. This insulates users from having to deal explicitly with potentially complex algorithms that carry out work and data partitioning. We also describe a viable mechanism for tracking and reusing copies of off-processor data. In many programs, several loops access the same off-processor memory locations. As long as it can be verified that the values assigned to off-processor memory locations remain unmodified, we show that we can effectively reuse stored off-processor data. We present experimental data from a 3-D unstructured Euler solver run on iPSC/860 to demonstrate the usefulness of our methods.
An Adaptive Unstructured Grid Method by Grid Subdivision, Local Remeshing, and Grid Movement

NASA Technical Reports Server (NTRS)

Pirzadeh, Shahyar Z.

1999-01-01

An unstructured grid adaptation technique has been developed and successfully applied to several three dimensional inviscid flow test cases. The approach is based on a combination of grid subdivision, local remeshing, and grid movement. For solution adaptive grids, the surface triangulation is locally refined by grid subdivision, and the tetrahedral grid in the field is partially remeshed at locations of dominant flow features. A grid redistribution strategy is employed for geometric adaptation of volume grids to moving or deforming surfaces. The method is automatic and fast and is designed for modular coupling with different solvers. Several steady state test cases with different inviscid flow features were tested for grid/solution adaptation. In all cases, the dominant flow features, such as shocks and vortices, were accurately and efficiently predicted with the present approach. A new and robust method of moving tetrahedral "viscous" grids is also presented and demonstrated on a three-dimensional example.
Wall modeled LES of wind turbine wakes with geometrical effects

NASA Astrophysics Data System (ADS)

Bricteux, Laurent; Benard, Pierre; Zeoli, Stephanie; Moureau, Vincent; Lartigue, Ghislain; Vire, Axelle

2017-11-01

This study focuses on prediction of wind turbine wakes when geometrical effects such as nacelle, tower, and built environment, are taken into account. The aim is to demonstrate the ability of a high order unstructured solver called YALES2 to perform wall modeled LES of wind turbine wake turbulence. The wind turbine rotor is modeled using an Actuator Line Model (ALM) while the geometrical details are explicitly meshed thanks to the use of an unstructured grid. As high Reynolds number flows are considered, sub-grid scale models as well as wall modeling are required. The first test case investigated concerns a wind turbine flow located in a wind tunnel that allows to validate the proposed methodology using experimental data. The second test case concerns the simulation of a wind turbine wake in a complex environment (e.g. a Building) using realistic turbulent inflow conditions.
Numerical analysis of a high-order unstructured overset grid method for compressible LES of turbomachinery

NASA Astrophysics Data System (ADS)

de Laborderie, J.; Duchaine, F.; Gicquel, L.; Vermorel, O.; Wang, G.; Moreau, S.

2018-06-01

Large-Eddy Simulation (LES) is recognized as a promising method for high-fidelity flow predictions in turbomachinery applications. The presented approach consists of the coupling of several instances of the same LES unstructured solver through an overset grid method. A high-order interpolation, implemented within this coupling method, is introduced and evaluated on several test cases. It is shown to be third order accurate, to preserve the accuracy of various second and third order convective schemes and to ensure the continuity of diffusive fluxes and subgrid scale tensors even in detrimental interface configurations. In this analysis, three types of spurious waves generated at the interface are identified. They are significantly reduced by the high-order interpolation at the interface. The latter having the same cost as the original lower order method, the high-order overset grid method appears as a promising alternative to be used in all the applications.
Final Report: Subcontract B623868 Algebraic Multigrid solvers for coupled PDE systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brannick, J.

The Pennsylvania State University (“Subcontractor”) continued to work on the design of algebraic multigrid solvers for coupled systems of partial differential equations (PDEs) arising in numerical modeling of various applications, with a main focus on solving the Dirac equation arising in Quantum Chromodynamics (QCD). The goal of the proposed work was to develop combined geometric and algebraic multilevel solvers that are robust and lend themselves to efficient implementation on massively parallel heterogeneous computers for these QCD systems. The research in these areas built on previous works, focusing on the following three topics: (1) the development of parallel full-multigrid (PFMG) andmore » non-Galerkin coarsening techniques in this frame work for solving the Wilson Dirac system; (2) the use of these same Wilson MG solvers for preconditioning the Overlap and Domain Wall formulations of the Dirac equation; and (3) the design and analysis of algebraic coarsening algorithms for coupled PDE systems including Stokes equation, Maxwell equation and linear elasticity.« less
Parallel discontinuous Galerkin FEM for computing hyperbolic conservation law on unstructured grids

NASA Astrophysics Data System (ADS)

Ma, Xinrong; Duan, Zhijian

2018-04-01

High-order resolution Discontinuous Galerkin finite element methods (DGFEM) has been known as a good method for solving Euler equations and Navier-Stokes equations on unstructured grid, but it costs too much computational resources. An efficient parallel algorithm was presented for solving the compressible Euler equations. Moreover, the multigrid strategy based on three-stage three-order TVD Runge-Kutta scheme was used in order to improve the computational efficiency of DGFEM and accelerate the convergence of the solution of unsteady compressible Euler equations. In order to make each processor maintain load balancing, the domain decomposition method was employed. Numerical experiment performed for the inviscid transonic flow fluid problems around NACA0012 airfoil and M6 wing. The results indicated that our parallel algorithm can improve acceleration and efficiency significantly, which is suitable for calculating the complex flow fluid.
A Numerical Study of Scalable Cardiac Electro-Mechanical Solvers on HPC Architectures

PubMed Central

Colli Franzone, Piero; Pavarino, Luca F.; Scacchi, Simone

2018-01-01

We introduce and study some scalable domain decomposition preconditioners for cardiac electro-mechanical 3D simulations on parallel HPC (High Performance Computing) architectures. The electro-mechanical model of the cardiac tissue is composed of four coupled sub-models: (1) the static finite elasticity equations for the transversely isotropic deformation of the cardiac tissue; (2) the active tension model describing the dynamics of the intracellular calcium, cross-bridge binding and myofilament tension; (3) the anisotropic Bidomain model describing the evolution of the intra- and extra-cellular potentials in the deforming cardiac tissue; and (4) the ionic membrane model describing the dynamics of ionic currents, gating variables, ionic concentrations and stretch-activated channels. This strongly coupled electro-mechanical model is discretized in time with a splitting semi-implicit technique and in space with isoparametric finite elements. The resulting scalable parallel solver is based on Multilevel Additive Schwarz preconditioners for the solution of the Bidomain system and on BDDC preconditioned Newton-Krylov solvers for the non-linear finite elasticity system. The results of several 3D parallel simulations show the scalability of both linear and non-linear solvers and their application to the study of both physiological excitation-contraction cardiac dynamics and re-entrant waves in the presence of different mechano-electrical feedbacks. PMID:29674971
Higher Order Time Integration Schemes for the Unsteady Navier-Stokes Equations on Unstructured Meshes

NASA Technical Reports Server (NTRS)

Jothiprasad, Giridhar; Mavriplis, Dimitri J.; Caughey, David A.

2002-01-01

The rapid increase in available computational power over the last decade has enabled higher resolution flow simulations and more widespread use of unstructured grid methods for complex geometries. While much of this effort has been focused on steady-state calculations in the aerodynamics community, the need to accurately predict off-design conditions, which may involve substantial amounts of flow separation, points to the need to efficiently simulate unsteady flow fields. Accurate unsteady flow simulations can easily require several orders of magnitude more computational effort than a corresponding steady-state simulation. For this reason, techniques for improving the efficiency of unsteady flow simulations are required in order to make such calculations feasible in the foreseeable future. The purpose of this work is to investigate possible reductions in computer time due to the choice of an efficient time-integration scheme from a series of schemes differing in the order of time-accuracy, and by the use of more efficient techniques to solve the nonlinear equations which arise while using implicit time-integration schemes. This investigation is carried out in the context of a two-dimensional unstructured mesh laminar Navier-Stokes solver.
Multigrid Strategies for Viscous Flow Solvers on Anisotropic Unstructured Meshes

NASA Technical Reports Server (NTRS)

Movriplis, Dimitri J.

1998-01-01

Unstructured multigrid techniques for relieving the stiffness associated with high-Reynolds number viscous flow simulations on extremely stretched grids are investigated. One approach consists of employing a semi-coarsening or directional-coarsening technique, based on the directions of strong coupling within the mesh, in order to construct more optimal coarse grid levels. An alternate approach is developed which employs directional implicit smoothing with regular fully coarsened multigrid levels. The directional implicit smoothing is obtained by constructing implicit lines in the unstructured mesh based on the directions of strong coupling. Both approaches yield large increases in convergence rates over the traditional explicit full-coarsening multigrid algorithm. However, maximum benefits are achieved by combining the two approaches in a coupled manner into a single algorithm. An order of magnitude increase in convergence rate over the traditional explicit full-coarsening algorithm is demonstrated, and convergence rates for high-Reynolds number viscous flows which are independent of the grid aspect ratio are obtained. Further acceleration is provided by incorporating low-Mach-number preconditioning techniques, and a Newton-GMRES strategy which employs the multigrid scheme as a preconditioner. The compounding effects of these various techniques on speed of convergence is documented through several example test cases.
Computing Normal Shock-Isotropic Turbulence Interaction With Tetrahedral Meshes and the Space-Time CESE Method

NASA Astrophysics Data System (ADS)

Venkatachari, Balaji Shankar; Chang, Chau-Lyan

2016-11-01

The focus of this study is scale-resolving simulations of the canonical normal shock- isotropic turbulence interaction using unstructured tetrahedral meshes and the space-time conservation element solution element (CESE) method. Despite decades of development in unstructured mesh methods and its potential benefits of ease of mesh generation around complex geometries and mesh adaptation, direct numerical or large-eddy simulations of turbulent flows are predominantly carried out using structured hexahedral meshes. This is due to the lack of consistent multi-dimensional numerical formulations in conventional schemes for unstructured meshes that can resolve multiple physical scales and flow discontinuities simultaneously. The CESE method - due to its Riemann-solver-free shock capturing capabilities, non-dissipative baseline schemes, and flux conservation in time as well as space - has the potential to accurately simulate turbulent flows using tetrahedral meshes. As part of the study, various regimes of the shock-turbulence interaction (wrinkled and broken shock regimes) will be investigated along with a study on how adaptive refinement of tetrahedral meshes benefits this problem. The research funding for this paper has been provided by Revolutionary Computational Aerosciences (RCA) subproject under the NASA Transformative Aeronautics Concepts Program (TACP).
On Multi-Dimensional Unstructured Mesh Adaption

NASA Technical Reports Server (NTRS)

Wood, William A.; Kleb, William L.

1999-01-01

Anisotropic unstructured mesh adaption is developed for a truly multi-dimensional upwind fluctuation splitting scheme, as applied to scalar advection-diffusion. The adaption is performed locally using edge swapping, point insertion/deletion, and nodal displacements. Comparisons are made versus the current state of the art for aggressive anisotropic unstructured adaption, which is based on a posteriori error estimates. Demonstration of both schemes to model problems, with features representative of compressible gas dynamics, show the present method to be superior to the a posteriori adaption for linear advection. The performance of the two methods is more similar when applied to nonlinear advection, with a difference in the treatment of shocks. The a posteriori adaption can excessively cluster points to a shock, while the present multi-dimensional scheme tends to merely align with a shock, using fewer nodes. As a consequence of this alignment tendency, an implementation of eigenvalue limiting for the suppression of expansion shocks is developed for the multi-dimensional distribution scheme. The differences in the treatment of shocks by the adaption schemes, along with the inherently low levels of artificial dissipation in the fluctuation splitting solver, suggest the present method is a strong candidate for applications to compressible gas dynamics.
Ramses-GPU: Second order MUSCL-Handcock finite volume fluid solver

NASA Astrophysics Data System (ADS)

Kestener, Pierre

2017-10-01

RamsesGPU is a reimplementation of RAMSES (ascl:1011.007) which drops the adaptive mesh refinement (AMR) features to optimize 3D uniform grid algorithms for modern graphics processor units (GPU) to provide an efficient software package for astrophysics applications that do not need AMR features but do require a very large number of integration time steps. RamsesGPU provides an very efficient C++/CUDA/MPI software implementation of a second order MUSCL-Handcock finite volume fluid solver for compressible hydrodynamics as a magnetohydrodynamics solver based on the constraint transport technique. Other useful modules includes static gravity, dissipative terms (viscosity, resistivity), and forcing source term for turbulence studies, and special care was taken to enhance parallel input/output performance by using state-of-the-art libraries such as HDF5 and parallel-netcdf.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jalas, S.; Dornmair, I.; Lehe, R.

Particle in Cell (PIC) simulations are a widely used tool for the investigation of both laser- and beam-driven plasma acceleration. It is a known issue that the beam quality can be artificially degraded by numerical Cherenkov radiation (NCR) resulting primarily from an incorrectly modeled dispersion relation. Pseudo-spectral solvers featuring infinite order stencils can strongly reduce NCR - or even suppress it - and are therefore well suited to correctly model the beam properties. For efficient parallelization of the PIC algorithm, however, localized solvers are inevitable. Arbitrary order pseudo-spectral methods provide this needed locality. Yet, these methods can again be pronemore » to NCR. Here in this paper, we show that acceptably low solver orders are sufficient to correctly model the physics of interest, while allowing for parallel computation by domain decomposition.« less
On some Aitken-like acceleration of the Schwarz method

NASA Astrophysics Data System (ADS)

Garbey, M.; Tromeur-Dervout, D.

2002-12-01

In this paper we present a family of domain decomposition based on Aitken-like acceleration of the Schwarz method seen as an iterative procedure with a linear rate of convergence. We first present the so-called Aitken-Schwarz procedure for linear differential operators. The solver can be a direct solver when applied to the Helmholtz problem with five-point finite difference scheme on regular grids. We then introduce the Steffensen-Schwarz variant which is an iterative domain decomposition solver that can be applied to linear and nonlinear problems. We show that these solvers have reasonable numerical efficiency compared to classical fast solvers for the Poisson problem or multigrids for more general linear and nonlinear elliptic problems. However, the salient feature of our method is that our algorithm has high tolerance to slow network in the context of distributed parallel computing and is attractive, generally speaking, to use with computer architecture for which performance is limited by the memory bandwidth rather than the flop performance of the CPU. This is nowadays the case for most parallel. computer using the RISC processor architecture. We will illustrate this highly desirable property of our algorithm with large-scale computing experiments.
A parallel finite-difference method for computational aerodynamics

NASA Technical Reports Server (NTRS)

Swisshelm, Julie M.

1989-01-01

A finite-difference scheme for solving complex three-dimensional aerodynamic flow on parallel-processing supercomputers is presented. The method consists of a basic flow solver with multigrid convergence acceleration, embedded grid refinements, and a zonal equation scheme. Multitasking and vectorization have been incorporated into the algorithm. Results obtained include multiprocessed flow simulations from the Cray X-MP and Cray-2. Speedups as high as 3.3 for the two-dimensional case and 3.5 for segments of the three-dimensional case have been achieved on the Cray-2. The entire solver attained a factor of 2.7 improvement over its unitasked version on the Cray-2. The performance of the parallel algorithm on each machine is analyzed.

Data Parallel Line Relaxation (DPLR) Code User Manual: Acadia - Version 4.01.1

NASA Technical Reports Server (NTRS)

Wright, Michael J.; White, Todd; Mangini, Nancy

2009-01-01

Data-Parallel Line Relaxation (DPLR) code is a computational fluid dynamic (CFD) solver that was developed at NASA Ames Research Center to help mission support teams generate high-value predictive solutions for hypersonic flow field problems. The DPLR Code Package is an MPI-based, parallel, full three-dimensional Navier-Stokes CFD solver with generalized models for finite-rate reaction kinetics, thermal and chemical non-equilibrium, accurate high-temperature transport coefficients, and ionized flow physics incorporated into the code. DPLR also includes a large selection of generalized realistic surface boundary conditions and links to enable loose coupling with external thermal protection system (TPS) material response and shock layer radiation codes.
Load Balancing Unstructured Adaptive Grids for CFD Problems

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid

1996-01-01

Mesh adaption is a powerful tool for efficient unstructured-grid computations but causes load imbalance among processors on a parallel machine. A dynamic load balancing method is presented that balances the workload across all processors with a global view. After each parallel tetrahedral mesh adaption, the method first determines if the new mesh is sufficiently unbalanced to warrant a repartitioning. If so, the adapted mesh is repartitioned, with new partitions assigned to processors so that the redistribution cost is minimized. The new partitions are accepted only if the remapping cost is compensated by the improved load balance. Results indicate that this strategy is effective for large-scale scientific computations on distributed-memory multiprocessors.
NASA Workshop on Computational Structural Mechanics 1987, part 1

NASA Technical Reports Server (NTRS)

Sykes, Nancy P. (Editor)

1989-01-01

Topics in Computational Structural Mechanics (CSM) are reviewed. CSM parallel structural methods, a transputer finite element solver, architectures for multiprocessor computers, and parallel eigenvalue extraction are among the topics discussed.
Partitioned coupling of advection-diffusion-reaction systems and Brinkman flows

NASA Astrophysics Data System (ADS)

Lenarda, Pietro; Paggi, Marco; Ruiz Baier, Ricardo

2017-09-01

We present a partitioned algorithm aimed at extending the capabilities of existing solvers for the simulation of coupled advection-diffusion-reaction systems and incompressible, viscous flow. The space discretisation of the governing equations is based on mixed finite element methods defined on unstructured meshes, whereas the time integration hinges on an operator splitting strategy that exploits the differences in scales between the reaction, advection, and diffusion processes, considering the global system as a number of sequentially linked sets of partial differential, and algebraic equations. The flow solver presents the advantage that all unknowns in the system (here vorticity, velocity, and pressure) can be fully decoupled and thus turn the overall scheme very attractive from the computational perspective. The robustness of the proposed method is illustrated with a series of numerical tests in 2D and 3D, relevant in the modelling of bacterial bioconvection and Boussinesq systems.
CFD Assessment of Aerodynamic Degradation of a Subsonic Transport Due to Airframe Damage

NASA Technical Reports Server (NTRS)

Frink, Neal T.; Pirzadeh, Shahyar Z.; Atkins, Harold L.; Viken, Sally A.; Morrison, Joseph H.

2010-01-01

A computational study is presented to assess the utility of two NASA unstructured Navier-Stokes flow solvers for capturing the degradation in static stability and aerodynamic performance of a NASA General Transport Model (GTM) due to airframe damage. The approach is to correlate computational results with a substantial subset of experimental data for the GTM undergoing progressive losses to the wing, vertical tail, and horizontal tail components. The ultimate goal is to advance the probability of inserting computational data into the creation of advanced flight simulation models of damaged subsonic aircraft in order to improve pilot training. Results presented in this paper demonstrate good correlations with slope-derived quantities, such as pitch static margin and static directional stability, and incremental rolling moment due to wing damage. This study further demonstrates that high fidelity Navier-Stokes flow solvers could augment flight simulation models with additional aerodynamic data for various airframe damage scenarios.
Toward Verification of USM3D Extensions for Mixed Element Grids

NASA Technical Reports Server (NTRS)

Pandya, Mohagna J.; Frink, Neal T.; Ding, Ejiang; Parlette, Edward B.

2013-01-01

The unstructured tetrahedral grid cell-centered finite volume flow solver USM3D has been recently extended to handle mixed element grids composed of hexahedral, prismatic, pyramidal, and tetrahedral cells. Presently, two turbulence models, namely, baseline Spalart-Allmaras (SA) and Menter Shear Stress Transport (SST), support mixed element grids. This paper provides an overview of the various numerical discretization options available in the newly enhanced USM3D. Using the SA model, the flow solver extensions are verified on three two-dimensional test cases available on the Turbulence Modeling Resource website at the NASA Langley Research Center. The test cases are zero pressure gradient flat plate, planar shear, and bump-inchannel. The effect of cell topologies on the flow solution is also investigated using the planar shear case. Finally, the assessment of various cell and face gradient options is performed on the zero pressure gradient flat plate case.
Nonlinear Aeroacoustics Computations by the Space-Time CE/SE Method

NASA Technical Reports Server (NTRS)

Loh, Ching Y.

2003-01-01

The Space-Time Conservation Element and Solution Element Method, or CE/SE Method for short, is a recently developed numerical method for conservation laws. Despite its second order accuracy in space and time, it possesses low dispersion errors and low dissipation. The method is robust enough to cover a wide range of compressible flows: from weak linear acoustic waves to strong discontinuous waves (shocks). An outstanding feature of the CE/SE scheme is its truly multi-dimensional, simple but effective non-reflecting boundary condition (NRBC), which is particularly valuable for computational aeroacoustics (CAA). In nature, the method may be categorized as a finite volume method, where the conservation element (CE) is equivalent to a finite control volume (or cell) and the solution element (SE) can be understood as the cell interface. However, due to its careful treatment of the surface fluxes and geometry, it is different from the existing schemes. Currently, the CE/SE scheme has been developed to a matured stage that a 3-D unstructured CE/SE Navier-Stokes solver is already available. However, in the present review paper, as a general introduction to the CE/SE method, only the 2-D unstructured Euler CE/SE solver is chosen and sketched in section 2. Then applications of the 2-D and 3-D CE/SE schemes to linear, and in particular, nonlinear aeroacoustics are depicted in sections 3, 4, and 5 to demonstrate its robustness and capability.
The novel high-performance 3-D MT inverse solver

NASA Astrophysics Data System (ADS)

Kruglyakov, Mikhail; Geraskin, Alexey; Kuvshinov, Alexey

2016-04-01

We present novel, robust, scalable, and fast 3-D magnetotelluric (MT) inverse solver. The solver is written in multi-language paradigm to make it as efficient, readable and maintainable as possible. Separation of concerns and single responsibility concepts go through implementation of the solver. As a forward modelling engine a modern scalable solver extrEMe, based on contracting integral equation approach, is used. Iterative gradient-type (quasi-Newton) optimization scheme is invoked to search for (regularized) inverse problem solution, and adjoint source approach is used to calculate efficiently the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT responses, and supports massive parallelization. Moreover, different parallelization strategies implemented in the code allow optimal usage of available computational resources for a given problem statement. To parameterize an inverse domain the so-called mask parameterization is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to HPC Piz Daint (6th supercomputer in the world) demonstrate practically linear scalability of the code up to thousands of nodes.
Multithreaded Model for Dynamic Load Balancing Parallel Adaptive PDE Computations

NASA Technical Reports Server (NTRS)

Chrisochoides, Nikos

1995-01-01

We present a multithreaded model for the dynamic load-balancing of numerical, adaptive computations required for the solution of Partial Differential Equations (PDE's) on multiprocessors. Multithreading is used as a means of exploring concurrency in the processor level in order to tolerate synchronization costs inherent to traditional (non-threaded) parallel adaptive PDE solvers. Our preliminary analysis for parallel, adaptive PDE solvers indicates that multithreading can be used an a mechanism to mask overheads required for the dynamic balancing of processor workloads with computations required for the actual numerical solution of the PDE's. Also, multithreading can simplify the implementation of dynamic load-balancing algorithms, a task that is very difficult for traditional data parallel adaptive PDE computations. Unfortunately, multithreading does not always simplify program complexity, often makes code re-usability not an easy task, and increases software complexity.
Accurate modeling of plasma acceleration with arbitrary order pseudo-spectral particle-in-cell methods

DOE PAGES

Jalas, S.; Dornmair, I.; Lehe, R.; ...

2017-03-20

Particle in Cell (PIC) simulations are a widely used tool for the investigation of both laser- and beam-driven plasma acceleration. It is a known issue that the beam quality can be artificially degraded by numerical Cherenkov radiation (NCR) resulting primarily from an incorrectly modeled dispersion relation. Pseudo-spectral solvers featuring infinite order stencils can strongly reduce NCR - or even suppress it - and are therefore well suited to correctly model the beam properties. For efficient parallelization of the PIC algorithm, however, localized solvers are inevitable. Arbitrary order pseudo-spectral methods provide this needed locality. Yet, these methods can again be pronemore » to NCR. Here in this paper, we show that acceptably low solver orders are sufficient to correctly model the physics of interest, while allowing for parallel computation by domain decomposition.« less
Hybrid Optimization Parallel Search PACKage

DOE Office of Scientific and Technical Information (OSTI.GOV)

2009-11-10

HOPSPACK is open source software for solving optimization problems without derivatives. Application problems may have a fully nonlinear objective function, bound constraints, and linear and nonlinear constraints. Problem variables may be continuous, integer-valued, or a mixture of both. The software provides a framework that supports any derivative-free type of solver algorithm. Through the framework, solvers request parallel function evaluation, which may use MPI (multiple machines) or multithreading (multiple processors/cores on one machine). The framework provides a Cache and Pending Cache of saved evaluations that reduces execution time and facilitates restarts. Solvers can dynamically create other algorithms to solve subproblems, amore » useful technique for handling multiple start points and integer-valued variables. HOPSPACK ships with the Generating Set Search (GSS) algorithm, developed at Sandia as part of the APPSPACK open source software project.« less
Development of a High-Order Navier-Stokes Solver Using Flux Reconstruction to Simulate Three-Dimensional Vortex Structures in a Curved Artery Model

NASA Astrophysics Data System (ADS)

Cox, Christopher

Low-order numerical methods are widespread in academic solvers and ubiquitous in industrial solvers due to their robustness and usability. High-order methods are less robust and more complicated to implement; however, they exhibit low numerical dissipation and have the potential to improve the accuracy of flow simulations at a lower computational cost when compared to low-order methods. This motivates our development of a high-order compact method using Huynh's flux reconstruction scheme for solving unsteady incompressible flow on unstructured grids. We use Chorin's classic artificial compressibility formulation with dual time stepping to solve unsteady flow problems. In 2D, an implicit non-linear lower-upper symmetric Gauss-Seidel scheme with backward Euler discretization is used to efficiently march the solution in pseudo time, while a second-order backward Euler discretization is used to march in physical time. We verify and validate implementation of the high-order method coupled with our implicit time stepping scheme using both steady and unsteady incompressible flow problems. The current implicit time stepping scheme is proven effective in satisfying the divergence-free constraint on the velocity field in the artificial compressibility formulation. The high-order solver is extended to 3D and parallelized using MPI. Due to its simplicity, time marching for 3D problems is done explicitly. The feasibility of using the current implicit time stepping scheme for large scale three-dimensional problems with high-order polynomial basis still remains to be seen. We directly use the aforementioned numerical solver to simulate pulsatile flow of a Newtonian blood-analog fluid through a rigid 180-degree curved artery model. One of the most physiologically relevant forces within the cardiovascular system is the wall shear stress. This force is important because atherosclerotic regions are strongly correlated with curvature and branching in the human vasculature, where the shear stress is both oscillatory and multidirectional. Also, the combined effect of curvature and pulsatility in cardiovascular flows produces unsteady vortices. The aim of this research as it relates to cardiovascular fluid dynamics is to predict the spatial and temporal evolution of vortical structures generated by secondary flows, as well as to assess the correlation between multiple vortex pairs and wall shear stress. We use a physiologically (pulsatile) relevant flow rate and generate results using both fully developed and uniform entrance conditions, the latter being motivated by the fact that flow upstream of a curved artery may not have sufficient straight entrance length to become fully developed. Under the two pulsatile inflow conditions, we characterize the morphology and evolution of various vortex pairs and their subsequent effect on relevant haemodynamic wall shear stress metrics.
Extending substructure based iterative solvers to multiple load and repeated analyses

NASA Technical Reports Server (NTRS)

Farhat, Charbel

1993-01-01

Direct solvers currently dominate commercial finite element structural software, but do not scale well in the fine granularity regime targeted by emerging parallel processors. Substructure based iterative solvers--often called also domain decomposition algorithms--lend themselves better to parallel processing, but must overcome several obstacles before earning their place in general purpose structural analysis programs. One such obstacle is the solution of systems with many or repeated right hand sides. Such systems arise, for example, in multiple load static analyses and in implicit linear dynamics computations. Direct solvers are well-suited for these problems because after the system matrix has been factored, the multiple or repeated solutions can be obtained through relatively inexpensive forward and backward substitutions. On the other hand, iterative solvers in general are ill-suited for these problems because they often must restart from scratch for every different right hand side. In this paper, we present a methodology for extending the range of applications of domain decomposition methods to problems with multiple or repeated right hand sides. Basically, we formulate the overall problem as a series of minimization problems over K-orthogonal and supplementary subspaces, and tailor the preconditioned conjugate gradient algorithm to solve them efficiently. The resulting solution method is scalable, whereas direct factorization schemes and forward and backward substitution algorithms are not. We illustrate the proposed methodology with the solution of static and dynamic structural problems, and highlight its potential to outperform forward and backward substitutions on parallel computers. As an example, we show that for a linear structural dynamics problem with 11640 degrees of freedom, every time-step beyond time-step 15 is solved in a single iteration and consumes 1.0 second on a 32 processor iPSC-860 system; for the same problem and the same parallel processor, a pair of forward/backward substitutions at each step consumes 15.0 seconds.
Gust Acoustics Computation with a Space-Time CE/SE Parallel 3D Solver

NASA Technical Reports Server (NTRS)

Wang, X. Y.; Himansu, A.; Chang, S. C.; Jorgenson, P. C. E.; Reddy, D. R. (Technical Monitor)

2002-01-01

The benchmark Problem 2 in Category 3 of the Third Computational Aero-Acoustics (CAA) Workshop is solved using the space-time conservation element and solution element (CE/SE) method. This problem concerns the unsteady response of an isolated finite-span swept flat-plate airfoil bounded by two parallel walls to an incident gust. The acoustic field generated by the interaction of the gust with the flat-plate airfoil is computed by solving the 3D (three-dimensional) Euler equations in the time domain using a parallel version of a 3D CE/SE solver. The effect of the gust orientation on the far-field directivity is studied. Numerical solutions are presented and compared with analytical solutions, showing a reasonable agreement.
Solving Coupled Gross--Pitaevskii Equations on a Cluster of PlayStation 3 Computers

NASA Astrophysics Data System (ADS)

Edwards, Mark; Heward, Jeffrey; Clark, C. W.

2009-05-01

At Georgia Southern University we have constructed an 8+1--node cluster of Sony PlayStation 3 (PS3) computers with the intention of using this computing resource to solve problems related to the behavior of ultra--cold atoms in general with a particular emphasis on studying bose--bose and bose--fermi mixtures confined in optical lattices. As a first project that uses this computing resource, we have implemented a parallel solver of the coupled time--dependent, one--dimensional Gross--Pitaevskii (TDGP) equations. These equations govern the behavior of dual-- species bosonic mixtures. We chose the split--operator/FFT to solve the coupled 1D TDGP equations. The fast Fourier transform component of this solver can be readily parallelized on the PS3 cpu known as the Cell Broadband Engine (CellBE). Each CellBE chip contains a single 64--bit PowerPC Processor Element known as the PPE and eight ``Synergistic Processor Element'' identified as the SPE's. We report on this algorithm and compare its performance to a non--parallel solver as applied to modeling evaporative cooling in dual--species bosonic mixtures.
Impact of the implementation of MPI point-to-point communications on the performance of two general sparse solvers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Amestoy, Patrick R.; Duff, Iain S.; L'Excellent, Jean-Yves

2001-10-10

We examine the mechanics of the send and receive mechanism of MPI and in particular how we can implement message passing in a robust way so that our performance is not significantly affected by changes to the MPI system. This leads us to using the Isend/Irecv protocol which will entail sometimes significant algorithmic changes. We discuss this within the context of two different algorithms for sparse Gaussian elimination that we have parallelized. One is a multifrontal solver called MUMPS, the other is a supernodal solver called SuperLU. Both algorithms are difficult to parallelize on distributed memory machines. Our initial strategiesmore » were based on simple MPI point-to-point communication primitives. With such approaches, the parallel performance of both codes are very sensitive to the MPI implementation, the way MPI internal buffers are used in particular. We then modified our codes to use more sophisticated nonblocking versions of MPI communication. This significantly improved the performance robustness (independent of the MPI buffering mechanism) and scalability, but at the cost of increased code complexity.« less
Three-dimensional Finite Element Formulation and Scalable Domain Decomposition for High Fidelity Rotor Dynamic Analysis

NASA Technical Reports Server (NTRS)

Datta, Anubhav; Johnson, Wayne R.

2009-01-01

This paper has two objectives. The first objective is to formulate a 3-dimensional Finite Element Model for the dynamic analysis of helicopter rotor blades. The second objective is to implement and analyze a dual-primal iterative substructuring based Krylov solver, that is parallel and scalable, for the solution of the 3-D FEM analysis. The numerical and parallel scalability of the solver is studied using two prototype problems - one for ideal hover (symmetric) and one for a transient forward flight (non-symmetric) - both carried out on up to 48 processors. In both hover and forward flight conditions, a perfect linear speed-up is observed, for a given problem size, up to the point of substructure optimality. Substructure optimality and the linear parallel speed-up range are both shown to depend on the problem size as well as on the selection of the coarse problem. With a larger problem size, linear speed-up is restored up to the new substructure optimality. The solver also scales with problem size - even though this conclusion is premature given the small prototype grids considered in this study.
Progress report on PIXIE3D, a fully implicit 3D extended MHD solver

NASA Astrophysics Data System (ADS)

Chacon, Luis

2008-11-01

Recently, invited talk at DPP07 an optimal, massively parallel implicit algorithm for 3D resistive magnetohydrodynamics (PIXIE3D) was demonstrated. Excellent algorithmic and parallel results were obtained with up to 4096 processors and 138 million unknowns. While this is a remarkable result, further developments are still needed for PIXIE3D to become a 3D extended MHD production code in general geometries. In this poster, we present an update on the status of PIXIE3D on several fronts. On the physics side, we will describe our progress towards the full Braginskii model, including: electron Hall terms, anisotropic heat conduction, and gyroviscous corrections. Algorithmically, we will discuss progress towards a robust, optimal, nonlinear solver for arbitrary geometries, including preconditioning for the new physical effects described, the implementation of a coarse processor-grid solver (to maintain optimal algorithmic performance for an arbitrarily large number of processors in massively parallel computations), and of a multiblock capability to deal with complicated geometries. L. Chac'on, Phys. Plasmas 15, 056103 (2008);
BRAIN initiative: fast and parallel solver for real-time monitoring of the eddy current in the brain for TMS applications.

PubMed

Sabouni, Abas; Pouliot, Philippe; Shmuel, Amir; Lesage, Frederic

2014-01-01

This paper introduce a fast and efficient solver for simulating the induced (eddy) current distribution in the brain during transcranial magnetic stimulation procedure. This solver has been integrated with MRI and neuronavigation software to accurately model the electromagnetic field and show eddy current in the head almost in real-time. To examine the performance of the proposed technique, we used a 3D anatomically accurate MRI model of the 25 year old female subject.
Parameter investigation with line-implicit lower-upper symmetric Gauss-Seidel on 3D stretched grids

NASA Astrophysics Data System (ADS)

Otero, Evelyn; Eliasson, Peter

2015-03-01

An implicit lower-upper symmetric Gauss-Seidel (LU-SGS) solver has been implemented as a multigrid smoother combined with a line-implicit method as an acceleration technique for Reynolds-averaged Navier-Stokes (RANS) simulation on stretched meshes. The computational fluid dynamics code concerned is Edge, an edge-based finite volume Navier-Stokes flow solver for structured and unstructured grids. The paper focuses on the investigation of the parameters related to our novel line-implicit LU-SGS solver for convergence acceleration on 3D RANS meshes. The LU-SGS parameters are defined as the Courant-Friedrichs-Lewy number, the left-hand side dissipation, and the convergence of iterative solution of the linear problem arising from the linearisation of the implicit scheme. The influence of these parameters on the overall convergence is presented and default values are defined for maximum convergence acceleration. The optimised settings are applied to 3D RANS computations for comparison with explicit and line-implicit Runge-Kutta smoothing. For most of the cases, a computing time acceleration of the order of 2 is found depending on the mesh type, namely the boundary layer and the magnitude of residual reduction.

An Overview of Ares-I CFD Ascent Aerodynamic Data Development And Analysis Based on USM3D

NASA Technical Reports Server (NTRS)

Abdol-Hamid, Khaled S.; Ghaffari, Farhad; Parlette, Edward B.

2011-01-01

An overview of the computational results obtained from the NASA Langley developed unstructured grid, Reynolds-averaged Navier-Stokes flow solver USM3D, in support of the Ares-I project within the NASA s Constellation program, are presented. The numerical data are obtained for representative flow conditions pertinent to the ascent phase of the trajectory at both wind tunnel and flight Reynolds number without including any propulsion effects. The USM3D flow solver has been designated to have the primary role within the Ares-I project in developing the computational aerodynamic data for the vehicle while other flow solvers, namely OVERFLOW and FUN3D, have supporting roles to provide complementary results for fewer cases as part of the verification process to ensure code-to-code solution consistency. Similarly, as part of the solution validation efforts, the predicted numerical results are correlated with the aerodynamic wind tunnel data that have been generated within the project in the past few years. Sample aerodynamic results and the processes established for the computational solution/data development for the evolving Ares-I design cycles are presented.
An implicit numerical scheme for the simulation of internal viscous flows on unstructured grids

NASA Technical Reports Server (NTRS)

Jorgenson, Philip C. E.; Pletcher, Richard H.

1994-01-01

The Navier-Stokes equations are solved numerically for two-dimensional steady viscous laminar flows. The grids are generated based on the method of Delaunay triangulation. A finite-volume approach is used to discretize the conservation law form of the compressible flow equations written in terms of primitive variables. A preconditioning matrix is added to the equations so that low Mach number flows can be solved economically. The equations are time marched using either an implicit Gauss-Seidel iterative procedure or a solver based on a conjugate gradient like method. A four color scheme is employed to vectorize the block Gauss-Seidel relaxation procedure. This increases the memory requirements minimally and decreases the computer time spent solving the resulting system of equations substantially. A factor of 7.6 speed up in the matrix solver is typical for the viscous equations. Numerical results are obtained for inviscid flow over a bump in a channel at subsonic and transonic conditions for validation with structured solvers. Viscous results are computed for developing flow in a channel, a symmetric sudden expansion, periodic tandem cylinders in a cross-flow, and a four-port valve. Comparisons are made with available results obtained by other investigators.
A Fast and Robust Poisson-Boltzmann Solver Based on Adaptive Cartesian Grids

PubMed Central

Boschitsch, Alexander H.; Fenley, Marcia O.

2011-01-01

An adaptive Cartesian grid (ACG) concept is presented for the fast and robust numerical solution of the 3D Poisson-Boltzmann Equation (PBE) governing the electrostatic interactions of large-scale biomolecules and highly charged multi-biomolecular assemblies such as ribosomes and viruses. The ACG offers numerous advantages over competing grid topologies such as regular 3D lattices and unstructured grids. For very large biological molecules and multi-biomolecule assemblies, the total number of grid-points is several orders of magnitude less than that required in a conventional lattice grid used in the current PBE solvers thus allowing the end user to obtain accurate and stable nonlinear PBE solutions on a desktop computer. Compared to tetrahedral-based unstructured grids, ACG offers a simpler hierarchical grid structure, which is naturally suited to multigrid, relieves indirect addressing requirements and uses fewer neighboring nodes in the finite difference stencils. Construction of the ACG and determination of the dielectric/ionic maps are straightforward, fast and require minimal user intervention. Charge singularities are eliminated by reformulating the problem to produce the reaction field potential in the molecular interior and the total electrostatic potential in the exterior ionic solvent region. This approach minimizes grid-dependency and alleviates the need for fine grid spacing near atomic charge sites. The technical portion of this paper contains three parts. First, the ACG and its construction for general biomolecular geometries are described. Next, a discrete approximation to the PBE upon this mesh is derived. Finally, the overall solution procedure and multigrid implementation are summarized. Results obtained with the ACG-based PBE solver are presented for: (i) a low dielectric spherical cavity, containing interior point charges, embedded in a high dielectric ionic solvent – analytical solutions are available for this case, thus allowing rigorous assessment of the solution accuracy; (ii) a pair of low dielectric charged spheres embedded in a ionic solvent to compute electrostatic interaction free energies as a function of the distance between sphere centers; (iii) surface potentials of proteins, nucleic acids and their larger-scale assemblies such as ribosomes; and (iv) electrostatic solvation free energies and their salt sensitivities – obtained with both linear and nonlinear Poisson-Boltzmann equation – for a large set of proteins. These latter results along with timings can serve as benchmarks for comparing the performance of different PBE solvers. PMID:21984876
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kargupta, H.; Stafford, B.; Hamzaoglu, I.

This paper describes an experimental parallel/distributed data mining system PADMA (PArallel Data Mining Agents) that uses software agents for local data accessing and analysis and a web based interface for interactive data visualization. It also presents the results of applying PADMA for detecting patterns in unstructured texts of postmortem reports and laboratory test data for Hepatitis C patients.
Cpu/gpu Computing for AN Implicit Multi-Block Compressible Navier-Stokes Solver on Heterogeneous Platform

NASA Astrophysics Data System (ADS)

Deng, Liang; Bai, Hanli; Wang, Fang; Xu, Qingxin

2016-06-01

CPU/GPU computing allows scientists to tremendously accelerate their numerical codes. In this paper, we port and optimize a double precision alternating direction implicit (ADI) solver for three-dimensional compressible Navier-Stokes equations from our in-house Computational Fluid Dynamics (CFD) software on heterogeneous platform. First, we implement a full GPU version of the ADI solver to remove a lot of redundant data transfers between CPU and GPU, and then design two fine-grain schemes, namely “one-thread-one-point” and “one-thread-one-line”, to maximize the performance. Second, we present a dual-level parallelization scheme using the CPU/GPU collaborative model to exploit the computational resources of both multi-core CPUs and many-core GPUs within the heterogeneous platform. Finally, considering the fact that memory on a single node becomes inadequate when the simulation size grows, we present a tri-level hybrid programming pattern MPI-OpenMP-CUDA that merges fine-grain parallelism using OpenMP and CUDA threads with coarse-grain parallelism using MPI for inter-node communication. We also propose a strategy to overlap the computation with communication using the advanced features of CUDA and MPI programming. We obtain speedups of 6.0 for the ADI solver on one Tesla M2050 GPU in contrast to two Xeon X5670 CPUs. Scalability tests show that our implementation can offer significant performance improvement on heterogeneous platform.
Accelerating finite-rate chemical kinetics with coprocessors: Comparing vectorization methods on GPUs, MICs, and CPUs

NASA Astrophysics Data System (ADS)

Stone, Christopher P.; Alferman, Andrew T.; Niemeyer, Kyle E.

2018-05-01

Accurate and efficient methods for solving stiff ordinary differential equations (ODEs) are a critical component of turbulent combustion simulations with finite-rate chemistry. The ODEs governing the chemical kinetics at each mesh point are decoupled by operator-splitting allowing each to be solved concurrently. An efficient ODE solver must then take into account the available thread and instruction-level parallelism of the underlying hardware, especially on many-core coprocessors, as well as the numerical efficiency. A stiff Rosenbrock and a nonstiff Runge-Kutta ODE solver are both implemented using the single instruction, multiple thread (SIMT) and single instruction, multiple data (SIMD) paradigms within OpenCL. Both methods solve multiple ODEs concurrently within the same instruction stream. The performance of these parallel implementations was measured on three chemical kinetic models of increasing size across several multicore and many-core platforms. Two separate benchmarks were conducted to clearly determine any performance advantage offered by either method. The first benchmark measured the run-time of evaluating the right-hand-side source terms in parallel and the second benchmark integrated a series of constant-pressure, homogeneous reactors using the Rosenbrock and Runge-Kutta solvers. The right-hand-side evaluations with SIMD parallelism on the host multicore Xeon CPU and many-core Xeon Phi co-processor performed approximately three times faster than the baseline multithreaded C++ code. The SIMT parallel model on the host and Phi was 13%-35% slower than the baseline while the SIMT model on the NVIDIA Kepler GPU provided approximately the same performance as the SIMD model on the Phi. The runtimes for both ODE solvers decreased significantly with the SIMD implementations on the host CPU (2.5-2.7 ×) and Xeon Phi coprocessor (4.7-4.9 ×) compared to the baseline parallel code. The SIMT implementations on the GPU ran 1.5-1.6 times faster than the baseline multithreaded CPU code; however, this was significantly slower than the SIMD versions on the host CPU or the Xeon Phi. The performance difference between the three platforms was attributed to thread divergence caused by the adaptive step-sizes within the ODE integrators. Analysis showed that the wider vector width of the GPU incurs a higher level of divergence than the narrower Sandy Bridge or Xeon Phi. The significant performance improvement provided by the SIMD parallel strategy motivates further research into more ODE solver methods that are both SIMD-friendly and computationally efficient.
Lattice Boltzmann Model of 3D Multiphase Flow in Artery Bifurcation Aneurysm Problem

PubMed Central

Abas, Aizat; Mokhtar, N. Hafizah; Ishak, M. H. H.; Abdullah, M. Z.; Ho Tian, Ang

2016-01-01

This paper simulates and predicts the laminar flow inside the 3D aneurysm geometry, since the hemodynamic situation in the blood vessels is difficult to determine and visualize using standard imaging techniques, for example, magnetic resonance imaging (MRI). Three different types of Lattice Boltzmann (LB) models are computed, namely, single relaxation time (SRT), multiple relaxation time (MRT), and regularized BGK models. The results obtained using these different versions of the LB-based code will then be validated with ANSYS FLUENT, a commercially available finite volume- (FV-) based CFD solver. The simulated flow profiles that include velocity, pressure, and wall shear stress (WSS) are then compared between the two solvers. The predicted outcomes show that all the LB models are comparable and in good agreement with the FVM solver for complex blood flow simulation. The findings also show minor differences in their WSS profiles. The performance of the parallel implementation for each solver is also included and discussed in this paper. In terms of parallelization, it was shown that LBM-based code performed better in terms of the computation time required. PMID:27239221
An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling

DOE PAGES

Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry; ...

2016-10-27

Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
Novel Scalable 3-D MT Inverse Solver

NASA Astrophysics Data System (ADS)

Kuvshinov, A. V.; Kruglyakov, M.; Geraskin, A.

2016-12-01

We present a new, robust and fast, three-dimensional (3-D) magnetotelluric (MT) inverse solver. As a forward modelling engine a highly-scalable solver extrEMe [1] is used. The (regularized) inversion is based on an iterative gradient-type optimization (quasi-Newton method) and exploits adjoint sources approach for fast calculation of the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT (single-site and/or inter-site) responses, and supports massive parallelization. Different parallelization strategies implemented in the code allow for optimal usage of available computational resources for a given problem set up. To parameterize an inverse domain a mask approach is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to high-performance clusters demonstrate practically linear scalability of the code up to thousands of nodes. 1. Kruglyakov, M., A. Geraskin, A. Kuvshinov, 2016. Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method, Computers and Geosciences, in press.
Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

NASA Technical Reports Server (NTRS)

Morgan, Philip E.

2004-01-01

This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
Multiphase three-dimensional direct numerical simulation of a rotating impeller with code Blue

NASA Astrophysics Data System (ADS)

Kahouadji, Lyes; Shin, Seungwon; Chergui, Jalel; Juric, Damir; Craster, Richard V.; Matar, Omar K.

2017-11-01

The flow driven by a rotating impeller inside an open fixed cylindrical cavity is simulated using code Blue, a solver for massively-parallel simulations of fully three-dimensional multiphase flows. The impeller is composed of four blades at a 45° inclination all attached to a central hub and tube stem. In Blue, solid forms are constructed through the definition of immersed objects via a distance function that accounts for the object's interaction with the flow for both single and two-phase flows. We use a moving frame technique for imposing translation and/or rotation. The variation of the Reynolds number, the clearance, and the tank aspect ratio are considered, and we highlight the importance of the confinement ratio (blade radius versus the tank radius) in the mixing process. Blue uses a domain decomposition strategy for parallelization with MPI. The fluid interface solver is based on a parallel implementation of a hybrid front-tracking/level-set method designed complex interfacial topological changes. Parallel GMRES and multigrid iterative solvers are applied to the linear systems arising from the implicit solution for the fluid velocities and pressure in the presence of strong density and viscosity discontinuities across fluid phases. EPSRC, UK, MEMPHIS program Grant (EP/K003976/1), RAEng Research Chair (OKM).
Adaptive Mesh Refinement in Curvilinear Body-Fitted Grid Systems

NASA Technical Reports Server (NTRS)

Steinthorsson, Erlendur; Modiano, David; Colella, Phillip

1995-01-01

To be truly compatible with structured grids, an AMR algorithm should employ a block structure for the refined grids to allow flow solvers to take advantage of the strengths of unstructured grid systems, such as efficient solution algorithms for implicit discretizations and multigrid schemes. One such algorithm, the AMR algorithm of Berger and Colella, has been applied to and adapted for use with body-fitted structured grid systems. Results are presented for a transonic flow over a NACA0012 airfoil (AGARD-03 test case) and a reflection of a shock over a double wedge.
An Upwind Multigrid Algorithm for Calculating Flows on Unstructured Grids

NASA Technical Reports Server (NTRS)

Bonhaus, Daryl L.

1993-01-01

An algorithm is described that calculates inviscid, laminar, and turbulent flows on triangular meshes with an upwind discretization. A brief description of the base solver and the multigrid implementation is given, followed by results that consist mainly of convergence rates for inviscid and viscous flows over a NACA four-digit airfoil section. The results show that multigrid does accelerate convergence when the same relaxation parameters that yield good single-grid performance are used; however, larger gains in performance can be realized by doing less work in the relaxation scheme.
Adaptive Meshing Techniques for Viscous Flow Calculations on Mixed Element Unstructured Meshes

NASA Technical Reports Server (NTRS)

Mavriplis, D. J.

1997-01-01

An adaptive refinement strategy based on hierarchical element subdivision is formulated and implemented for meshes containing arbitrary mixtures of tetrahendra, hexahendra, prisms and pyramids. Special attention is given to keeping memory overheads as low as possible. This procedure is coupled with an algebraic multigrid flow solver which operates on mixed-element meshes. Inviscid flows as well as viscous flows are computed an adaptively refined tetrahedral, hexahedral, and hybrid meshes. The efficiency of the method is demonstrated by generating an adapted hexahedral mesh containing 3 million vertices on a relatively inexpensive workstation.
Proteus-MOC: A 3D deterministic solver incorporating 2D method of characteristics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marin-Lafleche, A.; Smith, M. A.; Lee, C.

2013-07-01

A new transport solution methodology was developed by combining the two-dimensional method of characteristics with the discontinuous Galerkin method for the treatment of the axial variable. The method, which can be applied to arbitrary extruded geometries, was implemented in PROTEUS-MOC and includes parallelization in group, angle, plane, and space using a top level GMRES linear algebra solver. Verification tests were performed to show accuracy and stability of the method with the increased number of angular directions and mesh elements. Good scalability with parallelism in angle and axial planes is displayed. (authors)
Design of Unstructured Adaptive (UA) NAS Parallel Benchmark Featuring Irregular, Dynamic Memory Accesses

NASA Technical Reports Server (NTRS)

Feng, Hui-Yu; VanderWijngaart, Rob; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2001-01-01

We describe the design of a new method for the measurement of the performance of modern computer systems when solving scientific problems featuring irregular, dynamic memory accesses. The method involves the solution of a stylized heat transfer problem on an unstructured, adaptive grid. A Spectral Element Method (SEM) with an adaptive, nonconforming mesh is selected to discretize the transport equation. The relatively high order of the SEM lowers the fraction of wall clock time spent on inter-processor communication, which eases the load balancing task and allows us to concentrate on the memory accesses. The benchmark is designed to be three-dimensional. Parallelization and load balance issues of a reference implementation will be described in detail in future reports.
Discrete sensitivity derivatives of the Navier-Stokes equations with a parallel Krylov solver

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Taylor, Arthur C., III

1994-01-01

This paper solves an 'incremental' form of the sensitivity equations derived by differentiating the discretized thin-layer Navier Stokes equations with respect to certain design variables of interest. The equations are solved with a parallel, preconditioned Generalized Minimal RESidual (GMRES) solver on a distributed-memory architecture. The 'serial' sensitivity analysis code is parallelized by using the Single Program Multiple Data (SPMD) programming model, domain decomposition techniques, and message-passing tools. Sensitivity derivatives are computed for low and high Reynolds number flows over a NACA 1406 airfoil on a 32-processor Intel Hypercube, and found to be identical to those computed on a single-processor Cray Y-MP. It is estimated that the parallel sensitivity analysis code has to be run on 40-50 processors of the Intel Hypercube in order to match the single-processor processing time of a Cray Y-MP.
Computational strategies for three-dimensional flow simulations on distributed computer systems. Ph.D. Thesis Semiannual Status Report, 15 Aug. 1993 - 15 Feb. 1994

NASA Technical Reports Server (NTRS)

Weed, Richard Allen; Sankar, L. N.

1994-01-01

An increasing amount of research activity in computational fluid dynamics has been devoted to the development of efficient algorithms for parallel computing systems. The increasing performance to price ratio of engineering workstations has led to research to development procedures for implementing a parallel computing system composed of distributed workstations. This thesis proposal outlines an ongoing research program to develop efficient strategies for performing three-dimensional flow analysis on distributed computing systems. The PVM parallel programming interface was used to modify an existing three-dimensional flow solver, the TEAM code developed by Lockheed for the Air Force, to function as a parallel flow solver on clusters of workstations. Steady flow solutions were generated for three different wing and body geometries to validate the code and evaluate code performance. The proposed research will extend the parallel code development to determine the most efficient strategies for unsteady flow simulations.
A matrix dependent/algebraic multigrid approach for extruded meshes with applications to ice sheet modeling

DOE PAGES

Tuminaro, Raymond S.; Perego, Mauro; Tezaur, Irina Kalashnikova; ...

2016-10-06

A multigrid method is proposed that combines ideas from matrix dependent multigrid for structured grids and algebraic multigrid for unstructured grids. It targets problems where a three-dimensional mesh can be viewed as an extrusion of a two-dimensional, unstructured mesh in a third dimension. Our motivation comes from the modeling of thin structures via finite elements and, more specifically, the modeling of ice sheets. Extruded meshes are relatively common for thin structures and often give rise to anisotropic problems when the thin direction mesh spacing is much smaller than the broad direction mesh spacing. Within our approach, the first few multigridmore » hierarchy levels are obtained by applying matrix dependent multigrid to semicoarsen in a structured thin direction fashion. After sufficient structured coarsening, the resulting mesh contains only a single layer corresponding to a two-dimensional, unstructured mesh. Algebraic multigrid can then be employed in a standard manner to create further coarse levels, as the anisotropic phenomena is no longer present in the single layer problem. The overall approach remains fully algebraic, with the minor exception that some additional information is needed to determine the extruded direction. Furthermore, this facilitates integration of the solver with a variety of different extruded mesh applications.« less
Xyce

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thomquist, Heidi K.; Fixel, Deborah A.; Fett, David Brian

The Xyce Parallel Electronic Simulator simulates electronic circuit behavior in DC, AC, HB, MPDE and transient mode using standard analog (DAE) and/or device (PDE) device models including several age and radiation aware devices. It supports a variety of computing platforms (both serial and parallel) computers. Lastly, it uses a variety of modern solution algorithms dynamic parallel load-balancing and iterative solvers.

Parallelization of elliptic solver for solving 1D Boussinesq model

NASA Astrophysics Data System (ADS)

Tarwidi, D.; Adytia, D.

2018-03-01

In this paper, a parallel implementation of an elliptic solver in solving 1D Boussinesq model is presented. Numerical solution of Boussinesq model is obtained by implementing a staggered grid scheme to continuity, momentum, and elliptic equation of Boussinesq model. Tridiagonal system emerging from numerical scheme of elliptic equation is solved by cyclic reduction algorithm. The parallel implementation of cyclic reduction is executed on multicore processors with shared memory architectures using OpenMP. To measure the performance of parallel program, large number of grids is varied from 28 to 214. Two test cases of numerical experiment, i.e. propagation of solitary and standing wave, are proposed to evaluate the parallel program. The numerical results are verified with analytical solution of solitary and standing wave. The best speedup of solitary and standing wave test cases is about 2.07 with 214 of grids and 1.86 with 213 of grids, respectively, which are executed by using 8 threads. Moreover, the best efficiency of parallel program is 76.2% and 73.5% for solitary and standing wave test cases, respectively.
An Overview of Spray Modeling With OpenNCC and its Application to Emissions Predictions of a LDI Combustor at High Pressure

NASA Technical Reports Server (NTRS)

Raju, M. S.

2016-01-01

The open national combustion code (Open- NCC) is developed with the aim of advancing the current multi-dimensional computational tools used in the design of advanced technology combustors. In this paper we provide an overview of the spray module, LSPRAY-V, developed as a part of this effort. The spray solver is mainly designed to predict the flow, thermal, and transport properties of a rapidly evaporating multi-component liquid spray. The modeling approach is applicable over a wide-range of evaporating conditions (normal, superheat, and supercritical). The modeling approach is based on several well-established atomization, vaporization, and wall/droplet impingement models. It facilitates large-scale combustor computations through the use of massively parallel computers with the ability to perform the computations on either structured & unstructured grids. The spray module has a multi-liquid and multi-injector capability, and can be used in the calculation of both steady and unsteady computations. We conclude the paper by providing the results for a reacting spray generated by a single injector element with 600 axially swept swirler vanes. It is a configuration based on the next-generation lean-direct injection (LDI) combustor concept. The results include comparisons for both combustor exit temperature and EINOX at three different fuel/air ratios.
Mathematical and Numerical Aspects of the Adaptive Fast Multipole Poisson-Boltzmann Solver

DOE PAGES

Zhang, Bo; Lu, Benzhuo; Cheng, Xiaolin; ...

2013-01-01

This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole Poisson-Boltzmann (AFMPB) solver. We introduce and discuss the following components in order: the Poisson-Boltzmann model, boundary integral equation reformulation, surface mesh generation, the nodepatch discretization approach, Krylov iterative methods, the new version of fast multipole methods (FMMs), and a dynamic prioritization technique for scheduling parallel operations. For each component, we also remark on feasible approaches for further improvements in efficiency, accuracy and applicability of the AFMPB solver to large-scale long-time molecular dynamics simulations. Lastly, the potential of the solver is demonstrated with preliminary numericalmore » results.« less
Computational Aeroacoustics by the Space-time CE/SE Method

NASA Technical Reports Server (NTRS)

Loh, Ching Y.

2001-01-01

In recent years, a new numerical methodology for conservation laws-the Space-Time Conservation Element and Solution Element Method (CE/SE), was developed by Dr. Chang of NASA Glenn Research Center and collaborators. In nature, the new method may be categorized as a finite volume method, where the conservation element (CE) is equivalent to a finite control volume (or cell) and the solution element (SE) can be understood as the cell interface. However, due to its rigorous treatment of the fluxes and geometry, it is different from the existing schemes. The CE/SE scheme features: (1) space and time treated on the same footing, the integral equations of conservation laws are solve( for with second order accuracy, (2) high resolution, low dispersion and low dissipation, (3) novel, truly multi-dimensional, simple but effective non-reflecting boundary condition, (4) effortless implementation of computation, no numerical fix or parameter choice is needed, an( (5) robust enough to cover a wide spectrum of compressible flow: from weak linear acoustic waves to strong, discontinuous waves (shocks) appropriate for linear and nonlinear aeroacoustics. Currently, the CE/SE scheme has been developed to such a stage that a 3-13 unstructured CE/SE Navier-Stokes solver is already available. However, in the present paper, as a general introduction to the CE/SE method, only the 2-D unstructured Euler CE/SE solver is chosen as a prototype and is sketched in Section 2. Then applications of the CE/SE scheme to linear, nonlinear aeroacoustics and airframe noise are depicted in Sections 3, 4, and 5 respectively to demonstrate its robustness and capability.
Parallel language constructs for tensor product computations on loosely coupled architectures

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Van Rosendale, John

1989-01-01

A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. The authors focus on tensor product array computations, a simple but important class of numerical algorithms. They consider first the problem of programming one-dimensional kernel routines, such as parallel tridiagonal solvers, and then look at how such parallel kernels can be combined to form parallel tensor product algorithms.
Solvers for $$\\mathcal{O} (N)$$ Electronic Structure in the Strong Scaling Limit

DOE PAGES

Bock, Nicolas; Challacombe, William M.; Kale, Laxmikant

2016-01-26

Here we present a hybrid OpenMP/Charm\\tt++ framework for solving themore » $$\\mathcal{O} (N)$$ self-consistent-field eigenvalue problem with parallelism in the strong scaling regime, $$P\\gg{N}$$, where $P$ is the number of cores, and $N$ is a measure of system size, i.e., the number of matrix rows/columns, basis functions, atoms, molecules, etc. This result is achieved with a nested approach to spectral projection and the sparse approximate matrix multiply [Bock and Challacombe, SIAM J. Sci. Comput., 35 (2013), pp. C72--C98], and involves a recursive, task-parallel algorithm, often employed by generalized $N$-Body solvers, to occlusion and culling of negligible products in the case of matrices with decay. Lastly, employing classic technologies associated with generalized $N$-Body solvers, including overdecomposition, recursive task parallelism, orderings that preserve locality, and persistence-based load balancing, we obtain scaling beyond hundreds of cores per molecule for small water clusters ([H$${}_2$$O]$${}_N$$, $$N \\in \\{ 30, 90, 150 \\}$$, $$P/N \\approx \\{ 819, 273, 164 \\}$$) and find support for an increasingly strong scalability with increasing system size $N$.« less
A performance study of sparse Cholesky factorization on INTEL iPSC/860

NASA Technical Reports Server (NTRS)

Zubair, M.; Ghose, M.

1992-01-01

The problem of Cholesky factorization of a sparse matrix has been very well investigated on sequential machines. A number of efficient codes exist for factorizing large unstructured sparse matrices. However, there is a lack of such efficient codes on parallel machines in general, and distributed machines in particular. Some of the issues that are critical to the implementation of sparse Cholesky factorization on a distributed memory parallel machine are ordering, partitioning and mapping, load balancing, and ordering of various tasks within a processor. Here, we focus on the effect of various partitioning schemes on the performance of sparse Cholesky factorization on the Intel iPSC/860. Also, a new partitioning heuristic for structured as well as unstructured sparse matrices is proposed, and its performance is compared with other schemes.
Parallel Nonnegative Least Squares Solvers for Model Order Reduction

DTIC Science & Technology

2016-03-01

NNLS problems that arise when the Energy Conserving Sampling and Weighting hyper -reduction procedure is used when constructing a reduced-order model...ScaLAPACK and performance results are presented. nonnegative least squares, model order reduction, hyper -reduction, Energy Conserving Sampling and...optimal solution. ........................................ 20 Table 6 Reduced mesh sizes produced for each solver in the ECSW hyper -reduction step
Parallel-vector solution of large-scale structural analysis problems on supercomputers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

1989-01-01

A direct linear equation solution method based on the Choleski factorization procedure is presented which exploits both parallel and vector features of supercomputers. The new equation solver is described, and its performance is evaluated by solving structural analysis problems on three high-performance computers. The method has been implemented using Force, a generic parallel FORTRAN language.
Albany/FELIX: A parallel, scalable and robust, finite element, first-order Stokes approximation ice sheet solver built for advanced analysis

DOE PAGES

Tezaur, I. K.; Perego, M.; Salinger, A. G.; ...

2015-04-27

This paper describes a new parallel, scalable and robust finite element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and template-based generic programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, alongmore » with their implementation. The results of several verification studies of the model accuracy are presented using (1) new test cases for simplified two-dimensional (2-D) versions of the governing equations derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution are then studied on problems involving a realistic Greenland ice sheet geometry discretized using hexahedral and tetrahedral meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.« less
Using parallel banded linear system solvers in generalized eigenvalue problems

NASA Technical Reports Server (NTRS)

Zhang, Hong; Moss, William F.

1993-01-01

Subspace iteration is a reliable and cost effective method for solving positive definite banded symmetric generalized eigenproblems, especially in the case of large scale problems. This paper discusses an algorithm that makes use of two parallel banded solvers in subspace iteration. A shift is introduced to decompose the banded linear systems into relatively independent subsystems and to accelerate the iterations. With this shift, an eigenproblem is mapped efficiently into the memories of a multiprocessor and a high speed-up is obtained for parallel implementations. An optimal shift is a shift that balances total computation and communication costs. Under certain conditions, we show how to estimate an optimal shift analytically using the decay rate for the inverse of a banded matrix, and how to improve this estimate. Computational results on iPSC/2 and iPSC/860 multiprocessors are presented.
Completing the Physical Representation of Quantum Algorithms Provides a Quantitative Explanation of Their Computational Speedup

NASA Astrophysics Data System (ADS)

Castagnoli, Giuseppe

2018-03-01

The usual representation of quantum algorithms, limited to the process of solving the problem, is physically incomplete. We complete it in three steps: (i) extending the representation to the process of setting the problem, (ii) relativizing the extended representation to the problem solver to whom the problem setting must be concealed, and (iii) symmetrizing the relativized representation for time reversal to represent the reversibility of the underlying physical process. The third steps projects the input state of the representation, where the problem solver is completely ignorant of the setting and thus the solution of the problem, on one where she knows half solution (half of the information specifying it when the solution is an unstructured bit string). Completing the physical representation shows that the number of computation steps (oracle queries) required to solve any oracle problem in an optimal quantum way should be that of a classical algorithm endowed with the advanced knowledge of half solution.
Advanced Multigrid Solvers for Fluid Dynamics

NASA Technical Reports Server (NTRS)

Brandt, Achi

1999-01-01

The main objective of this project has been to support the development of multigrid techniques in computational fluid dynamics that can achieve "textbook multigrid efficiency" (TME), which is several orders of magnitude faster than current industrial CFD solvers. Toward that goal we have assembled a detailed table which lists every foreseen kind of computational difficulty for achieving it, together with the possible ways for resolving the difficulty, their current state of development, and references. We have developed several codes to test and demonstrate, in the framework of simple model problems, several approaches for overcoming the most important of the listed difficulties that had not been resolved before. In particular, TME has been demonstrated for incompressible flows on one hand, and for near-sonic flows on the other hand. General approaches were advanced for the relaxation of stagnation points and boundary conditions under various situations. Also, new algebraic multigrid techniques were formed for treating unstructured grid formulations. More details on all these are given below.
Numerical Investigations of the Benchmark Supercritical Wing in Transonic Flow

NASA Technical Reports Server (NTRS)

Chwalowski, Pawel; Heeg, Jennifer; Biedron, Robert T.

2017-01-01

This paper builds on the computational aeroelastic results published previously and generated in support of the second Aeroelastic Prediction Workshop for the NASA Benchmark Supercritical Wing (BSCW) configuration. The computational results are obtained using FUN3D, an unstructured grid Reynolds-Averaged Navier-Stokes solver developed at the NASA Langley Research Center. The analysis results show the effects of the temporal and spatial resolution, the coupling scheme between the flow and the structural solvers, and the initial excitation conditions on the numerical flutter onset. Depending on the free stream condition and the angle of attack, the above parameters do affect the flutter onset. Two conditions are analyzed: Mach 0.74 with angle of attack 0 and Mach 0.85 with angle of attack 5. The results are presented in the form of the damping values computed from the wing pitch angle response as a function of the dynamic pressure or in the form of dynamic pressure as a function of the Mach number.
A VERSATILE SHARP INTERFACE IMMERSED BOUNDARY METHOD FOR INCOMPRESSIBLE FLOWS WITH COMPLEX BOUNDARIES

PubMed Central

Mittal, R.; Dong, H.; Bozkurttas, M.; Najjar, F.M.; Vargas, A.; von Loebbecke, A.

2010-01-01

A sharp interface immersed boundary method for simulating incompressible viscous flow past three-dimensional immersed bodies is described. The method employs a multi-dimensional ghost-cell methodology to satisfy the boundary conditions on the immersed boundary and the method is designed to handle highly complex three-dimensional, stationary, moving and/or deforming bodies. The complex immersed surfaces are represented by grids consisting of unstructured triangular elements; while the flow is computed on non-uniform Cartesian grids. The paper describes the salient features of the methodology with special emphasis on the immersed boundary treatment for stationary and moving boundaries. Simulations of a number of canonical two- and three-dimensional flows are used to verify the accuracy and fidelity of the solver over a range of Reynolds numbers. Flow past suddenly accelerated bodies are used to validate the solver for moving boundary problems. Finally two cases inspired from biology with highly complex three-dimensional bodies are simulated in order to demonstrate the versatility of the method. PMID:20216919
Higher Order Time Integration Schemes for the Unsteady Navier-Stokes Equations on Unstructured Meshes

NASA Technical Reports Server (NTRS)

Jothiprasad, Giridhar; Mavriplis, Dimitri J.; Caughey, David A.; Bushnell, Dennis M. (Technical Monitor)

2002-01-01

The efficiency gains obtained using higher-order implicit Runge-Kutta schemes as compared with the second-order accurate backward difference schemes for the unsteady Navier-Stokes equations are investigated. Three different algorithms for solving the nonlinear system of equations arising at each timestep are presented. The first algorithm (NMG) is a pseudo-time-stepping scheme which employs a non-linear full approximation storage (FAS) agglomeration multigrid method to accelerate convergence. The other two algorithms are based on Inexact Newton's methods. The linear system arising at each Newton step is solved using iterative/Krylov techniques and left preconditioning is used to accelerate convergence of the linear solvers. One of the methods (LMG) uses Richardson's iterative scheme for solving the linear system at each Newton step while the other (PGMRES) uses the Generalized Minimal Residual method. Results demonstrating the relative superiority of these Newton's methods based schemes are presented. Efficiency gains as high as 10 are obtained by combining the higher-order time integration schemes with the more efficient nonlinear solvers.
RIACS

NASA Technical Reports Server (NTRS)

Oliger, Joseph

1997-01-01

Topics considered include: high-performance computing; cognitive and perceptual prostheses (computational aids designed to leverage human abilities); autonomous systems. Also included: development of a 3D unstructured grid code based on a finite volume formulation and applied to the Navier-stokes equations; Cartesian grid methods for complex geometry; multigrid methods for solving elliptic problems on unstructured grids; algebraic non-overlapping domain decomposition methods for compressible fluid flow problems on unstructured meshes; numerical methods for the compressible navier-stokes equations with application to aerodynamic flows; research in aerodynamic shape optimization; S-HARP: a parallel dynamic spectral partitioner; numerical schemes for the Hamilton-Jacobi and level set equations on triangulated domains; application of high-order shock capturing schemes to direct simulation of turbulence; multicast technology; network testbeds; supercomputer consolidation project.
A high-order 3D spectral difference solver for simulating flows about rotating geometries

NASA Astrophysics Data System (ADS)

Zhang, Bin; Liang, Chunlei

2017-11-01

Fluid flows around rotating geometries are ubiquitous. For example, a spinning ping pong ball can quickly change its trajectory in an air flow; a marine propeller can provide enormous amount of thrust to a ship. It has been a long-time challenge to accurately simulate these flows. In this work, we present a high-order and efficient 3D flow solver based on unstructured spectral difference (SD) method and a novel sliding-mesh method. In the SD method, solution and fluxes are reconstructed using tensor products of 1D polynomials and the equations are solved in differential-form, which leads to high-order accuracy and high efficiency. In the sliding-mesh method, a computational domain is decomposed into non-overlapping subdomains. Each subdomain can enclose a geometry and can rotate relative to its neighbor, resulting in nonconforming sliding interfaces. A curved dynamic mortar approach is designed for communication on these interfaces. In this approach, solutions and fluxes are projected from cell faces to mortars to compute common values which are then projected back to ensures continuity and conservation. Through theoretical analysis and numerical tests, it is shown that this solver is conservative, free-stream preservative, and high-order accurate in both space and time.
Validation Process for LEWICE by Use of a Navier-Stokes Solver

NASA Technical Reports Server (NTRS)

Wright, William B.; Porter, Christopher E.

2017-01-01

A research project is underway at NASA Glenn to produce computer software that can accurately predict ice growth under any meteorological conditions for any aircraft surface. This report will present results from the latest LEWICE release, version 3.5. This program differs from previous releases in its ability to model mixed phase and ice crystal conditions such as those encountered inside an engine. It also has expanded capability to use structured grids and a new capability to use results from unstructured grid flow solvers. A quantitative comparison of the results against a database of ice shapes that have been generated in the NASA Glenn Icing Research Tunnel (IRT) has also been performed. This paper will extend the comparison of ice shapes between LEWICE 3.5 and experimental data from a previous paper. Comparisons of lift and drag are made between experimentally collected data from experimentally obtained ice shapes and simulated (CFD) data on simulated (LEWICE) ice shapes. Comparisons are also made between experimentally collected and simulated performance data on select experimental ice shapes to ensure the CFD solver, FUN3D, is valid within the flight regime. The results show that the predicted results are within the accuracy limits of the experimental data for the majority of cases.
FUN3D Manual: 12.9

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Carlson, Jan-Renee; Derlaga, Joseph M.; Gnoffo, Peter A.; Hammond, Dana P.; Jones, William T.; Kleb, Bil; Lee-Rausch, Elizabeth M.; Nielsen, Eric J.; Park, Michael A.;

2016-01-01

This manual describes the installation and execution of FUN3D version 12.9, including optional dependent packages. FUN3D is a suite of computational fluid dynamics simulation and design tools that uses mixed-element unstructured grids in a large number of formats, including structured multiblock and overset grid systems. A discretely-exact adjoint solver enables efficient gradient-based design and grid adaptation to reduce estimated discretization error. FUN3D is available with and without a reacting, real-gas capability. This generic gas option is available only for those persons that qualify for its beta release status.

FUN3D Manual: 13.2

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Carlson, Jan-Renee; Derlaga, Joseph M.; Gnoffo, Peter A.; Hammond, Dana P.; Jones, William T.; Kleb, William L.; Lee-Rausch, Elizabeth M.; Nielsen, Eric J.; Park, Michael A.;

2017-01-01

This manual describes the installation and execution of FUN3D version 13.2, including optional dependent packages. FUN3D is a suite of computational fluid dynamics simulation and design tools that uses mixed-element unstructured grids in a large number of formats, including structured multiblock and overset grid systems. A discretely-exact adjoint solver enables efficient gradient-based design and grid adaptation to reduce estimated discretization error. FUN3D is available with and without a reacting, real-gas capability. This generic gas option is available only for those persons that qualify for its beta release status.

FUN3D Manual: 12.6

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Derlaga, Joseph M.; Gnoffo, Peter A.; Hammond, Dana P.; Jones, William T.; Kleb, William L.; Lee-Rausch, Elizabeth M.; Nielsen, Eric J.; Park, Michael A.; Rumsey, Christopher L.;

2015-01-01

This manual describes the installation and execution of FUN3D version 12.6, including optional dependent packages. FUN3D is a suite of computational fluid dynamics simulation and design tools that uses mixed-element unstructured grids in a large number of formats, including structured multiblock and overset grid systems. A discretely-exact adjoint solver enables efficient gradient-based design and grid adaptation to reduce estimated discretization error. FUN3D is available with and without a reacting, real-gas capability. This generic gas option is available only for those persons that qualify for its beta release status.

FUN3D Manual: 12.7

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Carlson, Jan-Renee; Derlaga, Joseph M.; Gnoffo, Peter A.; Hammond, Dana P.; Jones, William T.; Kleb, Bil; Lee-Rausch, Elizabeth M.; Nielsen, Eric J.; Park, Michael A.;

2015-01-01

This manual describes the installation and execution of FUN3D version 12.7, including optional dependent packages. FUN3D is a suite of computational fluid dynamics simulation and design tools that uses mixed-element unstructured grids in a large number of formats, including structured multiblock and overset grid systems. A discretely-exact adjoint solver enables efficient gradient-based design and grid adaptation to reduce estimated discretization error. FUN3D is available with and without a reacting, real-gas capability. This generic gas option is available only for those persons that qualify for its beta release status.

FUN3D Manual: 12.5

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Derlaga, Joseph M.; Gnoffo, Peter A.; Hammond, Dana P.; Jones, William T.; Kleb, William L.; Lee-Rausch, Elizabeth M.; Nielsen, Eric J.; Park, Michael A.; Rumsey, Christopher L.;

2014-01-01

This manual describes the installation and execution of FUN3D version 12.5, including optional dependent packages. FUN3D is a suite of computational uid dynamics simulation and design tools that uses mixed-element unstructured grids in a large number of formats, including structured multiblock and overset grid systems. A discretely-exact adjoint solver enables ecient gradient-based design and grid adaptation to reduce estimated discretization error. FUN3D is available with and without a reacting, real-gas capability. This generic gas option is available only for those persons that qualify for its beta release status.

FUN3D Manual: 12.8

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Carlson, Jan-Renee; Derlaga, Joseph M.; Gnoffo, Peter A.; Hammond, Dana P.; Jones, William T.; Kleb, Bil; Lee-Rausch, Elizabeth M.; Nielsen, Eric J.; Park, Michael A.;

2015-01-01

This manual describes the installation and execution of FUN3D version 12.8, including optional dependent packages. FUN3D is a suite of computational fluid dynamics simulation and design tools that uses mixed-element unstructured grids in a large number of formats, including structured multiblock and overset grid systems. A discretely-exact adjoint solver enables efficient gradient-based design and grid adaptation to reduce estimated discretization error. FUN3D is available with and without a reacting, real-gas capability. This generic gas option is available only for those persons that qualify for its beta release status.

FUN3D Manual: 12.4

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Derlaga, Joseph M.; Gnoffo, Peter A.; Hammond, Dana P.; Jones, William T.; Kleb, Bil; Lee-Rausch, Elizabeth M.; Nielsen, Eric J.; Park, Michael A.; Rumsey, Christopher L.;

2014-01-01

This manual describes the installation and execution of FUN3D version 12.4, including optional dependent packages. FUN3D is a suite of computational fluid dynamics simulation and design tools that uses mixedelement unstructured grids in a large number of formats, including structured multiblock and overset grid systems. A discretely-exact adjoint solver enables efficient gradient-based design and grid adaptation to reduce estimated discretization error. FUN3D is available with and without a reacting, real-gas capability. This generic gas option is available only for those persons that qualify for its beta release status.

FUN3D Manual: 13.1

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Carlson, Jan-Renee; Derlaga, Joseph M.; Gnoffo, Peter A.; Hammond, Dana P.; Jones, William T.; Kleb, Bil; Lee-Rausch, Elizabeth M.; Nielsen, Eric J.; Park, Michael A.;

2017-01-01

This manual describes the installation and execution of FUN3D version 13.1, including optional dependent packages. FUN3D is a suite of computational fluid dynamics simulation and design tools that uses mixed-element unstructured grids in a large number of formats, including structured multiblock and overset grid systems. A discretely-exact adjoint solver enables efficient gradient-based design and grid adaptation to reduce estimated discretization error. FUN3D is available with and without a reacting, real-gas capability. This generic gas option is available only for those persons that qualify for its beta release status.

FUN3D Manual: 13.0

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Carlson, Jan-Renee; Derlaga, Joseph M.; Gnoffo, Peter A.; Hammond, Dana P.; Jones, William T.; Kleb, Bill; Lee-Rausch, Elizabeth M.; Nielsen, Eric J.; Park, Michael A.;

2016-01-01

This manual describes the installation and execution of FUN3D version 13.0, including optional dependent packages. FUN3D is a suite of computational fluid dynamics simulation and design tools that uses mixed-element unstructured grids in a large number of formats, including structured multiblock and overset grid systems. A discretely-exact adjoint solver enables efficient gradient-based design and grid adaptation to reduce estimated discretization error. FUN3D is available with and without a reacting, real-gas capability. This generic gas option is available only for those persons that qualify for its beta release status.

FUN3D Manual: 13.3

NASA Technical Reports Server (NTRS)

Biedron, Robert T.; Carlson, Jan-Renee; Derlaga, Joseph M.; Gnoffo, Peter A.; Hammond, Dana P.; Jones, William T.; Kleb, Bil; Lee-Rausch, Elizabeth M.; Nielsen, Eric J.; Park, Michael A.;

2018-01-01

This manual describes the installation and execution of FUN3D version 13.3, including optional dependent packages. FUN3D is a suite of computational fluid dynamics simulation and design tools that uses mixed-element unstructured grids in a large number of formats, including structured multiblock and overset grid systems. A discretely-exact adjoint solver enables efficient gradient-based design and grid adaptation to reduce estimated discretization error. FUN3D is available with and without a reacting, real-gas capability. This generic gas option is available only for those persons that qualify for its beta release status.

Accurate evaluation of exchange fields in finite element micromagnetic solvers

NASA Astrophysics Data System (ADS)

Chang, R.; Escobar, M. A.; Li, S.; Lubarda, M. V.; Lomakin, V.

2012-04-01

Quadratic basis functions (QBFs) are implemented for solving the Landau-Lifshitz-Gilbert equation via the finite element method. This involves the introduction of a set of special testing functions compatible with the QBFs for evaluating the Laplacian operator. The results by using QBFs are significantly more accurate than those via linear basis functions. QBF approach leads to significantly more accurate results than conventionally used approaches based on linear basis functions. Importantly QBFs allow reducing the error of computing the exchange field by increasing the mesh density for structured and unstructured meshes. Numerical examples demonstrate the feasibility of the method.
Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

NASA Technical Reports Server (NTRS)

Povitsky, A.

1998-01-01

In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.
Acceleration of FDTD mode solver by high-performance computing techniques.

PubMed

Han, Lin; Xi, Yanping; Huang, Wei-Ping

2010-06-21

A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.
Solving Partial Differential Equations in a data-driven multiprocessor environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gaudiot, J.L.; Lin, C.M.; Hosseiniyar, M.

1988-12-31

Partial differential equations can be found in a host of engineering and scientific problems. The emergence of new parallel architectures has spurred research in the definition of parallel PDE solvers. Concurrently, highly programmable systems such as data-how architectures have been proposed for the exploitation of large scale parallelism. The implementation of some Partial Differential Equation solvers (such as the Jacobi method) on a tagged token data-flow graph is demonstrated here. Asynchronous methods (chaotic relaxation) are studied and new scheduling approaches (the Token No-Labeling scheme) are introduced in order to support the implementation of the asychronous methods in a data-driven environment.more » New high-level data-flow language program constructs are introduced in order to handle chaotic operations. Finally, the performance of the program graphs is demonstrated by a deterministic simulation of a message passing data-flow multiprocessor. An analysis of the overhead in the data-flow graphs is undertaken to demonstrate the limits of parallel operations in dataflow PDE program graphs.« less
Parallelization of the Physical-Space Statistical Analysis System (PSAS)

NASA Technical Reports Server (NTRS)

Larson, J. W.; Guo, J.; Lyster, P. M.

1999-01-01

Atmospheric data assimilation is a method of combining observations with model forecasts to produce a more accurate description of the atmosphere than the observations or forecast alone can provide. Data assimilation plays an increasingly important role in the study of climate and atmospheric chemistry. The NASA Data Assimilation Office (DAO) has developed the Goddard Earth Observing System Data Assimilation System (GEOS DAS) to create assimilated datasets. The core computational components of the GEOS DAS include the GEOS General Circulation Model (GCM) and the Physical-space Statistical Analysis System (PSAS). The need for timely validation of scientific enhancements to the data assimilation system poses computational demands that are best met by distributed parallel software. PSAS is implemented in Fortran 90 using object-based design principles. The analysis portions of the code solve two equations. The first of these is the "innovation" equation, which is solved on the unstructured observation grid using a preconditioned conjugate gradient (CG) method. The "analysis" equation is a transformation from the observation grid back to a structured grid, and is solved by a direct matrix-vector multiplication. Use of a factored-operator formulation reduces the computational complexity of both the CG solver and the matrix-vector multiplication, rendering the matrix-vector multiplications as a successive product of operators on a vector. Sparsity is introduced to these operators by partitioning the observations using an icosahedral decomposition scheme. PSAS builds a large (approx. 128MB) run-time database of parameters used in the calculation of these operators. Implementing a message passing parallel computing paradigm into an existing yet developing computational system as complex as PSAS is nontrivial. One of the technical challenges is balancing the requirements for computational reproducibility with the need for high performance. The problem of computational reproducibility is well known in the parallel computing community. It is a requirement that the parallel code perform calculations in a fashion that will yield identical results on different configurations of processing elements on the same platform. In some cases this problem can be solved by sacrificing performance. Meeting this requirement and still achieving high performance is very difficult. Topics to be discussed include: current PSAS design and parallelization strategy; reproducibility issues; load balance vs. database memory demands, possible solutions to these problems.
jInv: A Modular and Scalable Framework for Electromagnetic Inverse Problems

NASA Astrophysics Data System (ADS)

Belliveau, P. T.; Haber, E.

2016-12-01

Inversion is a key tool in the interpretation of geophysical electromagnetic (EM) data. Three-dimensional (3D) EM inversion is very computationally expensive and practical software for inverting large 3D EM surveys must be able to take advantage of high performance computing (HPC) resources. It has traditionally been difficult to achieve those goals in a high level dynamic programming environment that allows rapid development and testing of new algorithms, which is important in a research setting. With those goals in mind, we have developed jInv, a framework for PDE constrained parameter estimation problems. jInv provides optimization and regularization routines, a framework for user defined forward problems, and interfaces to several direct and iterative solvers for sparse linear systems. The forward modeling framework provides finite volume discretizations of differential operators on rectangular tensor product meshes and tetrahedral unstructured meshes that can be used to easily construct forward modeling and sensitivity routines for forward problems described by partial differential equations. jInv is written in the emerging programming language Julia. Julia is a dynamic language targeted at the computational science community with a focus on high performance and native support for parallel programming. We have developed frequency and time-domain EM forward modeling and sensitivity routines for jInv. We will illustrate its capabilities and performance with two synthetic time-domain EM inversion examples. First, in airborne surveys, which use many sources, we achieve distributed memory parallelism by decoupling the forward and inverse meshes and performing forward modeling for each source on small, locally refined meshes. Secondly, we invert grounded source time-domain data from a gradient array style induced polarization survey using a novel time-stepping technique that allows us to compute data from different time-steps in parallel. These examples both show that it is possible to invert large scale 3D time-domain EM datasets within a modular, extensible framework written in a high-level, easy to use programming language.
Petascale computation of multi-physics seismic simulations

NASA Astrophysics Data System (ADS)

Gabriel, Alice-Agnes; Madden, Elizabeth H.; Ulrich, Thomas; Wollherr, Stephanie; Duru, Kenneth C.

2017-04-01

Capturing the observed complexity of earthquake sources in concurrence with seismic wave propagation simulations is an inherently multi-scale, multi-physics problem. In this presentation, we present simulations of earthquake scenarios resolving high-detail dynamic rupture evolution and high frequency ground motion. The simulations combine a multitude of representations of model complexity; such as non-linear fault friction, thermal and fluid effects, heterogeneous fault stress and fault strength initial conditions, fault curvature and roughness, on- and off-fault non-elastic failure to capture dynamic rupture behavior at the source; and seismic wave attenuation, 3D subsurface structure and bathymetry impacting seismic wave propagation. Performing such scenarios at the necessary spatio-temporal resolution requires highly optimized and massively parallel simulation tools which can efficiently exploit HPC facilities. Our up to multi-PetaFLOP simulations are performed with SeisSol (www.seissol.org), an open-source software package based on an ADER-Discontinuous Galerkin (DG) scheme solving the seismic wave equations in velocity-stress formulation in elastic, viscoelastic, and viscoplastic media with high-order accuracy in time and space. Our flux-based implementation of frictional failure remains free of spurious oscillations. Tetrahedral unstructured meshes allow for complicated model geometry. SeisSol has been optimized on all software levels, including: assembler-level DG kernels which obtain 50% peak performance on some of the largest supercomputers worldwide; an overlapping MPI-OpenMP parallelization shadowing the multiphysics computations; usage of local time stepping; parallel input and output schemes and direct interfaces to community standard data formats. All these factors enable aim to minimise the time-to-solution. The results presented highlight the fact that modern numerical methods and hardware-aware optimization for modern supercomputers are essential to further our understanding of earthquake source physics and complement both physic-based ground motion research and empirical approaches in seismic hazard analysis. Lastly, we will conclude with an outlook on future exascale ADER-DG solvers for seismological applications.
3D CSEM inversion based on goal-oriented adaptive finite element method

NASA Astrophysics Data System (ADS)

Zhang, Y.; Key, K.

2016-12-01

We present a parallel 3D frequency domain controlled-source electromagnetic inversion code name MARE3DEM. Non-linear inversion of observed data is performed with the Occam variant of regularized Gauss-Newton optimization. The forward operator is based on the goal-oriented finite element method that efficiently calculates the responses and sensitivity kernels in parallel using a data decomposition scheme where independent modeling tasks contain different frequencies and subsets of the transmitters and receivers. To accommodate complex 3D conductivity variation with high flexibility and precision, we adopt the dual-grid approach where the forward mesh conforms to the inversion parameter grid and is adaptively refined until the forward solution converges to the desired accuracy. This dual-grid approach is memory efficient, since the inverse parameter grid remains independent from fine meshing generated around the transmitter and receivers by the adaptive finite element method. Besides, the unstructured inverse mesh efficiently handles multiple scale structures and allows for fine-scale model parameters within the region of interest. Our mesh generation engine keeps track of the refinement hierarchy so that the map of conductivity and sensitivity kernel between the forward and inverse mesh is retained. We employ the adjoint-reciprocity method to calculate the sensitivity kernels which establish a linear relationship between changes in the conductivity model and changes in the modeled responses. Our code uses a direcy solver for the linear systems, so the adjoint problem is efficiently computed by re-using the factorization from the primary problem. Further computational efficiency and scalability is obtained in the regularized Gauss-Newton portion of the inversion using parallel dense matrix-matrix multiplication and matrix factorization routines implemented with the ScaLAPACK library. We show the scalability, reliability and the potential of the algorithm to deal with complex geological scenarios by applying it to the inversion of synthetic marine controlled source EM data generated for a complex 3D offshore model with significant seafloor topography.
Quinoa - Adaptive Computational Fluid Dynamics, 0.2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bakosi, Jozsef; Gonzalez, Francisco; Rogers, Brandon

Quinoa is a set of computational tools that enables research and numerical analysis in fluid dynamics. At this time it remains a test-bed to experiment with various algorithms using fully asynchronous runtime systems. Currently, Quinoa consists of the following tools: (1) Walker, a numerical integrator for systems of stochastic differential equations in time. It is a mathematical tool to analyze and design the behavior of stochastic differential equations. It allows the estimation of arbitrary coupled statistics and probability density functions and is currently used for the design of statistical moment approximations for multiple mixing materials in variable-density turbulence. (2) Inciter,more » an overdecomposition-aware finite element field solver for partial differential equations using 3D unstructured grids. Inciter is used to research asynchronous mesh-based algorithms and to experiment with coupling asynchronous to bulk-synchronous parallel code. Two planned new features of Inciter, compared to the previous release (LA-CC-16-015), to be implemented in 2017, are (a) a simple Navier-Stokes solver for ideal single-material compressible gases, and (b) solution-adaptive mesh refinement (AMR), which enables dynamically concentrating compute resources to regions with interesting physics. Using the NS-AMR problem we plan to explore how to scale such high-load-imbalance simulations, representative of large production multiphysics codes, to very large problems on very large computers using an asynchronous runtime system. (3) RNGTest, a test harness to subject random number generators to stringent statistical tests enabling quantitative ranking with respect to their quality and computational cost. (4) UnitTest, a unit test harness, running hundreds of tests per second, capable of testing serial, synchronous, and asynchronous functions. (5) MeshConv, a mesh file converter that can be used to convert 3D tetrahedron meshes from and to either of the following formats: Gmsh, (http://www.geuz.org/gmsh), Netgen, (http://sourceforge.net/apps/mediawiki/netgen-mesher), ExodusII, (http://sourceforge.net/projects/exodusii), HyperMesh, (http://www.altairhyperworks.com/product/HyperMesh).« less
National Combustion Code: Parallel Performance

NASA Technical Reports Server (NTRS)

Babrauckas, Theresa

2001-01-01

This report discusses the National Combustion Code (NCC). The NCC is an integrated system of codes for the design and analysis of combustion systems. The advanced features of the NCC meet designers' requirements for model accuracy and turn-around time. The fundamental features at the inception of the NCC were parallel processing and unstructured mesh. The design and performance of the NCC are discussed.
A Time-Accurate Upwind Unstructured Finite Volume Method for Compressible Flow with Cure of Pathological Behaviors

NASA Technical Reports Server (NTRS)

Loh, Ching Y.; Jorgenson, Philip C. E.

2007-01-01

A time-accurate, upwind, finite volume method for computing compressible flows on unstructured grids is presented. The method is second order accurate in space and time and yields high resolution in the presence of discontinuities. For efficiency, the Roe approximate Riemann solver with an entropy correction is employed. In the basic Euler/Navier-Stokes scheme, many concepts of high order upwind schemes are adopted: the surface flux integrals are carefully treated, a Cauchy-Kowalewski time-stepping scheme is used in the time-marching stage, and a multidimensional limiter is applied in the reconstruction stage. However even with these up-to-date improvements, the basic upwind scheme is still plagued by the so-called "pathological behaviors," e.g., the carbuncle phenomenon, the expansion shock, etc. A solution to these limitations is presented which uses a very simple dissipation model while still preserving second order accuracy. This scheme is referred to as the enhanced time-accurate upwind (ETAU) scheme in this paper. The unstructured grid capability renders flexibility for use in complex geometry; and the present ETAU Euler/Navier-Stokes scheme is capable of handling a broad spectrum of flow regimes from high supersonic to subsonic at very low Mach number, appropriate for both CFD (computational fluid dynamics) and CAA (computational aeroacoustics). Numerous examples are included to demonstrate the robustness of the methods.

FUEL-FLEXIBLE GASIFICATION-COMBUSTION TECHNOLOGY FOR PRODUCTION OF H2 AND SEQUESTRATION-READY CO2

DOE Office of Scientific and Technical Information (OSTI.GOV)

George Rizeq; Janice West; Arnaldo Frydman

Further development of a combustion Large Eddy Simulation (LES) code for the design of advanced gaseous combustion systems is described in this sixth quarterly report. CFD Research Corporation (CFDRC) is developing the LES module within the parallel, unstructured solver included in the commercial CFD-ACE+ software. In this quarter, in-situ adaptive tabulation (ISAT) for efficient chemical rate storage and retrieval was implemented and tested within the Linear Eddy Model (LEM). ISAT type 3 is being tested so that extrapolation can be performed and further improve the retrieval rate. Further testing of the LEM for subgrid chemistry was performed for parallel applicationsmore » and for multi-step chemistry. Validation of the software on backstep and bluff-body reacting cases were performed. Initial calculations of the SimVal experiment at Georgia Tech using their LES code were performed. Georgia Tech continues the effort to parameterize the LEM over composition space so that a neural net can be used efficiently in the combustion LES code. A new and improved Artificial Neural Network (ANN), with log-transformed output, for the 1-step chemistry was implemented in CFDRC's LES code and gave reasonable results. This quarter, the 2nd consortium meeting was held at CFDRC. Next quarter, LES software development and testing will continue. Alpha testing of the code will continue to be performed on cases of interest to the industrial consortium. Optimization of subgrid models will be pursued, particularly with the ISAT approach. Also next quarter, the demonstration of the neural net approach, for multi-step chemical kinetics speed-up in CFD-ACE+, will be accomplished.« less
Parallel computation of three-dimensional aeroelastic fluid-structure interaction

NASA Astrophysics Data System (ADS)

Sadeghi, Mani

This dissertation presents a numerical method for the parallel computation of aeroelasticity (ParCAE). A flow solver is coupled to a structural solver by use of a fluid-structure interface method. The integration of the three-dimensional unsteady Navier-Stokes equations is performed in the time domain, simultaneously to the integration of a modal three-dimensional structural model. The flow solution is accelerated by using a multigrid method and a parallel multiblock approach. Fluid-structure coupling is achieved by subiteration. A grid-deformation algorithm is developed to interpolate the deformation of the structural boundaries onto the flow grid. The code is formulated to allow application to general, three-dimensional, complex configurations with multiple independent structures. Computational results are presented for various configurations, such as turbomachinery blade rows and aircraft wings. Investigations are performed on vortex-induced vibrations, effects of cascade mistuning on flutter, and cases of nonlinear cascade and wing flutter.
Development of a Robust and Efficient Parallel Solver for Unsteady Turbomachinery Flows

NASA Technical Reports Server (NTRS)

West, Jeff; Wright, Jeffrey; Thakur, Siddharth; Luke, Ed; Grinstead, Nathan

2012-01-01

The traditional design and analysis practice for advanced propulsion systems relies heavily on expensive full-scale prototype development and testing. Over the past decade, use of high-fidelity analysis and design tools such as CFD early in the product development cycle has been identified as one way to alleviate testing costs and to develop these devices better, faster and cheaper. In the design of advanced propulsion systems, CFD plays a major role in defining the required performance over the entire flight regime, as well as in testing the sensitivity of the design to the different modes of operation. Increased emphasis is being placed on developing and applying CFD models to simulate the flow field environments and performance of advanced propulsion systems. This necessitates the development of next generation computational tools which can be used effectively and reliably in a design environment. The turbomachinery simulation capability presented here is being developed in a computational tool called Loci-STREAM [1]. It integrates proven numerical methods for generalized grids and state-of-the-art physical models in a novel rule-based programming framework called Loci [2] which allows: (a) seamless integration of multidisciplinary physics in a unified manner, and (b) automatic handling of massively parallel computing. The objective is to be able to routinely simulate problems involving complex geometries requiring large unstructured grids and complex multidisciplinary physics. An immediate application of interest is simulation of unsteady flows in rocket turbopumps, particularly in cryogenic liquid rocket engines. The key components of the overall methodology presented in this paper are the following: (a) high fidelity unsteady simulation capability based on Detached Eddy Simulation (DES) in conjunction with second-order temporal discretization, (b) compliance with Geometric Conservation Law (GCL) in order to maintain conservative property on moving meshes for second-order time-stepping scheme, (c) a novel cloud-of-points interpolation method (based on a fast parallel kd-tree search algorithm) for interfaces between turbomachinery components in relative motion which is demonstrated to be highly scalable, and (d) demonstrated accuracy and parallel scalability on large grids (approx 250 million cells) in full turbomachinery geometries.
Scalable Evaluation of Polarization Energy and Associated Forces in Polarizable Molecular Dynamics: II.Towards Massively Parallel Computations using Smooth Particle Mesh Ewald.

PubMed

Lagardère, Louis; Lipparini, Filippo; Polack, Étienne; Stamm, Benjamin; Cancès, Éric; Schnieders, Michael; Ren, Pengyu; Maday, Yvon; Piquemal, Jean-Philip

2014-02-28

In this paper, we present a scalable and efficient implementation of point dipole-based polarizable force fields for molecular dynamics (MD) simulations with periodic boundary conditions (PBC). The Smooth Particle-Mesh Ewald technique is combined with two optimal iterative strategies, namely, a preconditioned conjugate gradient solver and a Jacobi solver in conjunction with the Direct Inversion in the Iterative Subspace for convergence acceleration, to solve the polarization equations. We show that both solvers exhibit very good parallel performances and overall very competitive timings in an energy-force computation needed to perform a MD step. Various tests on large systems are provided in the context of the polarizable AMOEBA force field as implemented in the newly developed Tinker-HP package which is the first implementation for a polarizable model making large scale experiments for massively parallel PBC point dipole models possible. We show that using a large number of cores offers a significant acceleration of the overall process involving the iterative methods within the context of spme and a noticeable improvement of the memory management giving access to very large systems (hundreds of thousands of atoms) as the algorithm naturally distributes the data on different cores. Coupled with advanced MD techniques, gains ranging from 2 to 3 orders of magnitude in time are now possible compared to non-optimized, sequential implementations giving new directions for polarizable molecular dynamics in periodic boundary conditions using massively parallel implementations.
Scalable Evaluation of Polarization Energy and Associated Forces in Polarizable Molecular Dynamics: II.Towards Massively Parallel Computations using Smooth Particle Mesh Ewald

PubMed Central

Lagardère, Louis; Lipparini, Filippo; Polack, Étienne; Stamm, Benjamin; Cancès, Éric; Schnieders, Michael; Ren, Pengyu; Maday, Yvon; Piquemal, Jean-Philip

2015-01-01

In this paper, we present a scalable and efficient implementation of point dipole-based polarizable force fields for molecular dynamics (MD) simulations with periodic boundary conditions (PBC). The Smooth Particle-Mesh Ewald technique is combined with two optimal iterative strategies, namely, a preconditioned conjugate gradient solver and a Jacobi solver in conjunction with the Direct Inversion in the Iterative Subspace for convergence acceleration, to solve the polarization equations. We show that both solvers exhibit very good parallel performances and overall very competitive timings in an energy-force computation needed to perform a MD step. Various tests on large systems are provided in the context of the polarizable AMOEBA force field as implemented in the newly developed Tinker-HP package which is the first implementation for a polarizable model making large scale experiments for massively parallel PBC point dipole models possible. We show that using a large number of cores offers a significant acceleration of the overall process involving the iterative methods within the context of spme and a noticeable improvement of the memory management giving access to very large systems (hundreds of thousands of atoms) as the algorithm naturally distributes the data on different cores. Coupled with advanced MD techniques, gains ranging from 2 to 3 orders of magnitude in time are now possible compared to non-optimized, sequential implementations giving new directions for polarizable molecular dynamics in periodic boundary conditions using massively parallel implementations. PMID:26512230
A Parallel Cartesian Approach for External Aerodynamics of Vehicles with Complex Geometry

NASA Technical Reports Server (NTRS)

Aftosmis, M. J.; Berger, M. J.; Adomavicius, G.

2001-01-01

This workshop paper presents the current status in the development of a new approach for the solution of the Euler equations on Cartesian meshes with embedded boundaries in three dimensions on distributed and shared memory architectures. The approach uses adaptively refined Cartesian hexahedra to fill the computational domain. Where these cells intersect the geometry, they are cut by the boundary into arbitrarily shaped polyhedra which receive special treatment by the solver. The presentation documents a newly developed multilevel upwind solver based on a flexible domain-decomposition strategy. One novel aspect of the work is its use of space-filling curves (SFC) for memory efficient on-the-fly parallelization, dynamic re-partitioning and automatic coarse mesh generation. Within each subdomain the approach employs a variety reordering techniques so that relevant data are on the same page in memory permitting high-performance on cache-based processors. Details of the on-the-fly SFC based partitioning are presented as are construction rules for the automatic coarse mesh generation. After describing the approach, the paper uses model problems and 3- D configurations to both verify and validate the solver. The model problems demonstrate that second-order accuracy is maintained despite the presence of the irregular cut-cells in the mesh. In addition, it examines both parallel efficiency and convergence behavior. These investigations demonstrate a parallel speed-up in excess of 28 on 32 processors of an SGI Origin 2000 system and confirm that mesh partitioning has no effect on convergence behavior.
Research in computer science

NASA Technical Reports Server (NTRS)

Ortega, J. M.

1986-01-01

Various graduate research activities in the field of computer science are reported. Among the topics discussed are: (1) failure probabilities in multi-version software; (2) Gaussian Elimination on parallel computers; (3) three dimensional Poisson solvers on parallel/vector computers; (4) automated task decomposition for multiple robot arms; (5) multi-color incomplete cholesky conjugate gradient methods on the Cyber 205; and (6) parallel implementation of iterative methods for solving linear equations.
Algebraic multigrid preconditioning within parallel finite-element solvers for 3-D electromagnetic modelling problems in geophysics

NASA Astrophysics Data System (ADS)

Koldan, Jelena; Puzyrev, Vladimir; de la Puente, Josep; Houzeaux, Guillaume; Cela, José María

2014-06-01

We present an elaborate preconditioning scheme for Krylov subspace methods which has been developed to improve the performance and reduce the execution time of parallel node-based finite-element (FE) solvers for 3-D electromagnetic (EM) numerical modelling in exploration geophysics. This new preconditioner is based on algebraic multigrid (AMG) that uses different basic relaxation methods, such as Jacobi, symmetric successive over-relaxation (SSOR) and Gauss-Seidel, as smoothers and the wave front algorithm to create groups, which are used for a coarse-level generation. We have implemented and tested this new preconditioner within our parallel nodal FE solver for 3-D forward problems in EM induction geophysics. We have performed series of experiments for several models with different conductivity structures and characteristics to test the performance of our AMG preconditioning technique when combined with biconjugate gradient stabilized method. The results have shown that, the more challenging the problem is in terms of conductivity contrasts, ratio between the sizes of grid elements and/or frequency, the more benefit is obtained by using this preconditioner. Compared to other preconditioning schemes, such as diagonal, SSOR and truncated approximate inverse, the AMG preconditioner greatly improves the convergence of the iterative solver for all tested models. Also, when it comes to cases in which other preconditioners succeed to converge to a desired precision, AMG is able to considerably reduce the total execution time of the forward-problem code-up to an order of magnitude. Furthermore, the tests have confirmed that our AMG scheme ensures grid-independent rate of convergence, as well as improvement in convergence regardless of how big local mesh refinements are. In addition, AMG is designed to be a black-box preconditioner, which makes it easy to use and combine with different iterative methods. Finally, it has proved to be very practical and efficient in the parallel context.
Evaluation of grid generation technologies from an applied perspective

NASA Technical Reports Server (NTRS)

Hufford, Gary S.; Harrand, Vincent J.; Patel, Bhavin C.; Mitchell, Curtis R.

1995-01-01

An analysis of the grid generation process from the point of view of an applied CFD engineer is given. Issues addressed include geometric modeling, structured grid generation, unstructured grid generation, hybrid grid generation and use of virtual parts libraries in large parametric analysis projects. The analysis is geared towards comparing the effective turn around time for specific grid generation and CFD projects. The conclusion was made that a single grid generation methodology is not universally suited for all CFD applications due to both limitations in grid generation and flow solver technology. A new geometric modeling and grid generation tool, CFD-GEOM, is introduced to effectively integrate the geometric modeling process to the various grid generation methodologies including structured, unstructured, and hybrid procedures. The full integration of the geometric modeling and grid generation allows implementation of extremely efficient updating procedures, a necessary requirement for large parametric analysis projects. The concept of using virtual parts libraries in conjunction with hybrid grids for large parametric analysis projects is also introduced to improve the efficiency of the applied CFD engineer.
Radiation Coupling with the FUN3D Unstructured-Grid CFD Code

NASA Technical Reports Server (NTRS)

Wood, William A.

2012-01-01

The HARA radiation code is fully-coupled to the FUN3D unstructured-grid CFD code for the purpose of simulating high-energy hypersonic flows. The radiation energy source terms and surface heat transfer, under the tangent slab approximation, are included within the fluid dynamic ow solver. The Fire II flight test, at the Mach-31 1643-second trajectory point, is used as a demonstration case. Comparisons are made with an existing structured-grid capability, the LAURA/HARA coupling. The radiative surface heat transfer rates from the present approach match the benchmark values within 6%. Although radiation coupling is the focus of the present work, convective surface heat transfer rates are also reported, and are seen to vary depending upon the choice of mesh connectivity and FUN3D ux reconstruction algorithm. On a tetrahedral-element mesh the convective heating matches the benchmark at the stagnation point, but under-predicts by 15% on the Fire II shoulder. Conversely, on a mixed-element mesh the convective heating over-predicts at the stagnation point by 20%, but matches the benchmark away from the stagnation region.
An Unstructured Finite Volume Approach for Structural Dynamics in Response to Fluid Motions.

PubMed

Xia, Guohua; Lin, Ching-Long

2008-04-01

A new cell-vortex unstructured finite volume method for structural dynamics is assessed for simulations of structural dynamics in response to fluid motions. A robust implicit dual-time stepping method is employed to obtain time accurate solutions. The resulting system of algebraic equations is matrix-free and allows solid elements to include structure thickness, inertia, and structural stresses for accurate predictions of structural responses and stress distributions. The method is coupled with a fluid dynamics solver for fluid-structure interaction, providing a viable alternative to the finite element method for structural dynamics calculations. A mesh sensitivity test indicates that the finite volume method is at least of second-order accuracy. The method is validated by the problem of vortex-induced vibration of an elastic plate with different initial conditions and material properties. The results are in good agreement with existing numerical data and analytical solutions. The method is then applied to simulate a channel flow with an elastic wall. The effects of wall inertia and structural stresses on the fluid flow are investigated.
Failure of Anisotropic Unstructured Mesh Adaption Based on Multidimensional Residual Minimization

NASA Technical Reports Server (NTRS)

Wood, William A.; Kleb, William L.

2003-01-01

An automated anisotropic unstructured mesh adaptation strategy is proposed, implemented, and assessed for the discretization of viscous flows. The adaption criteria is based upon the minimization of the residual fluctuations of a multidimensional upwind viscous flow solver. For scalar advection, this adaption strategy has been shown to use fewer grid points than gradient based adaption, naturally aligning mesh edges with discontinuities and characteristic lines. The adaption utilizes a compact stencil and is local in scope, with four fundamental operations: point insertion, point deletion, edge swapping, and nodal displacement. Evaluation of the solution-adaptive strategy is performed for a two-dimensional blunt body laminar wind tunnel case at Mach 10. The results demonstrate that the strategy suffers from a lack of robustness, particularly with regard to alignment of the bow shock in the vicinity of the stagnation streamline. In general, constraining the adaption to such a degree as to maintain robustness results in negligible improvement to the solution. Because the present method fails to consistently or significantly improve the flow solution, it is rejected in favor of simple uniform mesh refinement.
Research in Parallel Algorithms and Software for Computational Aerosciences

DOT National Transportation Integrated Search

1996-04-01

Phase I is complete for the development of a Computational Fluid Dynamics : with automatic grid generation and adaptation for the Euler : analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian : grid code developed at Lockheed...
NCC: A Multidisciplinary Design/Analysis Tool for Combustion Systems

NASA Technical Reports Server (NTRS)

Liu, Nan-Suey; Quealy, Angela

1999-01-01

A multi-disciplinary design/analysis tool for combustion systems is critical for optimizing the low-emission, high-performance combustor design process. Based on discussions between NASA Lewis Research Center and the jet engine companies, an industry-government team was formed in early 1995 to develop the National Combustion Code (NCC), which is an integrated system of computer codes for the design and analysis of combustion systems. NCC has advanced features that address the need to meet designer's requirements such as "assured accuracy", "fast turnaround", and "acceptable cost". The NCC development team is comprised of Allison Engine Company (Allison), CFD Research Corporation (CFDRC), GE Aircraft Engines (GEAE), NASA Lewis Research Center (LeRC), and Pratt & Whitney (P&W). This development team operates under the guidance of the NCC steering committee. The "unstructured mesh" capability and "parallel computing" are fundamental features of NCC from its inception. The NCC system is composed of a set of "elements" which includes grid generator, main flow solver, turbulence module, turbulence and chemistry interaction module, chemistry module, spray module, radiation heat transfer module, data visualization module, and a post-processor for evaluating engine performance parameters. Each element may have contributions from several team members. Such a multi-source multi-element system needs to be integrated in a way that facilitates inter-module data communication, flexibility in module selection, and ease of integration.
A Space-Time Conservation Element and Solution Element Method for Solving the Two- and Three-Dimensional Unsteady Euler Equations Using Quadrilateral and Hexahedral Meshes

NASA Technical Reports Server (NTRS)

Zhang, Zeng-Chan; Yu, S. T. John; Chang, Sin-Chung; Jorgenson, Philip (Technical Monitor)

2001-01-01

In this paper, we report a version of the Space-Time Conservation Element and Solution Element (CE/SE) Method in which the 2D and 3D unsteady Euler equations are simulated using structured or unstructured quadrilateral and hexahedral meshes, respectively. In the present method, mesh values of flow variables and their spatial derivatives are treated as independent unknowns to be solved for. At each mesh point, the value of a flow variable is obtained by imposing a flux conservation condition. On the other hand, the spatial derivatives are evaluated using a finite-difference/weighted-average procedure. Note that the present extension retains many key advantages of the original CE/SE method which uses triangular and tetrahedral meshes, respectively, for its 2D and 3D applications. These advantages include efficient parallel computing ease of implementing non-reflecting boundary conditions, high-fidelity resolution of shocks and waves, and a genuinely multidimensional formulation without using a dimensional-splitting approach. In particular, because Riemann solvers, the cornerstones of the Godunov-type upwind schemes, are not needed to capture shocks, the computational logic of the present method is considerably simpler. To demonstrate the capability of the present method, numerical results are presented for several benchmark problems including oblique shock reflection, supersonic flow over a wedge, and a 3D detonation flow.
A Review of High-Performance Computational Strategies for Modeling and Imaging of Electromagnetic Induction Data

NASA Astrophysics Data System (ADS)

Newman, Gregory A.

2014-01-01

Many geoscientific applications exploit electrostatic and electromagnetic fields to interrogate and map subsurface electrical resistivity—an important geophysical attribute for characterizing mineral, energy, and water resources. In complex three-dimensional geologies, where many of these resources remain to be found, resistivity mapping requires large-scale modeling and imaging capabilities, as well as the ability to treat significant data volumes, which can easily overwhelm single-core and modest multicore computing hardware. To treat such problems requires large-scale parallel computational resources, necessary for reducing the time to solution to a time frame acceptable to the exploration process. The recognition that significant parallel computing processes must be brought to bear on these problems gives rise to choices that must be made in parallel computing hardware and software. In this review, some of these choices are presented, along with the resulting trade-offs. We also discuss future trends in high-performance computing and the anticipated impact on electromagnetic (EM) geophysics. Topics discussed in this review article include a survey of parallel computing platforms, graphics processing units to multicore CPUs with a fast interconnect, along with effective parallel solvers and associated solver libraries effective for inductive EM modeling and imaging.
Parallel implementation of an adaptive scheme for 3D unstructured grids on the SP2

NASA Technical Reports Server (NTRS)

Strawn, Roger C.; Oliker, Leonid; Biswas, Rupak

1996-01-01

Dynamic mesh adaption on unstructured grids is a powerful tool for computing unsteady flows that require local grid modifications to efficiently resolve solution features. For this work, we consider an edge-based adaption scheme that has shown good single-processor performance on the C90. We report on our experience parallelizing this code for the SP2. Results show a 47.0X speedup on 64 processors when 10 percent of the mesh is randomly refined. Performance deteriorates to 7.7X when the same number of edges are refined in a highly-localized region. This is because almost all the mesh adaption is confined to a single processor. However, this problem can be remedied by repartitioning the mesh immediately after targeting edges for refinement but before the actual adaption takes place. With this change, the speedup improves dramatically to 43.6X.
Parallel Implementation of an Adaptive Scheme for 3D Unstructured Grids on the SP2

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Biswas, Rupak; Strawn, Roger C.

1996-01-01

Dynamic mesh adaption on unstructured grids is a powerful tool for computing unsteady flows that require local grid modifications to efficiently resolve solution features. For this work, we consider an edge-based adaption scheme that has shown good single-processor performance on the C90. We report on our experience parallelizing this code for the SP2. Results show a 47.OX speedup on 64 processors when 10% of the mesh is randomly refined. Performance deteriorates to 7.7X when the same number of edges are refined in a highly-localized region. This is because almost all mesh adaption is confined to a single processor. However, this problem can be remedied by repartitioning the mesh immediately after targeting edges for refinement but before the actual adaption takes place. With this change, the speedup improves dramatically to 43.6X.
PLUM: Parallel Load Balancing for Adaptive Unstructured Meshes

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Biswas, Rupak; Saini, Subhash (Technical Monitor)

1998-01-01

Mesh adaption is a powerful tool for efficient unstructured-grid computations but causes load imbalance among processors on a parallel machine. We present a novel method called PLUM to dynamically balance the processor workloads with a global view. This paper presents the implementation and integration of all major components within our dynamic load balancing strategy for adaptive grid calculations. Mesh adaption, repartitioning, processor assignment, and remapping are critical components of the framework that must be accomplished rapidly and efficiently so as not to cause a significant overhead to the numerical simulation. A data redistribution model is also presented that predicts the remapping cost on the SP2. This model is required to determine whether the gain from a balanced workload distribution offsets the cost of data movement. Results presented in this paper demonstrate that PLUM is an effective dynamic load balancing strategy which remains viable on a large number of processors.
Portable Parallel Programming for the Dynamic Load Balancing of Unstructured Grid Applications

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Das, Sajal K.; Harvey, Daniel; Oliker, Leonid

1999-01-01

The ability to dynamically adapt an unstructured -rid (or mesh) is a powerful tool for solving computational problems with evolving physical features; however, an efficient parallel implementation is rather difficult, particularly from the view point of portability on various multiprocessor platforms We address this problem by developing PLUM, tin automatic anti architecture-independent framework for adaptive numerical computations in a message-passing environment. Portability is demonstrated by comparing performance on an SP2, an Origin2000, and a T3E, without any code modifications. We also present a general-purpose load balancer that utilizes symmetric broadcast networks (SBN) as the underlying communication pattern, with a goal to providing a global view of system loads across processors. Experiments on, an SP2 and an Origin2000 demonstrate the portability of our approach which achieves superb load balance at the cost of minimal extra overhead.

Parallel Processing of Adaptive Meshes with Load Balancing

NASA Technical Reports Server (NTRS)

Das, Sajal K.; Harvey, Daniel J.; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2001-01-01

Many scientific applications involve grids that lack a uniform underlying structure. These applications are often also dynamic in nature in that the grid structure significantly changes between successive phases of execution. In parallel computing environments, mesh adaptation of unstructured grids through selective refinement/coarsening has proven to be an effective approach. However, achieving load balance while minimizing interprocessor communication and redistribution costs is a difficult problem. Traditional dynamic load balancers are mostly inadequate because they lack a global view of system loads across processors. In this paper, we propose a novel and general-purpose load balancer that utilizes symmetric broadcast networks (SBN) as the underlying communication topology, and compare its performance with a successful global load balancing environment, called PLUM, specifically created to handle adaptive unstructured applications. Our experimental results on an IBM SP2 demonstrate that the SBN-based load balancer achieves lower redistribution costs than that under PLUM by overlapping processing and data migration.
Domain Decomposition By the Advancing-Partition Method

NASA Technical Reports Server (NTRS)

Pirzadeh, Shahyar Z.

2008-01-01

A new method of domain decomposition has been developed for generating unstructured grids in subdomains either sequentially or using multiple computers in parallel. Domain decomposition is a crucial and challenging step for parallel grid generation. Prior methods are generally based on auxiliary, complex, and computationally intensive operations for defining partition interfaces and usually produce grids of lower quality than those generated in single domains. The new technique, referred to as "Advancing Partition," is based on the Advancing-Front method, which partitions a domain as part of the volume mesh generation in a consistent and "natural" way. The benefits of this approach are: 1) the process of domain decomposition is highly automated, 2) partitioning of domain does not compromise the quality of the generated grids, and 3) the computational overhead for domain decomposition is minimal. The new method has been implemented in NASA's unstructured grid generation code VGRID.
Algorithms and analyses for stochastic optimization for turbofan noise reduction using parallel reduced-order modeling

NASA Astrophysics Data System (ADS)

Yang, Huanhuan; Gunzburger, Max

2017-06-01

Simulation-based optimization of acoustic liner design in a turbofan engine nacelle for noise reduction purposes can dramatically reduce the cost and time needed for experimental designs. Because uncertainties are inevitable in the design process, a stochastic optimization algorithm is posed based on the conditional value-at-risk measure so that an ideal acoustic liner impedance is determined that is robust in the presence of uncertainties. A parallel reduced-order modeling framework is developed that dramatically improves the computational efficiency of the stochastic optimization solver for a realistic nacelle geometry. The reduced stochastic optimization solver takes less than 500 seconds to execute. In addition, well-posedness and finite element error analyses of the state system and optimization problem are provided.
A low-complexity Reed-Solomon decoder using new key equation solver

NASA Astrophysics Data System (ADS)

Xie, Jun; Yuan, Songxin; Tu, Xiaodong; Zhang, Chongfu

2006-09-01

This paper presents a low-complexity parallel Reed-Solomon (RS) (255,239) decoder architecture using a novel pipelined variable stages recursive Modified Euclidean (ME) algorithm for optical communication. The pipelined four-parallel syndrome generator is proposed. The time multiplexing and resource sharing schemes are used in the novel recursive ME algorithm to reduce the logic gate count. The new key equation solver can be shared by two decoder macro. A new Chien search cell which doesn't need initialization is proposed in the paper. The proposed decoder can be used for 2.5Gb/s data rates device. The decoder is implemented in Altera' Stratixll device. The resource utilization is reduced about 40% comparing to the conventional method.
Nemesis I: Parallel Enhancements to ExodusII

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hennigan, Gary L.; John, Matthew S.; Shadid, John N.

2006-03-28

NEMESIS I is an enhancement to the EXODUS II finite element database model used to store and retrieve data for unstructured parallel finite element analyses. NEMESIS I adds data structures which facilitate the partitioning of a scalar (standard serial) EXODUS II file onto parallel disk systems found on many parallel computers. Since the NEMESIS I application programming interface (APl)can be used to append information to an existing EXODUS II files can be used on files which contain NEMESIS I information. The NEMESIS I information is written and read via C or C++ callable functions which compromise the NEMESIS I API.
Unified Lambert Tool for Massively Parallel Applications in Space Situational Awareness

NASA Astrophysics Data System (ADS)

Woollands, Robyn M.; Read, Julie; Hernandez, Kevin; Probe, Austin; Junkins, John L.

2018-03-01

This paper introduces a parallel-compiled tool that combines several of our recently developed methods for solving the perturbed Lambert problem using modified Chebyshev-Picard iteration. This tool (unified Lambert tool) consists of four individual algorithms, each of which is unique and better suited for solving a particular type of orbit transfer. The first is a Keplerian Lambert solver, which is used to provide a good initial guess (warm start) for solving the perturbed problem. It is also used to determine the appropriate algorithm to call for solving the perturbed problem. The arc length or true anomaly angle spanned by the transfer trajectory is the parameter that governs the automated selection of the appropriate perturbed algorithm, and is based on the respective algorithm convergence characteristics. The second algorithm solves the perturbed Lambert problem using the modified Chebyshev-Picard iteration two-point boundary value solver. This algorithm does not require a Newton-like shooting method and is the most efficient of the perturbed solvers presented herein, however the domain of convergence is limited to about a third of an orbit and is dependent on eccentricity. The third algorithm extends the domain of convergence of the modified Chebyshev-Picard iteration two-point boundary value solver to about 90% of an orbit, through regularization with the Kustaanheimo-Stiefel transformation. This is the second most efficient of the perturbed set of algorithms. The fourth algorithm uses the method of particular solutions and the modified Chebyshev-Picard iteration initial value solver for solving multiple revolution perturbed transfers. This method does require "shooting" but differs from Newton-like shooting methods in that it does not require propagation of a state transition matrix. The unified Lambert tool makes use of the General Mission Analysis Tool and we use it to compute thousands of perturbed Lambert trajectories in parallel on the Space Situational Awareness computer cluster at the LASR Lab, Texas A&M University. We demonstrate the power of our tool by solving a highly parallel example problem, that is the generation of extremal field maps for optimal spacecraft rendezvous (and eventual orbit debris removal). In addition we demonstrate the need for including perturbative effects in simulations for satellite tracking or data association. The unified Lambert tool is ideal for but not limited to space situational awareness applications.
Multiscale Modeling of Hall Thrusters. Chapter 7: Plume Modeling

DTIC Science & Technology

2012-03-06

Quasineutral Potential Fix Finally, a ”quasi-neutral” switch has been implemented in the Draco Gauss - Seidel Solver. Implementation in the PCG solver is...unlimited. 4 (a) (b) Figure 7.2: Potential solution obtained for a single and multiple (16) zones Hence, part of the development effort went into...be seen from this plot, the two solutions are identical. The division of mesh into multiple zones had another benefit for parallel compu- tations
Hybrid MPI+OpenMP Programming of an Overset CFD Solver and Performance Investigations

NASA Technical Reports Server (NTRS)

Djomehri, M. Jahed; Jin, Haoqiang H.; Biegel, Bryan (Technical Monitor)

2002-01-01

This report describes a two level parallelization of a Computational Fluid Dynamic (CFD) solver with multi-zone overset structured grids. The approach is based on a hybrid MPI+OpenMP programming model suitable for shared memory and clusters of shared memory machines. The performance investigations of the hybrid application on an SGI Origin2000 (O2K) machine is reported using medium and large scale test problems.
LSRN: A PARALLEL ITERATIVE SOLVER FOR STRONGLY OVER- OR UNDERDETERMINED SYSTEMS*

PubMed Central

Meng, Xiangrui; Saunders, Michael A.; Mahoney, Michael W.

2014-01-01

We describe a parallel iterative least squares solver named LSRN that is based on random normal projection. LSRN computes the min-length solution to minx∈ℝn ‖Ax − b‖2, where A ∈ ℝm × n with m ≫ n or m ≪ n, and where A may be rank-deficient. Tikhonov regularization may also be included. Since A is involved only in matrix-matrix and matrix-vector multiplications, it can be a dense or sparse matrix or a linear operator, and LSRN automatically speeds up when A is sparse or a fast linear operator. The preconditioning phase consists of a random normal projection, which is embarrassingly parallel, and a singular value decomposition of size ⌈γ min(m, n)⌉ × min(m, n), where γ is moderately larger than 1, e.g., γ = 2. We prove that the preconditioned system is well-conditioned, with a strong concentration result on the extreme singular values, and hence that the number of iterations is fully predictable when we apply LSQR or the Chebyshev semi-iterative method. As we demonstrate, the Chebyshev method is particularly efficient for solving large problems on clusters with high communication cost. Numerical results show that on a shared-memory machine, LSRN is very competitive with LAPACK’s DGELSD and a fast randomized least squares solver called Blendenpik on large dense problems, and it outperforms the least squares solver from SuiteSparseQR on sparse problems without sparsity patterns that can be exploited to reduce fill-in. Further experiments show that LSRN scales well on an Amazon Elastic Compute Cloud cluster. PMID:25419094
Verification of the Icarus Material Response Tool

NASA Technical Reports Server (NTRS)

Schroeder, Olivia; Palmer, Grant; Stern, Eric; Schulz, Joseph; Muppidi, Suman; Martin, Alexandre

2017-01-01

Due to the complex physics encountered during reentry, material response solvers are used for two main purposes: improve the understanding of the physical phenomena; and design and size thermal protection systems (TPS). Icarus, is a three dimensional, unstructured material response tool that is intended to be used for design while maintaining the flexibility to easily implement physical models as needed. Because TPS selection and sizing is critical, it is of the utmost importance that the design tools be extensively verified and validated before their use. Verification tests aim at insuring that the numerical schemes and equations are implemented correctly by comparison to analytical solutions and grid convergence tests.
Numerical Simulation of the Aircraft Wake Vortex Flowfield

NASA Technical Reports Server (NTRS)

Ahmad, Nashat N.; Proctor, Fred H.; Perry, R. Brad

2013-01-01

The near wake vortex flowfield from a NACA0012 half-wing was simulated using a fully unstructured Navier-Stokes flow solver in three dimensions at a chord Reynolds number of 4.6 million and a Mach number of approximately 0.15. Several simulations were performed to examine the effect of boundary conditions, mesh resolution and turbulence scheme on the formation of wingtip vortex and its downstream propagation. The standard Spalart-Allmaras turbulence model was compared with the Dacles-Mariani and Spalart-Shur corrections for rotation and curvature effects. The simulation results were evaluated using the data from experiment performed at NASA Ames' 32in x 48in low speed wind tunnel.
Implicit filtered P{sub N} for high-energy density thermal radiation transport using discontinuous Galerkin finite elements

DOE Office of Scientific and Technical Information (OSTI.GOV)

Laboure, Vincent M., E-mail: vincent.laboure@tamu.edu; McClarren, Ryan G., E-mail: rgm@tamu.edu; Hauck, Cory D., E-mail: hauckc@ornl.gov

2016-09-15

In this work, we provide a fully-implicit implementation of the time-dependent, filtered spherical harmonics (FP{sub N}) equations for non-linear, thermal radiative transfer. We investigate local filtering strategies and analyze the effect of the filter on the conditioning of the system, showing in particular that the filter improves the convergence properties of the iterative solver. We also investigate numerically the rigorous error estimates derived in the linear setting, to determine whether they hold also for the non-linear case. Finally, we simulate a standard test problem on an unstructured mesh and make comparisons with implicit Monte Carlo (IMC) calculations.
Hierarchial parallel computer architecture defined by computational multidisciplinary mechanics

NASA Technical Reports Server (NTRS)

Padovan, Joe; Gute, Doug; Johnson, Keith

1989-01-01

The goal is to develop an architecture for parallel processors enabling optimal handling of multi-disciplinary computation of fluid-solid simulations employing finite element and difference schemes. The goals, philosphical and modeling directions, static and dynamic poly trees, example problems, interpolative reduction, the impact on solvers are shown in viewgraph form.
DICE/ColDICE: 6D collisionless phase space hydrodynamics using a lagrangian tesselation

NASA Astrophysics Data System (ADS)

Sousbie, Thierry

2018-01-01

DICE is a C++ template library designed to solve collisionless fluid dynamics in 6D phase space using massively parallel supercomputers via an hybrid OpenMP/MPI parallelization. ColDICE, based on DICE, implements a cosmological and physical VLASOV-POISSON solver for cold systems such as dark matter (CDM) dynamics.
Domain decomposition methods for the parallel computation of reacting flows

NASA Technical Reports Server (NTRS)

Keyes, David E.

1988-01-01

Domain decomposition is a natural route to parallel computing for partial differential equation solvers. Subdomains of which the original domain of definition is comprised are assigned to independent processors at the price of periodic coordination between processors to compute global parameters and maintain the requisite degree of continuity of the solution at the subdomain interfaces. In the domain-decomposed solution of steady multidimensional systems of PDEs by finite difference methods using a pseudo-transient version of Newton iteration, the only portion of the computation which generally stands in the way of efficient parallelization is the solution of the large, sparse linear systems arising at each Newton step. For some Jacobian matrices drawn from an actual two-dimensional reacting flow problem, comparisons are made between relaxation-based linear solvers and also preconditioned iterative methods of Conjugate Gradient and Chebyshev type, focusing attention on both iteration count and global inner product count. The generalized minimum residual method with block-ILU preconditioning is judged the best serial method among those considered, and parallel numerical experiments on the Encore Multimax demonstrate for it approximately 10-fold speedup on 16 processors.
Parallel Finite Element Domain Decomposition for Structural/Acoustic Analysis

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.; Tungkahotara, Siroj; Watson, Willie R.; Rajan, Subramaniam D.

2005-01-01

A domain decomposition (DD) formulation for solving sparse linear systems of equations resulting from finite element analysis is presented. The formulation incorporates mixed direct and iterative equation solving strategics and other novel algorithmic ideas that are optimized to take advantage of sparsity and exploit modern computer architecture, such as memory and parallel computing. The most time consuming part of the formulation is identified and the critical roles of direct sparse and iterative solvers within the framework of the formulation are discussed. Experiments on several computer platforms using several complex test matrices are conducted using software based on the formulation. Small-scale structural examples are used to validate thc steps in the formulation and large-scale (l,000,000+ unknowns) duct acoustic examples are used to evaluate the ORIGIN 2000 processors, and a duster of 6 PCs (running under the Windows environment). Statistics show that the formulation is efficient in both sequential and parallel computing environmental and that the formulation is significantly faster and consumes less memory than that based on one of the best available commercialized parallel sparse solvers.
An efficient spectral crystal plasticity solver for GPU architectures

NASA Astrophysics Data System (ADS)

Malahe, Michael

2018-03-01

We present a spectral crystal plasticity (CP) solver for graphics processing unit (GPU) architectures that achieves a tenfold increase in efficiency over prior GPU solvers. The approach makes use of a database containing a spectral decomposition of CP simulations performed using a conventional iterative solver over a parameter space of crystal orientations and applied velocity gradients. The key improvements in efficiency come from reducing global memory transactions, exposing more instruction-level parallelism, reducing integer instructions and performing fast range reductions on trigonometric arguments. The scheme also makes more efficient use of memory than prior work, allowing for larger problems to be solved on a single GPU. We illustrate these improvements with a simulation of 390 million crystal grains on a consumer-grade GPU, which executes at a rate of 2.72 s per strain step.
Development of a parallel FE simulator for modeling the whole trans-scale failure process of rock from meso- to engineering-scale

NASA Astrophysics Data System (ADS)

Li, Gen; Tang, Chun-An; Liang, Zheng-Zhao

2017-01-01

Multi-scale high-resolution modeling of rock failure process is a powerful means in modern rock mechanics studies to reveal the complex failure mechanism and to evaluate engineering risks. However, multi-scale continuous modeling of rock, from deformation, damage to failure, has raised high requirements on the design, implementation scheme and computation capacity of the numerical software system. This study is aimed at developing the parallel finite element procedure, a parallel rock failure process analysis (RFPA) simulator that is capable of modeling the whole trans-scale failure process of rock. Based on the statistical meso-damage mechanical method, the RFPA simulator is able to construct heterogeneous rock models with multiple mechanical properties, deal with and represent the trans-scale propagation of cracks, in which the stress and strain fields are solved for the damage evolution analysis of representative volume element by the parallel finite element method (FEM) solver. This paper describes the theoretical basis of the approach and provides the details of the parallel implementation on a Windows - Linux interactive platform. A numerical model is built to test the parallel performance of FEM solver. Numerical simulations are then carried out on a laboratory-scale uniaxial compression test, and field-scale net fracture spacing and engineering-scale rock slope examples, respectively. The simulation results indicate that relatively high speedup and computation efficiency can be achieved by the parallel FEM solver with a reasonable boot process. In laboratory-scale simulation, the well-known physical phenomena, such as the macroscopic fracture pattern and stress-strain responses, can be reproduced. In field-scale simulation, the formation process of net fracture spacing from initiation, propagation to saturation can be revealed completely. In engineering-scale simulation, the whole progressive failure process of the rock slope can be well modeled. It is shown that the parallel FE simulator developed in this study is an efficient tool for modeling the whole trans-scale failure process of rock from meso- to engineering-scale.
An adjoint view on flux consistency and strong wall boundary conditions to the Navier–Stokes equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stück, Arthur, E-mail: arthur.stueck@dlr.de

2015-11-15

Inconsistent discrete expressions in the boundary treatment of Navier–Stokes solvers and in the definition of force objective functionals can lead to discrete-adjoint boundary treatments that are not a valid representation of the boundary conditions to the corresponding adjoint partial differential equations. The underlying problem is studied for an elementary 1D advection–diffusion problem first using a node-centred finite-volume discretisation. The defect of the boundary operators in the inconsistently defined discrete-adjoint problem leads to oscillations and becomes evident with the additional insight of the continuous-adjoint approach. A homogenisation of the discretisations for the primal boundary treatment and the force objective functional yieldsmore » second-order functional accuracy and eliminates the defect in the discrete-adjoint boundary treatment. Subsequently, the issue is studied for aerodynamic Reynolds-averaged Navier–Stokes problems in conjunction with a standard finite-volume discretisation on median-dual grids and a strong implementation of noslip walls, found in many unstructured general-purpose flow solvers. Going out from a base-line discretisation of force objective functionals which is independent of the boundary treatment in the flow solver, two improved flux-consistent schemes are presented; based on either body wall-defined or farfield-defined control-volumes they resolve the dual inconsistency. The behaviour of the schemes is investigated on a sequence of grids in 2D and 3D.« less
Introducing a distributed unstructured mesh into gyrokinetic particle-in-cell code, XGC

NASA Astrophysics Data System (ADS)

Yoon, Eisung; Shephard, Mark; Seol, E. Seegyoung; Kalyanaraman, Kaushik

2017-10-01

XGC has shown good scalability for large leadership supercomputers. The current production version uses a copy of the entire unstructured finite element mesh on every MPI rank. Although an obvious scalability issue if the mesh sizes are to be dramatically increased, the current approach is also not optimal with respect to data locality of particles and mesh information. To address these issues we have initiated the development of a distributed mesh PIC method. This approach directly addresses the base scalability issue with respect to mesh size and, through the use of a mesh entity centric view of the particle mesh relationship, provides opportunities to address data locality needs of many core and GPU supported heterogeneous systems. The parallel mesh PIC capabilities are being built on the Parallel Unstructured Mesh Infrastructure (PUMI). The presentation will first overview the form of mesh distribution used and indicate the structures and functions used to support the mesh, the particles and their interaction. Attention will then focus on the node-level optimizations being carried out to ensure performant operation of all PIC operations on the distributed mesh. Partnership for Edge Physics Simulation (EPSI) Grant No. DE-SC0008449 and Center for Extended Magnetohydrodynamic Modeling (CEMM) Grant No. DE-SC0006618.

Advanced Computational Methods for Security Constrained Financial Transmission Rights: Structure and Parallelism

DOE Office of Scientific and Technical Information (OSTI.GOV)

Elbert, Stephen T.; Kalsi, Karanjit; Vlachopoulou, Maria

Financial Transmission Rights (FTRs) help power market participants reduce price risks associated with transmission congestion. FTRs are issued based on a process of solving a constrained optimization problem with the objective to maximize the FTR social welfare under power flow security constraints. Security constraints for different FTR categories (monthly, seasonal or annual) are usually coupled and the number of constraints increases exponentially with the number of categories. Commercial software for FTR calculation can only provide limited categories of FTRs due to the inherent computational challenges mentioned above. In this paper, a novel non-linear dynamical system (NDS) approach is proposed tomore » solve the optimization problem. The new formulation and performance of the NDS solver is benchmarked against widely used linear programming (LP) solvers like CPLEX™ and tested on large-scale systems using data from the Western Electricity Coordinating Council (WECC). The NDS is demonstrated to outperform the widely used CPLEX algorithms while exhibiting superior scalability. Furthermore, the NDS based solver can be easily parallelized which results in significant computational improvement.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry

Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
Computational Study of a McDonnell Douglas Single-Stage-to-Orbit Vehicle Concept for Aerodynamic Analysis

NASA Technical Reports Server (NTRS)

Prabhu, Ramadas K.

1996-01-01

This paper presents the results of a computational flow analysis of the McDonnell Douglas single-stage-to-orbit vehicle concept designated as the 24U. This study was made to determine the aerodynamic characteristics of the vehicle with and without body flaps over an angle of attack range of 20-40 deg. Computations were made at a flight Mach number of 20 at 200,000 ft. altitude with equilibrium air, and a Mach number of 6 with CF4 gas. The software package FELISA (Finite Element Langley imperial College Sawansea Ames) was used for all the computations. The FELISA software consists of unstructured surface and volume grid generators, and inviscid flow solvers with (1) perfect gas option for subsonic, transonic, and low supersonic speeds, and (2) perfect gas, equilibrium air, and CF4 options for hypersonic speeds. The hypersonic flow solvers with equilibrium air and CF4 options were used in the present studies. Results are compared with other computational results and hypersonic CF4 tunnel test data.
Domain modeling and grid generation for multi-block structured grids with application to aerodynamic and hydrodynamic configurations

NASA Technical Reports Server (NTRS)

Spekreijse, S. P.; Boerstoel, J. W.; Vitagliano, P. L.; Kuyvenhoven, J. L.

1992-01-01

About five years ago, a joint development was started of a flow simulation system for engine-airframe integration studies on propeller as well as jet aircraft. The initial system was based on the Euler equations and made operational for industrial aerodynamic design work. The system consists of three major components: a domain modeller, for the graphical interactive subdivision of flow domains into an unstructured collection of blocks; a grid generator, for the graphical interactive computation of structured grids in blocks; and a flow solver, for the computation of flows on multi-block grids. The industrial partners of the collaboration and NLR have demonstrated that the domain modeller, grid generator and flow solver can be applied to simulate Euler flows around complete aircraft, including propulsion system simulation. Extension to Navier-Stokes flows is in progress. Delft Hydraulics has shown that both the domain modeller and grid generator can also be applied successfully for hydrodynamic configurations. An overview is given about the main aspects of both domain modelling and grid generation.
Scalable smoothing strategies for a geometric multigrid method for the immersed boundary equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhalla, Amneet Pal Singh; Knepley, Matthew G.; Adams, Mark F.

2016-12-20

The immersed boundary (IB) method is a widely used approach to simulating fluid-structure interaction (FSI). Although explicit versions of the IB method can suffer from severe time step size restrictions, these methods remain popular because of their simplicity and generality. In prior work (Guy et al., Adv Comput Math, 2015), some of us developed a geometric multigrid preconditioner for a stable semi-implicit IB method under Stokes flow conditions; however, this solver methodology used a Vanka-type smoother that presented limited opportunities for parallelization. This work extends this Stokes-IB solver methodology by developing smoothing techniques that are suitable for parallel implementation. Specifically,more » we demonstrate that an additive version of the Vanka smoother can yield an effective multigrid preconditioner for the Stokes-IB equations, and we introduce an efficient Schur complement-based smoother that is also shown to be effective for the Stokes-IB equations. We investigate the performance of these solvers for a broad range of material stiffnesses, both for Stokes flows and flows at nonzero Reynolds numbers, and for thick and thin structural models. We show here that linear solver performance degrades with increasing Reynolds number and material stiffness, especially for thin interface cases. Nonetheless, the proposed approaches promise to yield effective solution algorithms, especially at lower Reynolds numbers and at modest-to-high elastic stiffnesses.« less
Parallel satellite orbital situational problems solver for space missions design and control

NASA Astrophysics Data System (ADS)

Atanassov, Atanas Marinov

2016-11-01

Solving different scientific problems for space applications demands implementation of observations, measurements or realization of active experiments during time intervals in which specific geometric and physical conditions are fulfilled. The solving of situational problems for determination of these time intervals when the satellite instruments work optimally is a very important part of all activities on every stage of preparation and realization of space missions. The elaboration of universal, flexible and robust approach for situation analysis, which is easily portable toward new satellite missions, is significant for reduction of missions' preparation times and costs. Every situation problem could be based on one or more situation conditions. Simultaneously solving different kinds of situation problems based on different number and types of situational conditions, each one of them satisfied on different segments of satellite orbit requires irregular calculations. Three formal approaches are presented. First one is related to situation problems description that allows achieving flexibility in situation problem assembling and presentation in computer memory. The second formal approach is connected with developing of situation problem solver organized as processor that executes specific code for every particular situational condition. The third formal approach is related to solver parallelization utilizing threads and dynamic scheduling based on "pool of threads" abstraction and ensures a good load balance. The developed situation problems solver is intended for incorporation in the frames of multi-physics multi-satellite space mission's design and simulation tools.
Implementation of density-based solver for all speeds in the framework of OpenFOAM

NASA Astrophysics Data System (ADS)

Shen, Chun; Sun, Fengxian; Xia, Xinlin

2014-10-01

In the framework of open source CFD code OpenFOAM, a density-based solver for all speeds flow field is developed. In this solver the preconditioned all speeds AUSM+(P) scheme is adopted and the dual time scheme is implemented to complete the unsteady process. Parallel computation could be implemented to accelerate the solving process. Different interface reconstruction algorithms are implemented, and their accuracy with respect to convection is compared. Three benchmark tests of lid-driven cavity flow, flow crossing over a bump, and flow over a forward-facing step are presented to show the accuracy of the AUSM+(P) solver for low-speed incompressible flow, transonic flow, and supersonic/hypersonic flow. Firstly, for the lid driven cavity flow, the computational results obtained by different interface reconstruction algorithms are compared. It is indicated that the one dimensional reconstruction scheme adopted in this solver possesses high accuracy and the solver developed in this paper can effectively catch the features of low incompressible flow. Then via the test cases regarding the flow crossing over bump and over forward step, the ability to capture characteristics of the transonic and supersonic/hypersonic flows are confirmed. The forward-facing step proves to be the most challenging for the preconditioned solvers with and without the dual time scheme. Nonetheless, the solvers described in this paper reproduce the main features of this flow, including the evolution of the initial transient.
Parallel Implementation of the Discontinuous Galerkin Method

NASA Technical Reports Server (NTRS)

Baggag, Abdalkader; Atkins, Harold; Keyes, David

1999-01-01

This paper describes a parallel implementation of the discontinuous Galerkin method. Discontinuous Galerkin is a spatially compact method that retains its accuracy and robustness on non-smooth unstructured grids and is well suited for time dependent simulations. Several parallelization approaches are studied and evaluated. The most natural and symmetric of the approaches has been implemented in all object-oriented code used to simulate aeroacoustic scattering. The parallel implementation is MPI-based and has been tested on various parallel platforms such as the SGI Origin, IBM SP2, and clusters of SGI and Sun workstations. The scalability results presented for the SGI Origin show slightly superlinear speedup on a fixed-size problem due to cache effects.
AZTEC. Parallel Iterative method Software for Solving Linear Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hutchinson, S.; Shadid, J.; Tuminaro, R.

1995-07-01

AZTEC is an interactive library that greatly simplifies the parrallelization process when solving the linear systems of equations Ax=b where A is a user supplied n X n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. AZTEC is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparse unstructured matricesmore » for parallel solutions.« less
StagBL : A Scalable, Portable, High-Performance Discretization and Solver Layer for Geodynamic Simulation

NASA Astrophysics Data System (ADS)

Sanan, P.; Tackley, P. J.; Gerya, T.; Kaus, B. J. P.; May, D.

2017-12-01

StagBL is an open-source parallel solver and discretization library for geodynamic simulation,encapsulating and optimizing operations essential to staggered-grid finite volume Stokes flow solvers.It provides a parallel staggered-grid abstraction with a high-level interface in C and Fortran.On top of this abstraction, tools are available to define boundary conditions and interact with particle systems.Tools and examples to efficiently solve Stokes systems defined on the grid are provided in small (direct solver), medium (simple preconditioners), and large (block factorization and multigrid) model regimes.By working directly with leading application codes (StagYY, I3ELVIS, and LaMEM) and providing an API and examples to integrate with others, StagBL aims to become a community tool supplying scalable, portable, reproducible performance toward novel science in regional- and planet-scale geodynamics and planetary science.By implementing kernels used by many research groups beneath a uniform abstraction layer, the library will enable optimization for modern hardware, thus reducing community barriers to large- or extreme-scale parallel simulation on modern architectures. In particular, the library will include CPU-, Manycore-, and GPU-optimized variants of matrix-free operators and multigrid components.The common layer provides a framework upon which to introduce innovative new tools.StagBL will leverage p4est to provide distributed adaptive meshes, and incorporate a multigrid convergence analysis tool.These options, in addition to a wealth of solver options provided by an interface to PETSc, will make the most modern solution techniques available from a common interface. StagBL in turn provides a PETSc interface, DMStag, to its central staggered grid abstraction.We present public version 0.5 of StagBL, including preliminary integration with application codes and demonstrations with its own demonstration application, StagBLDemo. Central to StagBL is the notion of an uninterrupted pipeline from toy/teaching codes to high-performance, extreme-scale solves. StagBLDemo replicates the functionality of an advanced MATLAB-style regional geodynamics code, thus providing users with a concrete procedure to exceed the performance and scalability limitations of smaller-scale tools.
Visualization of Unsteady Computational Fluid Dynamics

NASA Technical Reports Server (NTRS)

Haimes, Robert

1997-01-01

The current compute environment that most researchers are using for the calculation of 3D unsteady Computational Fluid Dynamic (CFD) results is a super-computer class machine. The Massively Parallel Processors (MPP's) such as the 160 node IBM SP2 at NAS and clusters of workstations acting as a single MPP (like NAS's SGI Power-Challenge array and the J90 cluster) provide the required computation bandwidth for CFD calculations of transient problems. If we follow the traditional computational analysis steps for CFD (and we wish to construct an interactive visualizer) we need to be aware of the following: (1) Disk space requirements. A single snap-shot must contain at least the values (primitive variables) stored at the appropriate locations within the mesh. For most simple 3D Euler solvers that means 5 floating point words. Navier-Stokes solutions with turbulence models may contain 7 state-variables. (2) Disk speed vs. Computational speeds. The time required to read the complete solution of a saved time frame from disk is now longer than the compute time for a set number of iterations from an explicit solver. Depending, on the hardware and solver an iteration of an implicit code may also take less time than reading the solution from disk. If one examines the performance improvements in the last decade or two, it is easy to see that depending on disk performance (vs. CPU improvement) may not be the best method for enhancing interactivity. (3) Cluster and Parallel Machine I/O problems. Disk access time is much worse within current parallel machines and cluster of workstations that are acting in concert to solve a single problem. In this case we are not trying to read the volume of data, but are running the solver and the solver outputs the solution. These traditional network interfaces must be used for the file system. (4) Numerics of particle traces. Most visualization tools can work upon a single snap shot of the data but some visualization tools for transient problems require dealing with time.
Discrete Adjoint-Based Design for Unsteady Turbulent Flows On Dynamic Overset Unstructured Grids

NASA Technical Reports Server (NTRS)

Nielsen, Eric J.; Diskin, Boris

2012-01-01

A discrete adjoint-based design methodology for unsteady turbulent flows on three-dimensional dynamic overset unstructured grids is formulated, implemented, and verified. The methodology supports both compressible and incompressible flows and is amenable to massively parallel computing environments. The approach provides a general framework for performing highly efficient and discretely consistent sensitivity analysis for problems involving arbitrary combinations of overset unstructured grids which may be static, undergoing rigid or deforming motions, or any combination thereof. General parent-child motions are also accommodated, and the accuracy of the implementation is established using an independent verification based on a complex-variable approach. The methodology is used to demonstrate aerodynamic optimizations of a wind turbine geometry, a biologically-inspired flapping wing, and a complex helicopter configuration subject to trimming constraints. The objective function for each problem is successfully reduced and all specified constraints are satisfied.
Efficient Parallel Formulations of Hierarchical Methods and Their Applications

NASA Astrophysics Data System (ADS)

Grama, Ananth Y.

1996-01-01

Hierarchical methods such as the Fast Multipole Method (FMM) and Barnes-Hut (BH) are used for rapid evaluation of potential (gravitational, electrostatic) fields in particle systems. They are also used for solving integral equations using boundary element methods. The linear systems arising from these methods are dense and are solved iteratively. Hierarchical methods reduce the complexity of the core matrix-vector product from O(n^2) to O(n log n) and the memory requirement from O(n^2) to O(n). We have developed highly scalable parallel formulations of a hybrid FMM/BH method that are capable of handling arbitrarily irregular distributions. We apply these formulations to astrophysical simulations of Plummer and Gaussian galaxies. We have used our parallel formulations to solve the integral form of the Laplace equation. We show that our parallel hierarchical mat-vecs yield high efficiency and overall performance even on relatively small problems. A problem containing approximately 200K nodes takes under a second to compute on 256 processors and yet yields over 85% efficiency. The efficiency and raw performance is expected to increase for bigger problems. For the 200K node problem, our code delivers about 5 GFLOPS of performance on a 256 processor T3D. This is impressive considering the fact that the problem has floating point divides and roots, and very little locality resulting in poor cache performance. A dense matrix-vector product of the same dimensions would require about 0.5 TeraBytes of memory and about 770 TeraFLOPS of computing speed. Clearly, if the loss in accuracy resulting from the use of hierarchical methods is acceptable, our code yields significant savings in time and memory. We also study the convergence of a GMRES solver built around this mat-vec. We accelerate the convergence of the solver using three preconditioning techniques: diagonal scaling, block-diagonal preconditioning, and inner-outer preconditioning. We study the performance and parallel efficiency of these preconditioned solvers. Using this solver, we solve dense linear systems with hundreds of thousands of unknowns. Solving a 105K unknown problem takes about 10 minutes on a 64 processor T3D. Until very recently, boundary element problems of this magnitude could not even be generated, let alone solved.
BOOK REVIEW: Advanced Topics in Computational Partial Differential Equations: Numerical Methods and Diffpack Programming

NASA Astrophysics Data System (ADS)

Katsaounis, T. D.

2005-02-01

The scope of this book is to present well known simple and advanced numerical methods for solving partial differential equations (PDEs) and how to implement these methods using the programming environment of the software package Diffpack. A basic background in PDEs and numerical methods is required by the potential reader. Further, a basic knowledge of the finite element method and its implementation in one and two space dimensions is required. The authors claim that no prior knowledge of the package Diffpack is required, which is true, but the reader should be at least familiar with an object oriented programming language like C++ in order to better comprehend the programming environment of Diffpack. Certainly, a prior knowledge or usage of Diffpack would be a great advantage to the reader. The book consists of 15 chapters, each one written by one or more authors. Each chapter is basically divided into two parts: the first part is about mathematical models described by PDEs and numerical methods to solve these models and the second part describes how to implement the numerical methods using the programming environment of Diffpack. Each chapter closes with a list of references on its subject. The first nine chapters cover well known numerical methods for solving the basic types of PDEs. Further, programming techniques on the serial as well as on the parallel implementation of numerical methods are also included in these chapters. The last five chapters are dedicated to applications, modelled by PDEs, in a variety of fields. The first chapter is an introduction to parallel processing. It covers fundamentals of parallel processing in a simple and concrete way and no prior knowledge of the subject is required. Examples of parallel implementation of basic linear algebra operations are presented using the Message Passing Interface (MPI) programming environment. Here, some knowledge of MPI routines is required by the reader. Examples solving in parallel simple PDEs using Diffpack and MPI are also presented. Chapter 2 presents the overlapping domain decomposition method for solving PDEs. It is well known that these methods are suitable for parallel processing. The first part of the chapter covers the mathematical formulation of the method as well as algorithmic and implementational issues. The second part presents a serial and a parallel implementational framework within the programming environment of Diffpack. The chapter closes by showing how to solve two application examples with the overlapping domain decomposition method using Diffpack. Chapter 3 is a tutorial about how to incorporate the multigrid solver in Diffpack. The method is illustrated by examples such as a Poisson solver, a general elliptic problem with various types of boundary conditions and a nonlinear Poisson type problem. In chapter 4 the mixed finite element is introduced. Technical issues concerning the practical implementation of the method are also presented. The main difficulties of the efficient implementation of the method, especially in two and three space dimensions on unstructured grids, are presented and addressed in the framework of Diffpack. The implementational process is illustrated by two examples, namely the system formulation of the Poisson problem and the Stokes problem. Chapter 5 is closely related to chapter 4 and addresses the problem of how to solve efficiently the linear systems arising by the application of the mixed finite element method. The proposed method is block preconditioning. Efficient techniques for implementing the method within Diffpack are presented. Optimal block preconditioners are used to solve the system formulation of the Poisson problem, the Stokes problem and the bidomain model for the electrical activity in the heart. The subject of chapter 6 is systems of PDEs. Linear and nonlinear systems are discussed. Fully implicit and operator splitting methods are presented. Special attention is paid to how existing solvers for scalar equations in Diffpack can be used to derive fully implicit solvers for systems. The proposed techniques are illustrated in terms of two applications, namely a system of PDEs modelling pipeflow and a two-phase porous media flow. Stochastic PDEs is the topic of chapter 7. The first part of the chapter is a simple introduction to stochastic PDEs; basic analytical properties are presented for simple models like transport phenomena and viscous drag forces. The second part considers the numerical solution of stochastic PDEs. Two basic techniques are presented, namely Monte Carlo and perturbation methods. The last part explains how to implement and incorporate these solvers into Diffpack. Chapter 8 describes how to operate Diffpack from Python scripts. The main goal here is to provide all the programming and technical details in order to glue the programming environment of Diffpack with visualization packages through Python and in general take advantage of the Python interfaces. Chapter 9 attempts to show how to use numerical experiments to measure the performance of various PDE solvers. The authors gathered a rather impressive list, a total of 14 PDE solvers. Solvers for problems like Poisson, Navier--Stokes, elasticity, two-phase flows and methods such as finite difference, finite element, multigrid, and gradient type methods are presented. The authors provide a series of numerical results combining various solvers with various methods in order to gain insight into their computational performance and efficiency. In Chapter 10 the authors consider a computationally challenging problem, namely the computation of the electrical activity of the human heart. After a brief introduction on the biology of the problem the authors present the mathematical models involved and a numerical method for solving them within the framework of Diffpack. Chapter 11 and 12 are closely related; actually they could have been combined in a single chapter. Chapter 11 introduces several mathematical models used in finance, based on the Black--Scholes equation. Chapter 12 considers several numerical methods like Monte Carlo, lattice methods, finite difference and finite element methods. Implementation of these methods within Diffpack is presented in the last part of the chapter. Chapter 13 presents how the finite element method is used for the modelling and analysis of elastic structures. The authors describe the structural elements of Diffpack which include popular elements such as beams and plates and examples are presented on how to use them to simulate elastic structures. Chapter 14 describes an application problem, namely the extrusion of aluminum. This is a rather\\endcolumn complicated process which involves non-Newtonian flow, heat transfer and elasticity. The authors describe the systems of PDEs modelling the underlying process and use a finite element method to obtain a numerical solution. The implementation of the numerical method in Diffpack is presented along with some applications. The last chapter, chapter 15, focuses on mathematical and numerical models of systems of PDEs governing geological processes in sedimentary basins. The underlying mathematical model is solved using the finite element method within a fully implicit scheme. The authors discuss the implementational issues involved within Diffpack and they present results from several examples. In summary, the book focuses on the computational and implementational issues involved in solving partial differential equations. The potential reader should have a basic knowledge of PDEs and the finite difference and finite element methods. The examples presented are solved within the programming framework of Diffpack and the reader should have prior experience with the particular software in order to take full advantage of the book. Overall the book is well written, the subject of each chapter is well presented and can serve as a reference for graduate students, researchers and engineers who are interested in the numerical solution of partial differential equations modelling various applications.
Visualization Co-Processing of a CFD Simulation

NASA Technical Reports Server (NTRS)

Vaziri, Arsi

1999-01-01

OVERFLOW, a widely used CFD simulation code, is combined with a visualization system, pV3, to experiment with an environment for simulation/visualization co-processing on a SGI Origin 2000 computer(O2K) system. The shared memory version of the solver is used with the O2K 'pfa' preprocessor invoked to automatically discover parallelism in the source code. No other explicit parallelism is enabled. In order to study the scaling and performance of the visualization co-processing system, sample runs are made with different processor groups in the range of 1 to 254 processors. The data exchange between the visualization system and the simulation system is rapid enough for user interactivity when the problem size is small. This shared memory version of OVERFLOW, with minimal parallelization, does not scale well to an increasing number of available processors. The visualization task takes about 18 to 30% of the total processing time and does not appear to be a major contributor to the poor scaling. Improper load balancing and inter-processor communication overhead are contributors to this poor performance. Work is in progress which is aimed at obtaining improved parallel performance of the solver and removing the limitations of serial data transfer to pV3 by examining various parallelization/communication strategies, including the use of the explicit message passing.
Implementation of a Pseudo-Bending Seismic Travel-Time Calculator in a Distributed Parallel Computing Environment

DTIC Science & Technology

2008-09-01

algorithms that have been proposed to accomplish it fall into three broad categories. Eikonal solvers (e.g., Vidale, 1988, 1990; Podvin and Lecomte, 1991...difference eikonal solvers, the FMM algorithm works by following a wavefront as it moves across a volume of grid points, updating the travel times in...the grid according to the eikonal differential equation, using a second-order finite-difference scheme. We chose to use FMM for our comparison because
Massively Parallel Solution of Poisson Equation on Coarse Grain MIMD Architectures

NASA Technical Reports Server (NTRS)

Fijany, A.; Weinberger, D.; Roosta, R.; Gulati, S.

1998-01-01

In this paper a new algorithm, designated as Fast Invariant Imbedding algorithm, for solution of Poisson equation on vector and massively parallel MIMD architectures is presented. This algorithm achieves the same optimal computational efficiency as other Fast Poisson solvers while offering a much better structure for vector and parallel implementation. Our implementation on the Intel Delta and Paragon shows that a speedup of over two orders of magnitude can be achieved even for moderate size problems.
An object-oriented approach for parallel self adaptive mesh refinement on block structured grids

NASA Technical Reports Server (NTRS)

Lemke, Max; Witsch, Kristian; Quinlan, Daniel

1993-01-01

Self-adaptive mesh refinement dynamically matches the computational demands of a solver for partial differential equations to the activity in the application's domain. In this paper we present two C++ class libraries, P++ and AMR++, which significantly simplify the development of sophisticated adaptive mesh refinement codes on (massively) parallel distributed memory architectures. The development is based on our previous research in this area. The C++ class libraries provide abstractions to separate the issues of developing parallel adaptive mesh refinement applications into those of parallelism, abstracted by P++, and adaptive mesh refinement, abstracted by AMR++. P++ is a parallel array class library to permit efficient development of architecture independent codes for structured grid applications, and AMR++ provides support for self-adaptive mesh refinement on block-structured grids of rectangular non-overlapping blocks. Using these libraries, the application programmers' work is greatly simplified to primarily specifying the serial single grid application and obtaining the parallel and self-adaptive mesh refinement code with minimal effort. Initial results for simple singular perturbation problems solved by self-adaptive multilevel techniques (FAC, AFAC), being implemented on the basis of prototypes of the P++/AMR++ environment, are presented. Singular perturbation problems frequently arise in large applications, e.g. in the area of computational fluid dynamics. They usually have solutions with layers which require adaptive mesh refinement and fast basic solvers in order to be resolved efficiently.
FleCSPH - a parallel and distributed SPH implementation based on the FleCSI framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Junghans, Christoph; Loiseau, Julien

2017-06-20

FleCSPH is a multi-physics compact application that exercises FleCSI parallel data structures for tree-based particle methods. In particular, FleCSPH implements a smoothed-particle hydrodynamics (SPH) solver for the solution of Lagrangian problems in astrophysics and cosmology. FleCSPH includes support for gravitational forces using the fast multipole method (FMM).
Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver

NASA Astrophysics Data System (ADS)

Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre

2014-06-01

This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.

High-performance computational fluid dynamics: a custom-code approach

NASA Astrophysics Data System (ADS)

Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.

2016-07-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
Recent Progress on the Parallel Implementation of Moving-Body Overset Grid Schemes

NASA Technical Reports Server (NTRS)

Wissink, Andrew; Allen, Edwin (Technical Monitor)

1998-01-01

Viscous calculations about geometrically complex bodies in which there is relative motion between component parts is one of the most computationally demanding problems facing CFD researchers today. This presentation documents results from the first two years of a CHSSI-funded effort within the U.S. Army AFDD to develop scalable dynamic overset grid methods for unsteady viscous calculations with moving-body problems. The first pan of the presentation will focus on results from OVERFLOW-D1, a parallelized moving-body overset grid scheme that employs traditional Chimera methodology. The two processes that dominate the cost of such problems are the flow solution on each component and the intergrid connectivity solution. Parallel implementations of the OVERFLOW flow solver and DCF3D connectivity software are coupled with a proposed two-part static-dynamic load balancing scheme and tested on the IBM SP and Cray T3E multi-processors. The second part of the presentation will cover some recent results from OVERFLOW-D2, a new flow solver that employs Cartesian grids with various levels of refinement, facilitating solution adaption. A study of the parallel performance of the scheme on large distributed- memory multiprocessor computer architectures will be reported.
A fast mass spring model solver for high-resolution elastic objects

NASA Astrophysics Data System (ADS)

Zheng, Mianlun; Yuan, Zhiyong; Zhu, Weixu; Zhang, Guian

2017-03-01

Real-time simulation of elastic objects is of great importance for computer graphics and virtual reality applications. The fast mass spring model solver can achieve visually realistic simulation in an efficient way. Unfortunately, this method suffers from resolution limitations and lack of mechanical realism for a surface geometry model, which greatly restricts its application. To tackle these problems, in this paper we propose a fast mass spring model solver for high-resolution elastic objects. First, we project the complex surface geometry model into a set of uniform grid cells as cages through *cages mean value coordinate method to reflect its internal structure and mechanics properties. Then, we replace the original Cholesky decomposition method in the fast mass spring model solver with a conjugate gradient method, which can make the fast mass spring model solver more efficient for detailed surface geometry models. Finally, we propose a graphics processing unit accelerated parallel algorithm for the conjugate gradient method. Experimental results show that our method can realize efficient deformation simulation of 3D elastic objects with visual reality and physical fidelity, which has a great potential for applications in computer animation.
Investigating the Transonic Flutter Boundary of the Benchmark Supercritical Wing

NASA Technical Reports Server (NTRS)

Heeg, Jennifer; Chwalowski, Pawel

2017-01-01

This paper builds on the computational aeroelastic results published previously and generated in support of the second Aeroelastic Prediction Workshop for the NASA Benchmark Supercritical Wing configuration. The computational results are obtained using FUN3D, an unstructured grid Reynolds-Averaged Navier-Stokes solver developed at the NASA Langley Research Center. The analysis results focus on understanding the dip in the transonic flutter boundary at a single Mach number (0.74), exploring an angle of attack range of ??1 to 8 and dynamic pressures from wind off to beyond flutter onset. The rigid analysis results are examined for insights into the behavior of the aeroelastic system. Both static and dynamic aeroelastic simulation results are also examined.
Development of Unsteady Aerodynamic and Aeroelastic Reduced-Order Models Using the FUN3D Code

NASA Technical Reports Server (NTRS)

Silva, Walter A.; Vatsa, Veer N.; Biedron, Robert T.

2009-01-01

Recent significant improvements to the development of CFD-based unsteady aerodynamic reduced-order models (ROMs) are implemented into the FUN3D unstructured flow solver. These improvements include the simultaneous excitation of the structural modes of the CFD-based unsteady aerodynamic system via a single CFD solution, minimization of the error between the full CFD and the ROM unsteady aero- dynamic solution, and computation of a root locus plot of the aeroelastic ROM. Results are presented for a viscous version of the two-dimensional Benchmark Active Controls Technology (BACT) model and an inviscid version of the AGARD 445.6 aeroelastic wing using the FUN3D code.
FUN3D Analyses in Support of the Second Aeroelastic Prediction Workshop

NASA Technical Reports Server (NTRS)

Chwalowski, Pawel; Heeg, Jennifer

2016-01-01

This paper presents the computational aeroelastic results generated in support of the second Aeroelastic Prediction Workshop for the Benchmark Supercritical Wing (BSCW) configurations and compares them to the experimental data. The computational results are obtained using FUN3D, an unstructured grid Reynolds- Averaged Navier-Stokes solver developed at NASA Langley Research Center. The analysis results include aerodynamic coefficients and surface pressures obtained for steady-state, static aeroelastic equilibrium, and unsteady flow due to a pitching wing or flutter prediction. Frequency response functions of the pressure coefficients with respect to the angular displacement are computed and compared with the experimental data. The effects of spatial and temporal convergence on the computational results are examined.
Flow Simulation of N2B Hybrid Wing Body Configuration

NASA Technical Reports Server (NTRS)

Kim, Hyoungjin; Liou, Meng-Sing

2012-01-01

The N2B hybrid wing body aircraft was conceptually designed to meet environmental and performance goals for the N+2 generation transport set by the subsonic fixed wing project. In this study, flow fields around the N2B configuration is simulated using a Reynolds-averaged Navier-Stokes flow solver using unstructured meshes. Boundary conditions at engine fan face and nozzle exhaust planes are provided by response surfaces of the NPSS thermodynamic engine cycle model. The present flow simulations reveal challenging design issues arising from boundary layer ingestion offset inlet and nacelle-airframe interference. The N2B configuration can be a good test bed for application of multidisciplinary design optimization technology.
Numerical prediction of the interference drag of a streamlined strut intersecting a surface in transonic flow

NASA Astrophysics Data System (ADS)

Tetrault, Philippe-Andre

2000-10-01

In transonic flow, the aerodynamic interference that occurs on a strut-braced wing airplane, pylons, and other applications is significant. The purpose of this work is to provide relationships to estimate the interference drag of wing-strut, wing-pylon, and wing-body arrangements. Those equations are obtained by fitting a curve to the results obtained from numerous Computational Fluid Dynamics (CFD) calculations using state-of-the-art codes that employ the Spalart-Allmaras turbulence model. In order to estimate the effect of the strut thickness, the Reynolds number of the flow, and the angle made by the strut with an adjacent surface, inviscid and viscous calculations are performed on a symmetrical strut at an angle between parallel walls. The computations are conducted at a Mach number of 0.85 and Reynolds numbers of 5.3 and 10.6 million based on the strut chord. The interference drag is calculated as the drag increment of the arrangement compared to an equivalent two-dimensional strut of the same cross-section. The results show a rapid increase of the interference drag as the angle of the strut deviates from a position perpendicular to the wall. Separation regions appear for low intersection angles, but the viscosity generally provides a positive effect in alleviating the strength of the shock near the junction and thus the drag penalty. When the thickness-to-chord ratio of the strut is reduced, the flowfield is disturbed only locally at the intersection of the strut with the wall. This study provides an equation to estimate the interference drag of simple intersections in transonic flow. In the course of performing the calculations associated with this work, an unstructured flow solver was utilized. Accurate drag prediction requires a very fine grid and this leads to problems associated with the grid generator. Several challenges facing the unstructured grid methodology are discussed: slivers, grid refinement near the leading edge and at the trailing edge, grid convergence studies, volume grid generation, and other practical matters concerning such calculations.
The upwind control volume scheme for unstructured triangular grids

NASA Technical Reports Server (NTRS)

Giles, Michael; Anderson, W. Kyle; Roberts, Thomas W.

1989-01-01

A new algorithm for the numerical solution of the Euler equations is presented. This algorithm is particularly suited to the use of unstructured triangular meshes, allowing geometric flexibility. Solutions are second-order accurate in the steady state. Implementation of the algorithm requires minimal grid connectivity information, resulting in modest storage requirements, and should enhance the implementation of the scheme on massively parallel computers. A novel form of upwind differencing is developed, and is shown to yield sharp resolution of shocks. Two new artificial viscosity models are introduced that enhance the performance of the new scheme. Numerical results for transonic airfoil flows are presented, which demonstrate the performance of the algorithm.
Final Report, DE-FG01-06ER25718 Domain Decomposition and Parallel Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Widlund, Olof B.

2015-06-09

The goal of this project is to develop and improve domain decomposition algorithms for a variety of partial differential equations such as those of linear elasticity and electro-magnetics.These iterative methods are designed for massively parallel computing systems and allow the fast solution of the very large systems of algebraic equations that arise in large scale and complicated simulations. A special emphasis is placed on problems arising from Maxwell's equation. The approximate solvers, the preconditioners, are combined with the conjugate gradient method and must always include a solver of a coarse model in order to have a performance which is independentmore » of the number of processors used in the computer simulation. A recent development allows for an adaptive construction of this coarse component of the preconditioner.« less
A Parallel Fast Sweeping Method for the Eikonal Equation

NASA Astrophysics Data System (ADS)

Baker, B.

2017-12-01

Recently, there has been an exciting emergence of probabilistic methods for travel time tomography. Unlike gradient-based optimization strategies, probabilistic tomographic methods are resistant to becoming trapped in a local minimum and provide a much better quantification of parameter resolution than, say, appealing to ray density or performing checkerboard reconstruction tests. The benefits associated with random sampling methods however are only realized by successive computation of predicted travel times in, potentially, strongly heterogeneous media. To this end this abstract is concerned with expediting the solution of the Eikonal equation. While many Eikonal solvers use a fast marching method, the proposed solver will use the iterative fast sweeping method because the eight fixed sweep orderings in each iteration are natural targets for parallelization. To reduce the number of iterations and grid points required the high-accuracy finite difference stencil of Nobel et al., 2014 is implemented. A directed acyclic graph (DAG) is created with a priori knowledge of the sweep ordering and finite different stencil. By performing a topological sort of the DAG sets of independent nodes are identified as candidates for concurrent updating. Additionally, the proposed solver will also address scalability during earthquake relocation, a necessary step in local and regional earthquake tomography and a barrier to extending probabilistic methods from active source to passive source applications, by introducing an asynchronous parallel forward solve phase for all receivers in the network. Synthetic examples using the SEG over-thrust model will be presented.
Aerothermodynamic Analyses of Towed Ballutes

NASA Technical Reports Server (NTRS)

Gnoffo, Peter A.; Buck, Greg; Moss, James N.; Nielsen, Eric; Berger, Karen; Jones, William T.; Rudavsky, Rena

2006-01-01

A ballute (balloon-parachute) is an inflatable, aerodynamic drag device for application to planetary entry vehicles. Two challenging aspects of aerothermal simulation of towed ballutes are considered. The first challenge, simulation of a complete system including inflatable tethers and a trailing toroidal ballute, is addressed using the unstructured-grid, Navier-Stokes solver FUN3D. Auxiliary simulations of a semi-infinite cylinder using the rarefied flow, Direct Simulation Monte Carlo solver, DSV2, provide additional insight into limiting behavior of the aerothermal environment around tethers directly exposed to the free stream. Simulations reveal pressures higher than stagnation and corresponding large heating rates on the tether as it emerges from the spacecraft base flow and passes through the spacecraft bow shock. The footprint of the tether shock on the toroidal ballute is also subject to heating amplification. Design options to accommodate or reduce these environments are discussed. The second challenge addresses time-accurate simulation to detect the onset of unsteady flow interactions as a function of geometry and Reynolds number. Video of unsteady interactions measured in the Langley Aerothermodynamic Laboratory 20-Inch Mach 6 Air Tunnel and CFD simulations using the structured grid, Navier-Stokes solver LAURA are compared for flow over a rigid spacecraft-sting-toroid system. The experimental data provides qualitative information on the amplitude and onset of unsteady motion which is captured in the numerical simulations. The presence of severe unsteady fluid - structure interactions is undesirable and numerical simulation must be able to predict the onset of such motion.
Extending fullwave core ICRF simulation to SOL and antenna regions using FEM solver

NASA Astrophysics Data System (ADS)

Shiraiwa, S.; Wright, J. C.

2016-10-01

A full wave simulation approach to solve a driven RF waves problem including hot core, SOL plasmas and possibly antenna is presented. This approach allows for exploiting advantages of two different way of representing wave field, namely treating spatially dispersive hot conductivity in a spectral solver and handling complicated geometry in SOL/antenna region using an unstructured mesh. Here, we compute a mode set in each region with the RF electric field excitation on the connecting boundary between core and edge regions. A mode corresponding to antenna excitation is also computed. By requiring the continuity of tangential RF electric and magnetic fields, the solution is obtained as unique superposition of these modes. In this work, TORIC core spectral solver is modified to allow for mode excitation, and the edge region of diverted Alcator C-Mod plasma is modeled using COMSOL FEM package. The reconstructed RF field is similar in the core region to TORIC stand-alone simulation. However, it contains higher poloidal modes near the edge and captures a wave bounced and propagating in the poloidal direction near the vacuum-plasma boundary. These features could play an important role when the single power pass absorption is modest. This new capability will enable antenna coupling calculations with a realistic load plasma, including collisional damping in realistic SOL plasma and other loss mechanisms such as RF sheath rectification. USDoE Awards DE-FC02-99ER54512, DE-FC02-01ER54648.
Adaptive unstructured triangular mesh generation and flow solvers for the Navier-Stokes equations at high Reynolds number

NASA Technical Reports Server (NTRS)

Ashford, Gregory A.; Powell, Kenneth G.

1995-01-01

A method for generating high quality unstructured triangular grids for high Reynolds number Navier-Stokes calculations about complex geometries is described. Careful attention is paid in the mesh generation process to resolving efficiently the disparate length scales which arise in these flows. First the surface mesh is constructed in a way which ensures that the geometry is faithfully represented. The volume mesh generation then proceeds in two phases thus allowing the viscous and inviscid regions of the flow to be meshed optimally. A solution-adaptive remeshing procedure which allows the mesh to adapt itself to flow features is also described. The procedure for tracking wakes and refinement criteria appropriate for shock detection are described. Although at present it has only been implemented in two dimensions, the grid generation process has been designed with the extension to three dimensions in mind. An implicit, higher-order, upwind method is also presented for computing compressible turbulent flows on these meshes. Two recently developed one-equation turbulence models have been implemented to simulate the effects of the fluid turbulence. Results for flow about a RAE 2822 airfoil and a Douglas three-element airfoil are presented which clearly show the improved resolution obtainable.
New multigrid approach for three-dimensional unstructured, adaptive grids

NASA Technical Reports Server (NTRS)

Parthasarathy, Vijayan; Kallinderis, Y.

1994-01-01

A new multigrid method with adaptive unstructured grids is presented. The three-dimensional Euler equations are solved on tetrahedral grids that are adaptively refined or coarsened locally. The multigrid method is employed to propagate the fine grid corrections more rapidly by redistributing the changes-in-time of the solution from the fine grid to the coarser grids to accelerate convergence. A new approach is employed that uses the parent cells of the fine grid cells in an adapted mesh to generate successively coaser levels of multigrid. This obviates the need for the generation of a sequence of independent, nonoverlapping grids as well as the relatively complicated operations that need to be performed to interpolate the solution and the residuals between the independent grids. The solver is an explicit, vertex-based, finite volume scheme that employs edge-based data structures and operations. Spatial discretization is of central-differencing type combined with a special upwind-like smoothing operators. Application cases include adaptive solutions obtained with multigrid acceleration for supersonic and subsonic flow over a bump in a channel, as well as transonic flow around the ONERA M6 wing. Two levels of multigrid resulted in reduction in the number of iterations by a factor of 5.
Fuego/Scefire MPMD Coupling L2 Milestone Executive Summary

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pierce, Flint; Tencer, John; Pautz, Shawn D.

2017-09-01

This milestone campaign was focused on coupling Sandia physics codes SIERRA low Mach module Fuego and RAMSES Boltzmann transport code Sceptre(Scefire). Fuego enables simulation of low Mach, turbulent, reacting, particle laden flows on unstructured meshes using CVFEM for abnormal thermal environments throughout SNL and the larger national security community. Sceptre provides simulation for photon, neutron, and charged particle transport on unstructured meshes using Discontinuous Galerkin for radiation effects calculations at SNL and elsewhere. Coupling these ”best of breed” codes enables efficient modeling of thermal/fluid environments with radiation transport, including fires (pool, propellant, composite) as well as those with directed radiantmore » fluxes. We seek to improve the experience of Fuego users who require radiation transport capabilities in two ways. The first is performance. We achieve this through leveraging additional computational resources for Scefire, reducing calculation times while leaving unaffected resources for fluid physics. This approach is new to Fuego, which previously utilized the same resources for both fluid and radiation solutions. The second improvement enables new radiation capabilities, including spectral (banded) radiation, beam boundary sources, and alternate radiation solvers (i.e. Pn). This summary provides an overview of these achievements.« less
Aeroacoustic Simulations of a Nose Landing Gear Using FUN3D on Pointwise Unstructured Grids

NASA Technical Reports Server (NTRS)

Vatsa, Veer N.; Khorrami, Mehdi R.; Rhoads, John; Lockard, David P.

2015-01-01

Numerical simulations have been performed for a partially-dressed, cavity-closed (PDCC) nose landing gear configuration that was tested in the University of Florida's open-jet acoustic facility known as the UFAFF. The unstructured-grid flow solver FUN3D is used to compute the unsteady flow field for this configuration. Mixed-element grids generated using the Pointwise(TradeMark) grid generation software are used for these simulations. Particular care is taken to ensure quality cells and proper resolution in critical areas of interest in an effort to minimize errors introduced by numerical artifacts. A hybrid Reynolds-averaged Navier-Stokes/large eddy simulation (RANS/LES) turbulence model is used for these simulations. Solutions are also presented for a wall function model coupled to the standard turbulence model. Time-averaged and instantaneous solutions obtained on these Pointwise grids are compared with the measured data and previous numerical solutions. The resulting CFD solutions are used as input to a Ffowcs Williams-Hawkings noise propagation code to compute the farfield noise levels in the flyover and sideline directions. The computed noise levels compare well with previous CFD solutions and experimental data.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chrisochoides, N.; Sukup, F.

In this paper we present a parallel implementation of the Bowyer-Watson (BW) algorithm using the task-parallel programming model. The BW algorithm constitutes an ideal mesh refinement strategy for implementing a large class of unstructured mesh generation techniques on both sequential and parallel computers, by preventing the need for global mesh refinement. Its implementation on distributed memory multicomputes using the traditional data-parallel model has been proven very inefficient due to excessive synchronization needed among processors. In this paper we demonstrate that with the task-parallel model we can tolerate synchronization costs inherent to data-parallel methods by exploring concurrency in the processor level.more » Our preliminary performance data indicate that the task- parallel approach: (i) is almost four times faster than the existing data-parallel methods, (ii) scales linearly, and (iii) introduces minimum overheads compared to the {open_quotes}best{close_quotes} sequential implementation of the BW algorithm.« less
An accurate, fast, and scalable solver for high-frequency wave propagation

NASA Astrophysics Data System (ADS)

Zepeda-Núñez, L.; Taus, M.; Hewett, R.; Demanet, L.

2017-12-01

In many science and engineering applications, solving time-harmonic high-frequency wave propagation problems quickly and accurately is of paramount importance. For example, in geophysics, particularly in oil exploration, such problems can be the forward problem in an iterative process for solving the inverse problem of subsurface inversion. It is important to solve these wave propagation problems accurately in order to efficiently obtain meaningful solutions of the inverse problems: low order forward modeling can hinder convergence. Additionally, due to the volume of data and the iterative nature of most optimization algorithms, the forward problem must be solved many times. Therefore, a fast solver is necessary to make solving the inverse problem feasible. For time-harmonic high-frequency wave propagation, obtaining both speed and accuracy is historically challenging. Recently, there have been many advances in the development of fast solvers for such problems, including methods which have linear complexity with respect to the number of degrees of freedom. While most methods scale optimally only in the context of low-order discretizations and smooth wave speed distributions, the method of polarized traces has been shown to retain optimal scaling for high-order discretizations, such as hybridizable discontinuous Galerkin methods and for highly heterogeneous (and even discontinuous) wave speeds. The resulting fast and accurate solver is consequently highly attractive for geophysical applications. To date, this method relies on a layered domain decomposition together with a preconditioner applied in a sweeping fashion, which has limited straight-forward parallelization. In this work, we introduce a new version of the method of polarized traces which reveals more parallel structure than previous versions while preserving all of its other advantages. We achieve this by further decomposing each layer and applying the preconditioner to these new components separately and in parallel. We demonstrate that this produces an even more effective and parallelizable preconditioner for a single right-hand side. As before, additional speed can be gained by pipelining several right-hand-sides.
Automated Euler and Navier-Stokes Database Generation for a Glide-Back Booster

NASA Technical Reports Server (NTRS)

Chaderjian, Neal M.; Rogers, Stuart E.; Aftosmis, Mike J.; Pandya, Shishir A.; Ahmad, Jasim U.; Tejnil, Edward

2004-01-01

The past two decades have seen a sustained increase in the use of high fidelity Computational Fluid Dynamics (CFD) in basic research, aircraft design, and the analysis of post-design issues. As the fidelity of a CFD method increases, the number of cases that can be readily and affordably computed greatly diminishes. However, computer speeds now exceed 2 GHz, hundreds of processors are currently available and more affordable, and advances in parallel CFD algorithms scale more readily with large numbers of processors. All of these factors make it feasible to compute thousands of high fidelity cases. However, there still remains the overwhelming task of monitoring the solution process. This paper presents an approach to automate the CFD solution process. A new software tool, AeroDB, is used to compute thousands of Euler and Navier-Stokes solutions for a 2nd generation glide-back booster in one week. The solution process exploits a common job-submission grid environment, the NASA Information Power Grid (IPG), using 13 computers located at 4 different geographical sites. Process automation and web-based access to a MySql database greatly reduces the user workload, removing much of the tedium and tendency for user input errors. The AeroDB framework is shown. The user submits/deletes jobs, monitors AeroDB's progress, and retrieves data and plots via a web portal. Once a job is in the database, a job launcher uses an IPG resource broker to decide which computers are best suited to run the job. Job/code requirements, the number of CPUs free on a remote system, and queue lengths are some of the parameters the broker takes into account. The Globus software provides secure services for user authentication, remote shell execution, and secure file transfers over an open network. AeroDB automatically decides when a job is completed. Currently, the Cart3D unstructured flow solver is used for the Euler equations, and the Overflow structured overset flow solver is used for the Navier-Stokes equations. Other codes can be readily included into the AeroDB framework.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.