linear system solver: Topics by Science.gov

Sample records for linear system solver

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gearhart, Jared Lee; Adair, Kristin Lynn; Durfee, Justin David.

When developing linear programming models, issues such as budget limitations, customer requirements, or licensing may preclude the use of commercial linear programming solvers. In such cases, one option is to use an open-source linear programming solver. A survey of linear programming tools was conducted to identify potential open-source solvers. From this survey, four open-source solvers were tested using a collection of linear programming test problems and the results were compared to IBM ILOG CPLEX Optimizer (CPLEX) [1], an industry standard. The solvers considered were: COIN-OR Linear Programming (CLP) [2], [3], GNU Linear Programming Kit (GLPK) [4], lp_solve [5] and Modularmore » In-core Nonlinear Optimization System (MINOS) [6]. As no open-source solver outperforms CPLEX, this study demonstrates the power of commercial linear programming software. CLP was found to be the top performing open-source solver considered in terms of capability and speed. GLPK also performed well but cannot match the speed of CLP or CPLEX. lp_solve and MINOS were considerably slower and encountered issues when solving several test problems.« less
Amesos2 and Belos: Direct and Iterative Solvers for Large Sparse Linear Systems

DOE PAGES

Bavier, Eric; Hoemmen, Mark; Rajamanickam, Sivasankaran; ...

2012-01-01

Solvers for large sparse linear systems come in two categories: direct and iterative. Amesos2, a package in the Trilinos software project, provides direct methods, and Belos, another Trilinos package, provides iterative methods. Amesos2 offers a common interface to many different sparse matrix factorization codes, and can handle any implementation of sparse matrices and vectors, via an easy-to-extend C++ traits interface. It can also factor matrices whose entries have arbitrary “Scalar” type, enabling extended-precision and mixed-precision algorithms. Belos includes many different iterative methods for solving large sparse linear systems and least-squares problems. Unlike competing iterative solver libraries, Belos completely decouples themore » algorithms from the implementations of the underlying linear algebra objects. This lets Belos exploit the latest hardware without changes to the code. Belos favors algorithms that solve higher-level problems, such as multiple simultaneous linear systems and sequences of related linear systems, faster than standard algorithms. The package also supports extended-precision and mixed-precision algorithms. Together, Amesos2 and Belos form a complete suite of sparse linear solvers.« less
LDRD final report on massively-parallel linear programming : the parPCx system.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar

2005-02-01

This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runsmore » on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We conclude with directions for long-term future algorithmic research and for near-term development that could improve the performance of parPCx.« less
A Flexible CUDA LU-based Solver for Small, Batched Linear Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tumeo, Antonino; Gawande, Nitin A.; Villa, Oreste

This chapter presents the implementation of a batched CUDA solver based on LU factorization for small linear systems. This solver may be used in applications such as reactive flow transport models, which apply the Newton-Raphson technique to linearize and iteratively solve the sets of non linear equations that represent the reactions for ten of thousands to millions of physical locations. The implementation exploits somewhat counterintuitive GPGPU programming techniques: it assigns the solution of a matrix (representing a system) to a single CUDA thread, does not exploit shared memory and employs dynamic memory allocation on the GPUs. These techniques enable ourmore » implementation to simultaneously solve sets of systems with over 100 equations and to employ LU decomposition with complete pivoting, providing the higher numerical accuracy required by certain applications. Other currently available solutions for batched linear solvers are limited by size and only support partial pivoting, although they may result faster in certain conditions. We discuss the code of our implementation and present a comparison with the other implementations, discussing the various tradeoffs in terms of performance and flexibility. This work will enable developers that need batched linear solvers to choose whichever implementation is more appropriate to the features and the requirements of their applications, and even to implement dynamic switching approaches that can choose the best implementation depending on the input data.« less
Application of Nearly Linear Solvers to Electric Power System Computation

NASA Astrophysics Data System (ADS)

Grant, Lisa L.

To meet the future needs of the electric power system, improvements need to be made in the areas of power system algorithms, simulation, and modeling, specifically to achieve a time frame that is useful to industry. If power system time-domain simulations could run in real-time, then system operators would have situational awareness to implement online control and avoid cascading failures, significantly improving power system reliability. Several power system applications rely on the solution of a very large linear system. As the demands on power systems continue to grow, there is a greater computational complexity involved in solving these large linear systems within reasonable time. This project expands on the current work in fast linear solvers, developed for solving symmetric and diagonally dominant linear systems, in order to produce power system specific methods that can be solved in nearly-linear run times. The work explores a new theoretical method that is based on ideas in graph theory and combinatorics. The technique builds a chain of progressively smaller approximate systems with preconditioners based on the system's low stretch spanning tree. The method is compared to traditional linear solvers and shown to reduce the time and iterations required for an accurate solution, especially as the system size increases. A simulation validation is performed, comparing the solution capabilities of the chain method to LU factorization, which is the standard linear solver for power flow. The chain method was successfully demonstrated to produce accurate solutions for power flow simulation on a number of IEEE test cases, and a discussion on how to further improve the method's speed and accuracy is included.
A computational study of the use of an optimization-based method for simulating large multibody systems.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Petra, C.; Gavrea, B.; Anitescu, M.

2009-01-01

The present work aims at comparing the performance of several quadratic programming (QP) solvers for simulating large-scale frictional rigid-body systems. Traditional time-stepping schemes for simulation of multibody systems are formulated as linear complementarity problems (LCPs) with copositive matrices. Such LCPs are generally solved by means of Lemke-type algorithms and solvers such as the PATH solver proved to be robust. However, for large systems, the PATH solver or any other pivotal algorithm becomes unpractical from a computational point of view. The convex relaxation proposed by one of the authors allows the formulation of the integration step as a QPD, for whichmore » a wide variety of state-of-the-art solvers are available. In what follows we report the results obtained solving that subproblem when using the QP solvers MOSEK, OOQP, TRON, and BLMVM. OOQP is presented with both the symmetric indefinite solver MA27 and our Cholesky reformulation using the CHOLMOD package. We investigate computational performance and address the correctness of the results from a modeling point of view. We conclude that the OOQP solver, particularly with the CHOLMOD linear algebra solver, has predictable performance and memory use patterns and is far more competitive for these problems than are the other solvers.« less
Scalable domain decomposition solvers for stochastic PDEs in high performance computing

DOE PAGES

Desai, Ajit; Khalil, Mohammad; Pettit, Chris; ...

2017-09-21

Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
Scalable domain decomposition solvers for stochastic PDEs in high performance computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Desai, Ajit; Khalil, Mohammad; Pettit, Chris

Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Garrett, C. Kristopher; Hauck, Cory D.

In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less
A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework

DOE PAGES

Garrett, C. Kristopher; Hauck, Cory D.

2018-04-05

In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less
Acceleration of Linear Finite-Difference Poisson-Boltzmann Methods on Graphics Processing Units.

PubMed

Qi, Ruxi; Botello-Smith, Wesley M; Luo, Ray

2017-07-11

Electrostatic interactions play crucial roles in biophysical processes such as protein folding and molecular recognition. Poisson-Boltzmann equation (PBE)-based models have emerged as widely used in modeling these important processes. Though great efforts have been put into developing efficient PBE numerical models, challenges still remain due to the high dimensionality of typical biomolecular systems. In this study, we implemented and analyzed commonly used linear PBE solvers for the ever-improving graphics processing units (GPU) for biomolecular simulations, including both standard and preconditioned conjugate gradient (CG) solvers with several alternative preconditioners. Our implementation utilizes the standard Nvidia CUDA libraries cuSPARSE, cuBLAS, and CUSP. Extensive tests show that good numerical accuracy can be achieved given that the single precision is often used for numerical applications on GPU platforms. The optimal GPU performance was observed with the Jacobi-preconditioned CG solver, with a significant speedup over standard CG solver on CPU in our diversified test cases. Our analysis further shows that different matrix storage formats also considerably affect the efficiency of different linear PBE solvers on GPU, with the diagonal format best suited for our standard finite-difference linear systems. Further efficiency may be possible with matrix-free operations and integrated grid stencil setup specifically tailored for the banded matrices in PBE-specific linear systems.
A high performance linear equation solver on the VPP500 parallel supercomputer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nakanishi, Makoto; Ina, Hiroshi; Miura, Kenichi

1994-12-31

This paper describes the implementation of two high performance linear equation solvers developed for the Fujitsu VPP500, a distributed memory parallel supercomputer system. The solvers take advantage of the key architectural features of VPP500--(1) scalability for an arbitrary number of processors up to 222 processors, (2) flexible data transfer among processors provided by a crossbar interconnection network, (3) vector processing capability on each processor, and (4) overlapped computation and transfer. The general linear equation solver based on the blocked LU decomposition method achieves 120.0 GFLOPS performance with 100 processors in the LIN-PACK Highly Parallel Computing benchmark.
Summer Proceedings 2016: The Center for Computing Research at Sandia National Laboratories

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carleton, James Brian; Parks, Michael L.

Solving sparse linear systems from the discretization of elliptic partial differential equations (PDEs) is an important building block in many engineering applications. Sparse direct solvers can solve general linear systems, but are usually slower and use much more memory than effective iterative solvers. To overcome these two disadvantages, a hierarchical solver (LoRaSp) based on H2-matrices was introduced in [22]. Here, we have developed a parallel version of the algorithm in LoRaSp to solve large sparse matrices on distributed memory machines. On a single processor, the factorization time of our parallel solver scales almost linearly with the problem size for three-dimensionalmore » problems, as opposed to the quadratic scalability of many existing sparse direct solvers. Moreover, our solver leads to almost constant numbers of iterations, when used as a preconditioner for Poisson problems. On more than one processor, our algorithm has significant speedups compared to sequential runs. With this parallel algorithm, we are able to solve large problems much faster than many existing packages as demonstrated by the numerical experiments.« less
A Lagrangian meshfree method applied to linear and nonlinear elasticity.

PubMed

Walker, Wade A

2017-01-01

The repeated replacement method (RRM) is a Lagrangian meshfree method which we have previously applied to the Euler equations for compressible fluid flow. In this paper we present new enhancements to RRM, and we apply the enhanced method to both linear and nonlinear elasticity. We compare the results of ten test problems to those of analytic solvers, to demonstrate that RRM can successfully simulate these elastic systems without many of the requirements of traditional numerical methods such as numerical derivatives, equation system solvers, or Riemann solvers. We also show the relationship between error and computational effort for RRM on these systems, and compare RRM to other methods to highlight its strengths and weaknesses. And to further explain the two elastic equations used in the paper, we demonstrate the mathematical procedure used to create Riemann and Sedov-Taylor solvers for them, and detail the numerical techniques needed to embody those solvers in code.
A Lagrangian meshfree method applied to linear and nonlinear elasticity

PubMed Central

2017-01-01

The repeated replacement method (RRM) is a Lagrangian meshfree method which we have previously applied to the Euler equations for compressible fluid flow. In this paper we present new enhancements to RRM, and we apply the enhanced method to both linear and nonlinear elasticity. We compare the results of ten test problems to those of analytic solvers, to demonstrate that RRM can successfully simulate these elastic systems without many of the requirements of traditional numerical methods such as numerical derivatives, equation system solvers, or Riemann solvers. We also show the relationship between error and computational effort for RRM on these systems, and compare RRM to other methods to highlight its strengths and weaknesses. And to further explain the two elastic equations used in the paper, we demonstrate the mathematical procedure used to create Riemann and Sedov-Taylor solvers for them, and detail the numerical techniques needed to embody those solvers in code. PMID:29045443
On improving linear solver performance: a block variant of GMRES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, A H; Dennis, J M; Jessup, E R

2004-05-10

The increasing gap between processor performance and memory access time warrants the re-examination of data movement in iterative linear solver algorithms. For this reason, we explore and establish the feasibility of modifying a standard iterative linear solver algorithm in a manner that reduces the movement of data through memory. In particular, we present an alternative to the restarted GMRES algorithm for solving a single right-hand side linear system Ax = b based on solving the block linear system AX = B. Algorithm performance, i.e. time to solution, is improved by using the matrix A in operations on groups of vectors.more » Experimental results demonstrate the importance of implementation choices on data movement as well as the effectiveness of the new method on a variety of problems from different application areas.« less
Linear solver performance in elastoplastic problem solution on GPU cluster

NASA Astrophysics Data System (ADS)

Khalevitsky, Yu. V.; Konovalov, A. V.; Burmasheva, N. V.; Partin, A. S.

2017-12-01

Applying the finite element method to severe plastic deformation problems involves solving linear equation systems. While the solution procedure is relatively hard to parallelize and computationally intensive by itself, a long series of large scale systems need to be solved for each problem. When dealing with fine computational meshes, such as in the simulations of three-dimensional metal matrix composite microvolume deformation, tens and hundreds of hours may be needed to complete the whole solution procedure, even using modern supercomputers. In general, one of the preconditioned Krylov subspace methods is used in a linear solver for such problems. The method convergence highly depends on the operator spectrum of a problem stiffness matrix. In order to choose the appropriate method, a series of computational experiments is used. Different methods may be preferable for different computational systems for the same problem. In this paper we present experimental data obtained by solving linear equation systems from an elastoplastic problem on a GPU cluster. The data can be used to substantiate the choice of the appropriate method for a linear solver to use in severe plastic deformation simulations.
FoSSI: the family of simplified solver interfaces for the rapid development of parallel numerical atmosphere and ocean models

NASA Astrophysics Data System (ADS)

Frickenhaus, Stephan; Hiller, Wolfgang; Best, Meike

The portable software FoSSI is introduced that—in combination with additional free solver software packages—allows for an efficient and scalable parallel solution of large sparse linear equations systems arising in finite element model codes. FoSSI is intended to support rapid model code development, completely hiding the complexity of the underlying solver packages. In particular, the model developer need not be an expert in parallelization and is yet free to switch between different solver packages by simple modifications of the interface call. FoSSI offers an efficient and easy, yet flexible interface to several parallel solvers, most of them available on the web, such as PETSC, AZTEC, MUMPS, PILUT and HYPRE. FoSSI makes use of the concept of handles for vectors, matrices, preconditioners and solvers, that is frequently used in solver libraries. Hence, FoSSI allows for a flexible treatment of several linear equations systems and associated preconditioners at the same time, even in parallel on separate MPI-communicators. The second special feature in FoSSI is the task specifier, being a combination of keywords, each configuring a certain phase in the solver setup. This enables the user to control a solver over one unique subroutine. Furthermore, FoSSI has rather similar features for all solvers, making a fast solver intercomparison or exchange an easy task. FoSSI is a community software, proven in an adaptive 2D-atmosphere model and a 3D-primitive equation ocean model, both formulated in finite elements. The present paper discusses perspectives of an OpenMP-implementation of parallel iterative solvers based on domain decomposition methods. This approach to OpenMP solvers is rather attractive, as the code for domain-local operations of factorization, preconditioning and matrix-vector product can be readily taken from a sequential implementation that is also suitable to be used in an MPI-variant. Code development in this direction is in an advanced state under the name ScOPES: the Scalable Open Parallel sparse linear Equations Solver.
ASTROP2-LE: A Mistuned Aeroelastic Analysis System Based on a Two Dimensional Linearized Euler Solver

NASA Technical Reports Server (NTRS)

Reddy, T. S. R.; Srivastava, R.; Mehmed, Oral

2002-01-01

An aeroelastic analysis system for flutter and forced response analysis of turbomachines based on a two-dimensional linearized unsteady Euler solver has been developed. The ASTROP2 code, an aeroelastic stability analysis program for turbomachinery, was used as a basis for this development. The ASTROP2 code uses strip theory to couple a two dimensional aerodynamic model with a three dimensional structural model. The code was modified to include forced response capability. The formulation was also modified to include aeroelastic analysis with mistuning. A linearized unsteady Euler solver, LINFLX2D is added to model the unsteady aerodynamics in ASTROP2. By calculating the unsteady aerodynamic loads using LINFLX2D, it is possible to include the effects of transonic flow on flutter and forced response in the analysis. The stability is inferred from an eigenvalue analysis. The revised code, ASTROP2-LE for ASTROP2 code using Linearized Euler aerodynamics, is validated by comparing the predictions with those obtained using linear unsteady aerodynamic solutions.
LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators

NASA Astrophysics Data System (ADS)

Gonzalez, Juan; Núñez, Rafael C.

2009-07-01

We present LAPACKrc, a family of FPGA-based linear algebra solvers able to achieve more than 100x speedup per commodity processor on certain problems. LAPACKrc subsumes some of the LAPACK and ScaLAPACK functionalities, and it also incorporates sparse direct and iterative matrix solvers. Current LAPACKrc prototypes demonstrate between 40x-150x speedup compared against top-of-the-line hardware/software systems. A technology roadmap is in place to validate current performance of LAPACKrc in HPC applications, and to increase the computational throughput by factors of hundreds within the next few years.

Multigrid approaches to non-linear diffusion problems on unstructured meshes

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.; Bushnell, Dennis M. (Technical Monitor)

2001-01-01

The efficiency of three multigrid methods for solving highly non-linear diffusion problems on two-dimensional unstructured meshes is examined. The three multigrid methods differ mainly in the manner in which the nonlinearities of the governing equations are handled. These comprise a non-linear full approximation storage (FAS) multigrid method which is used to solve the non-linear equations directly, a linear multigrid method which is used to solve the linear system arising from a Newton linearization of the non-linear system, and a hybrid scheme which is based on a non-linear FAS multigrid scheme, but employs a linear solver on each level as a smoother. Results indicate that all methods are equally effective at converging the non-linear residual in a given number of grid sweeps, but that the linear solver is more efficient in cpu time due to the lower cost of linear versus non-linear grid sweeps.
MODFLOW-2000, The U.S. Geological Survey Modular Ground-Water Model -- GMG Linear Equation Solver Package Documentation

USGS Publications Warehouse

Wilson, John D.; Naff, Richard L.

2004-01-01

A geometric multigrid solver (GMG), based in the preconditioned conjugate gradient algorithm, has been developed for solving systems of equations resulting from applying the cell-centered finite difference algorithm to flow in porous media. This solver has been adapted to the U.S. Geological Survey ground-water flow model MODFLOW-2000. The documentation herein is a description of the solver and the adaptation to MODFLOW-2000.
Fault tolerance in an inner-outer solver: A GVR-enabled case study

DOE PAGES

Zhang, Ziming; Chien, Andrew A.; Teranishi, Keita

2015-04-18

Resilience is a major challenge for large-scale systems. It is particularly important for iterative linear solvers, since they take much of the time of many scientific applications. We show that single bit flip errors in the Flexible GMRES iterative linear solver can lead to high computational overhead or even failure to converge to the right answer. Informed by these results, we design and evaluate several strategies for fault tolerance in both inner and outer solvers appropriate across a range of error rates. We implement them, extending Trilinos’ solver library with the Global View Resilience (GVR) programming model, which provides multi-streammore » snapshots, multi-version data structures with portable and rich error checking/recovery. Lastly, experimental results validate correct execution with low performance overhead under varied error conditions.« less
Compact tunable silicon photonic differential-equation solver for general linear time-invariant systems.

PubMed

Wu, Jiayang; Cao, Pan; Hu, Xiaofeng; Jiang, Xinhong; Pan, Ting; Yang, Yuxing; Qiu, Ciyuan; Tremblay, Christine; Su, Yikai

2014-10-20

We propose and experimentally demonstrate an all-optical temporal differential-equation solver that can be used to solve ordinary differential equations (ODEs) characterizing general linear time-invariant (LTI) systems. The photonic device implemented by an add-drop microring resonator (MRR) with two tunable interferometric couplers is monolithically integrated on a silicon-on-insulator (SOI) wafer with a compact footprint of ~60 μm × 120 μm. By thermally tuning the phase shifts along the bus arms of the two interferometric couplers, the proposed device is capable of solving first-order ODEs with two variable coefficients. The operation principle is theoretically analyzed, and system testing of solving ODE with tunable coefficients is carried out for 10-Gb/s optical Gaussian-like pulses. The experimental results verify the effectiveness of the fabricated device as a tunable photonic ODE solver.
Assessment of Linear Finite-Difference Poisson-Boltzmann Solvers

PubMed Central

Wang, Jun; Luo, Ray

2009-01-01

CPU time and memory usage are two vital issues that any numerical solvers for the Poisson-Boltzmann equation have to face in biomolecular applications. In this study we systematically analyzed the CPU time and memory usage of five commonly used finite-difference solvers with a large and diversified set of biomolecular structures. Our comparative analysis shows that modified incomplete Cholesky conjugate gradient and geometric multigrid are the most efficient in the diversified test set. For the two efficient solvers, our test shows that their CPU times increase approximately linearly with the numbers of grids. Their CPU times also increase almost linearly with the negative logarithm of the convergence criterion at very similar rate. Our comparison further shows that geometric multigrid performs better in the large set of tested biomolecules. However, modified incomplete Cholesky conjugate gradient is superior to geometric multigrid in molecular dynamics simulations of tested molecules. We also investigated other significant components in numerical solutions of the Poisson-Boltzmann equation. It turns out that the time-limiting step is the free boundary condition setup for the linear systems for the selected proteins if the electrostatic focusing is not used. Thus, development of future numerical solvers for the Poisson-Boltzmann equation should balance all aspects of the numerical procedures in realistic biomolecular applications. PMID:20063271
SSE-based Thomas algorithm for quasi-block-tridiagonal linear equation systems, optimized for small dense blocks

NASA Astrophysics Data System (ADS)

Barnaś, Dawid; Bieniasz, Lesław K.

2017-07-01

We have recently developed a vectorized Thomas solver for quasi-block tridiagonal linear algebraic equation systems using Streaming SIMD Extensions (SSE) and Advanced Vector Extensions (AVX) in operations on dense blocks [D. Barnaś and L. K. Bieniasz, Int. J. Comput. Meth., accepted]. The acceleration caused by vectorization was observed for large block sizes, but was less satisfactory for small blocks. In this communication we report on another version of the solver, optimized for small blocks of size up to four rows and/or columns.
The Use of Sparse Direct Solver in Vector Finite Element Modeling for Calculating Two Dimensional (2-D) Magnetotelluric Responses in Transverse Electric (TE) Mode

NASA Astrophysics Data System (ADS)

Yihaa Roodhiyah, Lisa’; Tjong, Tiffany; Nurhasan; Sutarno, D.

2018-04-01

The late research, linear matrices of vector finite element in two dimensional(2-D) magnetotelluric (MT) responses modeling was solved by non-sparse direct solver in TE mode. Nevertheless, there is some weakness which have to be improved especially accuracy in the low frequency (10-3 Hz-10-5 Hz) which is not achieved yet and high cost computation in dense mesh. In this work, the solver which is used is sparse direct solver instead of non-sparse direct solverto overcome the weaknesses of solving linear matrices of vector finite element metod using non-sparse direct solver. Sparse direct solver will be advantageous in solving linear matrices of vector finite element method because of the matrix properties which is symmetrical and sparse. The validation of sparse direct solver in solving linear matrices of vector finite element has been done for a homogen half-space model and vertical contact model by analytical solution. Thevalidation result of sparse direct solver in solving linear matrices of vector finite element shows that sparse direct solver is more stable than non-sparse direct solver in computing linear problem of vector finite element method especially in low frequency. In the end, the accuracy of 2D MT responses modelling in low frequency (10-3 Hz-10-5 Hz) has been reached out under the efficient allocation memory of array and less computational time consuming.
The solution of linear systems of equations with a structural analysis code on the NAS CRAY-2

NASA Technical Reports Server (NTRS)

Poole, Eugene L.; Overman, Andrea L.

1988-01-01

Two methods for solving linear systems of equations on the NAS Cray-2 are described. One is a direct method; the other is an iterative method. Both methods exploit the architecture of the Cray-2, particularly the vectorization, and are aimed at structural analysis applications. To demonstrate and evaluate the methods, they were installed in a finite element structural analysis code denoted the Computational Structural Mechanics (CSM) Testbed. A description of the techniques used to integrate the two solvers into the Testbed is given. Storage schemes, memory requirements, operation counts, and reformatting procedures are discussed. Finally, results from the new methods are compared with results from the initial Testbed sparse Choleski equation solver for three structural analysis problems. The new direct solvers described achieve the highest computational rates of the methods compared. The new iterative methods are not able to achieve as high computation rates as the vectorized direct solvers but are best for well conditioned problems which require fewer iterations to converge to the solution.
Multidimensional Riemann problem with self-similar internal structure - part III - a multidimensional analogue of the HLLI Riemann solver for conservative hyperbolic systems

NASA Astrophysics Data System (ADS)

Balsara, Dinshaw S.; Nkonga, Boniface

2017-10-01

Just as the quality of a one-dimensional approximate Riemann solver is improved by the inclusion of internal sub-structure, the quality of a multidimensional Riemann solver is also similarly improved. Such multidimensional Riemann problems arise when multiple states come together at the vertex of a mesh. The interaction of the resulting one-dimensional Riemann problems gives rise to a strongly-interacting state. We wish to endow this strongly-interacting state with physically-motivated sub-structure. The fastest way of endowing such sub-structure consists of making a multidimensional extension of the HLLI Riemann solver for hyperbolic conservation laws. Presenting such a multidimensional analogue of the HLLI Riemann solver with linear sub-structure for use on structured meshes is the goal of this work. The multidimensional MuSIC Riemann solver documented here is universal in the sense that it can be applied to any hyperbolic conservation law. The multidimensional Riemann solver is made to be consistent with constraints that emerge naturally from the Galerkin projection of the self-similar states within the wave model. When the full eigenstructure in both directions is used in the present Riemann solver, it becomes a complete Riemann solver in a multidimensional sense. I.e., all the intermediate waves are represented in the multidimensional wave model. The work also presents, for the very first time, an important analysis of the dissipation characteristics of multidimensional Riemann solvers. The present Riemann solver results in the most efficient implementation of a multidimensional Riemann solver with sub-structure. Because it preserves stationary linearly degenerate waves, it might also help with well-balancing. Implementation-related details are presented in pointwise fashion for the one-dimensional HLLI Riemann solver as well as the multidimensional MuSIC Riemann solver.
Architecting the Finite Element Method Pipeline for the GPU.

PubMed

Fu, Zhisong; Lewis, T James; Kirby, Robert M; Whitaker, Ross T

2014-02-01

The finite element method (FEM) is a widely employed numerical technique for approximating the solution of partial differential equations (PDEs) in various science and engineering applications. Many of these applications benefit from fast execution of the FEM pipeline. One way to accelerate the FEM pipeline is by exploiting advances in modern computational hardware, such as the many-core streaming processors like the graphical processing unit (GPU). In this paper, we present the algorithms and data-structures necessary to move the entire FEM pipeline to the GPU. First we propose an efficient GPU-based algorithm to generate local element information and to assemble the global linear system associated with the FEM discretization of an elliptic PDE. To solve the corresponding linear system efficiently on the GPU, we implement a conjugate gradient method preconditioned with a geometry-informed algebraic multi-grid (AMG) method preconditioner. We propose a new fine-grained parallelism strategy, a corresponding multigrid cycling stage and efficient data mapping to the many-core architecture of GPU. Comparison of our on-GPU assembly versus a traditional serial implementation on the CPU achieves up to an 87 × speedup. Focusing on the linear system solver alone, we achieve a speedup of up to 51 × versus use of a comparable state-of-the-art serial CPU linear system solver. Furthermore, the method compares favorably with other GPU-based, sparse, linear solvers.
Improving the energy efficiency of sparse linear system solvers on multicore and manycore systems.

PubMed

Anzt, H; Quintana-Ortí, E S

2014-06-28

While most recent breakthroughs in scientific research rely on complex simulations carried out in large-scale supercomputers, the power draft and energy spent for this purpose is increasingly becoming a limiting factor to this trend. In this paper, we provide an overview of the current status in energy-efficient scientific computing by reviewing different technologies used to monitor power draft as well as power- and energy-saving mechanisms available in commodity hardware. For the particular domain of sparse linear algebra, we analyse the energy efficiency of a broad collection of hardware architectures and investigate how algorithmic and implementation modifications can improve the energy performance of sparse linear system solvers, without negatively impacting their performance. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lott, P. Aaron; Woodward, Carol S.; Evans, Katherine J.

Performing accurate and efficient numerical simulation of global atmospheric climate models is challenging due to the disparate length and time scales over which physical processes interact. Implicit solvers enable the physical system to be integrated with a time step commensurate with processes being studied. The dominant cost of an implicit time step is the ancillary linear system solves, so we have developed a preconditioner aimed at improving the efficiency of these linear system solves. Our preconditioner is based on an approximate block factorization of the linearized shallow-water equations and has been implemented within the spectral element dynamical core within themore » Community Atmospheric Model (CAM-SE). Furthermore, in this paper we discuss the development and scalability of the preconditioner for a suite of test cases with the implicit shallow-water solver within CAM-SE.« less
A Numerical Study of Scalable Cardiac Electro-Mechanical Solvers on HPC Architectures

PubMed Central

Colli Franzone, Piero; Pavarino, Luca F.; Scacchi, Simone

2018-01-01

We introduce and study some scalable domain decomposition preconditioners for cardiac electro-mechanical 3D simulations on parallel HPC (High Performance Computing) architectures. The electro-mechanical model of the cardiac tissue is composed of four coupled sub-models: (1) the static finite elasticity equations for the transversely isotropic deformation of the cardiac tissue; (2) the active tension model describing the dynamics of the intracellular calcium, cross-bridge binding and myofilament tension; (3) the anisotropic Bidomain model describing the evolution of the intra- and extra-cellular potentials in the deforming cardiac tissue; and (4) the ionic membrane model describing the dynamics of ionic currents, gating variables, ionic concentrations and stretch-activated channels. This strongly coupled electro-mechanical model is discretized in time with a splitting semi-implicit technique and in space with isoparametric finite elements. The resulting scalable parallel solver is based on Multilevel Additive Schwarz preconditioners for the solution of the Bidomain system and on BDDC preconditioned Newton-Krylov solvers for the non-linear finite elasticity system. The results of several 3D parallel simulations show the scalability of both linear and non-linear solvers and their application to the study of both physiological excitation-contraction cardiac dynamics and re-entrant waves in the presence of different mechano-electrical feedbacks. PMID:29674971
Ensemble Grouping Strategies for Embedded Stochastic Collocation Methods Applied to Anisotropic Diffusion Problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

D'Elia, M.; Edwards, H. C.; Hu, J.

Previous work has demonstrated that propagating groups of samples, called ensembles, together through forward simulations can dramatically reduce the aggregate cost of sampling-based uncertainty propagation methods [E. Phipps, M. D'Elia, H. C. Edwards, M. Hoemmen, J. Hu, and S. Rajamanickam, SIAM J. Sci. Comput., 39 (2017), pp. C162--C193]. However, critical to the success of this approach when applied to challenging problems of scientific interest is the grouping of samples into ensembles to minimize the total computational work. For example, the total number of linear solver iterations for ensemble systems may be strongly influenced by which samples form the ensemble whenmore » applying iterative linear solvers to parameterized and stochastic linear systems. In this paper we explore sample grouping strategies for local adaptive stochastic collocation methods applied to PDEs with uncertain input data, in particular canonical anisotropic diffusion problems where the diffusion coefficient is modeled by truncated Karhunen--Loève expansions. Finally, we demonstrate that a measure of the total anisotropy of the diffusion coefficient is a good surrogate for the number of linear solver iterations for each sample and therefore provides a simple and effective metric for grouping samples.« less
Ensemble Grouping Strategies for Embedded Stochastic Collocation Methods Applied to Anisotropic Diffusion Problems

DOE PAGES

D'Elia, M.; Edwards, H. C.; Hu, J.; ...

2018-01-18

Previous work has demonstrated that propagating groups of samples, called ensembles, together through forward simulations can dramatically reduce the aggregate cost of sampling-based uncertainty propagation methods [E. Phipps, M. D'Elia, H. C. Edwards, M. Hoemmen, J. Hu, and S. Rajamanickam, SIAM J. Sci. Comput., 39 (2017), pp. C162--C193]. However, critical to the success of this approach when applied to challenging problems of scientific interest is the grouping of samples into ensembles to minimize the total computational work. For example, the total number of linear solver iterations for ensemble systems may be strongly influenced by which samples form the ensemble whenmore » applying iterative linear solvers to parameterized and stochastic linear systems. In this paper we explore sample grouping strategies for local adaptive stochastic collocation methods applied to PDEs with uncertain input data, in particular canonical anisotropic diffusion problems where the diffusion coefficient is modeled by truncated Karhunen--Loève expansions. Finally, we demonstrate that a measure of the total anisotropy of the diffusion coefficient is a good surrogate for the number of linear solver iterations for each sample and therefore provides a simple and effective metric for grouping samples.« less
A Block Preconditioned Conjugate Gradient-type Iterative Solver for Linear Systems in Thermal Reservoir Simulation

NASA Astrophysics Data System (ADS)

Betté, Srinivas; Diaz, Julio C.; Jines, William R.; Steihaug, Trond

1986-11-01

A preconditioned residual-norm-reducing iterative solver is described. Based on a truncated form of the generalized-conjugate-gradient method for nonsymmetric systems of linear equations, the iterative scheme is very effective for linear systems generated in reservoir simulation of thermal oil recovery processes. As a consequence of employing an adaptive implicit finite-difference scheme to solve the model equations, the number of variables per cell-block varies dynamically over the grid. The data structure allows for 5- and 9-point operators in the areal model, 5-point in the cross-sectional model, and 7- and 11-point operators in the three-dimensional model. Block-diagonal-scaling of the linear system, done prior to iteration, is found to have a significant effect on the rate of convergence. Block-incomplete-LU-decomposition (BILU) and block-symmetric-Gauss-Seidel (BSGS) methods, which result in no fill-in, are used as preconditioning procedures. A full factorization is done on the well terms, and the cells are ordered in a manner which minimizes the fill-in in the well-column due to this factorization. The convergence criterion for the linear (inner) iteration is linked to that of the nonlinear (Newton) iteration, thereby enhancing the efficiency of the computation. The algorithm, with both BILU and BSGS preconditioners, is evaluated in the context of a variety of thermal simulation problems. The solver is robust and can be used with little or no user intervention.
Application of PDSLin to the magnetic reconnection problem

NASA Astrophysics Data System (ADS)

Yuan, Xuefei; Li, Xiaoye S.; Yamazaki, Ichitaro; Jardin, Stephen C.; Koniges, Alice E.; Keyes, David E.

2013-01-01

Magnetic reconnection is a fundamental process in a magnetized plasma at both low and high magnetic Lundquist numbers (the ratio of the resistive diffusion time to the Alfvén wave transit time), which occurs in a wide variety of laboratory and space plasmas, e.g. magnetic fusion experiments, the solar corona and the Earth's magnetotail. An implicit time advance for the two-fluid magnetic reconnection problem is known to be difficult because of the large condition number of the associated matrix. This is especially troublesome when the collisionless ion skin depth is large so that the Whistler waves, which cause the fast reconnection, dominate the physics (Yuan et al 2012 J. Comput. Phys. 231 5822-53). For small system sizes, a direct solver such as SuperLU can be employed to obtain an accurate solution as long as the condition number is bounded by the reciprocal of the floating-point machine precision. However, SuperLU scales effectively only to hundreds of processors or less. For larger system sizes, it has been shown that physics-based (Chacón and Knoll 2003 J. Comput. Phys. 188 573-92) or other preconditioners can be applied to provide adequate solver performance. In recent years, we have been developing a new algebraic hybrid linear solver, PDSLin (Parallel Domain decomposition Schur complement-based Linear solver) (Yamazaki and Li 2010 Proc. VECPAR pp 421-34 and Yamazaki et al 2011 Technical Report). In this work, we compare numerical results from a direct solver and the proposed hybrid solver for the magnetic reconnection problem and demonstrate that the new hybrid solver is scalable to thousands of processors while maintaining the same robustness as a direct solver.
Evaluation of out-of-core computer programs for the solution of symmetric banded linear equations. [simultaneous equations

NASA Technical Reports Server (NTRS)

Dunham, R. S.

1976-01-01

FORTRAN coded out-of-core equation solvers that solve using direct methods symmetric banded systems of simultaneous algebraic equations. Banded, frontal and column (skyline) solvers were studied as well as solvers that can partition the working area and thus could fit into any available core. Comparison timings are presented for several typical two dimensional and three dimensional continuum type grids of elements with and without midside nodes. Extensive conclusions are also given.
Decreasing the temporal complexity for nonlinear, implicit reduced-order models by forecasting

DOE PAGES

Carlberg, Kevin; Ray, Jaideep; van Bloemen Waanders, Bart

2015-02-14

Implicit numerical integration of nonlinear ODEs requires solving a system of nonlinear algebraic equations at each time step. Each of these systems is often solved by a Newton-like method, which incurs a sequence of linear-system solves. Most model-reduction techniques for nonlinear ODEs exploit knowledge of system's spatial behavior to reduce the computational complexity of each linear-system solve. However, the number of linear-system solves for the reduced-order simulation often remains roughly the same as that for the full-order simulation. We propose exploiting knowledge of the model's temporal behavior to (1) forecast the unknown variable of the reduced-order system of nonlinear equationsmore » at future time steps, and (2) use this forecast as an initial guess for the Newton-like solver during the reduced-order-model simulation. To compute the forecast, we propose using the Gappy POD technique. As a result, the goal is to generate an accurate initial guess so that the Newton solver requires many fewer iterations to converge, thereby decreasing the number of linear-system solves in the reduced-order-model simulation.« less
Modeling of frequency-domain scalar wave equation with the average-derivative optimal scheme based on a multigrid-preconditioned iterative solver

NASA Astrophysics Data System (ADS)

Cao, Jian; Chen, Jing-Bo; Dai, Meng-Xue

2018-01-01

An efficient finite-difference frequency-domain modeling of seismic wave propagation relies on the discrete schemes and appropriate solving methods. The average-derivative optimal scheme for the scalar wave modeling is advantageous in terms of the storage saving for the system of linear equations and the flexibility for arbitrary directional sampling intervals. However, using a LU-decomposition-based direct solver to solve its resulting system of linear equations is very costly for both memory and computational requirements. To address this issue, we consider establishing a multigrid-preconditioned BI-CGSTAB iterative solver fit for the average-derivative optimal scheme. The choice of preconditioning matrix and its corresponding multigrid components is made with the help of Fourier spectral analysis and local mode analysis, respectively, which is important for the convergence. Furthermore, we find that for the computation with unequal directional sampling interval, the anisotropic smoothing in the multigrid precondition may affect the convergence rate of this iterative solver. Successful numerical applications of this iterative solver for the homogenous and heterogeneous models in 2D and 3D are presented where the significant reduction of computer memory and the improvement of computational efficiency are demonstrated by comparison with the direct solver. In the numerical experiments, we also show that the unequal directional sampling interval will weaken the advantage of this multigrid-preconditioned iterative solver in the computing speed or, even worse, could reduce its accuracy in some cases, which implies the need for a reasonable control of directional sampling interval in the discretization.

Using the Multiplicative Schwarz Alternating Algorithm (MSAA) for Solving the Large Linear System of Equations Related to Global Gravity Field Recovery up to Degree and Order 120

NASA Astrophysics Data System (ADS)

Safari, A.; Sharifi, M. A.; Amjadiparvar, B.

2010-05-01

The GRACE mission has substantiated the low-low satellite-to-satellite tracking (LL-SST) concept. The LL-SST configuration can be combined with the previously realized high-low SST concept in the CHAMP mission to provide a much higher accuracy. The line of sight (LOS) acceleration difference between the GRACE satellite pair is the mostly used observable for mapping the global gravity field of the Earth in terms of spherical harmonic coefficients. In this paper, mathematical formulae for LOS acceleration difference observations have been derived and the corresponding linear system of equations has been set up for spherical harmonic up to degree and order 120. The total number of unknowns is 14641. Such a linear equation system can be solved with iterative solvers or direct solvers. However, the runtime of direct methods or that of iterative solvers without a suitable preconditioner increases tremendously. This is the reason why we need a more sophisticated method to solve the linear system of problems with a large number of unknowns. Multiplicative variant of the Schwarz alternating algorithm is a domain decomposition method, which allows it to split the normal matrix of the system into several smaller overlaped submatrices. In each iteration step the multiplicative variant of the Schwarz alternating algorithm solves linear systems with the matrices obtained from the splitting successively. It reduces both runtime and memory requirements drastically. In this paper we propose the Multiplicative Schwarz Alternating Algorithm (MSAA) for solving the large linear system of gravity field recovery. The proposed algorithm has been tested on the International Association of Geodesy (IAG)-simulated data of the GRACE mission. The achieved results indicate the validity and efficiency of the proposed algorithm in solving the linear system of equations from accuracy and runtime points of view. Keywords: Gravity field recovery, Multiplicative Schwarz Alternating Algorithm, Low-Low Satellite-to-Satellite Tracking
A Unified Approach to Optimization

DTIC Science & Technology

2014-10-02

employee scheduling, ad placement, latin squares, disjunctions of linear systems, temporal modeling with interval variables, and traveling salesman problems ...integrating technologies. A key to integrated modeling is to formulate a problem with high-levelmetaconstraints, which are inspired by the “global... problem substructure to the solver. This contrasts with the atomistic modeling style of mixed integer programming (MIP) and satisfiability (SAT) solvers
Oasis: A high-level/high-performance open source Navier-Stokes solver

NASA Astrophysics Data System (ADS)

Mortensen, Mikael; Valen-Sendstad, Kristian

2015-03-01

Oasis is a high-level/high-performance finite element Navier-Stokes solver written from scratch in Python using building blocks from the FEniCS project (fenicsproject.org). The solver is unstructured and targets large-scale applications in complex geometries on massively parallel clusters. Oasis utilizes MPI and interfaces, through FEniCS, to the linear algebra backend PETSc. Oasis advocates a high-level, programmable user interface through the creation of highly flexible Python modules for new problems. Through the high-level Python interface the user is placed in complete control of every aspect of the solver. A version of the solver, that is using piecewise linear elements for both velocity and pressure, is shown to reproduce very well the classical, spectral, turbulent channel simulations of Moser et al. (1999). The computational speed is strongly dominated by the iterative solvers provided by the linear algebra backend, which is arguably the best performance any similar implicit solver using PETSc may hope for. Higher order accuracy is also demonstrated and new solvers may be easily added within the same framework.
A comparison of SuperLU solvers on the intel MIC architecture

NASA Astrophysics Data System (ADS)

Tuncel, Mehmet; Duran, Ahmet; Celebi, M. Serdar; Akaydin, Bora; Topkaya, Figen O.

2016-10-01

In many science and engineering applications, problems may result in solving a sparse linear system AX=B. For example, SuperLU_MCDT, a linear solver, was used for the large penta-diagonal matrices for 2D problems and hepta-diagonal matrices for 3D problems, coming from the incompressible blood flow simulation (see [1]). It is important to test the status and potential improvements of state-of-the-art solvers on new technologies. In this work, sequential, multithreaded and distributed versions of SuperLU solvers (see [2]) are examined on the Intel Xeon Phi coprocessors using offload programming model at the EURORA cluster of CINECA in Italy. We consider a portfolio of test matrices containing patterned matrices from UFMM ([3]) and randomly located matrices. This architecture can benefit from high parallelism and large vectors. We find that the sequential SuperLU benefited up to 45 % performance improvement from the offload programming depending on the sparse matrix type and the size of transferred and processed data.
Higher Order Time Integration Schemes for the Unsteady Navier-Stokes Equations on Unstructured Meshes

NASA Technical Reports Server (NTRS)

Jothiprasad, Giridhar; Mavriplis, Dimitri J.; Caughey, David A.; Bushnell, Dennis M. (Technical Monitor)

2002-01-01

The efficiency gains obtained using higher-order implicit Runge-Kutta schemes as compared with the second-order accurate backward difference schemes for the unsteady Navier-Stokes equations are investigated. Three different algorithms for solving the nonlinear system of equations arising at each timestep are presented. The first algorithm (NMG) is a pseudo-time-stepping scheme which employs a non-linear full approximation storage (FAS) agglomeration multigrid method to accelerate convergence. The other two algorithms are based on Inexact Newton's methods. The linear system arising at each Newton step is solved using iterative/Krylov techniques and left preconditioning is used to accelerate convergence of the linear solvers. One of the methods (LMG) uses Richardson's iterative scheme for solving the linear system at each Newton step while the other (PGMRES) uses the Generalized Minimal Residual method. Results demonstrating the relative superiority of these Newton's methods based schemes are presented. Efficiency gains as high as 10 are obtained by combining the higher-order time integration schemes with the more efficient nonlinear solvers.
Final Report: Subcontract B623868 Algebraic Multigrid solvers for coupled PDE systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brannick, J.

The Pennsylvania State University (“Subcontractor”) continued to work on the design of algebraic multigrid solvers for coupled systems of partial differential equations (PDEs) arising in numerical modeling of various applications, with a main focus on solving the Dirac equation arising in Quantum Chromodynamics (QCD). The goal of the proposed work was to develop combined geometric and algebraic multilevel solvers that are robust and lend themselves to efficient implementation on massively parallel heterogeneous computers for these QCD systems. The research in these areas built on previous works, focusing on the following three topics: (1) the development of parallel full-multigrid (PFMG) andmore » non-Galerkin coarsening techniques in this frame work for solving the Wilson Dirac system; (2) the use of these same Wilson MG solvers for preconditioning the Overlap and Domain Wall formulations of the Dirac equation; and (3) the design and analysis of algebraic coarsening algorithms for coupled PDE systems including Stokes equation, Maxwell equation and linear elasticity.« less
Algorithmically scalable block preconditioner for fully implicit shallow-water equations in CAM-SE

DOE PAGES

Lott, P. Aaron; Woodward, Carol S.; Evans, Katherine J.

2014-10-19

Performing accurate and efficient numerical simulation of global atmospheric climate models is challenging due to the disparate length and time scales over which physical processes interact. Implicit solvers enable the physical system to be integrated with a time step commensurate with processes being studied. The dominant cost of an implicit time step is the ancillary linear system solves, so we have developed a preconditioner aimed at improving the efficiency of these linear system solves. Our preconditioner is based on an approximate block factorization of the linearized shallow-water equations and has been implemented within the spectral element dynamical core within themore » Community Atmospheric Model (CAM-SE). Furthermore, in this paper we discuss the development and scalability of the preconditioner for a suite of test cases with the implicit shallow-water solver within CAM-SE.« less
On the use of finite difference matrix-vector products in Newton-Krylov solvers for implicit climate dynamics with spectral elements

DOE PAGES

Woodward, Carol S.; Gardner, David J.; Evans, Katherine J.

2015-01-01

Efficient solutions of global climate models require effectively handling disparate length and time scales. Implicit solution approaches allow time integration of the physical system with a step size governed by accuracy of the processes of interest rather than by stability of the fastest time scales present. Implicit approaches, however, require the solution of nonlinear systems within each time step. Usually, a Newton's method is applied to solve these systems. Each iteration of the Newton's method, in turn, requires the solution of a linear model of the nonlinear system. This model employs the Jacobian of the problem-defining nonlinear residual, but thismore » Jacobian can be costly to form. If a Krylov linear solver is used for the solution of the linear system, the action of the Jacobian matrix on a given vector is required. In the case of spectral element methods, the Jacobian is not calculated but only implemented through matrix-vector products. The matrix-vector multiply can also be approximated by a finite difference approximation which may introduce inaccuracy in the overall nonlinear solver. In this paper, we review the advantages and disadvantages of finite difference approximations of these matrix-vector products for climate dynamics within the spectral element shallow water dynamical core of the Community Atmosphere Model.« less
Performance Models for the Spike Banded Linear System Solver

DOE PAGES

Manguoglu, Murat; Saied, Faisal; Sameh, Ahmed; ...

2011-01-01

With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers themselves. In this paper, we investigate the problem of characterizing performance and scalability of banded preconditioners. Recent work has demonstrated the superior convergence properties and robustness of banded preconditioners,more » compared to state-of-the-art ILU family of preconditioners as well as algebraic multigrid preconditioners. Furthermore, when used in conjunction with efficient banded solvers, banded preconditioners are capable of significantly faster time-to-solution. Our banded solver, the Truncated Spike algorithm is specifically designed for parallel performance and tolerance to deep memory hierarchies. Its regular structure is also highly amenable to accurate performance characterization. Using these characteristics, we derive the following results in this paper: (i) we develop parallel formulations of the Truncated Spike solver, (ii) we develop a highly accurate pseudo-analytical parallel performance model for our solver, (iii) we show excellent predication capabilities of our model – based on which we argue the high scalability of our solver. Our pseudo-analytical performance model is based on analytical performance characterization of each phase of our solver. These analytical models are then parameterized using actual runtime information on target platforms. An important consequence of our performance models is that they reveal underlying performance bottlenecks in both serial and parallel formulations. All of our results are validated on diverse heterogeneous multiclusters – platforms for which performance prediction is particularly challenging. Finally, we provide predict the scalability of the Spike algorithm using up to 65,536 cores with our model. In this paper we extend the results presented in the Ninth International Symposium on Parallel and Distributed Computing.« less
Simulation of violent free surface flow by AMR method

NASA Astrophysics Data System (ADS)

Hu, Changhong; Liu, Cheng

2018-05-01

A novel CFD approach based on adaptive mesh refinement (AMR) technique is being developed for numerical simulation of violent free surface flows. CIP method is applied to the flow solver and tangent of hyperbola for interface capturing with slope weighting (THINC/SW) scheme is implemented as the free surface capturing scheme. The PETSc library is adopted to solve the linear system. The linear solver is redesigned and modified to satisfy the requirement of the AMR mesh topology. In this paper, our CFD method is outlined and newly obtained results on numerical simulation of violent free surface flows are presented.
AFMPB: An adaptive fast multipole Poisson-Boltzmann solver for calculating electrostatics in biomolecular systems

NASA Astrophysics Data System (ADS)

Lu, Benzhuo; Cheng, Xiaolin; Huang, Jingfang; McCammon, J. Andrew

2010-06-01

A Fortran program package is introduced for rapid evaluation of the electrostatic potentials and forces in biomolecular systems modeled by the linearized Poisson-Boltzmann equation. The numerical solver utilizes a well-conditioned boundary integral equation (BIE) formulation, a node-patch discretization scheme, a Krylov subspace iterative solver package with reverse communication protocols, and an adaptive new version of fast multipole method in which the exponential expansions are used to diagonalize the multipole-to-local translations. The program and its full description, as well as several closely related libraries and utility tools are available at http://lsec.cc.ac.cn/~lubz/afmpb.html and a mirror site at http://mccammon.ucsd.edu/. This paper is a brief summary of the program: the algorithms, the implementation and the usage. Program summaryProgram title: AFMPB: Adaptive fast multipole Poisson-Boltzmann solver Catalogue identifier: AEGB_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGB_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL 2.0 No. of lines in distributed program, including test data, etc.: 453 649 No. of bytes in distributed program, including test data, etc.: 8 764 754 Distribution format: tar.gz Programming language: Fortran Computer: Any Operating system: Any RAM: Depends on the size of the discretized biomolecular system Classification: 3 External routines: Pre- and post-processing tools are required for generating the boundary elements and for visualization. Users can use MSMS ( http://www.scripps.edu/~sanner/html/msms_home.html) for pre-processing, and VMD ( http://www.ks.uiuc.edu/Research/vmd/) for visualization. Sub-programs included: An iterative Krylov subspace solvers package from SPARSKIT by Yousef Saad ( http://www-users.cs.umn.edu/~saad/software/SPARSKIT/sparskit.html), and the fast multipole methods subroutines from FMMSuite ( http://www.fastmultipole.org/). Nature of problem: Numerical solution of the linearized Poisson-Boltzmann equation that describes electrostatic interactions of molecular systems in ionic solutions. Solution method: A novel node-patch scheme is used to discretize the well-conditioned boundary integral equation formulation of the linearized Poisson-Boltzmann equation. Various Krylov subspace solvers can be subsequently applied to solve the resulting linear system, with a bounded number of iterations independent of the number of discretized unknowns. The matrix-vector multiplication at each iteration is accelerated by the adaptive new versions of fast multipole methods. The AFMPB solver requires other stand-alone pre-processing tools for boundary mesh generation, post-processing tools for data analysis and visualization, and can be conveniently coupled with different time stepping methods for dynamics simulation. Restrictions: Only three or six significant digits options are provided in this version. Unusual features: Most of the codes are in Fortran77 style. Memory allocation functions from Fortran90 and above are used in a few subroutines. Additional comments: The current version of the codes is designed and written for single core/processor desktop machines. Check http://lsec.cc.ac.cn/~lubz/afmpb.html and http://mccammon.ucsd.edu/ for updates and changes. Running time: The running time varies with the number of discretized elements ( N) in the system and their distributions. In most cases, it scales linearly as a function of N.
A high-order semi-explicit discontinuous Galerkin solver for 3D incompressible flow with application to DNS and LES of turbulent channel flow

NASA Astrophysics Data System (ADS)

Krank, Benjamin; Fehn, Niklas; Wall, Wolfgang A.; Kronbichler, Martin

2017-11-01

We present an efficient discontinuous Galerkin scheme for simulation of the incompressible Navier-Stokes equations including laminar and turbulent flow. We consider a semi-explicit high-order velocity-correction method for time integration as well as nodal equal-order discretizations for velocity and pressure. The non-linear convective term is treated explicitly while a linear system is solved for the pressure Poisson equation and the viscous term. The key feature of our solver is a consistent penalty term reducing the local divergence error in order to overcome recently reported instabilities in spatially under-resolved high-Reynolds-number flows as well as small time steps. This penalty method is similar to the grad-div stabilization widely used in continuous finite elements. We further review and compare our method to several other techniques recently proposed in literature to stabilize the method for such flow configurations. The solver is specifically designed for large-scale computations through matrix-free linear solvers including efficient preconditioning strategies and tensor-product elements, which have allowed us to scale this code up to 34.4 billion degrees of freedom and 147,456 CPU cores. We validate our code and demonstrate optimal convergence rates with laminar flows present in a vortex problem and flow past a cylinder and show applicability of our solver to direct numerical simulation as well as implicit large-eddy simulation of turbulent channel flow at Reτ = 180 as well as 590.
Solving systems of linear equations by GPU-based matrix factorization in a Science Ground Segment

NASA Astrophysics Data System (ADS)

Legendre, Maxime; Schmidt, Albrecht; Moussaoui, Saïd; Lammers, Uwe

2013-11-01

Recently, Graphics Cards have been used to offload scientific computations from traditional CPUs for greater efficiency. This paper investigates the adaptation of a real-world linear system solver, which plays a central role in the data processing of the Science Ground Segment of ESA's astrometric Gaia mission. The paper quantifies the resource trade-offs between traditional CPU implementations and modern CUDA based GPU implementations. It also analyses the impact on the pipeline architecture and system development. The investigation starts from both a selected baseline algorithm with a reference implementation and a traditional linear system solver and then explores various modifications to control flow and data layout to achieve higher resource efficiency. It turns out that with the current state of the art, the modifications impact non-technical system attributes. For example, the control flow of the original modified Cholesky transform is modified so that locality of the code and verifiability deteriorate. The maintainability of the system is affected as well. On the system level, users will have to deal with more complex configuration control and testing procedures.
Performance of Nonlinear Finite-Difference Poisson-Boltzmann Solvers

PubMed Central

Cai, Qin; Hsieh, Meng-Juei; Wang, Jun; Luo, Ray

2014-01-01

We implemented and optimized seven finite-difference solvers for the full nonlinear Poisson-Boltzmann equation in biomolecular applications, including four relaxation methods, one conjugate gradient method, and two inexact Newton methods. The performance of the seven solvers was extensively evaluated with a large number of nucleic acids and proteins. Worth noting is the inexact Newton method in our analysis. We investigated the role of linear solvers in its performance by incorporating the incomplete Cholesky conjugate gradient and the geometric multigrid into its inner linear loop. We tailored and optimized both linear solvers for faster convergence rate. In addition, we explored strategies to optimize the successive over-relaxation method to reduce its convergence failures without too much sacrifice in its convergence rate. Specifically we attempted to adaptively change the relaxation parameter and to utilize the damping strategy from the inexact Newton method to improve the successive over-relaxation method. Our analysis shows that the nonlinear methods accompanied with a functional-assisted strategy, such as the conjugate gradient method and the inexact Newton method, can guarantee convergence in the tested molecules. Especially the inexact Newton method exhibits impressive performance when it is combined with highly efficient linear solvers that are tailored for its special requirement. PMID:24723843
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran

We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less
Robust parallel iterative solvers for linear and least-squares problems, Final Technical Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Saad, Yousef

2014-01-16

The primary goal of this project is to study and develop robust iterative methods for solving linear systems of equations and least squares systems. The focus of the Minnesota team is on algorithms development, robustness issues, and on tests and validation of the methods on realistic problems. 1. The project begun with an investigation on how to practically update a preconditioner obtained from an ILU-type factorization, when the coefficient matrix changes. 2. We investigated strategies to improve robustness in parallel preconditioners in a specific case of a PDE with discontinuous coefficients. 3. We explored ways to adapt standard preconditioners formore » solving linear systems arising from the Helmholtz equation. These are often difficult linear systems to solve by iterative methods. 4. We have also worked on purely theoretical issues related to the analysis of Krylov subspace methods for linear systems. 5. We developed an effective strategy for performing ILU factorizations for the case when the matrix is highly indefinite. The strategy uses shifting in some optimal way. The method was extended to the solution of Helmholtz equations by using complex shifts, yielding very good results in many cases. 6. We addressed the difficult problem of preconditioning sparse systems of equations on GPUs. 7. A by-product of the above work is a software package consisting of an iterative solver library for GPUs based on CUDA. This was made publicly available. It was the first such library that offers complete iterative solvers for GPUs. 8. We considered another form of ILU which blends coarsening techniques from Multigrid with algebraic multilevel methods. 9. We have released a new version on our parallel solver - called pARMS [new version is version 3]. As part of this we have tested the code in complex settings - including the solution of Maxwell and Helmholtz equations and for a problem of crystal growth.10. As an application of polynomial preconditioning we considered the problem of evaluating f(A)v which arises in statistical sampling. 11. As an application to the methods we developed, we tackled the problem of computing the diagonal of the inverse of a matrix. This arises in statistical applications as well as in many applications in physics. We explored probing methods as well as domain-decomposition type methods. 12. A collaboration with researchers from Toulouse, France, considered the important problem of computing the Schur complement in a domain-decomposition approach. 13. We explored new ways of preconditioning linear systems, based on low-rank approximations.« less
Final Report for "Implimentation and Evaluation of Multigrid Linear Solvers into Extended Magnetohydrodynamic Codes for Petascale Computing"

DOE Office of Scientific and Technical Information (OSTI.GOV)

Srinath Vadlamani; Scott Kruger; Travis Austin

Extended magnetohydrodynamic (MHD) codes are used to model the large, slow-growing instabilities that are projected to limit the performance of International Thermonuclear Experimental Reactor (ITER). The multiscale nature of the extended MHD equations requires an implicit approach. The current linear solvers needed for the implicit algorithm scale poorly because the resultant matrices are so ill-conditioned. A new solver is needed, especially one that scales to the petascale. The most successful scalable parallel processor solvers to date are multigrid solvers. Applying multigrid techniques to a set of equations whose fundamental modes are dispersive waves is a promising solution to CEMM problems.more » For the Phase 1, we implemented multigrid preconditioners from the HYPRE project of the Center for Applied Scientific Computing at LLNL via PETSc of the DOE SciDAC TOPS for the real matrix systems of the extended MHD code NIMROD which is a one of the primary modeling codes of the OFES-funded Center for Extended Magnetohydrodynamic Modeling (CEMM) SciDAC. We implemented the multigrid solvers on the fusion test problem that allows for real matrix systems with success, and in the process learned about the details of NIMROD data structures and the difficulties of inverting NIMROD operators. The further success of this project will allow for efficient usage of future petascale computers at the National Leadership Facilities: Oak Ridge National Laboratory, Argonne National Laboratory, and National Energy Research Scientific Computing Center. The project will be a collaborative effort between computational plasma physicists and applied mathematicians at Tech-X Corporation, applied mathematicians Front Range Scientific Computations, Inc. (who are collaborators on the HYPRE project), and other computational plasma physicists involved with the CEMM project.« less
Efficient numerical calculation of MHD equilibria with magnetic islands, with particular application to saturated neoclassical tearing modes

NASA Astrophysics Data System (ADS)

Raburn, Daniel Louis

We have developed a preconditioned, globalized Jacobian-free Newton-Krylov (JFNK) solver for calculating equilibria with magnetic islands. The solver has been developed in conjunction with the Princeton Iterative Equilibrium Solver (PIES) and includes two notable enhancements over a traditional JFNK scheme: (1) globalization of the algorithm by a sophisticated backtracking scheme, which optimizes between the Newton and steepest-descent directions; and, (2) adaptive preconditioning, wherein information regarding the system Jacobian is reused between Newton iterations to form a preconditioner for our GMRES-like linear solver. We have developed a formulation for calculating saturated neoclassical tearing modes (NTMs) which accounts for the incomplete loss of a bootstrap current due to gradients of multiple physical quantities. We have applied the coupled PIES-JFNK solver to calculate saturated island widths on several shots from the Tokamak Fusion Test Reactor (TFTR) and have found reasonable agreement with experimental measurement.
Flutter and Forced Response Analyses of Cascades using a Two-Dimensional Linearized Euler Solver

NASA Technical Reports Server (NTRS)

Reddy, T. S. R.; Srivastava, R.; Mehmed, O.

1999-01-01

Flutter and forced response analyses for a cascade of blades in subsonic and transonic flow is presented. The structural model for each blade is a typical section with bending and torsion degrees of freedom. The unsteady aerodynamic forces due to bending and torsion motions. and due to a vortical gust disturbance are obtained by solving unsteady linearized Euler equations. The unsteady linearized equations are obtained by linearizing the unsteady nonlinear equations about the steady flow. The predicted unsteady aerodynamic forces include the effect of steady aerodynamic loading due to airfoil shape, thickness and angle of attack. The aeroelastic equations are solved in the frequency domain by coupling the un- steady aerodynamic forces to the aeroelastic solver MISER. The present unsteady aerodynamic solver showed good correlation with published results for both flutter and forced response predictions. Further improvements are required to use the unsteady aerodynamic solver in a design cycle.
USM3D Unstructured Grid Solutions for CAWAPI at NASA LaRC

NASA Technical Reports Server (NTRS)

Lamar, John E.; Abdol-Hamid, Khaled S.

2007-01-01

In support the Cranked Arrow Wing Aerodynamic Project International (CAWAPI) to improve the Technology Readiness Level of flow solvers by comparing results with measured F-16XL-1 flight data, NASA Langley employed the TetrUSS unstructured grid solver, USM3D, to obtain solutions for all seven flight conditions of interest. A newly available solver version that incorporates a number of turbulence models, including the two-equation linear and non-linear k-epsilon, was used in this study. As a first test, a choice was made to utilize only a single grid resolution with the solver for the simulation of the different flight conditions. Comparisons are presented with three turbulence models in USM3D, flight data for surface pressure, boundary-layer profiles, and skin-friction results, as well as limited predictions from other solvers. A result of these comparisons is that the USM3D solver can be used in an engineering environment to predict flow physics on a complex configuration at flight Reynolds numbers with a two-equation linear k-epsilon turbulence model.

High-speed extended-term time-domain simulation for online cascading analysis of power system

NASA Astrophysics Data System (ADS)

Fu, Chuan

A high-speed extended-term (HSET) time domain simulator (TDS), intended to become a part of an energy management system (EMS), has been newly developed for use in online extended-term dynamic cascading analysis of power systems. HSET-TDS includes the following attributes for providing situational awareness of high-consequence events: (i) online analysis, including n-1 and n-k events, (ii) ability to simulate both fast and slow dynamics for 1-3 hours in advance, (iii) inclusion of rigorous protection-system modeling, (iv) intelligence for corrective action ID, storage, and fast retrieval, and (v) high-speed execution. Very fast on-line computational capability is the most desired attribute of this simulator. Based on the process of solving algebraic differential equations describing the dynamics of power system, HSET-TDS seeks to develop computational efficiency at each of the following hierarchical levels, (i) hardware, (ii) strategies, (iii) integration methods, (iv) nonlinear solvers, and (v) linear solver libraries. This thesis first describes the Hammer-Hollingsworth 4 (HH4) implicit integration method. Like the trapezoidal rule, HH4 is symmetrically A-Stable but it possesses greater high-order precision (h4 ) than the trapezoidal rule. Such precision enables larger integration steps and therefore improves simulation efficiency for variable step size implementations. This thesis provides the underlying theory on which we advocate use of HH4 over other numerical integration methods for power system time-domain simulation. Second, motivated by the need to perform high speed extended-term time domain simulation (HSET-TDS) for on-line purposes, this thesis presents principles for designing numerical solvers of differential algebraic systems associated with power system time-domain simulation, including DAE construction strategies (Direct Solution Method), integration methods(HH4), nonlinear solvers(Very Dishonest Newton), and linear solvers(SuperLU). We have implemented a design appropriate for HSET-TDS, and we compare it to various solvers, including the commercial grade PSSE program, with respect to computational efficiency and accuracy, using as examples the New England 39 bus system, the expanded 8775 bus system, and PJM 13029 buses system. Third, we have explored a stiffness-decoupling method, intended to be part of parallel design of time domain simulation software for super computers. The stiffness-decoupling method is able to combine the advantages of implicit methods (A-stability) and explicit method(less computation). With the new stiffness detection method proposed herein, the stiffness can be captured. The expanded 975 buses system is used to test simulation efficiency. Finally, several parallel strategies for super computer deployment to simulate power system dynamics are proposed and compared. Design A partitions the task via scale with the stiffness decoupling method, waveform relaxation, and parallel linear solver. Design B partitions the task via the time axis using a highly precise integration method, the Kuntzmann-Butcher Method - order 8 (KB8). The strategy of partitioning events is designed to partition the whole simulation via the time axis through a simulated sequence of cascading events. For all strategies proposed, a strategy of partitioning cascading events is recommended, since the sub-tasks for each processor are totally independent, and therefore minimum communication time is needed.
Advanced Computational Methods for Security Constrained Financial Transmission Rights: Structure and Parallelism

DOE Office of Scientific and Technical Information (OSTI.GOV)

Elbert, Stephen T.; Kalsi, Karanjit; Vlachopoulou, Maria

Financial Transmission Rights (FTRs) help power market participants reduce price risks associated with transmission congestion. FTRs are issued based on a process of solving a constrained optimization problem with the objective to maximize the FTR social welfare under power flow security constraints. Security constraints for different FTR categories (monthly, seasonal or annual) are usually coupled and the number of constraints increases exponentially with the number of categories. Commercial software for FTR calculation can only provide limited categories of FTRs due to the inherent computational challenges mentioned above. In this paper, a novel non-linear dynamical system (NDS) approach is proposed tomore » solve the optimization problem. The new formulation and performance of the NDS solver is benchmarked against widely used linear programming (LP) solvers like CPLEX™ and tested on large-scale systems using data from the Western Electricity Coordinating Council (WECC). The NDS is demonstrated to outperform the widely used CPLEX algorithms while exhibiting superior scalability. Furthermore, the NDS based solver can be easily parallelized which results in significant computational improvement.« less
Menu-Driven Solver Of Linear-Programming Problems

NASA Technical Reports Server (NTRS)

Viterna, L. A.; Ferencz, D.

1992-01-01

Program assists inexperienced user in formulating linear-programming problems. A Linear Program Solver (ALPS) computer program is full-featured LP analysis program. Solves plain linear-programming problems as well as more-complicated mixed-integer and pure-integer programs. Also contains efficient technique for solution of purely binary linear-programming problems. Written entirely in IBM's APL2/PC software, Version 1.01. Packed program contains licensed material, property of IBM (copyright 1988, all rights reserved).
Seeking Space Aliens and the Strong Approximation Property: A (disjoint) Study in Dust Plumes on Planetary Satellites and Nonsymmetric Algebraic Multigrid

NASA Astrophysics Data System (ADS)

Southworth, Benjamin Scott

PART I: One of the most fascinating questions to humans has long been whether life exists outside of our planet. To our knowledge, water is a fundamental building block of life, which makes liquid water on other bodies in the universe a topic of great interest. In fact, there are large bodies of water right here in our solar system, underneath the icy crust of moons around Saturn and Jupiter. The NASA-ESA Cassini Mission spent two decades studying the Saturnian system. One of the many exciting discoveries was a "plume" on the south pole of Enceladus, emitting hundreds of kg/s of water vapor and frozen water-ice particles from Enceladus' subsurface ocean. It has since been determined that Enceladus likely has a global liquid water ocean separating its rocky core from icy surface, with conditions that are relatively favorable to support life. The plume is of particular interest because it gives direct access to ocean particles from space, by flying through the plume. Recently, evidence has been found for similar geological activity occurring on Jupiter's moon Europa, long considered one of the most likely candidate bodies to support life in our solar system. Here, a model for plume-particle dynamics is developed based on studies of the Enceladus plume and data from the Cassini Cosmic Dust Analyzer. A C++, OpenMP/MPI parallel software package is then built to run large scale simulations of dust plumes on planetary satellites. In the case of Enceladus, data from simulations and the Cassini mission provide insight into the structure of emissions on the surface, the total mass production of the plume, and the distribution of particles being emitted. Each of these are fundamental to understanding the plume and, for Europa and Enceladus, simulation data provide important results for the planning of future missions to these icy moons. In particular, this work has contributed to the Europa Clipper mission and proposed Enceladus Life Finder. PART II: Solving large, sparse linear systems arises often in the modeling of biological and physical phenomenon, data analysis through graphs and networks, and other scientific applications. This work focuses primarily on linear systems resulting from the discretization of partial differential equations (PDEs). Because solving linear systems is the bottleneck of many large simulation codes, there is a rich field of research in developing "fast" solvers, with the ultimate goal being a method that solves an n x n linear system in O(n) operations. One of the most effective classes of solvers is algebraic multigrid (AMG), which is a multilevel iterative method based on projecting the problem into progressively smaller spaces, and scales like O(n) or O(nlog n) for certain classes of problems. The field of AMG is well-developed for symmetric positive definite matrices, and is typically most effective on linear systems resulting from the discretization of scalar elliptic PDEs, such as the heat equation. Systems of PDEs can add additional difficulties, but the underlying linear algebraic theory is consistent and, in many cases, an elliptic system of PDEs can be handled well by AMG with appropriate modifications of the solver. Solving general, nonsymmetric linear systems remains the wild west of AMG (and other fast solvers), lacking significant results in convergence theory as well as robust methods. Here, we develop new theoretical motivation and practical variations of AMG to solve nonsymmetric linear systems, often resulting from the discretization of hyperbolic PDEs. In particular, multilevel convergence of AMG for nonsymmetric systems is proven for the first time. A new nonsymmetric AMG solver is also developed based on an approximate ideal restriction, referred to as AIR, which is able to solve advection-dominated, hyperbolic-type problems that are outside the scope of existing AMG solvers and other fast iterative methods. AIR demonstrates scalable convergence on unstructured meshes, in multiple dimensions, and with high-order finite elements, expanding the applicability of AMG to a new class of problems.
Performance of uncertainty quantification methodologies and linear solvers in cardiovascular simulations

NASA Astrophysics Data System (ADS)

Seo, Jongmin; Schiavazzi, Daniele; Marsden, Alison

2017-11-01

Cardiovascular simulations are increasingly used in clinical decision making, surgical planning, and disease diagnostics. Patient-specific modeling and simulation typically proceeds through a pipeline from anatomic model construction using medical image data to blood flow simulation and analysis. To provide confidence intervals on simulation predictions, we use an uncertainty quantification (UQ) framework to analyze the effects of numerous uncertainties that stem from clinical data acquisition, modeling, material properties, and boundary condition selection. However, UQ poses a computational challenge requiring multiple evaluations of the Navier-Stokes equations in complex 3-D models. To achieve efficiency in UQ problems with many function evaluations, we implement and compare a range of iterative linear solver and preconditioning techniques in our flow solver. We then discuss applications to patient-specific cardiovascular simulation and how the problem/boundary condition formulation in the solver affects the selection of the most efficient linear solver. Finally, we discuss performance improvements in the context of uncertainty propagation. Support from National Institute of Health (R01 EB018302) is greatly appreciated.
On some Aitken-like acceleration of the Schwarz method

NASA Astrophysics Data System (ADS)

Garbey, M.; Tromeur-Dervout, D.

2002-12-01

In this paper we present a family of domain decomposition based on Aitken-like acceleration of the Schwarz method seen as an iterative procedure with a linear rate of convergence. We first present the so-called Aitken-Schwarz procedure for linear differential operators. The solver can be a direct solver when applied to the Helmholtz problem with five-point finite difference scheme on regular grids. We then introduce the Steffensen-Schwarz variant which is an iterative domain decomposition solver that can be applied to linear and nonlinear problems. We show that these solvers have reasonable numerical efficiency compared to classical fast solvers for the Poisson problem or multigrids for more general linear and nonlinear elliptic problems. However, the salient feature of our method is that our algorithm has high tolerance to slow network in the context of distributed parallel computing and is attractive, generally speaking, to use with computer architecture for which performance is limited by the memory bandwidth rather than the flop performance of the CPU. This is nowadays the case for most parallel. computer using the RISC processor architecture. We will illustrate this highly desirable property of our algorithm with large-scale computing experiments.
An Optimized Multicolor Point-Implicit Solver for Unstructured Grid Applications on Graphics Processing Units

NASA Technical Reports Server (NTRS)

Zubair, Mohammad; Nielsen, Eric; Luitjens, Justin; Hammond, Dana

2016-01-01

In the field of computational fluid dynamics, the Navier-Stokes equations are often solved using an unstructuredgrid approach to accommodate geometric complexity. Implicit solution methodologies for such spatial discretizations generally require frequent solution of large tightly-coupled systems of block-sparse linear equations. The multicolor point-implicit solver used in the current work typically requires a significant fraction of the overall application run time. In this work, an efficient implementation of the solver for graphics processing units is proposed. Several factors present unique challenges to achieving an efficient implementation in this environment. These include the variable amount of parallelism available in different kernel calls, indirect memory access patterns, low arithmetic intensity, and the requirement to support variable block sizes. In this work, the solver is reformulated to use standard sparse and dense Basic Linear Algebra Subprograms (BLAS) functions. However, numerical experiments show that the performance of the BLAS functions available in existing CUDA libraries is suboptimal for matrices representative of those encountered in actual simulations. Instead, optimized versions of these functions are developed. Depending on block size, the new implementations show performance gains of up to 7x over the existing CUDA library functions.
An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling

DOE PAGES

Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry; ...

2016-10-27

Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
The Mixed Finite Element Multigrid Method for Stokes Equations

PubMed Central

Muzhinji, K.; Shateyi, S.; Motsa, S. S.

2015-01-01

The stable finite element discretization of the Stokes problem produces a symmetric indefinite system of linear algebraic equations. A variety of iterative solvers have been proposed for such systems in an attempt to construct efficient, fast, and robust solution techniques. This paper investigates one of such iterative solvers, the geometric multigrid solver, to find the approximate solution of the indefinite systems. The main ingredient of the multigrid method is the choice of an appropriate smoothing strategy. This study considers the application of different smoothers and compares their effects in the overall performance of the multigrid solver. We study the multigrid method with the following smoothers: distributed Gauss Seidel, inexact Uzawa, preconditioned MINRES, and Braess-Sarazin type smoothers. A comparative study of the smoothers shows that the Braess-Sarazin smoothers enhance good performance of the multigrid method. We study the problem in a two-dimensional domain using stable Hood-Taylor Q 2-Q 1 pair of finite rectangular elements. We also give the main theoretical convergence results. We present the numerical results to demonstrate the efficiency and robustness of the multigrid method and confirm the theoretical results. PMID:25945361
ML 3.0 smoothed aggregation user's guide.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sala, Marzio; Hu, Jonathan Joseph; Tuminaro, Raymond Stephen

2004-05-01

ML is a multigrid preconditioning package intended to solve linear systems of equations Az = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. ML should be used on large sparse linear systems arising from partial differential equation (PDE) discretizations. While technically any linear system can be considered, ML should be used on linear systems that correspond to things that work well with multigrid methods (e.g. elliptic PDEs). ML can be used as a stand-alone package ormore » to generate preconditioners for a traditional iterative solver package (e.g. Krylov methods). We have supplied support for working with the AZTEC 2.1 and AZTECOO iterative package [15]. However, other solvers can be used by supplying a few functions. This document describes one specific algebraic multigrid approach: smoothed aggregation. This approach is used within several specialized multigrid methods: one for the eddy current formulation for Maxwell's equations, and a multilevel and domain decomposition method for symmetric and non-symmetric systems of equations (like elliptic equations, or compressible and incompressible fluid dynamics problems). Other methods exist within ML but are not described in this document. Examples are given illustrating the problem definition and exercising multigrid options.« less
ML 3.1 smoothed aggregation user's guide.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sala, Marzio; Hu, Jonathan Joseph; Tuminaro, Raymond Stephen

2004-10-01

ML is a multigrid preconditioning package intended to solve linear systems of equations Ax = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. ML should be used on large sparse linear systems arising from partial differential equation (PDE) discretizations. While technically any linear system can be considered, ML should be used on linear systems that correspond to things that work well with multigrid methods (e.g. elliptic PDEs). ML can be used as a stand-alone package ormore » to generate preconditioners for a traditional iterative solver package (e.g. Krylov methods). We have supplied support for working with the Aztec 2.1 and AztecOO iterative package [16]. However, other solvers can be used by supplying a few functions. This document describes one specific algebraic multigrid approach: smoothed aggregation. This approach is used within several specialized multigrid methods: one for the eddy current formulation for Maxwell's equations, and a multilevel and domain decomposition method for symmetric and nonsymmetric systems of equations (like elliptic equations, or compressible and incompressible fluid dynamics problems). Other methods exist within ML but are not described in this document. Examples are given illustrating the problem definition and exercising multigrid options.« less
Using parallel banded linear system solvers in generalized eigenvalue problems

NASA Technical Reports Server (NTRS)

Zhang, Hong; Moss, William F.

1993-01-01

Subspace iteration is a reliable and cost effective method for solving positive definite banded symmetric generalized eigenproblems, especially in the case of large scale problems. This paper discusses an algorithm that makes use of two parallel banded solvers in subspace iteration. A shift is introduced to decompose the banded linear systems into relatively independent subsystems and to accelerate the iterations. With this shift, an eigenproblem is mapped efficiently into the memories of a multiprocessor and a high speed-up is obtained for parallel implementations. An optimal shift is a shift that balances total computation and communication costs. Under certain conditions, we show how to estimate an optimal shift analytically using the decay rate for the inverse of a banded matrix, and how to improve this estimate. Computational results on iPSC/2 and iPSC/860 multiprocessors are presented.
Extending substructure based iterative solvers to multiple load and repeated analyses

NASA Technical Reports Server (NTRS)

Farhat, Charbel

1993-01-01

Direct solvers currently dominate commercial finite element structural software, but do not scale well in the fine granularity regime targeted by emerging parallel processors. Substructure based iterative solvers--often called also domain decomposition algorithms--lend themselves better to parallel processing, but must overcome several obstacles before earning their place in general purpose structural analysis programs. One such obstacle is the solution of systems with many or repeated right hand sides. Such systems arise, for example, in multiple load static analyses and in implicit linear dynamics computations. Direct solvers are well-suited for these problems because after the system matrix has been factored, the multiple or repeated solutions can be obtained through relatively inexpensive forward and backward substitutions. On the other hand, iterative solvers in general are ill-suited for these problems because they often must restart from scratch for every different right hand side. In this paper, we present a methodology for extending the range of applications of domain decomposition methods to problems with multiple or repeated right hand sides. Basically, we formulate the overall problem as a series of minimization problems over K-orthogonal and supplementary subspaces, and tailor the preconditioned conjugate gradient algorithm to solve them efficiently. The resulting solution method is scalable, whereas direct factorization schemes and forward and backward substitution algorithms are not. We illustrate the proposed methodology with the solution of static and dynamic structural problems, and highlight its potential to outperform forward and backward substitutions on parallel computers. As an example, we show that for a linear structural dynamics problem with 11640 degrees of freedom, every time-step beyond time-step 15 is solved in a single iteration and consumes 1.0 second on a 32 processor iPSC-860 system; for the same problem and the same parallel processor, a pair of forward/backward substitutions at each step consumes 15.0 seconds.
ALPS - A LINEAR PROGRAM SOLVER

NASA Technical Reports Server (NTRS)

Viterna, L. A.

1994-01-01

Linear programming is a widely-used engineering and management tool. Scheduling, resource allocation, and production planning are all well-known applications of linear programs (LP's). Most LP's are too large to be solved by hand, so over the decades many computer codes for solving LP's have been developed. ALPS, A Linear Program Solver, is a full-featured LP analysis program. ALPS can solve plain linear programs as well as more complicated mixed integer and pure integer programs. ALPS also contains an efficient solution technique for pure binary (0-1 integer) programs. One of the many weaknesses of LP solvers is the lack of interaction with the user. ALPS is a menu-driven program with no special commands or keywords to learn. In addition, ALPS contains a full-screen editor to enter and maintain the LP formulation. These formulations can be written to and read from plain ASCII files for portability. For those less experienced in LP formulation, ALPS contains a problem "parser" which checks the formulation for errors. ALPS creates fully formatted, readable reports that can be sent to a printer or output file. ALPS is written entirely in IBM's APL2/PC product, Version 1.01. The APL2 workspace containing all the ALPS code can be run on any APL2/PC system (AT or 386). On a 32-bit system, this configuration can take advantage of all extended memory. The user can also examine and modify the ALPS code. The APL2 workspace has also been "packed" to be run on any DOS system (without APL2) as a stand-alone "EXE" file, but has limited memory capacity on a 640K system. A numeric coprocessor (80X87) is optional but recommended. The standard distribution medium for ALPS is a 5.25 inch 360K MS-DOS format diskette. IBM, IBM PC and IBM APL2 are registered trademarks of International Business Machines Corporation. MS-DOS is a registered trademark of Microsoft Corporation.
MIBPB: a software package for electrostatic analysis.

PubMed

Chen, Duan; Chen, Zhan; Chen, Changjun; Geng, Weihua; Wei, Guo-Wei

2011-03-01

The Poisson-Boltzmann equation (PBE) is an established model for the electrostatic analysis of biomolecules. The development of advanced computational techniques for the solution of the PBE has been an important topic in the past two decades. This article presents a matched interface and boundary (MIB)-based PBE software package, the MIBPB solver, for electrostatic analysis. The MIBPB has a unique feature that it is the first interface technique-based PBE solver that rigorously enforces the solution and flux continuity conditions at the dielectric interface between the biomolecule and the solvent. For protein molecular surfaces, which may possess troublesome geometrical singularities, the MIB scheme makes the MIBPB by far the only existing PBE solver that is able to deliver the second-order convergence, that is, the accuracy increases four times when the mesh size is halved. The MIBPB method is also equipped with a Dirichlet-to-Neumann mapping technique that builds a Green's function approach to analytically resolve the singular charge distribution in biomolecules in order to obtain reliable solutions at meshes as coarse as 1 Å--whereas it usually takes other traditional PB solvers 0.25 Å to reach similar level of reliability. This work further accelerates the rate of convergence of linear equation systems resulting from the MIBPB by using the Krylov subspace (KS) techniques. Condition numbers of the MIBPB matrices are significantly reduced by using appropriate KS solver and preconditioner combinations. Both linear and nonlinear PBE solvers in the MIBPB package are tested by protein-solvent solvation energy calculations and analysis of salt effects on protein-protein binding energies, respectively. Copyright © 2010 Wiley Periodicals, Inc.
MIBPB: A software package for electrostatic analysis

PubMed Central

Chen, Duan; Chen, Zhan; Chen, Changjun; Geng, Weihua; Wei, Guo-Wei

2010-01-01

The Poisson-Boltzmann equation (PBE) is an established model for the electrostatic analysis of biomolecules. The development of advanced computational techniques for the solution of the PBE has been an important topic in the past two decades. This paper presents a matched interface and boundary (MIB) based PBE software package, the MIBPB solver, for electrostatic analysis. The MIBPB has a unique feature that it is the first interface technique based PBE solver that rigorously enforces the solution and flux continuity conditions at the dielectric interface between the biomolecule and the solvent. For protein molecular surfaces which may possess troublesome geometrical singularities, the MIB scheme makes the MIBPB by far the only existing PBE solver that is able to deliver the second order convergence, i.e., the accuracy increases four times when the mesh size is halved. The MIBPB method is also equipped with a Dirichlet-to-Neumann mapping (DNM) technique, that builds a Green's function approach to analytically resolve the singular charge distribution in biomolecules in order to obtain reliable solutions at meshes as coarse as 1Å — while it usually takes other traditional PB solvers 0.25Å to reach similar level of reliability. The present work further accelerates the rate of convergence of linear equation systems resulting from the MIBPB by utilizing the Krylov subspace (KS) techniques. Condition numbers of the MIBPB matrices are significantly reduced by using appropriate Krylov subspace solver and preconditioner combinations. Both linear and nonlinear PBE solvers in the MIBPB package are tested by protein-solvent solvation energy calculations and analysis of salt effects on protein-protein binding energies, respectively. PMID:20845420
Advanced Computational Methods for Security Constrained Financial Transmission Rights

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kalsi, Karanjit; Elbert, Stephen T.; Vlachopoulou, Maria

Financial Transmission Rights (FTRs) are financial insurance tools to help power market participants reduce price risks associated with transmission congestion. FTRs are issued based on a process of solving a constrained optimization problem with the objective to maximize the FTR social welfare under power flow security constraints. Security constraints for different FTR categories (monthly, seasonal or annual) are usually coupled and the number of constraints increases exponentially with the number of categories. Commercial software for FTR calculation can only provide limited categories of FTRs due to the inherent computational challenges mentioned above. In this paper, first an innovative mathematical reformulationmore » of the FTR problem is presented which dramatically improves the computational efficiency of optimization problem. After having re-formulated the problem, a novel non-linear dynamic system (NDS) approach is proposed to solve the optimization problem. The new formulation and performance of the NDS solver is benchmarked against widely used linear programming (LP) solvers like CPLEX™ and tested on both standard IEEE test systems and large-scale systems using data from the Western Electricity Coordinating Council (WECC). The performance of the NDS is demonstrated to be comparable and in some cases is shown to outperform the widely used CPLEX algorithms. The proposed formulation and NDS based solver is also easily parallelizable enabling further computational improvement.« less
Energy consumption optimization of the total-FETI solver by changing the CPU frequency

NASA Astrophysics Data System (ADS)

Horak, David; Riha, Lubomir; Sojka, Radim; Kruzik, Jakub; Beseda, Martin; Cermak, Martin; Schuchart, Joseph

2017-07-01

The energy consumption of supercomputers is one of the critical problems for the upcoming Exascale supercomputing era. The awareness of power and energy consumption is required on both software and hardware side. This paper deals with the energy consumption evaluation of the Finite Element Tearing and Interconnect (FETI) based solvers of linear systems, which is an established method for solving real-world engineering problems. We have evaluated the effect of the CPU frequency on the energy consumption of the FETI solver using a linear elasticity 3D cube synthetic benchmark. In this problem, we have evaluated the effect of frequency tuning on the energy consumption of the essential processing kernels of the FETI method. The paper provides results for two types of frequency tuning: (1) static tuning and (2) dynamic tuning. For static tuning experiments, the frequency is set before execution and kept constant during the runtime. For dynamic tuning, the frequency is changed during the program execution to adapt the system to the actual needs of the application. The paper shows that static tuning brings up 12% energy savings when compared to default CPU settings (the highest clock rate). The dynamic tuning improves this further by up to 3%.
Integrated multidisciplinary CAD/CAE environment for micro-electro-mechanical systems (MEMS)

NASA Astrophysics Data System (ADS)

Przekwas, Andrzej J.

1999-03-01

Computational design of MEMS involves several strongly coupled physical disciplines, including fluid mechanics, heat transfer, stress/deformation dynamics, electronics, electro/magneto statics, calorics, biochemistry and others. CFDRC is developing a new generation multi-disciplinary CAD systems for MEMS using high-fidelity field solvers on unstructured, solution-adaptive grids for a full range of disciplines. The software system, ACE + MEMS, includes all essential CAD tools; geometry/grid generation for multi- discipline, multi-equation solvers, GUI, tightly coupled configurable 3D field solvers for FVM, FEM and BEM and a 3D visualization/animation tool. The flow/heat transfer/calorics/chemistry equations are solved with unstructured adaptive FVM solver, stress/deformation are computed with a FEM STRESS solver and a FAST BEM solver is used to solve linear heat transfer, electro/magnetostatics and elastostatics equations on adaptive polygonal surface grids. Tight multidisciplinary coupling and automatic interoperability between the tools was achieved by designing a comprehensive database structure and APIs for complete model definition. The virtual model definition is implemented in data transfer facility, a publicly available tool described in this paper. The paper presents overall description of the software architecture and MEMS design flow in ACE + MEMS. It describes current status, ongoing effort and future plans for the software. The paper also discusses new concepts of mixed-level and mixed- dimensionality capability in which 1D microfluidic networks are simulated concurrently with 3D high-fidelity models of discrete components.
Galaxy Redshifts from Discrete Optimization of Correlation Functions

NASA Astrophysics Data System (ADS)

Lee, Benjamin C. G.; Budavári, Tamás; Basu, Amitabh; Rahman, Mubdi

2016-12-01

We propose a new method of constraining the redshifts of individual extragalactic sources based on celestial coordinates and their ensemble statistics. Techniques from integer linear programming (ILP) are utilized to optimize simultaneously for the angular two-point cross- and autocorrelation functions. Our novel formalism introduced here not only transforms the otherwise hopelessly expensive, brute-force combinatorial search into a linear system with integer constraints but also is readily implementable in off-the-shelf solvers. We adopt Gurobi, a commercial optimization solver, and use Python to build the cost function dynamically. The preliminary results on simulated data show potential for future applications to sky surveys by complementing and enhancing photometric redshift estimators. Our approach is the first application of ILP to astronomical analysis.

Relaxation approximations to second-order traffic flow models by high-resolution schemes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nikolos, I.K.; Delis, A.I.; Papageorgiou, M.

2015-03-10

A relaxation-type approximation of second-order non-equilibrium traffic models, written in conservation or balance law form, is considered. Using the relaxation approximation, the nonlinear equations are transformed to a semi-linear diagonilizable problem with linear characteristic variables and stiff source terms with the attractive feature that neither Riemann solvers nor characteristic decompositions are in need. In particular, it is only necessary to provide the flux and source term functions and an estimate of the characteristic speeds. To discretize the resulting relaxation system, high-resolution reconstructions in space are considered. Emphasis is given on a fifth-order WENO scheme and its performance. The computations reportedmore » demonstrate the simplicity and versatility of relaxation schemes as numerical solvers.« less
Three-dimensional unstructured grid Euler computations using a fully-implicit, upwind method

NASA Technical Reports Server (NTRS)

Whitaker, David L.

1993-01-01

A method has been developed to solve the Euler equations on a three-dimensional unstructured grid composed of tetrahedra. The method uses an upwind flow solver with a linearized, backward-Euler time integration scheme. Each time step results in a sparse linear system of equations which is solved by an iterative, sparse matrix solver. Local-time stepping, switched evolution relaxation (SER), preconditioning and reuse of the Jacobian are employed to accelerate the convergence rate. Implicit boundary conditions were found to be extremely important for fast convergence. Numerical experiments have shown that convergence rates comparable to that of a multigrid, central-difference scheme are achievable on the same mesh. Results are presented for several grids about an ONERA M6 wing.
Three-Dimensional Inverse Transport Solver Based on Compressive Sensing Technique

NASA Astrophysics Data System (ADS)

Cheng, Yuxiong; Wu, Hongchun; Cao, Liangzhi; Zheng, Youqi

2013-09-01

According to the direct exposure measurements from flash radiographic image, a compressive sensing-based method for three-dimensional inverse transport problem is presented. The linear absorption coefficients and interface locations of objects are reconstructed directly at the same time. It is always very expensive to obtain enough measurements. With limited measurements, compressive sensing sparse reconstruction technique orthogonal matching pursuit is applied to obtain the sparse coefficients by solving an optimization problem. A three-dimensional inverse transport solver is developed based on a compressive sensing-based technique. There are three features in this solver: (1) AutoCAD is employed as a geometry preprocessor due to its powerful capacity in graphic. (2) The forward projection matrix rather than Gauss matrix is constructed by the visualization tool generator. (3) Fourier transform and Daubechies wavelet transform are adopted to convert an underdetermined system to a well-posed system in the algorithm. Simulations are performed and numerical results in pseudo-sine absorption problem, two-cube problem and two-cylinder problem when using compressive sensing-based solver agree well with the reference value.
A second order discontinuous Galerkin fast sweeping method for Eikonal equations

NASA Astrophysics Data System (ADS)

Li, Fengyan; Shu, Chi-Wang; Zhang, Yong-Tao; Zhao, Hongkai

2008-09-01

In this paper, we construct a second order fast sweeping method with a discontinuous Galerkin (DG) local solver for computing viscosity solutions of a class of static Hamilton-Jacobi equations, namely the Eikonal equations. Our piecewise linear DG local solver is built on a DG method developed recently [Y. Cheng, C.-W. Shu, A discontinuous Galerkin finite element method for directly solving the Hamilton-Jacobi equations, Journal of Computational Physics 223 (2007) 398-415] for the time-dependent Hamilton-Jacobi equations. The causality property of Eikonal equations is incorporated into the design of this solver. The resulting local nonlinear system in the Gauss-Seidel iterations is a simple quadratic system and can be solved explicitly. The compactness of the DG method and the fast sweeping strategy lead to fast convergence of the new scheme for Eikonal equations. Extensive numerical examples verify efficiency, convergence and second order accuracy of the proposed method.
Preconditioned conjugate-gradient methods for low-speed flow calculations

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Ng, Wing-Fai; Liou, Meng-Sing

1993-01-01

An investigation is conducted into the viability of using a generalized Conjugate Gradient-like method as an iterative solver to obtain steady-state solutions of very low-speed fluid flow problems. Low-speed flow at Mach 0.1 over a backward-facing step is chosen as a representative test problem. The unsteady form of the two dimensional, compressible Navier-Stokes equations is integrated in time using discrete time-steps. The Navier-Stokes equations are cast in an implicit, upwind finite-volume, flux split formulation. The new iterative solver is used to solve a linear system of equations at each step of the time-integration. Preconditioning techniques are used with the new solver to enhance the stability and convergence rate of the solver and are found to be critical to the overall success of the solver. A study of various preconditioners reveals that a preconditioner based on the Lower-Upper Successive Symmetric Over-Relaxation iterative scheme is more efficient than a preconditioner based on Incomplete L-U factorizations of the iteration matrix. The performance of the new preconditioned solver is compared with a conventional Line Gauss-Seidel Relaxation (LGSR) solver. Overall speed-up factors of 28 (in terms of global time-steps required to converge to a steady-state solution) and 20 (in terms of total CPU time on one processor of a CRAY-YMP) are found in favor of the new preconditioned solver, when compared with the LGSR solver.
Preconditioned Conjugate Gradient methods for low speed flow calculations

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Ng, Wing-Fai; Liou, Meng-Sing

1993-01-01

An investigation is conducted into the viability of using a generalized Conjugate Gradient-like method as an iterative solver to obtain steady-state solutions of very low-speed fluid flow problems. Low-speed flow at Mach 0.1 over a backward-facing step is chosen as a representative test problem. The unsteady form of the two dimensional, compressible Navier-Stokes equations are integrated in time using discrete time-steps. The Navier-Stokes equations are cast in an implicit, upwind finite-volume, flux split formulation. The new iterative solver is used to solve a linear system of equations at each step of the time-integration. Preconditioning techniques are used with the new solver to enhance the stability and the convergence rate of the solver and are found to be critical to the overall success of the solver. A study of various preconditioners reveals that a preconditioner based on the lower-upper (L-U)-successive symmetric over-relaxation iterative scheme is more efficient than a preconditioner based on incomplete L-U factorizations of the iteration matrix. The performance of the new preconditioned solver is compared with a conventional line Gauss-Seidel relaxation (LGSR) solver. Overall speed-up factors of 28 (in terms of global time-steps required to converge to a steady-state solution) and 20 (in terms of total CPU time on one processor of a CRAY-YMP) are found in favor of the new preconditioned solver, when compared with the LGSR solver.
Some fast elliptic solvers on parallel architectures and their complexities

NASA Technical Reports Server (NTRS)

Gallopoulos, E.; Saad, Y.

1989-01-01

The discretization of separable elliptic partial differential equations leads to linear systems with special block tridiagonal matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconstant coefficients. A method was recently proposed to parallelize and vectorize BCR. In this paper, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational compelxity lower than that of parallel BCR.
Some fast elliptic solvers on parallel architectures and their complexities

NASA Technical Reports Server (NTRS)

Gallopoulos, E.; Saad, Youcef

1989-01-01

The discretization of separable elliptic partial differential equations leads to linear systems with special block triangular matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconsistant coefficients. A method was recently proposed to parallelize and vectorize BCR. Here, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches, including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational complexity lower than that of parallel BCR.
Distributed Memory Parallel Computing with SEAWAT

NASA Astrophysics Data System (ADS)

Verkaik, J.; Huizer, S.; van Engelen, J.; Oude Essink, G.; Ram, R.; Vuik, K.

2017-12-01

Fresh groundwater reserves in coastal aquifers are threatened by sea-level rise, extreme weather conditions, increasing urbanization and associated groundwater extraction rates. To counteract these threats, accurate high-resolution numerical models are required to optimize the management of these precious reserves. The major model drawbacks are long run times and large memory requirements, limiting the predictive power of these models. Distributed memory parallel computing is an efficient technique for reducing run times and memory requirements, where the problem is divided over multiple processor cores. A new Parallel Krylov Solver (PKS) for SEAWAT is presented. PKS has recently been applied to MODFLOW and includes Conjugate Gradient (CG) and Biconjugate Gradient Stabilized (BiCGSTAB) linear accelerators. Both accelerators are preconditioned by an overlapping additive Schwarz preconditioner in a way that: a) subdomains are partitioned using Recursive Coordinate Bisection (RCB) load balancing, b) each subdomain uses local memory only and communicates with other subdomains by Message Passing Interface (MPI) within the linear accelerator, c) it is fully integrated in SEAWAT. Within SEAWAT, the PKS-CG solver replaces the Preconditioned Conjugate Gradient (PCG) solver for solving the variable-density groundwater flow equation and the PKS-BiCGSTAB solver replaces the Generalized Conjugate Gradient (GCG) solver for solving the advection-diffusion equation. PKS supports the third-order Total Variation Diminishing (TVD) scheme for computing advection. Benchmarks were performed on the Dutch national supercomputer (https://userinfo.surfsara.nl/systems/cartesius) using up to 128 cores, for a synthetic 3D Henry model (100 million cells) and the real-life Sand Engine model ( 10 million cells). The Sand Engine model was used to investigate the potential effect of the long-term morphological evolution of a large sand replenishment and climate change on fresh groundwater resources. Speed-ups up to 40 were obtained with the new PKS solver.
A Comparative Study of Randomized Constraint Solvers for Random-Symbolic Testing

NASA Technical Reports Server (NTRS)

Takaki, Mitsuo; Cavalcanti, Diego; Gheyi, Rohit; Iyoda, Juliano; dAmorim, Marcelo; Prudencio, Ricardo

2009-01-01

The complexity of constraints is a major obstacle for constraint-based software verification. Automatic constraint solvers are fundamentally incomplete: input constraints often build on some undecidable theory or some theory the solver does not support. This paper proposes and evaluates several randomized solvers to address this issue. We compare the effectiveness of a symbolic solver (CVC3), a random solver, three hybrid solvers (i.e., mix of random and symbolic), and two heuristic search solvers. We evaluate the solvers on two benchmarks: one consisting of manually generated constraints and another generated with a concolic execution of 8 subjects. In addition to fully decidable constraints, the benchmarks include constraints with non-linear integer arithmetic, integer modulo and division, bitwise arithmetic, and floating-point arithmetic. As expected symbolic solving (in particular, CVC3) subsumes the other solvers for the concolic execution of subjects that only generate decidable constraints. For the remaining subjects the solvers are complementary.
DataView: a computational visualisation system for multidisciplinary design and analysis

NASA Astrophysics Data System (ADS)

Wang, Chengen

2016-01-01

Rapidly processing raw data and effectively extracting underlining information from huge volumes of multivariate data become essential to all decision-making processes in sectors like finance, government, medical care, climate analysis, industries, science, etc. Remarkably, visualisation is recognised as a fundamental technology that props up human comprehension, cognition and utilisation of burgeoning amounts of heterogeneous data. This paper presents a computational visualisation system, named DataView, which has been developed for graphically displaying and capturing outcomes of multiphysics problem-solvers widely used in engineering fields. The DataView is functionally composed of techniques for table/diagram representation, and graphical illustration of scalar, vector and tensor fields. The field visualisation techniques are implemented on the basis of a range of linear and non-linear meshes, which flexibly adapts to disparate data representation schemas adopted by a variety of disciplinary problem-solvers. The visualisation system has been successfully applied to a number of engineering problems, of which some illustrations are presented to demonstrate effectiveness of the visualisation techniques.
Techniques and Software for Monolithic Preconditioning of Moderately-sized Geodynamic Stokes Flow Problems

NASA Astrophysics Data System (ADS)

Sanan, Patrick; May, Dave A.; Schenk, Olaf; Bollhöffer, Matthias

2017-04-01

Geodynamics simulations typically involve the repeated solution of saddle-point systems arising from the Stokes equations. These computations often dominate the time to solution. Direct solvers are known for their robustness and ``black box'' properties, yet exhibit superlinear memory requirements and time to solution. More complex multilevel-preconditioned iterative solvers have been very successful for large problems, yet their use can require more effort from the practitioner in terms of setting up a solver and choosing its parameters. We champion an intermediate approach, based on leveraging the power of modern incomplete factorization techniques for indefinite symmetric matrices. These provide an interesting alternative in situations in between the regimes where direct solvers are an obvious choice and those where complex, scalable, iterative solvers are an obvious choice. That is, much like their relatives for definite systems, ILU/ICC-preconditioned Krylov methods and ILU/ICC-smoothed multigrid methods, the approaches demonstrated here provide a useful addition to the solver toolkit. We present results with a simple, PETSc-based, open-source Q2-Q1 (Taylor-Hood) finite element discretization, in 2 and 3 dimensions, with the Stokes and Lamé (linear elasticity) saddle point systems. Attention is paid to cases in which full-operator incomplete factorization gives an improvement in time to solution over direct solution methods (which may not even be feasible due to memory limitations), without the complication of more complex (or at least, less-automatic) preconditioners or smoothers. As an important factor in the relevance of these tools is their availability in portable software, we also describe open-source PETSc interfaces to the factorization routines.
Cutting-edge Kinetic Physics with Parker Solar Probe and Solar Orbiter: The Arbitrary Linear Plasma Solver (ALPS)

NASA Astrophysics Data System (ADS)

Verscharen, D.; Klein, K. G.; Chandran, B. D. G.; Stevens, M. L.; Salem, C. S.; Bale, S. D.

2017-12-01

The Arbitrary Linear Plasma Solver (ALPS) is a parallelized numerical code that solves the dispersion relation in a hot (even relativistic) magnetized plasma with an arbitrary number of particle species with arbitrary gyrotropic equilibrium distribution functions for any direction of wave propagation with respect to the background field. In this way, ALPS retains generality and overcomes the shortcomings of previous (bi-)Maxwellian solvers for the plasma dispersion relations. The unprecedented high-resolution particle and field data products from Parker Solar Probe (PSP) and Solar Orbiter (SO) will require novel theoretical tools. ALPS is one such tool, and its use will make possible new investigations into the role of non-Maxwellian distributions in the near-Sun solar wind. It can be applied to numerous high-velocity-resolution systems, ranging from current space missions to numerical simulations. We will briefly discuss the ALPS algorithm and demonstrate its functionality based on previous solar-wind measurements. We will then highlight our plans for future applications of ALPS to PSP and SO observations.
Development and Verification of the Charring Ablating Thermal Protection Implicit System Solver

NASA Technical Reports Server (NTRS)

Amar, Adam J.; Calvert, Nathan D.; Kirk, Benjamin S.

2010-01-01

The development and verification of the Charring Ablating Thermal Protection Implicit System Solver is presented. This work concentrates on the derivation and verification of the stationary grid terms in the equations that govern three-dimensional heat and mass transfer for charring thermal protection systems including pyrolysis gas flow through the porous char layer. The governing equations are discretized according to the Galerkin finite element method with first and second order implicit time integrators. The governing equations are fully coupled and are solved in parallel via Newton's method, while the fully implicit linear system is solved with the Generalized Minimal Residual method. Verification results from exact solutions and the Method of Manufactured Solutions are presented to show spatial and temporal orders of accuracy as well as nonlinear convergence rates.
Three-dimensional forward modeling and inversion of marine CSEM data in anisotropic conductivity structures

NASA Astrophysics Data System (ADS)

Han, B.; Li, Y.

2016-12-01

We present a three-dimensional (3D) forward and inverse modeling code for marine controlled-source electromagnetic (CSEM) surveys in anisotropic media. The forward solution is based on a primary/secondary field approach, in which secondary fields are solved using a staggered finite-volume (FV) method and primary fields are solved for 1D isotropic background models analytically. It is shown that it is rather straightforward to extend the isotopic 3D FV algorithm to a triaxial anisotropic one, while additional coefficients are required to account for full tensor conductivity. To solve the linear system resulting from FV discretization of Maxwell' s equations, both iterative Krylov solvers (e.g. BiCGSTAB) and direct solvers (e.g. MUMPS) have been implemented, makes the code flexible for different computing platforms and different problems. For iterative soloutions, the linear system in terms of electromagnetic potentials (A-Phi) is used to precondition the original linear system, transforming the discretized Curl-Curl equations to discretized Laplace-like equations, thus much more favorable numerical properties can be obtained. Numerical experiments suggest that this A-Phi preconditioner can dramatically improve the convergence rate of an iterative solver and high accuracy can be achieved without divergence correction even for low frequencies. To efficiently calculate the sensitivities, i.e. the derivatives of CSEM data with respect to tensor conductivity, the adjoint method is employed. For inverse modeling, triaxial anisotropy is taken into account. Since the number of model parameters to be resolved of triaxial anisotropic medias is twice or thrice that of isotropic medias, the data-space version of the Gauss-Newton (GN) minimization method is preferred due to its lower computational cost compared with the traditional model-space GN method. We demonstrate the effectiveness of the code with synthetic examples.
Solving delay differential equations in S-ADAPT by method of steps.

PubMed

Bauer, Robert J; Mo, Gary; Krzyzanski, Wojciech

2013-09-01

S-ADAPT is a version of the ADAPT program that contains additional simulation and optimization abilities such as parametric population analysis. S-ADAPT utilizes LSODA to solve ordinary differential equations (ODEs), an algorithm designed for large dimension non-stiff and stiff problems. However, S-ADAPT does not have a solver for delay differential equations (DDEs). Our objective was to implement in S-ADAPT a DDE solver using the methods of steps. The method of steps allows one to solve virtually any DDE system by transforming it to an ODE system. The solver was validated for scalar linear DDEs with one delay and bolus and infusion inputs for which explicit analytic solutions were derived. Solutions of nonlinear DDE problems coded in S-ADAPT were validated by comparing them with ones obtained by the MATLAB DDE solver dde23. The estimation of parameters was tested on the MATLB simulated population pharmacodynamics data. The comparison of S-ADAPT generated solutions for DDE problems with the explicit solutions as well as MATLAB produced solutions which agreed to at least 7 significant digits. The population parameter estimates from using importance sampling expectation-maximization in S-ADAPT agreed with ones used to generate the data. Published by Elsevier Ireland Ltd.
Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

DOE PAGES

Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.; ...

2017-01-18

Currently, Constraint-Based Reconstruction and Analysis (COBRA) is the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many orders of magnitude. Data values also have greatly varying magnitudes. Furthermore, standard double-precision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers based on rational arithmetic require a near-optimal warm start to be practical on large problems (current ME models have 70,000 constraints and variables and will grow larger). We also developed a quadrupleprecision version of ourmore » linear and nonlinear optimizer MINOS, and a solution procedure (DQQ) involving Double and Quad MINOS that achieves reliability and efficiency for ME models and other challenging problems tested here. DQQ will enable extensive use of large linear and nonlinear models in systems biology and other applications involving multiscale data.« less
Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.

Currently, Constraint-Based Reconstruction and Analysis (COBRA) is the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many orders of magnitude. Data values also have greatly varying magnitudes. Furthermore, standard double-precision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers based on rational arithmetic require a near-optimal warm start to be practical on large problems (current ME models have 70,000 constraints and variables and will grow larger). We also developed a quadrupleprecision version of ourmore » linear and nonlinear optimizer MINOS, and a solution procedure (DQQ) involving Double and Quad MINOS that achieves reliability and efficiency for ME models and other challenging problems tested here. DQQ will enable extensive use of large linear and nonlinear models in systems biology and other applications involving multiscale data.« less
Performance issues for iterative solvers in device simulation

NASA Technical Reports Server (NTRS)

Fan, Qing; Forsyth, P. A.; Mcmacken, J. R. F.; Tang, Wei-Pai

1994-01-01

Due to memory limitations, iterative methods have become the method of choice for large scale semiconductor device simulation. However, it is well known that these methods still suffer from reliability problems. The linear systems which appear in numerical simulation of semiconductor devices are notoriously ill-conditioned. In order to produce robust algorithms for practical problems, careful attention must be given to many implementation issues. This paper concentrates on strategies for developing robust preconditioners. In addition, effective data structures and convergence check issues are also discussed. These algorithms are compared with a standard direct sparse matrix solver on a variety of problems.
High Performance Radiation Transport Simulations on TITAN

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Christopher G; Davidson, Gregory G; Evans, Thomas M

2012-01-01

In this paper we describe the Denovo code system. Denovo solves the six-dimensional, steady-state, linear Boltzmann transport equation, of central importance to nuclear technology applications such as reactor core analysis (neutronics), radiation shielding, nuclear forensics and radiation detection. The code features multiple spatial differencing schemes, state-of-the-art linear solvers, the Koch-Baker-Alcouffe (KBA) parallel-wavefront sweep algorithm for inverting the transport operator, a new multilevel energy decomposition method scaling to hundreds of thousands of processing cores, and a modern, novel code architecture that supports straightforward integration of new features. In this paper we discuss the performance of Denovo on the 10--20 petaflop ORNLmore » GPU-based system, Titan. We describe algorithms and techniques used to exploit the capabilities of Titan's heterogeneous compute node architecture and the challenges of obtaining good parallel performance for this sparse hyperbolic PDE solver containing inherently sequential computations. Numerical results demonstrating Denovo performance on early Titan hardware are presented.« less

MUSTA fluxes for systems of conservation laws

NASA Astrophysics Data System (ADS)

Toro, E. F.; Titarev, V. A.

2006-08-01

This paper is about numerical fluxes for hyperbolic systems and we first present a numerical flux, called GFORCE, that is a weighted average of the Lax-Friedrichs and Lax-Wendroff fluxes. For the linear advection equation with constant coefficient, the new flux reduces identically to that of the Godunov first-order upwind method. Then we incorporate GFORCE in the framework of the MUSTA approach [E.F. Toro, Multi-Stage Predictor-Corrector Fluxes for Hyperbolic Equations. Technical Report NI03037-NPA, Isaac Newton Institute for Mathematical Sciences, University of Cambridge, UK, 17th June, 2003], resulting in a version that we call GMUSTA. For non-linear systems this gives results that are comparable to those of the Godunov method in conjunction with the exact Riemann solver or complete approximate Riemann solvers, noting however that in our approach, the solution of the Riemann problem in the conventional sense is avoided. Both the GFORCE and GMUSTA fluxes are extended to multi-dimensional non-linear systems in a straightforward unsplit manner, resulting in linearly stable schemes that have the same stability regions as the straightforward multi-dimensional extension of Godunov's method. The methods are applicable to general meshes. The schemes of this paper share with the family of centred methods the common properties of being simple and applicable to a large class of hyperbolic systems, but the schemes of this paper are distinctly more accurate. Finally, we proceed to the practical implementation of our numerical fluxes in the framework of high-order finite volume WENO methods for multi-dimensional non-linear hyperbolic systems. Numerical results are presented for the Euler equations and for the equations of magnetohydrodynamics.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Thompson, S.

This report describes the use of several subroutines from the CORLIB core mathematical subroutine library for the solution of a model fluid flow problem. The model consists of the Euler partial differential equations. The equations are spatially discretized using the method of pseudo-characteristics. The resulting system of ordinary differential equations is then integrated using the method of lines. The stiff ordinary differential equation solver LSODE (2) from CORLIB is used to perform the time integration. The non-stiff solver ODE (4) is used to perform a related integration. The linear equation solver subroutines DECOMP and SOLVE are used to solve linearmore » systems whose solutions are required in the calculation of the time derivatives. The monotone cubic spline interpolation subroutines PCHIM and PCHFE are used to approximate water properties. The report describes the use of each of these subroutines in detail. It illustrates the manner in which modules from a standard mathematical software library such as CORLIB can be used as building blocks in the solution of complex problems of practical interest. 9 refs., 2 figs., 4 tabs.« less
Accelerating Subsurface Transport Simulation on Heterogeneous Clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Villa, Oreste; Gawande, Nitin A.; Tumeo, Antonino

Reactive transport numerical models simulate chemical and microbiological reactions that occur along a flowpath. These models have to compute reactions for a large number of locations. They solve the set of ordinary differential equations (ODEs) that describes the reaction for each location through the Newton-Raphson technique. This technique involves computing a Jacobian matrix and a residual vector for each set of equation, and then solving iteratively the linearized system by performing Gaussian Elimination and LU decomposition until convergence. STOMP, a well known subsurface flow simulation tool, employs matrices with sizes in the order of 100x100 elements and, for numerical accuracy,more » LU factorization with full pivoting instead of the faster partial pivoting. Modern high performance computing systems are heterogeneous machines whose nodes integrate both CPUs and GPUs, exposing unprecedented amounts of parallelism. To exploit all their computational power, applications must use both the types of processing elements. For the case of subsurface flow simulation, this mainly requires implementing efficient batched LU-based solvers and identifying efficient solutions for enabling load balancing among the different processors of the system. In this paper we discuss two approaches that allows scaling STOMP's performance on heterogeneous clusters. We initially identify the challenges in implementing batched LU-based solvers for small matrices on GPUs, and propose an implementation that fulfills STOMP's requirements. We compare this implementation to other existing solutions. Then, we combine the batched GPU solver with an OpenMP-based CPU solver, and present an adaptive load balancer that dynamically distributes the linear systems to solve between the two components inside a node. We show how these approaches, integrated into the full application, provide speed ups from 6 to 7 times on large problems, executed on up to 16 nodes of a cluster with two AMD Opteron 6272 and a Tesla M2090 per node.« less
Large-scale 3D geoelectromagnetic modeling using parallel adaptive high-order finite element method

DOE PAGES

Grayver, Alexander V.; Kolev, Tzanio V.

2015-11-01

Here, we have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linear equations poses the main practical challenge. We have developed a fully parallel and distributed robust and scalable linear solver based on the optimal block-diagonal and auxiliary space preconditioners. The solver was found to be efficient for high finite element orders, unstructured and nonconforming locally refined meshes, a wide range of frequencies, largemore » conductivity contrasts, and number of degrees of freedom (DoFs). Furthermore, the presented linear solver is in essence algebraic; i.e., it acts on the matrix-vector level and thus requires no information about the discretization, boundary conditions, or physical source used, making it readily efficient for a wide range of electromagnetic modeling problems. To get accurate solutions at reduced computational cost, we have also implemented goal-oriented adaptive mesh refinement. The numerical tests indicated that if highly accurate modeling results were required, the high-order FEM in combination with the goal-oriented local mesh refinement required less computational time and DoFs than the lowest order adaptive FEM.« less
Large-scale 3D geoelectromagnetic modeling using parallel adaptive high-order finite element method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grayver, Alexander V.; Kolev, Tzanio V.

Here, we have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linear equations poses the main practical challenge. We have developed a fully parallel and distributed robust and scalable linear solver based on the optimal block-diagonal and auxiliary space preconditioners. The solver was found to be efficient for high finite element orders, unstructured and nonconforming locally refined meshes, a wide range of frequencies, largemore » conductivity contrasts, and number of degrees of freedom (DoFs). Furthermore, the presented linear solver is in essence algebraic; i.e., it acts on the matrix-vector level and thus requires no information about the discretization, boundary conditions, or physical source used, making it readily efficient for a wide range of electromagnetic modeling problems. To get accurate solutions at reduced computational cost, we have also implemented goal-oriented adaptive mesh refinement. The numerical tests indicated that if highly accurate modeling results were required, the high-order FEM in combination with the goal-oriented local mesh refinement required less computational time and DoFs than the lowest order adaptive FEM.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry

Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
Efficient and Robust Optimization for Building Energy Simulation

PubMed Central

Pourarian, Shokouh; Kearsley, Anthony; Wen, Jin; Pertzborn, Amanda

2016-01-01

Efficiently, robustly and accurately solving large sets of structured, non-linear algebraic and differential equations is one of the most computationally expensive steps in the dynamic simulation of building energy systems. Here, the efficiency, robustness and accuracy of two commonly employed solution methods are compared. The comparison is conducted using the HVACSIM+ software package, a component based building system simulation tool. The HVACSIM+ software presently employs Powell’s Hybrid method to solve systems of nonlinear algebraic equations that model the dynamics of energy states and interactions within buildings. It is shown here that the Powell’s method does not always converge to a solution. Since a myriad of other numerical methods are available, the question arises as to which method is most appropriate for building energy simulation. This paper finds considerable computational benefits result from replacing the Powell’s Hybrid method solver in HVACSIM+ with a solver more appropriate for the challenges particular to numerical simulations of buildings. Evidence is provided that a variant of the Levenberg-Marquardt solver has superior accuracy and robustness compared to the Powell’s Hybrid method presently used in HVACSIM+. PMID:27325907
Efficient and Robust Optimization for Building Energy Simulation.

PubMed

Pourarian, Shokouh; Kearsley, Anthony; Wen, Jin; Pertzborn, Amanda

2016-06-15

Efficiently, robustly and accurately solving large sets of structured, non-linear algebraic and differential equations is one of the most computationally expensive steps in the dynamic simulation of building energy systems. Here, the efficiency, robustness and accuracy of two commonly employed solution methods are compared. The comparison is conducted using the HVACSIM+ software package, a component based building system simulation tool. The HVACSIM+ software presently employs Powell's Hybrid method to solve systems of nonlinear algebraic equations that model the dynamics of energy states and interactions within buildings. It is shown here that the Powell's method does not always converge to a solution. Since a myriad of other numerical methods are available, the question arises as to which method is most appropriate for building energy simulation. This paper finds considerable computational benefits result from replacing the Powell's Hybrid method solver in HVACSIM+ with a solver more appropriate for the challenges particular to numerical simulations of buildings. Evidence is provided that a variant of the Levenberg-Marquardt solver has superior accuracy and robustness compared to the Powell's Hybrid method presently used in HVACSIM+.
LSRN: A PARALLEL ITERATIVE SOLVER FOR STRONGLY OVER- OR UNDERDETERMINED SYSTEMS*

PubMed Central

Meng, Xiangrui; Saunders, Michael A.; Mahoney, Michael W.

2014-01-01

We describe a parallel iterative least squares solver named LSRN that is based on random normal projection. LSRN computes the min-length solution to minx∈ℝn ‖Ax − b‖2, where A ∈ ℝm × n with m ≫ n or m ≪ n, and where A may be rank-deficient. Tikhonov regularization may also be included. Since A is involved only in matrix-matrix and matrix-vector multiplications, it can be a dense or sparse matrix or a linear operator, and LSRN automatically speeds up when A is sparse or a fast linear operator. The preconditioning phase consists of a random normal projection, which is embarrassingly parallel, and a singular value decomposition of size ⌈γ min(m, n)⌉ × min(m, n), where γ is moderately larger than 1, e.g., γ = 2. We prove that the preconditioned system is well-conditioned, with a strong concentration result on the extreme singular values, and hence that the number of iterations is fully predictable when we apply LSQR or the Chebyshev semi-iterative method. As we demonstrate, the Chebyshev method is particularly efficient for solving large problems on clusters with high communication cost. Numerical results show that on a shared-memory machine, LSRN is very competitive with LAPACK’s DGELSD and a fast randomized least squares solver called Blendenpik on large dense problems, and it outperforms the least squares solver from SuiteSparseQR on sparse problems without sparsity patterns that can be exploited to reduce fill-in. Further experiments show that LSRN scales well on an Amazon Elastic Compute Cloud cluster. PMID:25419094
LEOPARD: A grid-based dispersion relation solver for arbitrary gyrotropic distributions

NASA Astrophysics Data System (ADS)

Astfalk, Patrick; Jenko, Frank

2017-01-01

Particle velocity distributions measured in collisionless space plasmas often show strong deviations from idealized model distributions. Despite this observational evidence, linear wave analysis in space plasma environments such as the solar wind or Earth's magnetosphere is still mainly carried out using dispersion relation solvers based on Maxwellians or other parametric models. To enable a more realistic analysis, we present the new grid-based kinetic dispersion relation solver LEOPARD (Linear Electromagnetic Oscillations in Plasmas with Arbitrary Rotationally-symmetric Distributions) which no longer requires prescribed model distributions but allows for arbitrary gyrotropic distribution functions. In this work, we discuss the underlying numerical scheme of the code and we show a few exemplary benchmarks. Furthermore, we demonstrate a first application of LEOPARD to ion distribution data obtained from hybrid simulations. In particular, we show that in the saturation stage of the parallel fire hose instability, the deformation of the initial bi-Maxwellian distribution invalidates the use of standard dispersion relation solvers. A linear solver based on bi-Maxwellians predicts further growth even after saturation, while LEOPARD correctly indicates vanishing growth rates. We also discuss how this complies with former studies on the validity of quasilinear theory for the resonant fire hose. In the end, we briefly comment on the role of LEOPARD in directly analyzing spacecraft data, and we refer to an upcoming paper which demonstrates a first application of that kind.
Mixed-Integer Conic Linear Programming: Challenges and Perspectives

DTIC Science & Technology

2013-10-01

The novel DCCs for MISOCO may be used in branch- and-cut algorithms when solving MISOCO problems. The experimental software CICLO was developed to...perform limited, but rigorous computational experiments. The CICLO solver utilizes continuous SOCO solvers, MOSEK, CPLES or SeDuMi, builds on the open...submitted Fall 2013. Software: 1. CICLO : Integer conic linear optimization package. Authors: J.C. Góez, T.K. Ralphs, Y. Fu, and T. Terlaky
FEAST fundamental framework for electronic structure calculations: Reformulation and solution of the muffin-tin problem

NASA Astrophysics Data System (ADS)

Levin, Alan R.; Zhang, Deyin; Polizzi, Eric

2012-11-01

In a recent article Polizzi (2009) [15], the FEAST algorithm has been presented as a general purpose eigenvalue solver which is ideally suited for addressing the numerical challenges in electronic structure calculations. Here, FEAST is presented beyond the “black-box” solver as a fundamental modeling framework which can naturally address the original numerical complexity of the electronic structure problem as formulated by Slater in 1937 [3]. The non-linear eigenvalue problem arising from the muffin-tin decomposition of the real-space domain is first derived and then reformulated to be solved exactly within the FEAST framework. This new framework is presented as a fundamental and practical solution for performing both accurate and scalable electronic structure calculations, bypassing the various issues of using traditional approaches such as linearization and pseudopotential techniques. A finite element implementation of this FEAST framework along with simulation results for various molecular systems is also presented and discussed.
Development of a steady potential solver for use with linearized, unsteady aerodynamic analyses

NASA Technical Reports Server (NTRS)

Hoyniak, Daniel; Verdon, Joseph M.

1991-01-01

A full potential steady flow solver (SFLOW) developed explicitly for use with an inviscid unsteady aerodynamic analysis (LINFLO) is described. The steady solver uses the nonconservative form of the nonlinear potential flow equations together with an implicit, least squares, finite difference approximation to solve for the steady flow field. The difference equations were developed on a composite mesh which consists of a C grid embedded in a rectilinear (H grid) cascade mesh. The composite mesh is capable of resolving blade to blade and far field phenomena on the H grid, while accurately resolving local phenomena on the C grid. The resulting system of algebraic equations is arranged in matrix form using a sparse matrix package and solved by Newton's method. Steady and unsteady results are presented for two cascade configurations: a high speed compressor and a turbine with high exit Mach number.
Parallel Dynamics Simulation Using a Krylov-Schwarz Linear Solution Scheme

DOE PAGES

Abhyankar, Shrirang; Constantinescu, Emil M.; Smith, Barry F.; ...

2016-11-07

Fast dynamics simulation of large-scale power systems is a computational challenge because of the need to solve a large set of stiff, nonlinear differential-algebraic equations at every time step. The main bottleneck in dynamic simulations is the solution of a linear system during each nonlinear iteration of Newton’s method. In this paper, we present a parallel Krylov- Schwarz linear solution scheme that uses the Krylov subspacebased iterative linear solver GMRES with an overlapping restricted additive Schwarz preconditioner. As a result, performance tests of the proposed Krylov-Schwarz scheme for several large test cases ranging from 2,000 to 20,000 buses, including amore » real utility network, show good scalability on different computing architectures.« less
Parallel Dynamics Simulation Using a Krylov-Schwarz Linear Solution Scheme

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abhyankar, Shrirang; Constantinescu, Emil M.; Smith, Barry F.

Fast dynamics simulation of large-scale power systems is a computational challenge because of the need to solve a large set of stiff, nonlinear differential-algebraic equations at every time step. The main bottleneck in dynamic simulations is the solution of a linear system during each nonlinear iteration of Newton’s method. In this paper, we present a parallel Krylov- Schwarz linear solution scheme that uses the Krylov subspacebased iterative linear solver GMRES with an overlapping restricted additive Schwarz preconditioner. As a result, performance tests of the proposed Krylov-Schwarz scheme for several large test cases ranging from 2,000 to 20,000 buses, including amore » real utility network, show good scalability on different computing architectures.« less
Code Samples Used for Complexity and Control

NASA Astrophysics Data System (ADS)

Ivancevic, Vladimir G.; Reid, Darryn J.

2015-11-01

The following sections are included: * MathematicaⓇ Code * Generic Chaotic Simulator * Vector Differential Operators * NLS Explorer * 2C++ Code * C++ Lambda Functions for Real Calculus * Accelerometer Data Processor * Simple Predictor-Corrector Integrator * Solving the BVP with the Shooting Method * Linear Hyperbolic PDE Solver * Linear Elliptic PDE Solver * Method of Lines for a Set of the NLS Equations * C# Code * Iterative Equation Solver * Simulated Annealing: A Function Minimum * Simple Nonlinear Dynamics * Nonlinear Pendulum Simulator * Lagrangian Dynamics Simulator * Complex-Valued Crowd Attractor Dynamics * Freeform Fortran Code * Lorenz Attractor Simulator * Complex Lorenz Attractor * Simple SGE Soliton * Complex Signal Presentation * Gaussian Wave Packet * Hermitian Matrices * Euclidean L2-Norm * Vector/Matrix Operations * Plain C-Code: Levenberg-Marquardt Optimizer * Free Basic Code: 2D Crowd Dynamics with 3000 Agents
Implementation of the Jacobian-free Newton-Krylov method for solving the for solving the first-order ice sheet momentum balance

DOE Office of Scientific and Technical Information (OSTI.GOV)

Salinger, Andy; Evans, Katherine J; Lemieux, Jean-Francois

2011-01-01

We have implemented the Jacobian-free Newton-Krylov (JFNK) method for solving the rst-order ice sheet momentum equation in order to improve the numerical performance of the Community Ice Sheet Model (CISM), the land ice component of the Community Earth System Model (CESM). Our JFNK implementation is based on signicant re-use of existing code. For example, our physics-based preconditioner uses the original Picard linear solver in CISM. For several test cases spanning a range of geometries and boundary conditions, our JFNK implementation is 1.84-3.62 times more efficient than the standard Picard solver in CISM. Importantly, this computational gain of JFNK over themore » Picard solver increases when rening the grid. Global convergence of the JFNK solver has been signicantly improved by rescaling the equation for the basal boundary condition and through the use of an inexact Newton method. While a diverse set of test cases show that our JFNK implementation is usually robust, for some problems it may fail to converge with increasing resolution (as does the Picard solver). Globalization through parameter continuation did not remedy this problem and future work to improve robustness will explore a combination of Picard and JFNK and the use of homotopy methods.« less
Hydrodynamics of suspensions of passive and active rigid particles: a rigid multiblob approach

DOE PAGES

Usabiaga, Florencio Balboa; Kallemov, Bakytzhan; Delmotte, Blaise; ...

2016-01-12

We develop a rigid multiblob method for numerically solving the mobility problem for suspensions of passive and active rigid particles of complex shape in Stokes flow in unconfined, partially confined, and fully confined geometries. As in a number of existing methods, we discretize rigid bodies using a collection of minimally resolved spherical blobs constrained to move as a rigid body, to arrive at a potentially large linear system of equations for the unknown Lagrange multipliers and rigid-body motions. Here we develop a block-diagonal preconditioner for this linear system and show that a standard Krylov solver converges in a modest numbermore » of iterations that is essentially independent of the number of particles. Key to the efficiency of the method is a technique for fast computation of the product of the blob-blob mobility matrix and a vector. For unbounded suspensions, we rely on existing analytical expressions for the Rotne-Prager-Yamakawa tensor combined with a fast multipole method (FMM) to obtain linear scaling in the number of particles. For suspensions sedimented against a single no-slip boundary, we use a direct summation on a graphical processing unit (GPU), which gives quadratic asymptotic scaling with the number of particles. For fully confined domains, such as periodic suspensions or suspensions confined in slit and square channels, we extend a recently developed rigid-body immersed boundary method by B. Kallemov, A. P. S. Bhalla, B. E. Griffith, and A. Donev (Commun. Appl. Math. Comput. Sci. 11 (2016), no. 1, 79-141) to suspensions of freely moving passive or active rigid particles at zero Reynolds number. We demonstrate that the iterative solver for the coupled fluid and rigid-body equations converges in a bounded number of iterations regardless of the system size. In our approach, each iteration only requires a few cycles of a geometric multigrid solver for the Poisson equation, and an application of the block-diagonal preconditioner, leading to linear scaling with the number of particles. We optimize a number of parameters in the iterative solvers and apply our method to a variety of benchmark problems to carefully assess the accuracy of the rigid multiblob approach as a function of the resolution. We also model the dynamics of colloidal particles studied in recent experiments, such as passive boomerangs in a slit channel, as well as a pair of non-Brownian active nanorods sedimented against a wall.« less
Fast and Efficient Discrimination of Traveling Salesperson Problem Stimulus Difficulty

ERIC Educational Resources Information Center

Dry, Matthew J.; Fontaine, Elizabeth L.

2014-01-01

The Traveling Salesperson Problem (TSP) is a computationally difficult combinatorial optimization problem. In spite of its relative difficulty, human solvers are able to generate close-to-optimal solutions in a close-to-linear time frame, and it has been suggested that this is due to the visual system's inherent sensitivity to certain geometric…
Domain decomposition methods for the parallel computation of reacting flows

NASA Technical Reports Server (NTRS)

Keyes, David E.

1988-01-01

Domain decomposition is a natural route to parallel computing for partial differential equation solvers. Subdomains of which the original domain of definition is comprised are assigned to independent processors at the price of periodic coordination between processors to compute global parameters and maintain the requisite degree of continuity of the solution at the subdomain interfaces. In the domain-decomposed solution of steady multidimensional systems of PDEs by finite difference methods using a pseudo-transient version of Newton iteration, the only portion of the computation which generally stands in the way of efficient parallelization is the solution of the large, sparse linear systems arising at each Newton step. For some Jacobian matrices drawn from an actual two-dimensional reacting flow problem, comparisons are made between relaxation-based linear solvers and also preconditioned iterative methods of Conjugate Gradient and Chebyshev type, focusing attention on both iteration count and global inner product count. The generalized minimum residual method with block-ILU preconditioning is judged the best serial method among those considered, and parallel numerical experiments on the Encore Multimax demonstrate for it approximately 10-fold speedup on 16 processors.

Final Report - Subcontract B623760

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bank, R.

2017-11-17

During my visit to LLNL during July 17{27, 2017, I worked on linear system solvers. The two level hierarchical solver that initiated our study was developed to solve linear systems arising from hp adaptive finite element calculations, and is implemented in the PLTMG software package, version 12. This preconditioner typically requires 3-20% of the space used by the stiffness matrix for higher order elements. It has multigrid like convergence rates for a wide variety of PDEs (self-adjoint positive de nite elliptic equations, convection dominated convection-diffusion equations, and highly indefinite Helmholtz equations, among others). The convergence rate is not independent ofmore » the polynomial degree p as p ! 1, but but remains strong for p 9, which is the highest polynomial degree allowed in PLTMG, due to limitations of the numerical quadrature rules implemented in the software package. A more complete description of the method and some numerical experiments illustrating its effectiveness appear in. Like traditional geometric multilevel methods, this scheme relies on knowledge of the underlying finite element space in order to construct the smoother and the coarse grid correction.« less
IGA-ADS: Isogeometric analysis FEM using ADS solver

NASA Astrophysics Data System (ADS)

Łoś, Marcin M.; Woźniak, Maciej; Paszyński, Maciej; Lenharth, Andrew; Hassaan, Muhamm Amber; Pingali, Keshav

2017-08-01

In this paper we present a fast explicit solver for solution of non-stationary problems using L2 projections with isogeometric finite element method. The solver has been implemented within GALOIS framework. It enables parallel multi-core simulations of different time-dependent problems, in 1D, 2D, or 3D. We have prepared the solver framework in a way that enables direct implementation of the selected PDE and corresponding boundary conditions. In this paper we describe the installation, implementation of exemplary three PDEs, and execution of the simulations on multi-core Linux cluster nodes. We consider three case studies, including heat transfer, linear elasticity, as well as non-linear flow in heterogeneous media. The presented package generates output suitable for interfacing with Gnuplot and ParaView visualization software. The exemplary simulations show near perfect scalability on Gilbert shared-memory node with four Intel® Xeon® CPU E7-4860 processors, each possessing 10 physical cores (for a total of 40 cores).
Pushing Memory Bandwidth Limitations Through Efficient Implementations of Block-Krylov Space Solvers on GPUs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Clark, M. A.; Strelchenko, Alexei; Vaquero, Alejandro

Lattice quantum chromodynamics simulations in nuclear physics have benefited from a tremendous number of algorithmic advances such as multigrid and eigenvector deflation. These improve the time to solution but do not alleviate the intrinsic memory-bandwidth constraints of the matrix-vector operation dominating iterative solvers. Batching this operation for multiple vectors and exploiting cache and register blocking can yield a super-linear speed up. Block-Krylov solvers can naturally take advantage of such batched matrix-vector operations, further reducing the iterations to solution by sharing the Krylov space between solves. However, practical implementations typically suffer from the quadratic scaling in the number of vector-vector operations.more » Using the QUDA library, we present an implementation of a block-CG solver on NVIDIA GPUs which reduces the memory-bandwidth complexity of vector-vector operations from quadratic to linear. We present results for the HISQ discretization, showing a 5x speedup compared to highly-optimized independent Krylov solves on NVIDIA's SaturnV cluster.« less
A generalized Poisson and Poisson-Boltzmann solver for electrostatic environments.

PubMed

Fisicaro, G; Genovese, L; Andreussi, O; Marzari, N; Goedecker, S

2016-01-07

The computational study of chemical reactions in complex, wet environments is critical for applications in many fields. It is often essential to study chemical reactions in the presence of applied electrochemical potentials, taking into account the non-trivial electrostatic screening coming from the solvent and the electrolytes. As a consequence, the electrostatic potential has to be found by solving the generalized Poisson and the Poisson-Boltzmann equations for neutral and ionic solutions, respectively. In the present work, solvers for both problems have been developed. A preconditioned conjugate gradient method has been implemented for the solution of the generalized Poisson equation and the linear regime of the Poisson-Boltzmann, allowing to solve iteratively the minimization problem with some ten iterations of the ordinary Poisson equation solver. In addition, a self-consistent procedure enables us to solve the non-linear Poisson-Boltzmann problem. Both solvers exhibit very high accuracy and parallel efficiency and allow for the treatment of periodic, free, and slab boundary conditions. The solver has been integrated into the BigDFT and Quantum-ESPRESSO electronic-structure packages and will be released as an independent program, suitable for integration in other codes.
A generalized Poisson and Poisson-Boltzmann solver for electrostatic environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fisicaro, G., E-mail: giuseppe.fisicaro@unibas.ch; Goedecker, S.; Genovese, L.

2016-01-07

The computational study of chemical reactions in complex, wet environments is critical for applications in many fields. It is often essential to study chemical reactions in the presence of applied electrochemical potentials, taking into account the non-trivial electrostatic screening coming from the solvent and the electrolytes. As a consequence, the electrostatic potential has to be found by solving the generalized Poisson and the Poisson-Boltzmann equations for neutral and ionic solutions, respectively. In the present work, solvers for both problems have been developed. A preconditioned conjugate gradient method has been implemented for the solution of the generalized Poisson equation and themore » linear regime of the Poisson-Boltzmann, allowing to solve iteratively the minimization problem with some ten iterations of the ordinary Poisson equation solver. In addition, a self-consistent procedure enables us to solve the non-linear Poisson-Boltzmann problem. Both solvers exhibit very high accuracy and parallel efficiency and allow for the treatment of periodic, free, and slab boundary conditions. The solver has been integrated into the BigDFT and Quantum-ESPRESSO electronic-structure packages and will be released as an independent program, suitable for integration in other codes.« less
An immersed-boundary method for flow–structure interaction in biological systems with application to phonation

PubMed Central

Luo, Haoxiang; Mittal, Rajat; Zheng, Xudong; Bielamowicz, Steven A.; Walsh, Raymond J.; Hahn, James K.

2008-01-01

A new numerical approach for modeling a class of flow–structure interaction problems typically encountered in biological systems is presented. In this approach, a previously developed, sharp-interface, immersed-boundary method for incompressible flows is used to model the fluid flow and a new, sharp-interface Cartesian grid, immersed boundary method is devised to solve the equations of linear viscoelasticity that governs the solid. The two solvers are coupled to model flow–structure interaction. This coupled solver has the advantage of simple grid generation and efficient computation on simple, single-block structured grids. The accuracy of the solid-mechanics solver is examined by applying it to a canonical problem. The solution methodology is then applied to the problem of laryngeal aerodynamics and vocal fold vibration during human phonation. This includes a three-dimensional eigen analysis for a multi-layered vocal fold prototype as well as two-dimensional, flow-induced vocal fold vibration in a modeled larynx. Several salient features of the aerodynamics as well as vocal-fold dynamics are presented. PMID:19936017
The value of continuity: Refined isogeometric analysis and fast direct solvers

DOE PAGES

Garcia, Daniel; Pardo, David; Dalcin, Lisandro; ...

2016-08-24

Here, we propose the use of highly continuous finite element spaces interconnected with low continuity hyperplanes to maximize the performance of direct solvers. Starting from a highly continuous Isogeometric Analysis (IGA) discretization, we introduce C0-separators to reduce the interconnection between degrees of freedom in the mesh. By doing so, both the solution time and best approximation errors are simultaneously improved. We call the resulting method “refined Isogeometric Analysis (rIGA)”. To illustrate the impact of the continuity reduction, we analyze the number of Floating Point Operations (FLOPs), computational times, and memory required to solve the linear system obtained by discretizing themore » Laplace problem with structured meshes and uniform polynomial orders. Theoretical estimates demonstrate that an optimal continuity reduction may decrease the total computational time by a factor between p 2 and p 3, with pp being the polynomial order of the discretization. Numerical results indicate that our proposed refined isogeometric analysis delivers a speed-up factor proportional to p 2. In a 2D mesh with four million elements and p=5, the linear system resulting from rIGA is solved 22 times faster than the one from highly continuous IGA. In a 3D mesh with one million elements and p=3, the linear system is solved 15 times faster for the refined than the maximum continuity isogeometric analysis.« less
Large-scale 3-D EM modelling with a Block Low-Rank multifrontal direct solver

NASA Astrophysics Data System (ADS)

Shantsev, Daniil V.; Jaysaval, Piyoosh; de la Kethulle de Ryhove, Sébastien; Amestoy, Patrick R.; Buttari, Alfredo; L'Excellent, Jean-Yves; Mary, Theo

2017-06-01

We put forward the idea of using a Block Low-Rank (BLR) multifrontal direct solver to efficiently solve the linear systems of equations arising from a finite-difference discretization of the frequency-domain Maxwell equations for 3-D electromagnetic (EM) problems. The solver uses a low-rank representation for the off-diagonal blocks of the intermediate dense matrices arising in the multifrontal method to reduce the computational load. A numerical threshold, the so-called BLR threshold, controlling the accuracy of low-rank representations was optimized by balancing errors in the computed EM fields against savings in floating point operations (flops). Simulations were carried out over large-scale 3-D resistivity models representing typical scenarios for marine controlled-source EM surveys, and in particular the SEG SEAM model which contains an irregular salt body. The flop count, size of factor matrices and elapsed run time for matrix factorization are reduced dramatically by using BLR representations and can go down to, respectively, 10, 30 and 40 per cent of their full-rank values for our largest system with N = 20.6 million unknowns. The reductions are almost independent of the number of MPI tasks and threads at least up to 90 × 10 = 900 cores. The BLR savings increase for larger systems, which reduces the factorization flop complexity from O(N2) for the full-rank solver to O(Nm) with m = 1.4-1.6. The BLR savings are significantly larger for deep-water environments that exclude the highly resistive air layer from the computational domain. A study in a scenario where simulations are required at multiple source locations shows that the BLR solver can become competitive in comparison to iterative solvers as an engine for 3-D controlled-source electromagnetic Gauss-Newton inversion that requires forward modelling for a few thousand right-hand sides.
User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.

NASA Technical Reports Server (NTRS)

Reddy, C. J.

2000-01-01

PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.
libmpdata++ 1.0: a library of parallel MPDATA solvers for systems of generalised transport equations

NASA Astrophysics Data System (ADS)

Jaruga, A.; Arabas, S.; Jarecka, D.; Pawlowska, H.; Smolarkiewicz, P. K.; Waruszewski, M.

2015-04-01

This paper accompanies the first release of libmpdata++, a C++ library implementing the multi-dimensional positive-definite advection transport algorithm (MPDATA) on regular structured grid. The library offers basic numerical solvers for systems of generalised transport equations. The solvers are forward-in-time, conservative and non-linearly stable. The libmpdata++ library covers the basic second-order-accurate formulation of MPDATA, its third-order variant, the infinite-gauge option for variable-sign fields and a flux-corrected transport extension to guarantee non-oscillatory solutions. The library is equipped with a non-symmetric variational elliptic solver for implicit evaluation of pressure gradient terms. All solvers offer parallelisation through domain decomposition using shared-memory parallelisation. The paper describes the library programming interface, and serves as a user guide. Supported options are illustrated with benchmarks discussed in the MPDATA literature. Benchmark descriptions include code snippets as well as quantitative representations of simulation results. Examples of applications include homogeneous transport in one, two and three dimensions in Cartesian and spherical domains; a shallow-water system compared with analytical solution (originally derived for a 2-D case); and a buoyant convection problem in an incompressible Boussinesq fluid with interfacial instability. All the examples are implemented out of the library tree. Regardless of the differences in the problem dimensionality, right-hand-side terms, boundary conditions and parallelisation approach, all the examples use the same unmodified library, which is a key goal of libmpdata++ design. The design, based on the principle of separation of concerns, prioritises the user and developer productivity. The libmpdata++ library is implemented in C++, making use of the Blitz++ multi-dimensional array containers, and is released as free/libre and open-source software.
libmpdata++ 0.1: a library of parallel MPDATA solvers for systems of generalised transport equations

NASA Astrophysics Data System (ADS)

Jaruga, A.; Arabas, S.; Jarecka, D.; Pawlowska, H.; Smolarkiewicz, P. K.; Waruszewski, M.

2014-11-01

This paper accompanies first release of libmpdata++, a C++ library implementing the Multidimensional Positive-Definite Advection Transport Algorithm (MPDATA). The library offers basic numerical solvers for systems of generalised transport equations. The solvers are forward-in-time, conservative and non-linearly stable. The libmpdata++ library covers the basic second-order-accurate formulation of MPDATA, its third-order variant, the infinite-gauge option for variable-sign fields and a flux-corrected transport extension to guarantee non-oscillatory solutions. The library is equipped with a non-symmetric variational elliptic solver for implicit evaluation of pressure gradient terms. All solvers offer parallelisation through domain decomposition using shared-memory parallelisation. The paper describes the library programming interface, and serves as a user guide. Supported options are illustrated with benchmarks discussed in the MPDATA literature. Benchmark descriptions include code snippets as well as quantitative representations of simulation results. Examples of applications include: homogeneous transport in one, two and three dimensions in Cartesian and spherical domains; shallow-water system compared with analytical solution (originally derived for a 2-D case); and a buoyant convection problem in an incompressible Boussinesq fluid with interfacial instability. All the examples are implemented out of the library tree. Regardless of the differences in the problem dimensionality, right-hand-side terms, boundary conditions and parallelisation approach, all the examples use the same unmodified library, which is a key goal of libmpdata++ design. The design, based on the principle of separation of concerns, prioritises the user and developer productivity. The libmpdata++ library is implemented in C++, making use of the Blitz++ multi-dimensional array containers, and is released as free/libre and open-source software.
A parallel solver for huge dense linear systems

NASA Astrophysics Data System (ADS)

Badia, J. M.; Movilla, J. L.; Climente, J. I.; Castillo, M.; Marqués, M.; Mayo, R.; Quintana-Ortí, E. S.; Planelles, J.

2011-11-01

HDSS (Huge Dense Linear System Solver) is a Fortran Application Programming Interface (API) to facilitate the parallel solution of very large dense systems to scientists and engineers. The API makes use of parallelism to yield an efficient solution of the systems on a wide range of parallel platforms, from clusters of processors to massively parallel multiprocessors. It exploits out-of-core strategies to leverage the secondary memory in order to solve huge linear systems O(100.000). The API is based on the parallel linear algebra library PLAPACK, and on its Out-Of-Core (OOC) extension POOCLAPACK. Both PLAPACK and POOCLAPACK use the Message Passing Interface (MPI) as the communication layer and BLAS to perform the local matrix operations. The API provides a friendly interface to the users, hiding almost all the technical aspects related to the parallel execution of the code and the use of the secondary memory to solve the systems. In particular, the API can automatically select the best way to store and solve the systems, depending of the dimension of the system, the number of processes and the main memory of the platform. Experimental results on several parallel platforms report high performance, reaching more than 1 TFLOP with 64 cores to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors. New version program summaryProgram title: Huge Dense System Solver (HDSS) Catalogue identifier: AEHU_v1_1 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEHU_v1_1.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 87 062 No. of bytes in distributed program, including test data, etc.: 1 069 110 Distribution format: tar.gz Programming language: Fortran90, C Computer: Parallel architectures: multiprocessors, computer clusters Operating system: Linux/Unix Has the code been vectorized or parallelized?: Yes, includes MPI primitives. RAM: Tested for up to 190 GB Classification: 6.5 External routines: MPI ( http://www.mpi-forum.org/), BLAS ( http://www.netlib.org/blas/), PLAPACK ( http://www.cs.utexas.edu/~plapack/), POOCLAPACK ( ftp://ftp.cs.utexas.edu/pub/rvdg/PLAPACK/pooclapack.ps) (code for PLAPACK and POOCLAPACK is included in the distribution). Catalogue identifier of previous version: AEHU_v1_0 Journal reference of previous version: Comput. Phys. Comm. 182 (2011) 533 Does the new version supersede the previous version?: Yes Nature of problem: Huge scale dense systems of linear equations, Ax=B, beyond standard LAPACK capabilities. Solution method: The linear systems are solved by means of parallelized routines based on the LU factorization, using efficient secondary storage algorithms when the available main memory is insufficient. Reasons for new version: In many applications we need to guarantee a high accuracy in the solution of very large linear systems and we can do it by using double-precision arithmetic. Summary of revisions: Version 1.1 Can be used to solve linear systems using double-precision arithmetic. New version of the initialization routine. The user can choose the kind of arithmetic and the values of several parameters of the environment. Running time: About 5 hours to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors using double-precision arithmetic on an eight-node commodity cluster with a total of 64 Intel cores.
Toward an optimal solver for time-spectral fluid-dynamic and aeroelastic solutions on unstructured meshes

NASA Astrophysics Data System (ADS)

Mundis, Nathan L.; Mavriplis, Dimitri J.

2017-09-01

The time-spectral method applied to the Euler and coupled aeroelastic equations theoretically offers significant computational savings for purely periodic problems when compared to standard time-implicit methods. However, attaining superior efficiency with time-spectral methods over traditional time-implicit methods hinges on the ability rapidly to solve the large non-linear system resulting from time-spectral discretizations which become larger and stiffer as more time instances are employed or the period of the flow becomes especially short (i.e. the maximum resolvable wave-number increases). In order to increase the efficiency of these solvers, and to improve robustness, particularly for large numbers of time instances, the Generalized Minimal Residual Method (GMRES) is used to solve the implicit linear system over all coupled time instances. The use of GMRES as the linear solver makes time-spectral methods more robust, allows them to be applied to a far greater subset of time-accurate problems, including those with a broad range of harmonic content, and vastly improves the efficiency of time-spectral methods. In previous work, a wave-number independent preconditioner that mitigates the increased stiffness of the time-spectral method when applied to problems with large resolvable wave numbers has been developed. This preconditioner, however, directly inverts a large matrix whose size increases in proportion to the number of time instances. As a result, the computational time of this method scales as the cube of the number of time instances. In the present work, this preconditioner has been reworked to take advantage of an approximate-factorization approach that effectively decouples the spatial and temporal systems. Once decoupled, the time-spectral matrix can be inverted in frequency space, where it has entries only on the main diagonal and therefore can be inverted quite efficiently. This new GMRES/preconditioner combination is shown to be over an order of magnitude more efficient than the previous wave-number independent preconditioner for problems with large numbers of time instances and/or large reduced frequencies.
A LAGRANGIAN GAUSS-NEWTON-KRYLOV SOLVER FOR MASS- AND INTENSITY-PRESERVING DIFFEOMORPHIC IMAGE REGISTRATION.

PubMed

Mang, Andreas; Ruthotto, Lars

2017-01-01

We present an efficient solver for diffeomorphic image registration problems in the framework of Large Deformations Diffeomorphic Metric Mappings (LDDMM). We use an optimal control formulation, in which the velocity field of a hyperbolic PDE needs to be found such that the distance between the final state of the system (the transformed/transported template image) and the observation (the reference image) is minimized. Our solver supports both stationary and non-stationary (i.e., transient or time-dependent) velocity fields. As transformation models, we consider both the transport equation (assuming intensities are preserved during the deformation) and the continuity equation (assuming mass-preservation). We consider the reduced form of the optimal control problem and solve the resulting unconstrained optimization problem using a discretize-then-optimize approach. A key contribution is the elimination of the PDE constraint using a Lagrangian hyperbolic PDE solver. Lagrangian methods rely on the concept of characteristic curves. We approximate these curves using a fourth-order Runge-Kutta method. We also present an efficient algorithm for computing the derivatives of the final state of the system with respect to the velocity field. This allows us to use fast Gauss-Newton based methods. We present quickly converging iterative linear solvers using spectral preconditioners that render the overall optimization efficient and scalable. Our method is embedded into the image registration framework FAIR and, thus, supports the most commonly used similarity measures and regularization functionals. We demonstrate the potential of our new approach using several synthetic and real world test problems with up to 14.7 million degrees of freedom.
Parallel Symmetric Eigenvalue Problem Solvers

DTIC Science & Technology

2015-05-01

get research, tutoring, and mentoring experience as an undergraduate. Last but not least, I thank my family for their love and support. v TABLE OF...32 4.6.2 Choice of the Ritz shifts . . . . . . . . . . . . . . . . . . . . 37 4.7 Relationship between...pencil. I will conclude with a discussion of the relationship between Trace- Min and simultaneous iteration. If both methods solve the linear systems
Evaluating Sparse Linear System Solvers on Scalable Parallel Architectures

DTIC Science & Technology

2008-10-01

42 3.4 Residual history of WSO banded preconditioner for problem 2D 54019 HIGHK . . . . . . . . . . . . . . . . . . . . . . . . . . 43...3.5 Residual history of WSO banded preconditioner for problem Appu 43 3.6 Residual history of WSO banded preconditioner for problem ASIC 680k...44 3.7 Residual history of WSO banded preconditioner for problem BUN- DLE1
ASIS v1.0: an adaptive solver for the simulation of atmospheric chemistry

NASA Astrophysics Data System (ADS)

Cariolle, Daniel; Moinat, Philippe; Teyssèdre, Hubert; Giraud, Luc; Josse, Béatrice; Lefèvre, Franck

2017-04-01

This article reports on the development and tests of the adaptive semi-implicit scheme (ASIS) solver for the simulation of atmospheric chemistry. To solve the ordinary differential equation systems associated with the time evolution of the species concentrations, ASIS adopts a one-step linearized implicit scheme with specific treatments of the Jacobian of the chemical fluxes. It conserves mass and has a time-stepping module to control the accuracy of the numerical solution. In idealized box-model simulations, ASIS gives results similar to the higher-order implicit schemes derived from the Rosenbrock's and Gear's methods and requires less computation and run time at the moderate precision required for atmospheric applications. When implemented in the MOCAGE chemical transport model and the Laboratoire de Météorologie Dynamique Mars general circulation model, the ASIS solver performs well and reveals weaknesses and limitations of the original semi-implicit solvers used by these two models. ASIS can be easily adapted to various chemical schemes and further developments are foreseen to increase its computational efficiency, and to include the computation of the concentrations of the species in aqueous-phase in addition to gas-phase chemistry.
Final Report, DE-FG01-06ER25718 Domain Decomposition and Parallel Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Widlund, Olof B.

2015-06-09

The goal of this project is to develop and improve domain decomposition algorithms for a variety of partial differential equations such as those of linear elasticity and electro-magnetics.These iterative methods are designed for massively parallel computing systems and allow the fast solution of the very large systems of algebraic equations that arise in large scale and complicated simulations. A special emphasis is placed on problems arising from Maxwell's equation. The approximate solvers, the preconditioners, are combined with the conjugate gradient method and must always include a solver of a coarse model in order to have a performance which is independentmore » of the number of processors used in the computer simulation. A recent development allows for an adaptive construction of this coarse component of the preconditioner.« less
Preconditioned conjugate gradient methods for the Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Ng, Wing-Fai; Liou, Meng-Sing

1994-01-01

A preconditioned Krylov subspace method (GMRES) is used to solve the linear systems of equations formed at each time-integration step of the unsteady, two-dimensional, compressible Navier-Stokes equations of fluid flow. The Navier-Stokes equations are cast in an implicit, upwind finite-volume, flux-split formulation. Several preconditioning techniques are investigated to enhance the efficiency and convergence rate of the implicit solver based on the GMRES algorithm. The superiority of the new solver is established by comparisons with a conventional implicit solver, namely line Gauss-Seidel relaxation (LGSR). Computational test results for low-speed (incompressible flow over a backward-facing step at Mach 0.1), transonic flow (trailing edge flow in a transonic turbine cascade), and hypersonic flow (shock-on-shock interactions on a cylindrical leading edge at Mach 6.0) are presented. For the Mach 0.1 case, overall speedup factors of up to 17 (in terms of time-steps) and 15 (in terms of CPU time on a CRAY-YMP/8) are found in favor of the preconditioned GMRES solver, when compared with the LGSR solver. The corresponding speedup factors for the transonic flow case are 17 and 23, respectively. The hypersonic flow case shows slightly lower speedup factors of 9 and 13, respectively. The study of preconditioners conducted in this research reveals that a new LUSGS-type preconditioner is much more efficient than a conventional incomplete LU-type preconditioner.
NASA-Ames three-dimensional potential flow analysis system (POTFAN) equation solver code (SOLN) version 1

NASA Technical Reports Server (NTRS)

Davis, J. E.; Bonnett, W. S.; Medan, R. T.

1976-01-01

A computer program known as SOLN was developed as an independent segment of the NASA-Ames three-dimensional potential flow analysis systems of linear algebraic equations. Methods used include: LU decomposition, Householder's method, a partitioning scheme, and a block successive relaxation method. Due to the independent modular nature of the program, it may be used by itself and not necessarily in conjunction with other segments of the POTFAN system.

Convergence Speed of a Dynamical System for Sparse Recovery

NASA Astrophysics Data System (ADS)

Balavoine, Aurele; Rozell, Christopher J.; Romberg, Justin

2013-09-01

This paper studies the convergence rate of a continuous-time dynamical system for L1-minimization, known as the Locally Competitive Algorithm (LCA). Solving L1-minimization} problems efficiently and rapidly is of great interest to the signal processing community, as these programs have been shown to recover sparse solutions to underdetermined systems of linear equations and come with strong performance guarantees. The LCA under study differs from the typical L1 solver in that it operates in continuous time: instead of being specified by discrete iterations, it evolves according to a system of nonlinear ordinary differential equations. The LCA is constructed from simple components, giving it the potential to be implemented as a large-scale analog circuit. The goal of this paper is to give guarantees on the convergence time of the LCA system. To do so, we analyze how the LCA evolves as it is recovering a sparse signal from underdetermined measurements. We show that under appropriate conditions on the measurement matrix and the problem parameters, the path the LCA follows can be described as a sequence of linear differential equations, each with a small number of active variables. This allows us to relate the convergence time of the system to the restricted isometry constant of the matrix. Interesting parallels to sparse-recovery digital solvers emerge from this study. Our analysis covers both the noisy and noiseless settings and is supported by simulation results.
Matrix decomposition graphics processing unit solver for Poisson image editing

NASA Astrophysics Data System (ADS)

Lei, Zhao; Wei, Li

2012-10-01

In recent years, gradient-domain methods have been widely discussed in the image processing field, including seamless cloning and image stitching. These algorithms are commonly carried out by solving a large sparse linear system: the Poisson equation. However, solving the Poisson equation is a computational and memory intensive task which makes it not suitable for real-time image editing. A new matrix decomposition graphics processing unit (GPU) solver (MDGS) is proposed to settle the problem. A matrix decomposition method is used to distribute the work among GPU threads, so that MDGS will take full advantage of the computing power of current GPUs. Additionally, MDGS is a hybrid solver (combines both the direct and iterative techniques) and has two-level architecture. These enable MDGS to generate identical solutions with those of the common Poisson methods and achieve high convergence rate in most cases. This approach is advantageous in terms of parallelizability, enabling real-time image processing, low memory-taken and extensive applications.
TOUGH3: A new efficient version of the TOUGH suite of multiphase flow and transport simulators

NASA Astrophysics Data System (ADS)

Jung, Yoojin; Pau, George Shu Heng; Finsterle, Stefan; Pollyea, Ryan M.

2017-11-01

The TOUGH suite of nonisothermal multiphase flow and transport simulators has been updated by various developers over many years to address a vast range of challenging subsurface problems. The increasing complexity of the simulated processes as well as the growing size of model domains that need to be handled call for an improvement in the simulator's computational robustness and efficiency. Moreover, modifications have been frequently introduced independently, resulting in multiple versions of TOUGH that (1) led to inconsistencies in feature implementation and usage, (2) made code maintenance and development inefficient, and (3) caused confusion to users and developers. TOUGH3-a new base version of TOUGH-addresses these issues. It consolidates both the serial (TOUGH2 V2.1) and parallel (TOUGH2-MP V2.0) implementations, enabling simulations to be performed on desktop computers and supercomputers using a single code. New PETSc parallel linear solvers are added to the existing serial solvers of TOUGH2 and the Aztec solver used in TOUGH2-MP. The PETSc solvers generally perform better than the Aztec solvers in parallel and the internal TOUGH3 linear solver in serial. TOUGH3 also incorporates many new features, addresses bugs, and improves the flexibility of data handling. Due to the improved capabilities and usability, TOUGH3 is more robust and efficient for solving tough and computationally demanding problems in diverse scientific and practical applications related to subsurface flow modeling.
Thyra Abstract Interface Package

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bartlett, Roscoe A.

2005-09-01

Thrya primarily defines a set of abstract C++ class interfaces needed for the development of abstract numerical atgorithms (ANAs) such as iterative linear solvers, transient solvers all the way up to optimization. At the foundation of these interfaces are abstract C++ classes for vectors, vector spaces, linear operators and multi-vectors. Also included in the Thyra package is C++ code for creating concrete vector, vector space, linear operator, and multi-vector subclasses as well as other utilities to aid in the development of ANAs. Currently, very general and efficient concrete subclass implementations exist for serial and SPMD in-core vectors and multi-vectors. Codemore » also currently exists for testing objects and providing composite objects such as product vectors.« less
A survey of packages for large linear systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Kesheng; Milne, Brent

2000-02-11

This paper evaluates portable software packages for the iterative solution of very large sparse linear systems on parallel architectures. While we cannot hope to tell individual users which package will best suit their needs, we do hope that our systematic evaluation provides essential unbiased information about the packages and the evaluation process may serve as an example on how to evaluate these packages. The information contained here include feature comparisons, usability evaluations and performance characterizations. This review is primarily focused on self-contained packages that can be easily integrated into an existing program and are capable of computing solutions to verymore » large sparse linear systems of equations. More specifically, it concentrates on portable parallel linear system solution packages that provide iterative solution schemes and related preconditioning schemes because iterative methods are more frequently used than competing schemes such as direct methods. The eight packages evaluated are: Aztec, BlockSolve,ISIS++, LINSOL, P-SPARSLIB, PARASOL, PETSc, and PINEAPL. Among the eight portable parallel iterative linear system solvers reviewed, we recommend PETSc and Aztec for most application programmers because they have well designed user interface, extensive documentation and very responsive user support. Both PETSc and Aztec are written in the C language and are callable from Fortran. For those users interested in using Fortran 90, PARASOL is a good alternative. ISIS++is a good alternative for those who prefer the C++ language. Both PARASOL and ISIS++ are relatively new and are continuously evolving. Thus their user interface may change. In general, those packages written in Fortran 77 are more cumbersome to use because the user may need to directly deal with a number of arrays of varying sizes. Languages like C++ and Fortran 90 offer more convenient data encapsulation mechanisms which make it easier to implement a clean and intuitive user interface. In addition to reviewing these portable parallel iterative solver packages, we also provide a more cursory assessment of a range of related packages, from specialized parallel preconditioners to direct methods for sparse linear systems.« less
Simultaneous elastic parameter inversion in 2-D/3-D TTI medium combined later arrival times

NASA Astrophysics Data System (ADS)

Bai, Chao-ying; Wang, Tao; Yang, Shang-bei; Li, Xing-wang; Huang, Guo-jiao

2016-04-01

Traditional traveltime inversion for anisotropic medium is, in general, based on a "weak" assumption in the anisotropic property, which simplifies both the forward part (ray tracing is performed once only) and the inversion part (a linear inversion solver is possible). But for some real applications, a general (both "weak" and "strong") anisotropic medium should be considered. In such cases, one has to develop a ray tracing algorithm to handle with the general (including "strong") anisotropic medium and also to design a non-linear inversion solver for later tomography. Meanwhile, it is constructive to investigate how much the tomographic resolution can be improved by introducing the later arrivals. For this motivation, we incorporated our newly developed ray tracing algorithm (multistage irregular shortest-path method) for general anisotropic media with a non-linear inversion solver (a damped minimum norm, constrained least squares problem with a conjugate gradient approach) to formulate a non-linear inversion solver for anisotropic medium. This anisotropic traveltime inversion procedure is able to combine the later (reflected) arrival times. Both 2-D/3-D synthetic inversion experiments and comparison tests show that (1) the proposed anisotropic traveltime inversion scheme is able to recover the high contrast anomalies and (2) it is possible to improve the tomographic resolution by introducing the later (reflected) arrivals, but not as expected in the isotropic medium, because the different velocity (qP, qSV and qSH) sensitivities (or derivatives) respective to the different elastic parameters are not the same but are also dependent on the inclination angle.
NAS Experiences of Porting CM Fortran Codes to HPF on IBM SP2 and SGI Power Challenge

NASA Technical Reports Server (NTRS)

Saini, Subhash

1995-01-01

Current Connection Machine (CM) Fortran codes developed for the CM-2 and the CM-5 represent an important class of parallel applications. Several users have employed CM Fortran codes in production mode on the CM-2 and the CM-5 for the last five to six years, constituting a heavy investment in terms of cost and time. With Thinking Machines Corporation's decision to withdraw from the hardware business and with the decommissioning of many CM-2 and CM-5 machines, the best way to protect the substantial investment in CM Fortran codes is to port the codes to High Performance Fortran (HPF) on highly parallel systems. HPF is very similar to CM Fortran and thus represents a natural transition. Conversion issues involved in porting CM Fortran codes on the CM-5 to HPF are presented. In particular, the differences between data distribution directives and the CM Fortran Utility Routines Library, as well as the equivalent functionality in the HPF Library are discussed. Several CM Fortran codes (Cannon algorithm for matrix-matrix multiplication, Linear solver Ax=b, 1-D convolution for 2-D datasets, Laplace's Equation solver, and Direct Simulation Monte Carlo (DSMC) codes have been ported to Subset HPF on the IBM SP2 and the SGI Power Challenge. Speedup ratios versus number of processors for the Linear solver and DSMC code are presented.
Template-Based 3D Reconstruction of Non-rigid Deformable Object from Monocular Video

NASA Astrophysics Data System (ADS)

Liu, Yang; Peng, Xiaodong; Zhou, Wugen; Liu, Bo; Gerndt, Andreas

2018-06-01

In this paper, we propose a template-based 3D surface reconstruction system of non-rigid deformable objects from monocular video sequence. Firstly, we generate a semi-dense template of the target object with structure from motion method using a subsequence video. This video can be captured by rigid moving camera orienting the static target object or by a static camera observing the rigid moving target object. Then, with the reference template mesh as input and based on the framework of classical template-based methods, we solve an energy minimization problem to get the correspondence between the template and every frame to get the time-varying mesh to present the deformation of objects. The energy terms combine photometric cost, temporal and spatial smoothness cost as well as as-rigid-as-possible cost which can enable elastic deformation. In this paper, an easy and controllable solution to generate the semi-dense template for complex objects is presented. Besides, we use an effective iterative Schur based linear solver for the energy minimization problem. The experimental evaluation presents qualitative deformation objects reconstruction results with real sequences. Compare against the results with other templates as input, the reconstructions based on our template have more accurate and detailed results for certain regions. The experimental results show that the linear solver we used performs better efficiency compared to traditional conjugate gradient based solver.
Design of a Modular Monolithic Implicit Solver for Multi-Physics Applications

NASA Technical Reports Server (NTRS)

Carton De Wiart, Corentin; Diosady, Laslo T.; Garai, Anirban; Burgess, Nicholas; Blonigan, Patrick; Ekelschot, Dirk; Murman, Scott M.

2018-01-01

The design of a modular multi-physics high-order space-time finite-element framework is presented together with its extension to allow monolithic coupling of different physics. One of the main objectives of the framework is to perform efficient high- fidelity simulations of capsule/parachute systems. This problem requires simulating multiple physics including, but not limited to, the compressible Navier-Stokes equations, the dynamics of a moving body with mesh deformations and adaptation, the linear shell equations, non-re effective boundary conditions and wall modeling. The solver is based on high-order space-time - finite element methods. Continuous, discontinuous and C1-discontinuous Galerkin methods are implemented, allowing one to discretize various physical models. Tangent and adjoint sensitivity analysis are also targeted in order to conduct gradient-based optimization, error estimation, mesh adaptation, and flow control, adding another layer of complexity to the framework. The decisions made to tackle these challenges are presented. The discussion focuses first on the "single-physics" solver and later on its extension to the monolithic coupling of different physics. The implementation of different physics modules, relevant to the capsule/parachute system, are also presented. Finally, examples of coupled computations are presented, paving the way to the simulation of the full capsule/parachute system.
Implicit filtered P{sub N} for high-energy density thermal radiation transport using discontinuous Galerkin finite elements

DOE Office of Scientific and Technical Information (OSTI.GOV)

Laboure, Vincent M., E-mail: vincent.laboure@tamu.edu; McClarren, Ryan G., E-mail: rgm@tamu.edu; Hauck, Cory D., E-mail: hauckc@ornl.gov

2016-09-15

In this work, we provide a fully-implicit implementation of the time-dependent, filtered spherical harmonics (FP{sub N}) equations for non-linear, thermal radiative transfer. We investigate local filtering strategies and analyze the effect of the filter on the conditioning of the system, showing in particular that the filter improves the convergence properties of the iterative solver. We also investigate numerically the rigorous error estimates derived in the linear setting, to determine whether they hold also for the non-linear case. Finally, we simulate a standard test problem on an unstructured mesh and make comparisons with implicit Monte Carlo (IMC) calculations.
The Use of Iterative Linear-Equation Solvers in Codes for Large Systems of Stiff IVPs (Initial-Value Problems) for ODEs (Ordinary Differential Equations).

DTIC Science & Technology

1984-04-01

numerical solution, of sstem ot stiff Wh-f Cr ODs. Fro- qontl. a substantial portia of the total computationskwok and cooap required! to solve stiff...exep, possl- bly, foreciadalms of problem. That is% a syste of linewat o nonlinear algebrac equa- tion mumt be solved at auk step of the numerical ...onjugate gradient method [431 is a mall-know ezuze, have prove to be particularly -2- efecti for solving the linear stwem that &ise in the numerical
Three-dimensional Finite Element Formulation and Scalable Domain Decomposition for High Fidelity Rotor Dynamic Analysis

NASA Technical Reports Server (NTRS)

Datta, Anubhav; Johnson, Wayne R.

2009-01-01

This paper has two objectives. The first objective is to formulate a 3-dimensional Finite Element Model for the dynamic analysis of helicopter rotor blades. The second objective is to implement and analyze a dual-primal iterative substructuring based Krylov solver, that is parallel and scalable, for the solution of the 3-D FEM analysis. The numerical and parallel scalability of the solver is studied using two prototype problems - one for ideal hover (symmetric) and one for a transient forward flight (non-symmetric) - both carried out on up to 48 processors. In both hover and forward flight conditions, a perfect linear speed-up is observed, for a given problem size, up to the point of substructure optimality. Substructure optimality and the linear parallel speed-up range are both shown to depend on the problem size as well as on the selection of the coarse problem. With a larger problem size, linear speed-up is restored up to the new substructure optimality. The solver also scales with problem size - even though this conclusion is premature given the small prototype grids considered in this study.
A Matlab-based finite-difference solver for the Poisson problem with mixed Dirichlet-Neumann boundary conditions

NASA Astrophysics Data System (ADS)

Reimer, Ashton S.; Cheviakov, Alexei F.

2013-03-01

A Matlab-based finite-difference numerical solver for the Poisson equation for a rectangle and a disk in two dimensions, and a spherical domain in three dimensions, is presented. The solver is optimized for handling an arbitrary combination of Dirichlet and Neumann boundary conditions, and allows for full user control of mesh refinement. The solver routines utilize effective and parallelized sparse vector and matrix operations. Computations exhibit high speeds, numerical stability with respect to mesh size and mesh refinement, and acceptable error values even on desktop computers. Catalogue identifier: AENQ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AENQ_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License v3.0 No. of lines in distributed program, including test data, etc.: 102793 No. of bytes in distributed program, including test data, etc.: 369378 Distribution format: tar.gz Programming language: Matlab 2010a. Computer: PC, Macintosh. Operating system: Windows, OSX, Linux. RAM: 8 GB (8, 589, 934, 592 bytes) Classification: 4.3. Nature of problem: To solve the Poisson problem in a standard domain with “patchy surface”-type (strongly heterogeneous) Neumann/Dirichlet boundary conditions. Solution method: Finite difference with mesh refinement. Restrictions: Spherical domain in 3D; rectangular domain or a disk in 2D. Unusual features: Choice between mldivide/iterative solver for the solution of large system of linear algebraic equations that arise. Full user control of Neumann/Dirichlet boundary conditions and mesh refinement. Running time: Depending on the number of points taken and the geometry of the domain, the routine may take from less than a second to several hours to execute.
Determining the Optimal Values of Exponential Smoothing Constants--Does Solver Really Work?

ERIC Educational Resources Information Center

Ravinder, Handanhal V.

2013-01-01

A key issue in exponential smoothing is the choice of the values of the smoothing constants used. One approach that is becoming increasingly popular in introductory management science and operations management textbooks is the use of Solver, an Excel-based non-linear optimizer, to identify values of the smoothing constants that minimize a measure…
Comparing direct and iterative equation solvers in a large structural analysis software system

NASA Technical Reports Server (NTRS)

Poole, E. L.

1991-01-01

Two direct Choleski equation solvers and two iterative preconditioned conjugate gradient (PCG) equation solvers used in a large structural analysis software system are described. The two direct solvers are implementations of the Choleski method for variable-band matrix storage and sparse matrix storage. The two iterative PCG solvers include the Jacobi conjugate gradient method and an incomplete Choleski conjugate gradient method. The performance of the direct and iterative solvers is compared by solving several representative structural analysis problems. Some key factors affecting the performance of the iterative solvers relative to the direct solvers are identified.
MAFIA Version 4

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weiland, T.; Bartsch, M.; Becker, U.

1997-02-01

MAFIA Version 4.0 is an almost completely new version of the general purpose electromagnetic simulator known since 13 years. The major improvements concern the new graphical user interface based on state of the art technology as well as a series of new solvers for new physics problems. MAFIA now covers heat distribution, electro-quasistatics, S-parameters in frequency domain, particle beam tracking in linear accelerators, acoustics and even elastodynamics. The solvers that were available in earlier versions have also been improved and/or extended, as for example the complex eigenmode solver, the 2D--3D coupled PIC solvers. Time domain solvers have new waveguide boundarymore » conditions with an extremely low reflection even near cutoff frequency, concentrated elements are available as well as a variety of signal processing options. Probably the most valuable addition are recursive sub-grid capabilities that enable modeling of very small details in large structures. {copyright} {ital 1997 American Institute of Physics.}« less
MAFIA Version 4

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weiland, T.; Bartsch, M.; Becker, U.

1997-02-01

MAFIA Version 4.0 is an almost completely new version of the general purpose electromagnetic simulator known since 13 years. The major improvements concern the new graphical user interface based on state of the art technology as well as a series of new solvers for new physics problems. MAFIA now covers heat distribution, electro-quasistatics, S-parameters in frequency domain, particle beam tracking in linear accelerators, acoustics and even elastodynamics. The solvers that were available in earlier versions have also been improved and/or extended, as for example the complex eigenmode solver, the 2D-3D coupled PIC solvers. Time domain solvers have new waveguide boundarymore » conditions with an extremely low reflection even near cutoff frequency, concentrated elements are available as well as a variety of signal processing options. Probably the most valuable addition are recursive sub-grid capabilities that enable modeling of very small details in large structures.« less
Three-Dimensional Nacelle Aeroacoustics Code With Application to Impedance Education

NASA Technical Reports Server (NTRS)

Watson, Willie R.

2000-01-01

A three-dimensional nacelle acoustics code that accounts for uniform mean flow and variable surface impedance liners is developed. The code is linked to a commercial version of the NASA-developed General Purpose Solver (for solution of linear systems of equations) in order to obtain the capability to study high frequency waves that may require millions of grid points for resolution. Detailed, single-processor statistics for the performance of the solver in rigid and soft-wall ducts are presented. Over the range of frequencies of current interest in nacelle liner research, noise attenuation levels predicted from the code were in excellent agreement with those predicted from mode theory. The equation solver is memory efficient, requiring only a small fraction of the memory available on modern computers. As an application, the code is combined with an optimization algorithm and used to reduce the impedance spectrum of a ceramic liner. The primary problem with using the code to perform optimization studies at frequencies above I1kHz is the excessive CPU time (a major portion of which is matrix assembly). The research recommends that research be directed toward development of a rapid sparse assembler and exploitation of the multiprocessor capability of the solver to further reduce CPU time.
General purpose nonlinear system solver based on Newton-Krylov method.

DOE Office of Scientific and Technical Information (OSTI.GOV)

2013-12-01

KINSOL is part of a software family called SUNDIALS: SUite of Nonlinear and Differential/Algebraic equation Solvers [1]. KINSOL is a general-purpose nonlinear system solver based on Newton-Krylov and fixed-point solver technologies [2].
Applying EXCEL Solver to a watershed management goal-programming problem

Treesearch

J. E. de Steiguer

2000-01-01

This article demonstrates the application of EXCELÂ® spreadsheet linear programming (LP) solver to a watershed management multiple use goal programming (GP) problem. The data used to demonstrate the application are from a published study for a watershed in northern Colorado. GP has been used by natural resource managers for many years. However, the GP solution by means...

New algorithms for field-theoretic block copolymer simulations: Progress on using adaptive-mesh refinement and sparse matrix solvers in SCFT calculations

NASA Astrophysics Data System (ADS)

Sides, Scott; Jamroz, Ben; Crockett, Robert; Pletzer, Alexander

2012-02-01

Self-consistent field theory (SCFT) for dense polymer melts has been highly successful in describing complex morphologies in block copolymers. Field-theoretic simulations such as these are able to access large length and time scales that are difficult or impossible for particle-based simulations such as molecular dynamics. The modified diffusion equations that arise as a consequence of the coarse-graining procedure in the SCF theory can be efficiently solved with a pseudo-spectral (PS) method that uses fast-Fourier transforms on uniform Cartesian grids. However, PS methods can be difficult to apply in many block copolymer SCFT simulations (eg. confinement, interface adsorption) in which small spatial regions might require finer resolution than most of the simulation grid. Progress on using new solver algorithms to address these problems will be presented. The Tech-X Chompst project aims at marrying the best of adaptive mesh refinement with linear matrix solver algorithms. The Tech-X code PolySwift++ is an SCFT simulation platform that leverages ongoing development in coupling Chombo, a package for solving PDEs via block-structured AMR calculations and embedded boundaries, with PETSc, a toolkit that includes a large assortment of sparse linear solvers.
On numerical instabilities of Godunov-type schemes for strong shocks

NASA Astrophysics Data System (ADS)

Xie, Wenjia; Li, Wei; Li, Hua; Tian, Zhengyu; Pan, Sha

2017-12-01

It is well known that low diffusion Riemann solvers with minimal smearing on contact and shear waves are vulnerable to shock instability problems, including the carbuncle phenomenon. In the present study, we concentrate on exploring where the instability grows out and how the dissipation inherent in Riemann solvers affects the unstable behaviors. With the help of numerical experiments and a linearized analysis method, it has been found that the shock instability is strongly related to the unstable modes of intermediate states inside the shock structure. The consistency of mass flux across the normal shock is needed for a Riemann solver to capture strong shocks stably. The famous carbuncle phenomenon is interpreted as the consequence of the inconsistency of mass flux across the normal shock for a low diffusion Riemann solver. Based on the results of numerical experiments and the linearized analysis, a robust Godunov-type scheme with a simple cure for the shock instability is suggested. With only the dissipation corresponding to shear waves introduced in the vicinity of strong shocks, the instability problem is circumvented. Numerical results of several carefully chosen strong shock wave problems are investigated to demonstrate the robustness of the proposed scheme.
High Productivity Computing Systems Analysis and Performance

DTIC Science & Technology

2005-07-01

cubic grid Discrete Math Global Updates per second (GUP/S) RandomAccess Paper & Pencil Contact Bob Lucas (ISI) Multiple Precision none...can be found at the web site. One of the HPCchallenge codes, RandomAccess, is derived from the HPCS discrete math benchmarks that we released, and...Kernels Discrete Math … Graph Analysis … Linear Solvers … Signal Processi ng Execution Bounds Execution Indicators 6 Scalable Compact
From 2D to 3D modelling in long term tectonics: Modelling challenges and HPC solutions (Invited)

NASA Astrophysics Data System (ADS)

Le Pourhiet, L.; May, D.

2013-12-01

Over the last decades, 3D thermo-mechanical codes have been made available to the long term tectonics community either as open source (Underworld, Gale) or more limited access (Fantom, Elvis3D, Douar, LaMem etc ...). However, to date, few published results using these methods have included the coupling between crustal and lithospheric dynamics at large strain. The fact that these computations are computational expensive is not the primary reason for the relatively slow development of 3D modeling in the long term tectonics community, as compare to the rapid development observed within the mantle dynamic community, or in the short-term tectonics field. Long term tectonics problems have specific issues not found in either of these two field, including; large strain (not an issue for short-term), the inclusion of free surface and the occurence of large viscosity contrasts. The first issue is typically eliminated using a combined marker-ALE method instead of fully lagrangian method, however, the marker-ALE approach can pose some algorithmic challenges in a massively parallel environment. The two last issues are more problematic because they affect the convergence of the linear/non-linear solver and the memory cost. Two options have been tested so far, using low order element and solving with a sparse direct solver, or using higher order stable elements together with a multi-grid solver. The first options, is simpler to code and to use but reaches its limit at around 80^3 low order elements. The second option requires more operations but allows using iterative solver on extremely large computers. In this presentation, I will describe the design philosophy and highlight results obtained using a code from the second-class method. The presentation will be oriented from an end-user point of view, using an application from 3D continental break up to illustrate key concepts. The description will proceed point by point from implementing physics into the code, to dealing with specific issues related to solving the discrete system of non linear equations.
Parallel SOR methods with a parabolic-diffusion acceleration technique for solving an unstructured-grid Poisson equation on 3D arbitrary geometries

NASA Astrophysics Data System (ADS)

Zapata, M. A. Uh; Van Bang, D. Pham; Nguyen, K. D.

2016-05-01

This paper presents a parallel algorithm for the finite-volume discretisation of the Poisson equation on three-dimensional arbitrary geometries. The proposed method is formulated by using a 2D horizontal block domain decomposition and interprocessor data communication techniques with message passing interface. The horizontal unstructured-grid cells are reordered according to the neighbouring relations and decomposed into blocks using a load-balanced distribution to give all processors an equal amount of elements. In this algorithm, two parallel successive over-relaxation methods are presented: a multi-colour ordering technique for unstructured grids based on distributed memory and a block method using reordering index following similar ideas of the partitioning for structured grids. In all cases, the parallel algorithms are implemented with a combination of an acceleration iterative solver. This solver is based on a parabolic-diffusion equation introduced to obtain faster solutions of the linear systems arising from the discretisation. Numerical results are given to evaluate the performances of the methods showing speedups better than linear.
Three dimensional modelling of earthquake rupture cycles on frictional faults

NASA Astrophysics Data System (ADS)

Simpson, Guy; May, Dave

2017-04-01

We are developing an efficient MPI-parallel numerical method to simulate earthquake sequences on preexisting faults embedding within a three dimensional viscoelastic half-space. We solve the velocity form of the elasto(visco)dynamic equations using a continuous Galerkin Finite Element Method on an unstructured pentahedral mesh, which thus permits local spatial refinement in the vicinity of the fault. Friction sliding is coupled to the viscoelastic solid via rate- and state-dependent friction laws using the split-node technique. Our coupled formulation employs a picard-type non-linear solver with a fully implicit, first order accurate time integrator that utilises an adaptive time step that efficiently evolves the system through multiple seismic cycles. The implementation leverages advanced parallel solvers, preconditioners and linear algebra from the Portable Extensible Toolkit for Scientific Computing (PETSc) library. The model can treat heterogeneous frictional properties and stress states on the fault and surrounding solid as well as non-planar fault geometries. Preliminary tests show that the model successfully reproduces dynamic rupture on a vertical strike-slip fault in a half-space governed by rate-state friction with the ageing law.
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices, part 2

NASA Technical Reports Server (NTRS)

Freund, Roland W.; Nachtigal, Noel M.

1990-01-01

It is shown how the look-ahead Lanczos process (combined with a quasi-minimal residual QMR) approach) can be used to develop a robust black box solver for large sparse non-Hermitian linear systems. Details of an implementation of the resulting QMR algorithm are presented. It is demonstrated that the QMR method is closely related to the biconjugate gradient (BCG) algorithm; however, unlike BCG, the QMR algorithm has smooth convergence curves and good numerical properties. We report numerical experiments with our implementation of the look-ahead Lanczos algorithm, both for eigenvalue problem and linear systems. Also, program listings of FORTRAN implementations of the look-ahead algorithm and the QMR method are included.
Iterative-method performance evaluation for multiple vectors associated with a large-scale sparse matrix

NASA Astrophysics Data System (ADS)

Imamura, Seigo; Ono, Kenji; Yokokawa, Mitsuo

2016-07-01

Ensemble computing, which is an instance of capacity computing, is an effective computing scenario for exascale parallel supercomputers. In ensemble computing, there are multiple linear systems associated with a common coefficient matrix. We improve the performance of iterative solvers for multiple vectors by solving them at the same time, that is, by solving for the product of the matrices. We implemented several iterative methods and compared their performance. The maximum performance on Sparc VIIIfx was 7.6 times higher than that of a naïve implementation. Finally, to deal with the different convergence processes of linear systems, we introduced a control method to eliminate the calculation of already converged vectors.
Diagnosis of Enzyme Inhibition Using Excel Solver: A Combined Dry and Wet Laboratory Exercise

ERIC Educational Resources Information Center

Dias, Albino A.; Pinto, Paula A.; Fraga, Irene; Bezerra, Rui M. F.

2014-01-01

In enzyme kinetic studies, linear transformations of the Michaelis-Menten equation, such as the Lineweaver-Burk double-reciprocal transformation, present some constraints. The linear transformation distorts the experimental error and the relationship between "x" and "y" axes; consequently, linear regression of transformed data…
Adaptive mesh fluid simulations on GPU

NASA Astrophysics Data System (ADS)

Wang, Peng; Abel, Tom; Kaehler, Ralf

2010-10-01

We describe an implementation of compressible inviscid fluid solvers with block-structured adaptive mesh refinement on Graphics Processing Units using NVIDIA's CUDA. We show that a class of high resolution shock capturing schemes can be mapped naturally on this architecture. Using the method of lines approach with the second order total variation diminishing Runge-Kutta time integration scheme, piecewise linear reconstruction, and a Harten-Lax-van Leer Riemann solver, we achieve an overall speedup of approximately 10 times faster execution on one graphics card as compared to a single core on the host computer. We attain this speedup in uniform grid runs as well as in problems with deep AMR hierarchies. Our framework can readily be applied to more general systems of conservation laws and extended to higher order shock capturing schemes. This is shown directly by an implementation of a magneto-hydrodynamic solver and comparing its performance to the pure hydrodynamic case. Finally, we also combined our CUDA parallel scheme with MPI to make the code run on GPU clusters. Close to ideal speedup is observed on up to four GPUs.
Review and analysis of dense linear system solver package for distributed memory machines

NASA Technical Reports Server (NTRS)

Narang, H. N.

1993-01-01

A dense linear system solver package recently developed at the University of Texas at Austin for distributed memory machine (e.g. Intel Paragon) has been reviewed and analyzed. The package contains about 45 software routines, some written in FORTRAN, and some in C-language, and forms the basis for parallel/distributed solutions of systems of linear equations encountered in many problems of scientific and engineering nature. The package, being studied by the Computer Applications Branch of the Analysis and Computation Division, may provide a significant computational resource for NASA scientists and engineers in parallel/distributed computing. Since the package is new and not well tested or documented, many of its underlying concepts and implementations were unclear; our task was to review, analyze, and critique the package as a step in the process that will enable scientists and engineers to apply it to the solution of their problems. All routines in the package were reviewed and analyzed. Underlying theory or concepts which exist in the form of published papers or technical reports, or memos, were either obtained from the author, or from the scientific literature; and general algorithms, explanations, examples, and critiques have been provided to explain the workings of these programs. Wherever the things were still unclear, communications were made with the developer (author), either by telephone or by electronic mail, to understand the workings of the routines. Whenever possible, tests were made to verify the concepts and logic employed in their implementations. A detailed report is being separately documented to explain the workings of these routines.
Benchmarking Defmod, an open source FEM code for modeling episodic fault rupture

NASA Astrophysics Data System (ADS)

Meng, Chunfang

2017-03-01

We present Defmod, an open source (linear) finite element code that enables us to efficiently model the crustal deformation due to (quasi-)static and dynamic loadings, poroelastic flow, viscoelastic flow and frictional fault slip. Ali (2015) provides the original code introducing an implicit solver for (quasi-)static problem, and an explicit solver for dynamic problem. The fault constraint is implemented via Lagrange Multiplier. Meng (2015) combines these two solvers into a hybrid solver that uses failure criteria and friction laws to adaptively switch between the (quasi-)static state and dynamic state. The code is capable of modeling episodic fault rupture driven by quasi-static loadings, e.g. due to reservoir fluid withdraw or injection. Here, we focus on benchmarking the Defmod results against some establish results.
Linear instabilities near the DIII-D edge simulated in fluid models

NASA Astrophysics Data System (ADS)

Bass, Eric; Holland, Christopher

2017-10-01

The linear instability spectrum is reported near the DIII-D edge (within the separatrix) for L-mode and H-mode shots using the new eigenvalue solver FluTES (Fluid Toroidal Eigenvalue Solver). FluTES circumvents difficulties with convergence to clean linear eigenmodes (required for diagnosis of nonlinear simulations in codes such as BOUT++) often encountered with fluid initial-value solvers. FluTES is well-verified in analytic cases and against a BOUT++/ELITE benchmark toroidal case. We report results for both a 3-field, one-fluid model (the well-known ``elm-pb'' model) and a 5-field, two-fluid model. For the peeling-ballooning-dominated H-mode, the two solutions are qualitatively the same. In the driftwave-dominated L-mode edge, only the two-fluid solution gives robust instabilities which occur primarily at n > 50 . FluTES is optimized for this regime (near-flutelike limit, toroidally spectral). Cross-separatrix, coupled fluid and drift instabilities may play a role in explaining the gyrokinetic L-mode edge transport shortfall. Extension of FluTES into the open-field-line region is underway. Prepared by UCSD under Contract Number DE-FG02-06ER54871.
Reduced-Order Models Based on Linear and Nonlinear Aerodynamic Impulse Responses

NASA Technical Reports Server (NTRS)

Silva, Walter A.

1999-01-01

This paper discusses a method for the identification and application of reduced-order models based on linear and nonlinear aerodynamic impulse responses. The Volterra theory of nonlinear systems and an appropriate kernel identification technique are described. Insight into the nature of kernels is provided by applying the method to the nonlinear Riccati equation in a non-aerodynamic application. The method is then applied to a nonlinear aerodynamic model of RAE 2822 supercritical airfoil undergoing plunge motions using the CFL3D Navier-Stokes flow solver with the Spalart-Allmaras turbulence model. Results demonstrate the computational efficiency of the technique.
Reduced Order Models Based on Linear and Nonlinear Aerodynamic Impulse Responses

NASA Technical Reports Server (NTRS)

Silva, Walter A.

1999-01-01

This paper discusses a method for the identification and application of reduced-order models based on linear and nonlinear aerodynamic impulse responses. The Volterra theory of nonlinear systems and an appropriate kernel identification technique are described. Insight into the nature of kernels is provided by applying the method to the nonlinear Riccati equation in a non-aerodynamic application. The method is then applied to a nonlinear aerodynamic model of an RAE 2822 supercritical airfoil undergoing plunge motions using the CFL3D Navier-Stokes flow solver with the Spalart-Allmaras turbulence model. Results demonstrate the computational efficiency of the technique.
Science & Technology Review October 2007

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chinn, D J

Livermore researchers won five R&D 100 awards in R&D Magazine's annual competition for the top 100 industrial innovations worldwide. This issue of Science & Technology Review highlights the award-winning technologies: noninvasive pneumothorax detector, microelectromechanical system-based adaptive optics scanning laser ophthalmoscope, large-area imager, hyper library of linear solvers, and continuous-phase-plate optics system manufactured using magnetorheological finishing. Since 1978, Laboratory researchers have received 118 R&D 100 awards. The R&D 100 logo (on the cover and p 1) is reprinted courtesy of R&D Magazine.
Acceleration of GPU-based Krylov solvers via data transfer reduction

DOE PAGES

Anzt, Hartwig; Tomov, Stanimire; Luszczek, Piotr; ...

2015-04-08

Krylov subspace iterative solvers are often the method of choice when solving large sparse linear systems. At the same time, hardware accelerators such as graphics processing units continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a well optimized but limited set of linear algebra operations, applications that use them often fail to reduce certain data communications, and hence fail to leverage the full potential of the accelerator. In this study, we target the acceleration of Krylov subspace iterative methods for graphicsmore » processing units, and in particular the Biconjugate Gradient Stabilized solver that significant improvement can be achieved by reformulating the method to reduce data-communications through application-specific kernels instead of using the generic BLAS kernels, e.g. as provided by NVIDIA’s cuBLAS library, and by designing a graphics processing unit specific sparse matrix-vector product kernel that is able to more efficiently use the graphics processing unit’s computing power. Furthermore, we derive a model estimating the performance improvement, and use experimental data to validate the expected runtime savings. Finally, considering that the derived implementation achieves significantly higher performance, we assert that similar optimizations addressing algorithm structure, as well as sparse matrix-vector, are crucial for the subsequent development of high-performance graphics processing units accelerated Krylov subspace iterative methods.« less
Multi-GPU Accelerated Admittance Method for High-Resolution Human Exposure Evaluation.

PubMed

Xiong, Zubiao; Feng, Shi; Kautz, Richard; Chandra, Sandeep; Altunyurt, Nevin; Chen, Ji

2015-12-01

A multi-graphics processing unit (GPU) accelerated admittance method solver is presented for solving the induced electric field in high-resolution anatomical models of human body when exposed to external low-frequency magnetic fields. In the solver, the anatomical model is discretized as a three-dimensional network of admittances. The conjugate orthogonal conjugate gradient (COCG) iterative algorithm is employed to take advantage of the symmetric property of the complex-valued linear system of equations. Compared against the widely used biconjugate gradient stabilized method, the COCG algorithm can reduce the solving time by 3.5 times and reduce the storage requirement by about 40%. The iterative algorithm is then accelerated further by using multiple NVIDIA GPUs. The computations and data transfers between GPUs are overlapped in time by using asynchronous concurrent execution design. The communication overhead is well hidden so that the acceleration is nearly linear with the number of GPU cards. Numerical examples show that our GPU implementation running on four NVIDIA Tesla K20c cards can reach 90 times faster than the CPU implementation running on eight CPU cores (two Intel Xeon E5-2603 processors). The implemented solver is able to solve large dimensional problems efficiently. A whole adult body discretized in 1-mm resolution can be solved in just several minutes. The high efficiency achieved makes it practical to investigate human exposure involving a large number of cases with a high resolution that meets the requirements of international dosimetry guidelines.
Comparison of Integer Programming (IP) Solvers for Automated Test Assembly (ATA). Research Report. ETS RR-15-05

ERIC Educational Resources Information Center

Donoghue, John R.

2015-01-01

At the heart of van der Linden's approach to automated test assembly (ATA) is a linear programming/integer programming (LP/IP) problem. A variety of IP solvers are available, ranging in cost from free to hundreds of thousands of dollars. In this paper, I compare several approaches to solving the underlying IP problem. These approaches range from…
A coarse-grid projection method for accelerating incompressible flow computations

NASA Astrophysics Data System (ADS)

San, Omer; Staples, Anne E.

2013-01-01

We present a coarse-grid projection (CGP) method for accelerating incompressible flow computations, which is applicable to methods involving Poisson equations as incompressibility constraints. The CGP methodology is a modular approach that facilitates data transfer with simple interpolations and uses black-box solvers for the Poisson and advection-diffusion equations in the flow solver. After solving the Poisson equation on a coarsened grid, an interpolation scheme is used to obtain the fine data for subsequent time stepping on the full grid. A particular version of the method is applied here to the vorticity-stream function, primitive variable, and vorticity-velocity formulations of incompressible Navier-Stokes equations. We compute several benchmark flow problems on two-dimensional Cartesian and non-Cartesian grids, as well as a three-dimensional flow problem. The method is found to accelerate these computations while retaining a level of accuracy close to that of the fine resolution field, which is significantly better than the accuracy obtained for a similar computation performed solely using a coarse grid. A linear acceleration rate is obtained for all the cases we consider due to the linear-cost elliptic Poisson solver used, with reduction factors in computational time between 2 and 42. The computational savings are larger when a suboptimal Poisson solver is used. We also find that the computational savings increase with increasing distortion ratio on non-Cartesian grids, making the CGP method a useful tool for accelerating generalized curvilinear incompressible flow solvers.

Compressive Sensing with Cross-Validation and Stop-Sampling for Sparse Polynomial Chaos Expansions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huan, Xun; Safta, Cosmin; Sargsyan, Khachik

Compressive sensing is a powerful technique for recovering sparse solutions of underdetermined linear systems, which is often encountered in uncertainty quanti cation analysis of expensive and high-dimensional physical models. We perform numerical investigations employing several com- pressive sensing solvers that target the unconstrained LASSO formulation, with a focus on linear systems that arise in the construction of polynomial chaos expansions. With core solvers of l1 ls, SpaRSA, CGIST, FPC AS, and ADMM, we develop techniques to mitigate over tting through an automated selection of regularization constant based on cross-validation, and a heuristic strategy to guide the stop-sampling decision. Practical recommendationsmore » on parameter settings for these tech- niques are provided and discussed. The overall method is applied to a series of numerical examples of increasing complexity, including large eddy simulations of supersonic turbulent jet-in-cross flow involving a 24-dimensional input. Through empirical phase-transition diagrams and convergence plots, we illustrate sparse recovery performance under structures induced by polynomial chaos, accuracy and computational tradeoffs between polynomial bases of different degrees, and practi- cability of conducting compressive sensing for a realistic, high-dimensional physical application. Across test cases studied in this paper, we find ADMM to have demonstrated empirical advantages through consistent lower errors and faster computational times.« less
Power/Performance Trade-offs of Small Batched LU Based Solvers on GPUs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Villa, Oreste; Fatica, Massimiliano; Gawande, Nitin A.

In this paper we propose and analyze a set of batched linear solvers for small matrices on Graphic Processing Units (GPUs), evaluating the various alternatives depending on the size of the systems to solve. We discuss three different solutions that operate with different level of parallelization and GPU features. The first, exploiting the CUBLAS library, manages matrices of size up to 32x32 and employs Warp level (one matrix, one Warp) parallelism and shared memory. The second works at Thread-block level parallelism (one matrix, one Thread-block), still exploiting shared memory but managing matrices up to 76x76. The third is Thread levelmore » parallel (one matrix, one thread) and can reach sizes up to 128x128, but it does not exploit shared memory and only relies on the high memory bandwidth of the GPU. The first and second solution only support partial pivoting, the third one easily supports partial and full pivoting, making it attractive to problems that require greater numerical stability. We analyze the trade-offs in terms of performance and power consumption as function of the size of the linear systems that are simultaneously solved. We execute the three implementations on a Tesla M2090 (Fermi) and on a Tesla K20 (Kepler).« less
Multigrid solvers and multigrid preconditioners for the solution of variational data assimilation problems

NASA Astrophysics Data System (ADS)

Debreu, Laurent; Neveu, Emilie; Simon, Ehouarn; Le Dimet, Francois Xavier; Vidard, Arthur

2014-05-01

In order to lower the computational cost of the variational data assimilation process, we investigate the use of multigrid methods to solve the associated optimal control system. On a linear advection equation, we study the impact of the regularization term on the optimal control and the impact of discretization errors on the efficiency of the coarse grid correction step. We show that even if the optimal control problem leads to the solution of an elliptic system, numerical errors introduced by the discretization can alter the success of the multigrid methods. The view of the multigrid iteration as a preconditioner for a Krylov optimization method leads to a more robust algorithm. A scale dependent weighting of the multigrid preconditioner and the usual background error covariance matrix based preconditioner is proposed and brings significant improvements. [1] Laurent Debreu, Emilie Neveu, Ehouarn Simon, François-Xavier Le Dimet and Arthur Vidard, 2014: Multigrid solvers and multigrid preconditioners for the solution of variational data assimilation problems, submitted to QJRMS, http://hal.inria.fr/hal-00874643 [2] Emilie Neveu, Laurent Debreu and François-Xavier Le Dimet, 2011: Multigrid methods and data assimilation - Convergence study and first experiments on non-linear equations, ARIMA, 14, 63-80, http://intranet.inria.fr/international/arima/014/014005.html
Parallel-vector computation for linear structural analysis and non-linear unconstrained optimization problems

NASA Technical Reports Server (NTRS)

Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.

1991-01-01

Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.
Asymptotically and exactly energy balanced augmented flux-ADER schemes with application to hyperbolic conservation laws with geometric source terms

NASA Astrophysics Data System (ADS)

Navas-Montilla, A.; Murillo, J.

2016-07-01

In this work, an arbitrary order HLL-type numerical scheme is constructed using the flux-ADER methodology. The proposed scheme is based on an augmented Derivative Riemann solver that was used for the first time in Navas-Montilla and Murillo (2015) [1]. Such solver, hereafter referred to as Flux-Source (FS) solver, was conceived as a high order extension of the augmented Roe solver and led to the generation of a novel numerical scheme called AR-ADER scheme. Here, we provide a general definition of the FS solver independently of the Riemann solver used in it. Moreover, a simplified version of the solver, referred to as Linearized-Flux-Source (LFS) solver, is presented. This novel version of the FS solver allows to compute the solution without requiring reconstruction of derivatives of the fluxes, nevertheless some drawbacks are evidenced. In contrast to other previously defined Derivative Riemann solvers, the proposed FS and LFS solvers take into account the presence of the source term in the resolution of the Derivative Riemann Problem (DRP), which is of particular interest when dealing with geometric source terms. When applied to the shallow water equations, the proposed HLLS-ADER and AR-ADER schemes can be constructed to fulfill the exactly well-balanced property, showing that an arbitrary quadrature of the integral of the source inside the cell does not ensure energy balanced solutions. As a result of this work, energy balanced flux-ADER schemes that provide the exact solution for steady cases and that converge to the exact solution with arbitrary order for transient cases are constructed.
Efficient Parallel Formulations of Hierarchical Methods and Their Applications

NASA Astrophysics Data System (ADS)

Grama, Ananth Y.

1996-01-01

Hierarchical methods such as the Fast Multipole Method (FMM) and Barnes-Hut (BH) are used for rapid evaluation of potential (gravitational, electrostatic) fields in particle systems. They are also used for solving integral equations using boundary element methods. The linear systems arising from these methods are dense and are solved iteratively. Hierarchical methods reduce the complexity of the core matrix-vector product from O(n^2) to O(n log n) and the memory requirement from O(n^2) to O(n). We have developed highly scalable parallel formulations of a hybrid FMM/BH method that are capable of handling arbitrarily irregular distributions. We apply these formulations to astrophysical simulations of Plummer and Gaussian galaxies. We have used our parallel formulations to solve the integral form of the Laplace equation. We show that our parallel hierarchical mat-vecs yield high efficiency and overall performance even on relatively small problems. A problem containing approximately 200K nodes takes under a second to compute on 256 processors and yet yields over 85% efficiency. The efficiency and raw performance is expected to increase for bigger problems. For the 200K node problem, our code delivers about 5 GFLOPS of performance on a 256 processor T3D. This is impressive considering the fact that the problem has floating point divides and roots, and very little locality resulting in poor cache performance. A dense matrix-vector product of the same dimensions would require about 0.5 TeraBytes of memory and about 770 TeraFLOPS of computing speed. Clearly, if the loss in accuracy resulting from the use of hierarchical methods is acceptable, our code yields significant savings in time and memory. We also study the convergence of a GMRES solver built around this mat-vec. We accelerate the convergence of the solver using three preconditioning techniques: diagonal scaling, block-diagonal preconditioning, and inner-outer preconditioning. We study the performance and parallel efficiency of these preconditioned solvers. Using this solver, we solve dense linear systems with hundreds of thousands of unknowns. Solving a 105K unknown problem takes about 10 minutes on a 64 processor T3D. Until very recently, boundary element problems of this magnitude could not even be generated, let alone solved.
The High-Resolution Wave-Propagation Method Applied to Meso- and Micro-Scale Flows

NASA Technical Reports Server (NTRS)

Ahmad, Nashat N.; Proctor, Fred H.

2012-01-01

The high-resolution wave-propagation method for computing the nonhydrostatic atmospheric flows on meso- and micro-scales is described. The design and implementation of the Riemann solver used for computing the Godunov fluxes is discussed in detail. The method uses a flux-based wave decomposition in which the flux differences are written directly as the linear combination of the right eigenvectors of the hyperbolic system. The two advantages of the technique are: 1) the need for an explicit definition of the Roe matrix is eliminated and, 2) the inclusion of source term due to gravity does not result in discretization errors. The resulting flow solver is conservative and able to resolve regions of large gradients without introducing dispersion errors. The methodology is validated against exact analytical solutions and benchmark cases for non-hydrostatic atmospheric flows.
Wide-angle full-vector beam propagation method based on an alternating direction implicit preconditioner

NASA Astrophysics Data System (ADS)

Chui, Siu Lit; Lu, Ya Yan

2004-03-01

Wide-angle full-vector beam propagation methods (BPMs) for three-dimensional wave-guiding structures can be derived on the basis of rational approximants of a square root operator or its exponential (i.e., the one-way propagator). While the less accurate BPM based on the slowly varying envelope approximation can be efficiently solved by the alternating direction implicit (ADI) method, the wide-angle variants involve linear systems that are more difficult to handle. We present an efficient solver for these linear systems that is based on a Krylov subspace method with an ADI preconditioner. The resulting wide-angle full-vector BPM is used to simulate the propagation of wave fields in a Y branch and a taper.
Wide-angle full-vector beam propagation method based on an alternating direction implicit preconditioner.

PubMed

Chui, Siu Lit; Lu, Ya Yan

2004-03-01

Wide-angle full-vector beam propagation methods (BPMs) for three-dimensional wave-guiding structures can be derived on the basis of rational approximants of a square root operator or its exponential (i.e., the one-way propagator). While the less accurate BPM based on the slowly varying envelope approximation can be efficiently solved by the alternating direction implicit (ADI) method, the wide-angle variants involve linear systems that are more difficult to handle. We present an efficient solver for these linear systems that is based on a Krylov subspace method with an ADI preconditioner. The resulting wide-angle full-vector BPM is used to simulate the propagation of wave fields in a Y branch and a taper.
A Comparison of Solver Performance for Complex Gastric Electrophysiology Models

PubMed Central

Sathar, Shameer; Cheng, Leo K.; Trew, Mark L.

2016-01-01

Computational techniques for solving systems of equations arising in gastric electrophysiology have not been studied for efficient solution process. We present a computationally challenging problem of simulating gastric electrophysiology in anatomically realistic stomach geometries with multiple intracellular and extracellular domains. The multiscale nature of the problem and mesh resolution required to capture geometric and functional features necessitates efficient solution methods if the problem is to be tractable. In this study, we investigated and compared several parallel preconditioners for the linear systems arising from tetrahedral discretisation of electrically isotropic and anisotropic problems, with and without stimuli. The results showed that the isotropic problem was computationally less challenging than the anisotropic problem and that the application of extracellular stimuli increased workload considerably. Preconditioning based on block Jacobi and algebraic multigrid solvers were found to have the best overall solution times and least iteration counts, respectively. The algebraic multigrid preconditioner would be expected to perform better on large problems. PMID:26736543
A higher-order conservation element solution element method for solving hyperbolic differential equations on unstructured meshes

NASA Astrophysics Data System (ADS)

Bilyeu, David

This dissertation presents an extension of the Conservation Element Solution Element (CESE) method from second- to higher-order accuracy. The new method retains the favorable characteristics of the original second-order CESE scheme, including (i) the use of the space-time integral equation for conservation laws, (ii) a compact mesh stencil, (iii) the scheme will remain stable up to a CFL number of unity, (iv) a fully explicit, time-marching integration scheme, (v) true multidimensionality without using directional splitting, and (vi) the ability to handle two- and three-dimensional geometries by using unstructured meshes. This algorithm has been thoroughly tested in one, two and three spatial dimensions and has been shown to obtain the desired order of accuracy for solving both linear and non-linear hyperbolic partial differential equations. The scheme has also shown its ability to accurately resolve discontinuities in the solutions. Higher order unstructured methods such as the Discontinuous Galerkin (DG) method and the Spectral Volume (SV) methods have been developed for one-, two- and three-dimensional application. Although these schemes have seen extensive development and use, certain drawbacks of these methods have been well documented. For example, the explicit versions of these two methods have very stringent stability criteria. This stability criteria requires that the time step be reduced as the order of the solver increases, for a given simulation on a given mesh. The research presented in this dissertation builds upon the work of Chang, who developed a fourth-order CESE scheme to solve a scalar one-dimensional hyperbolic partial differential equation. The completed research has resulted in two key deliverables. The first is a detailed derivation of a high-order CESE methods on unstructured meshes for solving the conservation laws in two- and three-dimensional spaces. The second is the code implementation of these numerical methods in a computer code. For code development, a one-dimensional solver for the Euler equations was developed. This work is an extension of Chang's work on the fourth-order CESE method for solving a one-dimensional scalar convection equation. A generic formulation for the nth-order CESE method, where n ≥ 4, was derived. Indeed, numerical implementation of the scheme confirmed that the order of convergence was consistent with the order of the scheme. For the two- and three-dimensional solvers, SOLVCON was used as the basic framework for code implementation. A new solver kernel for the fourth-order CESE method has been developed and integrated into the framework provided by SOLVCON. The main part of SOLVCON, which deals with unstructured meshes and parallel computing, remains intact. The SOLVCON code for data transmission between computer nodes for High Performance Computing (HPC). To validate and verify the newly developed high-order CESE algorithms, several one-, two- and three-dimensional simulations where conducted. For the arbitrary order, one-dimensional, CESE solver, three sets of governing equations were selected for simulation: (i) the linear convection equation, (ii) the linear acoustic equations, (iii) the nonlinear Euler equations. All three systems of equations were used to verify the order of convergence through mesh refinement. In addition the Euler equations were used to solve the Shu-Osher and Blastwave problems. These two simulations demonstrated that the new high-order CESE methods can accurately resolve discontinuities in the flow field.For the two-dimensional, fourth-order CESE solver, the Euler equation was employed in four different test cases. The first case was used to verify the order of convergence through mesh refinement. The next three cases demonstrated the ability of the new solver to accurately resolve discontinuities in the flows. This was demonstrated through: (i) the interaction between acoustic waves and an entropy pulse, (ii) supersonic flow over a circular blunt body, (iii) supersonic flow over a guttered wedge. To validate and verify the three-dimensional, fourth-order CESE solver, two different simulations where selected. The first used the linear convection equations to demonstrate fourth-order convergence. The second used the Euler equations to simulate supersonic flow over a spherical body to demonstrate the scheme's ability to accurately resolve shocks. All test cases used are well known benchmark problems and as such, there are multiple sources available to validate the numerical results. Furthermore, the simulations showed that the high-order CESE solver was stable at a CFL number near unity.
Quantifying the Energy Efficiency of Object Recognition and Optical Flow

DTIC Science & Technology

2014-03-28

other linear solvers, such as conjugate- gradient (CG), preconditioned conjugate-gradient (PCG), and red-black Gauss Seidel (RB). We have also... Seidel , and conjugate gradient solvers. We are interested in the energy it takes to get a given solution quality. In Figure 6, we plot the quality of...in terms of Joules. Conversely, our implementation of red-black Gauss Seidel proves to be very inefficient when we consider Joules instead of just
The U.S. Geological Survey Modular Ground-Water Model - PCGN: A Preconditioned Conjugate Gradient Solver with Improved Nonlinear Control

USGS Publications Warehouse

Naff, Richard L.; Banta, Edward R.

2008-01-01

The preconditioned conjugate gradient with improved nonlinear control (PCGN) package provides addi-tional means by which the solution of nonlinear ground-water flow problems can be controlled as compared to existing solver packages for MODFLOW. Picard iteration is used to solve nonlinear ground-water flow equations by iteratively solving a linear approximation of the nonlinear equations. The linear solution is provided by means of the preconditioned conjugate gradient algorithm where preconditioning is provided by the modi-fied incomplete Cholesky algorithm. The incomplete Cholesky scheme incorporates two levels of fill, 0 and 1, in which the pivots can be modified so that the row sums of the preconditioning matrix and the original matrix are approximately equal. A relaxation factor is used to implement the modified pivots, which determines the degree of modification allowed. The effects of fill level and degree of pivot modification are briefly explored by means of a synthetic, heterogeneous finite-difference matrix; results are reported in the final section of this report. The preconditioned conjugate gradient method is coupled with Picard iteration so as to efficiently solve the nonlinear equations associated with many ground-water flow problems. The description of this coupling of the linear solver with Picard iteration is a primary concern of this document.
Structural Analysis Using NX Nastran 9.0

NASA Technical Reports Server (NTRS)

Rolewicz, Benjamin M.

2014-01-01

NX Nastran is a powerful Finite Element Analysis (FEA) software package used to solve linear and non-linear models for structural and thermal systems. The software, which consists of both a solver and user interface, breaks down analysis into four files, each of which are important to the end results of the analysis. The software offers capabilities for a variety of types of analysis, and also contains a respectable modeling program. Over the course of ten weeks, I was trained to effectively implement NX Nastran into structural analysis and refinement for parts of two missions at NASA's Kennedy Space Center, the Restore mission and the Orion mission.
An Unsplit Monte-Carlo solver for the resolution of the linear Boltzmann equation coupled to (stiff) Bateman equations

NASA Astrophysics Data System (ADS)

Bernede, Adrien; Poëtte, Gaël

2018-02-01

In this paper, we are interested in the resolution of the time-dependent problem of particle transport in a medium whose composition evolves with time due to interactions. As a constraint, we want to use of Monte-Carlo (MC) scheme for the transport phase. A common resolution strategy consists in a splitting between the MC/transport phase and the time discretization scheme/medium evolution phase. After going over and illustrating the main drawbacks of split solvers in a simplified configuration (monokinetic, scalar Bateman problem), we build a new Unsplit MC (UMC) solver improving the accuracy of the solutions, avoiding numerical instabilities, and less sensitive to time discretization. The new solver is essentially based on a Monte Carlo scheme with time dependent cross sections implying the on-the-fly resolution of a reduced model for each MC particle describing the time evolution of the matter along their flight path.
Factorizing the factorization - a spectral-element solver for elliptic equations with linear operation count

NASA Astrophysics Data System (ADS)

Huismann, Immo; Stiller, Jörg; Fröhlich, Jochen

2017-10-01

The paper proposes a novel factorization technique for static condensation of a spectral-element discretization matrix that yields a linear operation count of just 13N multiplications for the residual evaluation, where N is the total number of unknowns. In comparison to previous work it saves a factor larger than 3 and outpaces unfactored variants for all polynomial degrees. Using the new technique as a building block for a preconditioned conjugate gradient method yields linear scaling of the runtime with N which is demonstrated for polynomial degrees from 2 to 32. This makes the spectral-element method cost effective even for low polynomial degrees. Moreover, the dependence of the iterative solution on the element aspect ratio is addressed, showing only a slight increase in the number of iterations for aspect ratios up to 128. Hence, the solver is very robust for practical applications.
Evaluation of automated decisionmaking methodologies and development of an integrated robotic system simulation. Volume 1: Study results

NASA Technical Reports Server (NTRS)

Lowrie, J. W.; Fermelia, A. J.; Haley, D. C.; Gremban, K. D.; Vanbaalen, J.; Walsh, R. W.

1982-01-01

A variety of artificial intelligence techniques which could be used with regard to NASA space applications and robotics were evaluated. The techniques studied were decision tree manipulators, problem solvers, rule based systems, logic programming languages, representation language languages, and expert systems. The overall structure of a robotic simulation tool was defined and a framework for that tool developed. Nonlinear and linearized dynamics equations were formulated for n link manipulator configurations. A framework for the robotic simulation was established which uses validated manipulator component models connected according to a user defined configuration.
A fast direct method for block triangular Toeplitz-like with tri-diagonal block systems from time-fractional partial differential equations

NASA Astrophysics Data System (ADS)

Ke, Rihuan; Ng, Michael K.; Sun, Hai-Wei

2015-12-01

In this paper, we study the block lower triangular Toeplitz-like with tri-diagonal blocks system which arises from the time-fractional partial differential equation. Existing fast numerical solver (e.g., fast approximate inversion method) cannot handle such linear system as the main diagonal blocks are different. The main contribution of this paper is to propose a fast direct method for solving this linear system, and to illustrate that the proposed method is much faster than the classical block forward substitution method for solving this linear system. Our idea is based on the divide-and-conquer strategy and together with the fast Fourier transforms for calculating Toeplitz matrix-vector multiplication. The complexity needs O (MNlog2 ⁡ M) arithmetic operations, where M is the number of blocks (the number of time steps) in the system and N is the size (number of spatial grid points) of each block. Numerical examples from the finite difference discretization of time-fractional partial differential equations are also given to demonstrate the efficiency of the proposed method.
Generalised Assignment Matrix Methodology in Linear Programming

ERIC Educational Resources Information Center

Jerome, Lawrence

2012-01-01

Discrete Mathematics instructors and students have long been struggling with various labelling and scanning algorithms for solving many important problems. This paper shows how to solve a wide variety of Discrete Mathematics and OR problems using assignment matrices and linear programming, specifically using Excel Solvers although the same…
DL_MG: A Parallel Multigrid Poisson and Poisson-Boltzmann Solver for Electronic Structure Calculations in Vacuum and Solution.

PubMed

Womack, James C; Anton, Lucian; Dziedzic, Jacek; Hasnip, Phil J; Probert, Matt I J; Skylaris, Chris-Kriton

2018-03-13

The solution of the Poisson equation is a crucial step in electronic structure calculations, yielding the electrostatic potential-a key component of the quantum mechanical Hamiltonian. In recent decades, theoretical advances and increases in computer performance have made it possible to simulate the electronic structure of extended systems in complex environments. This requires the solution of more complicated variants of the Poisson equation, featuring nonhomogeneous dielectric permittivities, ionic concentrations with nonlinear dependencies, and diverse boundary conditions. The analytic solutions generally used to solve the Poisson equation in vacuum (or with homogeneous permittivity) are not applicable in these circumstances, and numerical methods must be used. In this work, we present DL_MG, a flexible, scalable, and accurate solver library, developed specifically to tackle the challenges of solving the Poisson equation in modern large-scale electronic structure calculations on parallel computers. Our solver is based on the multigrid approach and uses an iterative high-order defect correction method to improve the accuracy of solutions. Using two chemically relevant model systems, we tested the accuracy and computational performance of DL_MG when solving the generalized Poisson and Poisson-Boltzmann equations, demonstrating excellent agreement with analytic solutions and efficient scaling to ∼10 9 unknowns and 100s of CPU cores. We also applied DL_MG in actual large-scale electronic structure calculations, using the ONETEP linear-scaling electronic structure package to study a 2615 atom protein-ligand complex with routinely available computational resources. In these calculations, the overall execution time with DL_MG was not significantly greater than the time required for calculations using a conventional FFT-based solver.

ALPS: A Linear Program Solver

NASA Technical Reports Server (NTRS)

Ferencz, Donald C.; Viterna, Larry A.

1991-01-01

ALPS is a computer program which can be used to solve general linear program (optimization) problems. ALPS was designed for those who have minimal linear programming (LP) knowledge and features a menu-driven scheme to guide the user through the process of creating and solving LP formulations. Once created, the problems can be edited and stored in standard DOS ASCII files to provide portability to various word processors or even other linear programming packages. Unlike many math-oriented LP solvers, ALPS contains an LP parser that reads through the LP formulation and reports several types of errors to the user. ALPS provides a large amount of solution data which is often useful in problem solving. In addition to pure linear programs, ALPS can solve for integer, mixed integer, and binary type problems. Pure linear programs are solved with the revised simplex method. Integer or mixed integer programs are solved initially with the revised simplex, and the completed using the branch-and-bound technique. Binary programs are solved with the method of implicit enumeration. This manual describes how to use ALPS to create, edit, and solve linear programming problems. Instructions for installing ALPS on a PC compatible computer are included in the appendices along with a general introduction to linear programming. A programmers guide is also included for assistance in modifying and maintaining the program.
A new fast direct solver for the boundary element method

NASA Astrophysics Data System (ADS)

Huang, S.; Liu, Y. J.

2017-09-01

A new fast direct linear equation solver for the boundary element method (BEM) is presented in this paper. The idea of the new fast direct solver stems from the concept of the hierarchical off-diagonal low-rank matrix. The hierarchical off-diagonal low-rank matrix can be decomposed into the multiplication of several diagonal block matrices. The inverse of the hierarchical off-diagonal low-rank matrix can be calculated efficiently with the Sherman-Morrison-Woodbury formula. In this paper, a more general and efficient approach to approximate the coefficient matrix of the BEM with the hierarchical off-diagonal low-rank matrix is proposed. Compared to the current fast direct solver based on the hierarchical off-diagonal low-rank matrix, the proposed method is suitable for solving general 3-D boundary element models. Several numerical examples of 3-D potential problems with the total number of unknowns up to above 200,000 are presented. The results show that the new fast direct solver can be applied to solve large 3-D BEM models accurately and with better efficiency compared with the conventional BEM.
Implicit–explicit (IMEX) Runge–Kutta methods for non-hydrostatic atmospheric models

DOE PAGES

Gardner, David J.; Guerra, Jorge E.; Hamon, François P.; ...

2018-04-17

The efficient simulation of non-hydrostatic atmospheric dynamics requires time integration methods capable of overcoming the explicit stability constraints on time step size arising from acoustic waves. In this work, we investigate various implicit–explicit (IMEX) additive Runge–Kutta (ARK) methods for evolving acoustic waves implicitly to enable larger time step sizes in a global non-hydrostatic atmospheric model. The IMEX formulations considered include horizontally explicit – vertically implicit (HEVI) approaches as well as splittings that treat some horizontal dynamics implicitly. In each case, the impact of solving nonlinear systems in each implicit ARK stage in a linearly implicit fashion is also explored.The accuracymore » and efficiency of the IMEX splittings, ARK methods, and solver options are evaluated on a gravity wave and baroclinic wave test case. HEVI splittings that treat some vertical dynamics explicitly do not show a benefit in solution quality or run time over the most implicit HEVI formulation. While splittings that implicitly evolve some horizontal dynamics increase the maximum stable step size of a method, the gains are insufficient to overcome the additional cost of solving a globally coupled system. Solving implicit stage systems in a linearly implicit manner limits the solver cost but this is offset by a reduction in step size to achieve the desired accuracy for some methods. Overall, the third-order ARS343 and ARK324 methods performed the best, followed by the second-order ARS232 and ARK232 methods.« less
Implicit-explicit (IMEX) Runge-Kutta methods for non-hydrostatic atmospheric models

NASA Astrophysics Data System (ADS)

Gardner, David J.; Guerra, Jorge E.; Hamon, François P.; Reynolds, Daniel R.; Ullrich, Paul A.; Woodward, Carol S.

2018-04-01

The efficient simulation of non-hydrostatic atmospheric dynamics requires time integration methods capable of overcoming the explicit stability constraints on time step size arising from acoustic waves. In this work, we investigate various implicit-explicit (IMEX) additive Runge-Kutta (ARK) methods for evolving acoustic waves implicitly to enable larger time step sizes in a global non-hydrostatic atmospheric model. The IMEX formulations considered include horizontally explicit - vertically implicit (HEVI) approaches as well as splittings that treat some horizontal dynamics implicitly. In each case, the impact of solving nonlinear systems in each implicit ARK stage in a linearly implicit fashion is also explored. The accuracy and efficiency of the IMEX splittings, ARK methods, and solver options are evaluated on a gravity wave and baroclinic wave test case. HEVI splittings that treat some vertical dynamics explicitly do not show a benefit in solution quality or run time over the most implicit HEVI formulation. While splittings that implicitly evolve some horizontal dynamics increase the maximum stable step size of a method, the gains are insufficient to overcome the additional cost of solving a globally coupled system. Solving implicit stage systems in a linearly implicit manner limits the solver cost but this is offset by a reduction in step size to achieve the desired accuracy for some methods. Overall, the third-order ARS343 and ARK324 methods performed the best, followed by the second-order ARS232 and ARK232 methods.
Efficient reconstruction method for ground layer adaptive optics with mixed natural and laser guide stars.

PubMed

Wagner, Roland; Helin, Tapio; Obereder, Andreas; Ramlau, Ronny

2016-02-20

The imaging quality of modern ground-based telescopes such as the planned European Extremely Large Telescope is affected by atmospheric turbulence. In consequence, they heavily depend on stable and high-performance adaptive optics (AO) systems. Using measurements of incoming light from guide stars, an AO system compensates for the effects of turbulence by adjusting so-called deformable mirror(s) (DMs) in real time. In this paper, we introduce a novel reconstruction method for ground layer adaptive optics. In the literature, a common approach to this problem is to use Bayesian inference in order to model the specific noise structure appearing due to spot elongation. This approach leads to large coupled systems with high computational effort. Recently, fast solvers of linear order, i.e., with computational complexity O(n), where n is the number of DM actuators, have emerged. However, the quality of such methods typically degrades in low flux conditions. Our key contribution is to achieve the high quality of the standard Bayesian approach while at the same time maintaining the linear order speed of the recent solvers. Our method is based on performing a separate preprocessing step before applying the cumulative reconstructor (CuReD). The efficiency and performance of the new reconstructor are demonstrated using the OCTOPUS, the official end-to-end simulation environment of the ESO for extremely large telescopes. For more specific simulations we also use the MOST toolbox.
Implicit–explicit (IMEX) Runge–Kutta methods for non-hydrostatic atmospheric models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gardner, David J.; Guerra, Jorge E.; Hamon, François P.

The efficient simulation of non-hydrostatic atmospheric dynamics requires time integration methods capable of overcoming the explicit stability constraints on time step size arising from acoustic waves. In this work, we investigate various implicit–explicit (IMEX) additive Runge–Kutta (ARK) methods for evolving acoustic waves implicitly to enable larger time step sizes in a global non-hydrostatic atmospheric model. The IMEX formulations considered include horizontally explicit – vertically implicit (HEVI) approaches as well as splittings that treat some horizontal dynamics implicitly. In each case, the impact of solving nonlinear systems in each implicit ARK stage in a linearly implicit fashion is also explored.The accuracymore » and efficiency of the IMEX splittings, ARK methods, and solver options are evaluated on a gravity wave and baroclinic wave test case. HEVI splittings that treat some vertical dynamics explicitly do not show a benefit in solution quality or run time over the most implicit HEVI formulation. While splittings that implicitly evolve some horizontal dynamics increase the maximum stable step size of a method, the gains are insufficient to overcome the additional cost of solving a globally coupled system. Solving implicit stage systems in a linearly implicit manner limits the solver cost but this is offset by a reduction in step size to achieve the desired accuracy for some methods. Overall, the third-order ARS343 and ARK324 methods performed the best, followed by the second-order ARS232 and ARK232 methods.« less
LINFLUX-AE: A Turbomachinery Aeroelastic Code Based on a 3-D Linearized Euler Solver

NASA Technical Reports Server (NTRS)

Reddy, T. S. R.; Bakhle, M. A.; Trudell, J. J.; Mehmed, O.; Stefko, G. L.

2004-01-01

This report describes the development and validation of LINFLUX-AE, a turbomachinery aeroelastic code based on the linearized unsteady 3-D Euler solver, LINFLUX. A helical fan with flat plate geometry is selected as the test case for numerical validation. The steady solution required by LINFLUX is obtained from the nonlinear Euler/Navier Stokes solver TURBO-AE. The report briefly describes the salient features of LINFLUX and the details of the aeroelastic extension. The aeroelastic formulation is based on a modal approach. An eigenvalue formulation is used for flutter analysis. The unsteady aerodynamic forces required for flutter are obtained by running LINFLUX for each mode, interblade phase angle and frequency of interest. The unsteady aerodynamic forces for forced response analysis are obtained from LINFLUX for the prescribed excitation, interblade phase angle, and frequency. The forced response amplitude is calculated from the modal summation of the generalized displacements. The unsteady pressures, work done per cycle, eigenvalues and forced response amplitudes obtained from LINFLUX are compared with those obtained from LINSUB, TURBO-AE, ASTROP2, and ANSYS.
Computer-Aided Transformation of PDE Models: Languages, Representations, and a Calculus of Operations

DTIC Science & Technology

2016-01-05

discretizations . We maintain that what is clear at the mathematical level should be equally clear in computation. In this small STIR project, we separate the...concerns of describing and discretizing such models by defining an input language representing PDE, including steady-state and tran- sient, linear and...solvers, such as [8, 9], focused on the solvers themselves and particular families of discretizations (e. g. finite elements), and now it is natural to
Mega-Scale Simulation of Multi-Layer Devices-- Formulation, Kinetics, and Visualization

DTIC Science & Technology

1994-07-28

prototype code STRIDE, also initially developed under ARO support. The focus of the ARO supported research activities has been in the areas of multi ... FORTRAN -77. During its fifteen-year life- span several generations of researchers have modified the code . Due to this continual develop- ment, the...behavior. The replacement of the linear solver had no effect on the remainder of the code . We replaced the existing solver with a distributed multi -frontal
A FEniCS-based programming framework for modeling turbulent flow by the Reynolds-averaged Navier-Stokes equations

NASA Astrophysics Data System (ADS)

Mortensen, Mikael; Langtangen, Hans Petter; Wells, Garth N.

2011-09-01

Finding an appropriate turbulence model for a given flow case usually calls for extensive experimentation with both models and numerical solution methods. This work presents the design and implementation of a flexible, programmable software framework for assisting with numerical experiments in computational turbulence. The framework targets Reynolds-averaged Navier-Stokes models, discretized by finite element methods. The novel implementation makes use of Python and the FEniCS package, the combination of which leads to compact and reusable code, where model- and solver-specific code resemble closely the mathematical formulation of equations and algorithms. The presented ideas and programming techniques are also applicable to other fields that involve systems of nonlinear partial differential equations. We demonstrate the framework in two applications and investigate the impact of various linearizations on the convergence properties of nonlinear solvers for a Reynolds-averaged Navier-Stokes model.
An interior-point method-based solver for simulation of aircraft parts riveting

NASA Astrophysics Data System (ADS)

Stefanova, Maria; Yakunin, Sergey; Petukhova, Margarita; Lupuleac, Sergey; Kokkolaras, Michael

2018-05-01

The particularities of the aircraft parts riveting process simulation necessitate the solution of a large amount of contact problems. A primal-dual interior-point method-based solver is proposed for solving such problems efficiently. The proposed method features a worst case polynomial complexity bound ? on the number of iterations, where n is the dimension of the problem and ε is a threshold related to desired accuracy. In practice, the convergence is often faster than this worst case bound, which makes the method applicable to large-scale problems. The computational challenge is solving the system of linear equations because the associated matrix is ill conditioned. To that end, the authors introduce a preconditioner and a strategy for determining effective initial guesses based on the physics of the problem. Numerical results are compared with ones obtained using the Goldfarb-Idnani algorithm. The results demonstrate the efficiency of the proposed method.
TerraFERMA: Harnessing Advanced Computational Libraries in Earth Science

NASA Astrophysics Data System (ADS)

Wilson, C. R.; Spiegelman, M.; van Keken, P.

2012-12-01

Many important problems in Earth sciences can be described by non-linear coupled systems of partial differential equations. These "multi-physics" problems include thermo-chemical convection in Earth and planetary interiors, interactions of fluids and magmas with the Earth's mantle and crust and coupled flow of water and ice. These problems are of interest to a large community of researchers but are complicated to model and understand. Much of this complexity stems from the nature of multi-physics where small changes in the coupling between variables or constitutive relations can lead to radical changes in behavior, which in turn affect critical computational choices such as discretizations, solvers and preconditioners. To make progress in understanding such coupled systems requires a computational framework where multi-physics problems can be described at a high-level while maintaining the flexibility to easily modify the solution algorithm. Fortunately, recent advances in computational science provide a basis for implementing such a framework. Here we present the Transparent Finite Element Rapid Model Assembler (TerraFERMA), which leverages several advanced open-source libraries for core functionality. FEniCS (fenicsproject.org) provides a high level language for describing the weak forms of coupled systems of equations, and an automatic code generator that produces finite element assembly code. PETSc (www.mcs.anl.gov/petsc) provides a wide range of scalable linear and non-linear solvers that can be composed into effective multi-physics preconditioners. SPuD (amcg.ese.ic.ac.uk/Spud) is an application neutral options system that provides both human and machine-readable interfaces based on a single xml schema. Our software integrates these libraries and provides the user with a framework for exploring multi-physics problems. A single options file fully describes the problem, including all equations, coefficients and solver options. Custom compiled applications are generated from this file but share an infrastructure for services common to all models, e.g. diagnostics, checkpointing and global non-linear convergence monitoring. This maximizes code reusability, reliability and longevity ensuring that scientific results and the methods used to acquire them are transparent and reproducible. TerraFERMA has been tested against many published geodynamic benchmarks including 2D/3D thermal convection problems, the subduction zone benchmarks and benchmarks for magmatic solitary waves. It is currently being used in the investigation of reactive cracking phenomena with applications to carbon sequestration, but we will principally discuss its use in modeling the migration of fluids in subduction zones. Subduction zones require an understanding of the highly nonlinear interactions of fluids with solids and thus provide an excellent scientific driver for the development of multi-physics software.
Effect of Coannular Flow on Linearized Euler Equation Predictions of Jet Noise

NASA Technical Reports Server (NTRS)

Hixon, R.; Shih, S.-H.; Mankbadi, Reda R.

1997-01-01

An improved version of a previously validated linearized Euler equation solver is used to compute the noise generated by coannular supersonic jets. Results for a single supersonic jet are compared to the results from both a normal velocity profile and an inverted velocity profile supersonic jet.
Second derivative time integration methods for discontinuous Galerkin solutions of unsteady compressible flows

NASA Astrophysics Data System (ADS)

Nigro, A.; De Bartolo, C.; Crivellini, A.; Bassi, F.

2017-12-01

In this paper we investigate the possibility of using the high-order accurate A (α) -stable Second Derivative (SD) schemes proposed by Enright for the implicit time integration of the Discontinuous Galerkin (DG) space-discretized Navier-Stokes equations. These multistep schemes are A-stable up to fourth-order, but their use results in a system matrix difficult to compute. Furthermore, the evaluation of the nonlinear function is computationally very demanding. We propose here a Matrix-Free (MF) implementation of Enright schemes that allows to obtain a method without the costs of forming, storing and factorizing the system matrix, which is much less computationally expensive than its matrix-explicit counterpart, and which performs competitively with other implicit schemes, such as the Modified Extended Backward Differentiation Formulae (MEBDF). The algorithm makes use of the preconditioned GMRES algorithm for solving the linear system of equations. The preconditioner is based on the ILU(0) factorization of an approximated but computationally cheaper form of the system matrix, and it has been reused for several time steps to improve the efficiency of the MF Newton-Krylov solver. We additionally employ a polynomial extrapolation technique to compute an accurate initial guess to the implicit nonlinear system. The stability properties of SD schemes have been analyzed by solving a linear model problem. For the analysis on the Navier-Stokes equations, two-dimensional inviscid and viscous test cases, both with a known analytical solution, are solved to assess the accuracy properties of the proposed time integration method for nonlinear autonomous and non-autonomous systems, respectively. The performance of the SD algorithm is compared with the ones obtained by using an MF-MEBDF solver, in order to evaluate its effectiveness, identifying its limitations and suggesting possible further improvements.
Keeping it Together: Advanced algorithms and software for magma dynamics (and other coupled multi-physics problems)

NASA Astrophysics Data System (ADS)

Spiegelman, M.; Wilson, C. R.

2011-12-01

A quantitative theory of magma production and transport is essential for understanding the dynamics of magmatic plate boundaries, intra-plate volcanism and the geochemical evolution of the planet. It also provides one of the most challenging computational problems in solid Earth science, as it requires consistent coupling of fluid and solid mechanics together with the thermodynamics of melting and reactive flows. Considerable work on these problems over the past two decades shows that small changes in assumptions of coupling (e.g. the relationship between melt fraction and solid rheology), can have profound changes on the behavior of these systems which in turn affects critical computational choices such as discretizations, solvers and preconditioners. To make progress in exploring and understanding this physically rich system requires a computational framework that allows more flexible, high-level description of multi-physics problems as well as increased flexibility in composing efficient algorithms for solution of the full non-linear coupled system. Fortunately, recent advances in available computational libraries and algorithms provide a platform for implementing such a framework. We present results from a new model building system that leverages functionality from both the FEniCS project (www.fenicsproject.org) and PETSc libraries (www.mcs.anl.gov/petsc) along with a model independent options system and gui, Spud (amcg.ese.ic.ac.uk/Spud). Key features from FEniCS include fully unstructured FEM with a wide range of elements; a high-level language (ufl) and code generation compiler (FFC) for describing the weak forms of residuals and automatic differentiation for calculation of exact and approximate jacobians. The overall strategy is to monitor/calculate residuals and jacobians for the entire non-linear system of equations within a global non-linear solve based on PETSc's SNES routines. PETSc already provides a wide range of solvers and preconditioners, from parallel sparse direct to algebraic multigrid, that can be chosen at runtime. In particular, we make extensive use of PETSc's FieldSplit block preconditioners that allow us to use optimal solvers for subproblems (such as Stokes, or advection/diffusion of temperature) as preconditioners for the full problem. Thus these routines let us reuse effective solving recipes/splittings from previous experience while monitoring the convergence of the global problem. These techniques often yield quadratic (Newton like) convergence for the work of standard Picard schemes. We will illustrate this new framework with examples from the Magma Dynamic Demonstration suite (MADDs) of well understood magma dynamics benchmark problems including stokes flow in ridge geometries, magmatic solitary waves and shear-driven melt bands. While development of this system has been driven by magma dynamics, this framework is much more general and can be used for a wide range of PDE based multi-physics models.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Dumbser, Michael, E-mail: michael.dumbser@unitn.it; Balsara, Dinshaw S., E-mail: dbalsara@nd.edu

In this paper a new, simple and universal formulation of the HLLEM Riemann solver (RS) is proposed that works for general conservative and non-conservative systems of hyperbolic equations. For non-conservative PDE, a path-conservative formulation of the HLLEM RS is presented for the first time in this paper. The HLLEM Riemann solver is built on top of a novel and very robust path-conservative HLL method. It thus naturally inherits the positivity properties and the entropy enforcement of the underlying HLL scheme. However, with just the slight additional cost of evaluating eigenvectors and eigenvalues of intermediate characteristic fields, we can represent linearlymore » degenerate intermediate waves with a minimum of smearing. For conservative systems, our paper provides the easiest and most seamless path for taking a pre-existing HLL RS and quickly and effortlessly converting it to a RS that provides improved results, comparable with those of an HLLC, HLLD, Osher or Roe-type RS. This is done with minimal additional computational complexity, making our variant of the HLLEM RS also a very fast RS that can accurately represent linearly degenerate discontinuities. Our present HLLEM RS also transparently extends these advantages to non-conservative systems. For shallow water-type systems, the resulting method is proven to be well-balanced. Several test problems are presented for shallow water-type equations and two-phase flow models, as well as for gas dynamics with real equation of state, magnetohydrodynamics (MHD & RMHD), and nonlinear elasticity. Since our new formulation accommodates multiple intermediate waves and has a broader applicability than the original HLLEM method, it could alternatively be called the HLLI Riemann solver, where the “I” stands for the intermediate characteristic fields that can be accounted for. -- Highlights: •New simple and general path-conservative formulation of the HLLEM Riemann solver. •Application to general conservative and non-conservative hyperbolic systems. •Inclusion of sub-structure and resolution of intermediate characteristic fields. •Well-balanced for single- and two-layer shallow water equations and multi-phase flows. •Euler equations with real equation of state, MHD equations, nonlinear elasticity.« less
Least-Squares Spectral Element Solutions to the CAA Workshop Benchmark Problems

NASA Technical Reports Server (NTRS)

Lin, Wen H.; Chan, Daniel C.

1997-01-01

This paper presents computed results for some of the CAA benchmark problems via the acoustic solver developed at Rocketdyne CFD Technology Center under the corporate agreement between Boeing North American, Inc. and NASA for the Aerospace Industry Technology Program. The calculations are considered as benchmark testing of the functionality, accuracy, and performance of the solver. Results of these computations demonstrate that the solver is capable of solving the propagation of aeroacoustic signals. Testing of sound generation and on more realistic problems is now pursued for the industrial applications of this solver. Numerical calculations were performed for the second problem of Category 1 of the current workshop problems for an acoustic pulse scattered from a rigid circular cylinder, and for two of the first CAA workshop problems, i. e., the first problem of Category 1 for the propagation of a linear wave and the first problem of Category 4 for an acoustic pulse reflected from a rigid wall in a uniform flow of Mach 0.5. The aim for including the last two problems in this workshop is to test the effectiveness of some boundary conditions set up in the solver. Numerical results of the last two benchmark problems have been compared with their corresponding exact solutions and the comparisons are excellent. This demonstrates the high fidelity of the solver in handling wave propagation problems. This feature lends the method quite attractive in developing a computational acoustic solver for calculating the aero/hydrodynamic noise in a violent flow environment.
RELATIVISTIC MAGNETOHYDRODYNAMICS: RENORMALIZED EIGENVECTORS AND FULL WAVE DECOMPOSITION RIEMANN SOLVER

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anton, Luis; MartI, Jose M; Ibanez, Jose M

2010-05-01

We obtain renormalized sets of right and left eigenvectors of the flux vector Jacobians of the relativistic MHD equations, which are regular and span a complete basis in any physical state including degenerate ones. The renormalization procedure relies on the characterization of the degeneracy types in terms of the normal and tangential components of the magnetic field to the wave front in the fluid rest frame. Proper expressions of the renormalized eigenvectors in conserved variables are obtained through the corresponding matrix transformations. Our work completes previous analysis that present different sets of right eigenvectors for non-degenerate and degenerate states, andmore » can be seen as a relativistic generalization of earlier work performed in classical MHD. Based on the full wave decomposition (FWD) provided by the renormalized set of eigenvectors in conserved variables, we have also developed a linearized (Roe-type) Riemann solver. Extensive testing against one- and two-dimensional standard numerical problems allows us to conclude that our solver is very robust. When compared with a family of simpler solvers that avoid the knowledge of the full characteristic structure of the equations in the computation of the numerical fluxes, our solver turns out to be less diffusive than HLL and HLLC, and comparable in accuracy to the HLLD solver. The amount of operations needed by the FWD solver makes it less efficient computationally than those of the HLL family in one-dimensional problems. However, its relative efficiency increases in multidimensional simulations.« less
Optimal Facility Location Tool for Logistics Battle Command (LBC)

DTIC Science & Technology

2015-08-01

64 Appendix B. VBA Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Appendix C. Story...should city planners have located emergency service facilities so that all households (the demand) had equal access to coverage?” The critical...programming language called Visual Basic for Applications ( VBA ). CPLEX is a commercial solver for linear, integer, and mixed integer linear programming problems
An HLLC Riemann solver for resistive relativistic magnetohydrodynamics

NASA Astrophysics Data System (ADS)

Miranda-Aranguren, S.; Aloy, M. A.; Rembiasz, T.

2018-05-01

We present a new approximate Riemann solver for the augmented system of equations of resistive relativistic magnetohydrodynamics that belongs to the family of Harten-Lax-van Leer contact wave (HLLC) solvers. In HLLC solvers, the solution is approximated by two constant states flanked by two shocks separated by a contact wave. The accuracy of the new approximate solver is calibrated through 1D and 2D test problems.

A scalable geometric multigrid solver for nonsymmetric elliptic systems with application to variable-density flows

NASA Astrophysics Data System (ADS)

Esmaily, M.; Jofre, L.; Mani, A.; Iaccarino, G.

2018-03-01

A geometric multigrid algorithm is introduced for solving nonsymmetric linear systems resulting from the discretization of the variable density Navier-Stokes equations on nonuniform structured rectilinear grids and high-Reynolds number flows. The restriction operation is defined such that the resulting system on the coarser grids is symmetric, thereby allowing for the use of efficient smoother algorithms. To achieve an optimal rate of convergence, the sequence of interpolation and restriction operations are determined through a dynamic procedure. A parallel partitioning strategy is introduced to minimize communication while maintaining the load balance between all processors. To test the proposed algorithm, we consider two cases: 1) homogeneous isotropic turbulence discretized on uniform grids and 2) turbulent duct flow discretized on stretched grids. Testing the algorithm on systems with up to a billion unknowns shows that the cost varies linearly with the number of unknowns. This O (N) behavior confirms the robustness of the proposed multigrid method regarding ill-conditioning of large systems characteristic of multiscale high-Reynolds number turbulent flows. The robustness of our method to density variations is established by considering cases where density varies sharply in space by a factor of up to 104, showing its applicability to two-phase flow problems. Strong and weak scalability studies are carried out, employing up to 30,000 processors, to examine the parallel performance of our implementation. Excellent scalability of our solver is shown for a granularity as low as 104 to 105 unknowns per processor. At its tested peak throughput, it solves approximately 4 billion unknowns per second employing over 16,000 processors with a parallel efficiency higher than 50%.
Fast secant methods for the iterative solution of large nonsymmetric linear systems

NASA Technical Reports Server (NTRS)

Deuflhard, Peter; Freund, Roland; Walter, Artur

1990-01-01

A family of secant methods based on general rank-1 updates was revisited in view of the construction of iterative solvers for large non-Hermitian linear systems. As it turns out, both Broyden's good and bad update techniques play a special role, but should be associated with two different line search principles. For Broyden's bad update technique, a minimum residual principle is natural, thus making it theoretically comparable with a series of well known algorithms like GMRES. Broyden's good update technique, however, is shown to be naturally linked with a minimum next correction principle, which asymptotically mimics a minimum error principle. The two minimization principles differ significantly for sufficiently large system dimension. Numerical experiments on discretized partial differential equations of convection diffusion type in 2-D with integral layers give a first impression of the possible power of the derived good Broyden variant.
2D Electrostatic Potential Solver for Hall Thruster Simulation

DTIC Science & Technology

2006-07-12

µ(jrBz − jzBr)Br − jrBθ) + µneEz + µ∇zp jr = µ(jzBθ − µ(jrBz − jzBr)Bz) + µneEr + µ∇rp (5) This equation system can be solved to isolate the axial and...Simplification The following simplifications are used to more easily manipulate the equation system . jz = Z3Ez + Z4Er + Z5 jr = R3Ez +R4Er +R5 (10) where, Z5...j−1 ∆r (18) With appropriate boundary conditions, a pentadiagonal linear system of equations of the form Ax=b can be constructed (where x is the
DOE Office of Scientific and Technical Information (OSTI.GOV)

Vay, Jean-Luc, E-mail: jlvay@lbl.gov; Haber, Irving; Godfrey, Brendan B.

Pseudo-spectral electromagnetic solvers (i.e. representing the fields in Fourier space) have extraordinary precision. In particular, Haber et al. presented in 1973 a pseudo-spectral solver that integrates analytically the solution over a finite time step, under the usual assumption that the source is constant over that time step. Yet, pseudo-spectral solvers have not been widely used, due in part to the difficulty for efficient parallelization owing to global communications associated with global FFTs on the entire computational domains. A method for the parallelization of electromagnetic pseudo-spectral solvers is proposed and tested on single electromagnetic pulses, and on Particle-In-Cell simulations of themore » wakefield formation in a laser plasma accelerator. The method takes advantage of the properties of the Discrete Fourier Transform, the linearity of Maxwell’s equations and the finite speed of light for limiting the communications of data within guard regions between neighboring computational domains. Although this requires a small approximation, test results show that no significant error is made on the test cases that have been presented. The proposed method opens the way to solvers combining the favorable parallel scaling of standard finite-difference methods with the accuracy advantages of pseudo-spectral methods.« less
A mass-conservative adaptive FAS multigrid solver for cell-centered finite difference methods on block-structured, locally-cartesian grids

NASA Astrophysics Data System (ADS)

Feng, Wenqiang; Guo, Zhenlin; Lowengrub, John S.; Wise, Steven M.

2018-01-01

We present a mass-conservative full approximation storage (FAS) multigrid solver for cell-centered finite difference methods on block-structured, locally cartesian grids. The algorithm is essentially a standard adaptive FAS (AFAS) scheme, but with a simple modification that comes in the form of a mass-conservative correction to the coarse-level force. This correction is facilitated by the creation of a zombie variable, analogous to a ghost variable, but defined on the coarse grid and lying under the fine grid refinement patch. We show that a number of different types of fine-level ghost cell interpolation strategies could be used in our framework, including low-order linear interpolation. In our approach, the smoother, prolongation, and restriction operations need never be aware of the mass conservation conditions at the coarse-fine interface. To maintain global mass conservation, we need only modify the usual FAS algorithm by correcting the coarse-level force function at points adjacent to the coarse-fine interface. We demonstrate through simulations that the solver converges geometrically, at a rate that is h-independent, and we show the generality of the solver, applying it to several nonlinear, time-dependent, and multi-dimensional problems. In several tests, we show that second-order asymptotic (h → 0) convergence is observed for the discretizations, provided that (1) at least linear interpolation of the ghost variables is employed, and (2) the mass conservation corrections are applied to the coarse-level force term.
BioFVM: an efficient, parallelized diffusive transport solver for 3-D biological simulations

PubMed Central

Ghaffarizadeh, Ahmadreza; Friedman, Samuel H.; Macklin, Paul

2016-01-01

Motivation: Computational models of multicellular systems require solving systems of PDEs for release, uptake, decay and diffusion of multiple substrates in 3D, particularly when incorporating the impact of drugs, growth substrates and signaling factors on cell receptors and subcellular systems biology. Results: We introduce BioFVM, a diffusive transport solver tailored to biological problems. BioFVM can simulate release and uptake of many substrates by cell and bulk sources, diffusion and decay in large 3D domains. It has been parallelized with OpenMP, allowing efficient simulations on desktop workstations or single supercomputer nodes. The code is stable even for large time steps, with linear computational cost scalings. Solutions are first-order accurate in time and second-order accurate in space. The code can be run by itself or as part of a larger simulator. Availability and implementation: BioFVM is written in C ++ with parallelization in OpenMP. It is maintained and available for download at http://BioFVM.MathCancer.org and http://BioFVM.sf.net under the Apache License (v2.0). Contact: paul.macklin@usc.edu. Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26656933
Performance of a parallel algebraic multilevel preconditioner for stabilized finite element semiconductor device modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Paul T.; Shadid, John N.; Sala, Marzio

In this study results are presented for the large-scale parallel performance of an algebraic multilevel preconditioner for solution of the drift-diffusion model for semiconductor devices. The preconditioner is the key numerical procedure determining the robustness, efficiency and scalability of the fully-coupled Newton-Krylov based, nonlinear solution method that is employed for this system of equations. The coupled system is comprised of a source term dominated Poisson equation for the electric potential, and two convection-diffusion-reaction type equations for the electron and hole concentration. The governing PDEs are discretized in space by a stabilized finite element method. Solution of the discrete system ismore » obtained through a fully-implicit time integrator, a fully-coupled Newton-based nonlinear solver, and a restarted GMRES Krylov linear system solver. The algebraic multilevel preconditioner is based on an aggressive coarsening graph partitioning of the nonzero block structure of the Jacobian matrix. Representative performance results are presented for various choices of multigrid V-cycles and W-cycles and parameter variations for smoothers based on incomplete factorizations. Parallel scalability results are presented for solution of up to 10{sup 8} unknowns on 4096 processors of a Cray XT3/4 and an IBM POWER eServer system.« less
Multiphase three-dimensional direct numerical simulation of a rotating impeller with code Blue

NASA Astrophysics Data System (ADS)

Kahouadji, Lyes; Shin, Seungwon; Chergui, Jalel; Juric, Damir; Craster, Richard V.; Matar, Omar K.

2017-11-01

The flow driven by a rotating impeller inside an open fixed cylindrical cavity is simulated using code Blue, a solver for massively-parallel simulations of fully three-dimensional multiphase flows. The impeller is composed of four blades at a 45° inclination all attached to a central hub and tube stem. In Blue, solid forms are constructed through the definition of immersed objects via a distance function that accounts for the object's interaction with the flow for both single and two-phase flows. We use a moving frame technique for imposing translation and/or rotation. The variation of the Reynolds number, the clearance, and the tank aspect ratio are considered, and we highlight the importance of the confinement ratio (blade radius versus the tank radius) in the mixing process. Blue uses a domain decomposition strategy for parallelization with MPI. The fluid interface solver is based on a parallel implementation of a hybrid front-tracking/level-set method designed complex interfacial topological changes. Parallel GMRES and multigrid iterative solvers are applied to the linear systems arising from the implicit solution for the fluid velocities and pressure in the presence of strong density and viscosity discontinuities across fluid phases. EPSRC, UK, MEMPHIS program Grant (EP/K003976/1), RAEng Research Chair (OKM).
A non-linear regression analysis program for describing electrophysiological data with multiple functions using Microsoft Excel.

PubMed

Brown, Angus M

2006-04-01

The objective of this present study was to demonstrate a method for fitting complex electrophysiological data with multiple functions using the SOLVER add-in of the ubiquitous spreadsheet Microsoft Excel. SOLVER minimizes the difference between the sum of the squares of the data to be fit and the function(s) describing the data using an iterative generalized reduced gradient method. While it is a straightforward procedure to fit data with linear functions, and we have previously demonstrated a method of non-linear regression analysis of experimental data based upon a single function, it is more complex to fit data with multiple functions, usually requiring specialized expensive computer software. In this paper we describe an easily understood program for fitting experimentally acquired data, in this case the stimulus-evoked compound action potential from the mouse optic nerve, with multiple Gaussian functions. The program is flexible and can be applied to describe data with a wide variety of user-input functions.
Nonlinear Krylov and moving nodes in the method of lines

NASA Astrophysics Data System (ADS)

Miller, Keith

2005-11-01

We report on some successes and problem areas in the Method of Lines from our work with moving node finite element methods. First, we report on our "nonlinear Krylov accelerator" for the modified Newton's method on the nonlinear equations of our stiff ODE solver. Since 1990 it has been robust, simple, cheap, and automatic on all our moving node computations. We publicize further trials with it here because it should be of great general usefulness to all those solving evolutionary equations. Second, we discuss the need for reliable automatic choice of spatially variable time steps. Third, we discuss the need for robust and efficient iterative solvers for the difficult linearized equations (Jx=b) of our stiff ODE solver. Here, the 1997 thesis of Zulu Xaba has made significant progress.
Hybrid discrete ordinates and characteristics method for solving the linear Boltzmann equation

NASA Astrophysics Data System (ADS)

Yi, Ce

With the ability of computer hardware and software increasing rapidly, deterministic methods to solve the linear Boltzmann equation (LBE) have attracted some attention for computational applications in both the nuclear engineering and medical physics fields. Among various deterministic methods, the discrete ordinates method (SN) and the method of characteristics (MOC) are two of the most widely used methods. The SN method is the traditional approach to solve the LBE for its stability and efficiency. While the MOC has some advantages in treating complicated geometries. However, in 3-D problems requiring a dense discretization grid in phase space (i.e., a large number of spatial meshes, directions, or energy groups), both methods could suffer from the need for large amounts of memory and computation time. In our study, we developed a new hybrid algorithm by combing the two methods into one code, TITAN. The hybrid approach is specifically designed for application to problems containing low scattering regions. A new serial 3-D time-independent transport code has been developed. Under the hybrid approach, the preferred method can be applied in different regions (blocks) within the same problem model. Since the characteristics method is numerically more efficient in low scattering media, the hybrid approach uses a block-oriented characteristics solver in low scattering regions, and a block-oriented SN solver in the remainder of the physical model. In the TITAN code, a physical problem model is divided into a number of coarse meshes (blocks) in Cartesian geometry. Either the characteristics solver or the SN solver can be chosen to solve the LBE within a coarse mesh. A coarse mesh can be filled with fine meshes or characteristic rays depending on the solver assigned to the coarse mesh. Furthermore, with its object-oriented programming paradigm and layered code structure, TITAN allows different individual spatial meshing schemes and angular quadrature sets for each coarse mesh. Two quadrature types (level-symmetric and Legendre-Chebyshev quadrature) along with the ordinate splitting techniques (rectangular splitting and PN-TN splitting) are implemented. In the S N solver, we apply a memory-efficient 'front-line' style paradigm to handle the fine mesh interface fluxes. In the characteristics solver, we have developed a novel 'backward' ray-tracing approach, in which a bi-linear interpolation procedure is used on the incoming boundaries of a coarse mesh. A CPU-efficient scattering kernel is shared in both solvers within the source iteration scheme. Angular and spatial projection techniques are developed to transfer the angular fluxes on the interfaces of coarse meshes with different discretization grids. The performance of the hybrid algorithm is tested in a number of benchmark problems in both nuclear engineering and medical physics fields. Among them are the Kobayashi benchmark problems and a computational tomography (CT) device model. We also developed an extra sweep procedure with the fictitious quadrature technique to calculate angular fluxes along directions of interest. The technique is applied in a single photon emission computed tomography (SPECT) phantom model to simulate the SPECT projection images. The accuracy and efficiency of the TITAN code are demonstrated in these benchmarks along with its scalability. A modified version of the characteristics solver is integrated in the PENTRAN code and tested within the parallel engine of PENTRAN. The limitations on the hybrid algorithm are also studied.
Practical Aerodynamic Design Optimization Based on the Navier-Stokes Equations and a Discrete Adjoint Method

NASA Technical Reports Server (NTRS)

Grossman, Bernard

1999-01-01

The technical details are summarized below: Compressible and incompressible versions of a three-dimensional unstructured mesh Reynolds-averaged Navier-Stokes flow solver have been differentiated and resulting derivatives have been verified by comparisons with finite differences and a complex-variable approach. In this implementation, the turbulence model is fully coupled with the flow equations in order to achieve this consistency. The accuracy demonstrated in the current work represents the first time that such an approach has been successfully implemented. The accuracy of a number of simplifying approximations to the linearizations of the residual have been examined. A first-order approximation to the dependent variables in both the adjoint and design equations has been investigated. The effects of a "frozen" eddy viscosity and the ramifications of neglecting some mesh sensitivity terms were also examined. It has been found that none of the approximations yielded derivatives of acceptable accuracy and were often of incorrect sign. However, numerical experiments indicate that an incomplete convergence of the adjoint system often yield sufficiently accurate derivatives, thereby significantly lowering the time required for computing sensitivity information. The convergence rate of the adjoint solver relative to the flow solver has been examined. Inviscid adjoint solutions typically require one to four times the cost of a flow solution, while for turbulent adjoint computations, this ratio can reach as high as eight to ten. Numerical experiments have shown that the adjoint solver can stall before converging the solution to machine accuracy, particularly for viscous cases. A possible remedy for this phenomenon would be to include the complete higher-order linearization in the preconditioning step, or to employ a simple form of mesh sequencing to obtain better approximations to the solution through the use of coarser meshes. . An efficient surface parameterization based on a free-form deformation technique has been utilized and the resulting codes have been integrated with an optimization package. Lastly, sample optimizations have been shown for inviscid and turbulent flow over an ONERA M6 wing. Drag reductions have been demonstrated by reducing shock strengths across the span of the wing.
Application of Conjugate Gradient methods to tidal simulation

USGS Publications Warehouse

Barragy, E.; Carey, G.F.; Walters, R.A.

1993-01-01

A harmonic decomposition technique is applied to the shallow water equations to yield a complex, nonsymmetric, nonlinear, Helmholtz type problem for the sea surface and an accompanying complex, nonlinear diagonal problem for the velocities. The equation for the sea surface is linearized using successive approximation and then discretized with linear, triangular finite elements. The study focuses on applying iterative methods to solve the resulting complex linear systems. The comparative evaluation includes both standard iterative methods for the real subsystems and complex versions of the well known Bi-Conjugate Gradient and Bi-Conjugate Gradient Squared methods. Several Incomplete LU type preconditioners are discussed, and the effects of node ordering, rejection strategy, domain geometry and Coriolis parameter (affecting asymmetry) are investigated. Implementation details for the complex case are discussed. Performance studies are presented and comparisons made with a frontal solver. ?? 1993.
PDE-based geophysical modelling using finite elements: examples from 3D resistivity and 2D magnetotellurics

NASA Astrophysics Data System (ADS)

Schaa, R.; Gross, L.; du Plessis, J.

2016-04-01

We present a general finite-element solver, escript, tailored to solve geophysical forward and inverse modeling problems in terms of partial differential equations (PDEs) with suitable boundary conditions. Escript’s abstract interface allows geoscientists to focus on solving the actual problem without being experts in numerical modeling. General-purpose finite element solvers have found wide use especially in engineering fields and find increasing application in the geophysical disciplines as these offer a single interface to tackle different geophysical problems. These solvers are useful for data interpretation and for research, but can also be a useful tool in educational settings. This paper serves as an introduction into PDE-based modeling with escript where we demonstrate in detail how escript is used to solve two different forward modeling problems from applied geophysics (3D DC resistivity and 2D magnetotellurics). Based on these two different cases, other geophysical modeling work can easily be realized. The escript package is implemented as a Python library and allows the solution of coupled, linear or non-linear, time-dependent PDEs. Parallel execution for both shared and distributed memory architectures is supported and can be used without modifications to the scripts.
IETI – Isogeometric Tearing and Interconnecting

PubMed Central

Kleiss, Stefan K.; Pechstein, Clemens; Jüttler, Bert; Tomar, Satyendra

2012-01-01

Finite Element Tearing and Interconnecting (FETI) methods are a powerful approach to designing solvers for large-scale problems in computational mechanics. The numerical simulation problem is subdivided into a number of independent sub-problems, which are then coupled in appropriate ways. NURBS- (Non-Uniform Rational B-spline) based isogeometric analysis (IGA) applied to complex geometries requires to represent the computational domain as a collection of several NURBS geometries. Since there is a natural decomposition of the computational domain into several subdomains, NURBS-based IGA is particularly well suited for using FETI methods. This paper proposes the new IsogEometric Tearing and Interconnecting (IETI) method, which combines the advanced solver design of FETI with the exact geometry representation of IGA. We describe the IETI framework for two classes of simple model problems (Poisson and linearized elasticity) and discuss the coupling of the subdomains along interfaces (both for matching interfaces and for interfaces with T-joints, i.e. hanging nodes). Special attention is paid to the construction of a suitable preconditioner for the iterative linear solver used for the interface problem. We report several computational experiments to demonstrate the performance of the proposed IETI method. PMID:24511167
Composite solvers for linear saddle point problems arising from the incompressible Stokes equations with highly heterogeneous viscosity structure

NASA Astrophysics Data System (ADS)

Sanan, P.; Schnepp, S. M.; May, D.; Schenk, O.

2014-12-01

Geophysical applications require efficient forward models for non-linear Stokes flow on high resolution spatio-temporal domains. The bottleneck in applying the forward model is solving the linearized, discretized Stokes problem which takes the form of a large, indefinite (saddle point) linear system. Due to the heterogeniety of the effective viscosity in the elliptic operator, devising effective preconditioners for saddle point problems has proven challenging and highly problem-dependent. Nevertheless, at least three approaches show promise for preconditioning these difficult systems in an algorithmically scalable way using multigrid and/or domain decomposition techniques. The first is to work with a hierarchy of coarser or smaller saddle point problems. The second is to use the Schur complement method to decouple and sequentially solve for the pressure and velocity. The third is to use the Schur decomposition to devise preconditioners for the full operator. These involve sub-solves resembling inexact versions of the sequential solve. The choice of approach and sub-methods depends crucially on the motivating physics, the discretization, and available computational resources. Here we examine the performance trade-offs for preconditioning strategies applied to idealized models of mantle convection and lithospheric dynamics, characterized by large viscosity gradients. Due to the arbitrary topological structure of the viscosity field in geodynamical simulations, we utilize low order, inf-sup stable mixed finite element spatial discretizations which are suitable when sharp viscosity variations occur in element interiors. Particular attention is paid to possibilities within the decoupled and approximate Schur complement factorization-based monolithic approaches to leverage recently-developed flexible, communication-avoiding, and communication-hiding Krylov subspace methods in combination with `heavy' smoothers, which require solutions of large per-node sub-problems, well-suited to solution on hybrid computational clusters. To manage the combinatorial explosion of solver options (which include hybridizations of all the approaches mentioned above), we leverage the modularity of the PETSc library.
The novel high-performance 3-D MT inverse solver

NASA Astrophysics Data System (ADS)

Kruglyakov, Mikhail; Geraskin, Alexey; Kuvshinov, Alexey

2016-04-01

We present novel, robust, scalable, and fast 3-D magnetotelluric (MT) inverse solver. The solver is written in multi-language paradigm to make it as efficient, readable and maintainable as possible. Separation of concerns and single responsibility concepts go through implementation of the solver. As a forward modelling engine a modern scalable solver extrEMe, based on contracting integral equation approach, is used. Iterative gradient-type (quasi-Newton) optimization scheme is invoked to search for (regularized) inverse problem solution, and adjoint source approach is used to calculate efficiently the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT responses, and supports massive parallelization. Moreover, different parallelization strategies implemented in the code allow optimal usage of available computational resources for a given problem statement. To parameterize an inverse domain the so-called mask parameterization is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to HPC Piz Daint (6th supercomputer in the world) demonstrate practically linear scalability of the code up to thousands of nodes.
Algorithm 937: MINRES-QLP for Symmetric and Hermitian Linear Equations and Least-Squares Problems.

PubMed

Choi, Sou-Cheng T; Saunders, Michael A

2014-02-01

We describe algorithm MINRES-QLP and its FORTRAN 90 implementation for solving symmetric or Hermitian linear systems or least-squares problems. If the system is singular, MINRES-QLP computes the unique minimum-length solution (also known as the pseudoinverse solution), which generally eludes MINRES. In all cases, it overcomes a potential instability in the original MINRES algorithm. A positive-definite pre-conditioner may be supplied. Our FORTRAN 90 implementation illustrates a design pattern that allows users to make problem data known to the solver but hidden and secure from other program units. In particular, we circumvent the need for reverse communication. Example test programs input and solve real or complex problems specified in Matrix Market format. While we focus here on a FORTRAN 90 implementation, we also provide and maintain MATLAB versions of MINRES and MINRES-QLP.
MODFLOW-NWT, A Newton formulation for MODFLOW-2005

USGS Publications Warehouse

Niswonger, Richard G.; Panday, Sorab; Ibaraki, Motomu

2011-01-01

This report documents a Newton formulation of MODFLOW-2005, called MODFLOW-NWT. MODFLOW-NWT is a standalone program that is intended for solving problems involving drying and rewetting nonlinearities of the unconfined groundwater-flow equation. MODFLOW-NWT must be used with the Upstream-Weighting (UPW) Package for calculating intercell conductances in a different manner than is done in the Block-Centered Flow (BCF), Layer Property Flow (LPF), or Hydrogeologic-Unit Flow (HUF; Anderman and Hill, 2000) Packages. The UPW Package treats nonlinearities of cell drying and rewetting by use of a continuous function of groundwater head, rather than the discrete approach of drying and rewetting that is used by the BCF, LPF, and HUF Packages. This further enables application of the Newton formulation for unconfined groundwater-flow problems because conductance derivatives required by the Newton method are smooth over the full range of head for a model cell. The NWT linearization approach generates an asymmetric matrix, which is different from the standard MODFLOW formulation that generates a symmetric matrix. Because all linear solvers presently available for use with MODFLOW-2005 solve only symmetric matrices, MODFLOW-NWT includes two previously developed asymmetric matrix-solver options. The matrix-solver options include a generalized-minimum-residual (GMRES) Solver and an Orthomin / stabilized conjugate-gradient (CGSTAB) Solver. The GMRES Solver is documented in a previously published report, such that only a brief description and input instructions are provided in this report. However, the CGSTAB Solver (called XMD) is documented in this report. Flow-property input for the UPW Package is designed based on the LPF Package and material-property input is identical to that for the LPF Package except that the rewetting and vertical-conductance correction options of the LPF Package are not available with the UPW Package. Input files constructed for the LPF Package can be used with slight modification as input for the UPW Package. This report presents the theory and methods used by MODFLOW-NWT, including the UPW Package. Additionally, this report provides comparisons of the new methodology to analytical solutions of groundwater flow and to standard MODFLOW-2005 results by use of an unconfined aquifer MODFLOW example problem. The standard MODFLOW-2005 simulation uses the LPF Package with the wet/dry option active. A new example problem also is presented to demonstrate MODFLOW-NWT's ability to provide a solution for a difficult unconfined groundwater-flow problem.
Integrated Modeling Tools for Thermal Analysis and Applications

NASA Technical Reports Server (NTRS)

Milman, Mark H.; Needels, Laura; Papalexandris, Miltiadis

1999-01-01

Integrated modeling of spacecraft systems is a rapidly evolving area in which multidisciplinary models are developed to design and analyze spacecraft configurations. These models are especially important in the early design stages where rapid trades between subsystems can substantially impact design decisions. Integrated modeling is one of the cornerstones of two of NASA's planned missions in the Origins Program -- the Next Generation Space Telescope (NGST) and the Space Interferometry Mission (SIM). Common modeling tools for control design and opto-mechanical analysis have recently emerged and are becoming increasingly widely used. A discipline that has been somewhat less integrated, but is nevertheless of critical concern for high precision optical instruments, is thermal analysis and design. A major factor contributing to this mild estrangement is that the modeling philosophies and objectives for structural and thermal systems typically do not coincide. Consequently the tools that are used in these discplines suffer a degree of incompatibility, each having developed along their own evolutionary path. Although standard thermal tools have worked relatively well in the past. integration with other disciplines requires revisiting modeling assumptions and solution methods. Over the past several years we have been developing a MATLAB based integrated modeling tool called IMOS (Integrated Modeling of Optical Systems) which integrates many aspects of structural, optical, control and dynamical analysis disciplines. Recent efforts have included developing a thermal modeling and analysis capability, which is the subject of this article. Currently, the IMOS thermal suite contains steady state and transient heat equation solvers, and the ability to set up the linear conduction network from an IMOS finite element model. The IMOS code generates linear conduction elements associated with plates and beams/rods of the thermal network directly from the finite element structural model. Conductances for temperature varying materials are accommodated. This capability both streamlines the process of developing the thermal model from the finite element model, and also makes the structural and thermal models compatible in the sense that each structural node is associated with a thermal node. This is particularly useful when the purpose of the analysis is to predict structural deformations due to thermal loads. The steady state solver uses a restricted step size Newton method, and the transient solver is an adaptive step size implicit method applicable to general differential algebraic systems. Temperature dependent conductances and capacitances are accommodated by the solvers. In addition to discussing the modeling and solution methods. applications where the thermal modeling is "in the loop" with sensitivity analysis, optimization and optical performance drawn from our experiences with the Space Interferometry Mission (SIM), and the Next Generation Space Telescope (NGST) are presented.

Computational aeroelasticity using a pressure-based solver

NASA Astrophysics Data System (ADS)

Kamakoti, Ramji

A computational methodology for performing fluid-structure interaction computations for three-dimensional elastic wing geometries is presented. The flow solver used is based on an unsteady Reynolds-Averaged Navier-Stokes (RANS) model. A well validated k-ε turbulence model with wall function treatment for near wall region was used to perform turbulent flow calculations. Relative merits of alternative flow solvers were investigated. The predictor-corrector-based Pressure Implicit Splitting of Operators (PISO) algorithm was found to be computationally economic for unsteady flow computations. Wing structure was modeled using Bernoulli-Euler beam theory. A fully implicit time-marching scheme (using the Newmark integration method) was used to integrate the equations of motion for structure. Bilinear interpolation and linear extrapolation techniques were used to transfer necessary information between fluid and structure solvers. Geometry deformation was accounted for by using a moving boundary module. The moving grid capability was based on a master/slave concept and transfinite interpolation techniques. Since computations were performed on a moving mesh system, the geometric conservation law must be preserved. This is achieved by appropriately evaluating the Jacobian values associated with each cell. Accurate computation of contravariant velocities for unsteady flows using the momentum interpolation method on collocated, curvilinear grids was also addressed. Flutter computations were performed for the AGARD 445.6 wing at subsonic, transonic and supersonic Mach numbers. Unsteady computations were performed at various dynamic pressures to predict the flutter boundary. Results showed favorable agreement of experiment and previous numerical results. The computational methodology exhibited capabilities to predict both qualitative and quantitative features of aeroelasticity.
Use of general purpose graphics processing units with MODFLOW

USGS Publications Warehouse

Hughes, Joseph D.; White, Jeremy T.

2013-01-01

To evaluate the use of general-purpose graphics processing units (GPGPUs) to improve the performance of MODFLOW, an unstructured preconditioned conjugate gradient (UPCG) solver has been developed. The UPCG solver uses a compressed sparse row storage scheme and includes Jacobi, zero fill-in incomplete, and modified-incomplete lower-upper (LU) factorization, and generalized least-squares polynomial preconditioners. The UPCG solver also includes options for sequential and parallel solution on the central processing unit (CPU) using OpenMP. For simulations utilizing the GPGPU, all basic linear algebra operations are performed on the GPGPU; memory copies between the central processing unit CPU and GPCPU occur prior to the first iteration of the UPCG solver and after satisfying head and flow criteria or exceeding a maximum number of iterations. The efficiency of the UPCG solver for GPGPU and CPU solutions is benchmarked using simulations of a synthetic, heterogeneous unconfined aquifer with tens of thousands to millions of active grid cells. Testing indicates GPGPU speedups on the order of 2 to 8, relative to the standard MODFLOW preconditioned conjugate gradient (PCG) solver, can be achieved when (1) memory copies between the CPU and GPGPU are optimized, (2) the percentage of time performing memory copies between the CPU and GPGPU is small relative to the calculation time, (3) high-performance GPGPU cards are utilized, and (4) CPU-GPGPU combinations are used to execute sequential operations that are difficult to parallelize. Furthermore, UPCG solver testing indicates GPGPU speedups exceed parallel CPU speedups achieved using OpenMP on multicore CPUs for preconditioners that can be easily parallelized.
Parallel Solver for H(div) Problems Using Hybridization and AMG

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Chak S.; Vassilevski, Panayot S.

2016-01-15

In this paper, a scalable parallel solver is proposed for H(div) problems discretized by arbitrary order finite elements on general unstructured meshes. The solver is based on hybridization and algebraic multigrid (AMG). Unlike some previously studied H(div) solvers, the hybridization solver does not require discrete curl and gradient operators as additional input from the user. Instead, only some element information is needed in the construction of the solver. The hybridization results in a H1-equivalent symmetric positive definite system, which is then rescaled and solved by AMG solvers designed for H1 problems. Weak and strong scaling of the method are examinedmore » through several numerical tests. Our numerical results show that the proposed solver provides a promising alternative to ADS, a state-of-the-art solver [12], for H(div) problems. In fact, it outperforms ADS for higher order elements.« less
Optical solver for a system of ordinary differential equations based on an external feedback assisted microring resonator.

PubMed

Hou, Jie; Dong, Jianji; Zhang, Xinliang

2017-06-15

Systems of ordinary differential equations (SODEs) are crucial for describing the dynamic behaviors in various systems such as modern control systems which require observability and controllability. In this Letter, we propose and experimentally demonstrate an all-optical SODE solver based on the silicon-on-insulator platform. We use an add/drop microring resonator to construct two different ordinary differential equations (ODEs) and then introduce two external feedback waveguides to realize the coupling between these ODEs, thus forming the SODE solver. A temporal coupled mode theory is used to deduce the expression of the SODE. A system experiment is carried out for further demonstration. For the input 10 GHz NRZ-like pulses, the measured output waveforms of the SODE solver agree well with the calculated results.
High-performance equation solvers and their impact on finite element analysis

NASA Technical Reports Server (NTRS)

Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. Dale, Jr.

1990-01-01

The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number of operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.
High-performance equation solvers and their impact on finite element analysis

NASA Technical Reports Server (NTRS)

Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. D., Jr.

1992-01-01

The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number od operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.
Improved Convergence and Robustness of USM3D Solutions on Mixed Element Grids (Invited)

NASA Technical Reports Server (NTRS)

Pandya, Mohagna J.; Diskin, Boris; Thomas, James L.; Frink, Neal T.

2015-01-01

Several improvements to the mixed-element USM3D discretization and defect-correction schemes have been made. A new methodology for nonlinear iterations, called the Hierarchical Adaptive Nonlinear Iteration Scheme (HANIS), has been developed and implemented. It provides two additional hierarchies around a simple and approximate preconditioner of USM3D. The hierarchies are a matrix-free linear solver for the exact linearization of Reynolds-averaged Navier Stokes (RANS) equations and a nonlinear control of the solution update. Two variants of the new methodology are assessed on four benchmark cases, namely, a zero-pressure gradient flat plate, a bump-in-channel configuration, the NACA 0012 airfoil, and a NASA Common Research Model configuration. The new methodology provides a convergence acceleration factor of 1.4 to 13 over the baseline solver technology.
PB-AM: An open-source, fully analytical linear poisson-boltzmann solver

DOE Office of Scientific and Technical Information (OSTI.GOV)

Felberg, Lisa E.; Brookes, David H.; Yap, Eng-Hui

2016-11-02

We present the open source distributed software package Poisson-Boltzmann Analytical Method (PB-AM), a fully analytical solution to the linearized Poisson Boltzmann equation. The PB-AM software package includes the generation of outputs files appropriate for visualization using VMD, a Brownian dynamics scheme that uses periodic boundary conditions to simulate dynamics, the ability to specify docking criteria, and offers two different kinetics schemes to evaluate biomolecular association rate constants. Given that PB-AM defines mutual polarization completely and accurately, it can be refactored as a many-body expansion to explore 2- and 3-body polarization. Additionally, the software has been integrated into the Adaptive Poisson-Boltzmannmore » Solver (APBS) software package to make it more accessible to a larger group of scientists, educators and students that are more familiar with the APBS framework.« less
Robust large-scale parallel nonlinear solvers for simulations.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson

2005-11-01

This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their usemore » in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any existing linear solver, which makes it simple to write and easily portable. However, the method usually takes twice as long to solve as Newton-GMRES on general problems because it solves two linear systems at each iteration. In this paper, we discuss modifications to Bouaricha's method for a practical implementation, including a special globalization technique and other modifications for greater efficiency. We present numerical results showing computational advantages over Newton-GMRES on some realistic problems. We further discuss a new approach for dealing with singular (or ill-conditioned) matrices. In particular, we modify an algorithm for identifying a turning point so that an increasingly ill-conditioned Jacobian does not prevent convergence.« less
WARP3D-Release 10.8: Dynamic Nonlinear Analysis of Solids using a Preconditioned Conjugate Gradient Software Architecture

NASA Technical Reports Server (NTRS)

Koppenhoefer, Kyle C.; Gullerud, Arne S.; Ruggieri, Claudio; Dodds, Robert H., Jr.; Healy, Brian E.

1998-01-01

This report describes theoretical background material and commands necessary to use the WARP3D finite element code. WARP3D is under continuing development as a research code for the solution of very large-scale, 3-D solid models subjected to static and dynamic loads. Specific features in the code oriented toward the investigation of ductile fracture in metals include a robust finite strain formulation, a general J-integral computation facility (with inertia, face loading), an element extinction facility to model crack growth, nonlinear material models including viscoplastic effects, and the Gurson-Tver-gaard dilatant plasticity model for void growth. The nonlinear, dynamic equilibrium equations are solved using an incremental-iterative, implicit formulation with full Newton iterations to eliminate residual nodal forces. The history integration of the nonlinear equations of motion is accomplished with Newmarks Beta method. A central feature of WARP3D involves the use of a linear-preconditioned conjugate gradient (LPCG) solver implemented in an element-by-element format to replace a conventional direct linear equation solver. This software architecture dramatically reduces both the memory requirements and CPU time for very large, nonlinear solid models since formation of the assembled (dynamic) stiffness matrix is avoided. Analyses thus exhibit the numerical stability for large time (load) steps provided by the implicit formulation coupled with the low memory requirements characteristic of an explicit code. In addition to the much lower memory requirements of the LPCG solver, the CPU time required for solution of the linear equations during each Newton iteration is generally one-half or less of the CPU time required for a traditional direct solver. All other computational aspects of the code (element stiffnesses, element strains, stress updating, element internal forces) are implemented in the element-by- element, blocked architecture. This greatly improves vectorization of the code on uni-processor hardware and enables straightforward parallel-vector processing of element blocks on multi-processor hardware.
Efficient Implementation of Multigrid Solvers on Message-Passing Parrallel Systems

NASA Technical Reports Server (NTRS)

Lou, John

1994-01-01

We discuss our implementation strategies for finite difference multigrid partial differential equation (PDE) solvers on message-passing systems. Our target parallel architecture is Intel parallel computers: the Delta and Paragon system.
Response analysis of a laminar premixed M-flame to flow perturbations using a linearized compressible Navier-Stokes solver

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blanchard, M., E-mail: mathieu.blanchard@ladhyx.polytechnique.fr; Schuller, T.; Centrale-Supélec, Grande Voie des Vignes, 92290 Châtenay-Malabry

2015-04-15

The response of a laminar premixed methane-air flame subjected to flow perturbations around a steady state is examined experimentally and using a linearized compressible Navier-Stokes solver with a one-step chemistry mechanism to describe combustion. The unperturbed flame takes an M-shape stabilized both by a central bluff body and by the external rim of a cylindrical nozzle. This base flow is computed by a nonlinear direct simulation of the steady reacting flow, and the flame topology is shown to qualitatively correspond to experiments conducted under comparable conditions. The flame is then subjected to acoustic disturbances produced at different locations in themore » numerical domain, and its response is examined using the linearized solver. This linear numerical model then allows the componentwise investigation of the effects of flow disturbances on unsteady combustion and the feedback from the flame on the unsteady flow field. It is shown that a wrinkled reaction layer produces hydrodynamic disturbances in the fresh reactant flow field that superimpose on the acoustic field. This phenomenon, observed in several experiments, is fully interpreted here. The additional perturbations convected by the mean flow stem from the feedback of the perturbed flame sheet dynamics onto the flow field by a mechanism similar to that of a perturbed vortex sheet. The different regimes where this mechanism prevails are investigated by examining the phase and group velocities of flow disturbances along an axis oriented along the main direction of the flow in the fresh reactant flow field. It is shown that this mechanism dominates the low-frequency response of the wrinkled shape taken by the flame and, in particular, that it fully determines the dynamics of the flame tip from where the bulk of noise is radiated.« less
An Implicit Solver on A Parallel Block-Structured Adaptive Mesh Grid for FLASH

NASA Astrophysics Data System (ADS)

Lee, D.; Gopal, S.; Mohapatra, P.

2012-07-01

We introduce a fully implicit solver for FLASH based on a Jacobian-Free Newton-Krylov (JFNK) approach with an appropriate preconditioner. The main goal of developing this JFNK-type implicit solver is to provide efficient high-order numerical algorithms and methodology for simulating stiff systems of differential equations on large-scale parallel computer architectures. A large number of natural problems in nonlinear physics involve a wide range of spatial and time scales of interest. A system that encompasses such a wide magnitude of scales is described as "stiff." A stiff system can arise in many different fields of physics, including fluid dynamics/aerodynamics, laboratory/space plasma physics, low Mach number flows, reactive flows, radiation hydrodynamics, and geophysical flows. One of the big challenges in solving such a stiff system using current-day computational resources lies in resolving time and length scales varying by several orders of magnitude. We introduce FLASH's preliminary implementation of a time-accurate JFNK-based implicit solver in the framework of FLASH's unsplit hydro solver.
Practical Aerodynamic Design Optimization Based on the Navier-Stokes Equations and a Discrete Adjoint Method

NASA Technical Reports Server (NTRS)

Grossman, Bernard

1999-01-01

Compressible and incompressible versions of a three-dimensional unstructured mesh Reynolds-averaged Navier-Stokes flow solver have been differentiated and resulting derivatives have been verified by comparisons with finite differences and a complex-variable approach. In this implementation, the turbulence model is fully coupled with the flow equations in order to achieve this consistency. The accuracy demonstrated in the current work represents the first time that such an approach has been successfully implemented. The accuracy of a number of simplifying approximations to the linearizations of the residual have been examined. A first-order approximation to the dependent variables in both the adjoint and design equations has been investigated. The effects of a "frozen" eddy viscosity and the ramifications of neglecting some mesh sensitivity terms were also examined. It has been found that none of the approximations yielded derivatives of acceptable accuracy and were often of incorrect sign. However, numerical experiments indicate that an incomplete convergence of the adjoint system often yield sufficiently accurate derivatives, thereby significantly lowering the time required for computing sensitivity information. The convergence rate of the adjoint solver relative to the flow solver has been examined. Inviscid adjoint solutions typically require one to four times the cost of a flow solution, while for turbulent adjoint computations, this ratio can reach as high as eight to ten. Numerical experiments have shown that the adjoint solver can stall before converging the solution to machine accuracy, particularly for viscous cases. A possible remedy for this phenomenon would be to include the complete higher-order linearization in the preconditioning step, or to employ a simple form of mesh sequencing to obtain better approximations to the solution through the use of coarser meshes. An efficient surface parameterization based on a free-form deformation technique has been utilized and the resulting codes have been integrated with an optimization package. Lastly, sample optimizations have been shown for inviscid and turbulent flow over an ONERA M6 wing. Drag reductions have been demonstrated by reducing shock strengths across the span of the wing. In order for large scale optimization to become routine, the benefits of parallel architectures should be exploited. Although the flow solver has been parallelized using compiler directives. The parallel efficiency is under 50 percent. Clearly, parallel versions of the codes will have an immediate impact on the ability to design realistic configurations on fine meshes, and this effort is currently underway.
TOUGH3 v1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

PAU, GEORGE; JUNG, YOOJIN; FINSTERLE, STEFAN

2016-09-14

TOUGH3 V1.0 capabilities to simulate multi-dimensional, multi-phase, multi-component, non-isothermal flow and transport in fractured porous media, with applications geosciences and reservoir engineering and other application areas. TOUGH3 V1.0 supports a number of different combinations of fluids and components (updated equation-of-state (EOS) modules from previous versions of TOUGH, including EOS1, EOS2, EOS3, EOS4, EOS5, EOS7, EOS7R, EOS7C, EOS7CA, EOS8, EOS9, EWASG, TMVOC, ECO2N, and ECO2M). This upgrade includes (a) expanded list of updated equation-of-state (EOS) modules, (b) new hysteresis models, (c) new implementation of parallel and solver functionalities, (d) new linear solver options based on PETSc libraries, (e) new automatic buildmore » system that automatically downloads and builds third-party libraries and TOUGH3, (f) new printout in CSV format, (g) dynamic memory allocation, (h) various user features, and (i) bug fixes.« less
Solution of large nonlinear quasistatic structural mechanics problems on distributed-memory multiprocessor computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blanford, M.

1997-12-31

Most commercially-available quasistatic finite element programs assemble element stiffnesses into a global stiffness matrix, then use a direct linear equation solver to obtain nodal displacements. However, for large problems (greater than a few hundred thousand degrees of freedom), the memory size and computation time required for this approach becomes prohibitive. Moreover, direct solution does not lend itself to the parallel processing needed for today`s multiprocessor systems. This talk gives an overview of the iterative solution strategy of JAS3D, the nonlinear large-deformation quasistatic finite element program. Because its architecture is derived from an explicit transient-dynamics code, it does not ever assemblemore » a global stiffness matrix. The author describes the approach he used to implement the solver on multiprocessor computers, and shows examples of problems run on hundreds of processors and more than a million degrees of freedom. Finally, he describes some of the work he is presently doing to address the challenges of iterative convergence for ill-conditioned problems.« less
Adaptive Discrete Hypergraph Matching.

PubMed

Yan, Junchi; Li, Changsheng; Li, Yin; Cao, Guitao

2018-02-01

This paper addresses the problem of hypergraph matching using higher-order affinity information. We propose a solver that iteratively updates the solution in the discrete domain by linear assignment approximation. The proposed method is guaranteed to converge to a stationary discrete solution and avoids the annealing procedure and ad-hoc post binarization step that are required in several previous methods. Specifically, we start with a simple iterative discrete gradient assignment solver. This solver can be trapped in an -circle sequence under moderate conditions, where is the order of the graph matching problem. We then devise an adaptive relaxation mechanism to jump out this degenerating case and show that the resulting new path will converge to a fixed solution in the discrete domain. The proposed method is tested on both synthetic and real-world benchmarks. The experimental results corroborate the efficacy of our method.
Hybrid Optimization Parallel Search PACKage

DOE Office of Scientific and Technical Information (OSTI.GOV)

2009-11-10

HOPSPACK is open source software for solving optimization problems without derivatives. Application problems may have a fully nonlinear objective function, bound constraints, and linear and nonlinear constraints. Problem variables may be continuous, integer-valued, or a mixture of both. The software provides a framework that supports any derivative-free type of solver algorithm. Through the framework, solvers request parallel function evaluation, which may use MPI (multiple machines) or multithreading (multiple processors/cores on one machine). The framework provides a Cache and Pending Cache of saved evaluations that reduces execution time and facilitates restarts. Solvers can dynamically create other algorithms to solve subproblems, amore » useful technique for handling multiple start points and integer-valued variables. HOPSPACK ships with the Generating Set Search (GSS) algorithm, developed at Sandia as part of the APPSPACK open source software project.« less
Shallow-water sloshing in a moving vessel with variable cross-section and wetting-drying using an extension of George's well-balanced finite volume solver

NASA Astrophysics Data System (ADS)

Alemi Ardakani, Hamid; Bridges, Thomas J.; Turner, Matthew R.

2016-06-01

A class of augmented approximate Riemann solvers due to George (2008) [12] is extended to solve the shallow-water equations in a moving vessel with variable bottom topography and variable cross-section with wetting and drying. A class of Roe-type upwind solvers for the system of balance laws is derived which respects the steady-state solutions. The numerical solutions of the new adapted augmented f-wave solvers are validated against the Roe-type solvers. The theory is extended to solve the shallow-water flows in moving vessels with arbitrary cross-section with influx-efflux boundary conditions motivated by the shallow-water sloshing in the ocean wave energy converter (WEC) proposed by Offshore Wave Energy Ltd. (OWEL) [1]. A fractional step approach is used to handle the time-dependent forcing functions. The numerical solutions are compared to an extended new Roe-type solver for the system of balance laws with a time-dependent source function. The shallow-water sloshing finite volume solver can be coupled to a Runge-Kutta integrator for the vessel motion.
Novel Scalable 3-D MT Inverse Solver

NASA Astrophysics Data System (ADS)

Kuvshinov, A. V.; Kruglyakov, M.; Geraskin, A.

2016-12-01

We present a new, robust and fast, three-dimensional (3-D) magnetotelluric (MT) inverse solver. As a forward modelling engine a highly-scalable solver extrEMe [1] is used. The (regularized) inversion is based on an iterative gradient-type optimization (quasi-Newton method) and exploits adjoint sources approach for fast calculation of the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT (single-site and/or inter-site) responses, and supports massive parallelization. Different parallelization strategies implemented in the code allow for optimal usage of available computational resources for a given problem set up. To parameterize an inverse domain a mask approach is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to high-performance clusters demonstrate practically linear scalability of the code up to thousands of nodes. 1. Kruglyakov, M., A. Geraskin, A. Kuvshinov, 2016. Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method, Computers and Geosciences, in press.

N-MODY: A Code for Collisionless N-body Simulations in Modified Newtonian Dynamics

NASA Astrophysics Data System (ADS)

Londrillo, Pasquale; Nipoti, Carlo

2011-02-01

N-MODY is a parallel particle-mesh code for collisionless N-body simulations in modified Newtonian dynamics (MOND). N-MODY is based on a numerical potential solver in spherical coordinates that solves the non-linear MOND field equation, and is ideally suited to simulate isolated stellar systems. N-MODY can be used also to compute the MOND potential of arbitrary static density distributions. A few applications of N-MODY indicate that some astrophysically relevant dynamical processes are profoundly different in MOND and in Newtonian gravity with dark matter.
Adaptation of a Multi-Block Structured Solver for Effective Use in a Hybrid CPU/GPU Massively Parallel Environment

NASA Astrophysics Data System (ADS)

Gutzwiller, David; Gontier, Mathieu; Demeulenaere, Alain

2014-11-01

Multi-Block structured solvers hold many advantages over their unstructured counterparts, such as a smaller memory footprint and efficient serial performance. Historically, multi-block structured solvers have not been easily adapted for use in a High Performance Computing (HPC) environment, and the recent trend towards hybrid GPU/CPU architectures has further complicated the situation. This paper will elaborate on developments and innovations applied to the NUMECA FINE/Turbo solver that have allowed near-linear scalability with real-world problems on over 250 hybrid GPU/GPU cluster nodes. Discussion will focus on the implementation of virtual partitioning and load balancing algorithms using a novel meta-block concept. This implementation is transparent to the user, allowing all pre- and post-processing steps to be performed using a simple, unpartitioned grid topology. Additional discussion will elaborate on developments that have improved parallel performance, including fully parallel I/O with the ADIOS API and the GPU porting of the computationally heavy CPUBooster convergence acceleration module. Head of HPC and Release Management, Numeca International.
Performance of a parallel thermal-hydraulics code TEMPEST

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fann, G.I.; Trent, D.S.

The authors describe the parallelization of the Tempest thermal-hydraulics code. The serial version of this code is used for production quality 3-D thermal-hydraulics simulations. Good speedup was obtained with a parallel diagonally preconditioned BiCGStab non-symmetric linear solver, using a spatial domain decomposition approach for the semi-iterative pressure-based and mass-conserved algorithm. The test case used here to illustrate the performance of the BiCGStab solver is a 3-D natural convection problem modeled using finite volume discretization in cylindrical coordinates. The BiCGStab solver replaced the LSOR-ADI method for solving the pressure equation in TEMPEST. BiCGStab also solves the coupled thermal energy equation. Scalingmore » performance of 3 problem sizes (221220 nodes, 358120 nodes, and 701220 nodes) are presented. These problems were run on 2 different parallel machines: IBM-SP and SGI PowerChallenge. The largest problem attains a speedup of 68 on an 128 processor IBM-SP. In real terms, this is over 34 times faster than the fastest serial production time using the LSOR-ADI solver.« less
Nearly Interactive Parabolized Navier-Stokes Solver for High Speed Forebody and Inlet Flows

NASA Technical Reports Server (NTRS)

Benson, Thomas J.; Liou, May-Fun; Jones, William H.; Trefny, Charles J.

2009-01-01

A system of computer programs is being developed for the preliminary design of high speed inlets and forebodies. The system comprises four functions: geometry definition, flow grid generation, flow solver, and graphics post-processor. The system runs on a dedicated personal computer using the Windows operating system and is controlled by graphical user interfaces written in MATLAB (The Mathworks, Inc.). The flow solver uses the Parabolized Navier-Stokes equations to compute millions of mesh points in several minutes. Sample two-dimensional and three-dimensional calculations are demonstrated in the paper.
Development of the Semi-implicit Time Integration in KIM-SH

NASA Astrophysics Data System (ADS)

NAM, H.

2015-12-01

The Korea Institute of Atmospheric Prediction Systems (KIAPS) was founded in 2011 by the Korea Meteorological Administration (KMA) to develop Korea's own global Numerical Weather Prediction (NWP) system as nine year (2011-2019) project. The KIM-SH is a KIAPS integrated model-spectral element based in the HOMME. In KIM-SH, the explicit schemes are employed. We introduce the three- and two-time-level semi-implicit scheme in KIM-SH as the time integration. Explicit schemes however have a tendancy to be unstable and require very small timesteps while semi-implicit schemes are very stable and can have much larger timesteps.We define the linear and reference values, then by definition of semi-implicit scheme, we apply the linear solver as GMRES. The numerical results from experiments will be introduced with the current development status of the time integration in KIM-SH. Several numerical examples are shown to confirm the efficiency and reliability of the proposed schemes.
Algorithm 937: MINRES-QLP for Symmetric and Hermitian Linear Equations and Least-Squares Problems

PubMed Central

Choi, Sou-Cheng T.; Saunders, Michael A.

2014-01-01

We describe algorithm MINRES-QLP and its FORTRAN 90 implementation for solving symmetric or Hermitian linear systems or least-squares problems. If the system is singular, MINRES-QLP computes the unique minimum-length solution (also known as the pseudoinverse solution), which generally eludes MINRES. In all cases, it overcomes a potential instability in the original MINRES algorithm. A positive-definite pre-conditioner may be supplied. Our FORTRAN 90 implementation illustrates a design pattern that allows users to make problem data known to the solver but hidden and secure from other program units. In particular, we circumvent the need for reverse communication. Example test programs input and solve real or complex problems specified in Matrix Market format. While we focus here on a FORTRAN 90 implementation, we also provide and maintain MATLAB versions of MINRES and MINRES-QLP. PMID:25328255
Higher Order, Hybrid BEM/FEM Methods Applied to Antenna Modeling

NASA Technical Reports Server (NTRS)

Fink, P. W.; Wilton, D. R.; Dobbins, J. A.

2002-01-01

In this presentation, the authors address topics relevant to higher order modeling using hybrid BEM/FEM formulations. The first of these is the limitation on convergence rates imposed by geometric modeling errors in the analysis of scattering by a dielectric sphere. The second topic is the application of an Incomplete LU Threshold (ILUT) preconditioner to solve the linear system resulting from the BEM/FEM formulation. The final tOpic is the application of the higher order BEM/FEM formulation to antenna modeling problems. The authors have previously presented work on the benefits of higher order modeling. To achieve these benefits, special attention is required in the integration of singular and near-singular terms arising in the surface integral equation. Several methods for handling these terms have been presented. It is also well known that achieving he high rates of convergence afforded by higher order bases may als'o require the employment of higher order geometry models. A number of publications have described the use of quadratic elements to model curved surfaces. The authors have shown in an EFIE formulation, applied to scattering by a PEC .sphere, that quadratic order elements may be insufficient to prevent the domination of modeling errors. In fact, on a PEC sphere with radius r = 0.58 Lambda(sub 0), a quartic order geometry representation was required to obtain a convergence benefi.t from quadratic bases when compared to the convergence rate achieved with linear bases. Initial trials indicate that, for a dielectric sphere of the same radius, - requirements on the geometry model are not as severe as for the PEC sphere. The authors will present convergence results for higher order bases as a function of the geometry model order in the hybrid BEM/FEM formulation applied to dielectric spheres. It is well known that the system matrix resulting from the hybrid BEM/FEM formulation is ill -conditioned. For many real applications, a good preconditioner is required to obtain usable convergence from an iterative solver. The authors have examined the use of an Incomplete LU Threshold (ILUT) preconditioner . to solver linear systems stemming from higher order BEM/FEM formulations in 2D scattering problems. Although the resulting preconditioner provided aD excellent approximation to the system inverse, its size in terms of non-zero entries represented only a modest improvement when compared with the fill-in associated with a sparse direct solver. Furthermore, the fill-in of the preconditioner could not be substantially reduced without the occurrence of instabilities. In addition to the results for these 2D problems, the authors will present iterative solution data from the application of the ILUT preconditioner to 3D problems.
Conducting Automated Test Assembly Using the Premium Solver Platform Version 7.0 with Microsoft Excel and the Large-Scale LP/QP Solver Engine Add-In

ERIC Educational Resources Information Center

Cor, Ken; Alves, Cecilia; Gierl, Mark J.

2008-01-01

This review describes and evaluates a software add-in created by Frontline Systems, Inc., that can be used with Microsoft Excel 2007 to solve large, complex test assembly problems. The combination of Microsoft Excel 2007 with the Frontline Systems Premium Solver Platform is significant because Microsoft Excel is the most commonly used spreadsheet…
Parallel filtering in global gyrokinetic simulations

NASA Astrophysics Data System (ADS)

Jolliet, S.; McMillan, B. F.; Villard, L.; Vernay, T.; Angelino, P.; Tran, T. M.; Brunner, S.; Bottino, A.; Idomura, Y.

2012-02-01

In this work, a Fourier solver [B.F. McMillan, S. Jolliet, A. Bottino, P. Angelino, T.M. Tran, L. Villard, Comp. Phys. Commun. 181 (2010) 715] is implemented in the global Eulerian gyrokinetic code GT5D [Y. Idomura, H. Urano, N. Aiba, S. Tokuda, Nucl. Fusion 49 (2009) 065029] and in the global Particle-In-Cell code ORB5 [S. Jolliet, A. Bottino, P. Angelino, R. Hatzky, T.M. Tran, B.F. McMillan, O. Sauter, K. Appert, Y. Idomura, L. Villard, Comp. Phys. Commun. 177 (2007) 409] in order to reduce the memory of the matrix associated with the field equation. This scheme is verified with linear and nonlinear simulations of turbulence. It is demonstrated that the straight-field-line angle is the coordinate that optimizes the Fourier solver, that both linear and nonlinear turbulent states are unaffected by the parallel filtering, and that the k∥ spectrum is independent of plasma size at fixed normalized poloidal wave number.
PB-AM: An open-source, fully analytical linear poisson-boltzmann solver.

PubMed

Felberg, Lisa E; Brookes, David H; Yap, Eng-Hui; Jurrus, Elizabeth; Baker, Nathan A; Head-Gordon, Teresa

2017-06-05

We present the open source distributed software package Poisson-Boltzmann Analytical Method (PB-AM), a fully analytical solution to the linearized PB equation, for molecules represented as non-overlapping spherical cavities. The PB-AM software package includes the generation of outputs files appropriate for visualization using visual molecular dynamics, a Brownian dynamics scheme that uses periodic boundary conditions to simulate dynamics, the ability to specify docking criteria, and offers two different kinetics schemes to evaluate biomolecular association rate constants. Given that PB-AM defines mutual polarization completely and accurately, it can be refactored as a many-body expansion to explore 2- and 3-body polarization. Additionally, the software has been integrated into the Adaptive Poisson-Boltzmann Solver (APBS) software package to make it more accessible to a larger group of scientists, educators, and students that are more familiar with the APBS framework. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
On solving three-dimensional open-dimension rectangular packing problems

NASA Astrophysics Data System (ADS)

Junqueira, Leonardo; Morabito, Reinaldo

2017-05-01

In this article, a recently proposed three-dimensional open-dimension rectangular packing problem is considered, in which the objective is to find a minimal volume rectangular container that packs a set of rectangular boxes. The literature has tackled small-sized instances of this problem by means of optimization solvers, position-free mixed-integer programming (MIP) formulations and piecewise linearization approaches. In this study, the problem is alternatively addressed by means of grid-based position MIP formulations, whereas still considering optimization solvers and the same piecewise linearization techniques. A comparison of the computational performance of both models is then presented, when tested with benchmark problem instances and with new instances, and it is shown that the grid-based position MIP formulation can be competitive, depending on the characteristics of the instances. The grid-based position MIP formulation is also embedded with real-world practical constraints, such as cargo stability, and results are additionally presented.
Improved Convergence and Robustness of USM3D Solutions on Mixed-Element Grids

NASA Technical Reports Server (NTRS)

Pandya, Mohagna J.; Diskin, Boris; Thomas, James L.; Frink, Neal T.

2016-01-01

Several improvements to the mixed-element USM3D discretization and defect-correction schemes have been made. A new methodology for nonlinear iterations, called the Hierarchical Adaptive Nonlinear Iteration Method, has been developed and implemented. The Hierarchical Adaptive Nonlinear Iteration Method provides two additional hierarchies around a simple and approximate preconditioner of USM3D. The hierarchies are a matrix-free linear solver for the exact linearization of Reynolds-averaged Navier-Stokes equations and a nonlinear control of the solution update. Two variants of the Hierarchical Adaptive Nonlinear Iteration Method are assessed on four benchmark cases, namely, a zero-pressure-gradient flat plate, a bump-in-channel configuration, the NACA 0012 airfoil, and a NASA Common Research Model configuration. The new methodology provides a convergence acceleration factor of 1.4 to 13 over the preconditioner-alone method representing the baseline solver technology.
Improved Convergence and Robustness of USM3D Solutions on Mixed-Element Grids

NASA Technical Reports Server (NTRS)

Pandya, Mohagna J.; Diskin, Boris; Thomas, James L.; Frinks, Neal T.

2016-01-01

Several improvements to the mixed-elementUSM3Ddiscretization and defect-correction schemes have been made. A new methodology for nonlinear iterations, called the Hierarchical Adaptive Nonlinear Iteration Method, has been developed and implemented. The Hierarchical Adaptive Nonlinear Iteration Method provides two additional hierarchies around a simple and approximate preconditioner of USM3D. The hierarchies are a matrix-free linear solver for the exact linearization of Reynolds-averaged Navier-Stokes equations and a nonlinear control of the solution update. Two variants of the Hierarchical Adaptive Nonlinear Iteration Method are assessed on four benchmark cases, namely, a zero-pressure-gradient flat plate, a bump-in-channel configuration, the NACA 0012 airfoil, and a NASA Common Research Model configuration. The new methodology provides a convergence acceleration factor of 1.4 to 13 over the preconditioner-alone method representing the baseline solver technology.
Improving Fidelity of Launch Vehicle Liftoff Acoustic Simulations

NASA Technical Reports Server (NTRS)

Liever, Peter; West, Jeff

2016-01-01

Launch vehicles experience high acoustic loads during ignition and liftoff affected by the interaction of rocket plume generated acoustic waves with launch pad structures. Application of highly parallelized Computational Fluid Dynamics (CFD) analysis tools optimized for application on the NAS computer systems such as the Loci/CHEM program now enable simulation of time-accurate, turbulent, multi-species plume formation and interaction with launch pad geometry and capture the generation of acoustic noise at the source regions in the plume shear layers and impingement regions. These CFD solvers are robust in capturing the acoustic fluctuations, but they are too dissipative to accurately resolve the propagation of the acoustic waves throughout the launch environment domain along the vehicle. A hybrid Computational Fluid Dynamics and Computational Aero-Acoustics (CFD/CAA) modeling framework has been developed to improve such liftoff acoustic environment predictions. The framework combines the existing highly-scalable NASA production CFD code, Loci/CHEM, with a high-order accurate discontinuous Galerkin (DG) solver, Loci/THRUST, developed in the same computational framework. Loci/THRUST employs a low dissipation, high-order, unstructured DG method to accurately propagate acoustic waves away from the source regions across large distances. The DG solver is currently capable of solving up to 4th order solutions for non-linear, conservative acoustic field propagation. Higher order boundary conditions are implemented to accurately model the reflection and refraction of acoustic waves on launch pad components. The DG solver accepts generalized unstructured meshes, enabling efficient application of common mesh generation tools for CHEM and THRUST simulations. The DG solution is coupled with the CFD solution at interface boundaries placed near the CFD acoustic source regions. Both simulations are executed simultaneously with coordinated boundary condition data exchange.
Proteus-MOC: A 3D deterministic solver incorporating 2D method of characteristics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marin-Lafleche, A.; Smith, M. A.; Lee, C.

2013-07-01

A new transport solution methodology was developed by combining the two-dimensional method of characteristics with the discontinuous Galerkin method for the treatment of the axial variable. The method, which can be applied to arbitrary extruded geometries, was implemented in PROTEUS-MOC and includes parallelization in group, angle, plane, and space using a top level GMRES linear algebra solver. Verification tests were performed to show accuracy and stability of the method with the increased number of angular directions and mesh elements. Good scalability with parallelism in angle and axial planes is displayed. (authors)
Using the Gurobi Solvers on the Peregrine System | High-Performance

Science.gov Websites

Peregrine System Gurobi Optimizer is a suite of solvers for mathematical programming. It is licensed for ('GRB_MATLAB_PATH') >> path(path,grb) Gurobi and GAMS GAMS is a high-level modeling system for mathematical
Highly Efficient Parallel Multigrid Solver For Large-Scale Simulation of Grain Growth Using the Structural Phase Field Crystal Model

NASA Astrophysics Data System (ADS)

Guan, Zhen; Pekurovsky, Dmitry; Luce, Jason; Thornton, Katsuyo; Lowengrub, John

The structural phase field crystal (XPFC) model can be used to model grain growth in polycrystalline materials at diffusive time-scales while maintaining atomic scale resolution. However, the governing equation of the XPFC model is an integral-partial-differential-equation (IPDE), which poses challenges in implementation onto high performance computing (HPC) platforms. In collaboration with the XSEDE Extended Collaborative Support Service, we developed a distributed memory HPC solver for the XPFC model, which combines parallel multigrid and P3DFFT. The performance benchmarking on the Stampede supercomputer indicates near linear strong and weak scaling for both multigrid and transfer time between multigrid and FFT modules up to 1024 cores. Scalability of the FFT module begins to decline at 128 cores, but it is sufficient for the type of problem we will be examining. We have demonstrated simulations using 1024 cores, and we expect to achieve 4096 cores and beyond. Ongoing work involves optimization of MPI/OpenMP-based codes for the Intel KNL Many-Core Architecture. This optimizes the code for coming pre-exascale systems, in particular many-core systems such as Stampede 2.0 and Cori 2 at NERSC, without sacrificing efficiency on other general HPC systems.
Nonlinear Conservation Laws and Finite Volume Methods

NASA Astrophysics Data System (ADS)

Leveque, Randall J.

Introduction Software Notation Classification of Differential Equations Derivation of Conservation Laws The Euler Equations of Gas Dynamics Dissipative Fluxes Source Terms Radiative Transfer and Isothermal Equations Multi-dimensional Conservation Laws The Shock Tube Problem Mathematical Theory of Hyperbolic Systems Scalar Equations Linear Hyperbolic Systems Nonlinear Systems The Riemann Problem for the Euler Equations Numerical Methods in One Dimension Finite Difference Theory Finite Volume Methods Importance of Conservation Form - Incorrect Shock Speeds Numerical Flux Functions Godunov's Method Approximate Riemann Solvers High-Resolution Methods Other Approaches Boundary Conditions Source Terms and Fractional Steps Unsplit Methods Fractional Step Methods General Formulation of Fractional Step Methods Stiff Source Terms Quasi-stationary Flow and Gravity Multi-dimensional Problems Dimensional Splitting Multi-dimensional Finite Volume Methods Grids and Adaptive Refinement Computational Difficulties Low-Density Flows Discrete Shocks and Viscous Profiles Start-Up Errors Wall Heating Slow-Moving Shocks Grid Orientation Effects Grid-Aligned Shocks Magnetohydrodynamics The MHD Equations One-Dimensional MHD Solving the Riemann Problem Nonstrict Hyperbolicity Stiffness The Divergence of B Riemann Problems in Multi-dimensional MHD Staggered Grids The 8-Wave Riemann Solver Relativistic Hydrodynamics Conservation Laws in Spacetime The Continuity Equation The 4-Momentum of a Particle The Stress-Energy Tensor Finite Volume Methods Multi-dimensional Relativistic Flow Gravitation and General Relativity References
A Nonlinear Modal Aeroelastic Solver for FUN3D

NASA Technical Reports Server (NTRS)

Goldman, Benjamin D.; Bartels, Robert E.; Biedron, Robert T.; Scott, Robert C.

2016-01-01

A nonlinear structural solver has been implemented internally within the NASA FUN3D computational fluid dynamics code, allowing for some new aeroelastic capabilities. Using a modal representation of the structure, a set of differential or differential-algebraic equations are derived for general thin structures with geometric nonlinearities. ODEPACK and LAPACK routines are linked with FUN3D, and the nonlinear equations are solved at each CFD time step. The existing predictor-corrector method is retained, whereby the structural solution is updated after mesh deformation. The nonlinear solver is validated using a test case for a flexible aeroshell at transonic, supersonic, and hypersonic flow conditions. Agreement with linear theory is seen for the static aeroelastic solutions at relatively low dynamic pressures, but structural nonlinearities limit deformation amplitudes at high dynamic pressures. No flutter was found at any of the tested trajectory points, though LCO may be possible in the transonic regime.
Parallel Computation of the Jacobian Matrix for Nonlinear Equation Solvers Using MATLAB

NASA Technical Reports Server (NTRS)

Rose, Geoffrey K.; Nguyen, Duc T.; Newman, Brett A.

2017-01-01

Demonstrating speedup for parallel code on a multicore shared memory PC can be challenging in MATLAB due to underlying parallel operations that are often opaque to the user. This can limit potential for improvement of serial code even for the so-called embarrassingly parallel applications. One such application is the computation of the Jacobian matrix inherent to most nonlinear equation solvers. Computation of this matrix represents the primary bottleneck in nonlinear solver speed such that commercial finite element (FE) and multi-body-dynamic (MBD) codes attempt to minimize computations. A timing study using MATLAB's Parallel Computing Toolbox was performed for numerical computation of the Jacobian. Several approaches for implementing parallel code were investigated while only the single program multiple data (spmd) method using composite objects provided positive results. Parallel code speedup is demonstrated but the goal of linear speedup through the addition of processors was not achieved due to PC architecture.

A matrix-form GSM-CFD solver for incompressible fluids and its application to hemodynamics

NASA Astrophysics Data System (ADS)

Yao, Jianyao; Liu, G. R.

2014-10-01

A GSM-CFD solver for incompressible flows is developed based on the gradient smoothing method (GSM). A matrix-form algorithm and corresponding data structure for GSM are devised to efficiently approximate the spatial gradients of field variables using the gradient smoothing operation. The calculated gradient values on various test fields show that the proposed GSM is capable of exactly reproducing linear field and of second order accuracy on all kinds of meshes. It is found that the GSM is much more robust to mesh deformation and therefore more suitable for problems with complicated geometries. Integrated with the artificial compressibility approach, the GSM is extended to solve the incompressible flows. As an example, the flow simulation of carotid bifurcation is carried out to show the effectiveness of the proposed GSM-CFD solver. The blood is modeled as incompressible Newtonian fluid and the vessel is treated as rigid wall in this paper.
The semi-discrete Galerkin finite element modelling of compressible viscous flow past an airfoil

NASA Technical Reports Server (NTRS)

Meade, Andrew J., Jr.

1992-01-01

A method is developed to solve the two-dimensional, steady, compressible, turbulent boundary-layer equations and is coupled to an existing Euler solver for attached transonic airfoil analysis problems. The boundary-layer formulation utilizes the semi-discrete Galerkin (SDG) method to model the spatial variable normal to the surface with linear finite elements and the time-like variable with finite differences. A Dorodnitsyn transformed system of equations is used to bound the infinite spatial domain thereby permitting the use of a uniform finite element grid which provides high resolution near the wall and automatically follows boundary-layer growth. The second-order accurate Crank-Nicholson scheme is applied along with a linearization method to take advantage of the parabolic nature of the boundary-layer equations and generate a non-iterative marching routine. The SDG code can be applied to any smoothly-connected airfoil shape without modification and can be coupled to any inviscid flow solver. In this analysis, a direct viscous-inviscid interaction is accomplished between the Euler and boundary-layer codes, through the application of a transpiration velocity boundary condition. Results are presented for compressible turbulent flow past NACA 0012 and RAE 2822 airfoils at various freestream Mach numbers, Reynolds numbers, and angles of attack. All results show good agreement with experiment, and the coupled code proved to be a computationally-efficient and accurate airfoil analysis tool.
Analysis Tools for CFD Multigrid Solvers

NASA Technical Reports Server (NTRS)

Mineck, Raymond E.; Thomas, James L.; Diskin, Boris

2004-01-01

Analysis tools are needed to guide the development and evaluate the performance of multigrid solvers for the fluid flow equations. Classical analysis tools, such as local mode analysis, often fail to accurately predict performance. Two-grid analysis tools, herein referred to as Idealized Coarse Grid and Idealized Relaxation iterations, have been developed and evaluated within a pilot multigrid solver. These new tools are applicable to general systems of equations and/or discretizations and point to problem areas within an existing multigrid solver. Idealized Relaxation and Idealized Coarse Grid are applied in developing textbook-efficient multigrid solvers for incompressible stagnation flow problems.
Revisiting Parallel Cyclic Reduction and Parallel Prefix-Based Algorithms for Block Tridiagonal System of Equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seal, Sudip K; Perumalla, Kalyan S; Hirshman, Steven Paul

2013-01-01

Simulations that require solutions of block tridiagonal systems of equations rely on fast parallel solvers for runtime efficiency. Leading parallel solvers that are highly effective for general systems of equations, dense or sparse, are limited in scalability when applied to block tridiagonal systems. This paper presents scalability results as well as detailed analyses of two parallel solvers that exploit the special structure of block tridiagonal matrices to deliver superior performance, often by orders of magnitude. A rigorous analysis of their relative parallel runtimes is shown to reveal the existence of a critical block size that separates the parameter space spannedmore » by the number of block rows, the block size and the processor count, into distinct regions that favor one or the other of the two solvers. Dependence of this critical block size on the above parameters as well as on machine-specific constants is established. These formal insights are supported by empirical results on up to 2,048 cores of a Cray XT4 system. To the best of our knowledge, this is the highest reported scalability for parallel block tridiagonal solvers to date.« less
Efficient Kriging Algorithms

NASA Technical Reports Server (NTRS)

Memarsadeghi, Nargess

2011-01-01

More efficient versions of an interpolation method, called kriging, have been introduced in order to reduce its traditionally high computational cost. Written in C++, these approaches were tested on both synthetic and real data. Kriging is a best unbiased linear estimator and suitable for interpolation of scattered data points. Kriging has long been used in the geostatistic and mining communities, but is now being researched for use in the image fusion of remotely sensed data. This allows a combination of data from various locations to be used to fill in any missing data from any single location. To arrive at the faster algorithms, sparse SYMMLQ iterative solver, covariance tapering, Fast Multipole Methods (FMM), and nearest neighbor searching techniques were used. These implementations were used when the coefficient matrix in the linear system is symmetric, but not necessarily positive-definite.
Numerical Technology for Large-Scale Computational Electromagnetics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sharpe, R; Champagne, N; White, D

The key bottleneck of implicit computational electromagnetics tools for large complex geometries is the solution of the resulting linear system of equations. The goal of this effort was to research and develop critical numerical technology that alleviates this bottleneck for large-scale computational electromagnetics (CEM). The mathematical operators and numerical formulations used in this arena of CEM yield linear equations that are complex valued, unstructured, and indefinite. Also, simultaneously applying multiple mathematical modeling formulations to different portions of a complex problem (hybrid formulations) results in a mixed structure linear system, further increasing the computational difficulty. Typically, these hybrid linear systems aremore » solved using a direct solution method, which was acceptable for Cray-class machines but does not scale adequately for ASCI-class machines. Additionally, LLNL's previously existing linear solvers were not well suited for the linear systems that are created by hybrid implicit CEM codes. Hence, a new approach was required to make effective use of ASCI-class computing platforms and to enable the next generation design capabilities. Multiple approaches were investigated, including the latest sparse-direct methods developed by our ASCI collaborators. In addition, approaches that combine domain decomposition (or matrix partitioning) with general-purpose iterative methods and special purpose pre-conditioners were investigated. Special-purpose pre-conditioners that take advantage of the structure of the matrix were adapted and developed based on intimate knowledge of the matrix properties. Finally, new operator formulations were developed that radically improve the conditioning of the resulting linear systems thus greatly reducing solution time. The goal was to enable the solution of CEM problems that are 10 to 100 times larger than our previous capability.« less
Fully-Implicit Orthogonal Reconstructed Discontinuous Galerkin for Fluid Dynamics with Phase Change

DOE PAGES

Nourgaliev, R.; Luo, H.; Weston, B.; ...

2015-11-11

A new reconstructed Discontinuous Galerkin (rDG) method, based on orthogonal basis/test functions, is developed for fluid flows on unstructured meshes. Orthogonality of basis functions is essential for enabling robust and efficient fully-implicit Newton-Krylov based time integration. The method is designed for generic partial differential equations, including transient, hyperbolic, parabolic or elliptic operators, which are attributed to many multiphysics problems. We demonstrate the method’s capabilities for solving compressible fluid-solid systems (in the low Mach number limit), with phase change (melting/solidification), as motivated by applications in Additive Manufacturing (AM). We focus on the method’s accuracy (in both space and time), as wellmore » as robustness and solvability of the system of linear equations involved in the linearization steps of Newton-based methods. The performance of the developed method is investigated for highly-stiff problems with melting/solidification, emphasizing the advantages from tight coupling of mass, momentum and energy conservation equations, as well as orthogonality of basis functions, which leads to better conditioning of the underlying (approximate) Jacobian matrices, and rapid convergence of the Krylov-based linear solver.« less
Albany/FELIX: A parallel, scalable and robust, finite element, first-order Stokes approximation ice sheet solver built for advanced analysis

DOE PAGES

Tezaur, I. K.; Perego, M.; Salinger, A. G.; ...

2015-04-27

This paper describes a new parallel, scalable and robust finite element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and template-based generic programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, alongmore » with their implementation. The results of several verification studies of the model accuracy are presented using (1) new test cases for simplified two-dimensional (2-D) versions of the governing equations derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution are then studied on problems involving a realistic Greenland ice sheet geometry discretized using hexahedral and tetrahedral meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.« less
MILAMIN 2 - Fast MATLAB FEM solver

NASA Astrophysics Data System (ADS)

Dabrowski, Marcin; Krotkiewski, Marcin; Schmid, Daniel W.

2013-04-01

MILAMIN is a free and efficient MATLAB-based two-dimensional FEM solver utilizing unstructured meshes [Dabrowski et al., G-cubed (2008)]. The code consists of steady-state thermal diffusion and incompressible Stokes flow solvers implemented in approximately 200 lines of native MATLAB code. The brevity makes the code easily customizable. An important quality of MILAMIN is speed - it can handle millions of nodes within minutes on one CPU core of a standard desktop computer, and is faster than many commercial solutions. The new MILAMIN 2 allows three-dimensional modeling. It is designed as a set of functional modules that can be used as building blocks for efficient FEM simulations using MATLAB. The utilities are largely implemented as native MATLAB functions. For performance critical parts we use MUTILS - a suite of compiled MEX functions optimized for shared memory multi-core computers. The most important features of MILAMIN 2 are: 1. Modular approach to defining, tracking, and discretizing the geometry of the model 2. Interfaces to external mesh generators (e.g., Triangle, Fade2d, T3D) and mesh utilities (e.g., element type conversion, fast point location, boundary extraction) 3. Efficient computation of the stiffness matrix for a wide range of element types, anisotropic materials and three-dimensional problems 4. Fast global matrix assembly using a dedicated MEX function 5. Automatic integration rules 6. Flexible prescription (spatial, temporal, and field functions) and efficient application of Dirichlet, Neuman, and periodic boundary conditions 7. Treatment of transient and non-linear problems 8. Various iterative and multi-level solution strategies 9. Post-processing tools (e.g., numerical integration) 10. Visualization primitives using MATLAB, and VTK export functions We provide a large number of examples that show how to implement a custom FEM solver using the MILAMIN 2 framework. The examples are MATLAB scripts of increasing complexity that address a given technical topic (e.g., creating meshes, reordering nodes, applying boundary conditions), a given numerical topic (e.g., using various solution strategies, non-linear iterations), or that present a fully-developed solver designed to address a scientific topic (e.g., performing Stokes flow simulations in synthetic porous medium). References: Dabrowski, M., M. Krotkiewski, and D. W. Schmid MILAMIN: MATLAB-based finite element method solver for large problems, Geochem. Geophys. Geosyst., 9, Q04030, 2008
A scalable block-preconditioning strategy for divergence-conforming B-spline discretizations of the Stokes problem

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cortes, Adriano M.; Dalcin, Lisandro; Sarmiento, Adel F.

The recently introduced divergence-conforming B-spline discretizations allow the construction of smooth discrete velocity–pressure pairs for viscous incompressible flows that are at the same time inf–sup stable and pointwise divergence-free. When applied to the discretized Stokes problem, these spaces generate a symmetric and indefinite saddle-point linear system. The iterative method of choice to solve such system is the Generalized Minimum Residual Method. This method lacks robustness, and one remedy is to use preconditioners. For linear systems of saddle-point type, a large family of preconditioners can be obtained by using a block factorization of the system. In this paper, we show howmore » the nesting of “black-box” solvers and preconditioners can be put together in a block triangular strategy to build a scalable block preconditioner for the Stokes system discretized by divergence-conforming B-splines. Lastly, besides the known cavity flow problem, we used for benchmark flows defined on complex geometries: an eccentric annulus and hollow torus of an eccentric annular cross-section.« less
A scalable block-preconditioning strategy for divergence-conforming B-spline discretizations of the Stokes problem

DOE PAGES

Cortes, Adriano M.; Dalcin, Lisandro; Sarmiento, Adel F.; ...

2016-10-19

The recently introduced divergence-conforming B-spline discretizations allow the construction of smooth discrete velocity–pressure pairs for viscous incompressible flows that are at the same time inf–sup stable and pointwise divergence-free. When applied to the discretized Stokes problem, these spaces generate a symmetric and indefinite saddle-point linear system. The iterative method of choice to solve such system is the Generalized Minimum Residual Method. This method lacks robustness, and one remedy is to use preconditioners. For linear systems of saddle-point type, a large family of preconditioners can be obtained by using a block factorization of the system. In this paper, we show howmore » the nesting of “black-box” solvers and preconditioners can be put together in a block triangular strategy to build a scalable block preconditioner for the Stokes system discretized by divergence-conforming B-splines. Lastly, besides the known cavity flow problem, we used for benchmark flows defined on complex geometries: an eccentric annulus and hollow torus of an eccentric annular cross-section.« less
Efficiency and flexibility using implicit methods within atmosphere dycores

NASA Astrophysics Data System (ADS)

Evans, K. J.; Archibald, R.; Norman, M. R.; Gardner, D. J.; Woodward, C. S.; Worley, P.; Taylor, M.

2016-12-01

A suite of explicit and implicit methods are evaluated for a range of configurations of the shallow water dynamical core within the spectral-element Community Atmosphere Model (CAM-SE) to explore their relative computational performance. The configurations are designed to explore the attributes of each method under different but relevant model usage scenarios including varied spectral order within an element, static regional refinement, and scaling to large problem sizes. The limitations and benefits of using explicit versus implicit, with different discretizations and parameters, are discussed in light of trade-offs such as MPI communication, memory, and inherent efficiency bottlenecks. For the regionally refined shallow water configurations, the implicit BDF2 method is about the same efficiency as an explicit Runge-Kutta method, without including a preconditioner. Performance of the implicit methods with the residual function executed on a GPU is also presented; there is speed up for the residual relative to a CPU, but overwhelming transfer costs motivate moving more of the solver to the device. Given the performance behavior of implicit methods within the shallow water dynamical core, the recommendation for future work using implicit solvers is conditional based on scale separation and the stiffness of the problem. The strong growth of linear iterations with increasing resolution or time step size is the main bottleneck to computational efficiency. Within the hydrostatic dynamical core, of CAM-SE, we present results utilizing approximate block factorization preconditioners implemented using the Trilinos library of solvers. They reduce the cost of linear system solves and improve parallel scalability. We provide a summary of the remaining efficiency considerations within the preconditioner and utilization of the GPU, as well as a discussion about the benefits of a time stepping method that provides converged and stable solutions for a much wider range of time step sizes. As more complex model components, for example new physics and aerosols, are connected in the model, having flexibility in the time stepping will enable more options for combining and resolving multiple scales of behavior.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Weston, Brian T.

This dissertation focuses on the development of a fully-implicit, high-order compressible ow solver with phase change. The work is motivated by laser-induced phase change applications, particularly by the need to develop large-scale multi-physics simulations of the selective laser melting (SLM) process in metal additive manufacturing (3D printing). Simulations of the SLM process require precise tracking of multi-material solid-liquid-gas interfaces, due to laser-induced melting/ solidi cation and evaporation/condensation of metal powder in an ambient gas. These rapid density variations and phase change processes tightly couple the governing equations, requiring a fully compressible framework to robustly capture the rapid density variations ofmore » the ambient gas and the melting/evaporation of the metal powder. For non-isothermal phase change, the velocity is gradually suppressed through the mushy region by a variable viscosity and Darcy source term model. The governing equations are discretized up to 4th-order accuracy with our reconstructed Discontinuous Galerkin spatial discretization scheme and up to 5th-order accuracy with L-stable fully implicit time discretization schemes (BDF2 and ESDIRK3-5). The resulting set of non-linear equations is solved using a robust Newton-Krylov method, with the Jacobian-free version of the GMRES solver for linear iterations. Due to the sti nes associated with the acoustic waves and thermal and viscous/material strength e ects, preconditioning the GMRES solver is essential. A robust and scalable approximate block factorization preconditioner was developed, which utilizes the velocity-pressure (vP) and velocity-temperature (vT) Schur complement systems. This multigrid block reduction preconditioning technique converges for high CFL/Fourier numbers and exhibits excellent parallel and algorithmic scalability on classic benchmark problems in uid dynamics (lid-driven cavity ow and natural convection heat transfer) as well as for laser-induced phase change problems in 2D and 3D.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yeung, Yu-Hong; Pothen, Alex; Halappanavar, Mahantesh

We present an augmented matrix approach to update the solution to a linear system of equations when the coefficient matrix is modified by a few elements within a principal submatrix. This problem arises in the dynamic security analysis of a power grid, where operators need to performmore » $N-x$ contingency analysis, i.e., determine the state of the system when up to $x$ links from $N$ fail. Our algorithms augment the coefficient matrix to account for the changes in it, and then compute the solution to the augmented system without refactoring the modified matrix. We provide two algorithms, a direct method, and a hybrid direct-iterative method for solving the augmented system. We also exploit the sparsity of the matrices and vectors to accelerate the overall computation. Our algorithms are compared on three power grids with PARDISO, a parallel direct solver, and CHOLMOD, a direct solver with the ability to modify the Cholesky factors of the coefficient matrix. We show that our augmented algorithms outperform PARDISO (by two orders of magnitude), and CHOLMOD (by a factor of up to 5). Further, our algorithms scale better than CHOLMOD as the number of elements updated increases. The solutions are computed with high accuracy. Our algorithms are capable of computing $N-x$ contingency analysis on a $778K$ bus grid, updating a solution with $x=20$ elements in $$1.6 \\times 10^{-2}$$ seconds on an Intel Xeon processor.« less
Computational efficiency improvements for image colorization

NASA Astrophysics Data System (ADS)

Yu, Chao; Sharma, Gaurav; Aly, Hussein

2013-03-01

We propose an efficient algorithm for colorization of greyscale images. As in prior work, colorization is posed as an optimization problem: a user specifies the color for a few scribbles drawn on the greyscale image and the color image is obtained by propagating color information from the scribbles to surrounding regions, while maximizing the local smoothness of colors. In this formulation, colorization is obtained by solving a large sparse linear system, which normally requires substantial computation and memory resources. Our algorithm improves the computational performance through three innovations over prior colorization implementations. First, the linear system is solved iteratively without explicitly constructing the sparse matrix, which significantly reduces the required memory. Second, we formulate each iteration in terms of integral images obtained by dynamic programming, reducing repetitive computation. Third, we use a coarseto- fine framework, where a lower resolution subsampled image is first colorized and this low resolution color image is upsampled to initialize the colorization process for the fine level. The improvements we develop provide significant speedup and memory savings compared to the conventional approach of solving the linear system directly using off-the-shelf sparse solvers, and allow us to colorize images with typical sizes encountered in realistic applications on typical commodity computing platforms.
NONLINEAR MULTIGRID SOLVER EXPLOITING AMGe COARSE SPACES WITH APPROXIMATION PROPERTIES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Christensen, Max La Cour; Villa, Umberto E.; Engsig-Karup, Allan P.

The paper introduces a nonlinear multigrid solver for mixed nite element discretizations based on the Full Approximation Scheme (FAS) and element-based Algebraic Multigrid (AMGe). The main motivation to use FAS for unstruc- tured problems is the guaranteed approximation property of the AMGe coarse spaces that were developed recently at Lawrence Livermore National Laboratory. These give the ability to derive stable and accurate coarse nonlinear discretization problems. The previous attempts (including ones with the original AMGe method, [5, 11]), were less successful due to lack of such good approximation properties of the coarse spaces. With coarse spaces with approximation properties, ourmore » FAS approach on un- structured meshes should be as powerful/successful as FAS on geometrically re ned meshes. For comparison, Newton's method and Picard iterations with an inner state-of-the-art linear solver is compared to FAS on a nonlinear saddle point problem with applications to porous media ow. It is demonstrated that FAS is faster than Newton's method and Picard iterations for the experiments considered here. Due to the guaranteed approximation properties of our AMGe, the coarse spaces are very accurate, providing a solver with the potential for mesh-independent convergence on general unstructured meshes.« less
A coarse-grid projection method for accelerating incompressible flow computations

NASA Astrophysics Data System (ADS)

San, Omer; Staples, Anne

2011-11-01

We present a coarse-grid projection (CGP) algorithm for accelerating incompressible flow computations, which is applicable to methods involving Poisson equations as incompressibility constraints. CGP methodology is a modular approach that facilitates data transfer with simple interpolations and uses black-box solvers for the Poisson and advection-diffusion equations in the flow solver. Here, we investigate a particular CGP method for the vorticity-stream function formulation that uses the full weighting operation for mapping from fine to coarse grids, the third-order Runge-Kutta method for time stepping, and finite differences for the spatial discretization. After solving the Poisson equation on a coarsened grid, bilinear interpolation is used to obtain the fine data for consequent time stepping on the full grid. We compute several benchmark flows: the Taylor-Green vortex, a vortex pair merging, a double shear layer, decaying turbulence and the Taylor-Green vortex on a distorted grid. In all cases we use either FFT-based or V-cycle multigrid linear-cost Poisson solvers. Reducing the number of degrees of freedom of the Poisson solver by powers of two accelerates these computations while, for the first level of coarsening, retaining the same level of accuracy in the fine resolution vorticity field.
A Riemann solver for single-phase and two-phase shallow flow models based on relaxation. Relations with Roe and VFRoe solvers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pelanti, Marica, E-mail: Marica.Pelanti@ens.f; Bouchut, Francois, E-mail: francois.bouchut@univ-mlv.f; Mangeney, Anne, E-mail: mangeney@ipgp.jussieu.f

2011-02-01

We present a Riemann solver derived by a relaxation technique for classical single-phase shallow flow equations and for a two-phase shallow flow model describing a mixture of solid granular material and fluid. Our primary interest is the numerical approximation of this two-phase solid/fluid model, whose complexity poses numerical difficulties that cannot be efficiently addressed by existing solvers. In particular, we are concerned with ensuring a robust treatment of dry bed states. The relaxation system used by the proposed solver is formulated by introducing auxiliary variables that replace the momenta in the spatial gradients of the original model systems. The resultingmore » relaxation solver is related to Roe solver in that its Riemann solution for the flow height and relaxation variables is formally computed as Roe's Riemann solution. The relaxation solver has the advantage of a certain degree of freedom in the specification of the wave structure through the choice of the relaxation parameters. This flexibility can be exploited to handle robustly vacuum states, which is a well known difficulty of standard Roe's method, while maintaining Roe's low diffusivity. For the single-phase model positivity of flow height is rigorously preserved. For the two-phase model positivity of volume fractions in general is not ensured, and a suitable restriction on the CFL number might be needed. Nonetheless, numerical experiments suggest that the proposed two-phase flow solver efficiently models wet/dry fronts and vacuum formation for a large range of flow conditions. As a corollary of our study, we show that for single-phase shallow flow equations the relaxation solver is formally equivalent to the VFRoe solver with conservative variables of Gallouet and Masella [T. Gallouet, J.-M. Masella, Un schema de Godunov approche C.R. Acad. Sci. Paris, Serie I, 323 (1996) 77-84]. The relaxation interpretation allows establishing positivity conditions for this VFRoe method.« less
Numerical System Solver Developed for the National Cycle Program

NASA Technical Reports Server (NTRS)

Binder, Michael P.

1999-01-01

As part of the National Cycle Program (NCP), a powerful new numerical solver has been developed to support the simulation of aeropropulsion systems. This software uses a hierarchical object-oriented design. It can provide steady-state and time-dependent solutions to nonlinear and even discontinuous problems typically encountered when aircraft and spacecraft propulsion systems are simulated. It also can handle constrained solutions, in which one or more factors may limit the behavior of the engine system. Timedependent simulation capabilities include adaptive time-stepping and synchronization with digital control elements. The NCP solver is playing an important role in making the NCP a flexible, powerful, and reliable simulation package.
N-MODY: a code for collisionless N-body simulations in modified Newtonian dynamics.

NASA Astrophysics Data System (ADS)

Londrillo, P.; Nipoti, C.

We describe the numerical code N-MODY, a parallel particle-mesh code for collisionless N-body simulations in modified Newtonian dynamics (MOND). N-MODY is based on a numerical potential solver in spherical coordinates that solves the non-linear MOND field equation, and is ideally suited to simulate isolated stellar systems. N-MODY can be used also to compute the MOND potential of arbitrary static density distributions. A few applications of N-MODY indicate that some astrophysically relevant dynamical processes are profoundly different in MOND and in Newtonian gravity with dark matter.

Overview of the CHarring Ablator Response (CHAR) Code

NASA Technical Reports Server (NTRS)

Amar, Adam J.; Oliver, A. Brandon; Kirk, Benjamin S.; Salazar, Giovanni; Droba, Justin

2016-01-01

An overview of the capabilities of the CHarring Ablator Response (CHAR) code is presented. CHAR is a one-, two-, and three-dimensional unstructured continuous Galerkin finite-element heat conduction and ablation solver with both direct and inverse modes. Additionally, CHAR includes a coupled linear thermoelastic solver for determination of internal stresses induced from the temperature field and surface loading. Background on the development process, governing equations, material models, discretization techniques, and numerical methods is provided. Special focus is put on the available boundary conditions including thermochemical ablation and contact interfaces, and example simulations are included. Finally, a discussion of ongoing development efforts is presented.
Overview of the CHarring Ablator Response (CHAR) Code

NASA Technical Reports Server (NTRS)

Amar, Adam J.; Oliver, A. Brandon; Kirk, Benjamin S.; Salazar, Giovanni; Droba, Justin

2016-01-01

An overview of the capabilities of the CHarring Ablator Response (CHAR) code is presented. CHAR is a one-, two-, and three-dimensional unstructured continuous Galerkin finite-element heat conduction and ablation solver with both direct and inverse modes. Additionally, CHAR includes a coupled linear thermoelastic solver for determination of internal stresses induced from the temperature field and surface loading. Background on the development process, governing equations, material models, discretization techniques, and numerical methods is provided. Special focus is put on the available boundary conditions including thermochemical ablation, surface-to-surface radiation exchange, and flowfield coupling. Finally, a discussion of ongoing development efforts is presented.
Development and Verification of the Charring, Ablating Thermal Protection Implicit System Simulator

NASA Technical Reports Server (NTRS)

Amar, Adam J.; Calvert, Nathan; Kirk, Benjamin S.

2011-01-01

The development and verification of the Charring Ablating Thermal Protection Implicit System Solver (CATPISS) is presented. This work concentrates on the derivation and verification of the stationary grid terms in the equations that govern three-dimensional heat and mass transfer for charring thermal protection systems including pyrolysis gas flow through the porous char layer. The governing equations are discretized according to the Galerkin finite element method (FEM) with first and second order fully implicit time integrators. The governing equations are fully coupled and are solved in parallel via Newton s method, while the linear system is solved via the Generalized Minimum Residual method (GMRES). Verification results from exact solutions and Method of Manufactured Solutions (MMS) are presented to show spatial and temporal orders of accuracy as well as nonlinear convergence rates.
Parallel Finite Element Domain Decomposition for Structural/Acoustic Analysis

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.; Tungkahotara, Siroj; Watson, Willie R.; Rajan, Subramaniam D.

2005-01-01

A domain decomposition (DD) formulation for solving sparse linear systems of equations resulting from finite element analysis is presented. The formulation incorporates mixed direct and iterative equation solving strategics and other novel algorithmic ideas that are optimized to take advantage of sparsity and exploit modern computer architecture, such as memory and parallel computing. The most time consuming part of the formulation is identified and the critical roles of direct sparse and iterative solvers within the framework of the formulation are discussed. Experiments on several computer platforms using several complex test matrices are conducted using software based on the formulation. Small-scale structural examples are used to validate thc steps in the formulation and large-scale (l,000,000+ unknowns) duct acoustic examples are used to evaluate the ORIGIN 2000 processors, and a duster of 6 PCs (running under the Windows environment). Statistics show that the formulation is efficient in both sequential and parallel computing environmental and that the formulation is significantly faster and consumes less memory than that based on one of the best available commercialized parallel sparse solvers.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bolding, Simon R.; Cleveland, Mathew Allen; Morel, Jim E.

In this paper, we have implemented a new high-order low-order (HOLO) algorithm for solving thermal radiative transfer problems. The low-order (LO) system is based on the spatial and angular moments of the transport equation and a linear-discontinuous finite-element spatial representation, producing equations similar to the standard S 2 equations. The LO solver is fully implicit in time and efficiently resolves the nonlinear temperature dependence at each time step. The high-order (HO) solver utilizes exponentially convergent Monte Carlo (ECMC) to give a globally accurate solution for the angular intensity to a fixed-source pure-absorber transport problem. This global solution is used tomore » compute consistency terms, which require the HO and LO solutions to converge toward the same solution. The use of ECMC allows for the efficient reduction of statistical noise in the Monte Carlo solution, reducing inaccuracies introduced through the LO consistency terms. Finally, we compare results with an implicit Monte Carlo code for one-dimensional gray test problems and demonstrate the efficiency of ECMC over standard Monte Carlo in this HOLO algorithm.« less
A High-Order Low-Order Algorithm with Exponentially Convergent Monte Carlo for Thermal Radiative Transfer

DOE PAGES

Bolding, Simon R.; Cleveland, Mathew Allen; Morel, Jim E.

2016-10-21

In this paper, we have implemented a new high-order low-order (HOLO) algorithm for solving thermal radiative transfer problems. The low-order (LO) system is based on the spatial and angular moments of the transport equation and a linear-discontinuous finite-element spatial representation, producing equations similar to the standard S 2 equations. The LO solver is fully implicit in time and efficiently resolves the nonlinear temperature dependence at each time step. The high-order (HO) solver utilizes exponentially convergent Monte Carlo (ECMC) to give a globally accurate solution for the angular intensity to a fixed-source pure-absorber transport problem. This global solution is used tomore » compute consistency terms, which require the HO and LO solutions to converge toward the same solution. The use of ECMC allows for the efficient reduction of statistical noise in the Monte Carlo solution, reducing inaccuracies introduced through the LO consistency terms. Finally, we compare results with an implicit Monte Carlo code for one-dimensional gray test problems and demonstrate the efficiency of ECMC over standard Monte Carlo in this HOLO algorithm.« less
Theory and implementation of H-matrix based iterative and direct solvers for Helmholtz and elastodynamic oscillatory kernels

NASA Astrophysics Data System (ADS)

Chaillat, Stéphanie; Desiderio, Luca; Ciarlet, Patrick

2017-12-01

In this work, we study the accuracy and efficiency of hierarchical matrix (H-matrix) based fast methods for solving dense linear systems arising from the discretization of the 3D elastodynamic Green's tensors. It is well known in the literature that standard H-matrix based methods, although very efficient tools for asymptotically smooth kernels, are not optimal for oscillatory kernels. H2-matrix and directional approaches have been proposed to overcome this problem. However the implementation of such methods is much more involved than the standard H-matrix representation. The central questions we address are twofold. (i) What is the frequency-range in which the H-matrix format is an efficient representation for 3D elastodynamic problems? (ii) What can be expected of such an approach to model problems in mechanical engineering? We show that even though the method is not optimal (in the sense that more involved representations can lead to faster algorithms) an efficient solver can be easily developed. The capabilities of the method are illustrated on numerical examples using the Boundary Element Method.
Optimization on Paddy Crops in Central Java (with Solver, SVD on Least Square and ACO (Ant Colony Algorithm))

NASA Astrophysics Data System (ADS)

Parhusip, H. A.; Trihandaru, S.; Susanto, B.; Prasetyo, S. Y. J.; Agus, Y. H.; Simanjuntak, B. H.

2017-03-01

Several algorithms and objective functions on paddy crops have been studied to get optimal paddy crops in Central Java based on the data given from Surakarta and Boyolali. The algorithms are linear solver, least square and Ant Colony Algorithms (ACO) to develop optimization procedures on paddy crops modelled with Modified GSTAR (Generalized Space-Time Autoregressive) and nonlinear models where the nonlinear models are quadratic and power functions. The studied data contain paddy crops from Surakarta and Boyolali determining the best period of planting in the year 1992-2012 for Surakarta where 3 periods for planting are known and the optimal amount of paddy crops in Boyolali in the year 2008-2013. Having these analyses may guide the local agriculture government to give a decision on rice sustainability in its region. The best period for planting in Surakarta is observed, i.e. the best period is in September-December based on the data 1992-2012 by considering the planting area, the cropping area, and the paddy crops are the most important factors to be taken into account. As a result, we can refer the paddy crops in this best period (about 60.4 thousand tons per year) as the optimal results in 1992-2012 where the used objective function is quadratic. According to the research, the optimal paddy crops in Boyolali about 280 thousand tons per year where the studied factors are the amount of rainfalls, the harvested area and the paddy crops in 2008-2013. In this case, linear and power functions are studied to be the objective functions. Compared to all studied algorithms, the linear solver is still recommended to be an optimization tool for a local agriculture government to predict paddy crops in future.
Parallelization of the preconditioned IDR solver for modern multicore computer systems

NASA Astrophysics Data System (ADS)

Bessonov, O. A.; Fedoseyev, A. I.

2012-10-01

This paper present the analysis, parallelization and optimization approach for the large sparse matrix solver CNSPACK for modern multicore microprocessors. CNSPACK is an advanced solver successfully used for coupled solution of stiff problems arising in multiphysics applications such as CFD, semiconductor transport, kinetic and quantum problems. It employs iterative IDR algorithm with ILU preconditioning (user chosen ILU preconditioning order). CNSPACK has been successfully used during last decade for solving problems in several application areas, including fluid dynamics and semiconductor device simulation. However, there was a dramatic change in processor architectures and computer system organization in recent years. Due to this, performance criteria and methods have been revisited, together with involving the parallelization of the solver and preconditioner using Open MP environment. Results of the successful implementation for efficient parallelization are presented for the most advances computer system (Intel Core i7-9xx or two-processor Xeon 55xx/56xx).
Iterative Methods to Solve Linear RF Fields in Hot Plasma

NASA Astrophysics Data System (ADS)

Spencer, Joseph; Svidzinski, Vladimir; Evstatiev, Evstati; Galkin, Sergei; Kim, Jin-Soo

2014-10-01

Most magnetic plasma confinement devices use radio frequency (RF) waves for current drive and/or heating. Numerical modeling of RF fields is an important part of performance analysis of such devices and a predictive tool aiding design and development of future devices. Prior attempts at this modeling have mostly used direct solvers to solve the formulated linear equations. Full wave modeling of RF fields in hot plasma with 3D nonuniformities is mostly prohibited, with memory demands of a direct solver placing a significant limitation on spatial resolution. Iterative methods can significantly increase spatial resolution. We explore the feasibility of using iterative methods in 3D full wave modeling. The linear wave equation is formulated using two approaches: for cold plasmas the local cold plasma dielectric tensor is used (resolving resonances by particle collisions), while for hot plasmas the conductivity kernel (which includes a nonlocal dielectric response) is calculated by integrating along test particle orbits. The wave equation is discretized using a finite difference approach. The initial guess is important in iterative methods, and we examine different initial guesses including the solution to the cold plasma wave equation. Work is supported by the U.S. DOE SBIR program.
A Computational/Experimental Study of Two Optimized Supersonic Transport Designs and the Reference H Baseline

NASA Technical Reports Server (NTRS)

Cliff, Susan E.; Baker, Timothy J.; Hicks, Raymond M.; Reuther, James J.

1999-01-01

Two supersonic transport configurations designed by use of non-linear aerodynamic optimization methods are compared with a linearly designed baseline configuration. One optimized configuration, designated Ames 7-04, was designed at NASA Ames Research Center using an Euler flow solver, and the other, designated Boeing W27, was designed at Boeing using a full-potential method. The two optimized configurations and the baseline were tested in the NASA Langley Unitary Plan Supersonic Wind Tunnel to evaluate the non-linear design optimization methodologies. In addition, the experimental results are compared with computational predictions for each of the three configurations from the Enter flow solver, AIRPLANE. The computational and experimental results both indicate moderate to substantial performance gains for the optimized configurations over the baseline configuration. The computed performance changes with and without diverters and nacelles were in excellent agreement with experiment for all three models. Comparisons of the computational and experimental cruise drag increments for the optimized configurations relative to the baseline show excellent agreement for the model designed by the Euler method, but poorer comparisons were found for the configuration designed by the full-potential code.
Efficient convolutional sparse coding

DOEpatents

Wohlberg, Brendt

2017-06-20

Computationally efficient algorithms may be applied for fast dictionary learning solving the convolutional sparse coding problem in the Fourier domain. More specifically, efficient convolutional sparse coding may be derived within an alternating direction method of multipliers (ADMM) framework that utilizes fast Fourier transforms (FFT) to solve the main linear system in the frequency domain. Such algorithms may enable a significant reduction in computational cost over conventional approaches by implementing a linear solver for the most critical and computationally expensive component of the conventional iterative algorithm. The theoretical computational cost of the algorithm may be reduced from O(M.sup.3N) to O(MN log N), where N is the dimensionality of the data and M is the number of elements in the dictionary. This significant improvement in efficiency may greatly increase the range of problems that can practically be addressed via convolutional sparse representations.
TDIGG - TWO-DIMENSIONAL, INTERACTIVE GRID GENERATION CODE

NASA Technical Reports Server (NTRS)

Vu, B. T.

1994-01-01

TDIGG is a fast and versatile program for generating two-dimensional computational grids for use with finite-difference flow-solvers. Both algebraic and elliptic grid generation systems are included. The method for grid generation by algebraic transformation is based on an interpolation algorithm and the elliptic grid generation is established by solving the partial differential equation (PDE). Non-uniform grid distributions are carried out using a hyperbolic tangent stretching function. For algebraic grid systems, interpolations in one direction (univariate) and two directions (bivariate) are considered. These interpolations are associated with linear or cubic Lagrangian/Hermite/Bezier polynomial functions. The algebraic grids can subsequently be smoothed using an elliptic solver. For elliptic grid systems, the PDE can be in the form of Laplace (zero forcing function) or Poisson. The forcing functions in the Poisson equation come from the boundary or the entire domain of the initial algebraic grids. A graphics interface procedure using the Silicon Graphics (GL) Library is included to allow users to visualize the grid variations at each iteration. This will allow users to interactively modify the grid to match their applications. TDIGG is written in FORTRAN 77 for Silicon Graphics IRIS series computers running IRIX. This package requires either MIT's X Window System, Version 11 Revision 4 or SGI (Motif) Window System. A sample executable is provided on the distribution medium. It requires 148K of RAM for execution. The standard distribution medium is a .25 inch streaming magnetic IRIX tape cartridge in UNIX tar format. This program was developed in 1992.
Eigenvalue Solvers for Modeling Nuclear Reactors on Leadership Class Machines

DOE PAGES

Slaybaugh, R. N.; Ramirez-Zweiger, M.; Pandya, Tara; ...

2018-02-20

In this paper, three complementary methods have been implemented in the code Denovo that accelerate neutral particle transport calculations with methods that use leadership-class computers fully and effectively: a multigroup block (MG) Krylov solver, a Rayleigh quotient iteration (RQI) eigenvalue solver, and a multigrid in energy (MGE) preconditioner. The MG Krylov solver converges more quickly than Gauss Seidel and enables energy decomposition such that Denovo can scale to hundreds of thousands of cores. RQI should converge in fewer iterations than power iteration (PI) for large and challenging problems. RQI creates shifted systems that would not be tractable without the MGmore » Krylov solver. It also creates ill-conditioned matrices. The MGE preconditioner reduces iteration count significantly when used with RQI and takes advantage of the new energy decomposition such that it can scale efficiently. Each individual method has been described before, but this is the first time they have been demonstrated to work together effectively. The combination of solvers enables the RQI eigenvalue solver to work better than the other available solvers for large reactors problems on leadership-class machines. Using these methods together, RQI converged in fewer iterations and in less time than PI for a full pressurized water reactor core. These solvers also performed better than an Arnoldi eigenvalue solver for a reactor benchmark problem when energy decomposition is needed. The MG Krylov, MGE preconditioner, and RQI solver combination also scales well in energy. Finally, this solver set is a strong choice for very large and challenging problems.« less
Eigenvalue Solvers for Modeling Nuclear Reactors on Leadership Class Machines

DOE Office of Scientific and Technical Information (OSTI.GOV)

Slaybaugh, R. N.; Ramirez-Zweiger, M.; Pandya, Tara

In this paper, three complementary methods have been implemented in the code Denovo that accelerate neutral particle transport calculations with methods that use leadership-class computers fully and effectively: a multigroup block (MG) Krylov solver, a Rayleigh quotient iteration (RQI) eigenvalue solver, and a multigrid in energy (MGE) preconditioner. The MG Krylov solver converges more quickly than Gauss Seidel and enables energy decomposition such that Denovo can scale to hundreds of thousands of cores. RQI should converge in fewer iterations than power iteration (PI) for large and challenging problems. RQI creates shifted systems that would not be tractable without the MGmore » Krylov solver. It also creates ill-conditioned matrices. The MGE preconditioner reduces iteration count significantly when used with RQI and takes advantage of the new energy decomposition such that it can scale efficiently. Each individual method has been described before, but this is the first time they have been demonstrated to work together effectively. The combination of solvers enables the RQI eigenvalue solver to work better than the other available solvers for large reactors problems on leadership-class machines. Using these methods together, RQI converged in fewer iterations and in less time than PI for a full pressurized water reactor core. These solvers also performed better than an Arnoldi eigenvalue solver for a reactor benchmark problem when energy decomposition is needed. The MG Krylov, MGE preconditioner, and RQI solver combination also scales well in energy. Finally, this solver set is a strong choice for very large and challenging problems.« less
Real-time adaptive finite element solution of time-dependent Kohn-Sham equation

NASA Astrophysics Data System (ADS)

Bao, Gang; Hu, Guanghui; Liu, Di

2015-01-01

In our previous paper (Bao et al., 2012 [1]), a general framework of using adaptive finite element methods to solve the Kohn-Sham equation has been presented. This work is concerned with solving the time-dependent Kohn-Sham equations. The numerical methods are studied in the time domain, which can be employed to explain both the linear and the nonlinear effects. A Crank-Nicolson scheme and linear finite element space are employed for the temporal and spatial discretizations, respectively. To resolve the trouble regions in the time-dependent simulations, a heuristic error indicator is introduced for the mesh adaptive methods. An algebraic multigrid solver is developed to efficiently solve the complex-valued system derived from the semi-implicit scheme. A mask function is employed to remove or reduce the boundary reflection of the wavefunction. The effectiveness of our method is verified by numerical simulations for both linear and nonlinear phenomena, in which the effectiveness of the mesh adaptive methods is clearly demonstrated.
Towards development of enhanced fully-Lagrangian mesh-free computational methods for fluid-structure interaction

NASA Astrophysics Data System (ADS)

Khayyer, Abbas; Gotoh, Hitoshi; Falahaty, Hosein; Shimizu, Yuma

2018-02-01

Simulation of incompressible fluid flow-elastic structure interactions is targeted by using fully-Lagrangian mesh-free computational methods. A projection-based fluid model (moving particle semi-implicit (MPS)) is coupled with either a Newtonian or a Hamiltonian Lagrangian structure model (MPS or HMPS) in a mathematically-physically consistent manner. The fluid model is founded on the solution of Navier-Stokes and continuity equations. The structure models are configured either in the framework of Newtonian mechanics on the basis of conservation of linear and angular momenta, or Hamiltonian mechanics on the basis of variational principle for incompressible elastodynamics. A set of enhanced schemes are incorporated for projection-based fluid model (Enhanced MPS), thus, the developed coupled solvers for fluid structure interaction (FSI) are referred to as Enhanced MPS-MPS and Enhanced MPS-HMPS. Besides, two smoothed particle hydrodynamics (SPH)-based FSI solvers, being developed by the authors, are considered and their potential applicability and comparable performance are briefly discussed in comparison with MPS-based FSI solvers. The SPH-based FSI solvers are established through coupling of projection-based incompressible SPH (ISPH) fluid model and SPH-based Newtonian/Hamiltonian structure models, leading to Enhanced ISPH-SPH and Enhanced ISPH-HSPH. A comparative study is carried out on the performances of the FSI solvers through a set of benchmark tests, including hydrostatic water column on an elastic plate, high speed impact of an elastic aluminum beam, hydroelastic slamming of a marine panel and dam break with elastic gate.
Transient dynamic analysis of the Bao'An Stadium

NASA Astrophysics Data System (ADS)

Knight, David; Whitefield, Rowan; Nhieu, Eric; Tahmasebinia, Faham; Ansourian, Peter; Alonso-Marroquin, Fernando

2016-08-01

Bao'An Stadium is a unique structure that utilises 54m span cantilevers with tensioned members to support the roof. This report involves a simplified finite element model of Bao'An stadium using Strand7 to analyse the effects of deflections, buckling and earthquake loading. Modelling the cantilevers of the original structure with a double curvature was problematic due to unrealistic deflections and no total mass participation using the Spectral Response Solver. To rectify this, a simplified symmetrical stadium was created and the cable free length attribute was used to induce tension in the inner ring and bottom chord members to create upwards deflection. Further, in place of the Spectral Response Solver, the Transient Linear Dynamic Solver was inputted with an El-Centro earthquake. The stadium's response to a 0.20g earthquake and self-weight indicated the deflections satisfied AS1170.0, the loading in the columns was below the critical buckling load, and all structural members satisfied AS4100.
Aerothermodynamic Design Sensitivities for a Reacting Gas Flow Solver on an Unstructured Mesh Using a Discrete Adjoint Formulation

NASA Astrophysics Data System (ADS)

Thompson, Kyle Bonner

An algorithm is described to efficiently compute aerothermodynamic design sensitivities using a decoupled variable set. In a conventional approach to computing design sensitivities for reacting flows, the species continuity equations are fully coupled to the conservation laws for momentum and energy. In this algorithm, the species continuity equations are solved separately from the mixture continuity, momentum, and total energy equations. This decoupling simplifies the implicit system, so that the flow solver can be made significantly more efficient, with very little penalty on overall scheme robustness. Most importantly, the computational cost of the point implicit relaxation is shown to scale linearly with the number of species for the decoupled system, whereas the fully coupled approach scales quadratically. Also, the decoupled method significantly reduces the cost in wall time and memory in comparison to the fully coupled approach. This decoupled approach for computing design sensitivities with the adjoint system is demonstrated for inviscid flow in chemical non-equilibrium around a re-entry vehicle with a retro-firing annular nozzle. The sensitivities of the surface temperature and mass flow rate through the nozzle plenum are computed with respect to plenum conditions and verified against sensitivities computed using a complex-variable finite-difference approach. The decoupled scheme significantly reduces the computational time and memory required to complete the optimization, making this an attractive method for high-fidelity design of hypersonic vehicles.
Parallel Computation of Flow in Heterogeneous Media Modelled by Mixed Finite Elements

NASA Astrophysics Data System (ADS)

Cliffe, K. A.; Graham, I. G.; Scheichl, R.; Stals, L.

2000-11-01

In this paper we describe a fast parallel method for solving highly ill-conditioned saddle-point systems arising from mixed finite element simulations of stochastic partial differential equations (PDEs) modelling flow in heterogeneous media. Each realisation of these stochastic PDEs requires the solution of the linear first-order velocity-pressure system comprising Darcy's law coupled with an incompressibility constraint. The chief difficulty is that the permeability may be highly variable, especially when the statistical model has a large variance and a small correlation length. For reasonable accuracy, the discretisation has to be extremely fine. We solve these problems by first reducing the saddle-point formulation to a symmetric positive definite (SPD) problem using a suitable basis for the space of divergence-free velocities. The reduced problem is solved using parallel conjugate gradients preconditioned with an algebraically determined additive Schwarz domain decomposition preconditioner. The result is a solver which exhibits a good degree of robustness with respect to the mesh size as well as to the variance and to physically relevant values of the correlation length of the underlying permeability field. Numerical experiments exhibit almost optimal levels of parallel efficiency. The domain decomposition solver (DOUG, http://www.maths.bath.ac.uk/~parsoft) used here not only is applicable to this problem but can be used to solve general unstructured finite element systems on a wide range of parallel architectures.

A Linear-Elasticity Solver for Higher-Order Space-Time Mesh Deformation

NASA Technical Reports Server (NTRS)

Diosady, Laslo T.; Murman, Scott M.

2018-01-01

A linear-elasticity approach is presented for the generation of meshes appropriate for a higher-order space-time discontinuous finite-element method. The equations of linear-elasticity are discretized using a higher-order, spatially-continuous, finite-element method. Given an initial finite-element mesh, and a specified boundary displacement, we solve for the mesh displacements to obtain a higher-order curvilinear mesh. Alternatively, for moving-domain problems we use the linear-elasticity approach to solve for a temporally discontinuous mesh velocity on each time-slab and recover a continuous mesh deformation by integrating the velocity. The applicability of this methodology is presented for several benchmark test cases.
Fluid-structure interaction involving large deformations: 3D simulations and applications to biological systems

NASA Astrophysics Data System (ADS)

Tian, Fang-Bao; Dai, Hu; Luo, Haoxiang; Doyle, James F.; Rousseau, Bernard

2014-02-01

Three-dimensional fluid-structure interaction (FSI) involving large deformations of flexible bodies is common in biological systems, but accurate and efficient numerical approaches for modeling such systems are still scarce. In this work, we report a successful case of combining an existing immersed-boundary flow solver with a nonlinear finite-element solid-mechanics solver specifically for three-dimensional FSI simulations. This method represents a significant enhancement from the similar methods that are previously available. Based on the Cartesian grid, the viscous incompressible flow solver can handle boundaries of large displacements with simple mesh generation. The solid-mechanics solver has separate subroutines for analyzing general three-dimensional bodies and thin-walled structures composed of frames, membranes, and plates. Both geometric nonlinearity associated with large displacements and material nonlinearity associated with large strains are incorporated in the solver. The FSI is achieved through a strong coupling and partitioned approach. We perform several validation cases, and the results may be used to expand the currently limited database of FSI benchmark study. Finally, we demonstrate the versatility of the present method by applying it to the aerodynamics of elastic wings of insects and the flow-induced vocal fold vibration.
Fluid–structure interaction involving large deformations: 3D simulations and applications to biological systems

PubMed Central

Tian, Fang-Bao; Dai, Hu; Luo, Haoxiang; Doyle, James F.; Rousseau, Bernard

2013-01-01

Three-dimensional fluid–structure interaction (FSI) involving large deformations of flexible bodies is common in biological systems, but accurate and efficient numerical approaches for modeling such systems are still scarce. In this work, we report a successful case of combining an existing immersed-boundary flow solver with a nonlinear finite-element solid-mechanics solver specifically for three-dimensional FSI simulations. This method represents a significant enhancement from the similar methods that are previously available. Based on the Cartesian grid, the viscous incompressible flow solver can handle boundaries of large displacements with simple mesh generation. The solid-mechanics solver has separate subroutines for analyzing general three-dimensional bodies and thin-walled structures composed of frames, membranes, and plates. Both geometric nonlinearity associated with large displacements and material nonlinearity associated with large strains are incorporated in the solver. The FSI is achieved through a strong coupling and partitioned approach. We perform several validation cases, and the results may be used to expand the currently limited database of FSI benchmark study. Finally, we demonstrate the versatility of the present method by applying it to the aerodynamics of elastic wings of insects and the flow-induced vocal fold vibration. PMID:24415796
Introduction to COFFE: The Next-Generation HPCMP CREATE-AV CFD Solver

NASA Technical Reports Server (NTRS)

Glasby, Ryan S.; Erwin, J. Taylor; Stefanski, Douglas L.; Allmaras, Steven R.; Galbraith, Marshall C.; Anderson, W. Kyle; Nichols, Robert H.

2016-01-01

HPCMP CREATE-AV Conservative Field Finite Element (COFFE) is a modular, extensible, robust numerical solver for the Navier-Stokes equations that invokes modularity and extensibility from its first principles. COFFE implores a flexible, class-based hierarchy that provides a modular approach consisting of discretization, physics, parallelization, and linear algebra components. These components are developed with modern software engineering principles to ensure ease of uptake from a user's or developer's perspective. The Streamwise Upwind/Petrov-Galerkin (SU/PG) method is utilized to discretize the compressible Reynolds-Averaged Navier-Stokes (RANS) equations tightly coupled with a variety of turbulence models. The mathematics and the philosophy of the methodology that makes up COFFE are presented.
Nonlinear study of the parallel velocity/tearing instability using an implicit, nonlinear resistive MHD solver

NASA Astrophysics Data System (ADS)

Chacon, L.; Finn, J. M.; Knoll, D. A.

2000-10-01

Recently, a new parallel velocity instability has been found.(J. M. Finn, Phys. Plasmas), 2, 12 (1995) This mode is a tearing mode driven unstable by curvature effects and sound wave coupling in the presence of parallel velocity shear. Under such conditions, linear theory predicts that tearing instabilities will grow even in situations in which the classical tearing mode is stable. This could then be a viable seed mechanism for the neoclassical tearing mode, and hence a non-linear study is of interest. Here, the linear and non-linear stages of this instability are explored using a fully implicit, fully nonlinear 2D reduced resistive MHD code,(L. Chacon et al), ``Implicit, Jacobian-free Newton-Krylov 2D reduced resistive MHD nonlinear solver,'' submitted to J. Comput. Phys. (2000) including viscosity and particle transport effects. The nonlinear implicit time integration is performed using the Newton-Raphson iterative algorithm. Krylov iterative techniques are employed for the required algebraic matrix inversions, implemented Jacobian-free (i.e., without ever forming and storing the Jacobian matrix), and preconditioned with a ``physics-based'' preconditioner. Nonlinear results indicate that, for large total plasma beta and large parallel velocity shear, the instability results in the generation of large poloidal shear flows and large magnetic islands even in regimes when the classical tearing mode is absolutely stable. For small viscosity, the time asymptotic state can be turbulent.
Time integration algorithms for the two-dimensional Euler equations on unstructured meshes

NASA Technical Reports Server (NTRS)

Slack, David C.; Whitaker, D. L.; Walters, Robert W.

1994-01-01

Explicit and implicit time integration algorithms for the two-dimensional Euler equations on unstructured grids are presented. Both cell-centered and cell-vertex finite volume upwind schemes utilizing Roe's approximate Riemann solver are developed. For the cell-vertex scheme, a four-stage Runge-Kutta time integration, a fourstage Runge-Kutta time integration with implicit residual averaging, a point Jacobi method, a symmetric point Gauss-Seidel method and two methods utilizing preconditioned sparse matrix solvers are presented. For the cell-centered scheme, a Runge-Kutta scheme, an implicit tridiagonal relaxation scheme modeled after line Gauss-Seidel, a fully implicit lower-upper (LU) decomposition, and a hybrid scheme utilizing both Runge-Kutta and LU methods are presented. A reverse Cuthill-McKee renumbering scheme is employed for the direct solver to decrease CPU time by reducing the fill of the Jacobian matrix. A comparison of the various time integration schemes is made for both first-order and higher order accurate solutions using several mesh sizes, higher order accuracy is achieved by using multidimensional monotone linear reconstruction procedures. The results obtained for a transonic flow over a circular arc suggest that the preconditioned sparse matrix solvers perform better than the other methods as the number of elements in the mesh increases.
Fluid-structure coupling for wind turbine blade analysis using OpenFOAM

NASA Astrophysics Data System (ADS)

Dose, Bastian; Herraez, Ivan; Peinke, Joachim

2015-11-01

Modern wind turbine rotor blades are designed increasingly large and flexible. This structural flexibility represents a problem for the field of Computational Fluid Dynamics (CFD), which is used for accurate load calculations and detailed investigations of rotor aerodynamics. As the blade geometries within CFD simulations are considered stiff, the effect of blade deformation caused by aerodynamic loads cannot be captured by the common CFD approach. Coupling the flow solver with a structural solver can overcome this restriction and enables the investigation of flexible wind turbine blades. For this purpose, a new Finite Element (FE) solver was implemented into the open source CFD code OpenFOAM. Using a beam element formulation based on the Geometrically Exact Beam Theory (GEBT), the structural model can capture geometric non-linearities such as large deformations. Coupled with CFD solvers of the OpenFOAM package, the new framework represents a powerful tool for aerodynamic investigations. In this work, we investigated the aerodynamic performance of a state of the art wind turbine. For different wind speeds, aerodynamic key parameters are evaluated and compared for both, rigid and flexible blade geometries. The present work is funded within the framework of the joint project Smart Blades (0325601D) by the German Federal Ministry for Economic Affairs and Energy (BMWi) under decision of the German Federal Parliament.
Block Preconditioning to Enable Physics-Compatible Implicit Multifluid Plasma Simulations

NASA Astrophysics Data System (ADS)

Phillips, Edward; Shadid, John; Cyr, Eric; Miller, Sean

2017-10-01

Multifluid plasma simulations involve large systems of partial differential equations in which many time-scales ranging over many orders of magnitude arise. Since the fastest of these time-scales may set a restrictively small time-step limit for explicit methods, the use of implicit or implicit-explicit time integrators can be more tractable for obtaining dynamics at time-scales of interest. Furthermore, to enforce properties such as charge conservation and divergence-free magnetic field, mixed discretizations using volume, nodal, edge-based, and face-based degrees of freedom are often employed in some form. Together with the presence of stiff modes due to integrating over fast time-scales, the mixed discretization makes the required linear solves for implicit methods particularly difficult for black box and monolithic solvers. This work presents a block preconditioning strategy for multifluid plasma systems that segregates the linear system based on discretization type and approximates off-diagonal coupling in block diagonal Schur complement operators. By employing multilevel methods for the block diagonal subsolves, this strategy yields algorithmic and parallel scalability which we demonstrate on a range of problems.
Constraint-Based Abstract Semantics for Temporal Logic: A Direct Approach to Design and Implementation

NASA Astrophysics Data System (ADS)

Banda, Gourinath; Gallagher, John P.

interpretation provides a practical approach to verifying properties of infinite-state systems. We apply the framework of abstract interpretation to derive an abstract semantic function for the modal μ-calculus, which is the basis for abstract model checking. The abstract semantic function is constructed directly from the standard concrete semantics together with a Galois connection between the concrete state-space and an abstract domain. There is no need for mixed or modal transition systems to abstract arbitrary temporal properties, as in previous work in the area of abstract model checking. Using the modal μ-calculus to implement CTL, the abstract semantics gives an over-approximation of the set of states in which an arbitrary CTL formula holds. Then we show that this leads directly to an effective implementation of an abstract model checking algorithm for CTL using abstract domains based on linear constraints. The implementation of the abstract semantic function makes use of an SMT solver. We describe an implemented system for proving properties of linear hybrid automata and give some experimental results.
A new polytopic approach for the unknown input functional observer design

NASA Astrophysics Data System (ADS)

Bezzaoucha, Souad; Voos, Holger; Darouach, Mohamed

2018-03-01

In this paper, a constructive procedure to design Functional Unknown Input Observers for nonlinear continuous time systems is proposed under the Polytopic Takagi-Sugeno framework. An equivalent representation for the nonlinear model is achieved using the sector nonlinearity transformation. Applying the Lyapunov theory and the ? attenuation, linear matrix inequalities conditions are deduced which are solved for feasibility to obtain the observer design matrices. To cope with the effect of unknown inputs, classical approach of decoupling the unknown input for the linear case is used. Both algebraic and solver-based solutions are proposed (relaxed conditions). Necessary and sufficient conditions for the existence of the functional polytopic observer are given. For both approaches, the general and particular cases (measurable premise variables, full state estimation with full and reduced order cases) are considered and it is shown that the proposed conditions correspond to the one presented for standard linear case. To illustrate the proposed theoretical results, detailed numerical simulations are presented for a Quadrotor Aerial Robots Landing and a Waste Water Treatment Plant. Both systems are highly nonlinear and represented in a T-S polytopic form with unmeasurable premise variables and unknown inputs.
Efficient development of memory bounded geo-applications to scale on modern supercomputers

NASA Astrophysics Data System (ADS)

Räss, Ludovic; Omlin, Samuel; Licul, Aleksandar; Podladchikov, Yuri; Herman, Frédéric

2016-04-01

Numerical modeling is an actual key tool in the area of geosciences. The current challenge is to solve problems that are multi-physics and for which the length scale and the place of occurrence might not be known in advance. Also, the spatial extend of the investigated domain might strongly vary in size, ranging from millimeters for reactive transport to kilometers for glacier erosion dynamics. An efficient way to proceed is to develop simple but robust algorithms that perform well and scale on modern supercomputers and permit therefore very high-resolution simulations. We propose an efficient approach to solve memory bounded real-world applications on modern supercomputers architectures. We optimize the software to run on our newly acquired state-of-the-art GPU cluster "octopus". Our approach shows promising preliminary results on important geodynamical and geomechanical problematics: we have developed a Stokes solver for glacier flow and a poromechanical solver including complex rheologies for nonlinear waves in stressed rocks porous rocks. We solve the system of partial differential equations on a regular Cartesian grid and use an iterative finite difference scheme with preconditioning of the residuals. The MPI communication happens only locally (point-to-point); this method is known to scale linearly by construction. The "octopus" GPU cluster, which we use for the computations, has been designed to achieve maximal data transfer throughput at minimal hardware cost. It is composed of twenty compute nodes, each hosting four Nvidia Titan X GPU accelerators. These high-density nodes are interconnected with a parallel (dual-rail) FDR InfiniBand network. Our efforts show promising preliminary results for the different physics investigated. The glacier flow solver achieves good accuracy in the relevant benchmarks and the coupled poromechanical solver permits to explain previously unresolvable focused fluid flow as a natural outcome of the porosity setup. In both cases, near peak memory bandwidth transfer is achieved. Our approach allows us to get the best out of the current hardware.
Diffraction of a Shock Wave on a Wedge in a Dusty Gas

NASA Astrophysics Data System (ADS)

Surov, V. S.

2017-09-01

Within the framework of one- and multivelocity dusty-gas models, the author has investigated, on a curvilinear grid, flow in reflection of a shock wave from the wedge-shaped surface in an air-droplet mixture using the Godunov method with a linearized Riemannian solver.
A dynamic-solver-consistent minimum action method: With an application to 2D Navier-Stokes equations

NASA Astrophysics Data System (ADS)

Wan, Xiaoliang; Yu, Haijun

2017-02-01

This paper discusses the necessity and strategy to unify the development of a dynamic solver and a minimum action method (MAM) for a spatially extended system when employing the large deviation principle (LDP) to study the effects of small random perturbations. A dynamic solver is used to approximate the unperturbed system, and a minimum action method is used to approximate the LDP, which corresponds to solving an Euler-Lagrange equation related to but more complicated than the unperturbed system. We will clarify possible inconsistencies induced by independent numerical approximations of the unperturbed system and the LDP, based on which we propose to define both the dynamic solver and the MAM on the same approximation space for spatial discretization. The semi-discrete LDP can then be regarded as the exact LDP of the semi-discrete unperturbed system, which is a finite-dimensional ODE system. We achieve this methodology for the two-dimensional Navier-Stokes equations using a divergence-free approximation space. The method developed can be used to study the nonlinear instability of wall-bounded parallel shear flows, and be generalized straightforwardly to three-dimensional cases. Numerical experiments are presented.
Time Dependent Holographic Interferometry and Finite-Element Analysis of Heat Transfer Within a Rectangular Enclosure

DTIC Science & Technology

1976-09-01

describing the system are correctly assembled, a library subroutine (LEQT2F) functioning as a linear equation solver is called and the desired nodal... mooc ^30 •»-< * #00 + O00ɜ-tH Q(M I > •.-4(M-M- +30 ^f0*O I >o • -t-W-QOO 30 t^*Q(Mw O tO^ I o«o»-< •^ + 00 I fH (M(M3 -*>* (M +«o o fOOO *3Q • — » aoo...Documentation Center 2 Cameron Station Alexandria, Virginia 22314 Library , Code 0212 2 Naval Postgraduate School Monterey, California 93940 3
Implicit solvers for unstructured meshes

NASA Technical Reports Server (NTRS)

Venkatakrishnan, V.; Mavriplis, Dimitri J.

1991-01-01

Implicit methods for unstructured mesh computations are developed and tested. The approximate system which arises from the Newton-linearization of the nonlinear evolution operator is solved by using the preconditioned generalized minimum residual technique. These different preconditioners are investigated: the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over-relaxation (SSOR). The preconditioners have been optimized to have good vectorization properties. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also investigated. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
Coupled electromagnetic-thermodynamic simulations of microwave heating problems using the FDTD algorithm.

PubMed

Kopyt, Paweł; Celuch, Małgorzata

2007-01-01

A practical implementation of a hybrid simulation system capable of modeling coupled electromagnetic-thermodynamic problems typical in microwave heating is described. The paper presents two approaches to modeling such problems. Both are based on an FDTD-based commercial electromagnetic solver coupled to an external thermodynamic analysis tool required for calculations of heat diffusion. The first approach utilizes a simple FDTD-based thermal solver while in the second it is replaced by a universal commercial CFD solver. The accuracy of the two modeling systems is verified against the original experimental data as well as the measurement results available in literature.
A comparison of optimization algorithms for localized in vivo B0 shimming.

PubMed

Nassirpour, Sahar; Chang, Paul; Fillmer, Ariane; Henning, Anke

2018-02-01

To compare several different optimization algorithms currently used for localized in vivo B 0 shimming, and to introduce a novel, fast, and robust constrained regularized algorithm (ConsTru) for this purpose. Ten different optimization algorithms (including samples from both generic and dedicated least-squares solvers, and a novel constrained regularized inversion method) were implemented and compared for shimming in five different shimming volumes on 66 in vivo data sets from both 7 T and 9.4 T. The best algorithm was chosen to perform single-voxel spectroscopy at 9.4 T in the frontal cortex of the brain on 10 volunteers. The results of the performance tests proved that the shimming algorithm is prone to unstable solutions if it depends on the value of a starting point, and is not regularized to handle ill-conditioned problems. The ConsTru algorithm proved to be the most robust, fast, and efficient algorithm among all of the chosen algorithms. It enabled acquisition of spectra of reproducible high quality in the frontal cortex at 9.4 T. For localized in vivo B 0 shimming, the use of a dedicated linear least-squares solver instead of a generic nonlinear one is highly recommended. Among all of the linear solvers, the constrained regularized method (ConsTru) was found to be both fast and most robust. Magn Reson Med 79:1145-1156, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
Finite difference method accelerated with sparse solvers for structural analysis of the metal-organic complexes

NASA Astrophysics Data System (ADS)

Guda, A. A.; Guda, S. A.; Soldatov, M. A.; Lomachenko, K. A.; Bugaev, A. L.; Lamberti, C.; Gawelda, W.; Bressler, C.; Smolentsev, G.; Soldatov, A. V.; Joly, Y.

2016-05-01

Finite difference method (FDM) implemented in the FDMNES software [Phys. Rev. B, 2001, 63, 125120] was revised. Thorough analysis shows, that the calculated diagonal in the FDM matrix consists of about 96% zero elements. Thus a sparse solver would be more suitable for the problem instead of traditional Gaussian elimination for the diagonal neighbourhood. We have tried several iterative sparse solvers and the direct one MUMPS solver with METIS ordering turned out to be the best. Compared to the Gaussian solver present method is up to 40 times faster and allows XANES simulations for complex systems already on personal computers. We show applicability of the software for metal-organic [Fe(bpy)3]2+ complex both for low spin and high spin states populated after laser excitation.
An approximate Riemann solver for real gas parabolized Navier-Stokes equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Urbano, Annafederica, E-mail: annafederica.urbano@uniroma1.it; Nasuti, Francesco, E-mail: francesco.nasuti@uniroma1.it

2013-01-15

Under specific assumptions, parabolized Navier-Stokes equations are a suitable mean to study channel flows. A special case is that of high pressure flow of real gases in cooling channels where large crosswise gradients of thermophysical properties occur. To solve the parabolized Navier-Stokes equations by a space marching approach, the hyperbolicity of the system of governing equations is obtained, even for very low Mach number flow, by recasting equations such that the streamwise pressure gradient is considered as a source term. For this system of equations an approximate Roe's Riemann solver is developed as the core of a Godunov type finitemore » volume algorithm. The properties of the approximated Riemann solver, which is a modification of Roe's Riemann solver for the parabolized Navier-Stokes equations, are presented and discussed with emphasis given to its original features introduced to handle fluids governed by a generic real gas EoS. Sample solutions are obtained for low Mach number high compressible flows of transcritical methane, heated in straight long channels, to prove the solver ability to describe flows dominated by complex thermodynamic phenomena.« less
CUDA GPU based full-Stokes finite difference modelling of glaciers

NASA Astrophysics Data System (ADS)

Brædstrup, C. F.; Egholm, D. L.

2012-04-01

Many have stressed the limitations of using the shallow shelf and shallow ice approximations when modelling ice streams or surging glaciers. Using a full-stokes approach requires either large amounts of computer power or time and is therefore seldom an option for most glaciologists. Recent advances in graphics card (GPU) technology for high performance computing have proven extremely efficient in accelerating many large scale scientific computations. The general purpose GPU (GPGPU) technology is cheap, has a low power consumption and fits into a normal desktop computer. It could therefore provide a powerful tool for many glaciologists. Our full-stokes ice sheet model implements a Red-Black Gauss-Seidel iterative linear solver to solve the full stokes equations. This technique has proven very effective when applied to the stokes equation in geodynamics problems, and should therefore also preform well in glaciological flow probems. The Gauss-Seidel iterator is known to be robust but several other linear solvers have a much faster convergence. To aid convergence, the solver uses a multigrid approach where values are interpolated and extrapolated between different grid resolutions to minimize the short wavelength errors efficiently. This reduces the iteration count by several orders of magnitude. The run-time is further reduced by using the GPGPU technology where each card has up to 448 cores. Researchers utilizing the GPGPU technique in other areas have reported between 2 - 11 times speedup compared to multicore CPU implementations on similar problems. The goal of these initial investigations into the possible usage of GPGPU technology in glacial modelling is to apply the enhanced resolution of a full-stokes solver to ice streams and surging glaciers. This is a area of growing interest because ice streams are the main drainage conjugates for large ice sheets. It is therefore crucial to understand this streaming behavior and it's impact up-ice.

Monte Carlo modelling of Schottky diode for rectenna simulation

NASA Astrophysics Data System (ADS)

Bernuchon, E.; Aniel, F.; Zerounian, N.; Grimault-Jacquin, A. S.

2017-09-01

Before designing a detector circuit, the electrical parameters extraction of the Schottky diode is a critical step. This article is based on a Monte-Carlo (MC) solver of the Boltzmann Transport Equation (BTE) including different transport mechanisms at the metal-semiconductor contact such as image force effect or tunneling. The weight of tunneling and thermionic current is quantified according to different degrees of tunneling modelling. The I-V characteristic highlights the dependence of the ideality factor and the current saturation with bias. Harmonic Balance (HB) simulation on a rectifier circuit within Advanced Design System (ADS) software shows that considering non-linear ideality factor and saturation current for the electrical model of the Schottky diode does not seem essential. Indeed, bias independent values extracted in forward regime on I-V curve are sufficient. However, the non-linear series resistance extracted from a small signal analysis (SSA) strongly influences the conversion efficiency at low input powers.
An adaptive discontinuous Galerkin solver for aerodynamic flows

NASA Astrophysics Data System (ADS)

Burgess, Nicholas K.

This work considers the accuracy, efficiency, and robustness of an unstructured high-order accurate discontinuous Galerkin (DG) solver for computational fluid dynamics (CFD). Recently, there has been a drive to reduce the discretization error of CFD simulations using high-order methods on unstructured grids. However, high-order methods are often criticized for lacking robustness and having high computational cost. The goal of this work is to investigate methods that enhance the robustness of high-order discontinuous Galerkin (DG) methods on unstructured meshes, while maintaining low computational cost and high accuracy of the numerical solutions. This work investigates robustness enhancement of high-order methods by examining effective non-linear solvers, shock capturing methods, turbulence model discretizations and adaptive refinement techniques. The goal is to develop an all encompassing solver that can simulate a large range of physical phenomena, where all aspects of the solver work together to achieve a robust, efficient and accurate solution strategy. The components and framework for a robust high-order accurate solver that is capable of solving viscous, Reynolds Averaged Navier-Stokes (RANS) and shocked flows is presented. In particular, this work discusses robust discretizations of the turbulence model equation used to close the RANS equations, as well as stable shock capturing strategies that are applicable across a wide range of discretization orders and applicable to very strong shock waves. Furthermore, refinement techniques are considered as both efficiency and robustness enhancement strategies. Additionally, efficient non-linear solvers based on multigrid and Krylov subspace methods are presented. The accuracy, efficiency, and robustness of the solver is demonstrated using a variety of challenging aerodynamic test problems, which include turbulent high-lift and viscous hypersonic flows. Adaptive mesh refinement was found to play a critical role in obtaining a robust and efficient high-order accurate flow solver. A goal-oriented error estimation technique has been developed to estimate the discretization error of simulation outputs. For high-order discretizations, it is shown that functional output error super-convergence can be obtained, provided the discretization satisfies a property known as dual consistency. The dual consistency of the DG methods developed in this work is shown via mathematical analysis and numerical experimentation. Goal-oriented error estimation is also used to drive an hp-adaptive mesh refinement strategy, where a combination of mesh or h-refinement, and order or p-enrichment, is employed based on the smoothness of the solution. The results demonstrate that the combination of goal-oriented error estimation and hp-adaptation yield superior accuracy, as well as enhanced robustness and efficiency for a variety of aerodynamic flows including flows with strong shock waves. This work demonstrates that DG discretizations can be the basis of an accurate, efficient, and robust CFD solver. Furthermore, enhancing the robustness of DG methods does not adversely impact the accuracy or efficiency of the solver for challenging and complex flow problems. In particular, when considering the computation of shocked flows, this work demonstrates that the available shock capturing techniques are sufficiently accurate and robust, particularly when used in conjunction with adaptive mesh refinement . This work also demonstrates that robust solutions of the Reynolds Averaged Navier-Stokes (RANS) and turbulence model equations can be obtained for complex and challenging aerodynamic flows. In this context, the most robust strategy was determined to be a low-order turbulence model discretization coupled to a high-order discretization of the RANS equations. Although RANS solutions using high-order accurate discretizations of the turbulence model were obtained, the behavior of current-day RANS turbulence models discretized to high-order was found to be problematic, leading to solver robustness issues. This suggests that future work is warranted in the area of turbulence model formulation for use with high-order discretizations. Alternately, the use of Large-Eddy Simulation (LES) subgrid scale models with high-order DG methods offers the potential to leverage the high accuracy of these methods for very high fidelity turbulent simulations. This thesis has developed the algorithmic improvements that will lay the foundation for the development of a three-dimensional high-order flow solution strategy that can be used as the basis for future LES simulations.
Nonlinear Analysis of Airfoil High-Intensity Gust Response Using a High-Order Prefactored Compact Code

NASA Technical Reports Server (NTRS)

Crivellini, A.; Golubev, V.; Mankbadi, R.; Scott, J. R.; Hixon, R.; Povinelli, L.; Kiraly, L. James (Technical Monitor)

2002-01-01

The nonlinear response of symmetric and loaded airfoils to an impinging vortical gust is investigated in the parametric space of gust dimension, intensity, and frequency. The study, which was designed to investigate the validity limits for a linear analysis, is implemented by applying a nonlinear high-order prefactored compact code and comparing results with linear solutions from the GUST3D frequency-domain solver. Both the unsteady aerodynamic and acoustic gust responses are examined.
Spherical Harmonic Decomposition of Gravitational Waves Across Mesh Refinement Boundaries

NASA Technical Reports Server (NTRS)

Fiske, David R.; Baker, John; vanMeter, James R.; Centrella, Joan M.

2005-01-01

We evolve a linearized (Teukolsky) solution of the Einstein equations with a non-linear Einstein solver. Using this testbed, we are able to show that such gravitational waves, defined by the Weyl scalars in the Newman-Penrose formalism, propagate faithfully across mesh refinement boundaries, and use, for the first time to our knowledge, a novel algorithm due to Misner to compute spherical harmonic components of our waveforms. We show that the algorithm performs extremely well, even when the extraction sphere intersects refinement boundaries.
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations

DOE PAGES

Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel; ...

2017-06-01

As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel

As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
Advanced Fast 3-D Electromagnetic Solver for Microwave Tomography Imaging.

PubMed

Simonov, Nikolai; Kim, Bo-Ra; Lee, Kwang-Jae; Jeon, Soon-Ik; Son, Seong-Ho

2017-10-01

This paper describes a fast-forward electromagnetic solver (FFS) for the image reconstruction algorithm of our microwave tomography system. Our apparatus is a preclinical prototype of a biomedical imaging system, designed for the purpose of early breast cancer detection. It operates in the 3-6-GHz frequency band using a circular array of probe antennas immersed in a matching liquid; it produces image reconstructions of the permittivity and conductivity profiles of the breast under examination. Our reconstruction algorithm solves the electromagnetic (EM) inverse problem and takes into account the real EM properties of the probe antenna array as well as the influence of the patient's body and that of the upper metal screen sheet. This FFS algorithm is much faster than conventional EM simulation solvers. In comparison, in the same PC, the CST solver takes ~45 min, while the FFS takes ~1 s of effective simulation time for the same EM model of a numerical breast phantom.
A new family Jacobian solver for global three-dimensional modeling of atmospheric chemistry

NASA Astrophysics Data System (ADS)

Zhao, Xuepeng; Turco, Richard P.; Shen, Mei

1999-01-01

We present a new technique to solve complex sets of photochemical rate equations that is applicable to global modeling of the troposphere and stratosphere. The approach is based on the concept of "families" of species, whose chemical rate equations are tightly coupled. Variations of species concentrations within a family can be determined by inverting a linearized Jacobian matrix representing the family group. Since this group consists of a relatively small number of species the corresponding Jacobian has a low order (a minimatrix) compared to the Jacobian of the entire system. However, we go further and define a super-family that is the set of all families. The super-family is also solved by linearization and matrix inversion. The resulting Super-Family Matrix Inversion (SFMI) scheme is more stable and accurate than common family approaches. We discuss the numerical structure of the SFMI scheme and apply our algorithms to a comprehensive set of photochemical reactions. To evaluate performance, the SFMI scheme is compared with an optimized Gear solver. We find that the SFMI technique can be at least an order of magnitude more efficient than existing chemical solvers while maintaining relative errors in the calculations of 15% or less over a diurnal cycle. The largest SFMI errors arise at sunrise and sunset and during the evening when species concentrations may be very low. We show that sunrise/sunset errors can be minimized through a careful treatment of photodissociation during these periods; the nighttime deviations are negligible from the point of view of acceptable computational accuracy. The stability and flexibility of the SFMI algorithm should be sufficient for most modeling applications until major improvements in other modeling factors are achieved. In addition, because of its balanced computational design, SFMI can easily be adapted to parallel computing architectures. SFMI thus should allow practical long-term integrations of global chemistry coupled to general circulation and climate models, studies of interannual and interdecadal variability in atmospheric composition, simulations of past multidecadal trends owing to anthropogenic emissions, long-term forecasting associated with projected emissions, and sensitivity analyses for a wide range of physical and chemical parameters.
Unified solver for fluid dynamics and aeroacoustics in isentropic gas flows

NASA Astrophysics Data System (ADS)

Pont, Arnau; Codina, Ramon; Baiges, Joan; Guasch, Oriol

2018-06-01

The high computational cost of solving numerically the fully compressible Navier-Stokes equations, together with the poor performance of most numerical formulations for compressible flow in the low Mach number regime, has led to the necessity for more affordable numerical models for Computational Aeroacoustics. For low Mach number subsonic flows with neither shocks nor thermal coupling, both flow dynamics and wave propagation can be considered isentropic. Therefore, a joint isentropic formulation for flow and aeroacoustics can be devised which avoids the need for segregating flow and acoustic scales. Under these assumptions density and pressure fluctuations are directly proportional, and a two field velocity-pressure compressible formulation can be derived as an extension of an incompressible solver. Moreover, the linear system of equations which arises from the proposed isentropic formulation is better conditioned than the homologous incompressible one due to the presence of a pressure time derivative. Similarly to other compressible formulations the prescription of boundary conditions will have to deal with the backscattering of acoustic waves. In this sense, a separated imposition of boundary conditions for flow and acoustic scales which allows the evacuation of waves through Dirichlet boundaries without using any tailored damping model will be presented.
Validation of High-Fidelity CFD/CAA Framework for Launch Vehicle Acoustic Environment Simulation against Scale Model Test Data

NASA Technical Reports Server (NTRS)

Liever, Peter A.; West, Jeffrey S.

2016-01-01

A hybrid Computational Fluid Dynamics and Computational Aero-Acoustics (CFD/CAA) modeling framework has been developed for launch vehicle liftoff acoustic environment predictions. The framework couples the existing highly-scalable NASA production CFD code, Loci/CHEM, with a high-order accurate discontinuous Galerkin solver developed in the same production framework, Loci/THRUST, to accurately resolve and propagate acoustic physics across the entire launch environment. Time-accurate, Hybrid RANS/LES CFD modeling is applied for predicting the acoustic generation physics at the plume source, and a high-order accurate unstructured discontinuous Galerkin (DG) method is employed to propagate acoustic waves away from the source across large distances using high-order accurate schemes. The DG solver is capable of solving 2nd, 3rd, and 4th order Euler solutions for non-linear, conservative acoustic field propagation. Initial application testing and validation has been carried out against high resolution acoustic data from the Ares Scale Model Acoustic Test (ASMAT) series to evaluate the capabilities and production readiness of the CFD/CAA system to resolve the observed spectrum of acoustic frequency content. This paper presents results from this validation and outlines efforts to mature and improve the computational simulation framework.
Solution Methods for 3D Tomographic Inversion Using A Highly Non-Linear Ray Tracer

NASA Astrophysics Data System (ADS)

Hipp, J. R.; Ballard, S.; Young, C. J.; Chang, M.

2008-12-01

To develop 3D velocity models to improve nuclear explosion monitoring capability, we have developed a 3D tomographic modeling system that traces rays using an implementation of the Um and Thurber ray pseudo- bending approach, with full enforcement of Snell's Law in 3D at the major discontinuities. Due to the highly non-linear nature of the ray tracer, however, we are forced to substantially damp the inversion in order to converge on a reasonable model. Unfortunately the amount of damping is not known a priori and can significantly extend the number of calls of the computationally expensive ray-tracer and the least squares matrix solver. If the damping term is too small the solution step-size produces either an un-realistic model velocity change or places the solution in or near a local minimum from which extrication is nearly impossible. If the damping term is too large, convergence can be very slow or premature convergence can occur. Standard approaches involve running inversions with a suite of damping parameters to find the best model. A better solution methodology is to take advantage of existing non-linear solution techniques such as Levenberg-Marquardt (LM) or quasi-newton iterative solvers. In particular, the LM algorithm was specifically designed to find the minimum of a multi-variate function that is expressed as the sum of squares of non-linear real-valued functions. It has become a standard technique for solving non-linear least squared problems, and is widely adopted in a broad spectrum of disciplines, including the geosciences. At each iteration, the LM approach dynamically varies the level of damping to optimize convergence. When the current estimate of the solution is far from the ultimate solution LM behaves as a steepest decent method, but transitions to Gauss- Newton behavior, with near quadratic convergence, as the estimate approaches the final solution. We show typical linear solution techniques and how they can lead to local minima if the damping is set too low. We also describe the LM technique and show how it automatically determines the appropriate damping factor as it iteratively converges on the best solution. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration under Contract DE-AC04- 94AL85000.
GlobiPack v. 1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bartlett, Roscoe

2010-03-31

GlobiPack contains a small collection of optimization globalization algorithms. These algorithms are used by optimization and various nonlinear equation solver algorithms.Used as the line-search procedure with Newton and Quasi-Newton optimization and nonlinear equation solver methods. These are standard published 1-D line search algorithms such as are described in the book Nocedal and Wright Numerical Optimization: 2nd edition, 2006. One set of algorithms were copied and refactored from the existing open-source Trilinos package MOOCHO where the linear search code is used to globalize SQP methods. This software is generic to any mathematical optimization problem where smooth derivatives exist. There is nomore » specific connection or mention whatsoever to any specific application, period. You cannot find more general mathematical software.« less
GASPACHO: a generic automatic solver using proximal algorithms for convex huge optimization problems

NASA Astrophysics Data System (ADS)

Goossens, Bart; Luong, Hiêp; Philips, Wilfried

2017-08-01

Many inverse problems (e.g., demosaicking, deblurring, denoising, image fusion, HDR synthesis) share various similarities: degradation operators are often modeled by a specific data fitting function while image prior knowledge (e.g., sparsity) is incorporated by additional regularization terms. In this paper, we investigate automatic algorithmic techniques for evaluating proximal operators. These algorithmic techniques also enable efficient calculation of adjoints from linear operators in a general matrix-free setting. In particular, we study the simultaneous-direction method of multipliers (SDMM) and the parallel proximal algorithm (PPXA) solvers and show that the automatically derived implementations are well suited for both single-GPU and multi-GPU processing. We demonstrate this approach for an Electron Microscopy (EM) deconvolution problem.
smoothG

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barker, Andrew T.; Gelever, Stephan A.; Lee, Chak S.

2017-12-12

smoothG is a collection of parallel C++ classes/functions that algebraically constructs reduced models of different resolutions from a given high-fidelity graph model. In addition, smoothG also provides efficient linear solvers for the reduced models. Other than pure graph problem, the software finds its application in subsurface flow and power grid simulations in which graph Laplacians are found
Using Excel's Solver Function to Facilitate Reciprocal Service Department Cost Allocations

ERIC Educational Resources Information Center

Leese, Wallace R.

2013-01-01

The reciprocal method of service department cost allocation requires linear equations to be solved simultaneously. These computations are often so complex as to cause the abandonment of the reciprocal method in favor of the less sophisticated and theoretically incorrect direct or step-down methods. This article illustrates how Excel's Solver…
On the Development of an Efficient Parallel Hybrid Solver with Application to Acoustically Treated Aero-Engine Nacelles

NASA Technical Reports Server (NTRS)

Watson, Willie R.; Nark, Douglas M.; Nguyen, Duc T.; Tungkahotara, Siroj

2006-01-01

A finite element solution to the convected Helmholtz equation in a nonuniform flow is used to model the noise field within 3-D acoustically treated aero-engine nacelles. Options to select linear or cubic Hermite polynomial basis functions and isoparametric elements are included. However, the key feature of the method is a domain decomposition procedure that is based upon the inter-mixing of an iterative and a direct solve strategy for solving the discrete finite element equations. This procedure is optimized to take full advantage of sparsity and exploit the increased memory and parallel processing capability of modern computer architectures. Example computations are presented for the Langley Flow Impedance Test facility and a rectangular mapping of a full scale, generic aero-engine nacelle. The accuracy and parallel performance of this new solver are tested on both model problems using a supercomputer that contains hundreds of central processing units. Results show that the method gives extremely accurate attenuation predictions, achieves super-linear speedup over hundreds of CPUs, and solves upward of 25 million complex equations in a quarter of an hour.
The international river interface cooperative: Public domain flow and morphodynamics software for education and applications

NASA Astrophysics Data System (ADS)

Nelson, Jonathan M.; Shimizu, Yasuyuki; Abe, Takaaki; Asahi, Kazutake; Gamou, Mineyuki; Inoue, Takuya; Iwasaki, Toshiki; Kakinuma, Takaharu; Kawamura, Satomi; Kimura, Ichiro; Kyuka, Tomoko; McDonald, Richard R.; Nabi, Mohamed; Nakatsugawa, Makoto; Simões, Francisco R.; Takebayashi, Hiroshi; Watanabe, Yasunori

2016-07-01

This paper describes a new, public-domain interface for modeling flow, sediment transport and morphodynamics in rivers and other geophysical flows. The interface is named after the International River Interface Cooperative (iRIC), the group that constructed the interface and many of the current solvers included in iRIC. The interface is entirely free to any user and currently houses thirteen models ranging from simple one-dimensional models through three-dimensional large-eddy simulation models. Solvers are only loosely coupled to the interface so it is straightforward to modify existing solvers or to introduce other solvers into the system. Six of the most widely-used solvers are described in detail including example calculations to serve as an aid for users choosing what approach might be most appropriate for their own applications. The example calculations range from practical computations of bed evolution in natural rivers to highly detailed predictions of the development of small-scale bedforms on an initially flat bed. The remaining solvers are also briefly described. Although the focus of most solvers is coupled flow and morphodynamics, several of the solvers are also specifically aimed at providing flood inundation predictions over large spatial domains. Potential users can download the application, solvers, manuals, and educational materials including detailed tutorials at www.-i-ric.org. The iRIC development group encourages scientists and engineers to use the tool and to consider adding their own methods to the iRIC suite of tools.
The international river interface cooperative: Public domain flow and morphodynamics software for education and applications

USGS Publications Warehouse

Nelson, Jonathan M.; Shimizu, Yasuyuki; Abe, Takaaki; Asahi, Kazutake; Gamou, Mineyuki; Inoue, Takuya; Iwasaki, Toshiki; Kakinuma, Takaharu; Kawamura, Satomi; Kimura, Ichiro; Kyuka, Tomoko; McDonald, Richard R.; Nabi, Mohamed; Nakatsugawa, Makoto; Simoes, Francisco J.; Takebayashi, Hiroshi; Watanabe, Yasunori

2016-01-01

This paper describes a new, public-domain interface for modeling flow, sediment transport and morphodynamics in rivers and other geophysical flows. The interface is named after the International River Interface Cooperative (iRIC), the group that constructed the interface and many of the current solvers included in iRIC. The interface is entirely free to any user and currently houses thirteen models ranging from simple one-dimensional models through three-dimensional large-eddy simulation models. Solvers are only loosely coupled to the interface so it is straightforward to modify existing solvers or to introduce other solvers into the system. Six of the most widely-used solvers are described in detail including example calculations to serve as an aid for users choosing what approach might be most appropriate for their own applications. The example calculations range from practical computations of bed evolution in natural rivers to highly detailed predictions of the development of small-scale bedforms on an initially flat bed. The remaining solvers are also briefly described. Although the focus of most solvers is coupled flow and morphodynamics, several of the solvers are also specifically aimed at providing flood inundation predictions over large spatial domains. Potential users can download the application, solvers, manuals, and educational materials including detailed tutorials at www.-i-ric.org. The iRIC development group encourages scientists and engineers to use the tool and to consider adding their own methods to the iRIC suite of tools.
A fast and robust computational method for the ionization cross sections of the driven Schrödinger equation using an O (N) multigrid-based scheme

NASA Astrophysics Data System (ADS)

Cools, S.; Vanroose, W.

2016-03-01

This paper improves the convergence and robustness of a multigrid-based solver for the cross sections of the driven Schrödinger equation. Adding a Coupled Channel Correction Step (CCCS) after each multigrid (MG) V-cycle efficiently removes the errors that remain after the V-cycle sweep. The combined iterative solution scheme (MG-CCCS) is shown to feature significantly improved convergence rates over the classical MG method at energies where bound states dominate the solution, resulting in a fast and scalable solution method for the complex-valued Schrödinger break-up problem for any energy regime. The proposed solver displays optimal scaling; a solution is found in a time that is linear in the number of unknowns. The method is validated on a 2D Temkin-Poet model problem, and convergence results both as a solver and preconditioner are provided to support the O (N) scalability of the method. This paper extends the applicability of the complex contour approach for far field map computation (Cools et al. (2014) [10]).
CubiCal - Fast radio interferometric calibration suite exploiting complex optimisation

NASA Astrophysics Data System (ADS)

Kenyon, J. S.; Smirnov, O. M.; Grobler, T. L.; Perkins, S. J.

2018-05-01

It has recently been shown that radio interferometric gain calibration can be expressed succinctly in the language of complex optimisation. In addition to providing an elegant framework for further development, it exposes properties of the calibration problem which can be exploited to accelerate traditional non-linear least squares solvers such as Gauss-Newton and Levenberg-Marquardt. We extend existing derivations to chains of Jones terms: products of several gains which model different aberrant effects. In doing so, we find that the useful properties found in the single term case still hold. We also develop several specialised solvers which deal with complex gains parameterised by real values. The newly developed solvers have been implemented in a Python package called CubiCal, which uses a combination of Cython, multiprocessing and shared memory to leverage the power of modern hardware. We apply CubiCal to both simulated and real data, and perform both direction-independent and direction-dependent self-calibration. Finally, we present the results of some rudimentary profiling to show that CubiCal is competitive with respect to existing calibration tools such as MeqTrees.

On the eddy-resolving capability of high-order discontinuous Galerkin approaches to implicit LES / under-resolved DNS of Euler turbulence

NASA Astrophysics Data System (ADS)

Moura, R. C.; Mengaldo, G.; Peiró, J.; Sherwin, S. J.

2017-02-01

We present estimates of spectral resolution power for under-resolved turbulent Euler flows obtained with high-order discontinuous Galerkin (DG) methods. The '1% rule' based on linear dispersion-diffusion analysis introduced by Moura et al. (2015) [10] is here adapted for 3D energy spectra and validated through the inviscid Taylor-Green vortex problem. The 1% rule estimates the wavenumber beyond which numerical diffusion induces an artificial dissipation range on measured energy spectra. As the original rule relies on standard upwinding, different Riemann solvers are tested. Very good agreement is found for solvers which treat the different physical waves in a consistent manner. Relatively good agreement is still found for simpler solvers. The latter however displayed spurious features attributed to the inconsistent treatment of different physical waves. It is argued that, in the limit of vanishing viscosity, such features might have a significant impact on robustness and solution quality. The estimates proposed are regarded as useful guidelines for no-model DG-based simulations of free turbulence at very high Reynolds numbers.
Non-linear eigensolver-based alternative to traditional SCF methods

NASA Astrophysics Data System (ADS)

Gavin, Brendan; Polizzi, Eric

2013-03-01

The self-consistent iterative procedure in Density Functional Theory calculations is revisited using a new, highly efficient and robust algorithm for solving the non-linear eigenvector problem (i.e. H(X)X = EX;) of the Kohn-Sham equations. This new scheme is derived from a generalization of the FEAST eigenvalue algorithm, and provides a fundamental and practical numerical solution for addressing the non-linearity of the Hamiltonian with the occupied eigenvectors. In contrast to SCF techniques, the traditional outer iterations are replaced by subspace iterations that are intrinsic to the FEAST algorithm, while the non-linearity is handled at the level of a projected reduced system which is orders of magnitude smaller than the original one. Using a series of numerical examples, it will be shown that our approach can outperform the traditional SCF mixing techniques such as Pulay-DIIS by providing a high converge rate and by converging to the correct solution regardless of the choice of the initial guess. We also discuss a practical implementation of the technique that can be achieved effectively using the FEAST solver package. This research is supported by NSF under Grant #ECCS-0846457 and Intel Corporation.
A parallel time integrator for noisy nonlinear oscillatory systems

NASA Astrophysics Data System (ADS)

Subber, Waad; Sarkar, Abhijit

2018-06-01

In this paper, we adapt a parallel time integration scheme to track the trajectories of noisy non-linear dynamical systems. Specifically, we formulate a parallel algorithm to generate the sample path of nonlinear oscillator defined by stochastic differential equations (SDEs) using the so-called parareal method for ordinary differential equations (ODEs). The presence of Wiener process in SDEs causes difficulties in the direct application of any numerical integration techniques of ODEs including the parareal algorithm. The parallel implementation of the algorithm involves two SDEs solvers, namely a fine-level scheme to integrate the system in parallel and a coarse-level scheme to generate and correct the required initial conditions to start the fine-level integrators. For the numerical illustration, a randomly excited Duffing oscillator is investigated in order to study the performance of the stochastic parallel algorithm with respect to a range of system parameters. The distributed implementation of the algorithm exploits Massage Passing Interface (MPI).
Faster than Real-Time Dynamic Simulation for Large-Size Power System with Detailed Dynamic Models using High-Performance Computing Platform

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, Renke; Jin, Shuangshuang; Chen, Yousu

This paper presents a faster-than-real-time dynamic simulation software package that is designed for large-size power system dynamic simulation. It was developed on the GridPACKTM high-performance computing (HPC) framework. The key features of the developed software package include (1) faster-than-real-time dynamic simulation for a WECC system (17,000 buses) with different types of detailed generator, controller, and relay dynamic models, (2) a decoupled parallel dynamic simulation algorithm with optimized computation architecture to better leverage HPC resources and technologies, (3) options for HPC-based linear and iterative solvers, (4) hidden HPC details, such as data communication and distribution, to enable development centered on mathematicalmore » models and algorithms rather than on computational details for power system researchers, and (5) easy integration of new dynamic models and related algorithms into the software package.« less
Chance-Constrained AC Optimal Power Flow for Distribution Systems With Renewables

DOE Office of Scientific and Technical Information (OSTI.GOV)

DallAnese, Emiliano; Baker, Kyri; Summers, Tyler

This paper focuses on distribution systems featuring renewable energy sources (RESs) and energy storage systems, and presents an AC optimal power flow (OPF) approach to optimize system-level performance objectives while coping with uncertainty in both RES generation and loads. The proposed method hinges on a chance-constrained AC OPF formulation where probabilistic constraints are utilized to enforce voltage regulation with prescribed probability. A computationally more affordable convex reformulation is developed by resorting to suitable linear approximations of the AC power-flow equations as well as convex approximations of the chance constraints. The approximate chance constraints provide conservative bounds that hold for arbitrarymore » distributions of the forecasting errors. An adaptive strategy is then obtained by embedding the proposed AC OPF task into a model predictive control framework. Finally, a distributed solver is developed to strategically distribute the solution of the optimization problems across utility and customers.« less
All-optical 1st- and 2nd-order differential equation solvers with large tuning ranges using Fabry-Pérot semiconductor optical amplifiers.

PubMed

Chen, Kaisheng; Hou, Jie; Huang, Zhuyang; Cao, Tong; Zhang, Jihua; Yu, Yuan; Zhang, Xinliang

2015-02-09

We experimentally demonstrate an all-optical temporal computation scheme for solving 1st- and 2nd-order linear ordinary differential equations (ODEs) with tunable constant coefficients by using Fabry-Pérot semiconductor optical amplifiers (FP-SOAs). By changing the injection currents of FP-SOAs, the constant coefficients of the differential equations are practically tuned. A quite large constant coefficient tunable range from 0.0026/ps to 0.085/ps is achieved for the 1st-order differential equation. Moreover, the constant coefficient p of the 2nd-order ODE solver can be continuously tuned from 0.0216/ps to 0.158/ps, correspondingly with the constant coefficient q varying from 0.0000494/ps(2) to 0.006205/ps(2). Additionally, a theoretical model that combining the carrier density rate equation of the semiconductor optical amplifier (SOA) with the transfer function of the Fabry-Pérot (FP) cavity is exploited to analyze the solving processes. For both 1st- and 2nd-order solvers, excellent agreements between the numerical simulations and the experimental results are obtained. The FP-SOAs based all-optical differential-equation solvers can be easily integrated with other optical components based on InP/InGaAsP materials, such as laser, modulator, photodetector and waveguide, which can motivate the realization of the complicated optical computing on a single integrated chip.
Scalable smoothing strategies for a geometric multigrid method for the immersed boundary equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhalla, Amneet Pal Singh; Knepley, Matthew G.; Adams, Mark F.

2016-12-20

The immersed boundary (IB) method is a widely used approach to simulating fluid-structure interaction (FSI). Although explicit versions of the IB method can suffer from severe time step size restrictions, these methods remain popular because of their simplicity and generality. In prior work (Guy et al., Adv Comput Math, 2015), some of us developed a geometric multigrid preconditioner for a stable semi-implicit IB method under Stokes flow conditions; however, this solver methodology used a Vanka-type smoother that presented limited opportunities for parallelization. This work extends this Stokes-IB solver methodology by developing smoothing techniques that are suitable for parallel implementation. Specifically,more » we demonstrate that an additive version of the Vanka smoother can yield an effective multigrid preconditioner for the Stokes-IB equations, and we introduce an efficient Schur complement-based smoother that is also shown to be effective for the Stokes-IB equations. We investigate the performance of these solvers for a broad range of material stiffnesses, both for Stokes flows and flows at nonzero Reynolds numbers, and for thick and thin structural models. We show here that linear solver performance degrades with increasing Reynolds number and material stiffness, especially for thin interface cases. Nonetheless, the proposed approaches promise to yield effective solution algorithms, especially at lower Reynolds numbers and at modest-to-high elastic stiffnesses.« less
A novel post-processing scheme for two-dimensional electrical impedance tomography based on artificial neural networks

PubMed Central

2017-01-01

Objective Electrical Impedance Tomography (EIT) is a powerful non-invasive technique for imaging applications. The goal is to estimate the electrical properties of living tissues by measuring the potential at the boundary of the domain. Being safe with respect to patient health, non-invasive, and having no known hazards, EIT is an attractive and promising technology. However, it suffers from a particular technical difficulty, which consists of solving a nonlinear inverse problem in real time. Several nonlinear approaches have been proposed as a replacement for the linear solver, but in practice very few are capable of stable, high-quality, and real-time EIT imaging because of their very low robustness to errors and inaccurate modeling, or because they require considerable computational effort. Methods In this paper, a post-processing technique based on an artificial neural network (ANN) is proposed to obtain a nonlinear solution to the inverse problem, starting from a linear solution. While common reconstruction methods based on ANNs estimate the solution directly from the measured data, the method proposed here enhances the solution obtained from a linear solver. Conclusion Applying a linear reconstruction algorithm before applying an ANN reduces the effects of noise and modeling errors. Hence, this approach significantly reduces the error associated with solving 2D inverse problems using machine-learning-based algorithms. Significance This work presents radical enhancements in the stability of nonlinear methods for biomedical EIT applications. PMID:29206856
An Inviscid Decoupled Method for the Roe FDS Scheme in the Reacting Gas Path of FUN3D

NASA Technical Reports Server (NTRS)

Thompson, Kyle B.; Gnoffo, Peter A.

2016-01-01

An approach is described to decouple the species continuity equations from the mixture continuity, momentum, and total energy equations for the Roe flux difference splitting scheme. This decoupling simplifies the implicit system, so that the flow solver can be made significantly more efficient, with very little penalty on overall scheme robustness. Most importantly, the computational cost of the point implicit relaxation is shown to scale linearly with the number of species for the decoupled system, whereas the fully coupled approach scales quadratically. Also, the decoupled method significantly reduces the cost in wall time and memory in comparison to the fully coupled approach. This work lays the foundation for development of an efficient adjoint solution procedure for high speed reacting flow.
A numerical study of blood flow using mixture theory

PubMed Central

Wu, Wei-Tao; Aubry, Nadine; Massoudi, Mehrdad; Kim, Jeongho; Antaki, James F.

2014-01-01

In this paper, we consider the two dimensional flow of blood in a rectangular microfluidic channel. We use Mixture Theory to treat this problem as a two-component system: One component is the red blood cells (RBCs) modeled as a generalized Reiner–Rivlin type fluid, which considers the effects of volume fraction (hematocrit) and influence of shear rate upon viscosity. The other component, plasma, is assumed to behave as a linear viscous fluid. A CFD solver based on OpenFOAM® was developed and employed to simulate a specific problem, namely blood flow in a two dimensional micro-channel, is studied. Finally to better understand this two-component flow system and the effects of the different parameters, the equations are made dimensionless and a parametric study is performed. PMID:24791016
A numerical study of blood flow using mixture theory.

PubMed

Wu, Wei-Tao; Aubry, Nadine; Massoudi, Mehrdad; Kim, Jeongho; Antaki, James F

2014-03-01

In this paper, we consider the two dimensional flow of blood in a rectangular microfluidic channel. We use Mixture Theory to treat this problem as a two-component system: One component is the red blood cells (RBCs) modeled as a generalized Reiner-Rivlin type fluid, which considers the effects of volume fraction (hematocrit) and influence of shear rate upon viscosity. The other component, plasma, is assumed to behave as a linear viscous fluid. A CFD solver based on OpenFOAM ® was developed and employed to simulate a specific problem, namely blood flow in a two dimensional micro-channel, is studied. Finally to better understand this two-component flow system and the effects of the different parameters, the equations are made dimensionless and a parametric study is performed.
Hybrid ODE/SSA methods and the cell cycle model

NASA Astrophysics Data System (ADS)

Wang, S.; Chen, M.; Cao, Y.

2017-07-01

Stochastic effect in cellular systems has been an important topic in systems biology. Stochastic modeling and simulation methods are important tools to study stochastic effect. Given the low efficiency of stochastic simulation algorithms, the hybrid method, which combines an ordinary differential equation (ODE) system with a stochastic chemically reacting system, shows its unique advantages in the modeling and simulation of biochemical systems. The efficiency of hybrid method is usually limited by reactions in the stochastic subsystem, which are modeled and simulated using Gillespie's framework and frequently interrupt the integration of the ODE subsystem. In this paper we develop an efficient implementation approach for the hybrid method coupled with traditional ODE solvers. We also compare the efficiency of hybrid methods with three widely used ODE solvers RADAU5, DASSL, and DLSODAR. Numerical experiments with three biochemical models are presented. A detailed discussion is presented for the performances of three ODE solvers.
Box truss analysis and technology development. Task 1: Mesh analysis and control

NASA Technical Reports Server (NTRS)

Bachtell, E. E.; Bettadapur, S. S.; Coyner, J. V.

1985-01-01

An analytical tool was developed to model, analyze and predict RF performance of box truss antennas with reflective mesh surfaces. The analysis system is unique in that it integrates custom written programs for cord tied mesh surfaces, thereby drastically reducing the cost of analysis. The analysis system is capable of determining the RF performance of antennas under any type of manufacturing or operating environment by integrating together the various disciplines of design, finite element analysis, surface best fit analysis and RF analysis. The Integrated Mesh Analysis System consists of six separate programs: The Mesh Tie System Model Generator, The Loadcase Generator, The Model Optimizer, The Model Solver, The Surface Topography Solver and The RF Performance Solver. Additionally, a study using the mesh analysis system was performed to determine the effect of on orbit calibration, i.e., surface adjustment, on a typical box truss antenna.
CAPRI (Computational Analysis PRogramming Interface): A Solid Modeling Based Infra-Structure for Engineering Analysis and Design Simulations

NASA Technical Reports Server (NTRS)

Haimes, Robert; Follen, Gregory J.

1998-01-01

CAPRI is a CAD-vendor neutral application programming interface designed for the construction of analysis and design systems. By allowing access to the geometry from within all modules (grid generators, solvers and post-processors) such tasks as meshing on the actual surfaces, node enrichment by solvers and defining which mesh faces are boundaries (for the solver and visualization system) become simpler. The overall reliance on file 'standards' is minimized. This 'Geometry Centric' approach makes multi-physics (multi-disciplinary) analysis codes much easier to build. By using the shared (coupled) surface as the foundation, CAPRI provides a single call to interpolate grid-node based data from the surface discretization in one volume to another. Finally, design systems are possible where the results can be brought back into the CAD system (and therefore manufactured) because all geometry construction and modification are performed using the CAD system's geometry kernel.
A stopping criterion for the iterative solution of partial differential equations

NASA Astrophysics Data System (ADS)

Rao, Kaustubh; Malan, Paul; Perot, J. Blair

2018-01-01

A stopping criterion for iterative solution methods is presented that accurately estimates the solution error using low computational overhead. The proposed criterion uses information from prior solution changes to estimate the error. When the solution changes are noisy or stagnating it reverts to a less accurate but more robust, low-cost singular value estimate to approximate the error given the residual. This estimator can also be applied to iterative linear matrix solvers such as Krylov subspace or multigrid methods. Examples of the stopping criterion's ability to accurately estimate the non-linear and linear solution error are provided for a number of different test cases in incompressible fluid dynamics.
Implicit integration methods for dislocation dynamics

DOE PAGES

Gardner, D. J.; Woodward, C. S.; Reynolds, D. R.; ...

2015-01-20

In dislocation dynamics simulations, strain hardening simulations require integrating stiff systems of ordinary differential equations in time with expensive force calculations, discontinuous topological events, and rapidly changing problem size. Current solvers in use often result in small time steps and long simulation times. Faster solvers may help dislocation dynamics simulations accumulate plastic strains at strain rates comparable to experimental observations. Here, this paper investigates the viability of high order implicit time integrators and robust nonlinear solvers to reduce simulation run times while maintaining the accuracy of the computed solution. In particular, implicit Runge-Kutta time integrators are explored as a waymore » of providing greater accuracy over a larger time step than is typically done with the standard second-order trapezoidal method. In addition, both accelerated fixed point and Newton's method are investigated to provide fast and effective solves for the nonlinear systems that must be resolved within each time step. Results show that integrators of third order are the most effective, while accelerated fixed point and Newton's method both improve solver performance over the standard fixed point method used for the solution of the nonlinear systems.« less
A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters

DOE PAGES

Li, Xinya; Deng, Z. Daniel; USA, Richland Washington; ...

2014-11-27

Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developedmore » using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.« less
A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters

NASA Astrophysics Data System (ADS)

Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.

2014-11-01

Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters

PubMed Central

Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.

2014-01-01

Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature. PMID:25427517
A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters.

PubMed

Li, Xinya; Deng, Z Daniel; Sun, Yannan; Martinez, Jayson J; Fu, Tao; McMichael, Geoffrey A; Carlson, Thomas J

2014-11-27

Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.

A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Xinya; Deng, Z. Daniel; USA, Richland Washington

Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developedmore » using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.« less
An interior penalty stabilised incompressible discontinuous Galerkin-Fourier solver for implicit large eddy simulations

NASA Astrophysics Data System (ADS)

Ferrer, Esteban

2017-11-01

We present an implicit Large Eddy Simulation (iLES) h / p high order (≥2) unstructured Discontinuous Galerkin-Fourier solver with sliding meshes. The solver extends the laminar version of Ferrer and Willden, 2012 [34], to enable the simulation of turbulent flows at moderately high Reynolds numbers in the incompressible regime. This solver allows accurate flow solutions of the laminar and turbulent 3D incompressible Navier-Stokes equations on moving and static regions coupled through a high order sliding interface. The spatial discretisation is provided by the Symmetric Interior Penalty Discontinuous Galerkin (IP-DG) method in the x-y plane coupled with a purely spectral method that uses Fourier series and allows efficient computation of spanwise periodic three-dimensional flows. Since high order methods (e.g. discontinuous Galerkin and Fourier) are unable to provide enough numerical dissipation to enable under-resolved high Reynolds computations (i.e. as necessary in the iLES approach), we adapt the laminar version of the solver to increase (controllably) the dissipation and enhance the stability in under-resolved simulations. The novel stabilisation relies on increasing the penalty parameter included in the DG interior penalty (IP) formulation. The latter penalty term is included when discretising the linear viscous terms in the incompressible Navier-Stokes equations. These viscous penalty fluxes substitute the stabilising effect of non-linear fluxes, which has been the main trend in implicit LES discontinuous Galerkin approaches. The IP-DG penalty term provides energy dissipation, which is controlled by the numerical jumps at element interfaces (e.g. large in under-resolved regions) such as to stabilise under-resolved high Reynolds number flows. This dissipative term has minimal impact in well resolved regions and its implicit treatment does not restrict the use of large time steps, thus providing an efficient stabilization mechanism for iLES. The IP-DG stabilisation is complemented with a Spectral Vanishing Viscosity (SVV) method, in the z-direction, to enhance stability in the continuous Fourier space. The coupling between the numerical viscosity in the DG plane and the SVV damping, provides an efficient approach to stabilise high order methods at moderately high Reynolds numbers. We validate the formulation for three turbulent flow cases: a circular cylinder at Re = 3900, a static and pitch oscillating NACA 0012 airfoil at Re = 10000 and finally a rotating vertical-axis turbine at Re = 40000, with Reynolds based on the circular diameter, airfoil chord and turbine diameter, respectively. All our results compare favourably with published direct numerical simulations, large eddy simulations or experimental data. We conclude that the DG-Fourier high order solver, with IP-SVV stabilisation, proves to be a valuable tool to predict turbulent flows and associated statistics for both static and rotating machinery.
The development of an intelligent interface to a computational fluid dynamics flow-solver code

NASA Technical Reports Server (NTRS)

Williams, Anthony D.

1988-01-01

Researchers at NASA Lewis are currently developing an 'intelligent' interface to aid in the development and use of large, computational fluid dynamics flow-solver codes for studying the internal fluid behavior of aerospace propulsion systems. This paper discusses the requirements, design, and implementation of an intelligent interface to Proteus, a general purpose, 3-D, Navier-Stokes flow solver. The interface is called PROTAIS to denote its introduction of artificial intelligence (AI) concepts to the Proteus code.
The development of an intelligent interface to a computational fluid dynamics flow-solver code

NASA Technical Reports Server (NTRS)

Williams, Anthony D.

1988-01-01

Researchers at NASA Lewis are currently developing an 'intelligent' interface to aid in the development and use of large, computational fluid dynamics flow-solver codes for studying the internal fluid behavior of aerospace propulsion systems. This paper discusses the requirements, design, and implementation of an intelligent interface to Proteus, a general purpose, three-dimensional, Navier-Stokes flow solver. The interface is called PROTAIS to denote its introduction of artificial intelligence (AI) concepts to the Proteus code.
Comparison of Quasi-Conservative Pressure-Based and Fully-Conservative Formulations for the Simulation of Transcritical Flows

NASA Astrophysics Data System (ADS)

Lacaze, Guilhem; Oefelein, Joseph

2016-11-01

High-pressure flows are known to be challenging to simulate due to thermodynamic non-linearities occurring in the vicinity of the pseudo-boiling line. This study investigates the origin of this issue by analyzing the behavior of thermodynamic processes at elevated pressure and low temperature. We show that under transcritical conditions, non-linearities significantly amplify numerical errors associated with construction of fluxes. These errors affect the local density and energy balances, which in turn creates pressure oscillations. For that reason, solvers based on a conservative system of equations that transport density and total energy are subject to unphysical pressure variations in gradient regions. These perturbations hinder numerical stability and degrade the accuracy of predictions. To circumvent this problem, the governing system can be reformulated to a pressure-based treatment of energy. We present comparisons between the pressure-based and fully conservative formulations using a progressive set of canonical cases, including a cryogenic turbulent mixing layer at rocket engine conditions. Department of Energy, Office of Science, Basic Energy Sciences Program.
Fast computation of an optimal controller for large-scale adaptive optics.

PubMed

Massioni, Paolo; Kulcsár, Caroline; Raynaud, Henri-François; Conan, Jean-Marc

2011-11-01

The linear quadratic Gaussian regulator provides the minimum-variance control solution for a linear time-invariant system. For adaptive optics (AO) applications, under the hypothesis of a deformable mirror with instantaneous response, such a controller boils down to a minimum-variance phase estimator (a Kalman filter) and a projection onto the mirror space. The Kalman filter gain can be computed by solving an algebraic Riccati matrix equation, whose computational complexity grows very quickly with the size of the telescope aperture. This "curse of dimensionality" makes the standard solvers for Riccati equations very slow in the case of extremely large telescopes. In this article, we propose a way of computing the Kalman gain for AO systems by means of an approximation that considers the turbulence phase screen as the cropped version of an infinite-size screen. We demonstrate the advantages of the methods for both off- and on-line computational time, and we evaluate its performance for classical AO as well as for wide-field tomographic AO with multiple natural guide stars. Simulation results are reported.
Parallel iterative solution for h and p approximations of the shallow water equations

USGS Publications Warehouse

Barragy, E.J.; Walters, R.A.

1998-01-01

A p finite element scheme and parallel iterative solver are introduced for a modified form of the shallow water equations. The governing equations are the three-dimensional shallow water equations. After a harmonic decomposition in time and rearrangement, the resulting equations are a complex Helmholz problem for surface elevation, and a complex momentum equation for the horizontal velocity. Both equations are nonlinear and the resulting system is solved using the Picard iteration combined with a preconditioned biconjugate gradient (PBCG) method for the linearized subproblems. A subdomain-based parallel preconditioner is developed which uses incomplete LU factorization with thresholding (ILUT) methods within subdomains, overlapping ILUT factorizations for subdomain boundaries and under-relaxed iteration for the resulting block system. The method builds on techniques successfully applied to linear elements by introducing ordering and condensation techniques to handle uniform p refinement. The combined methods show good performance for a range of p (element order), h (element size), and N (number of processors). Performance and scalability results are presented for a field scale problem where up to 512 processors are used. ?? 1998 Elsevier Science Ltd. All rights reserved.
Cooperative control of two active spacecraft during proximity operations. M.S. Thesis - MIT

NASA Technical Reports Server (NTRS)

Polutchko, Robert J.

1989-01-01

A cooperative autopilot is developed for the control of the relative attitude, relative position and absolute attitude of two maneuvering spacecraft during on orbit proximity operations. The autopilot consists of an open-loop trajectory solver which computes a nine dimensional linearized nominal state trajectory at the beginning of each maneuver and a phase space regulator which maintains the two spacecraft on the nominal trajectory during coast phases of the maneuver. A linear programming algorithm is used to perform jet selection. Simulation tests using a system of two space shuttle vehicles are performed to verify the performance of the cooperative controller and comparisons are made to a traditional passive target/active pursuit vehicle approach to proximity operations. The cooperative autopilot is shown to be able to control the two vehicle system when both the would be pursuit vehicle and the target vehicle are not completely controllable in six degrees of freedom. The cooperative controller is also shown to use as much as 37 percent less fuel and 57 percent fewer jet firings than a single pursuit vehicle during a simple docking approach maneuver.
Design of a microfluidic system for red blood cell aggregation investigation.

PubMed

Mehri, R; Mavriplis, C; Fenech, M

2014-06-01

The purpose of this paper is to design a microfluidic apparatus capable of providing controlled flow conditions suitable for red blood cell (RBC) aggregation analysis. The linear velocity engendered from the controlled flow provides constant shear rates used to qualitatively analyze RBC aggregates. The design of the apparatus is based on numerical and experimental work. The numerical work consists of 3D numerical simulations performed using a research computational fluid dynamics (CFD) solver, Nek5000, while the experiments are conducted using a microparticle image velocimetry system. A Newtonian model is tested numerically and experimentally, then blood is tested experimentally under several conditions (hematocrit, shear rate, and fluid suspension) to be compared to the simulation results. We find that using a velocity ratio of 4 between the two Newtonian fluids, the layer corresponding to blood expands to fill 35% of the channel thickness where the constant shear rate is achieved. For blood experiments, the velocity profile in the blood layer is approximately linear, resulting in the desired controlled conditions for the study of RBC aggregation under several flow scenarios.
Robust control of combustion instabilities

NASA Astrophysics Data System (ADS)

Hong, Boe-Shong

Several interactive dynamical subsystems, each of which has its own time-scale and physical significance, are decomposed to build a feedback-controlled combustion- fluid robust dynamics. On the fast-time scale, the phenomenon of combustion instability is corresponding to the internal feedback of two subsystems: acoustic dynamics and flame dynamics, which are parametrically dependent on the slow-time-scale mean-flow dynamics controlled for global performance by a mean-flow controller. This dissertation constructs such a control system, through modeling, analysis and synthesis, to deal with model uncertainties, environmental noises and time- varying mean-flow operation. Conservation law is decomposed as fast-time acoustic dynamics and slow-time mean-flow dynamics, served for synthesizing LPV (linear parameter varying)- L2-gain robust control law, in which a robust observer is embedded for estimating and controlling the internal status, while achieving trade- offs among robustness, performances and operation. The robust controller is formulated as two LPV-type Linear Matrix Inequalities (LMIs), whose numerical solver is developed by finite-element method. Some important issues related to physical understanding and engineering application are discussed in simulated results of the control system.
Spacecraft Formation Flying Maneuvers Using Linear-Quadratic Regulation with No Radial Axis Inputs

NASA Technical Reports Server (NTRS)

Starin, Scott R.; Yedavalli, R. K.; Sparks, Andrew G.; Bauer, Frank H. (Technical Monitor)

2001-01-01

Regarding multiple spacecraft formation flying, the observation has been made that control thrust need only be applied coplanar to the local horizon to achieve complete controllability of a two-satellite (leader-follower) formation. A formulation of orbital dynamics using the state of one satellite relative to another is used. Without the need for thrust along the radial (zenith-nadir) axis of the relative reference frame ' propulsion system simplifications and weight reduction may be accomplished. Several linear-quadratic regulators (LQR) are explored and compared based on performance measures likely to be important to many missions, but not directly optimized in the LQR designs. Maneuver simulations are performed using commercial ODE solvers to propagate the Keplerian dynamics of a controlled satellite relative to an uncontrolled leader. These short maneuver simulations demonstrate the capacity of the controller to perform changes from one formation geometry to another. This work focusses on formations in which the controlled satellite has a relative trajectory which projects onto the local horizon of the uncontrolled satellite as a circle. This formation has potential uses for distributed remote sensing systems.
Real-time scene and signature generation for ladar and imaging sensors

NASA Astrophysics Data System (ADS)

Swierkowski, Leszek; Christie, Chad L.; Antanovskii, Leonid; Gouthas, Efthimios

2014-05-01

This paper describes development of two key functionalities within the VIRSuite scene simulation program, broadening its scene generation capabilities and increasing accuracy of thermal signatures. Firstly, a new LADAR scene generation module has been designed. It is capable of simulating range imagery for Geiger mode LADAR, in addition to the already existing functionality for linear mode systems. Furthermore, a new 3D heat diffusion solver has been developed within the VIRSuite signature prediction module. It is capable of calculating the temperature distribution in complex three-dimensional objects for enhanced dynamic prediction of thermal signatures. With these enhancements, VIRSuite is now a robust tool for conducting dynamic simulation for missiles with multi-mode seekers.
TADS: A CFD-Based Turbomachinery Analysis and Design System with GUI: Methods and Results. 2.0

NASA Technical Reports Server (NTRS)

Koiro, M. J.; Myers, R. A.; Delaney, R. A.

1999-01-01

The primary objective of this study was the development of a Computational Fluid Dynamics (CFD) based turbomachinery airfoil analysis and design system, controlled by a Graphical User Interface (GUI). The computer codes resulting from this effort are referred to as TADS (Turbomachinery Analysis and Design System). This document is the Final Report describing the theoretical basis and analytical results from the TADS system developed under Task 10 of NASA Contract NAS3-27394, ADPAC System Coupling to Blade Analysis & Design System GUI, Phase II-Loss, Design and. Multi-stage Analysis. TADS couples a throughflow solver (ADPAC) with a quasi-3D blade-to-blade solver (RVCQ3D) or a 3-D solver with slip condition on the end walls (B2BADPAC) in an interactive package. Throughflow analysis and design capability was developed in ADPAC through the addition of blade force and blockage terms to the governing equations. A GUI was developed to simplify user input and automate the many tasks required to perform turbomachinery analysis and design. The coupling of the various programs was done in such a way that alternative solvers or grid generators could be easily incorporated into the TADS framework. Results of aerodynamic calculations using the TADS system are presented for a multistage compressor, a multistage turbine, two highly loaded fans, and several single stage compressor and turbine example cases.
A mixed fluid-kinetic solver for the Vlasov-Poisson equations

NASA Astrophysics Data System (ADS)

Cheng, Yongtao

Plasmas are ionized gases that appear in a wide range of applications including astrophysics and space physics, as well as in laboratory settings such as in magnetically confined fusion. There are two prevailing types of modeling strategies to describe a plasma system: kinetic models and fluid models. Kinetic models evolve particle probability density distributions (PDFs) in phase space, which are accurate but computationally expensive. Fluid models evolve a small number of moments of the distribution function and reduce the dimension of the solution. However, some approximation is necessary to close the system, and finding an accurate moment closure that correctly captures the dynamics away from thermodynamic equilibrium is a difficult and still open problem. The main contributions of the present work can be divided into two main parts: (1) a new class of moment closures, based on a modification of existing quadrature-based moment-closure methods, is developed using bi-B-spline and bi-bubble representations; and (2) a novel mixed solver that combines a fluid and a kinetic solver is proposed, which uses the new class of moment-closure methods described in the first part. For the newly developed quadrature-based moment-closure based on bi-B-spline and bi-bubble representation, the explicit form of flux terms and the moment-realizability conditions are given. It is shown that while the bi-delta system is weakly hyperbolic, the newly proposed fluid models are strongly hyperbolic. Using a high-order Runge-Kutta discontinuous Galerkin method together with Strang operator splitting, the resulting models are applied to the Vlasov-Poisson-Fokker-Planck system in the high field limit. In the second part of this work, results from kinetic solver are used to provide a corrected closure to the fluid model. This correction keeps the fluid model hyperbolic and gives fluid results that match the moments as computed from the kinetic solution. Furthermore, a prolongation operation based on the bi-bubble moment-closure is used to make the first few moments of the kinetic and fluid solvers match. This results in a kinetic solver that exactly conserves mass and total energy. This mixed fluid-kinetic solver is applied to standard test problems for the Vlasov-Poisson system, including two-stream-instability problem and Landau damping.
Parallel-vector solution of large-scale structural analysis problems on supercomputers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

1989-01-01

A direct linear equation solution method based on the Choleski factorization procedure is presented which exploits both parallel and vector features of supercomputers. The new equation solver is described, and its performance is evaluated by solving structural analysis problems on three high-performance computers. The method has been implemented using Force, a generic parallel FORTRAN language.
JPLEX: Java Simplex Implementation with Branch-and-Bound Search for Automated Test Assembly

ERIC Educational Resources Information Center

Park, Ryoungsun; Kim, Jiseon; Dodd, Barbara G.; Chung, Hyewon

2011-01-01

JPLEX, short for Java simPLEX, is an automated test assembly (ATA) program. It is a mixed integer linear programming (MILP) solver written in Java. It reads in a configuration file, solves the minimization problem, and produces an output file for postprocessing. It implements the simplex algorithm to create a fully relaxed solution and…
Numerical Simulation of the Interaction of an Air Shock Wave with a Surface Gas-Dust Layer

NASA Astrophysics Data System (ADS)

Surov, V. S.

2018-05-01

Within the framework of the one-velocity and multivelocity models of a dust-laden gas with the use of the Godunov method with a linearized Riemann solver, the problem of the interaction of a shock wave with a dust-laden gas layer located along a solid plane surface has been studied.
Numerical Simulation of the Interaction of an Air Shock Wave with a Surface Gas-Dust Layer

NASA Astrophysics Data System (ADS)

Surov, V. S.

2018-03-01

Within the framework of the one-velocity and multivelocity models of a dust-laden gas with the use of the Godunov method with a linearized Riemann solver, the problem of the interaction of a shock wave with a dust-laden gas layer located along a solid plane surface has been studied.
Uncertainty Quantification of Non-linear Oscillation Triggering in a Multi-injector Liquid-propellant Rocket Combustion Chamber

NASA Astrophysics Data System (ADS)

Popov, Pavel; Sideris, Athanasios; Sirignano, William

2014-11-01

We examine the non-linear dynamics of the transverse modes of combustion-driven acoustic instability in a liquid-propellant rocket engine. Triggering can occur, whereby small perturbations from mean conditions decay, while larger disturbances grow to a limit-cycle of amplitude that may compare to the mean pressure. For a deterministic perturbation, the system is also deterministic, computed by coupled finite-volume solvers at low computational cost for a single realization. The randomness of the triggering disturbance is captured by treating the injector flow rates, local pressure disturbances, and sudden acceleration of the entire combustion chamber as random variables. The combustor chamber with its many sub-fields resulting from many injector ports may be viewed as a multi-scale complex system wherein the developing acoustic oscillation is the emergent structure. Numerical simulation of the resulting stochastic PDE system is performed using the polynomial chaos expansion method. The overall probability of unstable growth is assessed in different regions of the parameter space. We address, in particular, the seven-injector, rectangular Purdue University experimental combustion chamber. In addition to the novel geometry, new features include disturbances caused by engine acceleration and unsteady thruster nozzle flow.
Computing Generalized Matrix Inverse on Spiking Neural Substrate.

PubMed

Shukla, Rohit; Khoram, Soroosh; Jorgensen, Erik; Li, Jing; Lipasti, Mikko; Wright, Stephen

2018-01-01

Emerging neural hardware substrates, such as IBM's TrueNorth Neurosynaptic System, can provide an appealing platform for deploying numerical algorithms. For example, a recurrent Hopfield neural network can be used to find the Moore-Penrose generalized inverse of a matrix, thus enabling a broad class of linear optimizations to be solved efficiently, at low energy cost. However, deploying numerical algorithms on hardware platforms that severely limit the range and precision of representation for numeric quantities can be quite challenging. This paper discusses these challenges and proposes a rigorous mathematical framework for reasoning about range and precision on such substrates. The paper derives techniques for normalizing inputs and properly quantizing synaptic weights originating from arbitrary systems of linear equations, so that solvers for those systems can be implemented in a provably correct manner on hardware-constrained neural substrates. The analytical model is empirically validated on the IBM TrueNorth platform, and results show that the guarantees provided by the framework for range and precision hold under experimental conditions. Experiments with optical flow demonstrate the energy benefits of deploying a reduced-precision and energy-efficient generalized matrix inverse engine on the IBM TrueNorth platform, reflecting 10× to 100× improvement over FPGA and ARM core baselines.

Physics-Based Preconditioning of a Compressible Flow Solver for Large-Scale Simulations of Additive Manufacturing Processes

NASA Astrophysics Data System (ADS)

Weston, Brian; Nourgaliev, Robert; Delplanque, Jean-Pierre

2017-11-01

We present a new block-based Schur complement preconditioner for simulating all-speed compressible flow with phase change. The conservation equations are discretized with a reconstructed Discontinuous Galerkin method and integrated in time with fully implicit time discretization schemes. The resulting set of non-linear equations is converged using a robust Newton-Krylov framework. Due to the stiffness of the underlying physics associated with stiff acoustic waves and viscous material strength effects, we solve for the primitive-variables (pressure, velocity, and temperature). To enable convergence of the highly ill-conditioned linearized systems, we develop a physics-based preconditioner, utilizing approximate block factorization techniques to reduce the fully-coupled 3×3 system to a pair of reduced 2×2 systems. We demonstrate that our preconditioned Newton-Krylov framework converges on very stiff multi-physics problems, corresponding to large CFL and Fourier numbers, with excellent algorithmic and parallel scalability. Results are shown for the classic lid-driven cavity flow problem as well as for 3D laser-induced phase change. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Large Scale, High Resolution, Mantle Dynamics Modeling

NASA Astrophysics Data System (ADS)

Geenen, T.; Berg, A. V.; Spakman, W.

2007-12-01

To model the geodynamic evolution of plate convergence, subduction and collision and to allow for a connection to various types of observational data, geophysical, geodetical and geological, we developed a 4D (space-time) numerical mantle convection code. The model is based on a spherical 3D Eulerian fem model, with quadratic elements, on top of which we constructed a 3D Lagrangian particle in cell(PIC) method. We use the PIC method to transport material properties and to incorporate a viscoelastic rheology. Since capturing small scale processes associated with localization phenomena require a high resolution, we spend a considerable effort on implementing solvers suitable to solve for models with over 100 million degrees of freedom. We implemented Additive Schwartz type ILU based methods in combination with a Krylov solver, GMRES. However we found that for problems with over 500 thousend degrees of freedom the convergence of the solver degraded severely. This observation is known from the literature [Saad, 2003] and results from the local character of the ILU preconditioner resulting in a poor approximation of the inverse of A for large A. The size of A for which ILU is no longer usable depends on the condition of A and on the amount of fill in allowed for the ILU preconditioner. We found that for our problems with over 5×105 degrees of freedom convergence became to slow to solve the system within an acceptable amount of walltime, one minute, even when allowing for considerable amount of fill in. We also implemented MUMPS and found good scaling results for problems up to 107 degrees of freedom for up to 32 CPU¡¯s. For problems with over 100 million degrees of freedom we implemented Algebraic Multigrid type methods (AMG) from the ML library [Sala, 2006]. Since multigrid methods are most effective for single parameter problems, we rebuild our model to use the SIMPLE method in the Stokes solver [Patankar, 1980]. We present scaling results from these solvers for 3D spherical models. We also applied the above mentioned method to a high resolution (~ 1 km) 2D mantle convection model with temperature, pressure and phase dependent rheology including several phase transitions. We focus on a model of a subducting lithospheric slab which is subject to strong folding at the bottom of the mantle's D" region which includes the postperovskite phase boundary. For a detailed description of this model we refer to poster [Mantel convection models of the D" region, U17] [Saad, 2003] Saad, Y. (2003). Iterative methods for sparse linear systems. [Sala, 2006] Sala. M (2006) An Object-Oriented Framework for the Development of Scalable Parallel Multilevel Preconditioners. ACM Transactions on Mathematical Software, 32 (3), 2006 [Patankar, 1980] Patankar, S. V.(1980) Numerical Heat Transfer and Fluid Flow, Hemisphere, Washington.
Fully-Implicit Reconstructed Discontinuous Galerkin Method for Stiff Multiphysics Problems

NASA Astrophysics Data System (ADS)

Nourgaliev, Robert

2015-11-01

A new reconstructed Discontinuous Galerkin (rDG) method, based on orthogonal basis/test functions, is developed for fluid flows on unstructured meshes. Orthogonality of basis functions is essential for enabling robust and efficient fully-implicit Newton-Krylov based time integration. The method is designed for generic partial differential equations, including transient, hyperbolic, parabolic or elliptic operators, which are attributed to many multiphysics problems. We demonstrate the method's capabilities for solving compressible fluid-solid systems (in the low Mach number limit), with phase change (melting/solidification), as motivated by applications in Additive Manufacturing. We focus on the method's accuracy (in both space and time), as well as robustness and solvability of the system of linear equations involved in the linearization steps of Newton-based methods. The performance of the developed method is investigated for highly-stiff problems with melting/solidification, emphasizing the advantages from tight coupling of mass, momentum and energy conservation equations, as well as orthogonality of basis functions, which leads to better conditioning of the underlying (approximate) Jacobian matrices, and rapid convergence of the Krylov-based linear solver. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, and funded by the LDRD at LLNL under project tracking code 13-SI-002.
TADS: A CFD-based turbomachinery and analysis design system with GUI. Volume 2: User's manual

NASA Technical Reports Server (NTRS)

Myers, R. A.; Topp, D. A.; Delaney, R. A.

1995-01-01

The primary objective of this study was the development of a computational fluid dynamics (CFD) based turbomachinery airfoil analysis and design system, controlled by a graphical user interface (GUI). The computer codes resulting from this effort are referred to as the Turbomachinery Analysis and Design System (TADS). This document is intended to serve as a user's manual for the computer programs which comprise the TADS system. TADS couples a throughflow solver (ADPAC) with a quasi-3D blade-to-blade solver (RVCQ3D) in an interactive package. Throughflow analysis capability was developed in ADPAC through the addition of blade force and blockage terms to the governing equations. A GUI was developed to simplify user input and automate the many tasks required to perform turbomachinery analysis and design. The coupling of various programs was done in a way that alternative solvers or grid generators could be easily incorporated into the TADS framework.
Applying Reduced Generator Models in the Coarse Solver of Parareal in Time Parallel Power System Simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Duan, Nan; Dimitrovski, Aleksandar D; Simunovic, Srdjan

2016-01-01

The development of high-performance computing techniques and platforms has provided many opportunities for real-time or even faster-than-real-time implementation of power system simulations. One approach uses the Parareal in time framework. The Parareal algorithm has shown promising theoretical simulation speedups by temporal decomposing a simulation run into a coarse simulation on the entire simulation interval and fine simulations on sequential sub-intervals linked through the coarse simulation. However, it has been found that the time cost of the coarse solver needs to be reduced to fully exploit the potentials of the Parareal algorithm. This paper studies a Parareal implementation using reduced generatormore » models for the coarse solver and reports the testing results on the IEEE 39-bus system and a 327-generator 2383-bus Polish system model.« less
Study of dynamic fluid-structure coupling with application to human phonation

NASA Astrophysics Data System (ADS)

Saurabh, Shakti; Faber, Justin; Bodony, Daniel

2013-11-01

Two-dimensional direct numerical simulations of a compressible, viscous fluid interacting with a non-linear, viscoelastic solid are used to study the generation of the human voice. The vocal fold (VF) tissues are modeled using a finite-strain fractional derivative constitutive model implemented in a quadratic finite element code and coupled to a high-order compressible Navier-Stokes solver through a boundary-fitted fluid-solid interface. The viscoelastic solver is validated through in-house experiments using Agarose Gel, a human tissue simulant, undergoing static and harmonic deformation measured with load cell and optical diagnostics. The phonation simulations highlight the role tissue nonlinearity and viscosity play in the glottal jet dynamics and in the radiated sound. Supported by the National Science Foundation (CAREER award number 1150439).
Implementing High-Performance Geometric Multigrid Solver with Naturally Grained Messages

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shan, Hongzhang; Williams, Samuel; Zheng, Yili

2015-10-26

Structured-grid linear solvers often require manually packing and unpacking of communication data to achieve high performance.Orchestrating this process efficiently is challenging, labor-intensive, and potentially error-prone.In this paper, we explore an alternative approach that communicates the data with naturally grained messagesizes without manual packing and unpacking. This approach is the distributed analogue of shared-memory programming, taking advantage of the global addressspace in PGAS languages to provide substantial programming ease. However, its performance may suffer from the large number of small messages. We investigate theruntime support required in the UPC ++ library for this naturally grained version to close the performance gapmore » between the two approaches and attain comparable performance at scale using the High-Performance Geometric Multgrid (HPGMG-FV) benchmark as a driver.« less
Establishing Approaches to Modeling the Ares I-X and Ares I Roll Control System with Free-stream Interaction

NASA Technical Reports Server (NTRS)

Pao, S. Paul; Deere, Karen A.; Abdol-Hamid, Khales S.

2011-01-01

Approaches were established for modeling the roll control system and analyzing the jet interactions of the activated roll control system on Ares-type configurations using the USM3D Navier-Stokes solver. Components of the modeling approach for the roll control system include a choice of turbulence models, basis for computing a dynamic equivalence of the real gas rocket exhaust flow in terms of an ideal gas, and techniques to evaluate roll control system performance for wind tunnel and flight conditions. A simplified Ares I-X configuration was used during the development phase of the roll control system modeling approach. A limited set of Navier-Stokes solutions was obtained for the purposes of this investigation and highlights of the results are included in this paper. The USM3D solutions were compared to equivalent solutions at select flow conditions from a real gas Navier- Stokes solver (Loci-CHEM) and a structured overset grid Navier-Stokes solver (OVERFLOW).
A fast, preconditioned conjugate gradient Toeplitz solver

NASA Technical Reports Server (NTRS)

Pan, Victor; Schrieber, Robert

1989-01-01

A simple factorization is given of an arbitrary hermitian, positive definite matrix in which the factors are well-conditioned, hermitian, and positive definite. In fact, given knowledge of the extreme eigenvalues of the original matrix A, an optimal improvement can be achieved, making the condition numbers of each of the two factors equal to the square root of the condition number of A. This technique is to applied to the solution of hermitian, positive definite Toeplitz systems. Large linear systems with hermitian, positive definite Toeplitz matrices arise in some signal processing applications. A stable fast algorithm is given for solving these systems that is based on the preconditioned conjugate gradient method. The algorithm exploits Toeplitz structure to reduce the cost of an iteration to O(n log n) by applying the fast Fourier Transform to compute matrix-vector products. Matrix factorization is used as a preconditioner.
A medical image-based graphical platform -- features, applications and relevance for brachytherapy.

PubMed

Fonseca, Gabriel P; Reniers, Brigitte; Landry, Guillaume; White, Shane; Bellezzo, Murillo; Antunes, Paula C G; de Sales, Camila P; Welteman, Eduardo; Yoriyaz, Hélio; Verhaegen, Frank

2014-01-01

Brachytherapy dose calculation is commonly performed using the Task Group-No 43 Report-Updated protocol (TG-43U1) formalism. Recently, a more accurate approach has been proposed that can handle tissue composition, tissue density, body shape, applicator geometry, and dose reporting either in media or water. Some model-based dose calculation algorithms are based on Monte Carlo (MC) simulations. This work presents a software platform capable of processing medical images and treatment plans, and preparing the required input data for MC simulations. The A Medical Image-based Graphical platfOrm-Brachytherapy module (AMIGOBrachy) is a user interface, coupled to the MCNP6 MC code, for absorbed dose calculations. The AMIGOBrachy was first validated in water for a high-dose-rate (192)Ir source. Next, dose distributions were validated in uniform phantoms consisting of different materials. Finally, dose distributions were obtained in patient geometries. Results were compared against a treatment planning system including a linear Boltzmann transport equation (LBTE) solver capable of handling nonwater heterogeneities. The TG-43U1 source parameters are in good agreement with literature with more than 90% of anisotropy values within 1%. No significant dependence on the tissue composition was observed comparing MC results against an LBTE solver. Clinical cases showed differences up to 25%, when comparing MC results against TG-43U1. About 92% of the voxels exhibited dose differences lower than 2% when comparing MC results against an LBTE solver. The AMIGOBrachy can improve the accuracy of the TG-43U1 dose calculation by using a more accurate MC dose calculation algorithm. The AMIGOBrachy can be incorporated in clinical practice via a user-friendly graphical interface. Copyright © 2014 American Brachytherapy Society. Published by Elsevier Inc. All rights reserved.
Algebraic multigrid preconditioners for two-phase flow in porous media with phase transitions

NASA Astrophysics Data System (ADS)

Bui, Quan M.; Wang, Lu; Osei-Kuffuor, Daniel

2018-04-01

Multiphase flow is a critical process in a wide range of applications, including oil and gas recovery, carbon sequestration, and contaminant remediation. Numerical simulation of multiphase flow requires solving of a large, sparse linear system resulting from the discretization of the partial differential equations modeling the flow. In the case of multiphase multicomponent flow with miscible effect, this is a very challenging task. The problem becomes even more difficult if phase transitions are taken into account. A new approach to handle phase transitions is to formulate the system as a nonlinear complementarity problem (NCP). Unlike in the primary variable switching technique, the set of primary variables in this approach is fixed even when there is phase transition. Not only does this improve the robustness of the nonlinear solver, it opens up the possibility to use multigrid methods to solve the resulting linear system. The disadvantage of the complementarity approach, however, is that when a phase disappears, the linear system has the structure of a saddle point problem and becomes indefinite, and current algebraic multigrid (AMG) algorithms cannot be applied directly. In this study, we explore the effectiveness of a new multilevel strategy, based on the multigrid reduction technique, to deal with problems of this type. We demonstrate the effectiveness of the method through numerical results for the case of two-phase, two-component flow with phase appearance/disappearance. We also show that the strategy is efficient and scales optimally with problem size.
HST3D; a computer code for simulation of heat and solute transport in three-dimensional ground-water flow systems

USGS Publications Warehouse

Kipp, K.L.

1987-01-01

The Heat- and Soil-Transport Program (HST3D) simulates groundwater flow and associated heat and solute transport in three dimensions. The three governing equations are coupled through the interstitial pore velocity, the dependence of the fluid density on pressure, temperature, the solute-mass fraction , and the dependence of the fluid viscosity on temperature and solute-mass fraction. The solute transport equation is for only a single, solute species with possible linear equilibrium sorption and linear decay. Finite difference techniques are used to discretize the governing equations using a point-distributed grid. The flow-, heat- and solute-transport equations are solved , in turn, after a particle Gauss-reduction scheme is used to modify them. The modified equations are more tightly coupled and have better stability for the numerical solutions. The basic source-sink term represents wells. A complex well flow model may be used to simulate specified flow rate and pressure conditions at the land surface or within the aquifer, with or without pressure and flow rate constraints. Boundary condition types offered include specified value, specified flux, leakage, heat conduction, and approximate free surface, and two types of aquifer influence functions. All boundary conditions can be functions of time. Two techniques are available for solution of the finite difference matrix equations. One technique is a direct-elimination solver, using equations reordered by alternating diagonal planes. The other technique is an iterative solver, using two-line successive over-relaxation. A restart option is available for storing intermediate results and restarting the simulation at an intermediate time with modified boundary conditions. This feature also can be used as protection against computer system failure. Data input and output may be in metric (SI) units or inch-pound units. Output may include tables of dependent variables and parameters, zoned-contour maps, and plots of the dependent variables versus time. (Lantz-PTT)
Knowledge-based design of generate-and-patch problem solvers that solve global resource assignment problems

NASA Technical Reports Server (NTRS)

Voigt, Kerstin

1992-01-01

We present MENDER, a knowledge based system that implements software design techniques that are specialized to automatically compile generate-and-patch problem solvers that satisfy global resource assignments problems. We provide empirical evidence of the superior performance of generate-and-patch over generate-and-test: even with constrained generation, for a global constraint in the domain of '2D-floorplanning'. For a second constraint in '2D-floorplanning' we show that even when it is possible to incorporate the constraint into a constrained generator, a generate-and-patch problem solver may satisfy the constraint more rapidly. We also briefly summarize how an extended version of our system applies to a constraint in the domain of 'multiprocessor scheduling'.
A purely Lagrangian method for computing linearly-perturbed flows in spherical geometry

NASA Astrophysics Data System (ADS)

Jaouen, Stéphane

2007-07-01

In many physical applications, one wishes to control the development of multi-dimensional instabilities around a one-dimensional (1D) complex flow. For predicting the growth rates of these perturbations, a general numerical approach is viable which consists in solving simultaneously the one-dimensional equations and their linearized form for three-dimensional perturbations. In Clarisse et al. [J.-M. Clarisse, S. Jaouen, P.-A. Raviart, A Godunov-type method in Lagrangian coordinates for computing linearly-perturbed planar-symmetric flows of gas dynamics, J. Comp. Phys. 198 (2004) 80-105], a class of Godunov-type schemes for planar-symmetric flows of gas dynamics has been proposed. Pursuing this effort, we extend these results to spherically symmetric flows. A new method to derive the Lagrangian perturbation equations, based on the canonical form of systems of conservation laws with zero entropy flux [B. Després, Lagrangian systems of conservation laws. Invariance properties of Lagrangian systems of conservation laws, approximate Riemann solvers and the entropy condition, Numer. Math. 89 (2001) 99-134; B. Després, C. Mazeran, Lagrangian gas dynamics in two dimensions and Lagrangian systems, Arch. Rational Mech. Anal. 178 (2005) 327-372] is also described. It leads to many advantages. First of all, many physical problems we are interested in enter this formalism (gas dynamics, two-temperature plasma equations, ideal magnetohydrodynamics, etc.) whatever is the geometry. Secondly, a class of numerical entropic schemes is available for the basic flow [11]. Last, linearizing and devising numerical schemes for the perturbed flow is straightforward. The numerical capabilities of these methods are illustrated on three test cases of increasing difficulties and we show that - due to its simplicity and its low computational cost - the Linear Perturbations Code (LPC) is a powerful tool to understand and predict the development of hydrodynamic instabilities in the linear regime.
TransCut: interactive rendering of translucent cutouts.

PubMed

Li, Dongping; Sun, Xin; Ren, Zhong; Lin, Stephen; Tong, Yiying; Guo, Baining; Zhou, Kun

2013-03-01

We present TransCut, a technique for interactive rendering of translucent objects undergoing fracturing and cutting operations. As the object is fractured or cut open, the user can directly examine and intuitively understand the complex translucent interior, as well as edit material properties through painting on cross sections and recombining the broken pieces—all with immediate and realistic visual feedback. This new mode of interaction with translucent volumes is made possible with two technical contributions. The first is a novel solver for the diffusion equation (DE) over a tetrahedral mesh that produces high-quality results comparable to the state-of-art finite element method (FEM) of Arbree et al. but at substantially higher speeds. This accuracy and efficiency is obtained by computing the discrete divergences of the diffusion equation and constructing the DE matrix using analytic formulas derived for linear finite elements. The second contribution is a multiresolution algorithm to significantly accelerate our DE solver while adapting to the frequent changes in topological structure of dynamic objects. The entire multiresolution DE solver is highly parallel and easily implemented on the GPU. We believe TransCut provides a novel visual effect for heterogeneous translucent objects undergoing fracturing and cutting operations.
Parameter investigation with line-implicit lower-upper symmetric Gauss-Seidel on 3D stretched grids

NASA Astrophysics Data System (ADS)

Otero, Evelyn; Eliasson, Peter

2015-03-01

An implicit lower-upper symmetric Gauss-Seidel (LU-SGS) solver has been implemented as a multigrid smoother combined with a line-implicit method as an acceleration technique for Reynolds-averaged Navier-Stokes (RANS) simulation on stretched meshes. The computational fluid dynamics code concerned is Edge, an edge-based finite volume Navier-Stokes flow solver for structured and unstructured grids. The paper focuses on the investigation of the parameters related to our novel line-implicit LU-SGS solver for convergence acceleration on 3D RANS meshes. The LU-SGS parameters are defined as the Courant-Friedrichs-Lewy number, the left-hand side dissipation, and the convergence of iterative solution of the linear problem arising from the linearisation of the implicit scheme. The influence of these parameters on the overall convergence is presented and default values are defined for maximum convergence acceleration. The optimised settings are applied to 3D RANS computations for comparison with explicit and line-implicit Runge-Kutta smoothing. For most of the cases, a computing time acceleration of the order of 2 is found depending on the mesh type, namely the boundary layer and the magnitude of residual reduction.
A fast direct solver for boundary value problems on locally perturbed geometries

NASA Astrophysics Data System (ADS)

Zhang, Yabin; Gillman, Adrianna

2018-03-01

Many applications including optimal design and adaptive discretization techniques involve solving several boundary value problems on geometries that are local perturbations of an original geometry. This manuscript presents a fast direct solver for boundary value problems that are recast as boundary integral equations. The idea is to write the discretized boundary integral equation on a new geometry as a low rank update to the discretized problem on the original geometry. Using the Sherman-Morrison formula, the inverse can be expressed in terms of the inverse of the original system applied to the low rank factors and the right hand side. Numerical results illustrate for problems where perturbation is localized the fast direct solver is three times faster than building a new solver from scratch.
Circuit-based versus full-wave modelling of active microwave circuits

NASA Astrophysics Data System (ADS)

Bukvić, Branko; Ilić, Andjelija Ž.; Ilić, Milan M.

2018-03-01

Modern full-wave computational tools enable rigorous simulations of linear parts of complex microwave circuits within minutes, taking into account all physical electromagnetic (EM) phenomena. Non-linear components and other discrete elements of the hybrid microwave circuit are then easily added within the circuit simulator. This combined full-wave and circuit-based analysis is a must in the final stages of the circuit design, although initial designs and optimisations are still faster and more comfortably done completely in the circuit-based environment, which offers real-time solutions at the expense of accuracy. However, due to insufficient information and general lack of specific case studies, practitioners still struggle when choosing an appropriate analysis method, or a component model, because different choices lead to different solutions, often with uncertain accuracy and unexplained discrepancies arising between the simulations and measurements. We here design a reconfigurable power amplifier, as a case study, using both circuit-based solver and a full-wave EM solver. We compare numerical simulations with measurements on the manufactured prototypes, discussing the obtained differences, pointing out the importance of measured parameters de-embedding, appropriate modelling of discrete components and giving specific recipes for good modelling practices.
Monte Carlo simulation of parameter confidence intervals for non-linear regression analysis of biological data using Microsoft Excel.

PubMed

Lambert, Ronald J W; Mytilinaios, Ioannis; Maitland, Luke; Brown, Angus M

2012-08-01

This study describes a method to obtain parameter confidence intervals from the fitting of non-linear functions to experimental data, using the SOLVER and Analysis ToolPaK Add-In of the Microsoft Excel spreadsheet. Previously we have shown that Excel can fit complex multiple functions to biological data, obtaining values equivalent to those returned by more specialized statistical or mathematical software. However, a disadvantage of using the Excel method was the inability to return confidence intervals for the computed parameters or the correlations between them. Using a simple Monte-Carlo procedure within the Excel spreadsheet (without recourse to programming), SOLVER can provide parameter estimates (up to 200 at a time) for multiple 'virtual' data sets, from which the required confidence intervals and correlation coefficients can be obtained. The general utility of the method is exemplified by applying it to the analysis of the growth of Listeria monocytogenes, the growth inhibition of Pseudomonas aeruginosa by chlorhexidine and the further analysis of the electrophysiological data from the compound action potential of the rodent optic nerve. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Fast inverse scattering solutions using the distorted Born iterative method and the multilevel fast multipole algorithm

PubMed Central

Hesford, Andrew J.; Chew, Weng C.

2010-01-01

The distorted Born iterative method (DBIM) computes iterative solutions to nonlinear inverse scattering problems through successive linear approximations. By decomposing the scattered field into a superposition of scattering by an inhomogeneous background and by a material perturbation, large or high-contrast variations in medium properties can be imaged through iterations that are each subject to the distorted Born approximation. However, the need to repeatedly compute forward solutions still imposes a very heavy computational burden. To ameliorate this problem, the multilevel fast multipole algorithm (MLFMA) has been applied as a forward solver within the DBIM. The MLFMA computes forward solutions in linear time for volumetric scatterers. The typically regular distribution and shape of scattering elements in the inverse scattering problem allow the method to take advantage of data redundancy and reduce the computational demands of the normally expensive MLFMA setup. Additional benefits are gained by employing Kaczmarz-like iterations, where partial measurements are used to accelerate convergence. Numerical results demonstrate both the efficiency of the forward solver and the successful application of the inverse method to imaging problems with dimensions in the neighborhood of ten wavelengths. PMID:20707438

Verification of continuum drift kinetic equation solvers in NIMROD

DOE Office of Scientific and Technical Information (OSTI.GOV)

Held, E. D.; Ji, J.-Y.; Kruger, S. E.

Verification of continuum solutions to the electron and ion drift kinetic equations (DKEs) in NIMROD [C. R. Sovinec et al., J. Comp. Phys. 195, 355 (2004)] is demonstrated through comparison with several neoclassical transport codes, most notably NEO [E. A. Belli and J. Candy, Plasma Phys. Controlled Fusion 54, 015015 (2012)]. The DKE solutions use NIMROD's spatial representation, 2D finite-elements in the poloidal plane and a 1D Fourier expansion in toroidal angle. For 2D velocity space, a novel 1D expansion in finite elements is applied for the pitch angle dependence and a collocation grid is used for the normalized speedmore » coordinate. The full, linearized Coulomb collision operator is kept and shown to be important for obtaining quantitative results. Bootstrap currents, parallel ion flows, and radial particle and heat fluxes show quantitative agreement between NIMROD and NEO for a variety of tokamak equilibria. In addition, velocity space distribution function contours for ions and electrons show nearly identical detailed structure and agree quantitatively. A Θ-centered, implicit time discretization and a block-preconditioned, iterative linear algebra solver provide efficient electron and ion DKE solutions that ultimately will be used to obtain closures for NIMROD's evolving fluid model.« less
Geopotential Error Analysis from Satellite Gradiometer and Global Positioning System Observables on Parallel Architecture

NASA Technical Reports Server (NTRS)

Schutz, Bob E.; Baker, Gregory A.

1997-01-01

The recovery of a high resolution geopotential from satellite gradiometer observations motivates the examination of high performance computational techniques. The primary subject matter addresses specifically the use of satellite gradiometer and GPS observations to form and invert the normal matrix associated with a large degree and order geopotential solution. Memory resident and out-of-core parallel linear algebra techniques along with data parallel batch algorithms form the foundation of the least squares application structure. A secondary topic includes the adoption of object oriented programming techniques to enhance modularity and reusability of code. Applications implementing the parallel and object oriented methods successfully calculate the degree variance for a degree and order 110 geopotential solution on 32 processors of the Cray T3E. The memory resident gradiometer application exhibits an overall application performance of 5.4 Gflops, and the out-of-core linear solver exhibits an overall performance of 2.4 Gflops. The combination solution derived from a sun synchronous gradiometer orbit produce average geoid height variances of 17 millimeters.
Explicit methods in extended phase space for inseparable Hamiltonian problems

NASA Astrophysics Data System (ADS)

Pihajoki, Pauli

2015-03-01

We present a method for explicit leapfrog integration of inseparable Hamiltonian systems by means of an extended phase space. A suitably defined new Hamiltonian on the extended phase space leads to equations of motion that can be numerically integrated by standard symplectic leapfrog (splitting) methods. When the leapfrog is combined with coordinate mixing transformations, the resulting algorithm shows good long term stability and error behaviour. We extend the method to non-Hamiltonian problems as well, and investigate optimal methods of projecting the extended phase space back to original dimension. Finally, we apply the methods to a Hamiltonian problem of geodesics in a curved space, and a non-Hamiltonian problem of a forced non-linear oscillator. We compare the performance of the methods to a general purpose differential equation solver LSODE, and the implicit midpoint method, a symplectic one-step method. We find the extended phase space methods to compare favorably to both for the Hamiltonian problem, and to the implicit midpoint method in the case of the non-linear oscillator.
Geopotential error analysis from satellite gradiometer and global positioning system observables on parallel architectures

NASA Astrophysics Data System (ADS)

Baker, Gregory Allen

The recovery of a high resolution geopotential from satellite gradiometer observations motivates the examination of high performance computational techniques. The primary subject matter addresses specifically the use of satellite gradiometer and GPS observations to form and invert the normal matrix associated with a large degree and order geopotential solution. Memory resident and out-of-core parallel linear algebra techniques along with data parallel batch algorithms form the foundation of the least squares application structure. A secondary topic includes the adoption of object oriented programming techniques to enhance modularity and reusability of code. Applications implementing the parallel and object oriented methods successfully calculate the degree variance for a degree and order 110 geopotential solution on 32 processors of the Cray T3E. The memory resident gradiometer application exhibits an overall application performance of 5.4 Gflops, and the out-of-core linear solver exhibits an overall performance of 2.4 Gflops. The combination solution derived from a sun synchronous gradiometer orbit produce average geoid height variances of 17 millimeters.
BOOK REVIEW: Advanced Topics in Computational Partial Differential Equations: Numerical Methods and Diffpack Programming

NASA Astrophysics Data System (ADS)

Katsaounis, T. D.

2005-02-01

The scope of this book is to present well known simple and advanced numerical methods for solving partial differential equations (PDEs) and how to implement these methods using the programming environment of the software package Diffpack. A basic background in PDEs and numerical methods is required by the potential reader. Further, a basic knowledge of the finite element method and its implementation in one and two space dimensions is required. The authors claim that no prior knowledge of the package Diffpack is required, which is true, but the reader should be at least familiar with an object oriented programming language like C++ in order to better comprehend the programming environment of Diffpack. Certainly, a prior knowledge or usage of Diffpack would be a great advantage to the reader. The book consists of 15 chapters, each one written by one or more authors. Each chapter is basically divided into two parts: the first part is about mathematical models described by PDEs and numerical methods to solve these models and the second part describes how to implement the numerical methods using the programming environment of Diffpack. Each chapter closes with a list of references on its subject. The first nine chapters cover well known numerical methods for solving the basic types of PDEs. Further, programming techniques on the serial as well as on the parallel implementation of numerical methods are also included in these chapters. The last five chapters are dedicated to applications, modelled by PDEs, in a variety of fields. The first chapter is an introduction to parallel processing. It covers fundamentals of parallel processing in a simple and concrete way and no prior knowledge of the subject is required. Examples of parallel implementation of basic linear algebra operations are presented using the Message Passing Interface (MPI) programming environment. Here, some knowledge of MPI routines is required by the reader. Examples solving in parallel simple PDEs using Diffpack and MPI are also presented. Chapter 2 presents the overlapping domain decomposition method for solving PDEs. It is well known that these methods are suitable for parallel processing. The first part of the chapter covers the mathematical formulation of the method as well as algorithmic and implementational issues. The second part presents a serial and a parallel implementational framework within the programming environment of Diffpack. The chapter closes by showing how to solve two application examples with the overlapping domain decomposition method using Diffpack. Chapter 3 is a tutorial about how to incorporate the multigrid solver in Diffpack. The method is illustrated by examples such as a Poisson solver, a general elliptic problem with various types of boundary conditions and a nonlinear Poisson type problem. In chapter 4 the mixed finite element is introduced. Technical issues concerning the practical implementation of the method are also presented. The main difficulties of the efficient implementation of the method, especially in two and three space dimensions on unstructured grids, are presented and addressed in the framework of Diffpack. The implementational process is illustrated by two examples, namely the system formulation of the Poisson problem and the Stokes problem. Chapter 5 is closely related to chapter 4 and addresses the problem of how to solve efficiently the linear systems arising by the application of the mixed finite element method. The proposed method is block preconditioning. Efficient techniques for implementing the method within Diffpack are presented. Optimal block preconditioners are used to solve the system formulation of the Poisson problem, the Stokes problem and the bidomain model for the electrical activity in the heart. The subject of chapter 6 is systems of PDEs. Linear and nonlinear systems are discussed. Fully implicit and operator splitting methods are presented. Special attention is paid to how existing solvers for scalar equations in Diffpack can be used to derive fully implicit solvers for systems. The proposed techniques are illustrated in terms of two applications, namely a system of PDEs modelling pipeflow and a two-phase porous media flow. Stochastic PDEs is the topic of chapter 7. The first part of the chapter is a simple introduction to stochastic PDEs; basic analytical properties are presented for simple models like transport phenomena and viscous drag forces. The second part considers the numerical solution of stochastic PDEs. Two basic techniques are presented, namely Monte Carlo and perturbation methods. The last part explains how to implement and incorporate these solvers into Diffpack. Chapter 8 describes how to operate Diffpack from Python scripts. The main goal here is to provide all the programming and technical details in order to glue the programming environment of Diffpack with visualization packages through Python and in general take advantage of the Python interfaces. Chapter 9 attempts to show how to use numerical experiments to measure the performance of various PDE solvers. The authors gathered a rather impressive list, a total of 14 PDE solvers. Solvers for problems like Poisson, Navier--Stokes, elasticity, two-phase flows and methods such as finite difference, finite element, multigrid, and gradient type methods are presented. The authors provide a series of numerical results combining various solvers with various methods in order to gain insight into their computational performance and efficiency. In Chapter 10 the authors consider a computationally challenging problem, namely the computation of the electrical activity of the human heart. After a brief introduction on the biology of the problem the authors present the mathematical models involved and a numerical method for solving them within the framework of Diffpack. Chapter 11 and 12 are closely related; actually they could have been combined in a single chapter. Chapter 11 introduces several mathematical models used in finance, based on the Black--Scholes equation. Chapter 12 considers several numerical methods like Monte Carlo, lattice methods, finite difference and finite element methods. Implementation of these methods within Diffpack is presented in the last part of the chapter. Chapter 13 presents how the finite element method is used for the modelling and analysis of elastic structures. The authors describe the structural elements of Diffpack which include popular elements such as beams and plates and examples are presented on how to use them to simulate elastic structures. Chapter 14 describes an application problem, namely the extrusion of aluminum. This is a rather\\endcolumn complicated process which involves non-Newtonian flow, heat transfer and elasticity. The authors describe the systems of PDEs modelling the underlying process and use a finite element method to obtain a numerical solution. The implementation of the numerical method in Diffpack is presented along with some applications. The last chapter, chapter 15, focuses on mathematical and numerical models of systems of PDEs governing geological processes in sedimentary basins. The underlying mathematical model is solved using the finite element method within a fully implicit scheme. The authors discuss the implementational issues involved within Diffpack and they present results from several examples. In summary, the book focuses on the computational and implementational issues involved in solving partial differential equations. The potential reader should have a basic knowledge of PDEs and the finite difference and finite element methods. The examples presented are solved within the programming framework of Diffpack and the reader should have prior experience with the particular software in order to take full advantage of the book. Overall the book is well written, the subject of each chapter is well presented and can serve as a reference for graduate students, researchers and engineers who are interested in the numerical solution of partial differential equations modelling various applications.
Validation of High-Fidelity CFD/CAA Framework for Launch Vehicle Acoustic Environment Simulation against Scale Model Test Data

NASA Technical Reports Server (NTRS)

Liever, Peter A.; West, Jeffrey S.; Harris, Robert E.

2016-01-01

A hybrid Computational Fluid Dynamics and Computational Aero-Acoustics (CFD/CAA) modeling framework has been developed for launch vehicle liftoff acoustic environment predictions. The framework couples the existing highly-scalable NASA production CFD code, Loci/CHEM, with a high-order accurate Discontinuous Galerkin solver developed in the same production framework, Loci/THRUST, to accurately resolve and propagate acoustic physics across the entire launch environment. Time-accurate, Hybrid RANS/LES CFD modeling is applied for predicting the acoustic generation physics at the plume source, and a high-order accurate unstructured mesh Discontinuous Galerkin (DG) method is employed to propagate acoustic waves away from the source across large distances using high-order accurate schemes. The DG solver is capable of solving 2nd, 3rd, and 4th order Euler solutions for non-linear, conservative acoustic field propagation. Initial application testing and validation has been carried out against high resolution acoustic data from the Ares Scale Model Acoustic Test (ASMAT) series to evaluate the capabilities and production readiness of the CFD/CAA system to resolve the observed spectrum of acoustic frequency content. This paper presents results from this validation and outlines efforts to mature and improve the computational simulation framework.
A Computational Model for Path Loss in Wireless Sensor Networks in Orchard Environments

PubMed Central

Anastassiu, Hristos T.; Vougioukas, Stavros; Fronimos, Theodoros; Regen, Christian; Petrou, Loukas; Zude, Manuela; Käthner, Jana

2014-01-01

A computational model for radio wave propagation through tree orchards is presented. Trees are modeled as collections of branches, geometrically approximated by cylinders, whose dimensions are determined on the basis of measurements in a cherry orchard. Tree canopies are modeled as dielectric spheres of appropriate size. A single row of trees was modeled by creating copies of a representative tree model positioned on top of a rectangular, lossy dielectric slab that simulated the ground. The complete scattering model, including soil and trees, enhanced by periodicity conditions corresponding to the array, was characterized via a commercial computational software tool for simulating the wave propagation by means of the Finite Element Method. The attenuation of the simulated signal was compared to measurements taken in the cherry orchard, using two ZigBee receiver-transmitter modules. Near the top of the tree canopies (at 3 m), the predicted attenuation was close to the measured one—just slightly underestimated. However, at 1.5 m the solver underestimated the measured attenuation significantly, especially when leaves were present and, as distances grew longer. This suggests that the effects of scattering from neighboring tree rows need to be incorporated into the model. However, complex geometries result in ill conditioned linear systems that affect the solver's convergence. PMID:24625738
Coupling of Acoustic Cavitation with Dem-Based Particle Solvers for Modeling De-agglomeration of Particle Clusters in Liquid Metals

NASA Astrophysics Data System (ADS)

Manoylov, Anton; Lebon, Bruno; Djambazov, Georgi; Pericleous, Koulis

2017-11-01

The aerospace and automotive industries are seeking advanced materials with low weight yet high strength and durability. Aluminum and magnesium-based metal matrix composites with ceramic micro- and nano-reinforcements promise the desirable properties. However, larger surface-area-to-volume ratio in micro- and especially nanoparticles gives rise to van der Waals and adhesion forces that cause the particles to agglomerate in clusters. Such clusters lead to adverse effects on final properties, no longer acting as dislocation anchors but instead becoming defects. Also, agglomeration causes the particle distribution to become uneven, leading to inconsistent properties. To break up clusters, ultrasonic processing may be used via an immersed sonotrode, or alternatively via electromagnetic vibration. This paper combines a fundamental study of acoustic cavitation in liquid aluminum with a study of the interaction forces causing particles to agglomerate, as well as mechanisms of cluster breakup. A non-linear acoustic cavitation model utilizing pressure waves produced by an immersed horn is presented, and then applied to cavitation in liquid aluminum. Physical quantities related to fluid flow and quantities specific to the cavitation solver are passed to a discrete element method particles model. The coupled system is then used for a detailed study of clusters' breakup by cavitation.
SOME NEW FINITE DIFFERENCE METHODS FOR HELMHOLTZ EQUATIONS ON IRREGULAR DOMAINS OR WITH INTERFACES

PubMed Central

Wan, Xiaohai; Li, Zhilin

2012-01-01

Solving a Helmholtz equation Δu + λu = f efficiently is a challenge for many applications. For example, the core part of many efficient solvers for the incompressible Navier-Stokes equations is to solve one or several Helmholtz equations. In this paper, two new finite difference methods are proposed for solving Helmholtz equations on irregular domains, or with interfaces. For Helmholtz equations on irregular domains, the accuracy of the numerical solution obtained using the existing augmented immersed interface method (AIIM) may deteriorate when the magnitude of λ is large. In our new method, we use a level set function to extend the source term and the PDE to a larger domain before we apply the AIIM. For Helmholtz equations with interfaces, a new maximum principle preserving finite difference method is developed. The new method still uses the standard five-point stencil with modifications of the finite difference scheme at irregular grid points. The resulting coefficient matrix of the linear system of finite difference equations satisfies the sign property of the discrete maximum principle and can be solved efficiently using a multigrid solver. The finite difference method is also extended to handle temporal discretized equations where the solution coefficient λ is inversely proportional to the mesh size. PMID:22701346
SOME NEW FINITE DIFFERENCE METHODS FOR HELMHOLTZ EQUATIONS ON IRREGULAR DOMAINS OR WITH INTERFACES.

PubMed

Wan, Xiaohai; Li, Zhilin

2012-06-01

Solving a Helmholtz equation Δu + λu = f efficiently is a challenge for many applications. For example, the core part of many efficient solvers for the incompressible Navier-Stokes equations is to solve one or several Helmholtz equations. In this paper, two new finite difference methods are proposed for solving Helmholtz equations on irregular domains, or with interfaces. For Helmholtz equations on irregular domains, the accuracy of the numerical solution obtained using the existing augmented immersed interface method (AIIM) may deteriorate when the magnitude of λ is large. In our new method, we use a level set function to extend the source term and the PDE to a larger domain before we apply the AIIM. For Helmholtz equations with interfaces, a new maximum principle preserving finite difference method is developed. The new method still uses the standard five-point stencil with modifications of the finite difference scheme at irregular grid points. The resulting coefficient matrix of the linear system of finite difference equations satisfies the sign property of the discrete maximum principle and can be solved efficiently using a multigrid solver. The finite difference method is also extended to handle temporal discretized equations where the solution coefficient λ is inversely proportional to the mesh size.
Fluid-acoustic interactions and their impact on pathological voiced speech

NASA Astrophysics Data System (ADS)

Erath, Byron D.; Zanartu, Matias; Peterson, Sean D.; Plesniak, Michael W.

2011-11-01

Voiced speech is produced by vibration of the vocal fold structures. Vocal fold dynamics arise from aerodynamic pressure loadings, tissue properties, and acoustic modulation of the driving pressures. Recent speech science advancements have produced a physiologically-realistic fluid flow solver (BLEAP) capable of prescribing asymmetric intraglottal flow attachment that can be easily assimilated into reduced order models of speech. The BLEAP flow solver is extended to incorporate acoustic loading and sound propagation in the vocal tract by implementing a wave reflection analog approach for sound propagation based on the governing BLEAP equations. This enhanced physiological description of the physics of voiced speech is implemented into a two-mass model of speech. The impact of fluid-acoustic interactions on vocal fold dynamics is elucidated for both normal and pathological speech through linear and nonlinear analysis techniques. Supported by NSF Grant CBET-1036280.
Impulse propagation over a complex site: a comparison of experimental results and numerical predictions.

PubMed

Dragna, Didier; Blanc-Benon, Philippe; Poisson, Franck

2014-03-01

Results from outdoor acoustic measurements performed in a railway site near Reims in France in May 2010 are compared to those obtained from a finite-difference time-domain solver of the linearized Euler equations. During the experiments, the ground profile and the different ground surface impedances were determined. Meteorological measurements were also performed to deduce mean vertical profiles of wind and temperature. An alarm pistol was used as a source of impulse signals and three microphones were located along a propagation path. The various measured parameters are introduced as input data into the numerical solver. In the frequency domain, the numerical results are in good accordance with the measurements up to a frequency of 2 kHz. In the time domain, except a time shift, the predicted waveforms match the measured waveforms with a close agreement.
Recent Enhancements To The FUN3D Flow Solver For Moving-Mesh Applications

NASA Technical Reports Server (NTRS)

Biedron, Robert T,; Thomas, James L.

2009-01-01

An unsteady Reynolds-averaged Navier-Stokes solver for unstructured grids has been extended to handle general mesh movement involving rigid, deforming, and overset meshes. Mesh deformation is achieved through analogy to elastic media by solving the linear elasticity equations. A general method for specifying the motion of moving bodies within the mesh has been implemented that allows for inherited motion through parent-child relationships, enabling simulations involving multiple moving bodies. Several example calculations are shown to illustrate the range of potential applications. For problems in which an isolated body is rotating with a fixed rate, a noninertial reference-frame formulation is available. An example calculation for a tilt-wing rotor is used to demonstrate that the time-dependent moving grid and noninertial formulations produce the same results in the limit of zero time-step size.
Experimental evaluation of model predictive control and inverse dynamics control for spacecraft proximity and docking maneuvers

NASA Astrophysics Data System (ADS)

Virgili-Llop, Josep; Zagaris, Costantinos; Park, Hyeongjun; Zappulla, Richard; Romano, Marcello

2018-03-01

An experimental campaign has been conducted to evaluate the performance of two different guidance and control algorithms on a multi-constrained docking maneuver. The evaluated algorithms are model predictive control (MPC) and inverse dynamics in the virtual domain (IDVD). A linear-quadratic approach with a quadratic programming solver is used for the MPC approach. A nonconvex optimization problem results from the IDVD approach, and a nonlinear programming solver is used. The docking scenario is constrained by the presence of a keep-out zone, an entry cone, and by the chaser's maximum actuation level. The performance metrics for the experiments and numerical simulations include the required control effort and time to dock. The experiments have been conducted in a ground-based air-bearing test bed, using spacecraft simulators that float over a granite table.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lao, Lang L.; St John, Holger; Staebler, Gary M.

This report describes the work done under U.S. Department of Energy grant number DE-FG02-07ER54935 for the period ending July 31, 2010. The goal of this project was to provide predictive transport analysis to the PTRANSP code. Our contribution to this effort consisted of three parts: (a) a predictive solver suitable for use with highly non-linear transport models and installation of the turbulent confinement models GLF23 and TGLF, (b) an interface of this solver with the PTRANSP code, and (c) initial development of an EPED1 edge pedestal model interface with PTRANSP. PTRANSP has been installed locally on this cluster by importingmore » a complete PTRANSP build environment that always contains the proper version of the libraries and other object files that PTRANSP requires. The GCNMP package and its interface code have been added to the SVN repository at PPPL.« less
Implicit solvers for unstructured meshes

NASA Technical Reports Server (NTRS)

Venkatakrishnan, V.; Mavriplis, Dimitri J.

1991-01-01

Implicit methods were developed and tested for unstructured mesh computations. The approximate system which arises from the Newton linearization of the nonlinear evolution operator is solved by using the preconditioned GMRES (Generalized Minimum Residual) technique. Three different preconditioners were studied, namely, the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over relaxation (SSOR). The preconditioners were optimized to have good vectorization properties. SSOR and ILU were also studied as iterative schemes. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also studied. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
Solvers for the Cardiac Bidomain Equations

PubMed Central

Vigmond, E.J.; Weber dos Santos, R.; Prassl, A.J.; Deo, M.; Plank, G.

2010-01-01

The bidomain equations are widely used for the simulation of electrical activity in cardiac tissue. They are especially important for accurately modelling extracellular stimulation, as evidenced by their prediction of virtual electrode polarization before experimental verification. However, solution of the equations is computationally expensive due to the fine spatial and temporal discretization needed. This limits the size and duration of the problem which can be modeled. Regardless of the specific form into which they are cast, the computational bottleneck becomes the repeated solution of a large, linear system. The purpose of this review is to give an overview of the equations, and the methods by which they have been solved. Of particular note are recent developments in multigrid methods, which have proven to be the most efficient. PMID:17900668
Conservative, unconditionally stable discretization methods for Hamiltonian equations, applied to wave motion in lattice equations modeling protein molecules

NASA Astrophysics Data System (ADS)

LeMesurier, Brenton

2012-01-01

A new approach is described for generating exactly energy-momentum conserving time discretizations for a wide class of Hamiltonian systems of DEs with quadratic momenta, including mechanical systems with central forces; it is well-suited in particular to the large systems that arise in both spatial discretizations of nonlinear wave equations and lattice equations such as the Davydov System modeling energetic pulse propagation in protein molecules. The method is unconditionally stable, making it well-suited to equations of broadly “Discrete NLS form”, including many arising in nonlinear optics. Key features of the resulting discretizations are exact conservation of both the Hamiltonian and quadratic conserved quantities related to continuous linear symmetries, preservation of time reversal symmetry, unconditional stability, and respecting the linearity of certain terms. The last feature allows a simple, efficient iterative solution of the resulting nonlinear algebraic systems that retain unconditional stability, avoiding the need for full Newton-type solvers. One distinction from earlier work on conservative discretizations is a new and more straightforward nearly canonical procedure for constructing the discretizations, based on a “discrete gradient calculus with product rule” that mimics the essential properties of partial derivatives. This numerical method is then used to study the Davydov system, revealing that previously conjectured continuum limit approximations by NLS do not hold, but that sech-like pulses related to NLS solitons can nevertheless sometimes arise.
P-CSI v1.0, an accelerated barotropic solver for the high-resolution ocean model component in the Community Earth System Model v2.0

NASA Astrophysics Data System (ADS)

Huang, Xiaomeng; Tang, Qiang; Tseng, Yuheng; Hu, Yong; Baker, Allison H.; Bryan, Frank O.; Dennis, John; Fu, Haohuan; Yang, Guangwen

2016-11-01

In the Community Earth System Model (CESM), the ocean model is computationally expensive for high-resolution grids and is often the least scalable component for high-resolution production experiments. The major bottleneck is that the barotropic solver scales poorly at high core counts. We design a new barotropic solver to accelerate the high-resolution ocean simulation. The novel solver adopts a Chebyshev-type iterative method to reduce the global communication cost in conjunction with an effective block preconditioner to further reduce the iterations. The algorithm and its computational complexity are theoretically analyzed and compared with other existing methods. We confirm the significant reduction of the global communication time with a competitive convergence rate using a series of idealized tests. Numerical experiments using the CESM 0.1° global ocean model show that the proposed approach results in a factor of 1.7 speed-up over the original method with no loss of accuracy, achieving 10.5 simulated years per wall-clock day on 16 875 cores.
Numerical Approach to Spatial Deterministic-Stochastic Models Arising in Cell Biology.

PubMed

Schaff, James C; Gao, Fei; Li, Ye; Novak, Igor L; Slepchenko, Boris M

2016-12-01

Hybrid deterministic-stochastic methods provide an efficient alternative to a fully stochastic treatment of models which include components with disparate levels of stochasticity. However, general-purpose hybrid solvers for spatially resolved simulations of reaction-diffusion systems are not widely available. Here we describe fundamentals of a general-purpose spatial hybrid method. The method generates realizations of a spatially inhomogeneous hybrid system by appropriately integrating capabilities of a deterministic partial differential equation solver with a popular particle-based stochastic simulator, Smoldyn. Rigorous validation of the algorithm is detailed, using a simple model of calcium 'sparks' as a testbed. The solver is then applied to a deterministic-stochastic model of spontaneous emergence of cell polarity. The approach is general enough to be implemented within biologist-friendly software frameworks such as Virtual Cell.

Perm State University HPC-hardware and software services: capabilities for aircraft engine aeroacoustics problems solving

NASA Astrophysics Data System (ADS)

Demenev, A. G.

2018-02-01

The present work is devoted to analyze high-performance computing (HPC) infrastructure capabilities for aircraft engine aeroacoustics problems solving at Perm State University. We explore here the ability to develop new computational aeroacoustics methods/solvers for computer-aided engineering (CAE) systems to handle complicated industrial problems of engine noise prediction. Leading aircraft engine engineering company, including “UEC-Aviadvigatel” JSC (our industrial partners in Perm, Russia), require that methods/solvers to optimize geometry of aircraft engine for fan noise reduction. We analysed Perm State University HPC-hardware resources and software services to use efficiently. The performed results demonstrate that Perm State University HPC-infrastructure are mature enough to face out industrial-like problems of development CAE-system with HPC-method and CFD-solvers.
Accurate evaluation of exchange fields in finite element micromagnetic solvers

NASA Astrophysics Data System (ADS)

Chang, R.; Escobar, M. A.; Li, S.; Lubarda, M. V.; Lomakin, V.

2012-04-01

Quadratic basis functions (QBFs) are implemented for solving the Landau-Lifshitz-Gilbert equation via the finite element method. This involves the introduction of a set of special testing functions compatible with the QBFs for evaluating the Laplacian operator. The results by using QBFs are significantly more accurate than those via linear basis functions. QBF approach leads to significantly more accurate results than conventionally used approaches based on linear basis functions. Importantly QBFs allow reducing the error of computing the exchange field by increasing the mesh density for structured and unstructured meshes. Numerical examples demonstrate the feasibility of the method.
Sonic Boom Prediction and Minimization of the Douglas Reference OPT5 Configuration

NASA Technical Reports Server (NTRS)

Siclari, Michael J.

1999-01-01

Conventional CFD methods and grids do not yield adequate resolution of the complex shock flow pattern generated by a real aircraft geometry. As a result, a unique grid topology and supersonic flow solver was developed at Northrop Grumman based on the characteristic behavior of supersonic wave patterns emanating from the aircraft. Using this approach, it was possible to compute flow fields with adequate resolution several body lengths below the aircraft. In this region, three-dimensional effects are diminished and conventional two-dimensional modified linear theory (MLT) can be applied to estimate ground pressure signatures or sonic booms. To accommodate real aircraft geometries and alleviate the burdensome grid generation task, an implicit marching multi-block, multi-grid finite-volume Euler code was developed as the basis for the sonic boom prediction methodology. The Thomas two-dimensional extrapolation method is built into the Euler code so that ground signatures can be obtained quickly and efficiently with minimum computational effort suitable to the aircraft design environment. The loudness levels of these signatures can then be determined using a NASA generated noise code. Since the Euler code is a three-dimensional flow field solver, the complete circumferential region below the aircraft is computed. The extrapolation of all this field data from a cylinder of constant radius leads to the definition of the entire boom corridor occurring directly below and off to the side of the aircraft's flight path yielding an estimate for the entire noise "annoyance" corridor in miles as well as its magnitude. An automated multidisciplinary sonic boom design optimization software system was developed during the latter part of HSR Phase 1. Using this system, it was found that sonic boom signatures could be reduced through optimization of a variety of geometric aircraft parameters. This system uses a gradient based nonlinear optimizer as the driver in conjunction with a computationally efficient Euler CFD solver (NIIM3DSB) for computing the three-dimensional near-field characteristics of the aircraft. The intent of the design system is to identify and optimize geometric design variables that have a beneficial impact on the ground sonic boom. The system uses a simple wave drag data format to specify the aircraft geometry. The geometry is internally enhanced and analytic methods are used to generate marching grids suitable for the multi-block Euler solver. The Thomas extrapolation method is integrated into this system, and hence, the aircraft's centerline ground sonic boom signature is also automatically computed for a specified cruise altitude and yields the parameters necessary to evaluate the design function. The entire design system has been automated since the gradient based optimization software requires many flow analyses in order to obtain the required sensitivity derivatives for each design variable in order to converge on an optimal solution. Hence, once the problem is defined which includes defining the objective function and geometric and aerodynamic constraints, the system will automatically regenerate the perturbed geometry, the necessary grids, the Euler solution, and finally the ground sonic boom signature at the request of the optimizer.
Hypersonic flow analysis

NASA Technical Reports Server (NTRS)

Chow, Chuen-Yen; Ryan, James S.

1987-01-01

While the zonal grid system of Transonic Navier-Stokes (TNS) provides excellent modeling of complex geometries, improved shock capturing, and a higher Mach number range will be required if flows about hypersonic aircraft are to be modeled accurately. A computational fluid dynamics (CFD) code, the Compressible Navier-Stokes (CNS), is under development to combine the required high Mach number capability with the existing TNS geometry capability. One of several candidate flow solvers for inclusion in the CNS is that of F3D. This upwinding flow solver promises improved shock capturing, and more accurate hypersonic solutions overall, compared to the solver currently used in TNS.
System and method for modeling and analyzing complex scenarios

DOEpatents

Shevitz, Daniel Wolf

2013-04-09

An embodiment of the present invention includes a method for analyzing and solving possibility tree. A possibility tree having a plurality of programmable nodes is constructed and solved with a solver module executed by a processor element. The solver module executes the programming of said nodes, and tracks the state of at least a variable through a branch. When a variable of said branch is out of tolerance with a parameter, the solver disables remaining nodes of the branch and marks the branch as an invalid solution. The valid solutions are then aggregated and displayed as valid tree solutions.
Matlab Geochemistry: An open source geochemistry solver based on MRST

NASA Astrophysics Data System (ADS)

McNeece, C. J.; Raynaud, X.; Nilsen, H.; Hesse, M. A.

2017-12-01

The study of geological systems often requires the solution of complex geochemical relations. To address this need we present an open source geochemical solver based on the Matlab Reservoir Simulation Toolbox (MRST) developed by SINTEF. The implementation supports non-isothermal multicomponent aqueous complexation, surface complexation, ion exchange, and dissolution/precipitation reactions. The suite of tools available in MRST allows for rapid model development, in particular the incorporation of geochemical calculations into transport simulations of multiple phases, complex domain geometry and geomechanics. Different numerical schemes and additional physics can be easily incorporated into the existing tools through the object-oriented framework employed by MRST. The solver leverages the automatic differentiation tools available in MRST to solve arbitrarily complex geochemical systems with any choice of species or element concentration as input. Four mathematical approaches enable the solver to be quite robust: 1) the choice of chemical elements as the basis components makes all entries in the composition matrix positive thus preserving convexity, 2) a log variable transformation is used which transfers the nonlinearity to the convex composition matrix, 3) a priori bounds on variables are calculated from the structure of the problem, constraining Netwon's path and 4) an initial guess is calculated implicitly by sequentially adding model complexity. As a benchmark we compare the model to experimental and semi-analytic solutions of the coupled salinity-acidity transport system. Together with the reservoir simulation capabilities of MRST the solver offers a promising tool for geochemical simulations in reservoir domains for applications in a diversity of fields from enhanced oil recovery to radionuclide storage.
Using SPARK as a Solver for Modelica

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wetter, Michael; Wetter, Michael; Haves, Philip

Modelica is an object-oriented acausal modeling language that is well positioned to become a de-facto standard for expressing models of complex physical systems. To simulate a model expressed in Modelica, it needs to be translated into executable code. For generating run-time efficient code, such a translation needs to employ algebraic formula manipulations. As the SPARK solver has been shown to be competitive for generating such code but currently cannot be used with the Modelica language, we report in this paper how SPARK's symbolic and numerical algorithms can be implemented in OpenModelica, an open-source implementation of a Modelica modeling and simulationmore » environment. We also report benchmark results that show that for our air flow network simulation benchmark, the SPARK solver is competitive with Dymola, which is believed to provide the best solver for Modelica.« less
Method and apparatus for automatically generating airfoil performance tables

NASA Technical Reports Server (NTRS)

van Dam, Cornelis P. (Inventor); Mayda, Edward A. (Inventor); Strawn, Roger Clayton (Inventor)

2006-01-01

One embodiment of the present invention provides a system that facilitates automatically generating a performance table for an object, wherein the object is subject to fluid flow. The system operates by first receiving a description of the object and testing parameters for the object. The system executes a flow solver using the testing parameters and the description of the object to produce an output. Next, the system determines if the output of the flow solver indicates negative density or pressure. If not, the system analyzes the output to determine if the output is converging. If converging, the system writes the output to the performance table for the object.
Highly efficient and exact method for parallelization of grid-based algorithms and its implementation in DelPhi

PubMed Central

Li, Chuan; Li, Lin; Zhang, Jie; Alexov, Emil

2012-01-01

The Gauss-Seidel method is a standard iterative numerical method widely used to solve a system of equations and, in general, is more efficient comparing to other iterative methods, such as the Jacobi method. However, standard implementation of the Gauss-Seidel method restricts its utilization in parallel computing due to its requirement of using updated neighboring values (i.e., in current iteration) as soon as they are available. Here we report an efficient and exact (not requiring assumptions) method to parallelize iterations and to reduce the computational time as a linear/nearly linear function of the number of CPUs. In contrast to other existing solutions, our method does not require any assumptions and is equally applicable for solving linear and nonlinear equations. This approach is implemented in the DelPhi program, which is a finite difference Poisson-Boltzmann equation solver to model electrostatics in molecular biology. This development makes the iterative procedure on obtaining the electrostatic potential distribution in the parallelized DelPhi several folds faster than that in the serial code. Further we demonstrate the advantages of the new parallelized DelPhi by computing the electrostatic potential and the corresponding energies of large supramolecular structures. PMID:22674480
Reprint of Solution of Ambrosio-Tortorelli model for image segmentation by generalized relaxation method

NASA Astrophysics Data System (ADS)

D'Ambra, Pasqua; Tartaglione, Gaetano

2015-04-01

Image segmentation addresses the problem to partition a given image into its constituent objects and then to identify the boundaries of the objects. This problem can be formulated in terms of a variational model aimed to find optimal approximations of a bounded function by piecewise-smooth functions, minimizing a given functional. The corresponding Euler-Lagrange equations are a set of two coupled elliptic partial differential equations with varying coefficients. Numerical solution of the above system often relies on alternating minimization techniques involving descent methods coupled with explicit or semi-implicit finite-difference discretization schemes, which are slowly convergent and poorly scalable with respect to image size. In this work we focus on generalized relaxation methods also coupled with multigrid linear solvers, when a finite-difference discretization is applied to the Euler-Lagrange equations of Ambrosio-Tortorelli model. We show that non-linear Gauss-Seidel, accelerated by inner linear iterations, is an effective method for large-scale image analysis as those arising from high-throughput screening platforms for stem cells targeted differentiation, where one of the main goal is segmentation of thousand of images to analyze cell colonies morphology.
Solution of Ambrosio-Tortorelli model for image segmentation by generalized relaxation method

NASA Astrophysics Data System (ADS)

D'Ambra, Pasqua; Tartaglione, Gaetano

2015-03-01

Image segmentation addresses the problem to partition a given image into its constituent objects and then to identify the boundaries of the objects. This problem can be formulated in terms of a variational model aimed to find optimal approximations of a bounded function by piecewise-smooth functions, minimizing a given functional. The corresponding Euler-Lagrange equations are a set of two coupled elliptic partial differential equations with varying coefficients. Numerical solution of the above system often relies on alternating minimization techniques involving descent methods coupled with explicit or semi-implicit finite-difference discretization schemes, which are slowly convergent and poorly scalable with respect to image size. In this work we focus on generalized relaxation methods also coupled with multigrid linear solvers, when a finite-difference discretization is applied to the Euler-Lagrange equations of Ambrosio-Tortorelli model. We show that non-linear Gauss-Seidel, accelerated by inner linear iterations, is an effective method for large-scale image analysis as those arising from high-throughput screening platforms for stem cells targeted differentiation, where one of the main goal is segmentation of thousand of images to analyze cell colonies morphology.
Jacobian-free approximate solvers for hyperbolic systems: Application to relativistic magnetohydrodynamics

NASA Astrophysics Data System (ADS)

Castro, Manuel J.; Gallardo, José M.; Marquina, Antonio

2017-10-01

We present recent advances in PVM (Polynomial Viscosity Matrix) methods based on internal approximations to the absolute value function, and compare them with Chebyshev-based PVM solvers. These solvers only require a bound on the maximum wave speed, so no spectral decomposition is needed. Another important feature of the proposed methods is that they are suitable to be written in Jacobian-free form, in which only evaluations of the physical flux are used. This is particularly interesting when considering systems for which the Jacobians involve complex expressions, e.g., the relativistic magnetohydrodynamics (RMHD) equations. On the other hand, the proposed Jacobian-free solvers have also been extended to the case of approximate DOT (Dumbser-Osher-Toro) methods, which can be regarded as simple and efficient approximations to the classical Osher-Solomon method, sharing most of it interesting features and being applicable to general hyperbolic systems. To test the properties of our schemes a number of numerical experiments involving the RMHD equations are presented, both in one and two dimensions. The obtained results are in good agreement with those found in the literature and show that our schemes are robust and accurate, running stable under a satisfactory time step restriction. It is worth emphasizing that, although this work focuses on RMHD, the proposed schemes are suitable to be applied to general hyperbolic systems.
Molecular dynamics simulations in hybrid particle-continuum schemes: Pitfalls and caveats

NASA Astrophysics Data System (ADS)

Stalter, S.; Yelash, L.; Emamy, N.; Statt, A.; Hanke, M.; Lukáčová-Medvid'ová, M.; Virnau, P.

2018-03-01

Heterogeneous multiscale methods (HMM) combine molecular accuracy of particle-based simulations with the computational efficiency of continuum descriptions to model flow in soft matter liquids. In these schemes, molecular simulations typically pose a computational bottleneck, which we investigate in detail in this study. We find that it is preferable to simulate many small systems as opposed to a few large systems, and that a choice of a simple isokinetic thermostat is typically sufficient while thermostats such as Lowe-Andersen allow for simulations at elevated viscosity. We discuss suitable choices for time steps and finite-size effects which arise in the limit of very small simulation boxes. We also argue that if colloidal systems are considered as opposed to atomistic systems, the gap between microscopic and macroscopic simulations regarding time and length scales is significantly smaller. We propose a novel reduced-order technique for the coupling to the macroscopic solver, which allows us to approximate a non-linear stress-strain relation efficiently and thus further reduce computational effort of microscopic simulations.
Research in computer science

NASA Technical Reports Server (NTRS)

Ortega, J. M.

1986-01-01

Various graduate research activities in the field of computer science are reported. Among the topics discussed are: (1) failure probabilities in multi-version software; (2) Gaussian Elimination on parallel computers; (3) three dimensional Poisson solvers on parallel/vector computers; (4) automated task decomposition for multiple robot arms; (5) multi-color incomplete cholesky conjugate gradient methods on the Cyber 205; and (6) parallel implementation of iterative methods for solving linear equations.
Class and Homework Problems: The Break-Even Radius of Insulation Computed Using Excel Solver and WolframAlpha

ERIC Educational Resources Information Center

Foley, Greg

2014-01-01

A problem that illustrates two ways of computing the break-even radius of insulation is outlined. The problem is suitable for students who are taking an introductory module in heat transfer or transport phenomena and who have some previous knowledge of the numerical solution of non- linear algebraic equations. The potential for computer algebra,…
The fundamentals of adaptive grid movement

NASA Technical Reports Server (NTRS)

Eiseman, Peter R.

1990-01-01

Basic grid point movement schemes are studied. The schemes are referred to as adaptive grids. Weight functions and equidistribution in one dimension are treated. The specification of coefficients in the linear weight, attraction to a given grid or a curve, and evolutionary forces are considered. Curve by curve and finite volume methods are described. The temporal coupling of partial differential equations solvers and grid generators was discussed.
Parallel Performance of Linear Solvers and Preconditioners

DTIC Science & Technology

2014-01-01

are produced by a discrete dislocation dynamics ( DDD ) simulation and change with each timestep of the DDD simulation as the dislocation structure...evolves. However, the coefficient—or stiffness matrix— remains constant during the DDD simulation and some expensive matrix factorizations only occur once...discrete dislocation dynamics ( DDD ) simulations. This can be achieved by coupling a DDD simulator for bulk material (Arsenlis et al., 2007) to a
Advanced Signal Processing for Integrated LES-RANS Simulations: Anti-aliasing Filters

NASA Technical Reports Server (NTRS)

Schlueter, J. U.

2003-01-01

Currently, a wide variety of flow phenomena are addressed with numerical simulations. Many flow solvers are optimized to simulate a limited spectrum of flow effects effectively, such as single parts of a flow system, but are either inadequate or too expensive to be applied to a very complex problem. As an example, the flow through a gas turbine can be considered. In the compressor and the turbine section, the flow solver has to be able to handle the moving blades, model the wall turbulence, and predict the pressure and density distribution properly. This can be done by a flow solver based on the Reynolds-Averaged Navier-Stokes (RANS) approach. On the other hand, the flow in the combustion chamber is governed by large scale turbulence, chemical reactions, and the presence of fuel spray. Experience shows that these phenomena require an unsteady approach. Hence, for the combustor, the use of a Large Eddy Simulation (LES) flow solver is desirable. While many design problems of a single flow passage can be addressed by separate computations, only the simultaneous computation of all parts can guarantee the proper prediction of multi-component phenomena, such as compressor/combustor instability and combustor/turbine hot-streak migration. Therefore, a promising strategy to perform full aero-thermal simulations of gas-turbine engines is the use of a RANS flow solver for the compressor sections, an LES flow solver for the combustor, and again a RANS flow solver for the turbine section.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Spotz, William F.

PyTrilinos is a set of Python interfaces to compiled Trilinos packages. This collection supports serial and parallel dense linear algebra, serial and parallel sparse linear algebra, direct and iterative linear solution techniques, algebraic and multilevel preconditioners, nonlinear solvers and continuation algorithms, eigensolvers and partitioning algorithms. Also included are a variety of related utility functions and classes, including distributed I/O, coloring algorithms and matrix generation. PyTrilinos vector objects are compatible with the popular NumPy Python package. As a Python front end to compiled libraries, PyTrilinos takes advantage of the flexibility and ease of use of Python, and the efficiency of themore » underlying C++, C and Fortran numerical kernels. This paper covers recent, previously unpublished advances in the PyTrilinos package.« less
Decision Engines for Software Analysis Using Satisfiability Modulo Theories Solvers

NASA Technical Reports Server (NTRS)

Bjorner, Nikolaj

2010-01-01

The area of software analysis, testing and verification is now undergoing a revolution thanks to the use of automated and scalable support for logical methods. A well-recognized premise is that at the core of software analysis engines is invariably a component using logical formulas for describing states and transformations between system states. The process of using this information for discovering and checking program properties (including such important properties as safety and security) amounts to automatic theorem proving. In particular, theorem provers that directly support common software constructs offer a compelling basis. Such provers are commonly called satisfiability modulo theories (SMT) solvers. Z3 is a state-of-the-art SMT solver. It is developed at Microsoft Research. It can be used to check the satisfiability of logical formulas over one or more theories such as arithmetic, bit-vectors, lists, records and arrays. The talk describes some of the technology behind modern SMT solvers, including the solver Z3. Z3 is currently mainly targeted at solving problems that arise in software analysis and verification. It has been applied to various contexts, such as systems for dynamic symbolic simulation (Pex, SAGE, Vigilante), for program verification and extended static checking (Spec#/Boggie, VCC, HAVOC), for software model checking (Yogi, SLAM), model-based design (FORMULA), security protocol code (F7), program run-time analysis and invariant generation (VS3). We will describe how it integrates support for a variety of theories that arise naturally in the context of the applications. There are several new promising avenues and the talk will touch on some of these and the challenges related to SMT solvers. Proceedings

Solving regularly and singularly perturbed reaction-diffusion equations in three space dimensions

NASA Astrophysics Data System (ADS)

Moore, Peter K.

2007-06-01

In [P.K. Moore, Effects of basis selection and h-refinement on error estimator reliability and solution efficiency for higher-order methods in three space dimensions, Int. J. Numer. Anal. Mod. 3 (2006) 21-51] a fixed, high-order h-refinement finite element algorithm, Href, was introduced for solving reaction-diffusion equations in three space dimensions. In this paper Href is coupled with continuation creating an automatic method for solving regularly and singularly perturbed reaction-diffusion equations. The simple quasilinear Newton solver of Moore, (2006) is replaced by the nonlinear solver NITSOL [M. Pernice, H.F. Walker, NITSOL: a Newton iterative solver for nonlinear systems, SIAM J. Sci. Comput. 19 (1998) 302-318]. Good initial guesses for the nonlinear solver are obtained using continuation in the small parameter ɛ. Two strategies allow adaptive selection of ɛ. The first depends on the rate of convergence of the nonlinear solver and the second implements backtracking in ɛ. Finally a simple method is used to select the initial ɛ. Several examples illustrate the effectiveness of the algorithm.
New preconditioning strategy for Jacobian-free solvers for variably saturated flows with Richards’ equation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lipnikov, Konstantin; Moulton, David; Svyatskiy, Daniil

2016-04-29

We develop a new approach for solving the nonlinear Richards’ equation arising in variably saturated flow modeling. The growing complexity of geometric models for simulation of subsurface flows leads to the necessity of using unstructured meshes and advanced discretization methods. Typically, a numerical solution is obtained by first discretizing PDEs and then solving the resulting system of nonlinear discrete equations with a Newton-Raphson-type method. Efficiency and robustness of the existing solvers rely on many factors, including an empiric quality control of intermediate iterates, complexity of the employed discretization method and a customized preconditioner. We propose and analyze a new preconditioningmore » strategy that is based on a stable discretization of the continuum Jacobian. We will show with numerical experiments for challenging problems in subsurface hydrology that this new preconditioner improves convergence of the existing Jacobian-free solvers 3-20 times. Furthermore, we show that the Picard method with this preconditioner becomes a more efficient nonlinear solver than a few widely used Jacobian-free solvers.« less
Status Report on NEAMS System Analysis Module Development

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hu, R.; Fanning, T. H.; Sumner, T.

2015-12-01

Under the Reactor Product Line (RPL) of DOE-NE’s Nuclear Energy Advanced Modeling and Simulation (NEAMS) program, an advanced SFR System Analysis Module (SAM) is being developed at Argonne National Laboratory. The goal of the SAM development is to provide fast-running, improved-fidelity, whole-plant transient analyses capabilities. SAM utilizes an object-oriented application framework MOOSE), and its underlying meshing and finite-element library libMesh, as well as linear and non-linear solvers PETSc, to leverage modern advanced software environments and numerical methods. It also incorporates advances in physical and empirical models and seeks closure models based on information from high-fidelity simulations and experiments. This reportmore » provides an update on the SAM development, and summarizes the activities performed in FY15 and the first quarter of FY16. The tasks include: (1) implement the support of 2nd-order finite elements in SAM components for improved accuracy and computational efficiency; (2) improve the conjugate heat transfer modeling and develop pseudo 3-D full-core reactor heat transfer capabilities; (3) perform verification and validation tests as well as demonstration simulations; (4) develop the coupling requirements for SAS4A/SASSYS-1 and SAM integration.« less
Multichannel myopic deconvolution in underwater acoustic channels via low-rank recovery

PubMed Central

Tian, Ning; Byun, Sung-Hoon; Sabra, Karim; Romberg, Justin

2017-01-01

This paper presents a technique for solving the multichannel blind deconvolution problem. The authors observe the convolution of a single (unknown) source with K different (unknown) channel responses; from these channel outputs, the authors want to estimate both the source and the channel responses. The authors show how this classical signal processing problem can be viewed as solving a system of bilinear equations, and in turn can be recast as recovering a rank-1 matrix from a set of linear observations. Results of prior studies in the area of low-rank matrix recovery have identified effective convex relaxations for problems of this type and efficient, scalable heuristic solvers that enable these techniques to work with thousands of unknown variables. The authors show how a priori information about the channels can be used to build a linear model for the channels, which in turn makes solving these systems of equations well-posed. This study demonstrates the robustness of this methodology to measurement noises and parametrization errors of the channel impulse responses with several stylized and shallow water acoustic channel simulations. The performance of this methodology is also verified experimentally using shipping noise recorded on short bottom-mounted vertical line arrays. PMID:28599565
Parallel computation of fluid-structural interactions using high resolution upwind schemes

NASA Astrophysics Data System (ADS)

Hu, Zongjun

An efficient and accurate solver is developed to simulate the non-linear fluid-structural interactions in turbomachinery flutter flows. A new low diffusion E-CUSP scheme, Zha CUSP scheme, is developed to improve the efficiency and accuracy of the inviscid flux computation. The 3D unsteady Navier-Stokes equations with the Baldwin-Lomax turbulence model are solved using the finite volume method with the dual-time stepping scheme. The linearized equations are solved with Gauss-Seidel line iterations. The parallel computation is implemented using MPI protocol. The solver is validated with 2D cases for its turbulence modeling, parallel computation and unsteady calculation. The Zha CUSP scheme is validated with 2D cases, including a supersonic flat plate boundary layer, a transonic converging-diverging nozzle and a transonic inlet diffuser. The Zha CUSP2 scheme is tested with 3D cases, including a circular-to-rectangular nozzle, a subsonic compressor cascade and a transonic channel. The Zha CUSP schemes are proved to be accurate, robust and efficient in these tests. The steady and unsteady separation flows in a 3D stationary cascade under high incidence and three inlet Mach numbers are calculated to study the steady state separation flow patterns and their unsteady oscillation characteristics. The leading edge vortex shedding is the mechanism behind the unsteady characteristics of the high incidence separated flows. The separation flow characteristics is affected by the inlet Mach number. The blade aeroelasticity of a linear cascade with forced oscillating blades is studied using parallel computation. A simplified two-passage cascade with periodic boundary condition is first calculated under a medium frequency and a low incidence. The full scale cascade with 9 blades and two end walls is then studied more extensively under three oscillation frequencies and two incidence angles. The end wall influence and the blade stability are studied and compared under different frequencies and incidence angles. The Zha CUSP schemes are the first time to be applied in moving grid systems and 2D and 3D calculations. The implicit Gauss-Seidel iteration with dual time stepping is the first time to be used for moving grid systems. The NASA flutter cascade is the first time to be calculated in full scale.
A massively parallel adaptive scheme for melt migration in geodynamics computations

NASA Astrophysics Data System (ADS)

Dannberg, Juliane; Heister, Timo; Grove, Ryan

2016-04-01

Melt generation and migration are important processes for the evolution of the Earth's interior and impact the global convection of the mantle. While they have been the subject of numerous investigations, the typical time and length-scales of melt transport are vastly different from global mantle convection, which determines where melt is generated. This makes it difficult to study mantle convection and melt migration in a unified framework. In addition, modelling magma dynamics poses the challenge of highly non-linear and spatially variable material properties, in particular the viscosity. We describe our extension of the community mantle convection code ASPECT that adds equations describing the behaviour of silicate melt percolating through and interacting with a viscously deforming host rock. We use the original compressible formulation of the McKenzie equations, augmented by an equation for the conservation of energy. This approach includes both melt migration and melt generation with the accompanying latent heat effects, and it incorporates the individual compressibilities of the solid and the fluid phase. For this, we derive an accurate and stable Finite Element scheme that can be combined with adaptive mesh refinement. This is particularly advantageous for this type of problem, as the resolution can be increased in mesh cells where melt is present and viscosity gradients are high, whereas a lower resolution is sufficient in regions without melt. Together with a high-performance, massively parallel implementation, this allows for high resolution, 3d, compressible, global mantle convection simulations coupled with melt migration. Furthermore, scalable iterative linear solvers are required to solve the large linear systems arising from the discretized system. Finally, we present benchmarks and scaling tests of our solver up to tens of thousands of cores, show the effectiveness of adaptive mesh refinement when applied to melt migration and compare the compressible and incompressible formulation. We then apply our software to large-scale 3d simulations of melting and melt transport in mantle plumes interacting with the lithosphere. Our model of magma dynamics provides a framework for modelling processes on different scales and investigating links between processes occurring in the deep mantle and melt generation and migration. The presented implementation is available online under an Open Source license together with an extensive documentation.
On the scalability of the Albany/FELIX first-order Stokes approximation ice sheet solver for large-scale simulations of the Greenland and Antarctic ice sheets

DOE PAGES

Tezaur, Irina K.; Tuminaro, Raymond S.; Perego, Mauro; ...

2015-01-01

We examine the scalability of the recently developed Albany/FELIX finite-element based code for the first-order Stokes momentum balance equations for ice flow. We focus our analysis on the performance of two possible preconditioners for the iterative solution of the sparse linear systems that arise from the discretization of the governing equations: (1) a preconditioner based on the incomplete LU (ILU) factorization, and (2) a recently-developed algebraic multigrid (AMG) preconditioner, constructed using the idea of semi-coarsening. A strong scalability study on a realistic, high resolution Greenland ice sheet problem reveals that, for a given number of processor cores, the AMG preconditionermore » results in faster linear solve times but the ILU preconditioner exhibits better scalability. In addition, a weak scalability study is performed on a realistic, moderate resolution Antarctic ice sheet problem, a substantial fraction of which contains floating ice shelves, making it fundamentally different from the Greenland ice sheet problem. We show that as the problem size increases, the performance of the ILU preconditioner deteriorates whereas the AMG preconditioner maintains scalability. This is because the linear systems are extremely ill-conditioned in the presence of floating ice shelves, and the ill-conditioning has a greater negative effect on the ILU preconditioner than on the AMG preconditioner.« less
An Aeroelastic Analysis of a Thin Flexible Membrane

NASA Technical Reports Server (NTRS)

Scott, Robert C.; Bartels, Robert E.; Kandil, Osama A.

2007-01-01

Studies have shown that significant vehicle mass and cost savings are possible with the use of ballutes for aero-capture. Through NASA's In-Space Propulsion program, a preliminary examination of ballute sensitivity to geometry and Reynolds number was conducted, and a single-pass coupling between an aero code and a finite element solver was used to assess the static aeroelastic effects. There remain, however, a variety of open questions regarding the dynamic aeroelastic stability of membrane structures for aero-capture, with the primary challenge being the prediction of the membrane flutter onset. The purpose of this paper is to describe and begin addressing these issues. The paper includes a review of the literature associated with the structural analysis of membranes and membrane utter. Flow/structure analysis coupling and hypersonic flow solver options are also discussed. An approach is proposed for tackling this problem that starts with a relatively simple geometry and develops and evaluates analysis methods and procedures. This preliminary study considers a computationally manageable 2-dimensional problem. The membrane structural models used in the paper include a nonlinear finite-difference model for static and dynamic analysis and a NASTRAN finite element membrane model for nonlinear static and linear normal modes analysis. Both structural models are coupled with a structured compressible flow solver for static aeroelastic analysis. For dynamic aeroelastic analyses, the NASTRAN normal modes are used in the structured compressible flow solver and 3rd order piston theories were used with the finite difference membrane model to simulate utter onset. Results from the various static and dynamic aeroelastic analyses are compared.
TADS: A CFD-based turbomachinery and analysis design system with GUI. Volume 1: Method and results

NASA Technical Reports Server (NTRS)

Topp, D. A.; Myers, R. A.; Delaney, R. A.

1995-01-01

The primary objective of this study was the development of a computational fluid dynamics (CFD) based turbomachinery airfoil analysis and design system, controlled by a graphical user interface (GUI). The computer codes resulting from this effort are referred to as the Turbomachinery Analysis and Design System (TADS). This document describes the theoretical basis and analytical results from the TADS system. TADS couples a throughflow solver (ADPAC) with a quasi-3D blade-to-blade solver (RVCQ3D) in an interactive package. Throughflow analysis capability was developed in ADPAC through the addition of blade force and blockage terms to the governing equations. A GUI was developed to simplify user input and automate the many tasks required to perform turbomachinery analysis and design. The coupling of various programs was done in a way that alternative solvers or grid generators could be easily incorporated into the TADS framework. Results of aerodynamic calculations using the TADS system are presented for a highly loaded fan, a compressor stator, a low-speed turbine blade, and a transonic turbine vane.
Modeling TAE Response To Nonlinear Drives

NASA Astrophysics Data System (ADS)

Zhang, Bo; Berk, Herbert; Breizman, Boris; Zheng, Linjin

2012-10-01

Experiment has detected the Toroidal Alfven Eigenmodes (TAE) with signals at twice the eigenfrequency.These harmonic modes arise from the second order perturbation in amplitude of the MHD equation for the linear modes that are driven the energetic particle free energy. The structure of TAE in realistic geometry can be calculated by generalizing the linear numerical solver (AEGIS package). We have have inserted all the nonlinear MHD source terms, where are quadratic in the linear amplitudes, into AEGIS code. We then invert the linear MHD equation at the second harmonic frequency. The ratio of amplitudes of the first and second harmonic terms are used to determine the internal field amplitude. The spatial structure of energy and density distribution are investigated. The results can be directly employed to compare with experiments and determine the Alfven wave amplitude in the plasma region.
The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R.

PubMed

Pang, Haotian; Liu, Han; Vanderbei, Robert

2014-02-01

We develop an R package fastclime for solving a family of regularized linear programming (LP) problems. Our package efficiently implements the parametric simplex algorithm, which provides a scalable and sophisticated tool for solving large-scale linear programs. As an illustrative example, one use of our LP solver is to implement an important sparse precision matrix estimation method called CLIME (Constrained L 1 Minimization Estimator). Compared with existing packages for this problem such as clime and flare, our package has three advantages: (1) it efficiently calculates the full piecewise-linear regularization path; (2) it provides an accurate dual certificate as stopping criterion; (3) it is completely coded in C and is highly portable. This package is designed to be useful to statisticians and machine learning researchers for solving a wide range of problems.
Simulation results for a finite element-based cumulative reconstructor

NASA Astrophysics Data System (ADS)

Wagner, Roland; Neubauer, Andreas; Ramlau, Ronny

2017-10-01

Modern ground-based telescopes rely on adaptive optics (AO) systems for the compensation of image degradation caused by atmospheric turbulences. Within an AO system, measurements of incoming light from guide stars are used to adjust deformable mirror(s) in real time that correct for atmospheric distortions. The incoming wavefront has to be derived from sensor measurements, and this intermediate result is then translated into the shape(s) of the deformable mirror(s). Rapid changes of the atmosphere lead to the need for fast wavefront reconstruction algorithms. We review a fast matrix-free algorithm that was developed by Neubauer to reconstruct the incoming wavefront from Shack-Hartmann measurements based on a finite element discretization of the telescope aperture. The method is enhanced by a domain decomposition ansatz. We show that this algorithm reaches the quality of standard approaches in end-to-end simulation while at the same time maintaining the speed of recently introduced solvers with linear order speed.
Well-balanced high-order solver for blood flow in networks of vessels with variable properties.

PubMed

Müller, Lucas O; Toro, Eleuterio F

2013-12-01

We present a well-balanced, high-order non-linear numerical scheme for solving a hyperbolic system that models one-dimensional flow in blood vessels with variable mechanical and geometrical properties along their length. Using a suitable set of test problems with exact solution, we rigorously assess the performance of the scheme. In particular, we assess the well-balanced property and the effective order of accuracy through an empirical convergence rate study. Schemes of up to fifth order of accuracy in both space and time are implemented and assessed. The numerical methodology is then extended to realistic networks of elastic vessels and is validated against published state-of-the-art numerical solutions and experimental measurements. It is envisaged that the present scheme will constitute the building block for a closed, global model for the human circulation system involving arteries, veins, capillaries and cerebrospinal fluid. Copyright © 2013 John Wiley & Sons, Ltd.
Computing Generalized Matrix Inverse on Spiking Neural Substrate

PubMed Central

Shukla, Rohit; Khoram, Soroosh; Jorgensen, Erik; Li, Jing; Lipasti, Mikko; Wright, Stephen

2018-01-01

Emerging neural hardware substrates, such as IBM's TrueNorth Neurosynaptic System, can provide an appealing platform for deploying numerical algorithms. For example, a recurrent Hopfield neural network can be used to find the Moore-Penrose generalized inverse of a matrix, thus enabling a broad class of linear optimizations to be solved efficiently, at low energy cost. However, deploying numerical algorithms on hardware platforms that severely limit the range and precision of representation for numeric quantities can be quite challenging. This paper discusses these challenges and proposes a rigorous mathematical framework for reasoning about range and precision on such substrates. The paper derives techniques for normalizing inputs and properly quantizing synaptic weights originating from arbitrary systems of linear equations, so that solvers for those systems can be implemented in a provably correct manner on hardware-constrained neural substrates. The analytical model is empirically validated on the IBM TrueNorth platform, and results show that the guarantees provided by the framework for range and precision hold under experimental conditions. Experiments with optical flow demonstrate the energy benefits of deploying a reduced-precision and energy-efficient generalized matrix inverse engine on the IBM TrueNorth platform, reflecting 10× to 100× improvement over FPGA and ARM core baselines. PMID:29593483
Numerical Approach to Spatial Deterministic-Stochastic Models Arising in Cell Biology

PubMed Central

Gao, Fei; Li, Ye; Novak, Igor L.; Slepchenko, Boris M.

2016-01-01

Hybrid deterministic-stochastic methods provide an efficient alternative to a fully stochastic treatment of models which include components with disparate levels of stochasticity. However, general-purpose hybrid solvers for spatially resolved simulations of reaction-diffusion systems are not widely available. Here we describe fundamentals of a general-purpose spatial hybrid method. The method generates realizations of a spatially inhomogeneous hybrid system by appropriately integrating capabilities of a deterministic partial differential equation solver with a popular particle-based stochastic simulator, Smoldyn. Rigorous validation of the algorithm is detailed, using a simple model of calcium ‘sparks’ as a testbed. The solver is then applied to a deterministic-stochastic model of spontaneous emergence of cell polarity. The approach is general enough to be implemented within biologist-friendly software frameworks such as Virtual Cell. PMID:27959915
Visualization and Tracking of Parallel CFD Simulations

NASA Technical Reports Server (NTRS)

Vaziri, Arsi; Kremenetsky, Mark

1995-01-01

We describe a system for interactive visualization and tracking of a 3-D unsteady computational fluid dynamics (CFD) simulation on a parallel computer. CM/AVS, a distributed, parallel implementation of a visualization environment (AVS) runs on the CM-5 parallel supercomputer. A CFD solver is run as a CM/AVS module on the CM-5. Data communication between the solver, other parallel visualization modules, and a graphics workstation, which is running AVS, are handled by CM/AVS. Partitioning of the visualization task, between CM-5 and the workstation, can be done interactively in the visual programming environment provided by AVS. Flow solver parameters can also be altered by programmable interactive widgets. This system partially removes the requirement of storing large solution files at frequent time steps, a characteristic of the traditional 'simulate (yields) store (yields) visualize' post-processing approach.
Algebraic multigrid preconditioners for two-phase flow in porous media with phase transitions [Algebraic multigrid preconditioners for multiphase flow in porous media with phase transitions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bui, Quan M.; Wang, Lu; Osei-Kuffuor, Daniel

Multiphase flow is a critical process in a wide range of applications, including oil and gas recovery, carbon sequestration, and contaminant remediation. Numerical simulation of multiphase flow requires solving of a large, sparse linear system resulting from the discretization of the partial differential equations modeling the flow. In the case of multiphase multicomponent flow with miscible effect, this is a very challenging task. The problem becomes even more difficult if phase transitions are taken into account. A new approach to handle phase transitions is to formulate the system as a nonlinear complementarity problem (NCP). Unlike in the primary variable switchingmore » technique, the set of primary variables in this approach is fixed even when there is phase transition. Not only does this improve the robustness of the nonlinear solver, it opens up the possibility to use multigrid methods to solve the resulting linear system. The disadvantage of the complementarity approach, however, is that when a phase disappears, the linear system has the structure of a saddle point problem and becomes indefinite, and current algebraic multigrid (AMG) algorithms cannot be applied directly. In this study, we explore the effectiveness of a new multilevel strategy, based on the multigrid reduction technique, to deal with problems of this type. We demonstrate the effectiveness of the method through numerical results for the case of two-phase, two-component flow with phase appearance/disappearance. In conclusion, we also show that the strategy is efficient and scales optimally with problem size.« less
Algebraic multigrid preconditioners for two-phase flow in porous media with phase transitions [Algebraic multigrid preconditioners for multiphase flow in porous media with phase transitions

DOE PAGES

Bui, Quan M.; Wang, Lu; Osei-Kuffuor, Daniel

2018-02-06

Multiphase flow is a critical process in a wide range of applications, including oil and gas recovery, carbon sequestration, and contaminant remediation. Numerical simulation of multiphase flow requires solving of a large, sparse linear system resulting from the discretization of the partial differential equations modeling the flow. In the case of multiphase multicomponent flow with miscible effect, this is a very challenging task. The problem becomes even more difficult if phase transitions are taken into account. A new approach to handle phase transitions is to formulate the system as a nonlinear complementarity problem (NCP). Unlike in the primary variable switchingmore » technique, the set of primary variables in this approach is fixed even when there is phase transition. Not only does this improve the robustness of the nonlinear solver, it opens up the possibility to use multigrid methods to solve the resulting linear system. The disadvantage of the complementarity approach, however, is that when a phase disappears, the linear system has the structure of a saddle point problem and becomes indefinite, and current algebraic multigrid (AMG) algorithms cannot be applied directly. In this study, we explore the effectiveness of a new multilevel strategy, based on the multigrid reduction technique, to deal with problems of this type. We demonstrate the effectiveness of the method through numerical results for the case of two-phase, two-component flow with phase appearance/disappearance. In conclusion, we also show that the strategy is efficient and scales optimally with problem size.« less
Validation of a Simulation Process for Assessing the Response of a Vehicle and Its Occupants to an Explosive Threat

DTIC Science & Technology

2010-01-01

gross vehicle response; and the effects of blast mitigation material, restraint system, and seat design to the loads developed on the members of an...occupant. A Blast Event Simulation sysTem (BEST) has been developed for facilitating the easy use of the LS- DYNA solvers for conducting a...et al, 1999] for modeling blast events. In this paper the Eulerian solver of LS- DYNA is employed for simulating the soil – explosive – air
TADS: A CFD-based turbomachinery and analysis design system with GUI. Volume 1: Method and results

NASA Technical Reports Server (NTRS)

Topp, D. A.; Myers, R. A.; Delaney, R. A.

1995-01-01

The primary objective of this study was the development of a CFD (Computational Fluid Dynamics) based turbomachinery airfoil analysis and design system, controlled by a GUI (Graphical User Interface). The computer codes resulting from this effort are referred to as TADS (Turbomachinery Analysis and Design System). This document is the Final Report describing the theoretical basis and analytical results from the TADS system, developed under Task 18 of NASA Contract NAS3-25950, ADPAC System Coupling to Blade Analysis & Design System GUI. TADS couples a throughflow solver (ADPAC) with a quasi-3D blade-to-blade solver (RVCQ3D) in an interactive package. Throughflow analysis capability was developed in ADPAC through the addition of blade force and blockage terms to the governing equations. A GUI was developed to simplify user input and automate the many tasks required to perform turbomachinery analysis and design. The coupling of the various programs was done in such a way that alternative solvers or grid generators could be easily incorporated into the TADS framework. Results of aerodynamic calculations using the TADS system are presented for a highly loaded fan, a compressor stator, a low speed turbine blade and a transonic turbine vane.

Optimized and parallelized implementation of the electronegativity equalization method and the atom-bond electronegativity equalization method.

PubMed

Vareková, R Svobodová; Koca, J

2006-02-01

The most common way to calculate charge distribution in a molecule is ab initio quantum mechanics (QM). Some faster alternatives to QM have also been developed, the so-called "equalization methods" EEM and ABEEM, which are based on DFT. We have implemented and optimized the EEM and ABEEM methods and created the EEM SOLVER and ABEEM SOLVER programs. It has been found that the most time-consuming part of equalization methods is the reduction of the matrix belonging to the equation system generated by the method. Therefore, for both methods this part was replaced by the parallel algorithm WIRS and implemented within the PVM environment. The parallelized versions of the programs EEM SOLVER and ABEEM SOLVER showed promising results, especially on a single computer with several processors (compact PVM). The implemented programs are available through the Web page http://ncbr.chemi.muni.cz/~n19n/eem_abeem.
A Parallel Multigrid Solver for Viscous Flows on Anisotropic Structured Grids

NASA Technical Reports Server (NTRS)

Prieto, Manuel; Montero, Ruben S.; Llorente, Ignacio M.; Bushnell, Dennis M. (Technical Monitor)

2001-01-01

This paper presents an efficient parallel multigrid solver for speeding up the computation of a 3-D model that treats the flow of a viscous fluid over a flat plate. The main interest of this simulation lies in exhibiting some basic difficulties that prevent optimal multigrid efficiencies from being achieved. As the computing platform, we have used Coral, a Beowulf-class system based on Intel Pentium processors and equipped with GigaNet cLAN and switched Fast Ethernet networks. Our study not only examines the scalability of the solver but also includes a performance evaluation of Coral where the investigated solver has been used to compare several of its design choices, namely, the interconnection network (GigaNet versus switched Fast-Ethernet) and the node configuration (dual nodes versus single nodes). As a reference, the performance results have been compared with those obtained with the NAS-MG benchmark.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, Kuo -Ling; Mehrotra, Sanjay

We present a homogeneous algorithm equipped with a modified potential function for the monotone complementarity problem. We show that this potential function is reduced by at least a constant amount if a scaled Lipschitz condition (SLC) is satisfied. A practical algorithm based on this potential function is implemented in a software package named iOptimize. The implementation in iOptimize maintains global linear and polynomial time convergence properties, while achieving practical performance. It either successfully solves the problem, or concludes that the SLC is not satisfied. When compared with the mature software package MOSEK (barrier solver version 6.0.0.106), iOptimize solves convex quadraticmore » programming problems, convex quadratically constrained quadratic programming problems, and general convex programming problems in fewer iterations. Moreover, several problems for which MOSEK fails are solved to optimality. In addition, we also find that iOptimize detects infeasibility more reliably than the general nonlinear solvers Ipopt (version 3.9.2) and Knitro (version 8.0).« less
The DANTE Boltzmann transport solver: An unstructured mesh, 3-D, spherical harmonics algorithm compatible with parallel computer architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

McGhee, J.M.; Roberts, R.M.; Morel, J.E.

1997-06-01

A spherical harmonics research code (DANTE) has been developed which is compatible with parallel computer architectures. DANTE provides 3-D, multi-material, deterministic, transport capabilities using an arbitrary finite element mesh. The linearized Boltzmann transport equation is solved in a second order self-adjoint form utilizing a Galerkin finite element spatial differencing scheme. The core solver utilizes a preconditioned conjugate gradient algorithm. Other distinguishing features of the code include options for discrete-ordinates and simplified spherical harmonics angular differencing, an exact Marshak boundary treatment for arbitrarily oriented boundary faces, in-line matrix construction techniques to minimize memory consumption, and an effective diffusion based preconditioner formore » scattering dominated problems. Algorithm efficiency is demonstrated for a massively parallel SIMD architecture (CM-5), and compatibility with MPP multiprocessor platforms or workstation clusters is anticipated.« less
Multimodel methods for optimal control of aeroacoustics.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Guoquan; Collis, Samuel Scott

2005-01-01

A new multidomain/multiphysics computational framework for optimal control of aeroacoustic noise has been developed based on a near-field compressible Navier-Stokes solver coupled with a far-field linearized Euler solver both based on a discontinuous Galerkin formulation. In this approach, the coupling of near- and far-field domains is achieved by weakly enforcing continuity of normal fluxes across a coupling surface that encloses all nonlinearities and noise sources. For optimal control, gradient information is obtained by the solution of an appropriate adjoint problem that involves the propagation of adjoint information from the far-field to the near-field. This computational framework has been successfully appliedmore » to study optimal boundary-control of blade-vortex interaction, which is a significant noise source for helicopters on approach to landing. In the model-problem presented here, the noise propagated toward the ground is reduced by 12dB.« less
POSTPROCESSING MIXED FINITE ELEMENT METHODS FOR SOLVING CAHN-HILLIARD EQUATION: METHODS AND ERROR ANALYSIS

PubMed Central

Wang, Wansheng; Chen, Long; Zhou, Jie

2015-01-01

A postprocessing technique for mixed finite element methods for the Cahn-Hilliard equation is developed and analyzed. Once the mixed finite element approximations have been computed at a fixed time on the coarser mesh, the approximations are postprocessed by solving two decoupled Poisson equations in an enriched finite element space (either on a finer grid or a higher-order space) for which many fast Poisson solvers can be applied. The nonlinear iteration is only applied to a much smaller size problem and the computational cost using Newton and direct solvers is negligible compared with the cost of the linear problem. The analysis presented here shows that this technique remains the optimal rate of convergence for both the concentration and the chemical potential approximations. The corresponding error estimate obtained in our paper, especially the negative norm error estimates, are non-trivial and different with the existing results in the literatures. PMID:27110063
Novel numerical techniques for magma dynamics

NASA Astrophysics Data System (ADS)

Rhebergen, S.; Katz, R. F.; Wathen, A.; Alisic, L.; Rudge, J. F.; Wells, G.

2013-12-01

We discuss the development of finite element techniques and solvers for magma dynamics computations. These are implemented within the FEniCS framework. This approach allows for user-friendly, expressive, high-level code development, but also provides access to powerful, scalable numerical solvers and a large family of finite element discretisations. With the recent addition of dolfin-adjoint, FeniCS supports automated adjoint and tangent-linear models, enabling the rapid development of Generalised Stability Analysis. The ability to easily scale codes to three dimensions with large meshes, and/or to apply intricate adjoint calculations means that efficiency of the numerical algorithms is vital. We therefore describe our development and analysis of preconditioners designed specifically for finite element discretizations of equations governing magma dynamics. The preconditioners are based on Elman-Silvester-Wathen methods for the Stokes equation, and we extend these to flows with compaction. Our simulations are validated by comparison of results with laboratory experiments on partially molten aggregates.
Low-memory iterative density fitting.

PubMed

Grajciar, Lukáš

2015-07-30

A new low-memory modification of the density fitting approximation based on a combination of a continuous fast multipole method (CFMM) and a preconditioned conjugate gradient solver is presented. Iterative conjugate gradient solver uses preconditioners formed from blocks of the Coulomb metric matrix that decrease the number of iterations needed for convergence by up to one order of magnitude. The matrix-vector products needed within the iterative algorithm are calculated using CFMM, which evaluates them with the linear scaling memory requirements only. Compared with the standard density fitting implementation, up to 15-fold reduction of the memory requirements is achieved for the most efficient preconditioner at a cost of only 25% increase in computational time. The potential of the method is demonstrated by performing density functional theory calculations for zeolite fragment with 2592 atoms and 121,248 auxiliary basis functions on a single 12-core CPU workstation. © 2015 Wiley Periodicals, Inc.
The Programming Language Python In Earth System Simulations

NASA Astrophysics Data System (ADS)

Gross, L.; Imranullah, A.; Mora, P.; Saez, E.; Smillie, J.; Wang, C.

2004-12-01

Mathematical models in earth sciences base on the solution of systems of coupled, non-linear, time-dependent partial differential equations (PDEs). The spatial and time-scale vary from a planetary scale and million years for convection problems to 100km and 10 years for fault systems simulations. Various techniques are in use to deal with the time dependency (e.g. Crank-Nicholson), with the non-linearity (e.g. Newton-Raphson) and weakly coupled equations (e.g. non-linear Gauss-Seidel). Besides these high-level solution algorithms discretization methods (e.g. finite element method (FEM), boundary element method (BEM)) are used to deal with spatial derivatives. Typically, large-scale, three dimensional meshes are required to resolve geometrical complexity (e.g. in the case of fault systems) or features in the solution (e.g. in mantel convection simulations). The modelling environment escript allows the rapid implementation of new physics as required for the development of simulation codes in earth sciences. Its main object is to provide a programming language, where the user can define new models and rapidly develop high-level solution algorithms. The current implementation is linked with the finite element package finley as a PDE solver. However, the design is open and other discretization technologies such as finite differences and boundary element methods could be included. escript is implemented as an extension of the interactive programming environment python (see www.python.org). Key concepts introduced are Data objects, which are holding values on nodes or elements of the finite element mesh, and linearPDE objects, which are defining linear partial differential equations to be solved by the underlying discretization technology. In this paper we will show the basic concepts of escript and will show how escript is used to implement a simulation code for interacting fault systems. We will show some results of large-scale, parallel simulations on an SGI Altix system. Acknowledgements: Project work is supported by Australian Commonwealth Government through the Australian Computational Earth Systems Simulator Major National Research Facility, Queensland State Government Smart State Research Facility Fund, The University of Queensland and SGI.
Automation of the CFD Process on Distributed Computing Systems

NASA Technical Reports Server (NTRS)

Tejnil, Ed; Gee, Ken; Rizk, Yehia M.

2000-01-01

A script system was developed to automate and streamline portions of the CFD process. The system was designed to facilitate the use of CFD flow solvers on supercomputer and workstation platforms within a parametric design event. Integrating solver pre- and postprocessing phases, the fully automated ADTT script system marshalled the required input data, submitted the jobs to available computational resources, and processed the resulting output data. A number of codes were incorporated into the script system, which itself was part of a larger integrated design environment software package. The IDE and scripts were used in a design event involving a wind tunnel test. This experience highlighted the need for efficient data and resource management in all parts of the CFD process. To facilitate the use of CFD methods to perform parametric design studies, the script system was developed using UNIX shell and Perl languages. The goal of the work was to minimize the user interaction required to generate the data necessary to fill a parametric design space. The scripts wrote out the required input files for the user-specified flow solver, transferred all necessary input files to the computational resource, submitted and tracked the jobs using the resource queuing structure, and retrieved and post-processed the resulting dataset. For computational resources that did not run queueing software, the script system established its own simple first-in-first-out queueing structure to manage the workload. A variety of flow solvers were incorporated in the script system, including INS2D, PMARC, TIGER and GASP. Adapting the script system to a new flow solver was made easier through the use of object-oriented programming methods. The script system was incorporated into an ADTT integrated design environment and evaluated as part of a wind tunnel experiment. The system successfully generated the data required to fill the desired parametric design space. This stressed the computational resources required to compute and store the information. The scripts were continually modified to improve the utilization of the computational resources and reduce the likelihood of data loss due to failures. An ad-hoc file server was created to manage the large amount of data being generated as part of the design event. Files were stored and retrieved as needed to create new jobs and analyze the results. Additional information is contained in the original.
Complex wet-environments in electronic-structure calculations

NASA Astrophysics Data System (ADS)

Fisicaro, Giuseppe; Genovese, Luigi; Andreussi, Oliviero; Marzari, Nicola; Goedecker, Stefan

The computational study of chemical reactions in complex, wet environments is critical for applications in many fields. It is often essential to study chemical reactions in the presence of an applied electrochemical potentials, including complex electrostatic screening coming from the solvent. In the present work we present a solver to handle both the Generalized Poisson and the Poisson-Boltzmann equation. A preconditioned conjugate gradient (PCG) method has been implemented for the Generalized Poisson and the linear regime of the Poisson-Boltzmann, allowing to solve iteratively the minimization problem with some ten iterations. On the other hand, a self-consistent procedure enables us to solve the Poisson-Boltzmann problem. The algorithms take advantage of a preconditioning procedure based on the BigDFT Poisson solver for the standard Poisson equation. They exhibit very high accuracy and parallel efficiency, and allow different boundary conditions, including surfaces. The solver has been integrated into the BigDFT and Quantum-ESPRESSO electronic-structure packages and it will be released as a independent program, suitable for integration in other codes. We present test calculations for large proteins to demonstrate efficiency and performances. This work was done within the PASC and NCCR MARVEL projects. Computer resources were provided by the Swiss National Supercomputing Centre (CSCS) under Project ID s499. LG acknowledges also support from the EXTMOS EU project.
Solving groundwater flow problems by conjugate-gradient methods and the strongly implicit procedure

USGS Publications Warehouse

Hill, Mary C.

1990-01-01

The performance of the preconditioned conjugate-gradient method with three preconditioners is compared with the strongly implicit procedure (SIP) using a scalar computer. The preconditioners considered are the incomplete Cholesky (ICCG) and the modified incomplete Cholesky (MICCG), which require the same computer storage as SIP as programmed for a problem with a symmetric matrix, and a polynomial preconditioner (POLCG), which requires less computer storage than SIP. Although POLCG is usually used on vector computers, it is included here because of its small storage requirements. In this paper, published comparisons of the solvers are evaluated, all four solvers are compared for the first time, and new test cases are presented to provide a more complete basis by which the solvers can be judged for typical groundwater flow problems. Based on nine test cases, the following conclusions are reached: (1) SIP is actually as efficient as ICCG for some of the published, linear, two-dimensional test cases that were reportedly solved much more efficiently by ICCG; (2) SIP is more efficient than other published comparisons would indicate when common convergence criteria are used; and (3) for problems that are three-dimensional, nonlinear, or both, and for which common convergence criteria are used, SIP is often more efficient than ICCG, and is sometimes more efficient than MICCG.
RF Wave Simulation Using the MFEM Open Source FEM Package

NASA Astrophysics Data System (ADS)

Stillerman, J.; Shiraiwa, S.; Bonoli, P. T.; Wright, J. C.; Green, D. L.; Kolev, T.

2016-10-01

A new plasma wave simulation environment based on the finite element method is presented. MFEM, a scalable open-source FEM library, is used as the basis for this capability. MFEM allows for assembling an FEM matrix of arbitrarily high order in a parallel computing environment. A 3D frequency domain RF physics layer was implemented using a python wrapper for MFEM and a cold collisional plasma model was ported. This physics layer allows for defining the plasma RF wave simulation model without user knowledge of the FEM weak-form formulation. A graphical user interface is built on πScope, a python-based scientific workbench, such that a user can build a model definition file interactively. Benchmark cases have been ported to this new environment, with results being consistent with those obtained using COMSOL multiphysics, GENRAY, and TORIC/TORLH spectral solvers. This work is a first step in bringing to bear the sophisticated computational tool suite that MFEM provides (e.g., adaptive mesh refinement, solver suite, element types) to the linear plasma-wave interaction problem, and within more complicated integrated workflows, such as coupling with core spectral solver, or incorporating additional physics such as an RF sheath potential model or kinetic effects. USDoE Awards DE-FC02-99ER54512, DE-FC02-01ER54648.
Using the General Algebraic Modeling System on Peregrine | High-Performance

Science.gov Websites

directory, type the following: module load gams cp /nopt/nrel/apps/gams/example/trnsport.gms . gams trnsport file. For example, if your model input uses LP procedure and you want to use Gurobi solver to solve it directory that you run GAMS. For example, for the Gurobi solver, its option file is "gurobi.opt"
Equation solvers for distributed-memory computers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.

1994-01-01

A large number of scientific and engineering problems require the rapid solution of large systems of simultaneous equations. The performance of parallel computers in this area now dwarfs traditional vector computers by nearly an order of magnitude. This talk describes the major issues involved in parallel equation solvers with particular emphasis on the Intel Paragon, IBM SP-1 and SP-2 processors.
Conservative tightly-coupled simulations of stochastic multiscale systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Taverniers, Søren; Pigarov, Alexander Y.; Tartakovsky, Daniel M., E-mail: dmt@ucsd.edu

2016-05-15

Multiphysics problems often involve components whose macroscopic dynamics is driven by microscopic random fluctuations. The fidelity of simulations of such systems depends on their ability to propagate these random fluctuations throughout a computational domain, including subdomains represented by deterministic solvers. When the constituent processes take place in nonoverlapping subdomains, system behavior can be modeled via a domain-decomposition approach that couples separate components at the interfaces between these subdomains. Its coupling algorithm has to maintain a stable and efficient numerical time integration even at high noise strength. We propose a conservative domain-decomposition algorithm in which tight coupling is achieved by employingmore » either Picard's or Newton's iterative method. Coupled diffusion equations, one of which has a Gaussian white-noise source term, provide a computational testbed for analysis of these two coupling strategies. Fully-converged (“implicit”) coupling with Newton's method typically outperforms its Picard counterpart, especially at high noise levels. This is because the number of Newton iterations scales linearly with the amplitude of the Gaussian noise, while the number of Picard iterations can scale superlinearly. At large time intervals between two subsequent inter-solver communications, the solution error for single-iteration (“explicit”) Picard's coupling can be several orders of magnitude higher than that for implicit coupling. Increasing the explicit coupling's communication frequency reduces this difference, but the resulting increase in computational cost can make it less efficient than implicit coupling at similar levels of solution error, depending on the communication frequency of the latter and the noise strength. This trend carries over into higher dimensions, although at high noise strength explicit coupling may be the only computationally viable option.« less
Parallel Unsteady Overset Mesh Methodology for Adaptive and Moving Grids with Multiple Solvers

DTIC Science & Technology

2010-01-01

Research Laboratory Hampton, Virginia Jayanarayanan Sitaraman National Institute of Aerospace Hampton, Virginia ABSTRACT This paper describes a new...Army Research Laboratory ,Hampton, VA, , , 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) NATO/RTO...results section ( 3.6 and 3.5). Good linear scalability was observed for all three cases up to 12 processors. Beyond that the scalability drops off
3D Gaussian Beam Modeling

DTIC Science & Technology

2011-09-01

optimized building blocks such as a parallelized tri-diagonal linear solver (used in the “implicit finite differences ” and split-step Pade PE models...and Ding Lee. “A finite - difference treatment of interface conditions for the parabolic wave equation: The horizontal interface.” The Journal of the...Acoustical Society of America, 71(4):855, 1982. 3. Ding Lee and Suzanne T. McDaniel. “A finite - difference treatment of interface conditions for
Conformal Grid Generation

DTIC Science & Technology

1982-04-01

highly reco-..ded for •ppinp of the form Z • f(~) that a coaplicated upping be restated as a sequence of stapler •p- pinp for actual iJipleaentation in a...Society for Industrial and Applied Mathematics (SIAM), Ref. [53], contains linear equation solvers that can be useful in mapping operations...Press, New York, pp. 9-16. 53. Dongarra, J.J. (1979) "LINPACK User’s Guide," Society of Industrial and Applied Mathematics
An efficient direct solver for rarefied gas flows with arbitrary statistics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Diaz, Manuel A., E-mail: f99543083@ntu.edu.tw; Yang, Jaw-Yen, E-mail: yangjy@iam.ntu.edu.tw; Center of Advanced Study in Theoretical Science, National Taiwan University, Taipei 10167, Taiwan

2016-01-15

A new numerical methodology associated with a unified treatment is presented to solve the Boltzmann–BGK equation of gas dynamics for the classical and quantum gases described by the Bose–Einstein and Fermi–Dirac statistics. Utilizing a class of globally-stiffly-accurate implicit–explicit Runge–Kutta scheme for the temporal evolution, associated with the discrete ordinate method for the quadratures in the momentum space and the weighted essentially non-oscillatory method for the spatial discretization, the proposed scheme is asymptotic-preserving and imposes no non-linear solver or requires the knowledge of fugacity and temperature to capture the flow structures in the hydrodynamic (Euler) limit. The proposed treatment overcomes themore » limitations found in the work by Yang and Muljadi (2011) [33] due to the non-linear nature of quantum relations, and can be applied in studying the dynamics of a gas with internal degrees of freedom with correct values of the ratio of specific heat for the flow regimes for all Knudsen numbers and energy wave lengths. The present methodology is numerically validated with the unified treatment by the one-dimensional shock tube problem and the two-dimensional Riemann problems for gases of arbitrary statistics. Descriptions of ideal quantum gases including rotational degrees of freedom have been successfully achieved under the proposed methodology.« less

Some links on this page may take you to non-federal websites. Their policies may differ from this site.